----- Original Message ----- From: "Richard A. O'Keefe" <ok at cs.otago.ac.nz> To: <paul at datavore.com> Sent: Thursday, September 04, 2003 2:56 AM Subject: Re: [R] Overlaying graphs> I do not know how to overlay the curve graphic on top of hist graphic. > > Do you know about the "add=TRUE" option for plot()? > > I am hoping to show visually that the normal curve overlays the obtained > probability distribution when plotted on the same graph. Unfortunately, I > an not sure how to overlay them. Can anyone point me in the rightdirection> or show me the code. > > This is a bad way to do it anyway. What you want is a qqnorm plot. > See ?qqnorm. > >
My apologies for the last email that only contained the message and not my reply. Here is what I meant to send. ----- Original Message ----- From: "Richard A. O'Keefe" <ok at cs.otago.ac.nz> To: <paul at datavore.com> Sent: Thursday, September 04, 2003 2:56 AM Subject: Re: [R] Overlaying graphs> I do not know how to overlay the curve graphic on top of hist graphic. > > Do you know about the "add=TRUE" option for plot()?I learned about it from one of the list members and it worked ok for me. This is the recipe I finally came up with: fat <- read.table("fat.dat", header=TRUE) mu <- mean(fat$height) sdev <- sd(fat$height) par (fin=c(4,4)) hist(fat$height, br=20, freq=FALSE, col="lightblue", border="black", xlab="Male Height in Inches", main = paste("Histogram of" , "Male Height")) curve(dnorm(x, mu, sdev), add=TRUE, from=64, to=78, col="red", lwd=5)> I am hoping to show visually that the normal curve overlays the obtained > probability distribution when plotted on the same graph. Unfortunately, I > an not sure how to overlay them. Can anyone point me in the rightdirection> or show me the code. > > This is a bad way to do it anyway. What you want is a qqnorm plot. > See ?qqnorm.Yes qqnorm looks like a better tool for this particular job. It does not appear to be very general in the sense that you could visually inspect whether poissson distributed data conforms to a theoretical poisson distribution. I guess this leads to two more questions: 1. Is the Anderson-Darling goodness-of-fit test the recommended analytic test for determining whether a normal distribution conforms to a theoretical normal distribution. 2. Does R have a suite of "best-fit" tools for finding the best fitting-probability distribution for any observed probability distribution? Regards, Paul Meagher>
Paul Meagher wrote:> 2. Does R have a suite of "best-fit" tools for finding the best > fitting-probability distribution for any observed probability distribution?I think that the best-fitting probability distribution for an observed probability distribution is the empirical distribution of your observations. (Perhaps you have some other criteria than just goodness of fit?) Damon.
From: "Damon Wischik" <djw1005 at cam.ac.uk>> Paul Meagher wrote: > > 2. Does R have a suite of "best-fit" tools for finding the best > > fitting-probability distribution for any observed probabilitydistribution?> > I think that the best-fitting probability distribution for an observed > probability distribution is the empirical distribution of your > observations. > > (Perhaps you have some other criteria than just goodness of fit?)You can certainly use the empirical distribution of observations to construct your probability distribution and you are correct that, in some sense, this would be the best fitting probability distribution. Lately I have been asking myself why we bother in the first place to use theoretical probability distributions to model our empirically distributions. Why not construct the probability distribution directly from the data itself? I think that in some cases, this is the correct route to go. Computers allow us to make inferences about the probability of certain outcomes using these irregularly shaped distributions. These inferences may be more accurate than using any of the available theoretical probability distributions. The main reasons I can come up with for not using the empirical distribution itself as your probability distribution are: 1. Over-fitting which limits your ability to generalize to new situations. This, I think, is most important reason for engaging in the exercise of fitting your data to a theoretical distribution. 2. It is easier to derive inferences about your random variable. This is the second most important reason. 3. Anyone who plays with numbers constitutionally tends towards platonism. Regards, Paul> Damon. > >