I am attempting to wrap the histogram function in my own custom function, so that I can quickly generate some standard plots. A part of what I want to do is to draw a normal curve over the histogram: > x <- rnorm(1000) > hist(x, freq=F) > curve(dnorm(x), lty=3, add=T) (for normal use, x would be a vector of empirical values, but the rnorm() function works for testing) That works just as you'd expect, but I've found something a bit strange. If I try the following: > curve(dnorm(x, mean=mean(x), sd=sd(x)), lty=3, add=T) I get a much flatter and broader curve (which looks like it probably has the same area as the first curve, though I haven't tested). However, if I do > z <- sd(x) > curve(dnorm(x, mean=mean(x), sd=z), lty=1, add=T) I get the curve you'd expect; it draws right over the first curve (curve(dnorm(x),...), above). I haven't touched x between the call to curve() containing dnorm(...,sd=sd(x)) and the call to curve() containing dnorm(...,sd=z), and tests show that z == sd(x). I get similar results if I manually type in the standard deviation of x--the expected curve is drawn--so the broader and flatter curve is only drawn when I call dnorm with sd=sd(x). Is there a reason for this, or is there something odd going on with the call to curve()? Regards, Tom
Thomas Hopper <thopper at cobasys.com> writes:> I am attempting to wrap the histogram function in my own custom > function, so that I can quickly generate some standard plots. > > A part of what I want to do is to draw a normal curve over the histogram: > > > x <- rnorm(1000) > > hist(x, freq=F) > > curve(dnorm(x), lty=3, add=T) > > (for normal use, x would be a vector of empirical values, but the > rnorm() function works for testing) > > That works just as you'd expect, but I've found something a bit strange. > > If I try the following: > > > curve(dnorm(x, mean=mean(x), sd=sd(x)), lty=3, add=T) > > I get a much flatter and broader curve (which looks like it probably > has the same area as the first curve, though I haven't tested). > > However, if I do > > > z <- sd(x) > > curve(dnorm(x, mean=mean(x), sd=z), lty=1, add=T) > > I get the curve you'd expect; it draws right over the first curve > (curve(dnorm(x),...), above).I don't think that is guaranteed, actually. Notice that curve plots the *expression* as a function of the argument "x". So it takes a bunch of x values, evenly spread across the abscissa collects them into a vector and plugs that in as "x" in curve(dnorm(x, mean=mean(x), sd=sd(x)), lty=3, add=T) I.e. the x that gets plugged into mean(x) and sd(x) has nothing to do with your original data (except that they both fit in the same xlim)! -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
For which x do you think sd(x) is evaluated? Hint: the help page shows expr: an expression written as a function of 'x', or alternatively the name of a function which will be plotted. and you have written dnorm(x, mean=mean(x), sd=sd(x)) as function of x. On Thu, 10 Feb 2005, Thomas Hopper wrote:> I am attempting to wrap the histogram function in my own custom function, so > that I can quickly generate some standard plots. > > A part of what I want to do is to draw a normal curve over the histogram: > >> x <- rnorm(1000) >> hist(x, freq=F) >> curve(dnorm(x), lty=3, add=T) > > (for normal use, x would be a vector of empirical values, but the rnorm() > function works for testing) > > That works just as you'd expect, but I've found something a bit strange. > > If I try the following: > >> curve(dnorm(x, mean=mean(x), sd=sd(x)), lty=3, add=T) > > I get a much flatter and broader curve (which looks like it probably has the > same area as the first curve, though I haven't tested). > > However, if I do > >> z <- sd(x) >> curve(dnorm(x, mean=mean(x), sd=z), lty=1, add=T) > > I get the curve you'd expect; it draws right over the first curve > (curve(dnorm(x),...), above). > > I haven't touched x between the call to curve() containing > dnorm(...,sd=sd(x)) and the call to curve() containing dnorm(...,sd=z), and > tests show that z == sd(x). > > I get similar results if I manually type in the standard deviation of x--the > expected curve is drawn--so the broader and flatter curve is only drawn when > I call dnorm with sd=sd(x). > > Is there a reason for this, or is there something odd going on with the call > to curve()?It's working as documented. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595