I am attempting to wrap the histogram function in my own custom function, so that I can quickly generate some standard plots. A part of what I want to do is to draw a normal curve over the histogram: > x <- rnorm(1000) > hist(x, freq=F) > curve(dnorm(x), lty=3, add=T) (for normal use, x would be a vector of empirical values, but the rnorm() function works for testing) That works just as you'd expect, but I've found something a bit strange. If I try the following: > curve(dnorm(x, mean=mean(x), sd=sd(x)), lty=3, add=T) I get a much flatter and broader curve (which looks like it probably has the same area as the first curve, though I haven't tested). However, if I do > z <- sd(x) > curve(dnorm(x, mean=mean(x), sd=z), lty=1, add=T) I get the curve you'd expect; it draws right over the first curve (curve(dnorm(x),...), above). I haven't touched x between the call to curve() containing dnorm(...,sd=sd(x)) and the call to curve() containing dnorm(...,sd=z), and tests show that z == sd(x). I get similar results if I manually type in the standard deviation of x--the expected curve is drawn--so the broader and flatter curve is only drawn when I call dnorm with sd=sd(x). Is there a reason for this, or is there something odd going on with the call to curve()? Regards, Tom
Thomas Hopper <thopper at cobasys.com> writes:> I am attempting to wrap the histogram function in my own custom > function, so that I can quickly generate some standard plots. > > A part of what I want to do is to draw a normal curve over the histogram: > > > x <- rnorm(1000) > > hist(x, freq=F) > > curve(dnorm(x), lty=3, add=T) > > (for normal use, x would be a vector of empirical values, but the > rnorm() function works for testing) > > That works just as you'd expect, but I've found something a bit strange. > > If I try the following: > > > curve(dnorm(x, mean=mean(x), sd=sd(x)), lty=3, add=T) > > I get a much flatter and broader curve (which looks like it probably > has the same area as the first curve, though I haven't tested). > > However, if I do > > > z <- sd(x) > > curve(dnorm(x, mean=mean(x), sd=z), lty=1, add=T) > > I get the curve you'd expect; it draws right over the first curve > (curve(dnorm(x),...), above).I don't think that is guaranteed, actually. Notice that curve plots the *expression* as a function of the argument "x". So it takes a bunch of x values, evenly spread across the abscissa collects them into a vector and plugs that in as "x" in curve(dnorm(x, mean=mean(x), sd=sd(x)), lty=3, add=T) I.e. the x that gets plugged into mean(x) and sd(x) has nothing to do with your original data (except that they both fit in the same xlim)! -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
For which x do you think sd(x) is evaluated?
Hint: the help page shows
expr: an expression written as a function of 'x', or alternatively
the name of a function which will be plotted.
and you have written dnorm(x, mean=mean(x), sd=sd(x)) as function of x.
On Thu, 10 Feb 2005, Thomas Hopper wrote:
> I am attempting to wrap the histogram function in my own custom function,
so
> that I can quickly generate some standard plots.
>
> A part of what I want to do is to draw a normal curve over the histogram:
>
>> x <- rnorm(1000)
>> hist(x, freq=F)
>> curve(dnorm(x), lty=3, add=T)
>
> (for normal use, x would be a vector of empirical values, but the rnorm()
> function works for testing)
>
> That works just as you'd expect, but I've found something a bit
strange.
>
> If I try the following:
>
>> curve(dnorm(x, mean=mean(x), sd=sd(x)), lty=3, add=T)
>
> I get a much flatter and broader curve (which looks like it probably has
the
> same area as the first curve, though I haven't tested).
>
> However, if I do
>
>> z <- sd(x)
>> curve(dnorm(x, mean=mean(x), sd=z), lty=1, add=T)
>
> I get the curve you'd expect; it draws right over the first curve
> (curve(dnorm(x),...), above).
>
> I haven't touched x between the call to curve() containing
> dnorm(...,sd=sd(x)) and the call to curve() containing dnorm(...,sd=z), and
> tests show that z == sd(x).
>
> I get similar results if I manually type in the standard deviation of
x--the
> expected curve is drawn--so the broader and flatter curve is only drawn
when
> I call dnorm with sd=sd(x).
>
> Is there a reason for this, or is there something odd going on with the
call
> to curve()?
It's working as documented.
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595