Dear R-Gurus, I wonder why 'density' values as shown in hist or plot(density(x)) are sometimes over 1. How can that be? Example>hist(rnorm(1000,sd=.5),freq=FALSE)The resulting plot shows density values below 1 on the y-axis. However,>hist(rnorm(1000,sd=.1),freq=FALSE)shows density values over 1. How to interpret density values over 1? Greetings, Johannes
Because densities are not probabilities. It is the area under the density curve that represents probability. Example: the chi-squared density with 1 degree of freedom has a singularity at the zero and is unbounded. The area under the curve, however, is still 1. (This is a distressingly common misconception. It is really not an R issue but a distribution theory issue.) Bill Venables ________________________________________ From: r-help-bounces at r-project.org [r-help-bounces at r-project.org] On Behalf Of Johannes Elias [jelias at hygiene.uni-wuerzburg.de] Sent: 02 March 2009 22:27 To: r-help at r-project.org Subject: [R] density > 1? Dear R-Gurus, I wonder why 'density' values as shown in hist or plot(density(x)) are sometimes over 1. How can that be? Example>hist(rnorm(1000,sd=.5),freq=FALSE)The resulting plot shows density values below 1 on the y-axis. However,>hist(rnorm(1000,sd=.1),freq=FALSE)shows density values over 1. How to interpret density values over 1? Greetings, Johannes ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hi Johannes, ist more a statistical issue. In short: densities are not probabilities! With a continuous random variable probability statements are typically over intervals not over points. A density is bound to have an integral of 1 (and to be non-negative), nothing else. Consider the uniform (0,0.5) distribution there the density is f(x)=2 for all 0<=x<=0.5. This is a perfect probability density having all non-zero values > 1. hth. Johannes Elias schrieb:> Dear R-Gurus, > > I wonder why 'density' values as shown in hist or plot(density(x)) are > sometimes over 1. How can that be? > > Example > > >> hist(rnorm(1000,sd=.5),freq=FALSE) >> > > The resulting plot shows density values below 1 on the y-axis. However, > > >> hist(rnorm(1000,sd=.1),freq=FALSE) >> > > shows density values over 1. > > How to interpret density values over 1? > > Greetings, > > Johannes > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Eik Vettorazzi Institut f?r Medizinische Biometrie und Epidemiologie Universit?tsklinikum Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/42803-8243 F ++49/40/42803-7790
Johannes Elias wrote:> Dear R-Gurus, > > I wonder why 'density' values as shown in hist or plot(density(x)) are > sometimes over 1. How can that be? > > Example > >> hist(rnorm(1000,sd=.5),freq=FALSE) > > The resulting plot shows density values below 1 on the y-axis. However, > >> hist(rnorm(1000,sd=.1),freq=FALSE) > > shows density values over 1. > > How to interpret density values over 1?This comes up every now and again. The real question is: Why do people believe that densities should be probabilities? They're not, they denote (differential) probability per unit on the x axis, and the denominator can be small. The density _integrates_ to 1, so if e.g. it is concentrated on (0, 0.5) if has to be at least 2 somewhere.> Greetings, > > Johannes > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
On Mon, 2009-03-02 at 13:27 +0100, Johannes Elias wrote:> Dear R-Gurus, > > I wonder why 'density' values as shown in hist or plot(density(x)) are > sometimes over 1. How can that be? > > Example > > >hist(rnorm(1000,sd=.5),freq=FALSE) > > The resulting plot shows density values below 1 on the y-axis. However, > > >hist(rnorm(1000,sd=.1),freq=FALSE) > > shows density values over 1. > > How to interpret density values over 1? > > Greetings, > > JohannesJohannes, Well density is not like probability In histogram with density the area is equal de probability in you example set.seed(123) hist(rnorm(1000,sd=.1),freq=FALSE) The interval of -0.05 and 0 have density=4 but a probability of number in this interval is 4*.05=.2 the fact set.seed(123) hist(rnorm(1000,sd=.1),freq=FALSE)$density [1] 0.09999998 0.28000000 0.94000000 1.98000000 2.60000000 4.00000000 [7] 4.04000000 2.92000000 1.66000000 0.92000000 0.44000000 0.10000000 [13] 0.02000000 set.seed(123) sum(hist(rnorm(1000,sd=.1),freq=FALSE)$density) [1] 1 So the sum of probability is 1 but the sum of density 20 -- Bernardo Rangel Tura, M.D,MPH,Ph.D National Institute of Cardiology Brazil