I am using density() to plot a density curves. However, one of my variables is truncated at zero, but has most of its density around zero. I would like to know how to plot this with the density function. The problem is that if I do this the regular way density(), values near zero automatically get a very low value because there are no observed values below zero. Furthermore there is some density below zero, although there are no observed values below zero. This illustrated the problem: mydata <- rnorm(100000); mydata <- mydata[mydata>0]; plot(density(mydata)); the 'real' density is exactly the right half of a normal distribution, so truncated at zero. However using the default options, the line seems to decrease with a nice curve at the left, with some density below zero. This is pretty confusing for the reader. I have tried to decrease the bw, masks (but does not fix) some of the problem, but than also the rest of the curve loses smoothness. I would like to make a plot of this data that looks like the right half of a normal distribution, while keeping the curve relatively smooth. Is there any way to specify this truncation in the density function, so that it will only use the positive domain to calculate density? -- View this message in context: http://www.nabble.com/plotting-density-for-truncated-distribution-tp20684995p20684995.html Sent from the R help mailing list archive at Nabble.com.
Default kernel density estimation is poorly suited for this sort of situation. A better alternative is logspline -- see the eponymous package -- you can specify lower limits for the distribution as an option. url: www.econ.uiuc.edu/~roger Roger Koenker email rkoenker at uiuc.edu Department of Economics vox: 217-333-4558 University of Illinois fax: 217-244-6678 Champaign, IL 61820 On Nov 25, 2008, at 10:43 AM, Jeroen Ooms wrote:> > I am using density() to plot a density curves. However, one of my > variables > is truncated at zero, but has most of its density around zero. I > would like > to know how to plot this with the density function. > > The problem is that if I do this the regular way density(), values > near zero > automatically get a very low value because there are no observed > values > below zero. Furthermore there is some density below zero, although > there are > no observed values below zero. > > This illustrated the problem: > > mydata <- rnorm(100000); > mydata <- mydata[mydata>0]; > plot(density(mydata)); > > the 'real' density is exactly the right half of a normal > distribution, so > truncated at zero. However using the default options, the line seems > to > decrease with a nice curve at the left, with some density below > zero. This > is pretty confusing for the reader. I have tried to decrease the bw, > masks > (but does not fix) some of the problem, but than also the rest of > the curve > loses smoothness. I would like to make a plot of this data that > looks like > the right half of a normal distribution, while keeping the curve > relatively > smooth. > > Is there any way to specify this truncation in the density function, > so that > it will only use the positive domain to calculate density? > -- > View this message in context: http://www.nabble.com/plotting-density-for-truncated-distribution-tp20684995p20684995.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Another option mydata <- rnorm(100000) mydata <- mydata[mydata>0] plot(density(c(mydata, -mydata), from=0)) If you want the area under the curve to be one, you'll need to double the density estimate dx <- density(c(mydata, -mydata), from=0) dx$y <- dx$y * 2 plot(dx) Chris Jeroen Ooms wrote:> > I am using density() to plot a density curves. However, one of my > variables is truncated at zero, but has most of its density around zero. I > would like to know how to plot this with the density function. > > The problem is that if I do this the regular way density(), values near > zero automatically get a very low value because there are no observed > values below zero. Furthermore there is some density below zero, although > there are no observed values below zero. > > This illustrated the problem: > > mydata <- rnorm(100000); > mydata <- mydata[mydata>0]; > plot(density(mydata)); > > the 'real' density is exactly the right half of a normal distribution, so > truncated at zero. However using the default options, the line seems to > decrease with a nice curve at the left, with some density below zero. This > is pretty confusing for the reader. I have tried to decrease the bw, masks > (but does not fix) some of the problem, but than also the rest of the > curve loses smoothness. I would like to make a plot of this data that > looks like the right half of a normal distribution, while keeping the > curve relatively smooth. > > Is there any way to specify this truncation in the density function, so > that it will only use the positive domain to calculate density? >-- View this message in context: http://www.nabble.com/plotting-density-for-truncated-distribution-tp20684995p20699699.html Sent from the R help mailing list archive at Nabble.com.
thank you, both solutions are really helpful! -- View this message in context: http://www.nabble.com/plotting-density-for-truncated-distribution-tp20684995p20703469.html Sent from the R help mailing list archive at Nabble.com.