Hello, I would have some details and explanations about the results I get. In fact, I start with a uniform sample between -1 and 1, and then plot its density. My problem is that the density ranges are much more longer than I expected : samp <- runif(10000,-1,1) plot(density(samp)) Instead of varying between -1 and 1, the density varies between approximaly -1.5 and 1.5 Could someone explain me what is happening ? Maybe some arguments for density estimation need to be set ? Waiting for an answer, Thanks in advance Isabelle. Isabelle Zabalza-Mezghani, PhD IFP - Research Reservoir Engineer
Hi, | From: ZABALZA-MEZGHANI Isabelle <Isabelle.ZABALZA-MEZGHANI at ifp.fr> | Date: Tue, 8 Apr 2003 10:21:45 +0200 | Hello, | | I would have some details and explanations about the results I get. | In fact, I start with a uniform sample between -1 and 1, and then plot its | density. | My problem is that the density ranges are much more longer than I expected : | | samp <- runif(10000,-1,1) | plot(density(samp)) | | Instead of varying between -1 and 1, the density varies between approximaly | -1.5 and 1.5 The density is positive in the interval about (-1.3, 1.3) using the default bandwidth. Its value is around 0.5. I guess you should try to change bandwidth. Try> plot(density(samp, bw=0.1)) > lines(density(samp, bw=0.03), col=2) > lines(density(samp, bw=0.01), col=3)best wishes, Ott | Could someone explain me what is happening ? Maybe some arguments for | density estimation need to be set ? | | Isabelle.
On Tue, 8 Apr 2003, ZABALZA-MEZGHANI Isabelle wrote:> Hello, > > I would have some details and explanations about the results I get. > In fact, I start with a uniform sample between -1 and 1, and then plot its > density. > My problem is that the density ranges are much more longer than I expected : > > samp <- runif(10000,-1,1) > plot(density(samp)) > > Instead of varying between -1 and 1, the density varies between approximaly > -1.5 and 1.5 > Could someone explain me what is happening ? Maybe some arguments for > density estimation need to be set ? >Try:> samp <- runif(10000,-1,1) > range(samp)[1] -0.9995812 0.9996801 (for this samp)> plot(density(samp), ylim=c(0,0.6)) > abline(v=c(-1,1)) > lines(density(samp, cut=0), col="green") > lines(density(samp, from=-1, to=1), col="red")So you can add arguments to density() - see help(density) - but they will not affect the fact that for the chosen bandwidth and kernel, the kernel will extend outside the data range. Does:> samp1 <- runif(10000,-1.5,1.5) > plot(density(samp1, from=-1, to=1), ylim=c(0,0.6)) > abline(v=c(-1,1))"look" "better"? Roger> Waiting for an answer, > Thanks in advance > > Isabelle. > > Isabelle Zabalza-Mezghani, PhD > IFP - Research Reservoir Engineer >-- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Breiviksveien 40, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93 e-mail: Roger.Bivand at nhh.no
On 08-Apr-03 ZABALZA-MEZGHANI Isabelle wrote:> samp <- runif(10000,-1,1) > plot(density(samp)) > > Instead of varying between -1 and 1, the density varies between > approximaly -1.5 and 1.5 > Could someone explain me what is happening ? Maybe some arguments for > density estimation need to be set ?density() computes a kernel-density estimate of the density, i.e. it replaces each observation by a distribution ("kernel") which is spread out over a certain width on either side of it, and sums these contributions. Therefore, observations near the ends of the range [-1,1] are replaced by distributions which extend beyond the range, with the result you have seen. There are options to density() which can limit the estimated density to the range [-1,1]: try plot(density(samp,from=-1,to=1)) or plot(density(samp,cut=0)) (which both seem to give the same result), though you may not think that the result looks satisfactory at the ends. Ideally, for this sort of problem is should be possible to make the width of the kernel depend on the position (rank) of the observation it is applied to -- for a uniform distribution in particular the variance of an order statistic is strongly dependent on its rank (the median over [-1,1] has variance 1/(n+2), the min or the max has variance 4n/((n+2)*(n+1)^2) approx = 4/(n^2) for a sample of n). If you know that a sample is from a uniform distribution, the end-points are very precisely estimated from the extremes of the sample, and a fixed-width kernel-density estimate will not do justice to this.. I don't know whether this is possible directly with current R functions (though one can always write one which does it). I hope this helps, Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk> Fax-to-email: +44 (0)870 167 1972 Date: 08-Apr-03 Time: 10:40:50 ------------------------------ XFMail ------------------------------