thr3ads.net - R help - [R] Density Estimation [Jun 2006]

If this information is useful, please help other people find it:
Share via:

Pedro Ramirez

2006-Jun-07 17:00 UTC

[R] Density Estimation

Dear R-list,

I have made a simple kernel density estimation by

x <- c(2,1,3,2,3,0,4,5,10,11,12,11,10)
kde <- density(x,n=100)

Now I would like to know the estimated probability that a
new observation falls into the interval 0<x<3.

How can I integrate over the corresponding interval?
In several R-packages for kernel density estimation I did
not found a corresponding function. I could apply
Simpson's Rule for integrating, but perhaps somebody
knows a better solution.

Thanks a lot for help!

Pedro

_________

Greg Snow

2006-Jun-07 17:21 UTC

head link

[R] Density Estimation

Not a direct answer to your question, but if you use a logspline density
estimate rather than a kernal density estimate then the logspline
package will help you and it has built in functions for dlogspline,
qlogspline, and plogspline that do the integrals for you.

If you want to stick with the KDE, then you could find the area under
each of the kernals for the range you are interested in (need to work
out the standard deviation used from the bandwidth, then use pnorm for
the default gaussian kernal), then just sum the individual areas. 

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at intermountainmail.org
(801) 408-8111
 

-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Pedro Ramirez
Sent: Wednesday, June 07, 2006 11:00 AM
To: r-help at stat.math.ethz.ch
Subject: [R] Density Estimation

Dear R-list,

I have made a simple kernel density estimation by

x <- c(2,1,3,2,3,0,4,5,10,11,12,11,10)
kde <- density(x,n=100)

Now I would like to know the estimated probability that a new
observation falls into the interval 0<x<3.

How can I integrate over the corresponding interval?
In several R-packages for kernel density estimation I did not found a
corresponding function. I could apply Simpson's Rule for integrating,
but perhaps somebody knows a better solution.

Thanks a lot for help!

Pedro

_________

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

Rolf Turner

2006-Jun-07 17:48 UTC

head link

[R] Density Estimation

Pedro wrote:
> I have made a simple kernel density estimation by
> 
> x <- c(2,1,3,2,3,0,4,5,10,11,12,11,10)
> kde <- density(x,n=100)
> 
> Now I would like to know the estimated probability that a
> new observation falls into the interval 0<x<3.
> 
> How can I integrate over the corresponding interval?
> In several R-packages for kernel density estimation I did
> not found a corresponding function. I could apply
> Simpson's Rule for integrating, but perhaps somebody
> knows a better solution.
	One possibility is to use splinefun():

	> spiffy <- splinefun(kde$x,kde$y)
	> integrate(spiffy,0,3)
	0.2353400 with absolute error < 2e-09

		cheers,

			Rolf Turner
			rolf at math.unb.ca

Pedro Ramirez

2006-Jun-07 17:54 UTC

head link

[R] Density Estimation

>Not a direct answer to your question, but if you use a logspline density
>estimate rather than a kernal density estimate then the logspline
>package will help you and it has built in functions for dlogspline,
>qlogspline, and plogspline that do the integrals for you.
>
>If you want to stick with the KDE, then you could find the area under
>each of the kernals for the range you are interested in (need to work
>out the standard deviation used from the bandwidth, then use pnorm for
>the default gaussian kernal), then just sum the individual areas.
>
>Hope this helps,
Thanks a lot for your quick help! I think I will follow your first 
suggestion (logspline
density estimation) instead of summing over the kernel areas because at the
boundaries of the range truncated kernel areas can occur, so I think it is
easier to do it with logsplines. Thanks again for your help!!

Pedro


>
>--
>Gregory (Greg) L. Snow Ph.D.
>Statistical Data Center
>Intermountain Healthcare
>greg.snow at intermountainmail.org
>(801) 408-8111
>
>
>-----Original Message-----
>From: r-help-bounces at stat.math.ethz.ch
>[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Pedro Ramirez
>Sent: Wednesday, June 07, 2006 11:00 AM
>To: r-help at stat.math.ethz.ch
>Subject: [R] Density Estimation
>
>Dear R-list,
>
>I have made a simple kernel density estimation by
>
>x <- c(2,1,3,2,3,0,4,5,10,11,12,11,10)
>kde <- density(x,n=100)
>
>Now I would like to know the estimated probability that a new
>observation falls into the interval 0<x<3.
>
>How can I integrate over the corresponding interval?
>In several R-packages for kernel density estimation I did not found a
>corresponding function. I could apply Simpson's Rule for integrating,
>but perhaps somebody knows a better solution.
>
>Thanks a lot for help!
>
>Pedro
>
>_________
>
>______________________________________________
>R-help at stat.math.ethz.ch mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide!
>http://www.R-project.org/posting-guide.html
>

Pedro Ramirez

2006-Jun-08 18:31 UTC

head link

[R] Density Estimation

>In mathematical terms the optimal bandwith for density estimation
>decreases at rate n^{-1/5}, while the one for distribution function
>decreases at rate n^{-1/3}, if n is the sample size. In practical terms,
>one must choose an appreciably smaller bandwidth in the second case
>than in the first one.
Thanks a lot for your remark! I was not aware of the fact that the
optimal bandwidths for density and distribution do not decrease
at the same rate.
>Besides the computational aspect, there is a statistical one:
>the optimal choice of bandwidth for estimating the density function
>is not optimal (and possibly not even jsut sensible) for estimating
>the distribution function, and the stated problem is equivalent to
>estimation of the distribution function.
The given interval "0<x<3" was only an example, in fact I would
like to estimate the probability for intervals such as

"0<=x<1" , "1<=x<2" , "2<=x<3" ,
"3<=x<4" , ....

and compare it with the estimates of a corresponding histogram.
In this case the stated problem is not anymore equivalent to the
estimation of the distribution function. What do you think, can
I go a ahead in this case with the optimal bandwidth for the
density? Thanks a lot for your help!

Best wishes
Pedro



>best wishes,
>
>Adelchi
>
>
>PR>
>PR> >
>PR> >--
>PR> >Gregory (Greg) L. Snow Ph.D.
>PR> >Statistical Data Center
>PR> >Intermountain Healthcare
>PR> >greg.snow at intermountainmail.org
>PR> >(801) 408-8111
>PR> >
>PR> >
>PR> >-----Original Message-----
>PR> >From: r-help-bounces at stat.math.ethz.ch
>PR> >[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Pedro
>PR> >Ramirez Sent: Wednesday, June 07, 2006 11:00 AM
>PR> >To: r-help at stat.math.ethz.ch
>PR> >Subject: [R] Density Estimation
>PR> >
>PR> >Dear R-list,
>PR> >
>PR> >I have made a simple kernel density estimation by
>PR> >
>PR> >x <- c(2,1,3,2,3,0,4,5,10,11,12,11,10)
>PR> >kde <- density(x,n=100)
>PR> >
>PR> >Now I would like to know the estimated probability that a new
>PR> >observation falls into the interval 0<x<3.
>PR> >
>PR> >How can I integrate over the corresponding interval?
>PR> >In several R-packages for kernel density estimation I did not
>PR> >found a corresponding function. I could apply Simpson's Rule
for
>PR> >integrating, but perhaps somebody knows a better solution.
>PR> >
>PR> >Thanks a lot for help!
>PR> >
>PR> >Pedro
>PR> >
>PR> >_________
>PR> >
>PR> >______________________________________________
>PR> >R-help at stat.math.ethz.ch mailing list
>PR> >https://stat.ethz.ch/mailman/listinfo/r-help
>PR> >PLEASE do read the posting guide!
>PR> >http://www.R-project.org/posting-guide.html
>PR> >
>PR>
>PR> ______________________________________________
>PR> R-help at stat.math.ethz.ch mailing list
>PR> https://stat.ethz.ch/mailman/listinfo/r-help
>PR> PLEASE do read the posting guide!
>PR> http://www.R-project.org/posting-guide.html
>PR>

Possibly Parallel Threads

Search for more maybe matching threads

R help - Jun 2006 - Density Estimation

[R] Density Estimation

[R] Density Estimation

[R] Density Estimation

[R] Density Estimation

[R] Density Estimation

Possibly Parallel Threads