thr3ads.net - R help - [R] Density Estimation [Sep 2004]

If this information is useful, please help other people find it:
Share via:

Brian Mac Namee

2004-Sep-15 11:36 UTC

[R] Density Estimation

Hi there,

Sorry if this is a rather loing post. I have a simple list of single
feature data points from which I would like to generate a probability
that an unseen point comes from the same distribution. To do this I am
trying to estimate the probability density of the list of points and
use this to generate a probability for the new unseen points. I have
managed to use the R density function to generate the density estimate
but have not been able to do anything with this - i.e. generate a
rpobability that a new point comes from the same distribution. Is
there a function to do this, or am I way off the mark using the
density function at all?

Thanks in advance,

Brian.

Vito Ricci

2004-Sep-15 12:53 UTC

head link

[R] Density Estimation

Dear Brian,

I can suggest you to use density() function to get an
estimate of the pdf you're finding (I believe it's
unknown). Then you can plot the point you got by
density() using plot(). In this way you have a graphic
representation of you unknown pdf. According its shape
and helping by the graphic you could try to understand
what kind of pdf it would be (normal, gamma, weibul,
etc.)
After you can estimate parameters of pdf using your
data with LS or ML methods.
Then you can calculate the goodness of fit for each
model of pdf and use the best one.

I hope I get you a little help.

Cordially
Vito Ricci

brian.macnamee at gmail.com  wrote:

Hi there,

Sorry if this is a rather loing post. I have a simple
list of single
feature data points from which I would like to
generate a probability
that an unseen point comes from the same distribution.
To do this I am
trying to estimate the probability density of the list
of points and
use this to generate a probability for the new unseen
points. I have
managed to use the R density function to generate the
density estimate
but have not been able to do anything with this - i.e.
generate a
rpobability that a new point comes from the same
distribution. Is
there a function to do this, or am I way off the mark
using the
density function at all?

Thanks in advance,

Brian.

====Diventare costruttori di soluzioni

Visitate il portale http://www.modugno.it/
e in particolare la sezione su Palese
http://www.modugno.it/archivio/cat_palese.shtml


		
___________________________________

http://it.seriea.fantasysports.yahoo.com/

Bob Wheeler

2004-Sep-15 13:54 UTC

head link

[R] Density Estimation

Try fitting it with a Johnson function -- see SuppDists. If you can fit 
it you will then be able to use the functions in SuppDists just as you 
can for any other distribution supported by R.

Brian Mac Namee wrote:> Hi there,
> 
> Sorry if this is a rather loing post. I have a simple list of single
> feature data points from which I would like to generate a probability
> that an unseen point comes from the same distribution. To do this I am
> trying to estimate the probability density of the list of points and
> use this to generate a probability for the new unseen points. I have
> managed to use the R density function to generate the density estimate
> but have not been able to do anything with this - i.e. generate a
> rpobability that a new point comes from the same distribution. Is
> there a function to do this, or am I way off the mark using the
> density function at all?
> 
> Thanks in advance,
> 
> Brian.
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
> 

-- 
Bob Wheeler --- http://www.bobwheeler.com/
         ECHIP, Inc. ---
Randomness comes in bunches.

Wolski

2004-Sep-15 14:00 UTC

head link

[R] Density Estimation

Hi!

The function density returns you a object of class density.
This object has an x and an y attribute which you can access by x y,
Hi!

Use approx and runif.

eg.:

dd<-density(rnorm(100,3,5))
plot(dd)

Using the function ?approx you can compute the density value for any x.
#the x is a dummy here.
mydist<-function(x,dd)
{
	
	while(1)
	{
		tmp <- runif(1,min=min(dd$x),max=max(dd$x))
		lev <- approx(dd$x,dd$y,tmp)$y
		if(runif(1,c(0,1)) <= lev)
		{
			return(tmp)
		}
	}
}

x <- 0
mydist(x,dd)

res<-rep(0,500)
res<-sapply(res,mydist,dd)
lines(density(res),col=2)


/E.



*********** REPLY SEPARATOR  ***********

On 9/15/2004 at 12:36 PM Brian Mac Namee wrote:
>>>Hi there,
>>>
>>>Sorry if this is a rather loing post. I have a simple list of single
>>>feature data points from which I would like to generate a
probability
>>>that an unseen point comes from the same distribution. To do this I
am
>>>trying to estimate the probability density of the list of points and
>>>use this to generate a probability for the new unseen points. I have
>>>managed to use the R density function to generate the density
estimate
>>>but have not been able to do anything with this - i.e. generate a
>>>rpobability that a new point comes from the same distribution. Is
>>>there a function to do this, or am I way off the mark using the
>>>density function at all?
>>>
>>>Thanks in advance,
>>>
>>>Brian.
>>>
>>>______________________________________________
>>>R-help at stat.math.ethz.ch mailing list
>>>https://stat.ethz.ch/mailman/listinfo/r-help
>>>PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html


Dipl. bio-chem. Witold Eryk Wolski             @         MPI-Moleculare Genetic
Ihnestrasse 63-73 14195 Berlin                'v'    
tel: 0049-30-83875219                        /   \       
mail: witek96 at users.sourceforge.net        ---W-W----   
http://www.molgen.mpg.de/~wolski
      wolski at molgen.mpg.de

(Ted Harding)

2004-Sep-15 14:07 UTC

head link

[R] Density Estimation

On 15-Sep-04 Brian Mac Namee wrote:> Sorry if this is a rather loing post. I have a simple list of single
> feature data points from which I would like to generate a probability
> that an unseen point comes from the same distribution. To do this I am
> trying to estimate the probability density of the list of points and
> use this to generate a probability for the new unseen points. I have
> managed to use the R density function to generate the density estimate
> but have not been able to do anything with this - i.e. generate a
> rpobability that a new point comes from the same distribution. Is
> there a function to do this, or am I way off the mark using the
> density function at all?
It's not clear what you're really after, but it looks as though you
may be wanting to sample from the distribution estimated by 'density'.

A possible approach, which you could refine, is exemplified by

  x<-rnorm(1000)
  d<-density(x,n=4096)
  y<-sample(d$x,size=1000,prob=d$y)

Check performance with

  hist(y)

Looks OK to me! See "?density" and "?sample".

On an alternative interpretation, perhaps you want to first estimate
the density based on data you already have, and then when you have
got further data (but these would then be "seen" and not
"unseen")
come to a judgement about whether these new points are compatible
with coming from the distributikon you have estimated.

A possible approach to this question (again susceptible to refinement)
would be as follows.

1. Use a fine-grained grid for 'density', i.e. a large value for
"n".

2. Replace each of the points in the new data by the nearest point
   in this grid. Call these values z1, z2, ... , zk corresponding
   to index values i1, i2, ... , ik in d$x.

3. Evaluate the probability P(z1,...,zk) from the density as the
   product of d$y[i] where i<-c(i1,...,ik).
   Better still, evaluated the logarithm of this. Call the result L.

4. Now simulate a large number of draws of k values from d on the
   lines of sample(d$x,size=k,prob=d$y) as above, and evaluate L
   for each  of these. Where is the value of L from (3) situated in
   the distribution of these values of L from (4)? If (say) only
   1 per cent of the simulated values of L from "d" are less than
   the value of L from (3), then you have a basis for a test that
   your new data did not come from the distribution you have estimated
   from your old data, in that the new data are from the low-density
   part of the estimated distribution.

There are of course alternative ways to view this question. The
value of "k" is relevant. In particular, if "k" is small
(say 3
or 4) then the suggestion in (4) is probably the best way to
approach it. However, if "k" is large then you can use a test on
the lines of Kolmogorov-Smirnov with the reference distribution
estimated as the cumulative distribution of d$y and the distribution
being tested as the empirical cumulative distribution of your new
data.

Even sharper focus is available if you are in a position to make
a paramatric model for your data, but your description does not
suggest that this is the case.

Best wishes,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 167 1972
Date: 15-Sep-04                                       Time: 15:07:33
------------------------------ XFMail ------------------------------

Apparently Analagous Threads

Search for more reasonably related threads

R help - Sep 2004 - Density Estimation

[R] Density Estimation

[R] Density Estimation

[R] Density Estimation

[R] Density Estimation

[R] Density Estimation

Apparently Analagous Threads