Johannes Radinger
2015-Mar-21 14:27 UTC
[R] Optimization to fit data to custom density distribution
Thanks for the fast response. The fitdistr() function works well for the predefined density functions. However, what is the recommended approach to optimize/fit a density function described by two superimposed normal distributions? In my case it is N1(mean=0,sd1)*p+N2(mean=0,sd2)*(1-p). With fitdistr one can only choose among the 15 distributions. Probably this needs an approach using optim()? However I am so far unfamiliar with these packages. So any suggestion ist welcome. :) /Johannes On Sat, Mar 21, 2015 at 2:16 PM, Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:> One way using the standard R distribution: > > library(MASS) > ?fitdistr > > No optimization is needed to fit a normal distribution, though. > > > On 21/03/2015 13:05, Johannes Radinger wrote: > >> Hi, >> >> I am looking for a way to fit data (vector of values) to a density >> function >> using an optimization (ordinary least squares or maximum likelihood fit). >> For example if I have a vector of 100 values generated with rnorm: >> >> rnorm(n=100,mean=500,sd=50) >> >> How can I fit these data to a Gaussian density function to extract the >> mean >> and sd value of the underlying normal distribution. So the result should >> roughly meet the parameters of the normal distribution used to generate >> the >> data. The results will ideally be closer the true parameters the more data >> (n) are used to optimize the density function. >> > > That's a concept called 'consistency' from the statistical theory of > estimation. If you skipped that course, time to read up (but it is > off-topic here). > > -- > Brian D. Ripley, ripley at stats.ox.ac.uk > Emeritus Professor of Applied Statistics, University of Oxford > 1 South Parks Road, Oxford OX1 3TG, UK >[[alternative HTML version deleted]]
Prof Brian Ripley
2015-Mar-21 14:41 UTC
[R] Optimization to fit data to custom density distribution
On 21/03/2015 14:27, Johannes Radinger wrote:> Thanks for the fast response. The fitdistr() function works well for the > predefined density functions. However, what is the recommended approach > to optimize/fit a density function described by two superimposed normal > distributions? In my case it is N1(mean=0,sd1)*p+N2(mean=0,sd2)*(1-p). > With fitdistr one can only choose among the 15 distributions. ProbablyThat is simply not true. The help says densfun: Either a character string or a function returning a density evaluated at its first argument. and the second alternative is used in the examples.> this needs an approach using optim()? However I am so far unfamiliar > with these packages. So any suggestion ist welcome. :)There are examples of that in MASS (the book), chapter 16.> > /Johannes > > On Sat, Mar 21, 2015 at 2:16 PM, Prof Brian Ripley > <ripley at stats.ox.ac.uk <mailto:ripley at stats.ox.ac.uk>> wrote: > > One way using the standard R distribution: > > library(MASS) > ?fitdistr > > No optimization is needed to fit a normal distribution, though. > > > On 21/03/2015 13:05, Johannes Radinger wrote: > > Hi, > > I am looking for a way to fit data (vector of values) to a > density function > using an optimization (ordinary least squares or maximum > likelihood fit). > For example if I have a vector of 100 values generated with rnorm: > > rnorm(n=100,mean=500,sd=50) > > How can I fit these data to a Gaussian density function to > extract the mean > and sd value of the underlying normal distribution. So the > result should > roughly meet the parameters of the normal distribution used to > generate the > data. The results will ideally be closer the true parameters the > more data > (n) are used to optimize the density function. > > > That's a concept called 'consistency' from the statistical theory of > estimation. If you skipped that course, time to read up (but it is > off-topic here). > > -- > Brian D. Ripley, ripley at stats.ox.ac.uk <mailto:ripley at stats.ox.ac.uk> > Emeritus Professor of Applied Statistics, University of Oxford > 1 South Parks Road, Oxford OX1 3TG, UK > >-- Brian D. Ripley, ripley at stats.ox.ac.uk Emeritus Professor of Applied Statistics, University of Oxford 1 South Parks Road, Oxford OX1 3TG, UK
Johannes Radinger
2015-Mar-23 18:40 UTC
[R] Optimization to fit data to custom density distribution
On Sat, Mar 21, 2015 at 3:41 PM, Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:> On 21/03/2015 14:27, Johannes Radinger wrote: > >> Thanks for the fast response. The fitdistr() function works well for the >> predefined density functions. However, what is the recommended approach >> to optimize/fit a density function described by two superimposed normal >> distributions? In my case it is N1(mean=0,sd1)*p+N2(mean=0,sd2)*(1-p). >> With fitdistr one can only choose among the 15 distributions. Probably >> > > That is simply not true. The help says > > densfun: Either a character string or a function returning a density > evaluated at its first argument. > > and the second alternative is used in the examples.Of course, that was my mistake. So fitdistr() works fine for this case. Here an example to complete that case: x <- c(rnorm(mean=0,sd=50,70),rnorm(mean=0,sd=500,30)) hist(x,breaks=30) ddoublenorm <- function(x,sigma_stat,sigma_mob,p) { dnorm(x,mean=0,sd=sigma_stat)*p+dnorm(x,mean=0,sd=sigma_mob)*(1-p)} fitdistr(x=x,densfun=ddoublenorm, start=list(sigma_stat=30,sigma_mob=300,p=0.5), method="L-BFGS-B",lower=c(0.001,0.001,0.00001),upper=c(Inf,Inf,0.99999)) Thanks a lot! Best regards, Johannes> > > this needs an approach using optim()? However I am so far unfamiliar >> with these packages. So any suggestion ist welcome. :) >> > > There are examples of that in MASS (the book), chapter 16. > > >> /Johannes >> >> On Sat, Mar 21, 2015 at 2:16 PM, Prof Brian Ripley >> <ripley at stats.ox.ac.uk <mailto:ripley at stats.ox.ac.uk>> wrote: >> >> One way using the standard R distribution: >> >> library(MASS) >> ?fitdistr >> >> No optimization is needed to fit a normal distribution, though. >> >> >> On 21/03/2015 13:05, Johannes Radinger wrote: >> >> Hi, >> >> I am looking for a way to fit data (vector of values) to a >> density function >> using an optimization (ordinary least squares or maximum >> likelihood fit). >> For example if I have a vector of 100 values generated with rnorm: >> >> rnorm(n=100,mean=500,sd=50) >> >> How can I fit these data to a Gaussian density function to >> extract the mean >> and sd value of the underlying normal distribution. So the >> result should >> roughly meet the parameters of the normal distribution used to >> generate the >> data. The results will ideally be closer the true parameters the >> more data >> (n) are used to optimize the density function. >> >> >> That's a concept called 'consistency' from the statistical theory of >> estimation. If you skipped that course, time to read up (but it is >> off-topic here). >> >> -- >> Brian D. Ripley, ripley at stats.ox.ac.uk <mailto:ripley at stats.ox.ac.uk> >> Emeritus Professor of Applied Statistics, University of Oxford >> 1 South Parks Road, Oxford OX1 3TG, UK >> >> >> > > -- > Brian D. Ripley, ripley at stats.ox.ac.uk > Emeritus Professor of Applied Statistics, University of Oxford > 1 South Parks Road, Oxford OX1 3TG, UK >[[alternative HTML version deleted]]