This practical problem in maximum likelihood estimation must be encountered quite a bit. What do you do when a data point has a probability that comes out in numerical evaluation to zero? In calculating the log likelihood you then have a log(0) problem. Here is a simple example (probit) which illustrates the problem: x<-c(1,2,3,4,100) ntrials<-100 yes<-round(ntrials*pnorm((x-3)/1)) #points fall on normal CDF mean=3, sd=1 no<-ntrials-yes pyes<-yes/ntrials py<-function(b,x) pnorm((x-b[1])/b[2]) loglike<-function(b) -sum( yes*log(py(b,x)) + no*log(1-py(b,x)) ) out<-nlm(loglike,p=c(3,1),hessian=TRUE) In this example the right-most point gives a p(yes) of 1; 1-1=0; log(0) gives "NA/Inf replaced by maximum positive value" Please tell me how to deal with this problem. Thanks very much for any help. Bill Simpson -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Tue, 9 Jan 2001, Bill Simpson wrote:> This practical problem in maximum likelihood estimation must be > encountered quite a bit. What do you do when a data point has a > probability that comes out in numerical evaluation to zero? In calculating > the log likelihood you then have a log(0) problem. > > Here is a simple example (probit) which illustrates the problem: > > x<-c(1,2,3,4,100) > ntrials<-100 > yes<-round(ntrials*pnorm((x-3)/1)) #points fall on normal CDF mean=3, sd=1 > no<-ntrials-yes > pyes<-yes/ntrials > py<-function(b,x) pnorm((x-b[1])/b[2]) > loglike<-function(b) -sum( yes*log(py(b,x)) + no*log(1-py(b,x)) ) > out<-nlm(loglike,p=c(3,1),hessian=TRUE) > > In this example the right-most point gives a p(yes) of 1; 1-1=0; log(0) > gives "NA/Inf replaced by maximum positive value" > > Please tell me how to deal with this problem. Thanks very much for any > help.Calculate your log-likelihood correctly! 0*log(0) is 0, not NaN. It is an example of good S programming on page 94 of V&R3. More generally, I think you should be using the full 5-arg form of pnorm, since pnorm(x, 3, 1, FALSE, TRUE) is a lot more accurate than your calculation. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Thanks very much Brian and Ben for your helpful replies. Bill -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
You can use the "log" and "lower.tail" options to get around these numeric difficulties; essentially, "log" returns the log-probability calculation directly from pnorm(). You're right that this is a fairly standard problem in maximizing likelihoods. py<-function(b,x,...) pnorm((x-b[1])/b[2],log=TRUE,...) loglike<-function(b) -sum( yes*py(b,x) + no*py(b,x,lower.tail=FALSE)) out<-nlm(loglike,p=c(3,1),hessian=TRUE) On Tue, 9 Jan 2001, Bill Simpson wrote:> This practical problem in maximum likelihood estimation must be > encountered quite a bit. What do you do when a data point has a > probability that comes out in numerical evaluation to zero? In calculating > the log likelihood you then have a log(0) problem. > > Here is a simple example (probit) which illustrates the problem: > > x<-c(1,2,3,4,100) > ntrials<-100 > yes<-round(ntrials*pnorm((x-3)/1)) #points fall on normal CDF mean=3, sd=1 > no<-ntrials-yes > pyes<-yes/ntrials > py<-function(b,x) pnorm((x-b[1])/b[2]) > loglike<-function(b) -sum( yes*log(py(b,x)) + no*log(1-py(b,x)) ) > out<-nlm(loglike,p=c(3,1),hessian=TRUE) > > In this example the right-most point gives a p(yes) of 1; 1-1=0; log(0) > gives "NA/Inf replaced by maximum positive value" > > Please tell me how to deal with this problem. Thanks very much for any > help. > > Bill Simpson > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ >-- 318 Carr Hall bolker at zoo.ufl.edu Zoology Department, University of Florida http://www.zoo.ufl.edu/bolker Box 118525 (ph) 352-392-5697 Gainesville, FL 32611-8525 (fax) 352-392-3704 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Reasonably Related Threads
- Error in maximum likelihood estimation.
- Problems in programming a simple likelihood
- fitting a mixture of distributions with optim and max log likelihood ?
- Invalid 'times' argument three-category ordered probit with maximum likelihood
- how to generate and evaluate a design using Algdesign