Dear all, I ran into problems with the function "optim" when I tried to do an mle estimation of a simple lognormal regression. Some warning message poped up saying NANs have been produced in the optimization process. But I could not figure out which part of my code has caused this. I wonder if anybody would help. The code is in the following and the data is in the attachment. da <- read.table("da.txt",header=TRUE) # fit with linear regression using log transformation of the response variable fit <- lm(log(yp) ~ as.factor(ay)+as.factor(lag),data=da) # define the log likelihood to be maximized over llk.mar <- function(parm,y,x){ # parm is the vector of parameters # the last element is sigma # y is the response # x is the design matrix l <- length(parm) beta <- parm[-l] sigma <- parm[l] x <- as.matrix(x) mu <- x %*% beta llk <- sum(dnorm(y, mu, sigma,log=TRUE)) return(llk) } # initial values parm <- c(as.vector(coef(fit)),summary(fit)$sigma) y <- log(da$yp) x <- model.matrix(fit) op <- optim(parm, llk.mar, y=y,x=x,control=list(fnscale=-1,maxit=100000)) After running the above code, I got the warning message: Warning messages: 1: In dnorm(x, mean, sd, log) : NaNs produced 2: In dnorm(x, mean, sd, log) : NaNs produced I would really appreciate if anybody would help to point out the problem with this code or tell me how to trace it down (using "trace"?)? Many thanks in advance. Wayne (Yanwei) Zhang Statistical Research CNA NOTICE: This e-mail message, including any attachments and appended messages, is for the sole use of the intended recipients and may contain confidential and legally privileged information. If you are not the intended recipient, any review, dissemination, distribution, copying, storage or other use of all or any portion of this message is strictly prohibited. If you received this message in error, please immediately notify the sender by reply e-mail and delete this message in its entirety. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: da.txt URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20100909/9427d224/attachment.txt>
This is only a guess because I don't have your data: sigma is must be positive in the dnorm function. My guess is that optim may attempt an iteration with a negative sigma. You may want to see help(optim) for dealing with this constraint. Specifically see the lower argument. If you specify the constraint on sigma, you'll probably still get the same answer as before, without the warnings. -tgs 2010/9/9 Zhang,Yanwei <Yanwei.Zhang@cna.com>> Dear all, > > I ran into problems with the function "optim" when I tried to do an mle > estimation of a simple lognormal regression. Some warning message poped up > saying NANs have been produced in the optimization process. But I could not > figure out which part of my code has caused this. I wonder if anybody would > help. The code is in the following and the data is in the attachment. > > > da <- read.table("da.txt",header=TRUE) > > # fit with linear regression using log transformation of the response > variable > fit <- lm(log(yp) ~ as.factor(ay)+as.factor(lag),data=da) > > # define the log likelihood to be maximized over > llk.mar <- function(parm,y,x){ > # parm is the vector of parameters > # the last element is sigma > # y is the response > # x is the design matrix > l <- length(parm) > beta <- parm[-l] > sigma <- parm[l] > x <- as.matrix(x) > mu <- x %*% beta > llk <- sum(dnorm(y, mu, sigma,log=TRUE)) > return(llk) > } > > # initial values > parm <- c(as.vector(coef(fit)),summary(fit)$sigma) > y <- log(da$yp) > x <- model.matrix(fit) > > op <- optim(parm, llk.mar, y=y,x=x,control=list(fnscale=-1,maxit=100000)) > > > After running the above code, I got the warning message: > Warning messages: > 1: In dnorm(x, mean, sd, log) : NaNs produced > 2: In dnorm(x, mean, sd, log) : NaNs produced > > > I would really appreciate if anybody would help to point out the problem > with this code or tell me how to trace it down (using "trace"?)? > Many thanks in advance. > > > > > > > > Wayne (Yanwei) Zhang > Statistical Research > CNA > > > > > > NOTICE: This e-mail message, including any attachments and appended > messages, is for the sole use of the intended recipients and may contain > confidential and legally privileged information. > If you are not the intended recipient, any review, dissemination, > distribution, copying, storage or other use of all or any portion of this > message is strictly prohibited. > If you received this message in error, please immediately notify the sender > by reply e-mail and delete this message in its entirety. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >[[alternative HTML version deleted]]
Yanwei!!!!!!!!!!!!!!! Have you tried to write the likelihood function using log-normal directly? if you haven't so, you may want to check ?rlnorm -- View this message in context: http://r.789695.n4.nabble.com/Help-on-simple-problem-with-optim-tp2533420p2533487.html Sent from the R help mailing list archive at Nabble.com.
You can record all arguments and return values of the calls that optim(par,fn) makes to fn with a function like the following. It takes your function and makes a new function that returns the same thing but also records information it its environment. Thus, after optim is done you can see its path to the optimum. trackFn <- function (fn) { X <- NULL VALUE <- NULL force(fn) function(x) { X <<- rbind(X, x) val <- fn(x) VALUE <<- c(VALUE, val) val } } E.g.,> fn <- function(x)sum( (sin(x)-c(-1,0,1)) ^ 2 ) > optim(c(0,0,0), tfn <- trackFn(fn))$par [1] -1.583466e+00 6.726235e-05 1.558809e+00 $value [1] 1.612853e-08 $counts function gradient 146 NA $convergence [1] 0 $message NULL> with(environment(tfn), length(VALUE)) # agrees w/ counts above[1] 146> with(environment(tfn), plot(VALUE, log="y")) > with(environment(tfn), matplot(X, VALUE, log="y", type="l"))You could also record warning messages by including a call to withCallingHandlers() that stashed the warning in a list. Recall trackFn each time you call optim(). Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Zhang,Yanwei > Sent: Thursday, September 09, 2010 11:54 AM > To: r-help at r-project.org > Subject: [R] Help on simple problem with optim > > Dear all, > > I ran into problems with the function "optim" when I tried to > do an mle estimation of a simple lognormal regression. Some > warning message poped up saying NANs have been produced in > the optimization process. But I could not figure out which > part of my code has caused this. I wonder if anybody would > help. The code is in the following and the data is in the attachment. > > > da <- read.table("da.txt",header=TRUE) > > # fit with linear regression using log transformation of the > response variable > fit <- lm(log(yp) ~ as.factor(ay)+as.factor(lag),data=da) > > # define the log likelihood to be maximized over > llk.mar <- function(parm,y,x){ > # parm is the vector of parameters > # the last element is sigma > # y is the response > # x is the design matrix > l <- length(parm) > beta <- parm[-l] > sigma <- parm[l] > x <- as.matrix(x) > mu <- x %*% beta > llk <- sum(dnorm(y, mu, sigma,log=TRUE)) > return(llk) > } > > # initial values > parm <- c(as.vector(coef(fit)),summary(fit)$sigma) > y <- log(da$yp) > x <- model.matrix(fit) > > op <- optim(parm, llk.mar, > y=y,x=x,control=list(fnscale=-1,maxit=100000)) > > > After running the above code, I got the warning message: > Warning messages: > 1: In dnorm(x, mean, sd, log) : NaNs produced > 2: In dnorm(x, mean, sd, log) : NaNs produced > > > I would really appreciate if anybody would help to point out > the problem with this code or tell me how to trace it down > (using "trace"?)? > Many thanks in advance. > > > > > > > > Wayne (Yanwei) Zhang > Statistical Research > CNA > > > > > > NOTICE: This e-mail message, including any attachments and > appended messages, is for the sole use of the intended > recipients and may contain confidential and legally > privileged information. > If you are not the intended recipient, any review, > dissemination, distribution, copying, storage or other use of > all or any portion of this message is strictly prohibited. > If you received this message in error, please immediately > notify the sender by reply e-mail and delete this message in > its entirety. >
It is indeed a negative value for sigma that causes the issue. You can check this by inserting this line if(sigma <= 0 ) cat("Negative sigma=",sigma,"\n") after the line mu <- x %*% beta in function llk.mar Negative values for sigma can be avoided with the use of a transformation for sigma, forcing it to be always positive. Make optim use log(sigma) as parameter and transform this to sigma with sigma <- exp(parm[l]) in llk.mar. Like this # define the log likelihood to be maximized over llk.mar <- function(parm,y,x){ # parm is the vector of parameters # the last element is sigma # y is the response # x is the design matrix l <- length(parm) beta <- parm[-l] sigma <- exp(parm[l]) # <=== transform x <- as.matrix(x) mu <- x %*% beta if(sigma <= 0 ) cat("Negative sigma=",sigma,"\n") llk <- sum(dnorm(y, mu, sigma,log=TRUE)) return(llk) } # initial values parm <- c(as.vector(coef(fit)),log(summary(fit)$sigma)) # use log(sigma) as independent parameter Caveat: transformations often help in situations like this but can lead to badly scaled problems and are not a universal remedy. /Berend -- View this message in context: http://r.789695.n4.nabble.com/Help-on-simple-problem-with-optim-tp2533420p2533939.html Sent from the R help mailing list archive at Nabble.com.
Did you check if the data in "da" has any NA in the dependent or the independent data? Remember that your function llk.mar is going to evaluate dnorm for each pair. If any of those pairs has an NA value, your function will return an NA at the end (sum(c(NA,1,2,3)) = NA) I would check if the llk.mar function is fine for the whole domain of your data. My suggestion is to add an if right before llk<-sum(dnorm....), where you evaluate for NAs in that vector. If you find one, get rid of it before returning the function!. Try that before optim, then let it do the solving. Cheers, Cristi?n Montes. -----Mensaje original----- De: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] En nombre de Zhang,Yanwei Enviado el: Jueves, 09 de Septiembre de 2010 02:54 p.m. Para: r-help at r-project.org Asunto: [R] Help on simple problem with optim Dear all, I ran into problems with the function "optim" when I tried to do an mle estimation of a simple lognormal regression. Some warning message poped up saying NANs have been produced in the optimization process. But I could not figure out which part of my code has caused this. I wonder if anybody would help. The code is in the following and the data is in the attachment. da <- read.table("da.txt",header=TRUE) # fit with linear regression using log transformation of the response variable fit <- lm(log(yp) ~ as.factor(ay)+as.factor(lag),data=da) # define the log likelihood to be maximized over llk.mar <- function(parm,y,x){ # parm is the vector of parameters # the last element is sigma # y is the response # x is the design matrix l <- length(parm) beta <- parm[-l] sigma <- parm[l] x <- as.matrix(x) mu <- x %*% beta llk <- sum(dnorm(y, mu, sigma,log=TRUE)) return(llk) } # initial values parm <- c(as.vector(coef(fit)),summary(fit)$sigma) y <- log(da$yp) x <- model.matrix(fit) op <- optim(parm, llk.mar, y=y,x=x,control=list(fnscale=-1,maxit=100000)) After running the above code, I got the warning message: Warning messages: 1: In dnorm(x, mean, sd, log) : NaNs produced 2: In dnorm(x, mean, sd, log) : NaNs produced I would really appreciate if anybody would help to point out the problem with this code or tell me how to trace it down (using "trace"?)? Many thanks in advance. Wayne (Yanwei) Zhang Statistical Research CNA NOTICE: This e-mail message, including any attachments and appended messages, is for the sole use of the intended recipients and may contain confidential and legally privileged information. If you are not the intended recipient, any review, dissemination, distribution, copying, storage or other use of all or any portion of this message is strictly prohibited. If you received this message in error, please immediately notify the sender by reply e-mail and delete this message in its entirety.