Matthieu Stigler
2012-Nov-15 15:25 UTC
[R] hessian fails for box-constrained problems when close to boundary?
Hi I am trying to recover the hessian of a problem optimised with box-constraints. The problem is that in some cases, my estimates are very close to the boundary, which will make optim(..., hessian=TRUE) or optimHessian() fail, as they do not follow the box-constraints, and hence estimate the function in the unfeasible parameter space. As a simple example (my problem is more complex though, simple param transformations do not apply ashere), imagine estimating mu and sigma (restricted to be >0) of a simple normally distributed data, where however the true sigma is very close to zero: LLNorm <- function(para, dat=rn) return(-sum(dnorm(dat, para[1], para[2], log=TRUE))) rn2 <- c(rep(10.3, 2000), 10.31)>optim(c(10,1), fn=LLNorm, method="L-BFGS-B", lower=c(-Inf, 0.0000001),dat=rn2,hessian=TRUE) Error in optim(c(10, 1), fn = LLNorm, method = "L-BFGS-B", lower = c(-Inf, : non-finite finite-difference value [2] The only solution/workaround I found is to do a two steps procedure: use optim() without hessian, then use optimHess, specifying the length of the numerical variations (arg ndeps) as smaller as the distance between the parameter and the bound, i.e.: op<-optim(c(10,1), fn=LLNorm, method="L-BFGS-B", lower=c(-Inf, 0.0000001), dat=rn2,hessian=FALSE) H<-optimHess(op$par, LLNorm, dat=rn2, control=list(ndeps=rep((op$par[2]-0)/1000,2))) While this solution "works", it is surely not elegant, and I have following questions: 1) is this solution to use smaller steps good at all? 2) I am not sure either I understand the reasons why the hessian in a constrained optim problem is evaluated without the constraints, or at least the problem transformed into a warning? Furthermore, I realised that using the numDeriv package, the function hessian() does not offer either constraints for the parameters, yet in the special case of the example above, it does not fail (although would have problems with parameter even closer to the bound), unlike the optimHessian() function? See: library(numDeriv) hessian(LLNorm, op$par, dat=rn2) This brings me to the final question: is there a technical reason not to allow a "constrained" hessian, which seems to indicate the fact that the two implementations in R do not do it? I found a very interesting answer by Spencer Grave for a similar question: http://tolstoy.newcastle.edu.au/R/e4/help/08/06/15061.html although this regards more the statistical implications than the numerical issues. Thanks a lot! Matthieu [[alternative HTML version deleted]]
Paul Gilbert
2012-Nov-18 03:25 UTC
[R] hessian fails for box-constrained problems when close to boundary?
Matthieu (regarding the final question, but not the ones before that.) The short answer is that you seem to have a solution, since hessian in numDeriv works. The long answer follows. Outside a constraint area one cannot assume that a function will evaluate properly. Some constraints are imposed because the function is not defined outside the region. This would mean that the gradient and hessian are not defined in the usual sense at the boundary; the "left" and "right" derivatives should exist and be the same in the limit. I have had on my to-do list for some time a project to add one-sided approximates in numDeriv, but that will probably not happen soon. If your solution is actually on the boundary, you might consider some substitution which removes the constraint, and consider the hessian on the reduced, unconstrained, space. In your situation I would also be concerned about the optimum being very close, but not on the boundary. You might want to check that this is not an artifact of the convergence criteria. Check that the objective is actually inferior at a point on the boundary close to your optimum. Also, the gradient should be zero at the optimum, but that may not be exactly so because of numerical convergence criteria. If the gradient is pointing toward the boundary, that is a sign that you may just have converged too soon. Also, the hessian coming from the optimization is built up as an approximation using information from the steps of the optimization. I am not familiar enough with the numerical handling of boundary constraints to know how this approximation from the optimization might be affected, but you may need to be careful in this regard. Paul On 12-11-15 10:25 AM, Matthieu Stigler wrote:> Hi > > I am trying to recover the hessian of a problem optimised with > box-constraints. The problem is that in some cases, my estimates are > very close to the boundary, which will make optim(..., hessian=TRUE) or > optimHessian() fail, as they do not follow the box-constraints, and > hence estimate the function in the unfeasible parameter space. > > As a simple example (my problem is more complex though, simple param > transformations do not apply ashere), imagine estimating mu and sigma > (restricted to be >0) of a simple normally distributed data, where > however the true sigma is very close to zero: > > LLNorm <- function(para, dat=rn) return(-sum(dnorm(dat, para[1], > para[2], log=TRUE))) > rn2 <- c(rep(10.3, 2000), 10.31) > > >optim(c(10,1), fn=LLNorm, method="L-BFGS-B", lower=c(-Inf, 0.0000001), > dat=rn2,hessian=TRUE) > Error in optim(c(10, 1), fn = LLNorm, method = "L-BFGS-B", lower > c(-Inf, : non-finite finite-difference value [2] > > The only solution/workaround I found is to do a two steps procedure: use > optim() without hessian, then use optimHess, specifying the length of > the numerical variations (arg ndeps) as smaller as the distance between > the parameter and the bound, i.e.: > op<-optim(c(10,1), fn=LLNorm, method="L-BFGS-B", lower=c(-Inf, > 0.0000001), dat=rn2,hessian=FALSE) > > H<-optimHess(op$par, LLNorm, dat=rn2, > control=list(ndeps=rep((op$par[2]-0)/1000,2))) > > While this solution "works", it is surely not elegant, and I have > following questions: > 1) is this solution to use smaller steps good at all? > 2) I am not sure either I understand the reasons why the hessian in a > constrained optim problem is evaluated without the constraints, or at > least the problem transformed into a warning? > > Furthermore, I realised that using the numDeriv package, the function > hessian() does not offer either constraints for the parameters, yet in > the special case of the example above, it does not fail (although would > have problems with parameter even closer to the bound), unlike the > optimHessian() function? See: > library(numDeriv) > hessian(LLNorm, op$par, dat=rn2) > > This brings me to the final question: is there a technical reason not to > allow a "constrained" hessian, which seems to indicate the fact that the > two implementations in R do not do it? I found a very interesting answer > by Spencer Grave for a similar question: > http://tolstoy.newcastle.edu.au/R/e4/help/08/06/15061.html > although this regards more the statistical implications than the > numerical issues. > > Thanks a lot! > > Matthieu