Karl Ove Hufthammer
2008-May-16 13:04 UTC
[R] How to determine sensible values for 'fnscale' and 'parscale' in optim
Dear R-help,
I'm using the 'optim' functions to minimise functions, and have read
the
documentation, but I'm still not sure how to determine sensible values to
use for the 'fnscale' and 'parscale' options.
If I have understood everything correctly, 'fnscale' should be used to
scale
the objective function, so that for example if the default is 'sensible'
(or even 'optimal') for minimising 'f', one should use
'fnscale=1e-6' for
minimizing the function 'function(...) 1e-6 f(...)'.
But in which range of numbers should 'f' lie for the default
'fnscale' to
be reasonable (with other options, such as 'reltol', at their defaults)?
I understand that if 'f takes values around, e.g., 1e-10 (at least for
parameter values close the optimal ones), I need to use 'fnscale'. But
how
much should I scale?
The same applies to 'parscale'. How do I termine reasonable values?
To make the question a bit less theoretical, how would one go about
choosing good values of 'fnscale' and 'parscale' to use when
finding,
for example, the MLEs of a bivariate normal distribution using optim.
Here's code for this example:
-----------------------------------------------
library(MASS) # needed mvrnorm
library(mvtnorm) # needed for dmvnorm
set.sed(20080516)
n=1000
mu1=3
mu2=5
sig1=7
sig2=20
rho=.5
sigmat=matrix(c(sig1^2,sig1*sig2*rho,sig1*sig2*rho,sig2^2),2)
xy=mvrnorm(n,c(mu1,mu2),sigmat) # n = 1000 observations from this
# distribution.
obj=function(par,xy) # The function to maximize.
{
mu=par[1:2]
sigmat=matrix(c(par[3]^2,par[3]*par[4]*par[5],par[3]*par[4]*par[5],par[4]^2),2)
mean(dmvnorm(xy, mu, sigmat, log=TRUE))
}
# Using optim to find the MLEs.
optim( c(5,5,10,10,.5), obj, control=list(fnscale=-1), xy=xy)
# We could of course also calculated MLEs directly.
colMeans(xy)
sd(xy)*sqrt(1-1/n)
cor(xy)
-----------------------------------------------
Here optim converges to (approximately) the correct values, even with
not very good initial values (though with method="CG" we do not get
convergence without increasing maxit). But how should one choose
'fnscale'
and 'parscale' for faster or better convergence?
--
Karl Ove Hufthammer
