Karl Ove Hufthammer
2008-May-16 13:04 UTC
[R] How to determine sensible values for 'fnscale' and 'parscale' in optim
Dear R-help, I'm using the 'optim' functions to minimise functions, and have read the documentation, but I'm still not sure how to determine sensible values to use for the 'fnscale' and 'parscale' options. If I have understood everything correctly, 'fnscale' should be used to scale the objective function, so that for example if the default is 'sensible' (or even 'optimal') for minimising 'f', one should use 'fnscale=1e-6' for minimizing the function 'function(...) 1e-6 f(...)'. But in which range of numbers should 'f' lie for the default 'fnscale' to be reasonable (with other options, such as 'reltol', at their defaults)? I understand that if 'f takes values around, e.g., 1e-10 (at least for parameter values close the optimal ones), I need to use 'fnscale'. But how much should I scale? The same applies to 'parscale'. How do I termine reasonable values? To make the question a bit less theoretical, how would one go about choosing good values of 'fnscale' and 'parscale' to use when finding, for example, the MLEs of a bivariate normal distribution using optim. Here's code for this example: ----------------------------------------------- library(MASS) # needed mvrnorm library(mvtnorm) # needed for dmvnorm set.sed(20080516) n=1000 mu1=3 mu2=5 sig1=7 sig2=20 rho=.5 sigmat=matrix(c(sig1^2,sig1*sig2*rho,sig1*sig2*rho,sig2^2),2) xy=mvrnorm(n,c(mu1,mu2),sigmat) # n = 1000 observations from this # distribution. obj=function(par,xy) # The function to maximize. { mu=par[1:2] sigmat=matrix(c(par[3]^2,par[3]*par[4]*par[5],par[3]*par[4]*par[5],par[4]^2),2) mean(dmvnorm(xy, mu, sigmat, log=TRUE)) } # Using optim to find the MLEs. optim( c(5,5,10,10,.5), obj, control=list(fnscale=-1), xy=xy) # We could of course also calculated MLEs directly. colMeans(xy) sd(xy)*sqrt(1-1/n) cor(xy) ----------------------------------------------- Here optim converges to (approximately) the correct values, even with not very good initial values (though with method="CG" we do not get convergence without increasing maxit). But how should one choose 'fnscale' and 'parscale' for faster or better convergence? -- Karl Ove Hufthammer