Dear R-helpers, The function optim implements algorithms that I would like to use. I have function implemented in R, which given the parameters of which minimization is to take place returns a scalar as well as the gradient. Unfortunately optim requires two function _fn_ and _gr_ where fn returns the function value and gr the gradient. Splitting my function in two functions would be easy, however I am wondering if evaluating both is not doubling the the very high computational costs. Most of the computational intensive operations are identical if computing the function value and gradient. Question: is there a way to tweek optim that only one function evaluation is necessary? Are ther other implementations of the algorithm, which do assume that the function to be minimized returns the function value and the gradient as well? Thanks Eryk.
nwew wrote:> Dear R-helpers, > > The function optim implements algorithms that I would like to use. > > > I have function implemented in R, which given the parameters of which > minimization is to take place returns a scalar as well as the gradient. > > Unfortunately optim requires two function _fn_ and _gr_ where fn returns the > function value and gr the gradient. Splitting my function in two functions > would be easy, however I am wondering if evaluating both is not doubling the > the very high computational costs. Most of the computational intensive > operations are identical if computing the function value and gradient. > > Question: is there a way to tweek optim that only one function evaluation is > necessary? Are ther other implementations of the algorithm, which do assume > that the function to be minimized returns the function value and the gradient > as well?I don't know the answer to your question, but here's a different approach. Write a function that effectively splits your single function into two: splitfn <- function(f) { lastx <- NA lastfn <- NA lastgr <- NA doeval <- function(x) { if (identical(all.equal(x, lastx), TRUE)) return(lastfn) lastx <<- x both <- f(x) lastfn <<- both$fnval lastgr <<- both$grval return(lastfn) } fn <- function(x) doeval(x) gr <- function(x) { doeval() lastgr } list(fn=fn, gr=gr) } I haven't tested this, but the idea is that it sets up a local environment where the last x value and last function and gradient values are stored. If the next call asks for the same x, then the cached values are returned. I don't know if it will actually improve efficiency: that depends on whether optim evaluates the gradient and function values at the same points or at different points. You would use this as follows, assuming your function is called f: f2 <- splitfn(f) optim(par, f2$fn, f2$gr, ...) Duncan Murdoch
On Thu, 4 Aug 2005, nwew wrote:> Dear R-helpers, > > The function optim implements algorithms that I would like to use.They are available to you as part of the R API at C level.> I have function implemented in R, which given the parameters of which > minimization is to take place returns a scalar as well as the gradient. > > Unfortunately optim requires two function _fn_ and _gr_ where fn returns the > function value and gr the gradient. Splitting my function in two functions > would be easy, however I am wondering if evaluating both is not doubling the > the very high computational costs. Most of the computational intensive > operations are identical if computing the function value and gradient.That is an unusual situation.> Question: is there a way to tweek optim that only one function evaluation is > necessary? Are ther other implementations of the algorithm, which do assume > that the function to be minimized returns the function value and the gradient > as well?You can of course write your function to cache the work and check if the parameter value is unchanged from the last call. Then if the optimizer calls for the gradient after the function value in the same place (and most methods will) you can just do the additional work for the gradient. That is what nnet does, at C level. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
nwew wrote:> Dear R-helpers, > > The function optim implements algorithms that I would like to use. > > > I have function implemented in R, which given the parameters of which > minimization is to take place returns a scalar as well as the gradient. > > Unfortunately optim requires two function _fn_ and _gr_ where fn returns the > function value and gr the gradient. Splitting my function in two functions > would be easy, however I am wondering if evaluating both is not doubling the > the very high computational costs. Most of the computational intensive > operations are identical if computing the function value and gradient. > > Question: is there a way to tweek optim that only one function evaluation is > necessary? Are ther other implementations of the algorithm, which do assume > that the function to be minimized returns the function value and the gradient > as well? > > Thanks > Eryk. >Hi, Eryk, ?optim does not *require* the "gr" argument. If one is not supplied, numerical gradients are used for which optim evaluates "fn" twice. However, in many cases analytical gradients can both improve numerical accuracy and computational speed. Trivial example: fn <- function(beta) { sum((y - x %*% beta)^2) } gr <- function(beta) { colSums(-2 * x * drop((y - x %*% beta))) } set.seed(1) n <- 10000 p <- 5 g <- factor(rep(LETTERS[1:p], each = n/p)) x <- model.matrix(~g) beta <- rnorm(p) y <- drop(x %*% beta + rnorm(n)) start <- rep(0, p) system.time(f1 <- optim(start, fn, gr, hessian = TRUE)) # [1] 0.47 0.00 0.48 NA NA system.time(f2 <- optim(start, fn, hessian = TRUE)) # [1] 0.54 0.00 0.53 NA NA f1$par #[1] -0.6408643 0.2128109 -0.8378791 1.5983054 0.3366216 f2$par #[1] -0.6408643 0.2128109 -0.8378791 1.5983054 0.3366216 f1$hessian # [,1] [,2] [,3] [,4] [,5] #[1,] 20000 4000 4000 4000 4000 #[2,] 4000 4000 0 0 0 #[3,] 4000 0 4000 0 0 #[4,] 4000 0 0 4000 0 #[5,] 4000 0 0 0 4000 f2$hessian # [,1] [,2] [,3] [,4] [,5] #[1,] 20000 4.000000e+03 4000 4.000000e+03 4000 #[2,] 4000 4.000000e+03 0 2.273737e-07 0 #[3,] 4000 0.000000e+00 4000 0.000000e+00 0 #[4,] 4000 2.273737e-07 0 4.000000e+03 0 #[5,] 4000 0.000000e+00 0 0.000000e+00 4000 # Does this answer your question? Furthermore, have you read the chapter in MASS that discusses optim? HTH, --sundar
Check out: http://finzi.psych.upenn.edu/R/Rhelp02a/archive/18289.html On 8/4/05, nwew <W.E.Wolski at newcastle.ac.uk> wrote:> Dear R-helpers, > > The function optim implements algorithms that I would like to use. > > > I have function implemented in R, which given the parameters of which > minimization is to take place returns a scalar as well as the gradient. > > Unfortunately optim requires two function _fn_ and _gr_ where fn returns the > function value and gr the gradient. Splitting my function in two functions > would be easy, however I am wondering if evaluating both is not doubling the > the very high computational costs. Most of the computational intensive > operations are identical if computing the function value and gradient. > > Question: is there a way to tweek optim that only one function evaluation is > necessary? Are ther other implementations of the algorithm, which do assume > that the function to be minimized returns the function value and the gradient > as well? > > Thanks > Eryk. > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >