thr3ads.net - R help - [R] vectorized mle / optim [Oct 2007]

If this information is useful, please help other people find it:
Share via:

Arnaud Le Rouzic

2007-Oct-24 17:00 UTC

[R] vectorized mle / optim

Hi the list,

I would need some advice on something that looks like a FAQ: the 
possibility of providing vectors to optim() function.

Here is a stupid and short example summarizing the problem:

-------------------------------- example 1 ------------ 8< 
----------------------
library(stats4)
data <- rnorm(100,0,1)
lik1 <- function(m, v, data) {
  N <- length(data)
  lik.mean <- dnorm(mean(data), m, sqrt(v/N), log=T)
  lik.var <- dchisq(N*var(data)/v, N-1, log=T)
  return(-lik.mean - lik.var)
}
ml.result <- mle(lik1, start=list(m=2, v=2), fixed=list(data=data))
summary(ml.result)
---------------------------------------------------------------------------------------

This works perfectly (except that the default algorithm sometimes tries 
some negative variance parameters).

However, if the parameters (m and v) are considered as a vector of 
parameters, the result is very disappointing:

 -------------------------------- example 2 ------------ 8< 
----------------------
lik2 <- function(param, data) {
  N <- length(data)
  lik.mean <- dnorm(mean(data), param["m"],
sqrt(param["v"]/N), log=T)
  lik.var <- dchisq(N*var(data)/param["v"], N-1, log=T)
  return(-lik.mean - lik.var)
}
ml.result <- mle(lik2, start=list(param=c(m=2, v=2)), fixed=list(data=data))
---------------------------------------------------------------------------------------
"Error in dnorm(mean(data), param["m"],
sqrt(param["v"]/N), log = T) :
  argument "param" is missing, with no default"

One could trust the error message and provide default values, but 
unfortunately,

 -------------------------------- example 2b ------------ 8< 
----------------------
lik2b <- function(param=c(m=1, v=1), data) {
  N <- length(data)
  lik.mean <- dnorm(mean(data), param["m"],
sqrt(param["v"]/N), log=T)
  lik.var <- dchisq(N*var(data)/param["v"], N-1, log=T)
  return(-lik.mean - lik.var)
}
ml.result <- mle(lik2b, start=list(param=c(m=2, v=2)), 
fixed=list(data=data))
---------------------------------------------------------------------------------------
"Error in validObject(.Object) :
  invalid class "mle" object: invalid object for slot
"fullcoef" in
class "mle": got class "list", should be or extend class
"numeric""

The example above is very stupid, but there are cases where the estimate 
of vectors of parameters can hardly be avoided. In particular, I'm 
working with time series models with random effects that has to be 
estimated at each time point (the parameters underlying the distribution 
of these random effects are "real" parameters, but the effect itself
is
a nuisance parameter). Of course, the number of variables to estimate 
will depend on the size of the data set, and the function is supposed to 
work for any dataset (convergence is another issue).

The archives of r-help are quite clear on the fact that mle (and optim) 
are not vectorized, dot. I tried to trick mle by defining a likelihood 
function with a "..." parameter, and converting the c(...) into named 
parameters afterwards, but it does not work: mle checks the list of 
parameters against "fullcoef <- formals(minuslogl)". I derived my
own
mle function to force the call of optim() without the "fool-proof" 
tests, but it was probably very naive because it crashes as well.

The fact is that the code of mle(), as well as the R part of optim(), is 
full of (not always clean) tricks to ensure that the parameters are  
exactly as they are supposed to be. I guess the aim of this code is not 
only to slow down the function and to annoy the user, so I would like to 
get some feedback before trying to modify mle() and optim() to force 
them, in one way or another, to turn lists of vectors into lists of 
numeric values. I'm a bit afraid that the problem goes down to the 
.Internal optim function, and I will probably give up if it is the case.

Thanks for any comment,

Arnaud.



PS: By the way, the "paranoid parameter tests" in mle / optim lead to
an
annoying bug: the "minuslogl" function cannot have
"computed" default
parameters, because all of them must be specified either as "fixed" or
"start".

 -------------------------------- example 3 ------------ 8< 
----------------------
lik1b <- function(m, v=var(data), data) {
  N <- length(data)
  lik.mean <- dnorm(mean(data), m, sqrt(v/N), log=T)
  lik.var <- dchisq(N*var(data)/v, N-1, log=T)
  return(-lik.mean - lik.var)
}
ml.result.1 <- mle(lik1b, start=list(m=2, v=2), fixed=list(data=data))
ml.result.2 <- mle(lik1b, start=list(m=2), fixed=list(data=data))
---------------------------------------------------------------------------------------

The second call to mle crashes
"Error in validObject(.Object) :
  invalid class "mle" object: invalid object for slot
"fullcoef" in
class "mle": got class "list", should be or extend class
"numeric""

I don't see any reason for that: a call to lik1b(m=0, data=data) returns 
the expected value. The bug can be bypassed by using a default such as 
v=NULL, and then if(is.null(v)) v <- var(data) , but it does not seem 
very elegant...

Charles C. Berry

2007-Oct-24 18:49 UTC

head link

[R] Passing args and data to optim. WAS: Re: vectorized mle / optim

On Wed, 24 Oct 2007, Arnaud Le Rouzic wrote:
> Hi the list,
>
> I would need some advice on something that looks like a FAQ: the
> possibility of providing vectors to optim() function.
>
Arnaud,

If I have misunderstood your intent, let me say I am sorry.

I think what you are struggling with is this:

 	Providing data and parameter values to an objective function to be
 	used by optim in a flexible way.

I have experienced some frustration in this myself.

One solution I have used is to write the objective function in a natural 
way - using multiple arguments as needed for convenience and clarity.

Then I write a wrapper function that optim can handle that takes only a 
vector argument. If I need to pass data or fixed values of parameters, I 
do this by assigning them in the environment of the wrapper function.

Section 1.7 of Intro to R has an example ('open.account') of a nifty
trick
that is instructive and helpful for this purpose; you may want to review 
it.

In one project, I found it helpful to write a function that allows me to 
create wrappers and the underlying objective function and gradient in a 
flexible way. It is not well documented, can I send this and an example to 
you off-list if you like.


HTH,

Chuck

> Here is a stupid and short example summarizing the problem:
>
> -------------------------------- example 1 ------------ 8<
> ----------------------
> library(stats4)
> data <- rnorm(100,0,1)
> lik1 <- function(m, v, data) {
>  N <- length(data)
>  lik.mean <- dnorm(mean(data), m, sqrt(v/N), log=T)
>  lik.var <- dchisq(N*var(data)/v, N-1, log=T)
>  return(-lik.mean - lik.var)
> }
> ml.result <- mle(lik1, start=list(m=2, v=2), fixed=list(data=data))
> summary(ml.result)
>
---------------------------------------------------------------------------------------
>
> This works perfectly (except that the default algorithm sometimes tries
> some negative variance parameters).
>
> However, if the parameters (m and v) are considered as a vector of
> parameters, the result is very disappointing:
>
> -------------------------------- example 2 ------------ 8<
> ----------------------
> lik2 <- function(param, data) {
>  N <- length(data)
>  lik.mean <- dnorm(mean(data), param["m"],
sqrt(param["v"]/N), log=T)
>  lik.var <- dchisq(N*var(data)/param["v"], N-1, log=T)
>  return(-lik.mean - lik.var)
> }
> ml.result <- mle(lik2, start=list(param=c(m=2, v=2)),
fixed=list(data=data))
>
---------------------------------------------------------------------------------------
> "Error in dnorm(mean(data), param["m"],
sqrt(param["v"]/N), log = T) :
>  argument "param" is missing, with no default"
>
> One could trust the error message and provide default values, but
> unfortunately,
>
> -------------------------------- example 2b ------------ 8<
> ----------------------
> lik2b <- function(param=c(m=1, v=1), data) {
>  N <- length(data)
>  lik.mean <- dnorm(mean(data), param["m"],
sqrt(param["v"]/N), log=T)
>  lik.var <- dchisq(N*var(data)/param["v"], N-1, log=T)
>  return(-lik.mean - lik.var)
> }
> ml.result <- mle(lik2b, start=list(param=c(m=2, v=2)),
> fixed=list(data=data))
>
---------------------------------------------------------------------------------------
> "Error in validObject(.Object) :
>  invalid class "mle" object: invalid object for slot
"fullcoef" in
> class "mle": got class "list", should be or extend
class "numeric""
>
> The example above is very stupid, but there are cases where the estimate
> of vectors of parameters can hardly be avoided. In particular, I'm
> working with time series models with random effects that has to be
> estimated at each time point (the parameters underlying the distribution
> of these random effects are "real" parameters, but the effect
itself is
> a nuisance parameter). Of course, the number of variables to estimate
> will depend on the size of the data set, and the function is supposed to
> work for any dataset (convergence is another issue).
>
> The archives of r-help are quite clear on the fact that mle (and optim)
> are not vectorized, dot. I tried to trick mle by defining a likelihood
> function with a "..." parameter, and converting the c(...) into
named
> parameters afterwards, but it does not work: mle checks the list of
> parameters against "fullcoef <- formals(minuslogl)". I derived
my own
> mle function to force the call of optim() without the
"fool-proof"
> tests, but it was probably very naive because it crashes as well.
>
> The fact is that the code of mle(), as well as the R part of optim(), is
> full of (not always clean) tricks to ensure that the parameters are
> exactly as they are supposed to be. I guess the aim of this code is not
> only to slow down the function and to annoy the user, so I would like to
> get some feedback before trying to modify mle() and optim() to force
> them, in one way or another, to turn lists of vectors into lists of
> numeric values. I'm a bit afraid that the problem goes down to the
> .Internal optim function, and I will probably give up if it is the case.
>
> Thanks for any comment,
>
> Arnaud.
>
>
>
> PS: By the way, the "paranoid parameter tests" in mle / optim
lead to an
> annoying bug: the "minuslogl" function cannot have
"computed" default
> parameters, because all of them must be specified either as
"fixed" or
> "start".
>
> -------------------------------- example 3 ------------ 8<
> ----------------------
> lik1b <- function(m, v=var(data), data) {
>  N <- length(data)
>  lik.mean <- dnorm(mean(data), m, sqrt(v/N), log=T)
>  lik.var <- dchisq(N*var(data)/v, N-1, log=T)
>  return(-lik.mean - lik.var)
> }
> ml.result.1 <- mle(lik1b, start=list(m=2, v=2), fixed=list(data=data))
> ml.result.2 <- mle(lik1b, start=list(m=2), fixed=list(data=data))
>
---------------------------------------------------------------------------------------
>
> The second call to mle crashes
> "Error in validObject(.Object) :
>  invalid class "mle" object: invalid object for slot
"fullcoef" in
> class "mle": got class "list", should be or extend
class "numeric""
>
> I don't see any reason for that: a call to lik1b(m=0, data=data)
returns
> the expected value. The bug can be bypassed by using a default such as
> v=NULL, and then if(is.null(v)) v <- var(data) , but it does not seem
> very elegant...
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Charles C. Berry                            (858) 534-2098
                                             Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu	            UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

Seemingly Similar Threads

Search for more possibly parallel threads

R help - Oct 2007 - vectorized mle / optim

[R] vectorized mle / optim

[R] Passing args and data to optim. WAS: Re: vectorized mle / optim

Seemingly Similar Threads