thr3ads.net - R help - [R] nlminb() - how do I constrain the parameter vector properly? [Oct 2013]

If this information is useful, please help other people find it:
Share via:

Steven LeBlanc

2013-Oct-20 22:01 UTC

[R] nlminb() - how do I constrain the parameter vector properly?

Greets,

I'm trying to use nlminb() to estimate the parameters of a bivariate normal
sample and during one of the iterations it passes a parameter vector to the
likelihood function resulting in an invalid covariance matrix that causes
dmvnorm() to throw an error. Thus, it seems I need to somehow communicate to
nlminb() that the final three parameters in my parameter vector are used to
construct a positive semi-definite matrix, but I can't see how to achieve
this using the constraint mechanism provided. Additional details are provided in
the code below.

Suggestions?

Best Regards,
Steven

Generate the data set I'm using:

mu<-c(1,5)
sigma<-c(1,2,2,6)
dim(sigma)<-c(2,2)
set.seed(83165026)
sample.full<-sample.deleted<-mvrnorm(50,mu,sigma)
delete.one<-c(1,5,13,20)
delete.two<-c(3,11,17,31,40,41)
sample.deleted[delete.one,1]<-NA
sample.deleted[delete.two,2]<-NA
missing<-c(delete.one,delete.two)
complete<-sample.deleted[!(is.na(sample.deleted[,1]) |
is.na(sample.deleted[,2])),]
deleted<-sample.deleted[missing,]

Try to estimate the parameters of the data set less the deleted values:

exact<-function(theta,complete,deleted){
    one.only<-deleted[!(is.na(deleted[,1])),1]
    two.only<-deleted[!(is.na(deleted[,2])),2]
    u<-c(theta[1],theta[2])
    sigma<-c(theta[3],theta[5],theta[5],theta[4])
    dim(sigma)<-c(2,2)
    -sum(log(dmvnorm(x=complete,mean=u,sigma=sigma)))-
        sum(log(dnorm(one.only,u[1],sigma[1,1])))-
            sum(log(dnorm(two.only,u[2],sigma[2,2])))
}
nlminb(start=c(0,0,1,1,0),objective=exact,complete=complete,deleted=deleted,control=list(trace=1))

Escape and it stops at Iteration 9 on my machine.

David Winsemius

2013-Oct-21 00:19 UTC

head link

[R] nlminb() - how do I constrain the parameter vector properly?

On Oct 20, 2013, at 3:01 PM, Steven LeBlanc wrote:
> Greets,
> 
> I'm trying to use nlminb() to estimate the parameters of a bivariate
normal sample and during one of the iterations it passes a parameter vector to
the likelihood function resulting in an invalid covariance matrix that causes
dmvnorm() to throw an error. Thus, it seems I need to somehow communicate to
nlminb() that the final three parameters in my parameter vector are used to
construct a positive semi-definite matrix, but I can't see how to achieve
this using the constraint mechanism provided. Additional details are provided in
the code below.
> 
I recently noticed that teh dmvnorn function in package mixtools doe not throw
an error when given a non-positive definite varaince-covariance matrix. Whether
it should be doing so might be up for debate. Peter Dalgaard seemed to think
that any self respecting such function _should_ be throwing NaN's back at
you.

The code it was using is rather compact:
> mixtools::dmvnormfunction (y, mu = NULL, sigma = NULL) 
{
    exp(logdmvnorm(y, mu = mu, sigma = sigma))
}
<environment: namespace:mixtools>
> mixtools::logdmvnormfunction (y, mu = NULL, sigma = NULL) 
{
    if (is.vector(y)) 
        y <- matrix(y, nrow = 1)
    d <- ncol(y)
    if (!is.null(mu)) 
        y <- sweep(y, 2, mu, "-")
    if (is.null(sigma)) 
        sigma <- diag(d)
    k <- d * 1.83787706640935
    a <- qr(sigma)
    logdet <- sum(log(abs(diag(a$qr))))
    if (nrow(y) == 1) 
        mahaldist <- as.vector(y %*% qr.solve(a, t(y)))
    else mahaldist <- rowSums((y %*% qr.solve(a)) * y)
    -0.5 * (mahaldist + logdet + k)
}

I noticed that you were logging the dmvnorm values and so I wondered also if
going directly to that function might save some time. (Note that I am way out of
my mathematical dept here. Caveat emptor, maxima caveat emptor).

-- 
David.
> Suggestions?
> 
> Best Regards,
> Steven
> 
> Generate the data set I'm using:
> 
> mu<-c(1,5)
> sigma<-c(1,2,2,6)
> dim(sigma)<-c(2,2)
> set.seed(83165026)
> sample.full<-sample.deleted<-mvrnorm(50,mu,sigma)
> delete.one<-c(1,5,13,20)
> delete.two<-c(3,11,17,31,40,41)
> sample.deleted[delete.one,1]<-NA
> sample.deleted[delete.two,2]<-NA
> missing<-c(delete.one,delete.two)
> complete<-sample.deleted[!(is.na(sample.deleted[,1]) |
is.na(sample.deleted[,2])),]
> deleted<-sample.deleted[missing,]
> 
> Try to estimate the parameters of the data set less the deleted values:
> 
> exact<-function(theta,complete,deleted){
>    one.only<-deleted[!(is.na(deleted[,1])),1]
>    two.only<-deleted[!(is.na(deleted[,2])),2]
>    u<-c(theta[1],theta[2])
>    sigma<-c(theta[3],theta[5],theta[5],theta[4])
>    dim(sigma)<-c(2,2)
>    -sum(log(dmvnorm(x=complete,mean=u,sigma=sigma)))-
>        sum(log(dnorm(one.only,u[1],sigma[1,1])))-
>            sum(log(dnorm(two.only,u[2],sigma[2,2])))
> }
>
nlminb(start=c(0,0,1,1,0),objective=exact,complete=complete,deleted=deleted,control=list(trace=1))
> 
> Escape and it stops at Iteration 9 on my machine.
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius
Alameda, CA, USA

William Dunlap

2013-Oct-21 01:41 UTC

head link

[R] nlminb() - how do I constrain the parameter vector properly?

Do you mean that your objective function (given to nlminb) parameterized
a positive definite matrix, P, as the elements in its upper (or lower) triangle?
If so, you could reparameterize it by the non-zero (upper triangular) elements
of the Choleski decomposition, C, of P.  Compute P as crossprod(C), compute
the initial estimate of C as chol(P).

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at
r-project.org] On Behalf
> Of Steven LeBlanc
> Sent: Sunday, October 20, 2013 3:02 PM
> To: r-help at R-project.org
> Subject: [R] nlminb() - how do I constrain the parameter vector properly?
> 
> Greets,
> 
> I'm trying to use nlminb() to estimate the parameters of a bivariate
normal sample and
> during one of the iterations it passes a parameter vector to the likelihood
function
> resulting in an invalid covariance matrix that causes dmvnorm() to throw an
error. Thus,
> it seems I need to somehow communicate to nlminb() that the final three
parameters in
> my parameter vector are used to construct a positive semi-definite matrix,
but I can't see
> how to achieve this using the constraint mechanism provided. Additional
details are
> provided in the code below.
> 
> Suggestions?
> 
> Best Regards,
> Steven
> 
> Generate the data set I'm using:
> 
> mu<-c(1,5)
> sigma<-c(1,2,2,6)
> dim(sigma)<-c(2,2)
> set.seed(83165026)
> sample.full<-sample.deleted<-mvrnorm(50,mu,sigma)
> delete.one<-c(1,5,13,20)
> delete.two<-c(3,11,17,31,40,41)
> sample.deleted[delete.one,1]<-NA
> sample.deleted[delete.two,2]<-NA
> missing<-c(delete.one,delete.two)
> complete<-sample.deleted[!(is.na(sample.deleted[,1]) |
is.na(sample.deleted[,2])),]
> deleted<-sample.deleted[missing,]
> 
> Try to estimate the parameters of the data set less the deleted values:
> 
> exact<-function(theta,complete,deleted){
>     one.only<-deleted[!(is.na(deleted[,1])),1]
>     two.only<-deleted[!(is.na(deleted[,2])),2]
>     u<-c(theta[1],theta[2])
>     sigma<-c(theta[3],theta[5],theta[5],theta[4])
>     dim(sigma)<-c(2,2)
>     -sum(log(dmvnorm(x=complete,mean=u,sigma=sigma)))-
>         sum(log(dnorm(one.only,u[1],sigma[1,1])))-
>             sum(log(dnorm(two.only,u[2],sigma[2,2])))
> }
>
nlminb(start=c(0,0,1,1,0),objective=exact,complete=complete,deleted=deleted,control=l
> ist(trace=1))
> 
> Escape and it stops at Iteration 9 on my machine.
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Prof J C Nash (U30A)

2013-Oct-21 12:47 UTC

head link

[R] nlminb() - how do I constrain the parameter vector properly?

This is one area where more internal communication between the objective 
function (inadmissible inputs) and optimizer (e.g., Quasi-Newton) is 
needed. This is NOT done at the moment in R, nor in most software. An 
area for R&D. In Nash and Walker-Smith (1987) we did some of this in 
BASIC back in mid-80s. Still trying to redo that for R, but it won't be 
done quickly due to the much bigger infrastructure of R.

The "trick" with using a large value or Inf (which sometimes causes
other errors) usually slows the optimization, whereas communicating that 
the objective is inadmissible in a line search can often be simply a 
shortening of the step size.

JN

On 13-10-21 06:00 AM, r-help-request at r-project.org
wrote:> Message: 34
> Date: Mon, 21 Oct 2013 05:56:45 -0400
> From: Mark Leeds<markleeds2 at gmail.com>
> To: Steven LeBlanc<oreslag at gmail.com>
> Cc:"r-help at R-project.org"  <r-help at r-project.org>
> Subject: Re: [R] nlminb() - how do I constrain the parameter vector
> 	properly?
> Message-ID:
> 	<CAHz+bWYEtvZjiCCauGVxstGcqmF6ENw0N3MMb7jfa3OkYkBTKA at
mail.gmail.com>
> Content-Type: text/plain
>
> my mistake. since nlminb is minimizing, it should be +Inf  ( so that the
> likelihood
> is large ) as you pointed out. Note  that this approach is a heuristic and
> may  not work all the time.

William Dunlap

2013-Oct-21 14:52 UTC

head link

[R] nlminb() - how do I constrain the parameter vector properly?

Try defining the function
    theta345toSigma <- function(theta) {
       cholSigma <- cbind(c(theta[3], 0), c(theta[4], theta[5]))
       crossprod(cholSigma) # t(cholSigma) %*% cholSigma)
    }
This creates a positive definite matrix for any theta (and
any positive definite matrix has such a representation, generally
more than one).  It is like using the square root of a quantity
in the optimizer when you know the quantity must be non-negative.

Then change your
    sigma <- c(theta[3],theta[5],theta[5],theta[4])
    dim(sigma) <- c(2, 2)
to
    sigma <- theta345toSigma(theta)

If one of your variances is near 0 the optimizer may run into
trouble at saddlepoints.  Others may be able to help better
with that issue.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

> -----Original Message-----
> From: Steven LeBlanc [mailto:oreslag at gmail.com]
> Sent: Sunday, October 20, 2013 9:35 PM
> To: William Dunlap
> Subject: Re: [R] nlminb() - how do I constrain the parameter vector
properly?
> 
> 
> On Oct 20, 2013, at 6:41 PM, William Dunlap <wdunlap at tibco.com>
wrote:
> 
> > Do you mean that your objective function (given to nlminb)
parameterized
> > a positive definite matrix, P, as the elements in its upper (or lower)
triangle?
> > If so, you could reparameterize it by the non-zero (upper triangular)
elements
> > of the Choleski decomposition, C, of P.  Compute P as crossprod(C),
compute
> > the initial estimate of C as chol(P).
> >
> > Bill Dunlap
> > Spotfire, TIBCO Software
> > wdunlap tibco.com
> 
> Hi Bill,
> 
> I've clipped out the superfluous code to leave the objective function
and call to nlminb()
> below.
> 
> I'm using the first two parameters to construct the vector of means
'u' for a bivariate
> normal, and the final three parameters to construct the corresponding
covariance matrix
> 'sigma'. Both are required by dmvnorm(). In summary, nlminb()
required a vector of
> parameters, so I supplied the number of parameters I needed nlminb() to
optimize and
> simply built the required formats within the function.
> 
> It didn't occur to me until I saw the error that nlminb() would have no
way of knowing the
> proper boundaries of the parameter space unless there is some way to
communicate the
> constraints. nlminb() implements simple box constraints, but I don't
see a way to
> communicate "parameters 3, 4, and 5 must satisfy 3*4 - 5^2 > 0.
> 
> Regarding your suggestion, I don't think I understand. Might you
elaborate?
> 
> Thanks & Best Regards,
> Steven
> 
> > exact<-function(theta,complete,deleted){
> >>
> >>    u<-c(theta[1],theta[2])
> >>    sigma<-c(theta[3],theta[5],theta[5],theta[4])
> >>    dim(sigma)<-c(2,2)
> >>    -sum(log(dmvnorm(x=complete,mean=u,sigma=sigma)))-
> >>        sum(log(dnorm(one.only,u[1],sigma[1,1])))-
> >>            sum(log(dnorm(two.only,u[2],sigma[2,2])))
> >> }
> >>
>
nlminb(start=c(0,0,1,1,0),objective=exact,complete=complete,deleted=deleted,control=l
> >> ist(trace=1))

William Dunlap

2013-Oct-21 16:28 UTC

head link

[R] nlminb() - how do I constrain the parameter vector properly?

[I added r-help to the cc again.  Please keep the replies on the list as
there are others who can contribute to or learn from them.]
> I also learned you have to be very careful with the starting value, as the
simple identity
> matrix becomes singular under the transformation.
That is why I suggested using the non-zero entries in chol(sigmaStart), where
sigmaStart is your initial estimate of the variance matrix, as the initial
values of theta[3:5].

I should have said that crossProd(x) gives you a positive semidefinite matrix,
not positive definite.  When I said that variances near 0 would cause problems
I meant that a singular covariance matrix would cause problems.


Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

> -----Original Message-----
> From: Steven LeBlanc [mailto:oreslag at gmail.com]
> Sent: Monday, October 21, 2013 9:21 AM
> To: William Dunlap
> Subject: Re: [R] nlminb() - how do I constrain the parameter vector
properly?
> 
> 
> On Oct 21, 2013, at 7:52 AM, William Dunlap <wdunlap at tibco.com>
wrote:
> 
> > Try defining the function
> >    theta345toSigma <- function(theta) {
> >       cholSigma <- cbind(c(theta[3], 0), c(theta[4], theta[5]))
> >       crossprod(cholSigma) # t(cholSigma) %*% cholSigma)
> >    }
> > This creates a positive definite matrix for any theta (and
> > any positive definite matrix has such a representation, generally
> > more than one).  It is like using the square root of a quantity
> > in the optimizer when you know the quantity must be non-negative.
> >
> > Then change your
> >    sigma <- c(theta[3],theta[5],theta[5],theta[4])
> >    dim(sigma) <- c(2, 2)
> > to
> >    sigma <- theta345toSigma(theta)
> >
> > If one of your variances is near 0 the optimizer may run into
> > trouble at saddlepoints.  Others may be able to help better
> > with that issue.
> >
> > Bill Dunlap
> > Spotfire, TIBCO Software
> > wdunlap tibco.com
> 
> Hi Bill,
> 
> I tried your suggestion and the optimizer produces a result, but it seems
substantially far
> from the anticipated result and from the result obtained when I use Inf as
a return value
> for an invalid covariance matrix. Perhaps it would work if I made other
adjustments to
> account for the 'bias' (using this term very loosely) induced by
changing an invalid
> parameter vector into something valid but incorrect?
> 
> I also learned you have to be very careful with the starting value, as the
simple identity
> matrix becomes singular under the transformation.
> 
> > theta<-c(0,0,1,1,0)
> > cholSigma<-cbind(c(theta[3], 0), c(theta[4], theta[5]))
> > sigma<-crossprod(cholSigma)
> > sigma
>      [,1] [,2]
> [1,]    1    1
> [2,]    1    1
> 
> In any case, the Inf trick works for now. I was asking about
'adjustments' strictly out of
> curiosity. Code and results are below.
> 
> Best Regards,
> Steven
> 
> > exact<-function(theta,complete,deleted){
> +     one.only<-deleted[!(is.na(deleted[,1])),1]
> +     two.only<-deleted[!(is.na(deleted[,2])),2]
> +     u<-c(theta[1],theta[2])
> +     sigma<-c(theta[3],theta[5],theta[5],theta[4])
> +     dim(sigma)<-c(2,2)
> +     if(!(is.positive.semi.definite(sigma))){return(Inf)}
> +     -sum(log(dmvnorm(x=complete,mean=u,sigma=sigma)))-
> +         sum(log(dnorm(one.only,u[1],sigma[1,1])))-
> +             sum(log(dnorm(two.only,u[2],sigma[2,2])))
> + }
> > exact2<-function(theta,complete,deleted){
> +     one.only<-deleted[!(is.na(deleted[,1])),1]
> +     two.only<-deleted[!(is.na(deleted[,2])),2]
> +     u<-c(theta[1],theta[2])
> +     cholSigma<-cbind(c(theta[3], 0), c(theta[4], theta[5]))
> +     sigma<-crossprod(cholSigma)
> +     -sum(log(dmvnorm(x=complete,mean=u,sigma=sigma)))-
> +         sum(log(dnorm(one.only,u[1],sigma[1,1])))-
> +             sum(log(dnorm(two.only,u[2],sigma[2,2])))
> + }
> >
nlminb(start=theta.hat.em,objective=exact,complete=complete,deleted=deleted)$par
> [1] 1.2289422 5.4995271 0.9395155 4.8069068 1.8009213
> >
nlminb(start=theta.hat.em,objective=exact2,complete=complete,deleted=deleted)$par
> [1] 1.2289421 5.4995265 0.9692861 1.8579876 1.1639544

Seemingly Similar Threads

Search for more reasonably related threads

R help - Oct 2013 - nlminb() - how do I constrain the parameter vector properly?

[R] nlminb() - how do I constrain the parameter vector properly?

[R] nlminb() - how do I constrain the parameter vector properly?

[R] nlminb() - how do I constrain the parameter vector properly?

[R] nlminb() - how do I constrain the parameter vector properly?

[R] nlminb() - how do I constrain the parameter vector properly?

[R] nlminb() - how do I constrain the parameter vector properly?

Seemingly Similar Threads