thr3ads.net - R help - [R] Hessian Matrix Issue [Sep 2011]

If this information is useful, please help other people find it:
Share via:

tzaihra at alcor.concordia.ca

2011-Sep-02 19:33 UTC

[R] Hessian Matrix Issue

Dear All,

I am running a simulation to obtain coverage probability of Wald type
confidence intervals for my parameter d in a function of two parameters
(mu,d).

I am optimizing it using "optim" method "L-BFGS-B" to obtain
MLE. As, I
want to invert the Hessian matrix to get Standard errors of the two
parameter estimates. However, my Hessian matrix at times becomes
non-invertible that is it is no more positive definite and I get the
following error msg:

"Error in solve.default(ac$hessian) : system is computationally singular:
reciprocal condition number = 6.89585e-21"
Thank you

Following is the code I am running I would really appreciate your comments
and suggestions:

#Start Code
#option to trace /recover error
#options(error = recover)

#Sample Size
n<-30
mu<-5
size<- 2

#true values of parameter d
d.true<-1+mu/size
d.true

#true value of  zero inflation index phi= 1+log(d)/(1-d)
z.true<-1+(log(d.true)/(1-d.true))
z.true

# Allocating space for simulation vectors and setting counters for simulation
counter<-0
iter<-10000
lower.d<-numeric(iter)
upper.d<-numeric(iter)

#set.seed(987654321)

#begining of simulation loop########

for (i in 1:iter){
r.NB<-rnbinom(n, mu = mu, size = size)
y<-sort(r.NB)
iter.num<-i
print(y)
print(iter.num)
#empirical estimates or sample moments
xbar<-mean(y)
variance<-(sum((y-xbar)^2))/length(y)
dbar<-variance/xbar
#sample estimate of proportion of zeros and zero inflation index
pbar<-length(y[y==0])/length(y)

### Simplified function #############################################

NegBin<-function(th){
mu<-th[1]
d<-th[2]
n<-length(y)

arg1<-n*mean(y)*ifelse(mu >= 0, log(mu),0)
#arg1<-n*mean(y)*log(mu)

#arg2<-n*log(d)*((mean(y))+mu/(d-1))
arg2<-n*ifelse(d>=0, log(d), 0)*((mean(y))+mu/ifelse((d-1)>= 0, (d-1),
0.0000001))

aa<-numeric(length(max(y)))
a<-numeric(length(y))
for (i in 1:n)
{
for (j in 1:y[i]){
aa[j]<-ifelse(((j-1)*(d-1))/mu >0,log(1+((j-1)*(d-1))/mu),0)
#aa[j]<-log(1+((j-1)*(d-1))/mu)
#print(aa[j])
}

a[i]<-sum(aa)
#print(a[i])
}
a
arg3<-sum(a)
llh<-arg1+arg2+arg3
if(! is.finite(llh))
llh<-1e+20
-llh
}
ac<-optim(NegBin,par=c(xbar,dbar),method="L-BFGS-B",hessian=TRUE,lowerc(0,1)
)
ac
print(ac$hessian)
muhat<-ac$par[1]
dhat<-ac$par[2]
zhat<- 1+(log(dhat)/(1-dhat))
infor<-solve(ac$hessian)
var.dhat<-infor[2,2]
se.dhat<-sqrt(var.dhat)
var.muhat<-infor[1,1]
se.muhat<-sqrt(var.muhat)
var.func<-dhat*muhat
var.func
d.prime<-cbind(dhat,muhat)

se.var.func<-d.prime%*%infor%*%t(d.prime)
se.var.func
lower.d[i]<-dhat-1.96*se.dhat
upper.d[i]<-dhat+1.96*se.dhat

if(lower.d[i]  <= d.true & d.true<= upper.d[i])
counter <-counter+1
}
counter
covg.prob<-counter/iter
covg.prob

Uwe Ligges

2011-Sep-03 17:06 UTC

head link

[R] Hessian Matrix Issue

I have not really looked into the details of the lengthy and almost 
unreadable code below. In any case, there are good reasons why numerics 
software typically uses Fisher scoring / IWLS in order to fit GLMs.....

And if your matrix is that "singular", even the common numerical
tricks
may not save the day anymore. 7e-21 is very close to exact singularity!

Uwe Ligges






On 02.09.2011 21:33, tzaihra at alcor.concordia.ca
wrote:> Dear All,
>
> I am running a simulation to obtain coverage probability of Wald type
> confidence intervals for my parameter d in a function of two parameters
> (mu,d).
>
> I am optimizing it using "optim" method "L-BFGS-B" to
obtain MLE. As, I
> want to invert the Hessian matrix to get Standard errors of the two
> parameter estimates. However, my Hessian matrix at times becomes
> non-invertible that is it is no more positive definite and I get the
> following error msg:
>
> "Error in solve.default(ac$hessian) : system is computationally
singular:
> reciprocal condition number = 6.89585e-21"
> Thank you
>
> Following is the code I am running I would really appreciate your comments
> and suggestions:
>
> #Start Code
> #option to trace /recover error
> #options(error = recover)
>
> #Sample Size
> n<-30
> mu<-5
> size<- 2
>
> #true values of parameter d
> d.true<-1+mu/size
> d.true
>
> #true value of  zero inflation index phi= 1+log(d)/(1-d)
> z.true<-1+(log(d.true)/(1-d.true))
> z.true
>
> # Allocating space for simulation vectors and setting counters for
simulation
> counter<-0
> iter<-10000
> lower.d<-numeric(iter)
> upper.d<-numeric(iter)
>
> #set.seed(987654321)
>
> #begining of simulation loop########
>
> for (i in 1:iter){
> r.NB<-rnbinom(n, mu = mu, size = size)
> y<-sort(r.NB)
> iter.num<-i
> print(y)
> print(iter.num)
> #empirical estimates or sample moments
> xbar<-mean(y)
> variance<-(sum((y-xbar)^2))/length(y)
> dbar<-variance/xbar
> #sample estimate of proportion of zeros and zero inflation index
> pbar<-length(y[y==0])/length(y)
>
> ### Simplified function #############################################
>
> NegBin<-function(th){
> mu<-th[1]
> d<-th[2]
> n<-length(y)
>
> arg1<-n*mean(y)*ifelse(mu>= 0, log(mu),0)
> #arg1<-n*mean(y)*log(mu)
>
> #arg2<-n*log(d)*((mean(y))+mu/(d-1))
> arg2<-n*ifelse(d>=0, log(d), 0)*((mean(y))+mu/ifelse((d-1)>= 0,
(d-1),
> 0.0000001))
>
> aa<-numeric(length(max(y)))
> a<-numeric(length(y))
> for (i in 1:n)
> {
> for (j in 1:y[i]){
> aa[j]<-ifelse(((j-1)*(d-1))/mu>0,log(1+((j-1)*(d-1))/mu),0)
> #aa[j]<-log(1+((j-1)*(d-1))/mu)
> #print(aa[j])
> }
>
> a[i]<-sum(aa)
> #print(a[i])
> }
> a
> arg3<-sum(a)
> llh<-arg1+arg2+arg3
> if(! is.finite(llh))
> llh<-1e+20
> -llh
> }
>
ac<-optim(NegBin,par=c(xbar,dbar),method="L-BFGS-B",hessian=TRUE,lower>
c(0,1) )
> ac
> print(ac$hessian)
> muhat<-ac$par[1]
> dhat<-ac$par[2]
> zhat<- 1+(log(dhat)/(1-dhat))
> infor<-solve(ac$hessian)
> var.dhat<-infor[2,2]
> se.dhat<-sqrt(var.dhat)
> var.muhat<-infor[1,1]
> se.muhat<-sqrt(var.muhat)
> var.func<-dhat*muhat
> var.func
> d.prime<-cbind(dhat,muhat)
>
> se.var.func<-d.prime%*%infor%*%t(d.prime)
> se.var.func
> lower.d[i]<-dhat-1.96*se.dhat
> upper.d[i]<-dhat+1.96*se.dhat
>
> if(lower.d[i]<= d.true&  d.true<= upper.d[i])
> counter<-counter+1
> }
> counter
> covg.prob<-counter/iter
> covg.prob
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

John C Nash

2011-Sep-03 17:39 UTC

head link

[R] Hessian matrix issue

Unless you are supplying analytic hessian code, you are almost certainly getting
an
approximation. Worse, if you do not provide gradients, these are the result of
two levels
of differencing, so you should expect some loss of precision in the approximate
Hessian.

Moreover, if your estimate of the optimum is a little bit off, or the optimizer
has
terminated (algorithms converge, programs terminate) to a point that is not an
optimum,
there is no reason the Hessian should be positive definite.

Package optimx() uses the Jacobian of the gradient if the analytic gradient is
available.
This drops the differencing to 1 level. Even better is to code the Hessian, but
that is
messy and tedious in most cases.

Best, JN


On 09/03/2011 06:00 AM, r-help-request at r-project.org
wrote:> Message: 59
> Date: Fri, 2 Sep 2011 15:33:13 -0400
> From: tzaihra at alcor.concordia.ca
> To: r-help at r-project.org
> Subject: [R] Hessian Matrix Issue
> Message-ID:
> 	<e6dc43b4487eb4a4055e1ab485f015f0.squirrel at webmail.concordia.ca>
> Content-Type: text/plain;charset=iso-8859-1
> 
> Dear All,
> 
> I am running a simulation to obtain coverage probability of Wald type
> confidence intervals for my parameter d in a function of two parameters
> (mu,d).
> 
> I am optimizing it using "optim" method "L-BFGS-B" to
obtain MLE. As, I
> want to invert the Hessian matrix to get Standard errors of the two
> parameter estimates. However, my Hessian matrix at times becomes
> non-invertible that is it is no more positive definite and I get the
> following error msg:
> 
> "Error in solve.default(ac$hessian) : system is computationally
singular:
> reciprocal condition number = 6.89585e-21"
> Thank you
> 
> Following is the code I am running I would really appreciate your comments
> and suggestions:
> 
> #Start Code
> #option to trace /recover error
> #options(error = recover)
> 
> #Sample Size
> n<-30
> mu<-5
> size<- 2
> 
> #true values of parameter d
> d.true<-1+mu/size
> d.true
> 
> #true value of  zero inflation index phi= 1+log(d)/(1-d)
> z.true<-1+(log(d.true)/(1-d.true))
> z.true
> 
> # Allocating space for simulation vectors and setting counters for
simulation
> counter<-0
> iter<-10000
> lower.d<-numeric(iter)
> upper.d<-numeric(iter)
> 
> #set.seed(987654321)
> 
> #begining of simulation loop########
> 
> for (i in 1:iter){
> r.NB<-rnbinom(n, mu = mu, size = size)
> y<-sort(r.NB)
> iter.num<-i
> print(y)
> print(iter.num)
> #empirical estimates or sample moments
> xbar<-mean(y)
> variance<-(sum((y-xbar)2))/length(y)
> dbar<-variance/xbar
> #sample estimate of proportion of zeros and zero inflation index
> pbar<-length(y[y==0])/length(y)
> 
> ### Simplified function #############################################
> 
> NegBin<-function(th){
> mu<-th[1]
> d<-th[2]
> n<-length(y)
> 
> arg1<-n*mean(y)*ifelse(mu >= 0, log(mu),0)
> #arg1<-n*mean(y)*log(mu)
> 
> #arg2<-n*log(d)*((mean(y))+mu/(d-1))
> arg2<-n*ifelse(d>=0, log(d), 0)*((mean(y))+mu/ifelse((d-1)>= 0,
(d-1),
> 0.0000001))
> 
> aa<-numeric(length(max(y)))
> a<-numeric(length(y))
> for (i in 1:n)
> {
> for (j in 1:y[i]){
> aa[j]<-ifelse(((j-1)*(d-1))/mu >0,log(1+((j-1)*(d-1))/mu),0)
> #aa[j]<-log(1+((j-1)*(d-1))/mu)
> #print(aa[j])
> }
> 
> a[i]<-sum(aa)
> #print(a[i])
> }
> a
> arg3<-sum(a)
> llh<-arg1+arg2+arg3
> if(! is.finite(llh))
> llh<-1e+20
> -llh
> }
>
ac<-optim(NegBin,par=c(xbar,dbar),method="L-BFGS-B",hessian=TRUE,lower>
c(0,1) )
> ac
> print(ac$hessian)
> muhat<-ac$par[1]
> dhat<-ac$par[2]
> zhat<- 1+(log(dhat)/(1-dhat))
> infor<-solve(ac$hessian)
> var.dhat<-infor[2,2]
> se.dhat<-sqrt(var.dhat)
> var.muhat<-infor[1,1]
> se.muhat<-sqrt(var.muhat)
> var.func<-dhat*muhat
> var.func
> d.prime<-cbind(dhat,muhat)
> 
> se.var.func<-d.prime%*%infor%*%t(d.prime)
> se.var.func
> lower.d[i]<-dhat-1.96*se.dhat
> upper.d[i]<-dhat+1.96*se.dhat
> 
> if(lower.d[i]  <= d.true & d.true<= upper.d[i])
> counter <-counter+1
> }
> counter
> covg.prob<-counter/iter
> covg.prob
> 
> 
>

dave fournier

2011-Sep-03 20:22 UTC

head link

[R] Hessian Matrix Issue

I wonder if your code is correct?

I ran your script until an error was reported. the data set
of 30 obs was


[1]  0  0  1  3  3  3  4  4  4  4  5  5  5  5  5  7  7  7  7  7  7  8  9 
10 11
[26] 12 12 12 15 16

I created a tiny AD Model Builder program to do MLE on it.

DATA_SECTION
   init_int nobs
   init_vector y(1,nobs)
PARAMETER_SECTION
   init_number log_mu
   init_number log_alpha
   sdreport_number mu
   sdreport_number tau
   objective_function_value f
PROCEDURE_SECTION
   mu=exp(log_mu);
   tau=1.0+exp(log_alpha);
   for (int i=1;i<=nobs;i++)
   {
     f-=log_negbinomial_density(y(i),mu,tau);
   }
It converged quickly and

The eigenvalues of the Hessian were

     4.711089774    78.27632341

and the estimates and std devs of the parameters mu and tau were

index   name       value      std dev

      3   mu         6.6000e+00 7.7318e-01
      4   tau        2.7173e+00 7.8944e-01

where tau is the variance divided by the mean.

This was all so simple that I suspect your (rather difficult to read)
R code is wrong, otherwise R must really suck at this kind of problem.

       Dave

Giovanni Petris

2011-Sep-06 14:21 UTC

head link

[R] Hessian matrix issue

About the numerical calculation of the Hessian matrix, I have found
numDeriv:::hessian to be often more accurate than the Hessian returned
by optim.

Best,
Giovanni Petris

On Sat, 2011-09-03 at 13:39 -0400, John C Nash wrote:> Unless you are supplying analytic hessian code, you are almost certainly
getting an
> approximation. Worse, if you do not provide gradients, these are the result
of two levels
> of differencing, so you should expect some loss of precision in the
approximate Hessian.
> 
> Moreover, if your estimate of the optimum is a little bit off, or the
optimizer has
> terminated (algorithms converge, programs terminate) to a point that is not
an optimum,
> there is no reason the Hessian should be positive definite.
> 
> Package optimx() uses the Jacobian of the gradient if the analytic gradient
is available.
> This drops the differencing to 1 level. Even better is to code the Hessian,
but that is
> messy and tedious in most cases.
> 
> Best, JN
> 
> 
> On 09/03/2011 06:00 AM, r-help-request at r-project.org wrote:
> > Message: 59
> > Date: Fri, 2 Sep 2011 15:33:13 -0400
> > From: tzaihra at alcor.concordia.ca
> > To: r-help at r-project.org
> > Subject: [R] Hessian Matrix Issue
> > Message-ID:
> > 	<e6dc43b4487eb4a4055e1ab485f015f0.squirrel at
webmail.concordia.ca>
> > Content-Type: text/plain;charset=iso-8859-1
> > 
> > Dear All,
> > 
> > I am running a simulation to obtain coverage probability of Wald type
> > confidence intervals for my parameter d in a function of two
parameters
> > (mu,d).
> > 
> > I am optimizing it using "optim" method "L-BFGS-B"
to obtain MLE. As, I
> > want to invert the Hessian matrix to get Standard errors of the two
> > parameter estimates. However, my Hessian matrix at times becomes
> > non-invertible that is it is no more positive definite and I get the
> > following error msg:
> > 
> > "Error in solve.default(ac$hessian) : system is computationally
singular:
> > reciprocal condition number = 6.89585e-21"
> > Thank you
> > 
> > Following is the code I am running I would really appreciate your
comments
> > and suggestions:
> > 
> > #Start Code
> > #option to trace /recover error
> > #options(error = recover)
> > 
> > #Sample Size
> > n<-30
> > mu<-5
> > size<- 2
> > 
> > #true values of parameter d
> > d.true<-1+mu/size
> > d.true
> > 
> > #true value of  zero inflation index phi= 1+log(d)/(1-d)
> > z.true<-1+(log(d.true)/(1-d.true))
> > z.true
> > 
> > # Allocating space for simulation vectors and setting counters for
simulation
> > counter<-0
> > iter<-10000
> > lower.d<-numeric(iter)
> > upper.d<-numeric(iter)
> > 
> > #set.seed(987654321)
> > 
> > #begining of simulation loop########
> > 
> > for (i in 1:iter){
> > r.NB<-rnbinom(n, mu = mu, size = size)
> > y<-sort(r.NB)
> > iter.num<-i
> > print(y)
> > print(iter.num)
> > #empirical estimates or sample moments
> > xbar<-mean(y)
> > variance<-(sum((y-xbar)2))/length(y)
> > dbar<-variance/xbar
> > #sample estimate of proportion of zeros and zero inflation index
> > pbar<-length(y[y==0])/length(y)
> > 
> > ### Simplified function #############################################
> > 
> > NegBin<-function(th){
> > mu<-th[1]
> > d<-th[2]
> > n<-length(y)
> > 
> > arg1<-n*mean(y)*ifelse(mu >= 0, log(mu),0)
> > #arg1<-n*mean(y)*log(mu)
> > 
> > #arg2<-n*log(d)*((mean(y))+mu/(d-1))
> > arg2<-n*ifelse(d>=0, log(d), 0)*((mean(y))+mu/ifelse((d-1)>=
0, (d-1),
> > 0.0000001))
> > 
> > aa<-numeric(length(max(y)))
> > a<-numeric(length(y))
> > for (i in 1:n)
> > {
> > for (j in 1:y[i]){
> > aa[j]<-ifelse(((j-1)*(d-1))/mu >0,log(1+((j-1)*(d-1))/mu),0)
> > #aa[j]<-log(1+((j-1)*(d-1))/mu)
> > #print(aa[j])
> > }
> > 
> > a[i]<-sum(aa)
> > #print(a[i])
> > }
> > a
> > arg3<-sum(a)
> > llh<-arg1+arg2+arg3
> > if(! is.finite(llh))
> > llh<-1e+20
> > -llh
> > }
> >
ac<-optim(NegBin,par=c(xbar,dbar),method="L-BFGS-B",hessian=TRUE,lower>
> c(0,1) )
> > ac
> > print(ac$hessian)
> > muhat<-ac$par[1]
> > dhat<-ac$par[2]
> > zhat<- 1+(log(dhat)/(1-dhat))
> > infor<-solve(ac$hessian)
> > var.dhat<-infor[2,2]
> > se.dhat<-sqrt(var.dhat)
> > var.muhat<-infor[1,1]
> > se.muhat<-sqrt(var.muhat)
> > var.func<-dhat*muhat
> > var.func
> > d.prime<-cbind(dhat,muhat)
> > 
> > se.var.func<-d.prime%*%infor%*%t(d.prime)
> > se.var.func
> > lower.d[i]<-dhat-1.96*se.dhat
> > upper.d[i]<-dhat+1.96*se.dhat
> > 
> > if(lower.d[i]  <= d.true & d.true<= upper.d[i])
> > counter <-counter+1
> > }
> > counter
> > covg.prob<-counter/iter
> > covg.prob
> > 
> > 
> >
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Ravi Varadhan

2011-Sep-07 17:37 UTC

head link

[R] Hessian matrix issue

Yes, numDeriv::hessian is very accurate.  So, I normally take the output from
the optimizer, *if it is a local optimum*, and then apply numDeriv::hessian to
it.  I, then, compute the standard errors from it.

However, it is important to know that you have obtained a local optimum.  In
fact, you need the gradient and Hessian to actually verify this - the first and
second order KKT conditions.  However, this is tricky in *constrained*
optimization problems.  If you have constraints, and at least one of the
constraints is active at the best parameter estimate, then the gradient of the
original objective function need not be close to zero and the hessian is also
incorrect.

If you employ an augmented Lagrangian approach (see alabama::auglag), then you
can obtain the correct gradient and Hessian at best parameter estimate. These
gradients and Hessian correspond to the modified objective function that
includes a Lagrangian term augmented by a quadratic penalty term.  They can be
used to check the KKT conditions.

Here is a simple example:

require(alabama)

fr <- function(x) {   ## Rosenbrock Banana function
    x1 <- x[1]
    x2 <- x[2]
    100 * (x2 - x1 * x1)^2 + (1 - x1)^2
}

grr <- function(x) { ## Gradient of 'fr'
    x1 <- x[1]
    x2 <- x[2]
    c(-400 * x1 * (x2 - x1 * x1) - 2 * (1 - x1),
       200 *      (x2 - x1 * x1))
}

p0 <- c(0, 0)

ans1 <- optim(par=p0, fn=fr, gr=grr, upper=c(0.9, Inf),
method="L-BFGS-B", hessian=TRUE)

grr(ans1$par)  # this will not be close to zero

ans1$hessian

# Using an augmented Lagrangian optimizer

hcon <- function(x) 0.9 - x[1]

hcon.jac <- function(x) matrix(c(-1, 0) , 1, 2)

ans2 <- auglag(par=p0, fn=fr, gr=grr, hin=hcon, hin.jac=hcon.jac)

ans2$gradient  # this will be close to zero

ans2$hessian

However, it is not known whether the standard errors obtained from this Hessian
are asymptotically valid.

Best,
Ravi.
-------------------------------------------------------
Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins
University

Ph. (410) 502-2619
email: rvaradhan@jhmi.edu<mailto:rvaradhan@jhmi.edu>


	[[alternative HTML version deleted]]

Seemingly Similar Threads

Search for more seemingly similar threads

R help - Sep 2011 - Hessian Matrix Issue

[R] Hessian Matrix Issue

[R] Hessian Matrix Issue

[R] Hessian matrix issue

[R] Hessian Matrix Issue

[R] Hessian matrix issue

[R] Hessian matrix issue

Seemingly Similar Threads