Hi,
I would really appreciate if I could get some help here. I'm using nlm to
minimize my negative log likelihood function. What I did is as follows:
My log likelihood function (it returns negative log likelihood) with
'gradient' attribute defined inside as follows:
# ==========Method definition=====================logLikFunc3 <-
function(sigma, object, totalTime) {
y <- as.matrix(object at data$output[1:totalTime,1]);
x <- as.matrix(object at data$input[1:totalTime,]);
# compute necessary matrices
M <- as.matrix(object at model$M);
P <- diag(sigma*sigma);
A <- AMatrix(totalTime, M, object at data$input[1:totalTime,]);
Q <- IMatrix(totalTime)+A %*% outerM(IMatrix(totalTime-1),P) %*% t(A);
invQ <- solve(Q,IMatrix(dim(Q)[1]));
xM <- matrix(rep(0, dim(M)[2]*totalTime), ncol=dim(M)[2],
nrow=totalTime);
for (i in 1:totalTime) {
xM[i,] <- x[i,] %*% powerM(M, -totalTime+i);
}
tmp <- solve((t(xM) %*% invQ %*% xM), IMatrix(dim(xM)[2]));
Bt <- (tmp %*% t(xM)) %*% (invQ %*% y);
N <- IMatrix(totalTime)-(xM %*% tmp %*% t(xM) %*% invQ);
sigma2 <- (1/totalTime) * t(y- xM %*% Bt)%*% invQ %*% (y- xM %*% Bt);
# log likelihood function
loglik <-
-0.5*log(abs(det(diag(rep(sigma2,totalTime)))))-0.5*log(abs(det(Q)))-
(0.5/sigma2)* (t(y- (xM%*% Bt)) %*% invQ %*% (y-(xM %*% Bt)));
sgm <- sigma;
# gradients eq. (4.16)
gr <- function(sgm) {
gradVecs <- c();
# sgm <- c(sigma1, sigma2);
sgm <- sgm*sgm;
for (i in 1:length(sgm)) {
Eij <- matrix(rep(0, length(sgm)^2), nrow=length(sgm),
ncol=length(sgm));
Eij[i,i] <- 1.0;
# trace term
term1 <- -sum(diag((invQ %*% A) %*%
outerM(IMatrix(totalTime-1),Eij) %*% t(A)));
# very long term
term2 <- (1/totalTime)*solve((t(y) %*% t(N) %*% invQ %*% y),
IMatrix(dim(y)[2]));
term3 <- (t(y) %*% t(N) %*% invQ %*% A) %*%
outerM(IMatrix(totalTime-1),Eij) %*% (t(A) %*% invQ %*% N %*% y);
gradVecs <- -1*c(gradVecs, term1+ (term2 %*% term3));
} # end for
print(paste("Gradient has length:", length(gradVecs)));
return(gradVecs);
}
res <- -loglik;
attr(res, "gradient") <- gradVecs;
return(res);
}
#=========end method definition====================================
Then when I call the nlm on this function, i.e.
nlm(f=logLikFunc3, p=as.numeric(c(1,1)), object=this, totalTime=200,
print.level=2)
It complains that my analytic gradient returns vector length different from
number of my unknowns. In this case, I tried print the length of gradient vector
that I returned (as you could see in the code). It has the same length as my
input parameter vectors. Did I do anything wrong here?
Also, I would like to be able to put some constraints on this optimization as
well. I tried constrOptim with:
ui <- diag(c(1,1));
ci <- matrix(rep(0,2), ncol=1, nrow=2);
using the same parameters passed to nlm above. constrOptim gives me an error
that initial value is in infeasible region which I don't quite understand.
As my constraints simply says that the two parameters must be greater than zero.
My assigned initial values are both 1. So it should be ok. Any help would be
really appreciated. Thank you.
- adschai