Hi, I would really appreciate if I could get some help here. I'm using nlm to minimize my negative log likelihood function. What I did is as follows: My log likelihood function (it returns negative log likelihood) with 'gradient' attribute defined inside as follows: # ==========Method definition=====================logLikFunc3 <- function(sigma, object, totalTime) { y <- as.matrix(object at data$output[1:totalTime,1]); x <- as.matrix(object at data$input[1:totalTime,]); # compute necessary matrices M <- as.matrix(object at model$M); P <- diag(sigma*sigma); A <- AMatrix(totalTime, M, object at data$input[1:totalTime,]); Q <- IMatrix(totalTime)+A %*% outerM(IMatrix(totalTime-1),P) %*% t(A); invQ <- solve(Q,IMatrix(dim(Q)[1])); xM <- matrix(rep(0, dim(M)[2]*totalTime), ncol=dim(M)[2], nrow=totalTime); for (i in 1:totalTime) { xM[i,] <- x[i,] %*% powerM(M, -totalTime+i); } tmp <- solve((t(xM) %*% invQ %*% xM), IMatrix(dim(xM)[2])); Bt <- (tmp %*% t(xM)) %*% (invQ %*% y); N <- IMatrix(totalTime)-(xM %*% tmp %*% t(xM) %*% invQ); sigma2 <- (1/totalTime) * t(y- xM %*% Bt)%*% invQ %*% (y- xM %*% Bt); # log likelihood function loglik <- -0.5*log(abs(det(diag(rep(sigma2,totalTime)))))-0.5*log(abs(det(Q)))- (0.5/sigma2)* (t(y- (xM%*% Bt)) %*% invQ %*% (y-(xM %*% Bt))); sgm <- sigma; # gradients eq. (4.16) gr <- function(sgm) { gradVecs <- c(); # sgm <- c(sigma1, sigma2); sgm <- sgm*sgm; for (i in 1:length(sgm)) { Eij <- matrix(rep(0, length(sgm)^2), nrow=length(sgm), ncol=length(sgm)); Eij[i,i] <- 1.0; # trace term term1 <- -sum(diag((invQ %*% A) %*% outerM(IMatrix(totalTime-1),Eij) %*% t(A))); # very long term term2 <- (1/totalTime)*solve((t(y) %*% t(N) %*% invQ %*% y), IMatrix(dim(y)[2])); term3 <- (t(y) %*% t(N) %*% invQ %*% A) %*% outerM(IMatrix(totalTime-1),Eij) %*% (t(A) %*% invQ %*% N %*% y); gradVecs <- -1*c(gradVecs, term1+ (term2 %*% term3)); } # end for print(paste("Gradient has length:", length(gradVecs))); return(gradVecs); } res <- -loglik; attr(res, "gradient") <- gradVecs; return(res); } #=========end method definition==================================== Then when I call the nlm on this function, i.e. nlm(f=logLikFunc3, p=as.numeric(c(1,1)), object=this, totalTime=200, print.level=2) It complains that my analytic gradient returns vector length different from number of my unknowns. In this case, I tried print the length of gradient vector that I returned (as you could see in the code). It has the same length as my input parameter vectors. Did I do anything wrong here? Also, I would like to be able to put some constraints on this optimization as well. I tried constrOptim with: ui <- diag(c(1,1)); ci <- matrix(rep(0,2), ncol=1, nrow=2); using the same parameters passed to nlm above. constrOptim gives me an error that initial value is in infeasible region which I don't quite understand. As my constraints simply says that the two parameters must be greater than zero. My assigned initial values are both 1. So it should be ok. Any help would be really appreciated. Thank you. - adschai