Hello,
I am using constrOptim to maximize a likelihood function (the values of my
parameter vector must be between zero and one and must sum up to <=1). I am
getting the error 'initial value in 'vmmin' is not finite'.
I've tracked it down to a problem in the function 'R' defined within
the constrOptim function. It is performing a check on my values, and its not
passing the check, and therefore returning NaN. This is not my initial value,
but rather during an internal iteration.
This is the function defined in constrOptim (the following is code from
debugging it)
debugging in: R(theta, theta.old)
debug: {
ui.theta <- ui %*% theta
gi <- ui.theta - ci
if (any(gi < 0))
return(NaN)
gi.old <- ui %*% theta.old - ci
bar <- sum(gi.old * log(gi) - ui.theta)
if (!is.finite(bar))
bar <- -Inf
f(theta, ...) - mu * bar
}
I see that I am getting the gi values that are just machine error below 0
(-8.881784e-16):
Browse[3]> ui.theta
[,1]
[1,] -1
[2,] 1
Browse[3]> ci
[1] -1 0
Browse[3]> gi
[,1]
[1,] -8.881784e-16
[2,] 1.000000e+00
I'm a bit surprised that constrOptim wouldn't be robust to that. I'm
wondering what I can do to get around this, if anything. Is this a sign that my
function that I defined is not behaving correctly in some way (e.g. should I be
handling machine error in the function I defined?). I would note in my example
below, I am only optimizing over a one-dimensional parameter, for which I
understand optim is not optimal, but my function is general and in other
situations it could be a vector of length greater than 1.
I give an example below, as well as my function and session info. The example is
pathological (x, the successes, are equal to m the trials) but I'm running
simulations, and this is being created in my simulations. When I run a large set
of simulations (over many different parameters) I am actually getting about half
the time an error like this, and so I don't know if it is always a
pathological example like this. This is just an example I found where I know it
failed.
Thanks for your help,
Elizabeth Purdom
Example data:> x <- c(12, 17, 9, 15, 15, 9, 15, 11, 12, 7)
> m <- c(12,17, 9, 15, 15, 9, 15, 11, 12, 7)
> AMat<-matrix(c(1,0,0,1),byrow=T,nrow=2,ncol=2)
> eventTiming(x, m, history = AMat)
Error message:
Error in optim(theta.old, fun, gradient, control = control, method = method, :
initial value in 'vmmin' is not finite
Enter a frame number, or 0 to exit
1: eventTiming(x, m, history = AMat)
2: eventTiming.R#84: constrOptim(theta = init, f = neglik, grad = neggr, ui =
3: eventTiming.R#84: optim(theta.old, fun, gradient, control = control, method
> sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] tools_2.15.0
My function:
eventTiming<-function(x,m,history,
exactAllele=FALSE,normCont=0,verbose=FALSE,coverageCutoff=1,minMutations=10){
A<-history
if(is.null(dim(A))) stop("'history' should be a matrix of size
nCopies x (nEvents +1)")
nCopies<-nrow(A)
K<-ncol(A)-1
possAlleles<-(1:nCopies)/nCopies*(1-normCont)
nMuts<-length(x)
if(length(m)!=nMuts) stop("x and m must be of same length")
if(coverageCutoff<1) stop("coverageCutoff must be at least 1")
if(any(m<coverageCutoff)){
if(verbose) warning("some mutations have no reads, will be ignored")
nBelowCutoff<-length(which(m<coverageCutoff))
nZero<-length(which(m==0))
x<-x[which(m>=coverageCutoff)]
m<-m[which(m>=coverageCutoff)]
}
else{
nBelowCutoff<-0
nZero<-0
}
if(length(m)<minMutations){
if(length(normCont)>0){
if(length(possAlleles)!=K+1) stop("invalid length for possible Allele
Frequency values")
if(is.null(names(possAlleles)))
names(possAlleles)<-paste("AlleleFreq",1:length(possAlleles),sep="")
return(list(pi=rep(NA,length=K+1),alleleSet=possAlleles,optimOut=NA,nBelowCutoff=nBelowCutoff,nZero=nZero,nMut=length(m)))
}
else
return(list(pi=rep(NA,length=K+1),alleleSet=NA,optimOut=NA,nBelowCutoff=nBelowCutoff,nZero=nZero,nMut=length(m)))
}
###Make 'weight matrix' of P(X=X[i]|X>0, P=p(k,s)) of size nStages x
nCopies for each x --
###an array that is nMuts x nCopies of the prob of X given a certain allele
probX<-function(p){
dbinom(x=x,size=m,prob=p)
}
wtMat<-sapply(possAlleles,probX) #nMuts x nAlleles
if(is.null(dim(wtMat))){
if(length(x)==1) wtMat<-matrix(wtMat,nrow=1)
else stop("Programming error -- single row wtMat")
}
###Need to convert so in terms of pi[-0]
transMat<-rbind(rep(-1,K),diag(K))
Aid<-A%*%transMat
numGr<-wtMat %*% Aid #N x K matrix
init<-rep(1/(K+1),length=K) #equal distribution; could perhaps choose better
init value
getProbFromTheta<-function(theta){
return(c(1-sum(theta),theta))
}
neglik<-function(theta){
probDist<-as.vector(A%*%getProbFromTheta(theta)) #should be a vector of
probabilities
if(length(probDist)!=K+1) stop("Programming error -- wrong length")
if(any(probDist< -1e-10)) stop("Programming error -- negative
probabilities")
if(any(probDist> 1+1e-10)) stop("Programming error -- >1
probabilities")
-sum(log(rowSums(sweep(wtMat,2,probDist,FUN="*"))))
}
#numerator of grad because linear in pi -- same regardless of value of theta
neggr<-function(theta){
#denominator:
probDist<-as.vector(A%*%getProbFromTheta(theta))
if(length(probDist)!=K+1) stop("Programming error -- wrong length")
if(any(probDist< -1e-10)) stop("Programming error -- negative
probabilities")
if(any(probDist> 1+1e-10)) stop("Programming error -- >1
probabilities")
denom<-rowSums(sweep(wtMat,2,probDist,FUN="*")) #should be a
vector length equal to N
-sum(sweep(numGr,1,denom,"/"))
}
lengthOfTheta<-K
uiL1<-diag(rep(-1,lengthOfTheta),nrow=lengthOfTheta,ncol=lengthOfTheta)
ciL1<-rep(1,lengthOfTheta)
uiGr0<-diag(rep(1,lengthOfTheta),nrow=lengthOfTheta,ncol=lengthOfTheta)
ciGr0<-rep(0,lengthOfTheta)
ui<-rbind(uiL1,uiGr0)
ci<- c(-ciL1,-ciGr0)
#The feasible region is defined by ui %*% theta - ci >= 0.
out<-constrOptim(theta=init,f = neglik, grad=neggr, ui=ui, ci=ci)
return(list(pi=getProbFromTheta(out$par),alleleSet=possAlleles,optimOut=out,nBelowCutoff=nBelowCutoff,nZero=nZero,nMut=length(m)))
}