Hi , I am trying to get some estimator based on lognormal distribution when we have left,interval, and right censored data. Since, there is now avalible pakage in R can help me in this, I had to write my own code using Newton Raphson method which requires first and second derivative of log likelihood but my problem after runing the code is the estimators were too high. with this email ,I provide simple example for estimation parameters for lognormal when we have only right censored data, and I tried to estimate the parameters using three methods, Surv pakage, optim function, and Newton Raphson calculation. For the Surv pakage and Optim function, I got similar results of estimation values to the true values, but for the Newton Raphson, I got problem with estimation, where it was too high" overestimation". Is there any one cen help me to know my problem when can be or recommend me to read some good references that can be useful? Best regards. Below is the example, cur=curd=cen=cens=array(1,100) Cur=RealCur=realcensoring=realcured=array(1,20) ExpCure=Bias=RealCure=array(1,21) ################################## Z1=c(rbinom(100,1,0.5)) Z2=c(rbinom(100,1,0.5)) dat1<-data.frame(time=rlnorm( 100,2,0.8),Censored=rbinom(100,1,0.9),Cured=rbinom(100,1,.3)) dat2<-dat1[order(dat1[,1]),] # order the data # for (i in 1:10) { dat2$Cured[i+90]=0 #Long term survivors/10 individuals# dat2$Censored[i+90]=0 dat2$time[i+90]=dat2$time[90] } cens<-c(dat2$Censored) #censored status # curd<-c(dat2$Cured) #cured status # tim<-c(dat2$time) #lifetimes # X1<-c(Z1) #X1 Covariate=Age # X2<-c(Z2) #X2 Covariate=Type of treatment(1=chemo,0=radio) # L1<-length(cens) #number of censored# for (j in 1:L1) { if ((cens[j]==1)&(curd[j]==0)) {(cen[j]=1)&(cur[j]=1)} else {(cen[j]=cens[j])&(cur[j]=curd[j])} } Now, the following is my data: ####### My Data only with uncensored and right censored #################### data=data.frame(Ti=dat1$time,Censored=cen) #################### Estimation using Surv pakage ############################ library(survival) survreg(Surv(Ti, Censored)~1, data=data, dist="lognormal") ########### Seperating the data for simply using optim function and Newton Raphson #### dat2<-c(split(data[1:2],data$Censored==1)) # Split the data(cen/uncen) # tun<-c((dat2$'TRUE')$Ti) # Life times data set for uncensored # Stat.uncen<-c((dat2$'TRUE')$Censored) # uncensored Status data set # tcen<-c((dat2$'FALSE')$Ti) # Life times data set for censored # Stat.cen<-c((dat2$'FALSE')$Censored) # censoring Status data set # ##################### Estimation using optim function ################ ml= function(par){mu=par[1] s=par[2] -sum(dlnorm(tun,mu,s,log=TRUE))-sum(1-plnorm(tcen,mu,s))} Max=optim(c(0.5,0.5),ml) param=c(Max$par) ## My Problem is when I try to estimate parameters using newton raphson ### ########### intial values ############## mu=1 s=1 ############# Derivative for mu ########## F1= function(par){ mu=par[1] s=par[2] sum((log(tun)-mu)/s^2)+ sum(((1/sqrt(2*pi))* exp(-((log(tcen)-mu)^2)/2*s^2))/(s^2*(1-plnorm(tcen,mu,s)))) } ############## Derivative for sigma" s " ####### F2= function(par){ mu=par[1] s=par[2] sum((log(tun)-mu)^2/s^3 -1/s)+ sum((log(tcen)-mu)*((1/sqrt(2*pi))* exp(-((log(tcen)-mu)^2)/2*s^2))/(s^2*(1-plnorm(tcen,mu,s))))} ############### Total Function ######################## F=function(par){ F=matrix(0,nrow=2) F[1]=F1(par) F[2]=F2(par) F } ################ the Jacobian matrix, a 2 x 2 matrix ############### d11=d12=d21=d22=array() J=function(par){ j=matrix(0,ncol=2,nrow=2) # The format of J is 2 by 2# d11=0; d12=0; d21=0;d22=0 ######################## second Derivative for mu ########## d11 = function(par){ mu=par[1] s=par[2] sum(-1/s^2)-sum((1/s^2)* (((1/sqrt(2*pi))*exp(-((log(tcen)-mu)^2)/2*s^2))/s*((1-plnorm(tcen,mu,s))))* (-(log(tcen)-mu)/s +((1/sqrt(2*pi))*exp(-((log(tcen)-mu)^2)/2*s^2))/ (s*(1-plnorm(tcen,mu,s)))))} ################ Derivative for mu/s ########################################## d12= function(par){ mu=par[1] s=par[2] -sum(2*(log(tun)-mu)/s^3)-sum((1/s^2)*(((1/sqrt(2*pi))* exp(-((log(tcen)-mu)^2)/2*s^2))/(s*(1-plnorm(tcen,mu,s)))+((log(tcen)-mu)/s)* ((1/sqrt(2*pi))*exp(-((log(tcen)-mu)^2)/2*s^2))/(s*(1-plnorm(tcen,mu,s)))* (((1/sqrt(2*pi))* exp(-((log(tcen)-mu)^2)/2*s^2))/(s*(1-plnorm(tcen,mu,s)))-(log(tcen)-mu)/s)))} d21=d12 ######## second Derivative for s ########################### d22=function(par){ mu=par[1] s=par[2] sum((-3*(log(tun)-mu)^2/s^4)+1/s^2)-sum((1/s^2)*((log(tcen)-mu)/s)*(((1/sqrt(2*pi))* exp(-((log(tcen)-mu)^2)/2*s^2))/(s*(1-plnorm(tcen,mu,s)))+((1/sqrt(2*pi))* exp(-((log(tcen)-mu)^2)/2*s^2))/(s*(1-plnorm(tcen,mu,s)))+((log(tcen)-mu)/s)*(((1/sqrt(2*pi))* exp(-((log(tcen)-mu)^2)/2*s^2))/(s*(1-plnorm(tcen,mu,s))))*(((1/sqrt(2*pi))* exp(-((log(tcen)-mu)^2)/2*s^2))/(s*(1-plnorm(tcen,mu,s)))-(log(tcen)-mu)/s)))} ############ The R codes for Newton's method ##################################### j[1,1]=d11(par); j[1,2]=d12(par); j[2,1]=d21(par); j[2,2]=d22(par) j } out1=outs=array() par=c(mu,s) #### initial values ##### v=matrix(0,ncol=length(par)) for (i in 1:30) { v=solve(J(par),F(par)) par=par-v ############ here see the values are to high ########### mu[i+1]=par[1] s[i+1]=par[2] } out1=c(mu) ############ the value of parameter at each step , no convergence #################### outs=c(s) [[alternative HTML version deleted]]
On 31-08-2012, at 03:54, Salma Wafi wrote:> Hi , > I am trying to get some estimator based on lognormal distribution when we have left,interval, and right censored data. Since, there is now avalible pakage in R can help me in this, I had to write my own code using Newton Raphson method which requires first and second derivative of log likelihood but my problem after runing the code is the estimators were too high. with this email ,I provide simple example for estimation parameters for lognormal when we have only right censored data, and I tried to estimate the parameters using three methods, Surv pakage, optim function, and Newton Raphson calculation. For the Surv pakage and Optim function, I got similar results of estimation values to the true values, but for the Newton Raphson, I got problem with estimation, where it was too high" overestimation". Is there any one cen help me to know my problem when can be or recommend me to read some good references that can be useful? > Best regards. > Below is the example, > > cur=curd=cen=cens=array(1,100) > Cur=RealCur=realcensoring=realcured=array(1,20) > ExpCure=Bias=RealCure=array(1,21) > ################################## > Z1=c(rbinom(100,1,0.5)) > Z2=c(rbinom(100,1,0.5)) > dat1<-data.frame(time=rlnorm( 100,2,0.8),Censored=rbinom(100,1,0.9),Cured=rbinom(100,1,.3)) > > dat2<-dat1[order(dat1[,1]),] # order the data # > for (i in 1:10) > { > dat2$Cured[i+90]=0 #Long term survivors/10 individuals# > dat2$Censored[i+90]=0 > dat2$time[i+90]=dat2$time[90] > } > cens<-c(dat2$Censored) #censored status # > curd<-c(dat2$Cured) #cured status # > tim<-c(dat2$time) #lifetimes # > X1<-c(Z1) #X1 Covariate=Age # > X2<-c(Z2) #X2 Covariate=Type of treatment(1=chemo,0=radio) # > > L1<-length(cens) #number of censored# > for (j in 1:L1) > { > if ((cens[j]==1)&(curd[j]==0)) {(cen[j]=1)&(cur[j]=1)} > > else {(cen[j]=cens[j])&(cur[j]=curd[j])} > } > Now, the following is my data: > > ####### My Data only with uncensored and right censored #################### > data=data.frame(Ti=dat1$time,Censored=cen) > > #################### Estimation using Surv pakage ############################ > library(survival) > survreg(Surv(Ti, Censored)~1, data=data, dist="lognormal") > ########### Seperating the data for simply using optim function and Newton Raphson #### > > dat2<-c(split(data[1:2],data$Censored==1)) # Split the data(cen/uncen) # > tun<-c((dat2$'TRUE')$Ti) # Life times data set for uncensored # > Stat.uncen<-c((dat2$'TRUE')$Censored) # uncensored Status data set # > tcen<-c((dat2$'FALSE')$Ti) # Life times data set for censored # > Stat.cen<-c((dat2$'FALSE')$Censored) # censoring Status data set # > ##################### Estimation using optim function ################ > ml= function(par){mu=par[1] > s=par[2] > -sum(dlnorm(tun,mu,s,log=TRUE))-sum(1-plnorm(tcen,mu,s))} > > Max=optim(c(0.5,0.5),ml) > param=c(Max$par) > > ## My Problem is when I try to estimate parameters using newton raphson ### > > ########### intial values ############## > mu=1 > s=1 > ############# Derivative for mu ########## > F1= function(par){ mu=par[1] > s=par[2] > sum((log(tun)-mu)/s^2)+ sum(((1/sqrt(2*pi))* > exp(-((log(tcen)-mu)^2)/2*s^2))/(s^2*(1-plnorm(tcen,mu,s)))) } > ############## Derivative for sigma" s " ####### > > F2= function(par){ > mu=par[1] > s=par[2] > sum((log(tun)-mu)^2/s^3 -1/s)+ sum((log(tcen)-mu)*((1/sqrt(2*pi))* > exp(-((log(tcen)-mu)^2)/2*s^2))/(s^2*(1-plnorm(tcen,mu,s))))} > ############### Total Function ######################## > F=function(par){ > F=matrix(0,nrow=2) > F[1]=F1(par) > F[2]=F2(par) > F } > ################ the Jacobian matrix, a 2 x 2 matrix ############### > d11=d12=d21=d22=array() > J=function(par){ > j=matrix(0,ncol=2,nrow=2) > # The format of J is 2 by 2# > d11=0; d12=0; > d21=0;d22=0 > ######################## second Derivative for mu ########## > d11 = function(par){ > mu=par[1] > s=par[2] > sum(-1/s^2)-sum((1/s^2)* > (((1/sqrt(2*pi))*exp(-((log(tcen)-mu)^2)/2*s^2))/s*((1-plnorm(tcen,mu,s))))* > (-(log(tcen)-mu)/s +((1/sqrt(2*pi))*exp(-((log(tcen)-mu)^2)/2*s^2))/ > (s*(1-plnorm(tcen,mu,s)))))} > > ################ Derivative for mu/s ########################################## > d12= function(par){ > mu=par[1] > s=par[2] > -sum(2*(log(tun)-mu)/s^3)-sum((1/s^2)*(((1/sqrt(2*pi))* > exp(-((log(tcen)-mu)^2)/2*s^2))/(s*(1-plnorm(tcen,mu,s)))+((log(tcen)-mu)/s)* > ((1/sqrt(2*pi))*exp(-((log(tcen)-mu)^2)/2*s^2))/(s*(1-plnorm(tcen,mu,s)))* > (((1/sqrt(2*pi))* > exp(-((log(tcen)-mu)^2)/2*s^2))/(s*(1-plnorm(tcen,mu,s)))-(log(tcen)-mu)/s)))} > d21=d12 > ######## second Derivative for s ########################### > d22=function(par){ > mu=par[1] > s=par[2] > sum((-3*(log(tun)-mu)^2/s^4)+1/s^2)-sum((1/s^2)*((log(tcen)-mu)/s)*(((1/sqrt(2*pi))* > exp(-((log(tcen)-mu)^2)/2*s^2))/(s*(1-plnorm(tcen,mu,s)))+((1/sqrt(2*pi))* > exp(-((log(tcen)-mu)^2)/2*s^2))/(s*(1-plnorm(tcen,mu,s)))+((log(tcen)-mu)/s)*(((1/sqrt(2*pi))* > exp(-((log(tcen)-mu)^2)/2*s^2))/(s*(1-plnorm(tcen,mu,s))))*(((1/sqrt(2*pi))* > exp(-((log(tcen)-mu)^2)/2*s^2))/(s*(1-plnorm(tcen,mu,s)))-(log(tcen)-mu)/s)))} > > ############ The R codes for Newton's method ##################################### > > j[1,1]=d11(par); j[1,2]=d12(par); > j[2,1]=d21(par); j[2,2]=d22(par) > j } > out1=outs=array() > par=c(mu,s) #### initial values ##### > v=matrix(0,ncol=length(par)) > for (i in 1:30) > { > v=solve(J(par),F(par)) > par=par-v ############ here see the values are to high ########### > mu[i+1]=par[1] > s[i+1]=par[2] > } > out1=c(mu) ############ the value of parameter at each step , no convergence #################### > outs=c(s)Your code is a mess and unreadable. Assuming that the function ml is what you want to minimize, I'm quite sure that the derivatives you are calculating are quite incorrect. Did you check them? Assuming that: - function F gives the gradient of ml - function J gives the Hessian of ml we can check these with package numDeriv. To make the following reproducible I inserted set.seed(123) at the start of your code and at the end of your code I inserted library(numDeriv) ml.grad <- function(par) grad(ml,par) ml.hessian <- function(par) hessian(ml,par) pstart <- c(0.5,.5) ml(pstart) F(pstart) ml.grad(pstart) J(pstart) ml.hessian(pstart) Running the code in R gives> ml(pstart)[1] 570.405> F(pstart)[,1] [1,] 1.770876e+15 [2,] 7.247892e+15> ml.grad(pstart)[1] -459.6117 -1423.2860> J(pstart)[,1] [,2] [1,] -2.634033e+01 -2.567008e+31 [2,] -2.567008e+31 -2.101266e+32> ml.hessian(pstart)[,1] [,2] [1,] 326.3977 1826.317 [2,] 1826.3172 9187.053 Functions F and J are most likely not returning correct values (I have no reason to question the results of numDeriv). So it is not surprising that your attempt at Newton fails miserably. You can also run this (p1.opt <- optim(pstart,ml)) (p2.opt <- optim(pstart,ml, gr=ml.grad, method="BFGS")) (p3.opt <- optim(pstart,ml, gr=ml.grad, method="CG")) (p4.opt <- optim(pstart,ml, gr=ml.grad, method="L-BFGS-B")) ml.grad(p1.opt$par) ml.grad(p2.opt$par) ml.grad(p3.opt$par) ml.grad(p4.opt$par) which will show you that optim is perfectly capable of minimizing function ml. I cannot judge whether the result is compatible with the result of survreg. That's up to you. Finally, it is often not a good idea to use an equation solver to find a solution to gradient of function=0 as a way of locating a minimum. If you insist on doing that, have a look at packages nleqslv and BB. regards, Berend
On Thu, 30 Aug 2012, Salma Wafi wrote:> I am trying to get some estimator based on lognormal distribution when we > have left,interval, and right censored data.Take a look at the NADA package. For left-censored data you can use cenmle with dist = 'lognormal'. Survival analyses work well for right-censored data as well as interval-censored data. Dennis Helsel's second edition, "Statistics for Censored Environmental Data Unsing Minitab and R" is the basis for NADA and a valuable resource. Rich
You should keep your replies on the R-help list by always cc-ing to that list. As you were asked to do in a previous thread you started. As I demonstrated in a previous reply, the R function optim() is perfectly capable of minimizing or maximizing your likelihood function. Your attempts at using the Newton method for finding a maximum will only work if the starting point is (very) close to the solution. In addition using Newton to solve gradient(f) = 0 to find a minimum/maximum is not the best way to go about this. There is no guarantee that the Hessian at the initial point is positive definite (for a minimum). You will be better off using functions like optim() or nlminb() I do not know much about the EM algorithm and I cannot judge whether what you are trying to do is correct or sensible. For speeding up EM iterations you could have a look the SQUAREM packages. I feel that you should consult an expert on this method. The code you have provided is not complete in the sense that you must be using packages such as numDeriv but I cannot see any library statements. From the limited experiments I have done with your code I get the impression that there is something wrong with the way parameter theta is used/implemented. Berend On 01-09-2012, at 14:11, Salma Wafi wrote:> Dear Berend, > Thanks alot for your response and help. really your help is very appreciated. Sorry if my code was unreadable and I hope you give me excuse in this , since I am new R user and there is no one in my area know about R. Dear, I am trying to estimate my parameters by calculating derivatives of loglikelihood function and using Newton raphson method because I need to estimate another parameter, which will be added to my likelihood function, and the information that I have about this parameter are partially missing. so I want to estimate the parameters using MLE via EM algorithm, which is composed of two steps; an Expectation (E) step which computes the expectation of the log likelihood function using the observed data set followed by a maximization (M) step which computes the parameters that maximize the expected log likelihood function found in the E-step. Then, I need to get estimators using Newton raphson method at one step and then use them to compute the E step, and so on till convergence has been met. But because I face "overestimation" problem, I tried to realize my mistake by estimating only MLE of lognormal censored data. Furthermore, I tried to write my code as what you suggest , using grad and hessian of log likelihood function and use them to calculate the Newton rapson method but also I faced some problem in this, where there is no convergence in my estimations . Dear, do you think the reason of this can be because of hessian matrix is not positive point? And in this case how we can avoid it?. > Dear, I am sorry to disturb you. Really, I work individually and lost more than 3 months trying to solve this problem. The following are, my trying to calculate Newton raphson using the grad and hessian of log likelihood instead of the derivatives and my original code that I am trying to estimate the parameters mu, s and theta using MLE via EM algorithm. > ################################### caculating Newton Rahson using grad and hessian of log likelihood #################### > cur=curd=cen=cens=array(1,100) > Cur=RealCur=realcensoring=realcured=array(1,20) > ExpCure=Bias=RealCure=array(1,21) > > dat1<-data.frame(time=rlnorm( 100,2,0.8),Censored=rbinom(100,1,0.9),Cured=rbinom(100,1,.3)) > > dat2<-dat1[order(dat1[,1]),] # order the data # > for (i in 1:10) > { > dat2$Cured[i+90]=0 #Long term survivors/10 individuals# > dat2$Censored[i+90]=0 > dat2$time[i+90]=dat2$time[90] > } > cens<-c(dat2$Censored) #censored status # > curd<-c(dat2$Cured) #cured status # > tim<-c(dat2$time) #lifetimes # > L1<-length(cens) #number of censored# > for (j in 1:L1) > { > if ((cens[j]==1)&(curd[j]==0)) {(cen[j]=1)&(cur[j]=1)} > > else {(cen[j]=cens[j])&(cur[j]=curd[j])} > } > ####### My Data only with uncensored and right censored #################### > data=data.frame(Ti=dat1$time,Censored=cen) > > ########### Seperating the data for simply using optim and Newton Raphson ## > > dat2<-c(split(data[1:2],data$Censored==1)) # Split the data(cen/uncen) # > tun<-c((dat2$'TRUE')$Ti) # Life times data set for uncensored # > tcen<-c((dat2$'FALSE')$Ti) # Life times data set for censored # > outm=outs=array() > #################### loglikelihood function ############################ > ml= function(par){mu=par[1] > s=par[2] > +sum(dlnorm(tun,mu,s,log=TRUE))+sum(log(1-plnorm(tcen,mu,s)))} > ######################### intial values ###################### > mu=0.5 > s=0.5 > ################# grad & hessian matrix of ml ################### > ml.grad <- function(par) grad(ml2,par) > ml.hessian <- function(par) hessian(ml2,par) > pstart <- c(mu,s) > ml(pstart) > ml.grad(pstart) > ml.hessian(pstart) > for (i in 1:10) > { > v=solve(ml.hessian(pstart),ml.grad(pstart)) > pstart=pstart-v > mu[i+1]=pstart[1] > s[i+1]=pstart[2] > } > outm=c(mu) > outs=c(s) ##### there is no convergence ############ > ############################################################################# > > Original Code " estimating by using MLE via EM algorithm " > ############################ under same data ############################## > outmu=outs1=outtheta=array() > g=pc=array() > ######################### intial values ###################### > mu=0.5 > s=0.5 > theta=0.5 > ############################################## > for(w in 1:10) ### loop for EM algorithm ### > { > for (i in 1:length(tcen)) ## loop for computing the E step ## > { > > pc[i]=((1-exp(-theta[w]))* > (1-plnorm(tcen[i],mu[w],s[w])))/ > (exp(-theta[w])+((1-exp(-theta[w]))*(1-plnorm(tcen[i],mu[w],s[w])))) > } > g=c(pc) > ################################## M step "maximizing log likelihood" ###################### > ml2= function(par){mu=par[1] > s=par[2] > theta=par[3] > +sum(dlnorm(tun,mu,s,log=TRUE)+log(1-exp(-theta)))+sum(g*log(1-exp(-theta))+g*log(1-plnorm(tcen,mu,s))-(1-g)*theta)} > > ################# grad & hessian matrix of ml ################### > ml2.grad <- function(par) grad(ml2,par) ### the problem here, why the grad and hessian have number of rows > and columns more than the number of parameters ### > ml2.hessian <- function(par) hessian(ml2,par) > pstart <- c(mu,s,theta) > ml2(pstart) > ml2.grad(pstart) > ml2.hessian(pstart) > for (i in 1:1) ####### compute newton rahson at one step ######### > { > v=solve(ml2.hessian(pstart),ml2.grad(pstart)) > pstart=pstart-v ### the problem is the number of the estimators exceeds number of parameters ### > } > mu[w+1]=pstart[1] > s[w+1]=pstart[2] > theta[w+1]=pstart[3] > } ## end loop w ### > outmu=c(mu) > outs1=c(s) > outtheta=c(theta) #### No convergence #### > > Finally, I am very thankful to you and all people who work on Nabble website. > Best regards. > <computing newton rahson><estimationg using MLE via EM algorithm>
Apparently Analagous Threads
- Estimation parameters of lognormal censored data
- interval-censored data in survreg()
- Problem with Newton_Raphson
- Accessing Results from cenmle function in NADA package
- using optimize with two unknowns, e.g. to parameterize a distribution with given confidence interval