Nadia Theron
2008-Apr-08 10:14 UTC
[R] Weibull maximum likelihood estimates for censored data
Hello! I have a matrix with data and a column indicating whether it is censored or not. Is there a way to apply weibull and exponential maximum likelihood estimation directly on the censored data, like in the paper: Backtesting Value-at-Risk: A Duration-Based Approach, P Chrisoffersen and D Pelletier (October 2003) page 8? The problem is that if I type out the code as below the likelihood ratio is greater than one.> InterestD C 1 17 1 2 10 0 3 15 0 4 2 0 5 42 0 6 53 0 7 193 0 8 11 0 9 2 0 10 8 0 11 12 1 library(stats4) dur_ind_test = function (CDMatrix) # Matrix with durations and censores { lLnw <- function(b){ D = CDMatrix NT = nrow(D) a =((NT-D[1,2]-D[NT,2])/ sum(D[,1]^b))^(1/b) f = sum(log((a^b)*b*(D[2:(NT-1),1]^(b-1))*exp(-((a*D[2:(NT-1),1])^b)))) fd1 = (a^b)*b*(D[1,1]^(b-1))*exp(-(a*D[1,1])^b) fdn = (a^b)*b*(D[NT,1]^(b-1))*exp(-(a*D[NT,1])^b) S1 = exp(-(a*D[1,1])^b) SN = exp(-(a*D[NT,1])^b) -(D[1,2]*log(S1)+(1-D[1,2])*log(fd1)+ f + D[NT,2]*log(SN)+ (1-D[NT,2])*log(fdn)) } lLne <- function(A){ D = CDMatrix NT = nrow(D) b=1 f = sum(log(A*b*(D[2:(NT-1),1]^(b-1))*exp(-(A^(1/b)*D[2:(NT-1),1])^b))) fd1 = A*b*(D[1,1]^(b-1))*exp(-(A^(1/b)*D[1,1])^b) fdn = A*b*(D[NT,1]^(b-1))*exp(-(A^(1/b)*D[NT,1])^b) S1 = exp(-(A^(1/b)*D[1,1])^b) SN = exp(-(A^(1/b)*D[NT,1])^b) lLw = D[1,2]*log(S1)+(1-D[1,2])*log(fd1)+ f + D[NT,2]*log(SN)+ (1-D[NT,2])*log(fdn) -(D[1,2]*log(S1)+(1-D[1,2])*log(fd1)+ f + D[NT,2]*log(SN)+ (1-D[NT,2])*log(fdn)) } fit1 <- mle(lLnw,start = list(b = 0.5)) # Estimate parameters using ml fit2 <- mle(lLne,start = list(A = 0.02)) Lw <- lLnw(coef(fit1)) # Maximum log likelihood : Weibull Le <- lLne(coef(fit2)) # Maximum log likelihood : Exponential LR0 <- (Le/Lw) # Likelihood ratio with duration sample NSimM <- cbind(as.matrix(sort(rchisq(nsim,1,0))),runif(nsim,0,1)) # chi-square df1 simulations, uniform rvs Uniftest <- runif(1,0,1) firstrow <- cbind(LR0,Uniftest) # use sample LR as LR NSimM <- rbind(firstrow,NSimM) Test <- matrix(rep(0,2*(nsim+1)),nrow=(nsim+1)) NSimM <- cbind(NSimM,Test) for(i in 2:nsim+1) { # indicates the number of simulation above the sample if (NSimM[i,1]< LR0)NSimM[i,3]<- 1 # likelihood ratio else if(NSimM[i,1]== LR0)if(NSimM[i,2]>= Uniftest)NSimM[i,4]<-1 # if equal, only indicate if rv for simulation } # is larger that rv for sample LR Gn <- 1-(sum(NSimM[,3]))/nsim + sum(NSimM[,4])/nsim pval <- (nsim*Gn+1)/(nsim+1) #Calculate Monte Carlo p-value out <- c(pval,confint(fit1)) now <- c(Le,Lw) LR0 }}> test_1 <- dur_ind_test(CDMatrix = Interest,nsim=1000)Profiling...> test_1A b 42.32406 41.59035 => likelihood ratio = 1.017641 Could someone please help? http://www.investec.com/EmailDisclaimer/emaildisclaimer.htm The disclaimer also provides our corporate information and names of our directors as required by law. The disclaimer is deemed to form part of this message in terms of Section 11 of the Electronic Communications and Transactions Act 25 of 2002. If you cannot access the disclaimer, please obtain a copy thereof from us by sending an email to: disclaimer@investec.co.za [[alternative HTML version deleted]]
Terry Therneau
2008-Apr-09 13:15 UTC
[R] Weibull maximum likelihood estimates for censored data
> I have a matrix with data and a column indicating whether it is censored > or not. Is there a way to apply weibull and exponential maximum > likelihood estimation directly on the censored data, like in the paper: > Backtesting Value-at-Risk: A Duration-Based Approach, P Chrisoffersen > and D Pelletier (October 2003) page 8?It would be easier to use the survreg function, which is part of the survival library. It will fit MLE estimates of the exponential, Weibull, log-normal, and others. >library(survival) > survreg(Surv(D, C) ~1, data=Interest) Coefficients: (Intercept) 5.531319 Scale= 1.303637 Loglik(model)= -12.3 Loglik(intercept only)= -12.3 Notes: 1. The survreg function uses a location/scale parameterization of the Weibull. Kalbfleisch and Prentice, "The statistical analysis of failure time data" is the standard text for this. There are several others. A simple online reference is a technical report from earlier versions of the survival package, TR #53 available at www.mayo.edu/biostatistics. (Somewhat dated wrt newer options in the package, but sufficient for your purposes). 2. You give page numbers but no journal name in your reference. 3. I don't know whether your variable "C" has 1=censored or 1=uncensored. The survreg function expects the latter. You can just change the call to survreg(Surv(D, 1-C) ~1) if yours is otherwise. 4. The rest of your message is a set of nested functions with hardly a single comment. It is very difficult for an outside reader to comment on "what went wrong" without further hints about what it is that you are actually trying to compute. Terry Therneau