Aher
2012-Jan-17 09:22 UTC
[R] Scoring using cox model: probability of survival before time t
Dear Members, I required to score probability of survival before specified time using fitted cox model on scoring dataset. On the training sample data I am able to get the probability of a survival before time point(t), but on the scoring dataset, which will have only predictor information I am facing some issues. It would be great help for me if you tell me where am I going wrong! Here is the sample script! ######################################################### library(survival) n = 100 beta1 = 3; beta2 = -2; lambdaT = .01 lambdaC = .6 x1 = rnorm(n,0) x2 = rnorm(n,0) T = rweibull(n, shape=1, scale=lambdaT*exp(-beta1*x1-beta2*x2)) C = rweibull(n, shape=1, scale=lambdaC) time = pmin(T,C) event = time==T train_sample=data.frame(time,event,x1,x2) rm(time,event,x1,x2) fit_coxph <- coxph(Surv(time, event)~ x1 + x2, data= train_sample, method="breslow") #Save model to some directory save(fit_coxph, file = file.path("C:/Desktop","fit_coxph.RData")) #I can get probabilities on train_sample as below: library(peperr) pred_train <- predictProb.coxph(fit_coxph, Surv(train_sample$time, train_sample$event), train_sample, 0.4) head(pred_train) # [,1] #[1,] 5.126281e-03 #[2,] 4.324882e-01 #[3,] 4.444506e-61 #[4,] 0.000000e+00 #[5,] 0.000000e+00 #[6,] 3.249947e-01 #In the same line, I need probabilities on scoring_data. Now, close the earlier session and run the below script in the new #R session, it gives error. library(survival) library(peperr) load(file = file.path("C:/Desktop","fit_coxph.RData")) n = 1000 set.seed(1) x1 = rnorm(n,0) x2 = rnorm(n,0) score_data <- data.frame(x1,x2) pred_score <- predictProb.coxph(fit_coxph, Surv(time, event), score_data, 0.04) #Error in Surv(time, event) : Time variable is not numeric #After creating dummy place holder for Surv(time, event), it gives another error: time <- rep(2, n) event <- rep(1, n) pred_score <- predictProb.coxph(fit_coxph, Surv(time, event), score_data, 0.04) #Error in inherits(x, "data.frame") : object 'train_sample' not found ######################################################## Appreciate your help, is there any other way to get these probabilities on newdata. Thanks in advance!!!! ~Aher -- View this message in context: http://r.789695.n4.nabble.com/Scoring-using-cox-model-probability-of-survival-before-time-t-tp4302775p4302775.html Sent from the R help mailing list archive at Nabble.com.
Terry Therneau
2012-Jan-18 14:36 UTC
[R] Scoring using cox model: probability of survival before time t
--begin included message --- Dear Members, I required to score probability of survival before specified time using fitted cox model on scoring dataset. On the training sample data I am able to get the probability of a survival before time point(t), but on the scoring dataset, which will have only predictor information I am facing some issues. It would be great help for me if you tell me where am I going wrong! Here is the sample script! ------------------ Your example isn't complete: the error comes from a function predictProb.coxph() which I have never heard of, and I wrote the survival library. I obviously can't comment on why it fails -- you might want to contact the author of the function. Using only the survival library (one of the more recent versions), you need to know the fact that Pr(survival to t) = exp(-expected events by t), then you can use predict(fit_coxph, type="expected", newdata=....) where newdata has a time variable that contains the desired time point for prediction. Note, by default the coxph result does not store all of the data needed to make predicted survivals (it needs to keep the entire X matrix). You can override this by adding "model=TRUE" to the original call. If you do not, then it needs to look up the orginal data set to do the prediction. Terry Therneau
Reasonably Related Threads
- Calculating the probability of an event at time "t" from a Cox model fit
- reference classes, LAZY_DUPLICATE_OK, and external pointers
- Questions for "domist... subscript out of bounds"
- simulate time data
- problem with BootCV for coxph in pec after feature selection with glmnet (lasso)