Terry Therneau
2007-May-07 12:53 UTC
[R] Predicted Cox survival curves - factor coding problems..
The combination of survfit, coxph, and factors is getting confused. It is not smart enough to match a new data frame that contains a numeric for sitenew to a fit that contained that variable as a factor. (Perhaps it should be smart enough to at least die gracefully -- but it's not). The simple solution is to not use factors. site1 <- 1*(coxsnps$sitenew==1) site2 <- 1*(coxsnps$sitenew==2) test1 <- coxph(Surv(time, censor) ~ snp1 + sex + site1 + site2 + gene + eth.self + strata(edu), data= coxsnps) output profile1 <- data.frame(snp1=c(0,1), site2=c(0,0), sex=c(0,0), site1=c(0,0), site2=c(0,0), geno=c(0,0) eth.self=c(0,0)) plot(survfit(test1, newdata=profile1)) Note that you do not have to explicitly make "edu" a factor. Since it is included in a strata statement, the coxph routine must treat it as discrete groups. Terry Therneau
Prof Brian Ripley
2007-May-07 13:45 UTC
[R] Predicted Cox survival curves - factor coding problems..
On Mon, 7 May 2007, Terry Therneau wrote:> The combination of survfit, coxph, and factors is getting confused. It is > not smart enough to match a new data frame that contains a numeric for sitenew > to a fit that contained that variable as a factor. (Perhaps it should be smart > enough to at least die gracefully -- but it's not).The 'standard' model-fitting functions in R do make an attempt to match the new data to that used for fitting, or die gracefully. Perhaps Thomas could look into adding this to survift and coxph (see http://developer.r-project.org/model-fitting-functions.txt).> The simple solution is to not use factors. > > site1 <- 1*(coxsnps$sitenew==1) > site2 <- 1*(coxsnps$sitenew==2) > test1 <- coxph(Surv(time, censor) ~ snp1 + sex + site1 + site2 + gene + > eth.self + strata(edu), data= coxsnps) > > output > > profile1 <- data.frame(snp1=c(0,1), site2=c(0,0), sex=c(0,0), > site1=c(0,0), site2=c(0,0), geno=c(0,0) eth.self=c(0,0)) > plot(survfit(test1, newdata=profile1)) > > Note that you do not have to explicitly make "edu" a factor. Since it is > included in a strata statement, the coxph routine must treat it as discrete > groups. > > Terry Therneau-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Reasonably Related Threads
- Predicted Cox survival curves - factor coding problems...
- Merging data frames, or one column/vector with a data frame filling out empty rows with NA's
- Samba4 AD DC Sites / Rename Default-First-Site-Name and internal DNS
- array dimension changes with assignment
- filling in datasets of differing lengths