Dear R-helpers, To get curves for a pseudo cohort other than the one centered at the mean of the covariates, I have been trying to use the newdata argument to survfit with no success. Here is my model statement, the newdata and the ensuing error. What am I doing wrong?> summary(fit)Call: coxph(formula = Surv(Start, Stop, Event, type = "counting") ~ Week + LagAOO + Prior.f + cluster(interaction(Station, Year)), data = data8, method = "breslow", x = T, y = T) n= 1878 coef exp(coef) se(coef) robust se z p Week 0.00582 1.01 0.0323 0.0239 0.244 8.1e-01 LagAOO 0.71929 2.05 0.1238 0.1215 5.918 3.3e-09 Prior.f2 0.12927 1.14 0.4402 0.4025 0.321 7.5e-01 Prior.f3 0.79082 2.21 0.5484 0.4460 1.773 7.6e-02 Prior.f4 2.04189 7.71 0.6008 0.4685 4.358 1.3e-05 Prior.f5 1.20450 3.34 0.6423 0.5481 2.198 2.8e-02 exp(coef) exp(-coef) lower .95 upper .95 Week 1.01 0.994 0.960 1.05 LagAOO 2.05 0.487 1.618 2.61 Prior.f2 1.14 0.879 0.517 2.50 Prior.f3 2.21 0.453 0.920 5.29 Prior.f4 7.71 0.130 3.076 19.30 Prior.f5 3.34 0.300 1.139 9.76 Rsquare= 0.047 (max possible= 0.25 ) Likelihood ratio test= 91 on 6 df, p=0 Wald test = 209 on 6 df, p=0 Score (logrank) test = 142 on 6 df, p=0, Robust = 17.4 p=0.00803 (Note: the likelihood ratio and score tests assume independence of observations within a cluster, the Wald and robust score tests do not).>newdat Week LagAOO Prior.f2 Prior.f3 Prior.f4 Prior.f5 1 17.55218 1.191693 1 0 0 0 2 17.55218 1.191693 0 0 0 0> survfit(fit,newdata=newdat)Error in model.frame(formula, rownames, variables, varnames, extras, extranames, : variable lengths differ In addition: Warning message: 'newdata' had 2 rows but variable(s) found have 1878 rows Regards, Alex Alex Hanke Department of Fisheries and Oceans St. Andrews Biological Station 531 Brandy Cove Road St. Andrews, NB Canada E5B 2L9 [[alternative HTML version deleted]]
You appear to have a coding for prior.f in newdata rather than the factor itself. It's a bit hard to be sure when we don't have data8 to compare with. On Wed, 15 Jun 2005, Hanke, Alex wrote:> Dear R-helpers, > To get curves for a pseudo cohort other than the one centered at the mean of > the covariates, I have been trying to use the newdata argument to survfit > with no success. Here is my model statement, the newdata and the ensuing > error. What am I doing wrong? > >> summary(fit) > Call: > coxph(formula = Surv(Start, Stop, Event, type = "counting") ~ > Week + LagAOO + Prior.f + cluster(interaction(Station, Year)), > data = data8, method = "breslow", x = T, y = T) > > n= 1878 > coef exp(coef) se(coef) robust se z p > Week 0.00582 1.01 0.0323 0.0239 0.244 8.1e-01 > LagAOO 0.71929 2.05 0.1238 0.1215 5.918 3.3e-09 > Prior.f2 0.12927 1.14 0.4402 0.4025 0.321 7.5e-01 > Prior.f3 0.79082 2.21 0.5484 0.4460 1.773 7.6e-02 > Prior.f4 2.04189 7.71 0.6008 0.4685 4.358 1.3e-05 > Prior.f5 1.20450 3.34 0.6423 0.5481 2.198 2.8e-02 > > exp(coef) exp(-coef) lower .95 upper .95 > Week 1.01 0.994 0.960 1.05 > LagAOO 2.05 0.487 1.618 2.61 > Prior.f2 1.14 0.879 0.517 2.50 > Prior.f3 2.21 0.453 0.920 5.29 > Prior.f4 7.71 0.130 3.076 19.30 > Prior.f5 3.34 0.300 1.139 9.76 > > Rsquare= 0.047 (max possible= 0.25 ) > Likelihood ratio test= 91 on 6 df, p=0 > Wald test = 209 on 6 df, p=0 > Score (logrank) test = 142 on 6 df, p=0, Robust = 17.4 p=0.00803 > > (Note: the likelihood ratio and score tests assume independence of > observations within a cluster, the Wald and robust score tests do not). >> > newdat > Week LagAOO Prior.f2 Prior.f3 Prior.f4 Prior.f5 > 1 17.55218 1.191693 1 0 0 0 > 2 17.55218 1.191693 0 0 0 0 > >> survfit(fit,newdata=newdat) > Error in model.frame(formula, rownames, variables, varnames, extras, > extranames, : > variable lengths differ > In addition: Warning message: > 'newdata' had 2 rows but variable(s) found have 1878 rows > > Regards, > Alex > > > Alex Hanke > Department of Fisheries and Oceans > St. Andrews Biological Station > 531 Brandy Cove Road > St. Andrews, NB > Canada > E5B 2L9 > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Hi Brian, The factor Prior.f has 5 levels (1,2,3,4,5) which coxph deals with by creating 4 dummy variables coded with 1 or zero. That's what I see when I look at fit$x. fit$x[1:5,] Week LagAOO factor(Prior.f)2 factor(Prior.f)3 factor(Prior.f)4 31 22 0 0 0 0 32 22 0 0 0 0 33 22 2 0 0 0 34 22 3 0 0 0 35 22 2 0 0 0 factor(Prior.f)5 31 0 32 0 33 0 34 0 35 0 I have played with the formula a bit adding a term at a time and then checking to see if I can produce the survival curves for pseudo cohorts. I get as far as the Prior.f term and am successful if I treat it as a continuous variable. If I introduce it as a factor and assume it wants four dummy variables as above I get the variable lengths error. If I represent the term with one variable: survfit(fit,list(Week=c(15,15),LagAOO=c(0,0),Prior.f=c(1,2))) I get: Error in x2 %*% coef : non-conformable arguments Which is a nice change but still short of knowing what is going on. Regards Alex -----Original Message----- From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk] Sent: June 15, 2005 4:24 PM To: Hanke, Alex Cc: 'r-help at stat.math.ethz.ch' Subject: Re: [R] Error using newdata argument in survfit You appear to have a coding for prior.f in newdata rather than the factor itself. It's a bit hard to be sure when we don't have data8 to compare with. On Wed, 15 Jun 2005, Hanke, Alex wrote:> Dear R-helpers, > To get curves for a pseudo cohort other than the one centered at the meanof> the covariates, I have been trying to use the newdata argument to survfit > with no success. Here is my model statement, the newdata and the ensuing > error. What am I doing wrong? > >> summary(fit) > Call: > coxph(formula = Surv(Start, Stop, Event, type = "counting") ~ > Week + LagAOO + Prior.f + cluster(interaction(Station, Year)), > data = data8, method = "breslow", x = T, y = T) > > n= 1878 > coef exp(coef) se(coef) robust se z p > Week 0.00582 1.01 0.0323 0.0239 0.244 8.1e-01 > LagAOO 0.71929 2.05 0.1238 0.1215 5.918 3.3e-09 > Prior.f2 0.12927 1.14 0.4402 0.4025 0.321 7.5e-01 > Prior.f3 0.79082 2.21 0.5484 0.4460 1.773 7.6e-02 > Prior.f4 2.04189 7.71 0.6008 0.4685 4.358 1.3e-05 > Prior.f5 1.20450 3.34 0.6423 0.5481 2.198 2.8e-02 > > exp(coef) exp(-coef) lower .95 upper .95 > Week 1.01 0.994 0.960 1.05 > LagAOO 2.05 0.487 1.618 2.61 > Prior.f2 1.14 0.879 0.517 2.50 > Prior.f3 2.21 0.453 0.920 5.29 > Prior.f4 7.71 0.130 3.076 19.30 > Prior.f5 3.34 0.300 1.139 9.76 > > Rsquare= 0.047 (max possible= 0.25 ) > Likelihood ratio test= 91 on 6 df, p=0 > Wald test = 209 on 6 df, p=0 > Score (logrank) test = 142 on 6 df, p=0, Robust = 17.4 p=0.00803 > > (Note: the likelihood ratio and score tests assume independence of > observations within a cluster, the Wald and robust score tests donot).>> > newdat > Week LagAOO Prior.f2 Prior.f3 Prior.f4 Prior.f5 > 1 17.55218 1.191693 1 0 0 0 > 2 17.55218 1.191693 0 0 0 0 > >> survfit(fit,newdata=newdat) > Error in model.frame(formula, rownames, variables, varnames, extras, > extranames, : > variable lengths differ > In addition: Warning message: > 'newdata' had 2 rows but variable(s) found have 1878 rows > > Regards, > Alex > > > Alex Hanke > Department of Fisheries and Oceans > St. Andrews Biological Station > 531 Brandy Cove Road > St. Andrews, NB > Canada > E5B 2L9 > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide!http://www.R-project.org/posting-guide.html>-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Thanks to Thomas, Brain and Ales, Whose advice led me to the solution. I actually had a second problem preventing Thomas' solution : survfit(fit,list(Week=c(15,15),LagAOO=c(0,0),Prior.f=factor(c(1,2),levels=1: 5))) from working. In the model statement I create a factor from Prior.f via factor(Prior.f). Rather one should predefine the factor variable Prior.f<-factor(Prior.f) and use that term in the model and then Thomas' solution works fine. Alex -----Original Message----- From: Thomas Lumley [mailto:tlumley at u.washington.edu] Sent: June 16, 2005 11:00 AM To: Hanke, Alex Cc: 'r-help at stat.math.ethz.ch' Subject: Re: [R] Error using newdata argument in survfit On Thu, 16 Jun 2005, Hanke, Alex wrote:> Hi Brian, > The factor Prior.f has 5 levels (1,2,3,4,5) which coxph deals with by > creating 4 dummy variables coded with 1 or zero. That's what I see when I > look at fit$x. > fit$x[1:5,] > Week LagAOO factor(Prior.f)2 factor(Prior.f)3 factor(Prior.f)4 > 31 22 0 0 0 0 > 32 22 0 0 0 0 > 33 22 2 0 0 0 > 34 22 3 0 0 0 > 35 22 2 0 0 0 > factor(Prior.f)5 > 31 0 > 32 0 > 33 0 > 34 0 > 35 0 > I have played with the formula a bit adding a term at a time and then > checking to see if I can produce the survival curves for pseudo cohorts. I > get as far as the Prior.f term and am successful if I treat it as a > continuous variable. If I introduce it as a factor and assume it wantsfour> dummy variables as above I get the variable lengths error. If I represent > the term with one variable: > survfit(fit,list(Week=c(15,15),LagAOO=c(0,0),Prior.f=c(1,2))) > I get: > Error in x2 %*% coef : non-conformable argumentsYes, but it wants a factor with *5* levels. Try survfit(fit,list(Week=c(15,15),LagAOO=c(0,0),Prior.f=factor(c(1,2),levels=1: 5))) -thomas> Which is a nice change but still short of knowing what is going on. > Regards > Alex > > -----Original Message----- > From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk] > Sent: June 15, 2005 4:24 PM > To: Hanke, Alex > Cc: 'r-help at stat.math.ethz.ch' > Subject: Re: [R] Error using newdata argument in survfit > > You appear to have a coding for prior.f in newdata rather than the factor > itself. > > It's a bit hard to be sure when we don't have data8 to compare with. > > On Wed, 15 Jun 2005, Hanke, Alex wrote: > >> Dear R-helpers, >> To get curves for a pseudo cohort other than the one centered at the mean > of >> the covariates, I have been trying to use the newdata argument to survfit >> with no success. Here is my model statement, the newdata and the ensuing >> error. What am I doing wrong? >> >>> summary(fit) >> Call: >> coxph(formula = Surv(Start, Stop, Event, type = "counting") ~ >> Week + LagAOO + Prior.f + cluster(interaction(Station, Year)), >> data = data8, method = "breslow", x = T, y = T) >> >> n= 1878 >> coef exp(coef) se(coef) robust se z p >> Week 0.00582 1.01 0.0323 0.0239 0.244 8.1e-01 >> LagAOO 0.71929 2.05 0.1238 0.1215 5.918 3.3e-09 >> Prior.f2 0.12927 1.14 0.4402 0.4025 0.321 7.5e-01 >> Prior.f3 0.79082 2.21 0.5484 0.4460 1.773 7.6e-02 >> Prior.f4 2.04189 7.71 0.6008 0.4685 4.358 1.3e-05 >> Prior.f5 1.20450 3.34 0.6423 0.5481 2.198 2.8e-02 >> >> exp(coef) exp(-coef) lower .95 upper .95 >> Week 1.01 0.994 0.960 1.05 >> LagAOO 2.05 0.487 1.618 2.61 >> Prior.f2 1.14 0.879 0.517 2.50 >> Prior.f3 2.21 0.453 0.920 5.29 >> Prior.f4 7.71 0.130 3.076 19.30 >> Prior.f5 3.34 0.300 1.139 9.76 >> >> Rsquare= 0.047 (max possible= 0.25 ) >> Likelihood ratio test= 91 on 6 df, p=0 >> Wald test = 209 on 6 df, p=0 >> Score (logrank) test = 142 on 6 df, p=0, Robust = 17.4 p=0.00803 >> >> (Note: the likelihood ratio and score tests assume independence of >> observations within a cluster, the Wald and robust score tests do > not). >>> >> newdat >> Week LagAOO Prior.f2 Prior.f3 Prior.f4 Prior.f5 >> 1 17.55218 1.191693 1 0 0 0 >> 2 17.55218 1.191693 0 0 0 0 >> >>> survfit(fit,newdata=newdat) >> Error in model.frame(formula, rownames, variables, varnames, extras, >> extranames, : >> variable lengths differ >> In addition: Warning message: >> 'newdata' had 2 rows but variable(s) found have 1878 rows >> >> Regards, >> Alex >> >> >> Alex Hanke >> Department of Fisheries and Oceans >> St. Andrews Biological Station >> 531 Brandy Cove Road >> St. Andrews, NB >> Canada >> E5B 2L9 >> >> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at stat.math.ethz.ch mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html >> > > -- > Brian D. Ripley, ripley at stats.ox.ac.uk > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UK Fax: +44 1865 272595 > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide!http://www.R-project.org/posting-guide.html>Thomas Lumley Assoc. Professor, Biostatistics tlumley at u.washington.edu University of Washington, Seattle
Apparently Analagous Threads
- Clustering and the test for proportional hazards
- nls (with SSlogis model and upper limit) never returns (PR#10544)
- combining zoo series with an overlapping index?
- Interpolating a line and then summing there values for a diurnal oxygen curve (zoo object)
- A problem about nomogram--thank you for you help