thr3ads.net - R help - [R] Apparently Conflicting Results with coxph [Oct 2007]

If this information is useful, please help other people find it:
Share via:

Kevin E. Thorpe

2007-Oct-01 13:27 UTC

[R] Apparently Conflicting Results with coxph

Dear List:

I have a data frame prepared in the couting process style for including
a binary time-dependent covariate.  The first few rows look like this.

    PtNo Start    End Status Imp
1      1     0  608.0      0   0
2      2     0  513.0      0   0
3      2   513  887.0      0   1
4      3     0   57.0      0   0
5      3    57  604.0      0   1
6      4     0  150.0      1   0


The outcome is mortality and the covariate is for an implantable
defibrillator, so it is expected that the implant would reduce the
risk of death.  The results of fitting coxph (survival package) are:

Call:
coxph(formula = Surv(Start, End, Status) ~ Imp, data = nina.excl)


     coef exp(coef) se(coef)     z    p
Imp 0.163      1.18    0.485 0.337 0.74

Likelihood ratio test=0.11  on 1 df, p=0.738  n= 335

Since this was unexpected, I created a non-counting process data
frame with an indicator variable representing received an implant
or not.  Here are the results:

Call:
coxph(formula = Surv(Days, Dead) ~ Implant, data = nina.excl0)


         coef exp(coef) se(coef)     z       p
Implant -1.77     0.171    0.426 -4.15 3.3e-05

Likelihood ratio test=19.1  on 1 df, p=1.21e-05  n= 197

I found this degree of discrepancy surprising, especially the point
estimate of the coefficient.  I have verified the data frames are
set up correctly.

Here is what I have tried to understand what is going on.

I tried fitting models adjusted for other covariates that I have in
the data frame.  This did not appreciably affect the coefficients
for the implant variable.

I ran cox.zph on the two models shown above and plotted the results.
In both cases, the point estimate of Beta(t) is sort of parabolic
in that the curves are monotonically increasing to a local maximum
after which they are monotonically decreasing (the CIs are a bit
more wiggly).

I would interpret this to mean that the effect of implant is probably
time-dependent.  If so, how do I actually get a "proper" estimate of
beta(t) for a variable like this?

Are there some other things I should look at to understand what's
going on?

Here is my sessionInfo.
R version 2.5.0 (2007-04-23)
i686-pc-linux-gnu

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] "splines"   "stats"     "graphics" 
"grDevices" "utils"     "datasets"
[7] "methods"   "base"

other attached packages:
  cmprsk survival
 "2.1-7"   "2.31"


-- 
Kevin E. Thorpe
Biostatistician/Trialist, Knowledge Translation Program
Assistant Professor, Department of Public Health Sciences
Faculty of Medicine, University of Toronto
email: kevin.thorpe at utoronto.ca  Tel: 416.864.5776  Fax: 416.864.6057

Peter Dalgaard

2007-Oct-01 13:48 UTC

head link

[R] Apparently Conflicting Results with coxph

Kevin E. Thorpe wrote:> Dear List:
>
> I have a data frame prepared in the couting process style for including
> a binary time-dependent covariate.  The first few rows look like this.
>
>     PtNo Start    End Status Imp
> 1      1     0  608.0      0   0
> 2      2     0  513.0      0   0
> 3      2   513  887.0      0   1
> 4      3     0   57.0      0   0
> 5      3    57  604.0      0   1
> 6      4     0  150.0      1   0
>
>
> The outcome is mortality and the covariate is for an implantable
> defibrillator, so it is expected that the implant would reduce the
> risk of death.  The results of fitting coxph (survival package) are:
>
> Call:
> coxph(formula = Surv(Start, End, Status) ~ Imp, data = nina.excl)
>
>
>      coef exp(coef) se(coef)     z    p
> Imp 0.163      1.18    0.485 0.337 0.74
>
> Likelihood ratio test=0.11  on 1 df, p=0.738  n= 335
>
> Since this was unexpected, I created a non-counting process data
> frame with an indicator variable representing received an implant
> or not.  Here are the results:
>
> Call:
> coxph(formula = Surv(Days, Dead) ~ Implant, data = nina.excl0)
>
>
>          coef exp(coef) se(coef)     z       p
> Implant -1.77     0.171    0.426 -4.15 3.3e-05
>
> Likelihood ratio test=19.1  on 1 df, p=1.21e-05  n= 197
>
> I found this degree of discrepancy surprising, especially the point
> estimate of the coefficient.  I have verified the data frames are
> set up correctly.
>
> Here is what I have tried to understand what is going on.
>
> I tried fitting models adjusted for other covariates that I have in
> the data frame.  This did not appreciably affect the coefficients
> for the implant variable.
>
> I ran cox.zph on the two models shown above and plotted the results.
> In both cases, the point estimate of Beta(t) is sort of parabolic
> in that the curves are monotonically increasing to a local maximum
> after which they are monotonically decreasing (the CIs are a bit
> more wiggly).
>
> I would interpret this to mean that the effect of implant is probably
> time-dependent.  If so, how do I actually get a "proper" estimate
of
> beta(t) for a variable like this?
>
> Are there some other things I should look at to understand what's
> going on?
>
>   If you want to play with time-dependent regression coefficients have a
look at the timereg package and the book that it supports.

However, first you need to consider the possibility of selection effects
that can take place even with non-varying effects. In the case at hand I
would suspect a bias created by the fact that you don't implant devices
into people who are already dead.
> Here is my sessionInfo.
> R version 2.5.0 (2007-04-23)
> i686-pc-linux-gnu
>
> locale:
>
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
>
> attached base packages:
> [1] "splines"   "stats"     "graphics" 
"grDevices" "utils"     "datasets"
> [7] "methods"   "base"
>
> other attached packages:
>   cmprsk survival
>  "2.1-7"   "2.31"
>
>
>   

-- 
   O__  ---- Peter Dalgaard             ?ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907

Terry Therneau

2007-Oct-02 16:36 UTC

head link

[R] Apparently Conflicting Results with coxph

From my experience, what you are seeing is almost certainly a patient 
selection effect.  (The number 1 reason for puzzling results is incorrect coding
of a time-dependent covariate, but you appear to have been quite careful).

   Assigning the implant as a non-time dependent covariate almost guarrantees 
that the estimated effect will be beneficial.  The only people who get an 
implant are those who live longer than average (long enough to get an implant).
The size of such a bias is surprisingly large.  The problem is rediscovered in 
the cancer field every few years, in comparisons of responders to 
non-responders. 

   As a time-dependent covariate, you have the problem of indication for 
treatment.  Say for instance that the devices were very expensive, and were only
used for patients in immenent danger of death.  For a device that was a placebo 
you would find, not surprisingly, that being selected for implantation carried a
major risk.  The device may need to be extremely effective to overcome this type
of bias.  As a simple example, if you compare the death rate of those who have 
seen a oncologist (cancer doc) in the last month to those who have not done so, 
you find that the former group has a much higher death rate.

   Terry Therneau
   > Kevin E. Thorpe wrote:
>> Dear List:
>>
>> I have a data frame prepared in the couting process style for including
>> a binary time-dependent covariate.  The first few rows look like this.
>>
>>     PtNo Start    End Status Imp
>> 1      1     0  608.0      0   0
>> 2      2     0  513.0      0   0
>> 3      2   513  887.0      0   1
>> 4      3     0   57.0      0   0
>> 5      3    57  604.0      0   1
>> 6      4     0  150.0      1   0
>>
>>
>> The outcome is mortality and the covariate is for an implantable
>> defibrillator, so it is expected that the implant would reduce the
>> risk of death.  The results of fitting coxph (survival package) are:
>>
>> Call:
>> coxph(formula = Surv(Start, End, Status) ~ Imp, data = nina.excl)
>>
>>
>>      coef exp(coef) se(coef)     z    p
>> Imp 0.163      1.18    0.485 0.337 0.74
>>
>> Likelihood ratio test=0.11  on 1 df, p=0.738  n= 335
>>
>> Since this was unexpected, I created a non-counting process data
>> frame with an indicator variable representing received an implant
>> or not.  Here are the results:
>>
>> Call:
>> coxph(formula = Surv(Days, Dead) ~ Implant, data = nina.excl0)
>>
>>
>>          coef exp(coef) se(coef)     z       p
>> Implant -1.77     0.171    0.426 -4.15 3.3e-05
>>
>> Likelihood ratio test=19.1  on 1 df, p=1.21e-05  n= 197
>>
>> I found this degree of discrepancy surprising, especially the point
>> estimate of the coefficient.  I have verified the data frames are
>> set up correctly.
>>
>> Here is what I have tried to understand what is going on.
>>
>> I tried fitting models adjusted for other covariates that I have in
>> the data frame.  This did not appreciably affect the coefficients
>> for the implant variable.
>>
>> I ran cox.zph on the two models shown above and plotted the results.
>> In both cases, the point estimate of Beta(t) is sort of parabolic
>> in that the curves are monotonically increasing to a local maximum
>> after which they are monotonically decreasing (the CIs are a bit
>> more wiggly).
>>
>> I would interpret this to mean that the effect of implant is probably
>> time-dependent.  If so, how do I actually get a "proper"
estimate of
>> beta(t) for a variable like this?
>>
>> Are there some other things I should look at to understand what's
>> going on?

Terry Therneau

2007-Oct-03 16:21 UTC

head link

[R] Apparently Conflicting Results with coxph

> I thought about this some more, and I'm not sure that possibility is
> "to blame."  In my time-dependent model, I don't think
I'm doing
> anything different than is done for transplant in the Stanford
> Heart Study (the often used example for this kind of time-dependent
> covariate).  As in my case, you would not transplant a dead patient.
> So, I remain puzzled as to why my model is misbehaving.
  The Stanford Heart Study, quoted in nearly every survival book as you say, is 
a bit of an anomaly.  At the time it was run a good tissue match between the 
donor heart and the recipient was considered very important.  When a donor 
became available, the best match (or near best) among those waiting was chosen 
to recieve it.  Since the donor genetics are unpredictable, this is essentially 
equal to a random pick from those waiting.  The Stanford study is nearly alone 
in examples of time-dependent treatment in not having selection effects.
  
  	Terry T.

Reasonably Related Threads

Search for more apparently analagous threads

R help - Oct 2007 - Apparently Conflicting Results with coxph

[R] Apparently Conflicting Results with coxph

[R] Apparently Conflicting Results with coxph

[R] Apparently Conflicting Results with coxph

[R] Apparently Conflicting Results with coxph

Reasonably Related Threads