Vincent Vinh-Hung
2010-Nov-16  15:32 UTC
[R] Re : interpretation of coefficients in survreg AND obtaining the hazard function for an individual given a set of predictors
Thanks for sharing the questions and responses! Is it possible to appreciate how much the coefficients matter in one or the other model? Say, using Biau's example, using coxph, as.factor(grade2 ="high")TRUE gives hazard ratio 1.27 (rounded). As clinician I can grasp this HR as 27% relative increase. I can relate with other published results. With survreg the Weibull model gives a coefficient -0.4035245: is it feasible or meaningful to translate it to HR? Thanks in advance, Vincent Vinh-Hung Radiation Oncology, Geneva University Hospitals On Sun, Nov 14, 2010 at 6:51 AM, Biau David <djmbiau at yahoo.fr> wrote:> Dear R help list, > > I am modeling some survival data with coxph and survreg (dist='weibull') using > package survival. I have 2 problems: > > 1) I do not understand how to interpret the regression coefficients in the > survreg output and it is not clear, for me, from ?survreg.objects how to. > > Here is an example of the codes that points out my problem: > - data is stc1 > - the factor is dichotomous with 'low' and 'high' categories > > slr <- Surv(stc1$ti_lr, stc1$ev_lr==1) > > mca <- coxph(slr~as.factor(grade2=='high'), data=stc1) > mcb <- coxph(slr~as.factor(grade2), data=stc1) > mwa <- survreg(slr~as.factor(grade2=='high'), data=stc1, dist='weibull', > scale=0) > mwb <- survreg(slr~as.factor(grade2), data=stc1, dist='weibull', scale=0) > >> summary(mca)$coef > coef > exp(coef) se(coef) z Pr(>|z|) > as.factor(grade2 == "high")TRUE 0.2416562 1.273356 0.2456232 > 0.9838494 0.3251896 > >> summary(mcb)$coef > coef exp(coef) > se(coef) z Pr(>|z|) > as.factor(grade2)low -0.2416562 0.7853261 0.2456232 -0.9838494 > 0.3251896 > >> summary(mwa)$coef > (Intercept) as.factor(grade2 == "high")TRUE > 7.9068380 -0.4035245 > >> summary(mwb)$coef > (Intercept) as.factor(grade2)low > 7.5033135 0.4035245 > > > No problem with the interpretation of the coefs in the cox model. However, i do > not understand why > a) the coefficients in the survreg model are the opposite (negative when the > other is positive) of what I have in the cox model? are these not the log(HR) > given the categories of these variable?No. survreg() fits accelerated failure models, not proportional hazards models. The coefficients are logarithms of ratios of survival times, so a positive coefficient means longer survival.> b) how come the intercept coefficient changes (the scale parameter does not > change)?Because you have reversed the order of the factor levels. The coefficient of that variable changes sign and the intercept changes to compensate.> 2) My second question relates to the first. > a) given a model from survreg, say mwa above, how should i do to extract the > base hazard and the hazard of each patient given a set of predictors? With the > hazard function for the ith individual in the study given by h_i(t) > exp(\beta'x_i)*\lambda*\gamma*t^{\gamma-1}, it doesn't look like to me that > predict(mwa, type='linear') is \beta'x_i.No, it's beta'x_i for the accelerated failure parametrization of the Weibull. In terms of the CDF F_i(t) = F_0( exp((t+beta'x_i)/scale) ) So you need to multiply by the scale parameter and change sign to get the log hazard ratios.> b) since I need the coefficient intercept from the model to obtain the scale > parameter to obtain the base hazard function as defined in Collett > (h_0(t)=\lambda*\gamma*t^{\gamma-1}), I am concerned that this coefficient > intercept changes depending on the reference level of the factor entered in the > model. The change is very important when I have more than one predictor in the > model.As Terry Therneau pointed out recently in the context of the Cox model, there is no such thing as "the" baseline hazard. The baseline hazard is the hazard when all your covariates are equal to zero, and this depends on how you parametrize. In mwa, zero is grade2="low", in mwb, zero is grade2="high", so the hazard at zero has to be different in the two cases. -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland
Thomas Lumley
2010-Nov-16  19:36 UTC
[R] Re : interpretation of coefficients in survreg AND obtaining the hazard function for an individual given a set of predictors
A coefficient of -0.4 means that survival times are multiplied by
exp(-0.4), that is, people survival only 67% as long.
    -thomas
On Wed, Nov 17, 2010 at 4:32 AM, Vincent Vinh-Hung <anhxang at gmail.com>
wrote:> Thanks for sharing the questions and responses!
>
> Is it possible to appreciate how much the coefficients matter in one
> or the other model?
> Say, using Biau's example, using coxph, as.factor(grade2 =>
"high")TRUE gives hazard ratio 1.27 (rounded).
> As clinician I can grasp this HR as 27% relative increase. I can
> relate with other published results.
> With survreg the Weibull model gives a coefficient -0.4035245: is it
> feasible or meaningful to translate it to HR?
>
> Thanks in advance,
>
> Vincent Vinh-Hung
> Radiation Oncology,
> Geneva University Hospitals
>
> On Sun, Nov 14, 2010 at 6:51 AM, Biau David <djmbiau at yahoo.fr>
wrote:
>> Dear R help list,
>>
>> I am modeling some survival data with coxph and survreg
(dist='weibull') using
>> package survival. I have 2 problems:
>>
>> 1) I do not understand how to interpret the regression coefficients in
the
>> survreg output and it is not clear, for me, from ?survreg.objects how
to.
>>
>> Here is an example of the codes that points out my problem:
>> - data is stc1
>> - the factor is dichotomous with 'low' and 'high'
categories
>>
>> slr <- Surv(stc1$ti_lr, stc1$ev_lr==1)
>>
>> mca <- coxph(slr~as.factor(grade2=='high'), data=stc1)
>> mcb <- coxph(slr~as.factor(grade2), data=stc1)
>> mwa <- survreg(slr~as.factor(grade2=='high'), data=stc1,
dist='weibull',
>> scale=0)
>> mwb <- survreg(slr~as.factor(grade2), data=stc1,
dist='weibull', scale=0)
>>
>>> summary(mca)$coef
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? coef
>> exp(coef) ? ? ?se(coef) ? ? ? ? z ? ? ? ? ? ? ? ? ? ? ?Pr(>|z|)
>> as.factor(grade2 == "high")TRUE 0.2416562 ?1.273356 ? ?
0.2456232
>> 0.9838494 ? ? ?0.3251896
>>
>>> summary(mcb)$coef
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? coef ? ? ? ? ? ? exp(coef)
>> se(coef) ? ? ? ? ? ? z ? ? ? ? ? ? ? ? ? ? Pr(>|z|)
>> as.factor(grade2)low -0.2416562 0.7853261 ? ? 0.2456232 ? ? -0.9838494
>> 0.3251896
>>
>>> summary(mwa)$coef
>> (Intercept) ? ? as.factor(grade2 == "high")TRUE
>> 7.9068380 ? ? ? -0.4035245
>>
>>> summary(mwb)$coef
>> (Intercept) ? ? as.factor(grade2)low
>> 7.5033135 ? ? ? 0.4035245
>>
>>
>> No problem with the interpretation of the coefs in the cox model.
However, i do
>> not understand why
>> a) the coefficients in the survreg model are the opposite (negative
when the
>> other is positive) of what I have in the cox model? are these not the
log(HR)
>> given the categories of these variable?
>
> No. survreg() fits accelerated failure models, not proportional
> hazards models. ? The coefficients are logarithms of ratios of
> survival times, so a positive coefficient means longer survival.
>
>
>> b) how come the intercept coefficient changes (the scale parameter does
not
>> change)?
>
> Because you have reversed the order of the factor levels. ?The
> coefficient of that variable changes sign and the intercept changes to
> compensate.
>
>
>> 2) My second question relates to the first.
>> a) given a model from survreg, say mwa above, how should i do to
extract the
>> base hazard and the hazard of each patient given a set of predictors?
With the
>> hazard function for the ith individual in the study given by ?h_i(t)
>> exp(\beta'x_i)*\lambda*\gamma*t^{\gamma-1}, it doesn't look
like to me that
>> predict(mwa, type='linear') is \beta'x_i.
>
> No, it's beta'x_i for the accelerated failure parametrization of
the
> Weibull. ?In terms of the CDF
>
> F_i(t) = F_0( exp((t+beta'x_i)/scale) )
>
> So you need to multiply by the scale parameter and change sign to get
> the log hazard ratios.
>
>
>> b) since I need the coefficient intercept from the model to obtain the
scale
>> parameter ?to obtain the base hazard function as defined in Collett
>> (h_0(t)=\lambda*\gamma*t^{\gamma-1}), I am concerned that this
coefficient
>> intercept changes depending on the reference level of the factor
entered in the
>> model. The change is very important when I have more than one predictor
in the
>> model.
>
> As Terry Therneau pointed out recently in the context of the Cox
> model, there is no such thing as "the" baseline hazard. ?The
baseline
> hazard is the hazard when all your covariates are equal to zero, and
> this depends on how you parametrize. ?In mwa, zero is
grade2="low", in
> mwb, zero is grade2="high", so the hazard at zero has to be
different
> in the two cases.
>
> ? ? -thomas
>
> --
> Thomas Lumley
> Professor of Biostatistics
> University of Auckland
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland
Apparently Analagous Threads
- interpretation of coefficients in survreg AND obtaining the hazard function for an individual given a set of predictors
- interpretation of coefficients in survreg AND obtaining the hazard function
- Re: Trusting and trusted domain (home mapping) problem
- Memory errors using lmer
- [patch 2/9] Guest page hinting: unused / free pages on s390.