thr3ads.net - R help - [R] coxph linear.predictors [Oct 2010]

If this information is useful, please help other people find it:
Share via:

Bond, Stephen

2010-Oct-27 16:12 UTC

[R] coxph linear.predictors

I would like to be able to construct hazard rates (or unconditional death prob)
for many subjects from a given survfit.

This will involve adjusting the ( n.event/n.risk)
with (coxph object )$linear.predictors
I must be having another silly day as I cannot reproduce the linear predictor:

fit <- coxph(Surv(futime, fustat) ~ age, data = ovarian)
fit$linear.predictors[1]
[1] 2.612756

coef(fit)*model.matrix(fit)[1,1]
     age
11.69021

The above is based on the help listing for coxph.object
coefficients: the coefficients of the linear predictor, which multiply
          the columns of the model matrix.  If the model is
          over-determined there will be missing values in the vector
          corresponding to the redundant columns in the model matrix.

Also, please comment whether n.event/n.risk gives the baseline hazard exp(alpha)
?

Please, help. Thanks a lot.

Stephen B


	[[alternative HTML version deleted]]

David Winsemius

2010-Oct-27 17:14 UTC

head link

[R] coxph linear.predictors

On Oct 27, 2010, at 12:12 PM, Bond, Stephen wrote:
> I would like to be able to construct hazard rates (or unconditional  
> death prob)
Hazards are not probabilities (since probabilities are constrained to  
the range [0,1] and hazards are unbounded upward.)
> for many subjects from a given survfit.
>
> This will involve adjusting the ( n.event/n.risk)
> with (coxph object )$linear.predictors
> I must be having another silly day as I cannot reproduce the linear  
> predictor:
>
> fit <- coxph(Surv(futime, fustat) ~ age, data = ovarian)
> fit$linear.predictors[1]
> [1] 2.612756
That's the linear predictor (the beta*X) and that particular number  
only applies to the first case.
>
> coef(fit)*model.matrix(fit)[1,1]
>     age
> 11.69021
>
I don't know what that might be and you are not telling us what you  
think it is.

> The above is based on the help listing for coxph.object
> coefficients: the coefficients of the linear predictor, which multiply
>          the columns of the model matrix.  If the model is
>          over-determined there will be missing values in the vector
>          corresponding to the redundant columns in the model matrix.
>
> Also, please comment whether n.event/n.risk
The Nelson-Aalen estimator of the cumulative hazard as a function of  
intervals prior to t is sum( n-event(t)/ n.risk(t))
> gives the baseline hazard exp(alpha) ?
No. The "baseline hazard", as you are calling this, would be an  
estimate for persons with all covariates = 0, so in this case is for  
women of age=0. (Not a particularly interpretable result in many  
situations. The baseline hazard following treated ovarian cancer for  
neonates is not medically sensible.)

What is the purpose of this request? Is someone telling you you need  
to provide estimates for instantaneous hazards?


-- 

David Winsemius, MD
West Hartford, CT

Terry Therneau

2010-Oct-28 22:38 UTC

head link

[R] coxph linear.predictors

Gentlemen,
  I read R-news in batch mode so I'm often a day behind.  Let me try to
answer some of the questions.

 1. X*beta  != linear.predictor.  
I'm sorry if the documentation isn't all it could be.  Between the book,
tech report, and help I've written about 400 pages, but this particular
topic isn't yet in it.  The final snipe about being "opaque like
SAS"
was really unfair.
The Cox model is a relative risk model, if lp is a linear predictor then
so is lp +c for any constant; they are equally good and equally valid.
The linear.predictor component in a coxph fit is (X-means) * beta.  The
computation exp(lp) occurs multiple times downstream and this keep the
exp function from overflowing when there is something like a Date object
as a predictor.  Adding this constant changes not a single downstream
calcuation.

2. Survfit is too slow.
 I'd like to hear more about this.  My work mostly involves modest data
sets so perhaps I haven't seen it.  Accuracy and maintainability have
been my first worries.

3. Baseline survival.
 Let xbase be a particular set of values for the x covariates (one for
each).  The survival curve for a given xbase is obtained from survfit
   fit <- coxph(....
   sfit <- survfit(fit, newdata=xbase)
   chaz <- -log(sfit$surv)  #cumulative hazard
(The xbase vector will need to have variable names for the function to
know which value goes to which of course).

The cumulative hazard for any other subject will be 
   newhaz <- chaz * exp(fit$coef%*% (x-xbase))
There is not a simple transformation of the standard error from one fit
to another, however.  You will need to call survfit with a data frame
for newdata, which will return one curve per row with the proper values.

In my view there is no such thing as "A" baseline survival curve.  Any
xbase you chose is a baseline.  However, it is wise to choose something
near the center of the data space in order to avoid numeric problems
with the exp function above.  I would never ever chose a vector of
zeros, although some text books do -- it saves them about 8 characters
of typing in the newhaz formula above.

Terry Therneau

Apparently Analagous Threads

Search for more apparently analagous threads

R help - Oct 2010 - coxph linear.predictors

[R] coxph linear.predictors

[R] coxph linear.predictors

[R] coxph linear.predictors

Apparently Analagous Threads