Eric Rescorla
2008-Nov-10 14:54 UTC
[R] coxph diagnostics plot for shape of hazard function?
Hi, I've been banging my head against the following problem for a while and thought the fine people on r-help might be able to help. I'm using the survival package. I'm studying the survival rate of a population with a preexisting linear-like event rate (there are theoretical reasons to believe it's linear, but of course it's subject to the usual sampling noise) Some of the population exhibit predictor X and some don't [I'm not trying to be cagey about the setting here, it's just complicated to explain and I'm trying to keep my message short.] When I plot the survival curves, there's a qualitatively significant difference and this is confirmed by survdiff. When I run cox.zph, however, it's pretty clear that the proportional hazards assumption isn't satisfied:> zph <- cox.zph(cox) > zphrho chisq p Initially.Vulnerable -0.0476 32.5 1.19e-08>Similarly, when I do plot(zph), B(t) is fairly non-constant. This isn't inherently a problem for me. I don't need a hard single number to characterize the shape of the excess risk. However, I'd like to be able to say something qualitative about the shape of the excess risk for the predictor. E.g., is it linear, monotonically increasing, monotonially decreasing, etc. Is it safe to use the coxph diagnostic plot for this purpose? I did try heuristically subtracting out the background and then fitting a spline using locfit as described in the MASS supplement, but this seemed a little more ad hoc than I was hoping for something more principled. Thanks in advance. -Ekr
Terry Therneau
2008-Nov-11 14:14 UTC
[R] coxph diagnostics plot for shape of hazard function?
> Similarly, when I do plot(zph), B(t) is fairly non-constant.> This isn't inherently a problem for me. I don't need a hard single number > to characterize the shape of the excess risk. However, I'd like to be > able to say > something qualitative about the shape of the excess risk for the predictor. > E.g., is it linear, monotonically increasing, monotonially decreasing, etc. > Is it safe to use the coxph diagnostic plot for this purpose?Basically - yes you can. There are a few caveats: 1. As a computational shortcut cox.zph assumes that var(X) is approximately constant over time, where X is the matrix of covariates. (Improving this has been on my to do list for some time). I have found this to be almost always true, but if you have a data set where e.g. everyone in treatment 1 is crossed over at 6 months, then you can get odd results for that covariate. I've run across 2-3 such data sets in 10+ years. 2. The spline curve on the plot is "for the eye". You can certainly use other smoothings, fit a line, etc. Often you can find a simpler fit. zpfit <- cox.zph(mycoxfit, transform='identity') plot(zpfit$x, zpfit$y[,1], xlab='Time') #look at variable 1 lines(lowess(zpfit$x, zpfit$y[,1]), col=2) abline( lm(zpfit$y[,1] ~zpfit$x), col=3) plot(zpfit$x, zpfit$y[,1], log='x') #same as transform=log etc. Sometimes the regression spline fit, the default for cox.zph, puts an extra "hook" on the end of the curve, somewhat like polynomials will. Terry T.