It turns out that the issue is ties in survival times. Repeating the data
produces tied failure times; weighting does not, so you get different results.
The effect is unusually large here, perhaps because of the small sample size.
If you use the (less accurate) Breslow correction for ties, you do get the same
answer for both data sets.
>
coxph(Surv(time,status)~x+strata(sex),data=test,weights=wt,method="breslow")
Call:
coxph(formula = Surv(time, status) ~ x + strata(sex), data = test,
weights = wt, method = "breslow")
coef exp(coef) se(coef) z p
x 1.01 2.73 0.734 1.37 0.17
Likelihood ratio test=1.99 on 1 df, p=0.159 n=6 (1 observation deleted due to
missingness)>
coxph(Surv(time,status)~x+strata(sex),data=test_freq,method="breslow")
Call:
coxph(formula = Surv(time, status) ~ x + strata(sex), data = test_freq,
method = "breslow")
coef exp(coef) se(coef) z p
x 1.01 2.73 0.734 1.37 0.17
Likelihood ratio test=1.99 on 1 df, p=0.159 n=18 (3 observations deleted due
to missingness)
-thomas
On Fri, 13 Jun 2008, mah wrote:
> I am confuse by the results of the weights option for coxph. I
> replicated each row three times from the help page for coxph in the
> data frame test_freq. I had expected that the coefficients,
> significance tests, and tests of non-proportionality would yield the
> same results for the replicated and non-replicated data, but the
> output below shows differences in all three metrics. Is this the
> result of a curved response variable? This is likely more of a
> conceptual question than a language question, but all help is
> sincerely appreciated.
>
> Mike
>
>> test1
> $time
> [1] 4 3 1 1 2 2 3
>
> $status
> [1] 1 NA 1 0 1 1 0
>
> $x
> [1] 0 2 1 1 1 0 0
>
> $sex
> [1] 0 0 0 0 1 1 1
>
> $wt
> [1] 3 3 3 3 3 3 3
>
>> test_freq
> time status x sex
> 1 4 1 0 0
> 2 4 1 0 0
> 3 4 1 0 0
> 4 3 NA 2 0
> 5 3 NA 2 0
> 6 3 NA 2 0
> 7 1 1 1 0
> 8 1 1 1 0
> 9 1 1 1 0
> 10 1 0 1 0
> 11 1 0 1 0
> 12 1 0 1 0
> 13 2 1 1 1
> 14 2 1 1 1
> 15 2 1 1 1
> 16 2 1 0 1
> 17 2 1 0 1
> 18 2 1 0 1
> 19 3 0 0 1
> 20 3 0 0 1
> 21 3 0 0 1
>> t1 <- coxph( Surv(time, status) ~ x + strata(sex), data=test1,
weights=wt)
>> summary(t1)
> Call:
> coxph(formula = Surv(time, status) ~ x + strata(sex), data = test1,
> weights = wt)
>
> n=6 (1 observation deleted due to missingness)
> coef exp(coef) se(coef) z p
> x 1.17 3.22 0.744 1.57 0.12
>
> exp(coef) exp(-coef) lower .95 upper .95
> x 3.22 0.311 0.749 13.8
>
> Rsquare= 0.353 (max possible= 0.999 )
> Likelihood ratio test= 2.61 on 1 df, p=0.106
> Wald test = 2.47 on 1 df, p=0.116
> Score (logrank) test = 2.67 on 1 df, p=0.102
>
>> cox.zph(t1)
> rho chisq p
> x -0.0716 0.00598 0.938
>> t_freq <- coxph( Surv(time, status) ~ x + strata(sex),
data=test_freq)
>> summary(t_freq)
> Call:
> coxph(formula = Surv(time, status) ~ x + strata(sex), data > test_freq)
>
> n=18 (3 observations deleted due to missingness)
> coef exp(coef) se(coef) z p
> x 1.41 4.09 0.756 1.86 0.063
>
> exp(coef) exp(-coef) lower .95 upper .95
> x 4.09 0.245 0.929 18.0
>
> Rsquare= 0.185 (max possible= 0.879 )
> Likelihood ratio test= 3.69 on 1 df, p=0.0549
> Wald test = 3.47 on 1 df, p=0.0626
> Score (logrank) test = 3.84 on 1 df, p=0.0499
>
>> cox.zph(t_freq)
> rho chisq p
> x -0.0697 0.0526 0.819
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Thomas Lumley Assoc. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle