thr3ads.net - R help - [R] Kaplan Meier analysis: 95% CI wider in R than in SAS [Apr 2012]

If this information is useful, please help other people find it:
Share via:

Paul Miller

2012-Apr-13 12:13 UTC

[R] Kaplan Meier analysis: 95% CI wider in R than in SAS

Hello All,
?
Am replicating in R an analysis I did earlier using SAS. See this as a test of
whether I'm ready to start using R in my day-to-day work.
?
Just finished replicating a Kaplan Meier analysis. Everything seems to work out
fine except for one thing. The 95% CI around my estimate for the median is
substantially larger in R than in SAS. For example, in SAS I have a median of
3.29 with a 95% CI of [1.15, 5.29]. In R, I get a median of 3.29 with a 95% CI
of [1.35,?13.35].
?
Can anyone tell me why I get this difference?
?
My R code looks like:
?
survfrm <- Surv(progression_months_landmark_14,progression==1) ~
pr_rg_landmark_14
survobj <- survfit(survfrm, data=Survival)
survlrk <- survdiff(survfrm, data=Survival)
summary(survobj)
print(survobj)
print(survlrk)
?
My SAS code looks like:
?
proc lifetest data=survival;
strata pr_rg_landmark_14;
time progression_months_landmark_14 * progression(0);
run;

Thought maybe the difference could have something to do with the strata
statement in the SAS code not being translated properly into R. Tried changing
my R code to make pr_rg_landmark_14 a strata but this didn't seem to change
anything. Except that I no longer got a log rank test.

Thanks,

Paul
?
?
?
?

?
?
?

Paul Miller

2012-Apr-13 15:36 UTC

head link

[R] Kaplan Meier analysis: 95% CI wider in R than in SAS

Hi Enrico,

Not sure how SAS builds the CI but I can look into it. The SAS documentation
does have a section on computational formulas at:

http://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_lifetest_a0000000259.htm

Although I can't provide my dataset, I can provide the data and code below.
This is the R-equivalent of an analysis from "Common Statistical Methods
for Clinical Research with SAS Examples."

R produces the follwoing output:
> print(surv.by.vac)Call: survfit(formula = Surv(WKS, CENS == 0) ~ VAC, data = hsv)

        records n.max n.start events median 0.95LCL 0.95UCL
VAC=GD2      25    25      25     14     35      15      NA
VAC=PBO      23    23      23     17     15      12      35

SAS has the same 95% CI for VAC=GD2 but has a 95% CI of [10, 27] for VAC=PBO.
This is just like in the analysis I'm doing currently.

Thanks,

Paul

?
#######################################
#### Chapter 21: The Log-Rank Test ####
#######################################
?
#####################################################
#### Example 21.1: HSV2 Vaccine with gD2 Vaccine ####
#####################################################
?
connection <- textConnection("
GD2? 1?? 8 12? GD2? 3 -12 10? GD2? 6 -52? 7
GD2? 7? 28 10? GD2? 8? 44? 6? GD2 10? 14? 8
GD2 12?? 3? 8? GD2 14 -52? 9? GD2 15? 35 11
GD2 18?? 6 13? GD2 20? 12? 7? GD2 23? -7 13
GD2 24 -52? 9? GD2 26 -52 12? GD2 28? 36 13
GD2 31 -52? 8? GD2 33?? 9 10? GD2 34 -11 16
GD2 36 -52? 6? GD2 39? 15 14? GD2 40? 13 13
GD2 42? 21 13? GD2 44 -24 16? GD2 46 -52 13
GD2 48? 28? 9? PBO? 2? 15? 9? PBO? 4 -44 10
PBO? 5? -2 12? PBO? 9?? 8? 7? PBO 11? 12? 7
PBO 13 -52? 7? PBO 16? 21? 7? PBO 17? 19 11
PBO 19?? 6 16? PBO 21? 10 16? PBO 22 -15? 6
PBO 25?? 4 15? PBO 27? -9? 9? PBO 29? 27 10
PBO 30?? 1 17? PBO 32? 12? 8? PBO 35? 20? 8
PBO 37 -32? 8? PBO 38? 15? 8? PBO 41?? 5 14
PBO 43? 35 13? PBO 45? 28? 9? PBO 47?? 6 15
")

hsv <- data.frame(scan(connection, list(VAC="", PAT=0, WKS=0,
X=0)))
hsv <- transform(hsv,
??CENS = ifelse(WKS < 1, 1, 0),
??WKS? = abs(WKS),
??TRT? = ifelse(VAC=="GD2", 1, 0))

library("survival")
surv.by.vac <- survfit(Surv(WKS,CENS==0)~VAC, data=hsv)

plot(surv.by.vac, 
?main = "The Log-Rank Test \n Example 21.1: HSV-Episodes with gD2
Vaccine",
?ylab = "Survival Distribution Function",
?xlab = "Survival Time in Weeks",
?lty = c(1,2))

legend(0.75,0.19, 
?legend = c("gD2","PBO"), 
?lty = c(1,2), title = "Treatment")

summary(surv.by.vac)
print(surv.by.vac)
?

Paul Miller

2012-Apr-14 13:36 UTC

head link

[R] Kaplan Meier analysis: 95% CI wider in R than in SAS

Hello Drs. Colosimo and Harrell,
 
Thank you for your replies to my question. From Dr. Colosimo, I was able to
determine that the SAS results can be replicated by adding the
option conf.type="log-log" to my code as in :
 
survobj <- survfit(survfrm, conf.type="log-log", data=Survival)
 
Originally, it looked like the SAS results could be replicated using
conf.type="plain". Applying this option to my actual data revealed
that this was not the case, however. 
 >From Dr. Harrell, I learned that using conf.type="log-log" may not
be such a good idea. Interestingly though, I've seen at least one instance
where experts in the R community use this option in their book. The book is
about 10 years old. So maybe opinion about the use of this option has shifted
since then. 
Thanks,
 
Paul  
 
	[[alternative HTML version deleted]]

Terry Therneau

2012-Apr-16 13:30 UTC

head link

[R] Kaplan Meier analysis: 95% CI wider in R than in SAS

On 04/14/2012 05:00 AM, r-help-request at r-project.org
wrote:> Am replicating in R an analysis I did earlier using SAS. See this as a test
of whether I'm ready to start using R in my day-to-day work.
> ?
> Just finished replicating a Kaplan Meier analysis. Everything seems to work
out fine except for one thing. The 95% CI around my estimate for the median is
substantially larger in R than in SAS. For example, in SAS I have a median of
3.29 with a 95% CI of [1.15, 5.29]. In R, I get a median of 3.29 with a 95% CI
of [1.35,?13.35].
> ?
> Can anyone tell me why I get this difference?
>
The confidence interval for the median is based on the confidence 
intervals for the curves.  There are several methods for computing 
confidence intervals for the curves: plain, log, log-log, or logit 
scale.  There are opinions on which is best, and it is a close race: 
except for the first of these.  The type "plain" intervals are awful, 
it's like putting me in one lane of a championship 100 meter dash.

Until about version 9 the only option in SAS was "plain", then for a 
time it was still the default.  By 9.2 they finally went to loglog.

Terry Therneau

Possibly Parallel Threads

Search for more maybe matching threads

R help - Apr 2012 - Kaplan Meier analysis: 95% CI wider in R than in SAS

[R] Kaplan Meier analysis: 95% CI wider in R than in SAS

[R] Kaplan Meier analysis: 95% CI wider in R than in SAS

[R] Kaplan Meier analysis: 95% CI wider in R than in SAS

[R] Kaplan Meier analysis: 95% CI wider in R than in SAS

Possibly Parallel Threads