Polwart Calum (COUNTY DURHAM AND DARLINGTON NHS FOUNDATION TRUST)
2013-Apr-29 08:48 UTC
[R] Comparing two different 'survival' events for the same subject using survdiff?
I have a dataset which for the sake of simplicity has two endpoints. We would like to test if two different end-points have the same eventual meaning. To try and take an example that people might understand better: Lets assume we had a group of subjects who all received a treatment. The could stop treatment for any reason (side effects, treatment stops working etc). Getting that data is very easy. Measuring if treatment stops working is very hard to capture... so we would like to test if duration on treatment (easy) is the same as time to treatment failure (hard). My data might look like this: A = c(9.77, 0.43, 0.03, 3.50, 7.07, 6.57, 8.57, 2.30, 6.17, 3.27, 2.57, 0.77) B = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1) # 1 = yes (censored) C = c( 9.80, 0.43, 5.93, 8.43, 6.80, 2.60, 8.93, 8.37, 12.23, 5.83, 13.17, 0.77) D = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1) # 1 = yes (censored) myData = data.frame (TimeOnTx = A, StillOnTx = B, TimeToFailure = C, NotFailed = D) We can do a survival analysis on those individually: OnTxFit = survfit (Surv ( TimeOnTx, StillOnTx==0 ) ~ 1 , data = myData) FailedFit = survfit (Surv ( TimeToFailure , NotFailed==0 ) ~ 1 , data = myData) plot(OnTxFit) lines(OnTxFit) But how can I do a survdiff type of comparison between the two? Do I have to restructure the data so that Time's are all in one column, Event in another and then a Group to indicate what type of event it is? Seems a complex way to do it (especially as the dataset is of course more complex than I've just shown)... so I thought maybe I'm missing something... ******************************************************************************************************************** This message may contain confidential information. If yo...{{dropped:19}}
Andrews, Chris
2013-Apr-29 11:35 UTC
[R] Comparing two different 'survival' events for the same subject using survdiff?
It isn't that complex: myDataLong <- data.frame(Time=c(A, C), Censored=c(B, D), group=rep(0:1, times=c(length(A), length(C)))) Fit = survfit(Surv(Time, Censored==0) ~ group, data=myDataLong) plot(Fit, col=1:2) survdiff(Surv(Time, Censored==0) ~ group, data=myDataLong) However, your approach (a 'wide' data frame) suggests that there are equal numbers in the two survival studies. Are they even the same people? Is it even the same study? If so, this is a competing risks question and would have to be approached differently. And, of course, absence of evidence is not evidence of absence. Failing to reject the null hypothesis that the distributions are different is not proof that the distributions are equal. Chris -----Original Message----- From: Polwart Calum (COUNTY DURHAM AND DARLINGTON NHS FOUNDATION TRUST) [mailto:calum.polwart at nhs.net] Sent: Monday, April 29, 2013 4:48 AM To: r-help at r-project.org Subject: [R] Comparing two different 'survival' events for the same subject using survdiff? I have a dataset which for the sake of simplicity has two endpoints. We would like to test if two different end-points have the same eventual meaning. To try and take an example that people might understand better: Lets assume we had a group of subjects who all received a treatment. The could stop treatment for any reason (side effects, treatment stops working etc). Getting that data is very easy. Measuring if treatment stops working is very hard to capture... so we would like to test if duration on treatment (easy) is the same as time to treatment failure (hard). My data might look like this: A = c(9.77, 0.43, 0.03, 3.50, 7.07, 6.57, 8.57, 2.30, 6.17, 3.27, 2.57, 0.77) B = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1) # 1 = yes (censored) C = c( 9.80, 0.43, 5.93, 8.43, 6.80, 2.60, 8.93, 8.37, 12.23, 5.83, 13.17, 0.77) D = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1) # 1 = yes (censored) myData = data.frame (TimeOnTx = A, StillOnTx = B, TimeToFailure = C, NotFailed = D) We can do a survival analysis on those individually: OnTxFit = survfit (Surv ( TimeOnTx, StillOnTx==0 ) ~ 1 , data = myData) FailedFit = survfit (Surv ( TimeToFailure , NotFailed==0 ) ~ 1 , data = myData) plot(OnTxFit) lines(OnTxFit) But how can I do a survdiff type of comparison between the two? Do I have to restructure the data so that Time's are all in one column, Event in another and then a Group to indicate what type of event it is? Seems a complex way to do it (especially as the dataset is of course more complex than I've just shown)... so I thought maybe I'm missing something... ******************************************************************************************************************** This message may contain confidential information. If yo...{{dropped:7}}
Polwart Calum (COUNTY DURHAM AND DARLINGTON NHS FOUNDATION TRUST)
2013-Apr-29 11:56 UTC
[R] Comparing two different 'survival' events for the same subject using survdiff?
> It isn't that complex: > > myDataLong <- data.frame(Time=c(A, C), Censored=c(B, D), group=rep(0:1, times=c(length(A), length(C)))) > Fit = survfit(Surv(Time, Censored==0) ~ group, data=myDataLong) > plot(Fit, col=1:2) > survdiff(Surv(Time, Censored==0) ~ group, data=myDataLong)Yes - for the example its not complex - but once we get down to having more data columns I think it may... Maybe I ignore those and just build 'myDataLong' for this specific test.> However, your approach (a 'wide' data frame) suggests that there are equal numbers in the two survival > studies. Are they even the same people? Is it even the same study? If so, this is a competing risks question > and would have to be approached differently.Yes its the same patients. The two events are technically independant of each other but the hope is that the easier outcome measure would predict the other... I'm not familliar with competing risks and so will have to read up on it but it isn't a scenario where A or B happens, A happens and B happens and you might expect A happened because B happened...> And, of course, absence of evidence is not evidence of absence. Failing to reject the null hypothesis that the > distributions are different is not proof that the distributions are equal.Yes absolutely - however I'm half expecting to detect a difference and so then dismiss using A as a surrogate of B... Thanks -----Original Message----- From: Polwart Calum (COUNTY DURHAM AND DARLINGTON NHS FOUNDATION TRUST) [mailto:calum.polwart at nhs.net] Sent: Monday, April 29, 2013 4:48 AM To: r-help at r-project.org Subject: [R] Comparing two different 'survival' events for the same subject using survdiff? I have a dataset which for the sake of simplicity has two endpoints. We would like to test if two different end-points have the same eventual meaning. To try and take an example that people might understand better: Lets assume we had a group of subjects who all received a treatment. The could stop treatment for any reason (side effects, treatment stops working etc). Getting that data is very easy. Measuring if treatment stops working is very hard to capture... so we would like to test if duration on treatment (easy) is the same as time to treatment failure (hard). My data might look like this: A = c(9.77, 0.43, 0.03, 3.50, 7.07, 6.57, 8.57, 2.30, 6.17, 3.27, 2.57, 0.77) B = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1) # 1 = yes (censored) C = c( 9.80, 0.43, 5.93, 8.43, 6.80, 2.60, 8.93, 8.37, 12.23, 5.83, 13.17, 0.77) D = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1) # 1 = yes (censored) myData = data.frame (TimeOnTx = A, StillOnTx = B, TimeToFailure = C, NotFailed = D) We can do a survival analysis on those individually: OnTxFit = survfit (Surv ( TimeOnTx, StillOnTx==0 ) ~ 1 , data = myData) FailedFit = survfit (Surv ( TimeToFailure , NotFailed==0 ) ~ 1 , data = myData) plot(OnTxFit) lines(OnTxFit) But how can I do a survdiff type of comparison between the two? Do I have to restructure the data so that Time's are all in one column, Event in another and then a Group to indicate what type of event it is? Seems a complex way to do it (especially as the dataset is of course more complex than I've just shown)... so I thought maybe I'm missing something... ******************************************************************************************************************** This message may contain confidential information. If yo...{{dropped:29}}
Terry Therneau
2013-Apr-30 14:55 UTC
[R] Comparing two different 'survival' events for the same subject using survdiff?
-----Original Message----- I have a dataset which for the sake of simplicity has two endpoints. We would like to test if two different end-points have the same eventual meaning. To try and take an example that people might understand better: Lets assume we had a group of subjects who all received a treatment. The could stop treatment for any reason (side effects, treatment stops working etc). Getting that data is very easy. Measuring if treatment stops working is very hard to capture... so we would like to test if duration on treatment (easy) is the same as time to treatment failure (hard). --- End ---- The problem you describe is known as "surrogate endpoints" and addressing it is harder than you think. You will need to look in the literature to gain an understanding of the issues before you proceed. Your question is an important one and lots of folks have thought about it more deeply than I. Cohn JN (2004). "Introduction to Surrogate Markers". Circulation (American Heart Association) 109 (25 Suppl 1): IV20?1. doi:10.1161/01.CIR.0000133441.05780.1d Fleming T, David D. Surrogate End Points in Clinical Trials: Are We Being Misled? Ann Intern Med. 1996 Oct 1;125(7):605-13 Terry T.