Polwart Calum (COUNTY DURHAM AND DARLINGTON NHS FOUNDATION TRUST)
2011-Oct-31 18:29 UTC
[R] Kaplan Meier - not for dates
I have some data which is censored and I want to determine the median. Its actually cost data for a cohort of patients, many of whom are still on treatment and so are censored. I can do the same sort of analysis for a survival curve and get the median survival... ...but can I just use the survival curve functions to plot an X axis that is $ rather than date? If not is there some other way to achieve this? Thanks Calum ******************************************************************************************************************** This message may contain confidential information. If yo...{{dropped:21}}
I think it really depends on what your event of interest is. If your event is that the patient got better and "left treatment" then I think this could work. You would have to mark as censored any patient still in treatment or any patient that stopped treatment w/o getting better (e.g. in the case of death). You would then be predicting the cost required to make the patient well enough to leave treatment. It is a little non-standard to use $ instead of time, but time is money after all. You could set up your data frame with two columns: 1) cost 2) event/censored. Then create your survival object: mySurv = Surv(my_data$cost,my_data$event) And then use survfit to create your KM curves: myFit = survfit(mySurv~NULL) If you have other explanatory variables that you think may influence the cost, you can of course add them to your data frame and change the formula you use in survfit. For instance, you could have some severity measure, e.g. High, Medium, Low. You could then do: myFit = survfit(mySurv~my_data$severity) -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Polwart Calum (COUNTY DURHAM AND DARLINGTON NHS FOUNDATION TRUST) Sent: Monday, October 31, 2011 1:29 PM To: r-help at r-project.org Subject: [R] Kaplan Meier - not for dates I have some data which is censored and I want to determine the median. Its actually cost data for a cohort of patients, many of whom are still on treatment and so are censored. I can do the same sort of analysis for a survival curve and get the median survival... ...but can I just use the survival curve functions to plot an X axis that is $ rather than date? If not is there some other way to achieve this? Thanks Calum ******************************************************************************************************************** This message may contain confidential information. If yo...{{dropped:9}}
Polwart Calum (COUNTY DURHAM AND DARLINGTON NHS FOUNDATION TRUST)
2011-Nov-03 21:21 UTC
[R] Kaplan Meier - not for dates
Thanks for the reply. The treatment is effectively for a chronic condition - so you stay on the treatment till it stops working. We know from trials how long that should be and we know the theoretical cost of that treatment but that's based on the text book dose (patients dose reduce and delay treatment and its based on weight so variable). We've been asked to provide our national planning team with an "average" cost based on our early experiences. So we have suggested to them we might be able to get a median cost. Some patients will stay on treatment several years so it will be impossible to get an average for years. So the censored patients will be those still on treatment (the event being stopping treatment) I'll give what you've suggested a go. Thanks Calum Polwart BSc(Hons) MSc MRPharmS SPres IPres Network Pharmacist - NECN and Pharmacy Clinical Team Manager (Cancer & Aseptic Services) - CDDFT Our website has now been unlocked and updated. Should you require contacts, meeting details, publications etc, please visit us on www.cancernorth.nhs.uk ________________________________________ From: Lancaster, Robert (Orbitz) [ROBERT.LANCASTER at orbitz.com] Sent: 03 November 2011 19:55 To: Polwart Calum (COUNTY DURHAM AND DARLINGTON NHS FOUNDATION TRUST); r-help at r-project.org Subject: RE: Kaplan Meier - not for dates I think it really depends on what your event of interest is. If your event is that the patient got better and "left treatment" then I think this could work. You would have to mark as censored any patient still in treatment or any patient that stopped treatment w/o getting better (e.g. in the case of death). You would then be predicting the cost required to make the patient well enough to leave treatment. It is a little non-standard to use $ instead of time, but time is money after all. You could set up your data frame with two columns: 1) cost 2) event/censored. Then create your survival object: mySurv = Surv(my_data$cost,my_data$event) And then use survfit to create your KM curves: myFit = survfit(mySurv~NULL) If you have other explanatory variables that you think may influence the cost, you can of course add them to your data frame and change the formula you use in survfit. For instance, you could have some severity measure, e.g. High, Medium, Low. You could then do: myFit = survfit(mySurv~my_data$severity) -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Polwart Calum (COUNTY DURHAM AND DARLINGTON NHS FOUNDATION TRUST) Sent: Monday, October 31, 2011 1:29 PM To: r-help at r-project.org Subject: [R] Kaplan Meier - not for dates I have some data which is censored and I want to determine the median. Its actually cost data for a cohort of patients, many of whom are still on treatment and so are censored. I can do the same sort of analysis for a survival curve and get the median survival... ...but can I just use the survival curve functions to plot an X axis that is $ rather than date? If not is there some other way to achieve this? Thanks Calum ******************************************************************************************************************** This message may contain confidential information. If yo...{{dropped:21}} ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ******************************************************************************************************************** This message may contain confidential information. If you are not the intended recipient please inform the sender that you have received the message in error before deleting it. Please do not disclose, copy or distribute information in this e-mail or take any action in reliance on its contents: to do so is strictly prohibited and may be unlawful. Thank you for your co-operation. NHSmail is the secure email and directory service available for all NHS staff in England and Scotland NHSmail is approved for exchanging patient data and other sensitive information with NHSmail and GSi recipients NHSmail provides an email address for your career in the NHS and can be accessed anywhere For more information and to find out how you can switch, visit www.connectingforhealth.nhs.uk/nhsmail
--- begin included message -- I have some data which is censored and I want to determine the median. Its actually cost data for a cohort of patients, many of whom are still on treatment and so are censored. I can do the same sort of analysis for a survival curve and get the median survival... ...but can I just use the survival curve functions to plot an X axis that is $ rather than date? If not is there some other way to achieve this? -- end inclusion -- 1. The survfit routines will work, and the results that you plot will indeed be on the dollar scale, BUT 2. The answer will be wrong. The reason is that the censoring occurs on a time scale, not a $ scale: you don't stop observing someone because total cost hits a threshold, but because calendar time does. The KM routines assume that the censoring process and the event process are on the same scale. The result can be an overestimation of cost. See Dan-Yu Lin, Biometrics 1997, "Estimating medical costs from incomplete follow-up data". Terry Therneau
Polwart Calum (COUNTY DURHAM AND DARLINGTON NHS FOUNDATION TRUST)
2011-Nov-07 13:15 UTC
[R] Kaplan Meier - not for dates
> 2. The answer will be wrong. The reason is that the censoring occurs on a time scale, not a $ scale: you don't stop observing someone because > total cost hits a threshold, but because calendar time does. The KM routines assume that the censoring process and the event process are on the > same scale. > The result can be an overestimation of cost. See Dan-Yu Lin, Biometrics 1997, "Estimating medical costs from incomplete follow-up data". > > Terry TherneauThanks that's extremely useful. I'll dig out that reference. You are correct my censoring is happening on an event - (dis)continuation of treatment - not on reaching a cumulative cost. Calum ******************************************************************************************************************** This message may contain confidential information. If yo...{{dropped:21}}
Polwart Calum (COUNTY DURHAM AND DARLINGTON NHS FOUNDATION TRUST)
2011-Nov-07 13:31 UTC
[R] Kaplan Meier - not for dates
> 2. The answer will be wrong. The reason is that the censoring occurs on a time scale, not a $ scale: you don't stop observing someone because > total cost hits a threshold, but because calendar time does. The KM routines assume that the censoring process and the event process are on the > same scale. > The result can be an overestimation of cost. See Dan-Yu Lin, Biometrics 1997, "Estimating medical costs from incomplete follow-up data".Having now skimmed the paper this is long term follow-up. In my particular case the patients are getting treatment for relatively short periods (median time to stopping treatment will be ~ 9months) and will discontinue treatment relatively quickly (I'd be surprised if anyone is still on treatment 3-4 years out). I only want the costs of that treatment not the costs for their overall care to death. I'm not sure how that affects things but hoping it makes life simpler. Calum ******************************************************************************************************************** This message may contain confidential information. If yo...{{dropped:21}}