Rob Balshaw
2003-Mar-28 23:58 UTC
[R] Observational data questions <not S-language question>
< This is not an S-language question, but I hoped it would be of at least passing interest to some members of the group. > I''ve encountered a situation which I''m sure is familiar to many. We''re looking at an observational dataset with data from many thousands of patients. (So many patients, I won''t bother to discuss the observed significance levels of our results. Everything is significant.) One of our predictive factors of interest is Treatment (Trt A vs Trt B). There are several covariates measured on the patients at the time of entry in the study (say, X1 and X2). The outcome of interest is time to death. Some patients will develop a disease prior to death, and it is thought that Disease is an important risk factor for death. The develpment of Disease has been linked to the use of Treatment B. Covariate X1 is also thought to predict Disease. Covariates X1 and X2 are thought to influence the risk of death but may also influence the choice of treatments. All in all, a pretty standard observational study scenario. Now we conduct a proportional hazards regression analysis for time to death, with Trt, X1 and X2 as covariates. Using this model, we find that Trt A has a hazard ratio considerably less than 1. Treatment A appears to reduce the risk of death after adjusting for differences in the observed covariates X1 and X2. Next we include Disease as a time dependent covariate. Under this model, Trt A has a hazard ratio considerably greater than 1, as does Disease. Thus, Trt A now appears to *increase* the risk of death (after adjusting for the observed covariates X1 and X2 *and* the development of Disease). My difficulty arises when I try to explain to clinicians that I do not find these results contradictory. The hazard ratio for drug A relative to drug B could easily be 1.2 when we attempt to ''adjust for'' the develpment of Disease. This addresses a completely different question than the analysis where Disease is ignored, so it is quite possible for the answer to appear to be so different. My questions: (1) Does my interpretation sound reasonable? I''ve had so many clinicians question me, I''m starting to lose confidence... (2) Has this phenomenon been explained nicely anywhere? I''d love to be able to argue by appeal to authority... Thanks for any comments or suggestions. (I''m tempted to build a simulation of this effect, but I''m not certain the clinicians would be too impressed.) Cheers, Rob -- Robert Balshaw, Ph.D. -- Senior Biostatistician, Syreon Corp. -- Phone: 604.676.5900x220; Fax: 604.676.5911