Terry Therneau
2011-Jul-22 12:04 UTC
[R] Cox model approximaions (was "comparing SAS and R survival....)
For time scale that are truly discrete Cox proposed the "exact partial likelihood". I call that the "exact" method and SAS calls it the "discrete" method. What we compute is precisely the same, however they use a clever algorithm which is faster. To make things even more confusing, Prentice introduced an "exact marginal likelihood" which is not implemented in R, but which SAS calls the "exact" method. Data is usually not truly discrete, however. More often ties are the result of imprecise measurement or grouping. The Efron approximation assumes that the data are actually continuous but we see ties because of this; it also introduces an approximation at one point in the calculation which greatly speeds up the computation; numerically the approximation is very good. In spite of the irrational love that our profession has for anything branded with the word "exact", I currently see no reason to ever use that particular computation in a Cox model. I'm not quite ready to remove the option from coxph, but certainly am not going to devote any effort toward improving that part of the code. The Breslow approximation is less accurate, but is the easiest to program and therefore was the only method in early Cox model programs; it persists as the default in many software packages because of history. Truth be told, unless the number of tied deaths is quite large the difference in results between it and the Efron approx will be trivial. The worst approximation, and the one that can sometimes give seriously strange results, is to artificially remove ties from the data set by adding a random value to each subject's time. Terry T --- begin quote -- I didn't know precisely the specifities of each approximation method. I thus came back to section 3.3 of Therneau and Grambsch, Extending the Cox Model. I think I now see things more clearly. If I have understood correctly, both "discrete" option and "exact" functions assume "true" discrete event times in a model approximating the Cox model. Cox partial likelihood cannot be exactly maximized, or even written, when there are some ties, am I right ? In my sample, many of the ties (those whithin a single observation of the process) are due to the fact that continuous event times are grouped into intervals. So I think the logistic approximation may not be the best for my problem despite the estimate on my real data set (shown on my previous post) do give interessant results regarding to the context of my data set ! I was thinking about distributing the events uniformly in each interval. What do you think about this option ? Can I expect a better approximation than directly applying Breslow or Efron method directly with the grouped event data ? Finally, it becomes a model problem more than a computationnal or algorithmic one I guess.
Mike Marchywka
2011-Jul-22 12:20 UTC
[R] Cox model approximaions (was "comparing SAS and R survival....)
----------------------------------------> From: therneau at mayo.edu > To: aboueslati at gmail.com > Date: Fri, 22 Jul 2011 07:04:15 -0500 > CC: r-help at r-project.org > Subject: Re: [R] Cox model approximaions (was "comparing SAS and R survival....) > > For time scale that are truly discrete Cox proposed the "exact partial > likelihood". I call that the "exact" method and SAS calls it the > "discrete" method. What we compute is precisely the same, however they > use a clever algorithm which is faster. To make things even more > confusing, Prentice introduced an "exact marginal likelihood" which is > not implemented in R, but which SAS calls the "exact" method. > > Data is usually not truly discrete, however. More often ties are the > result of imprecise measurement or grouping. The Efron approximation > assumes that the data are actually continuous but we see ties because of > this; it also introduces an approximation at one point in the > calculation which greatly speeds up the computation; numerically the > approximation is very good. > In spite of the irrational love that our profession has for anything > branded with the word "exact", I currently see no reason to ever use > that particular computation in a Cox model. I'm not quite ready to > remove the option from coxph, but certainly am not going to devote any > effort toward improving that part of the code. > > The Breslow approximation is less accurate, but is the easiest to > program and therefore was the only method in early Cox model programs; > it persists as the default in many software packages because of history. > Truth be told, unless the number of tied deaths is quite large the > difference in results between it and the Efron approx will be trivial. > > The worst approximation, and the one that can sometimes give seriously > strange results, is to artificially remove ties from the data set by > adding a random value to each subject's time.Care to elaborate on this at all? First of course I would agree that doing anything to the data, or making up data, and then handing it to an analysis tool that doesn't know you maniplated it can be a problem ( often called interpolation or something with a legitimate name LOL).? However, it is not unreasonable to do a sensitivity analysis by adding noise and checking the results.? Presumaably adding noise to remove things the algorighm doesn't happen to like would work but you would need to take many samples and examine stats of how your broke the ties. Now if the model is bad to begin with or the data is so coarsely binned that you can't get much out of it then ok. I guess in this case, having not thought about it too much, ties would be most common either with lots of data or if hazards spiked over time scales simlar to your measurement precision or if the measurement resolution is not comparable to hazard rate. In the latter 2 cases of course the approach is probably quite? limited . Consider turning exponential curves into step functions for example.> > Terry T > > > --- begin quote -- > I didn't know precisely the specifities of each approximation method. > I thus came back to section 3.3 of Therneau and Grambsch, Extending the > Cox > Model. I think I now see things more clearly. If I have understood > correctly, both "discrete" option and "exact" functions assume "true" > discrete event times in a model approximating the Cox model. Cox partial > likelihood cannot be exactly maximized, or even written, when there are > some > ties, am I right ? > > In my sample, many of the ties (those whithin a single observation of > the > process) are due to the fact that continuous event times are grouped > into > intervals. > > So I think the logistic approximation may not be the best for my problem > despite the estimate on my real data set (shown on my previous post) do > give[[elided Hotmail spam]]> I was thinking about distributing the events uniformly in each interval. > What do you think about this option ? Can I expect a better > approximation > than directly applying Breslow or Efron method directly with the grouped > event data ? Finally, it becomes a model problem more than a > computationnal > or algorithmic one I guess. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Göran Broström
2011-Jul-24 11:49 UTC
[R] Cox model approximaions (was "comparing SAS and R survival....)
On Fri, Jul 22, 2011 at 2:04 PM, Terry Therneau <therneau at mayo.edu> wrote:> ?For time scale that are truly discrete Cox proposed the "exact partial > likelihood".Or "the method of partial likelihood" applied to the discrete logistic model,>?I call that the "exact" method and SAS calls it the > "discrete" method. ?What we compute is precisely the same, however they > use a clever algorithm which is faster.Note that the model to estimate here is discrete. The "base-line" conditional probabilities at each failure time are eliminated through the partial likelihood argument. This can also be described as a conditional logistic regression, where we condition on the total number of failures in each risk set (thus eliminating the risk-set-specific parameters). Suppose that in a risk set of size n there are d failures. This method must then consider all possible ways of choosing d failures out of n at risk, or choose(n, d) cases. This makes the computational burden huge with lots of ties. The method "ml" in "coxreg" (package 'eha') uses a different approach. Instead of conditional logistic regression it performs unconditional logistic regression by adding one parameter per risk set. In principle this is possible to do with 'glm' after expanding the data set with "toBinary" in 'eha', but with large data sets and lots of risk sets, glm chokes. Instead, with the "ml" approach in "coxreg", the extra parameters just introduced are eliminated by profiling them out! This leads to a fast estimation procedure, compared to the abovementioned 'exact' methods. A final note: with "ml", the logistic regression uses the cloglog link, to be compatible with the situation when data really are continuous but grouped, and a proportional hazards model holds. (Interestingly, conditional inference is usually used to simplify things; here it creates computational problems not present without conditioning.)> ?To make things even more > confusing, Prentice introduced an "exact marginal likelihood" which is > not implemented in R, but which SAS calls the "exact" method.This is not so confusing if we realize that we now are in the continuous time model. Then, with a risk set of size n with d failures, we must consider all possible permutations of the d failures, or d! cases. That is, here we assume that ties occur because of imprecise measurement and that there is one true ordering. This method calculates an average contribution to the partial likelihood. (Btw, you refer to "Prentice", but isn't this from the Biometrika paper by Kalbfleisch & Prentice (1973)? And of course their classical book?)> ?Data is usually not truly discrete, however. ?More often ties are the > result of imprecise measurement or grouping. ?The Efron approximation > assumes that the data are actually continuous but we see ties because of > this; it also introduces an approximation at one point in the > calculation which greatly speeds up the computation; numerically the > approximation is very good.Note that both Breslow's and Efron's approximations are approximations of the "exact marginal likelihood".> ?In spite of the irrational love that our profession has for anything > branded with the word "exact", I currently see no reason to ever use > that particular computation in a Cox model.Agreed; but only because it is so time consuming. The unconditional logistic regression with profiling is a good alternative.> I'm not quite ready to > remove the option from coxph, but certainly am not going to devote any > effort toward improving that part of the code. > > ?The Breslow approximation is less accurate, but is the easiest to > program and therefore was the only method in early Cox model programs; > it persists as the default in many software packages because of history. > Truth be told, unless the number of tied deaths is quite large the > difference in results between it and the Efron approx will be trivial. > > ?The worst approximation, and the one that can sometimes give seriously > strange results, is to artificially remove ties from the data set by > adding a random value to each subject's time.Maybe, but randomly breaking ties may not be a bad idea; you could regard that as getting an (unbiased?) estimator of the exact (continuous-time) partial likelihood. Expanding: Instead of going through all possible permutations, why not take a random sample of size greater than one? G?ran> Terry T > > > --- begin quote -- > I didn't know precisely the specifities of each approximation method. > I thus came back to section 3.3 of Therneau and Grambsch, Extending the > Cox > Model. I think I now see things more clearly. If I have understood > correctly, both "discrete" option and "exact" functions assume "true" > discrete event times in a model approximating the Cox model. Cox partial > likelihood cannot be exactly maximized, or even written, when there are > some > ties, am I right ? > > In my sample, many of the ties (those whithin a single observation of > the > process) are due to the fact that continuous event times are grouped > into > intervals. > > So I think the logistic approximation may not be the best for my problem > despite the estimate on my real data set (shown on my previous post) do > give > interessant results regarding to the context of my data set ! > I was thinking about distributing the events uniformly in each interval. > What do you think about this option ? Can I expect a better > approximation > than directly applying Breslow or Efron method directly with the grouped > event data ? Finally, it becomes a model problem more than a > computationnal > or algorithmic one I guess. > > > >-- G?ran Brostr?m
AO_Statistics
2012-Apr-02 16:03 UTC
[R] Cox model approximaions (was "comparing SAS and R survival....)
I have a question about Cox's partial likelihood approximations in "coxph" function of "survival package (and in SAS as well) in the presence of tied events generated by grouping continuous event times into intervals. I am processing estimations for recurrent events with time-dependent covariates in the Andersen and Gill approach of Cox's model. If I have understood Breslow's and Efron's approximations correctly, they consist in modifying the denominators of the contributing likelihood term when we do not know the order of occurrence of the events. This order is important only if the tied events are associated to a diferent value of the covariate. I would like to know if the "breslow" and "efron" options still modify the initial denominators of the terms when they correspond to the same covariate. Especially, whithin the same trajectory of the observed process (the same individual), the covariate is measured once for each tied events. To my mind, we would introduce a useless bias in this case since the initial partial likelihood is true. Thank you. -- View this message in context: http://r.789695.n4.nabble.com/Re-Cox-model-approximaions-was-comparing-SAS-and-R-survival-tp3686543p4526443.html Sent from the R help mailing list archive at Nabble.com.