Therneau, Terry M., Ph.D.
2016-Jun-04 11:28 UTC
[R] Johannes Hengelbrock <j.hengelbrock@uke.de>
I'm traveling so chasing this down more fully will wait until I get home.
Four points.
1. This is an edge case. You will notice that if you add
"subset=1:100" to the coxph call that the function works perfectly.
You have to get up to 1000 or so before it fails.
2. The exact partial likelihood of Cox is referred to as "exact" by
coxph and as "discrete" by SAS phreg. What phreg calls
'exact' is the exact marginal likelihood of Prentice. I don't know
which of these you were using in SAS so can't verify that "phreg
works".
3. The computations and memory for the exact calculation go up phenomenally with
the number of ties. It is a sum over n choose d terms, if there were 10
events out of 200 at risk this is a sum over all ways to choose 10 subjects out
of 200 which is approx 2e16 terms. Your example requires all choices of 1541
out of 3000, which I would expect to take somewhere near age-of-the-universe
seconds to compute. The code uses a clever nested compuation due to Gail et al
which will cut that time down to infinity/10.
4. This example drove coxph into a memory fault, I suspect. I will certainly
look into patching that once I get home. (There is a check for this but it
must have a flaw). My sympathy is for your plight is low, however. I
can't conceive of the real data problem where someone would actually need to
compute this awful likelihood. 1541 events tied at the same time? Or even
more to imagine a case where I would need it badly enough to wait a lifetime for
the answer. The Efron approximation is pretty darn good for cases like this,
and it is fast.
Terry Therneau
---------------------------------------------------
Dear users,
I'm trying to estimate a conditional logistic model using the
coxph()-function from the survival package. Somehow, the model does not
converge if time is set to the same value for all observations:
library(survival)
set.seed(12345)
n <- 3000
a <- rbinom(n, 1, 0.5)
b <- rbinom(n, 1, 0.5)
coxph(formula = Surv(rep(1, 3000), a) ~ b, method = "exact")
Error in fitter(X, Y, strats, offset, init, control, weights = weights,
: NA/NaN/Inf in foreign function call (arg 5) In addition: Warning
message: In fitter(X, Y, strats, offset, init, control, weights weights, :Ran
out of iterations and did not converge
Changing iter.max does not help, aparently. Strangely, the exact same
model converges in SAS.
I know that I could estimate the model differently (via glm), but I
would like to understand why the model does converge in SAS but not in R.
Thanks,
Johannes
[[alternative HTML version deleted]]
RICHARD M. HEIBERGER
2016-Jun-04 20:43 UTC
[R] Johannes Hengelbrock <j.hengelbrock@uke.de>
fortune candidate Your example requires all choices of 1541 out of 3000, which I would expect to take somewhere near age-of-the-universe seconds to compute. The code uses a clever nested compuation due to Gail et al which will cut that time down to infinity/10. Sent from my iPhone> On Jun 4, 2016, at 07:28, Therneau, Terry M., Ph.D. <therneau at mayo.edu> wrote: > > Your example requires all choices of 1541 out of 3000, which I would expect to take somewhere near age-of-the-universe seconds to compute. The code uses a clever nested compuation due to Gail et al which will cut that time down to infinity/10.