buhr at biostat.wisc.edu
2010-Feb-23 21:25 UTC
[Rd] survival package: Surv handles invalid intervals (start time > stop (PR#14221)
In the latest version of the survival package (2.35-8), the Surv function handles invalid intervals (where start time is greater than stop time) by issuing a warning that NAs have been created and then setting the left endpoint of the offending intervals to NA. However, the code that sets the endpoint to NA subsets incorrectly, and in some circumstances an arbitrary selection of intervals will have an endpoint set to NA. For example, for the interval/event data: interval event (NA, 10] 1 (1, 5] 1 (6, 4] 1 the appropriate Surv call **should** result in a warning and the left endpoint of the third, invalid interval being set to NA, as here:> Surv(c(NA,1,6),c(10,5,4),c(1,1,1))[1] (NA,10 ] ( 1, 5 ] (NA, 4 ] Warning message: In Surv(c(NA, 1, 6), c(10, 5, 4), c(1, 1, 1)) : Stop time must be> start time, NA created>However, the Surv call **actually** results in:> Surv(c(NA,1,6), c(10,5,4), c(1,1,1))[1] (NA,10 ] (NA, 5 ] ( 6, 4 ] Warning message: In Surv(c(NA, 1, 6), c(10, 5, 4), c(1, 1, 1)) : Stop time must be> start time, NA created>Note that the endpoint of the valid, second interval has been set to NA in place of the invalid, third interval. A similar problem exists for type="interval2" type data. The **expected** behavior is:> Surv(c(NA,1,6), c(10,5,4), type="interval2")[1] 10- [ 1, 5] [NA, 4] Warning message: In Surv(c(NA, 1, 6), c(10, 5, 4), type = "interval2") : Invalid interval: start> stop, NA created>but the **actual** behavior is:> Surv(c(NA,1,6), c(10,5,4), type="interval2")[1] 10- [NA, 5] [ 6, 4] Warning message: In Surv(c(NA, 1, 6), c(10, 5, 4), type = "interval2") : Invalid interval: start> stop, NA created>The attached patch fixes the problem. -- Kevin Buhr<buhr at biostat.wisc.edu> Phone: +1 608 265 4587 Assistant Scientist Fax: +1 608 263 0415 Statistical Data Analysis Center Room 211, WARF Office Building, 610 Walnut St., Madison, WI 53726-2397