Guido van Steen
2009-Oct-29 17:21 UTC
[R] How to turn individual consecutive information into survival objects?
Dear R List, I have a dataset with the following structure: """personal_id, p_0, p_1, p_2, .... , p_36, p_37 1, NA, 1, 4, .... , 1, NA 2, NA, NA, NA, .... , 4, NA . . . 6020, NA, 3, 3, ...., NA, NA 6021, NA, 2, 2, ...., 4, NA """ I used some made-up data. It is just meant to show the structure of the dataset. The variables of interest are p_0, ... p_37. They represent types of activity of the 6021 persons interviewed in 38 consecutive period. The values for p_0 and p_37 are coded as follows: 1 = self-employed 2 = employed 3 = in training 4 = unemployed NA = no information available p_0 is the period before p_1. This period was just before the survey, so that none of the individuals were interviewed in this period. p_37 is the period after p_36. This period was just after the survey, so that none of the individuals were interviewed in this period. I would like to transform this dataset into information on the lenght of the first spell of unemployment. (If there are multiple spells I would just like to use the first one.) My question is how can I convert the records in this dataset into survivul objects - "Surv()" - so that they can be used with a function like "coxph()". Could anyone give me a pointer to a function that performs this specific conversion? Thanks in advance! Guido Get your new Email address! Grab the Email name you've always wanted before someone else does! http://mail.promotions.yahoo.com/newdomains/aa/
Karl Ove Hufthammer
2009-Oct-29 17:36 UTC
[R] How to turn individual consecutive information into survival objects?
On Thu, 29 Oct 2009 10:21:23 -0700 (PDT) Guido van Steen <gvsteen at yahoo.com> wrote:> I would like to transform this dataset into information on the > lenght of the first spell of unemployment. (If there are multiple > spells I would just like to use the first one.)If 'dat' is your data frame (or matrix): l.sum=apply(dat,1,rle) f.length=function(x) x$lengths[match(4,x$values)] sapply(l.sum,f.length) The results is a vector of the lengths of the first run of '4' for each row (i.e., the first element corresponds to the first row, the second element to the second row, and so on). I very often find uses for the 'rle' function. It fast, it's fun, and it's great that it's part of base R. :-) -- Karl Ove Hufthammer
Apparently Analagous Threads
- [LLVMdev] [Polly] Analysis of the expensive compile-time overhead of Polly Dependence pass
- [LLVMdev] [Polly] Analysis of the expensive compile-time overhead of Polly Dependence pass
- [LLVMdev] [Polly] Analysis of the expensive compile-time overhead of Polly Dependence pass
- [LLVMdev] [Polly] Analysis of the expensive compile-time overhead of Polly Dependence pass
- [LLVMdev] [Polly] GSoC Proposal: Reducing LLVM-Polly Compiling overhead