Frank S.
2016-Oct-03 17:17 UTC
[R] Looping through data tables (or data frames) by removing previous individuals
Dear R users, With this mail I send my third and last question I wanted to ask these days. First of all, many thanks for the received support in my previous mails! My question is this: Starting from a series of (for example) "k" different dates (all contained in vector "v"), I want to get a list of "k" data tables (or data frames) so that each contains those individuals who for the first time are at least 65, looping on each of the dates of vector "v". Let's consider the following example with 5 individuals: dt <- data.table( id = 1:5, fborn = as.Date(c("1935-07-25", "1942-10-05", "1942-09-07", "1943-09-07", "1943-12-31")), sex = as.factor(rep(c(0, 1), c(2, 3))) ) v <- seq(as.Date("2006-01-01"), as.Date("2009-01-01"), by ="year") # k=4 I would expect to obtain k=4 data tables so that: dt_p1: contains id = 1 (he is for the first time at least 65 on date v[1]) dt_p2: is NULL (no subject reach for the first time 65 on date v[2]) dt_p3: contains id = 2 & id = 3 (they are for the first time at least 65 on v[3]) dt_p4: contains id = 4 & id = 5 (they are for the first time at least 65 on v[4]) I have tried: dt_p <- list( ) # Empty list to alocate data tables for (i in 1:length(v)) { dt_p[[i]] <- dt[ !(id %in% dt_p[[1:(i-1)]]$id) & # Remove subjects from previous dt_p's round((v[i] - fborn)/365.25, 2) >= 65, ][ , list(id, fborn, sex)] dt.names <- paste0("dt_p", 1:length(v)) assign(dt.names[i], dt_p[[i]]) # Assign a name to each data table } However, I cannot express correctly the previous data tables, because for the first data table in the loop, there are not any previous. Consequently, I get an error message: # Error in dt_p[[1:(i - 1)]] : no such index at level 1 I would be very grateful for anu suggestion! Frank S. [[alternative HTML version deleted]]
Ista Zahn
2016-Oct-03 19:34 UTC
[R] Looping through data tables (or data frames) by removing previous individuals
Hi Frank, How about library(lubridate) dtf <- merge(dt, expand.grid(id = dt$id, refdate = v), by = "id") dtf[, gt65 := as.period(interval(fborn, refdate), unit = "years") > years(65)] dtf <- dtf[gt65 == TRUE,][, .SD[refdate == min(refdate)], by = id] Best, Ista On Mon, Oct 3, 2016 at 1:17 PM, Frank S. <f_j_rod at hotmail.com> wrote:> Dear R users, > > With this mail I send my third and last question I wanted to ask these days. First of all, many thanks > > for the received support in my previous mails! My question is this: Starting from a series of (for example) > > "k" different dates (all contained in vector "v"), I want to get a list of "k" data tables (or data frames) so > > that each contains those individuals who for the first time are at least 65, looping on each of the dates of > > vector "v". Let's consider the following example with 5 individuals: > > > dt <- data.table( > id = 1:5, > fborn = as.Date(c("1935-07-25", "1942-10-05", "1942-09-07", "1943-09-07", "1943-12-31")), > sex = as.factor(rep(c(0, 1), c(2, 3))) > ) > > v <- seq(as.Date("2006-01-01"), as.Date("2009-01-01"), by ="year") # k=4 > > > I would expect to obtain k=4 data tables so that: > dt_p1: contains id = 1 (he is for the first time at least 65 on date v[1]) > dt_p2: is NULL (no subject reach for the first time 65 on date v[2]) > dt_p3: contains id = 2 & id = 3 (they are for the first time at least 65 on v[3]) > dt_p4: contains id = 4 & id = 5 (they are for the first time at least 65 on v[4]) > > > I have tried: > > dt_p <- list( ) # Empty list to alocate data tables > > for (i in 1:length(v)) { > dt_p[[i]] <- dt[ !(id %in% dt_p[[1:(i-1)]]$id) & # Remove subjects from previous dt_p's > round((v[i] - fborn)/365.25, 2) >= 65, ][ , list(id, fborn, sex)] > > dt.names <- paste0("dt_p", 1:length(v)) > assign(dt.names[i], dt_p[[i]]) # Assign a name to each data table > } > > However, I cannot express correctly the previous data tables, because for the first data > > table in the loop, there are not any previous. Consequently, I get an error message: > > # Error in dt_p[[1:(i - 1)]] : no such index at level 1 > > > I would be very grateful for anu suggestion! > > Frank S. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Charles C. Berry
2016-Oct-03 19:38 UTC
[R] Looping through data tables (or data frames) by removing previous individuals
On Mon, 3 Oct 2016, Frank S. wrote:> Dear R users, > >[deleted]> I want to get a list of "k" data tables (or data frames) so that each > contains those individuals who for the first time are at least 65, > looping on each of the dates of vector "v". Let's consider the following > example with 5 individuals: > > > dt <- data.table( > id = 1:5, > fborn = as.Date(c("1935-07-25", "1942-10-05", "1942-09-07", "1943-09-07", "1943-12-31")), > sex = as.factor(rep(c(0, 1), c(2, 3))) > ) > > v <- seq(as.Date("2006-01-01"), as.Date("2009-01-01"), by ="year") # k=4 > > > I would expect to obtain k=4 data tables so that: > dt_p1: contains id = 1 (he is for the first time at least 65 on date v[1]) > dt_p2: is NULL (no subject reach for the first time 65 on date v[2]) > dt_p3: contains id = 2 & id = 3 (they are for the first time at least 65 on v[3]) > dt_p4: contains id = 4 & id = 5 (they are for the first time at least 65 on v[4]) > >Here is a start (using a data.frame for dt):> vp <- as.POSIXlt( c( as.Date("1000-01-01"), v )) > vp$year <- vp$year-65 > dt.cut <- as.numeric(cut(as.POSIXlt(dt$fborn),vp)) > split(dt,factor(dt.cut, 1:length(v)))$`1` id fborn sex 1 1 1935-07-25 0 $`2` [1] id fborn sex <0 rows> (or 0-length row.names) $`3` id fborn sex 2 2 1942-10-05 0 3 3 1942-09-07 1 $`4` id fborn sex 4 4 1943-09-07 1 5 5 1943-12-31 1 See ?as.POSIXlt ?cut.POSIXt ?split HTH, Chuck
Frank S.
2016-Oct-04 10:31 UTC
[R] Looping through data tables (or data frames) by removing previous individuals
Thank you very much Ista and Zahn! Best, Frank S. ________________________________ De: Charles C. Berry <ccberry at ucsd.edu> Enviat el: dilluns, 3 d'octubre de 2016 21:38:05 Per a: Frank S. A/c: r-help at r-project.org Tema: Re: Looping through data tables (or data frames) by removing previous individuals On Mon, 3 Oct 2016, Frank S. wrote:> Dear R users, > >[deleted]> I want to get a list of "k" data tables (or data frames) so that each > contains those individuals who for the first time are at least 65, > looping on each of the dates of vector "v". Let's consider the following > example with 5 individuals: > > > dt <- data.table( > id = 1:5, > fborn = as.Date(c("1935-07-25", "1942-10-05", "1942-09-07", "1943-09-07", "1943-12-31")), > sex = as.factor(rep(c(0, 1), c(2, 3))) > ) > > v <- seq(as.Date("2006-01-01"), as.Date("2009-01-01"), by ="year") # k=4 > > > I would expect to obtain k=4 data tables so that: > dt_p1: contains id = 1 (he is for the first time at least 65 on date v[1]) > dt_p2: is NULL (no subject reach for the first time 65 on date v[2]) > dt_p3: contains id = 2 & id = 3 (they are for the first time at least 65 on v[3]) > dt_p4: contains id = 4 & id = 5 (they are for the first time at least 65 on v[4]) > >Here is a start (using a data.frame for dt):> vp <- as.POSIXlt( c( as.Date("1000-01-01"), v )) > vp$year <- vp$year-65 > dt.cut <- as.numeric(cut(as.POSIXlt(dt$fborn),vp)) > split(dt,factor(dt.cut, 1:length(v)))$`1` id fborn sex 1 1 1935-07-25 0 $`2` [1] id fborn sex <0 rows> (or 0-length row.names) $`3` id fborn sex 2 2 1942-10-05 0 3 3 1942-09-07 1 $`4` id fborn sex 4 4 1943-09-07 1 5 5 1943-12-31 1 See ?as.POSIXlt ?cut.POSIXt ?split HTH, Chuck [[alternative HTML version deleted]]