thr3ads.net - R help - [R] Looping through data tables (or data frames) by removing previous individuals [Oct 2016]

If this information is useful, please help other people find it:
Share via:

Frank S.

2016-Oct-03 17:17 UTC

[R] Looping through data tables (or data frames) by removing previous individuals

Dear R users,

With this mail I send my third and last question I wanted to ask these days.
First of all, many thanks

for the received support in my previous mails! My question is this: Starting
from a series of (for example)

"k" different dates (all contained in vector "v"), I want to
get a list of "k" data tables (or data frames) so

that each contains those individuals who for the first time are at least 65,
looping on each of the dates of

vector "v". Let's consider the following example with 5
individuals:


dt <- data.table(
   id = 1:5,
   fborn = as.Date(c("1935-07-25", "1942-10-05",
"1942-09-07", "1943-09-07", "1943-12-31")),
   sex = as.factor(rep(c(0, 1), c(2, 3)))
   )

v <- seq(as.Date("2006-01-01"), as.Date("2009-01-01"), by
="year") # k=4


I would expect to obtain k=4 data tables so that:
dt_p1: contains id = 1 (he is for the first time at least 65 on date v[1])
dt_p2: is NULL (no subject reach for the first time 65 on date v[2])
dt_p3: contains id = 2 & id = 3 (they are for the first time at least 65 on
v[3])
dt_p4: contains id = 4 & id = 5 (they are for the first time at least 65 on
v[4])


I have tried:

dt_p <- list( )                        # Empty list to alocate data tables

for (i in 1:length(v)) {
  dt_p[[i]] <- dt[ !(id %in% dt_p[[1:(i-1)]]$id) &  # Remove subjects
from previous dt_p's
         round((v[i] - fborn)/365.25, 2) >= 65, ][ , list(id, fborn, sex)]

 dt.names <- paste0("dt_p", 1:length(v))
 assign(dt.names[i], dt_p[[i]])         # Assign a name to each data table
 }

However, I cannot express correctly the previous data tables, because for the
first data

table in the loop, there are not any previous. Consequently, I get an error
message:

# Error in dt_p[[1:(i - 1)]] : no such index at level 1


I would be very grateful for anu suggestion!

Frank S.

	[[alternative HTML version deleted]]

Ista Zahn

2016-Oct-03 19:34 UTC

head link

[R] Looping through data tables (or data frames) by removing previous individuals

Hi Frank,

How about

library(lubridate)
dtf <- merge(dt, expand.grid(id = dt$id, refdate = v), by = "id")
dtf[, gt65 := as.period(interval(fborn, refdate), unit = "years") >
years(65)]
dtf <- dtf[gt65 == TRUE,][, .SD[refdate == min(refdate)], by = id]

Best,
Ista

On Mon, Oct 3, 2016 at 1:17 PM, Frank S. <f_j_rod at hotmail.com>
wrote:> Dear R users,
>
> With this mail I send my third and last question I wanted to ask these
days. First of all, many thanks
>
> for the received support in my previous mails! My question is this:
Starting from a series of (for example)
>
> "k" different dates (all contained in vector "v"), I
want to get a list of "k" data tables (or data frames) so
>
> that each contains those individuals who for the first time are at least
65, looping on each of the dates of
>
> vector "v". Let's consider the following example with 5
individuals:
>
>
> dt <- data.table(
>    id = 1:5,
>    fborn = as.Date(c("1935-07-25", "1942-10-05",
"1942-09-07", "1943-09-07", "1943-12-31")),
>    sex = as.factor(rep(c(0, 1), c(2, 3)))
>    )
>
> v <- seq(as.Date("2006-01-01"),
as.Date("2009-01-01"), by ="year") # k=4
>
>
> I would expect to obtain k=4 data tables so that:
> dt_p1: contains id = 1 (he is for the first time at least 65 on date v[1])
> dt_p2: is NULL (no subject reach for the first time 65 on date v[2])
> dt_p3: contains id = 2 & id = 3 (they are for the first time at least
65 on v[3])
> dt_p4: contains id = 4 & id = 5 (they are for the first time at least
65 on v[4])
>
>
> I have tried:
>
> dt_p <- list( )                        # Empty list to alocate data
tables
>
> for (i in 1:length(v)) {
>   dt_p[[i]] <- dt[ !(id %in% dt_p[[1:(i-1)]]$id) &  # Remove
subjects from previous dt_p's
>          round((v[i] - fborn)/365.25, 2) >= 65, ][ , list(id, fborn,
sex)]
>
>  dt.names <- paste0("dt_p", 1:length(v))
>  assign(dt.names[i], dt_p[[i]])         # Assign a name to each data table
>  }
>
> However, I cannot express correctly the previous data tables, because for
the first data
>
> table in the loop, there are not any previous. Consequently, I get an error
message:
>
> # Error in dt_p[[1:(i - 1)]] : no such index at level 1
>
>
> I would be very grateful for anu suggestion!
>
> Frank S.
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Charles C. Berry

2016-Oct-03 19:38 UTC

head link

[R] Looping through data tables (or data frames) by removing previous individuals

On Mon, 3 Oct 2016, Frank S. wrote:
> Dear R users,
>
>
[deleted]
> I want to get a list of "k" data tables (or data frames) so that
each
> contains those individuals who for the first time are at least 65, 
> looping on each of the dates of vector "v". Let's consider
the following
> example with 5 individuals:
>
>
> dt <- data.table(
>   id = 1:5,
>   fborn = as.Date(c("1935-07-25", "1942-10-05",
"1942-09-07", "1943-09-07", "1943-12-31")),
>   sex = as.factor(rep(c(0, 1), c(2, 3)))
>   )
>
> v <- seq(as.Date("2006-01-01"),
as.Date("2009-01-01"), by ="year") # k=4
>
>
> I would expect to obtain k=4 data tables so that:
> dt_p1: contains id = 1 (he is for the first time at least 65 on date v[1])
> dt_p2: is NULL (no subject reach for the first time 65 on date v[2])
> dt_p3: contains id = 2 & id = 3 (they are for the first time at least
65 on v[3])
> dt_p4: contains id = 4 & id = 5 (they are for the first time at least
65 on v[4])
>
>
Here is a start (using a data.frame for dt):
> vp <- as.POSIXlt( c( as.Date("1000-01-01"), v ))
> vp$year <- vp$year-65
> dt.cut <- as.numeric(cut(as.POSIXlt(dt$fborn),vp))
> split(dt,factor(dt.cut, 1:length(v)))$`1`
   id      fborn sex
1  1 1935-07-25   0

$`2`
[1] id    fborn sex
<0 rows> (or 0-length row.names)

$`3`
   id      fborn sex
2  2 1942-10-05   0
3  3 1942-09-07   1

$`4`
   id      fborn sex
4  4 1943-09-07   1
5  5 1943-12-31   1


See
   ?as.POSIXlt
   ?cut.POSIXt
   ?split

HTH,

Chuck

Frank S.

2016-Oct-04 10:31 UTC

head link

[R] Looping through data tables (or data frames) by removing previous individuals

Thank you very much Ista and Zahn!


Best,


Frank S.

________________________________
De: Charles C. Berry <ccberry at ucsd.edu>
Enviat el: dilluns, 3 d'octubre de 2016 21:38:05
Per a: Frank S.
A/c: r-help at r-project.org
Tema: Re: Looping through data tables (or data frames) by removing previous
individuals

On Mon, 3 Oct 2016, Frank S. wrote:
> Dear R users,
>
>
[deleted]
> I want to get a list of "k" data tables (or data frames) so that
each
> contains those individuals who for the first time are at least 65,
> looping on each of the dates of vector "v". Let's consider
the following
> example with 5 individuals:
>
>
> dt <- data.table(
>   id = 1:5,
>   fborn = as.Date(c("1935-07-25", "1942-10-05",
"1942-09-07", "1943-09-07", "1943-12-31")),
>   sex = as.factor(rep(c(0, 1), c(2, 3)))
>   )
>
> v <- seq(as.Date("2006-01-01"),
as.Date("2009-01-01"), by ="year") # k=4
>
>
> I would expect to obtain k=4 data tables so that:
> dt_p1: contains id = 1 (he is for the first time at least 65 on date v[1])
> dt_p2: is NULL (no subject reach for the first time 65 on date v[2])
> dt_p3: contains id = 2 & id = 3 (they are for the first time at least
65 on v[3])
> dt_p4: contains id = 4 & id = 5 (they are for the first time at least
65 on v[4])
>
>
Here is a start (using a data.frame for dt):
> vp <- as.POSIXlt( c( as.Date("1000-01-01"), v ))
> vp$year <- vp$year-65
> dt.cut <- as.numeric(cut(as.POSIXlt(dt$fborn),vp))
> split(dt,factor(dt.cut, 1:length(v)))$`1`
   id      fborn sex
1  1 1935-07-25   0

$`2`
[1] id    fborn sex
<0 rows> (or 0-length row.names)

$`3`
   id      fborn sex
2  2 1942-10-05   0
3  3 1942-09-07   1

$`4`
   id      fborn sex
4  4 1943-09-07   1
5  5 1943-12-31   1


See
   ?as.POSIXlt
   ?cut.POSIXt
   ?split

HTH,

Chuck

	[[alternative HTML version deleted]]

R help - Oct 2016 - Looping through data tables (or data frames) by removing previous individuals

[R] Looping through data tables (or data frames) by removing previous individuals

[R] Looping through data tables (or data frames) by removing previous individuals

[R] Looping through data tables (or data frames) by removing previous individuals

[R] Looping through data tables (or data frames) by removing previous individuals