Ista: Aha -- now I see the point. My bad. You are right. I was careless. However, cut() with ifelse() might simplify the code a bit and/or make it more readable. To be clear, this is just a matter of taste; e.g. using your data and a data frame instead of a data table:> DT <- within(DT,exposure <- { f <-cut(fini,as.Date(c("2000-01-01","2006-01-01","2006-06-30","2006-12-21")), labels= letters[1:3]) ifelse(f == "a", 1, ifelse( f == "c", .5, difftime(as.Date("2007-01-01"), fini, units="days")/365.25)) } )> DTid fini group exposure f 1 2 2005-04-20 A 1.0000000 a 2 2 2005-04-20 A 1.0000000 a 3 2 2005-04-20 A 1.0000000 a 4 5 2006-02-19 B 0.8651608 b 5 5 2006-06-29 B 0.5092402 b 6 7 2006-10-08 A 0.5000000 c 7 7 2006-10-08 A 0.5000000 c Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Mon, Sep 26, 2016 at 12:07 PM, Ista Zahn <istazahn at gmail.com> wrote:> On Mon, Sep 26, 2016 at 2:48 PM, Bert Gunter <bgunter.4567 at gmail.com> wrote: >> I thought that that was a typo from the OP, as it disagrees with his >> example. But the labels are arbitrary, so in fact cut() will do it >> whichever way he meant. > > I don't see how cut will do it, at least not conveniently. Consider > this slightly altered example: > > library(data.table) > DT <- data.table( > id = rep(c(2, 5, 7), c(3, 2, 2)), > fini = rep(as.Date(c('2005-04-20', > '2006-02-19', > '2006-06-29', > '2006-10-08')), > c(3, 1, 1, 2)), > group = rep(c("A", "B", "A"), c(3, 2, 2)) ) > > DT[, exposure := vector(mode = "numeric", length = .N)] > DT[fini < as.Date("2006-01-01"), exposure := 1] > DT[fini >= as.Date("2006-01-01") & fini <= as.Date("2006-06-30"), > exposure := difftime(as.Date("2007-01-01"), fini, units="days")/365.25] > DT[fini >= as.Date("2006-07-01"), exposure := 0.5] > > DT > > ## id fini group exposure > ## 1: 2 2005-04-20 A 1.0000000 > ## 2: 2 2005-04-20 A 1.0000000 > ## 3: 2 2005-04-20 A 1.0000000 > ## 4: 5 2006-02-19 B 0.8651608 > ## 5: 5 2006-06-29 B 0.5092402 > ## 6: 7 2006-10-08 A 0.5000000 > ## 7: 7 2006-10-08 A 0.5000000 > > Best, > Ista > >> >> -- Bert >> Bert Gunter >> >> "The trouble with having an open mind is that people keep coming along >> and sticking things into it." >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >> >> >> On Mon, Sep 26, 2016 at 11:37 AM, Ista Zahn <istazahn at gmail.com> wrote: >>> On Mon, Sep 26, 2016 at 1:59 PM, Bert Gunter <bgunter.4567 at gmail.com> wrote: >>>> This seems like a job for cut() . >>> >>> I thought that at first two, but the middle group shouldn't be .87 but rather >>> >>> exposure" = "2007-01-01" - "fini" >>> >>> so, I think cut alone won't do it. >>> >>> Best, >>> Ista >>>> >>>> (I made DT a data frame to avoid loading the data table package. But I >>>> assume it would work with a data table too, Check this, though!) >>>> >>>>> DT <- within(DT, exposure <- cut(fini,as.Date(c("2000-01-01","2006-01-01","2006-06-30","2006-12-21")), labels= c(1,.87,.5))) >>>> >>>>> DT >>>> id fini group exposure >>>> 1 2 2005-04-20 A 1 >>>> 2 2 2005-04-20 A 1 >>>> 3 2 2005-04-20 A 1 >>>> 4 5 2006-02-19 B 0.87 >>>> 5 5 2006-02-19 B 0.87 >>>> 6 7 2006-10-08 A 0.5 >>>> 7 7 2006-10-08 A 0.5 >>>> >>>> >>>> (but note that exposure is a factor, not numeric) >>>> >>>> >>>> Cheers, >>>> Bert >>>> >>>> Bert Gunter >>>> >>>> "The trouble with having an open mind is that people keep coming along >>>> and sticking things into it." >>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >>>> >>>> >>>> On Mon, Sep 26, 2016 at 10:05 AM, Ista Zahn <istazahn at gmail.com> wrote: >>>>> Hi Frank, >>>>> >>>>> lapply(DT) iterates over each column. That doesn't seem to be what you want. >>>>> >>>>> There are probably better ways, but here is one approach. >>>>> >>>>> DT[, exposure := vector(mode = "numeric", length = .N)] >>>>> DT[fini < as.Date("2006-01-01"), exposure := 1] >>>>> DT[fini >= as.Date("2006-01-01") & fini <= as.Date("2006-06-30"), >>>>> exposure := difftime(as.Date("2007-01-01"), fini, units="days")/365.25] >>>>> DT[fini >= as.Date("2006-07-01"), exposure := 0.5] >>>>> >>>>> Best, >>>>> Ista >>>>> >>>>> On Mon, Sep 26, 2016 at 11:28 AM, Frank S. <f_j_rod at hotmail.com> wrote: >>>>>> Dear all, >>>>>> >>>>>> I have a R data table like this: >>>>>> >>>>>> DT <- data.table( >>>>>> id = rep(c(2, 5, 7), c(3, 2, 2)), >>>>>> fini = rep(as.Date(c('2005-04-20', '2006-02-19', '2006-10-08')), c(3, 2, 2)), >>>>>> group = rep(c("A", "B", "A"), c(3, 2, 2)) ) >>>>>> >>>>>> >>>>>> I want to construct a new variable "exposure" defined as follows: >>>>>> >>>>>> 1) If "fini" earlier than 2006-01-01 --> "exposure" = 1 >>>>>> 2) If "fini" in [2006-01-01, 2006-06-30] --> "exposure" = "2007-01-01" - "fini" >>>>>> 3) If "fini" in [2006-07-01, 2006-12-31] --> "exposure" = 0.5 >>>>>> >>>>>> >>>>>> So the desired output would be the following data table: >>>>>> >>>>>> id fini exposure group >>>>>> 1: 2 2005-04-20 1.00 A >>>>>> 2: 2 2005-04-20 1.00 A >>>>>> 3: 2 2005-04-20 1.00 A >>>>>> 4: 5 2006-02-19 0.87 B >>>>>> 5: 5 2006-02-19 0.87 B >>>>>> 6: 7 2006-10-08 0.50 A >>>>>> 7: 7 2006-10-08 0.50 A >>>>>> >>>>>> >>>>>> I have tried: >>>>>> >>>>>> DT <- DT[ , list(id, fini, exposure = 0, group)] >>>>>> DT.new <- lapply(DT, function(exposure){ >>>>>> exposure[fini < as.Date("2006-01-01")] <- 1 # 1st case >>>>>> exposure[fini >= as.Date("2006-01-01") & fini <= as.Date("2006-06-30")] <- difftime(as.Date("2007-01-01"), fini, units="days")/365.25 # 2nd case >>>>>> exposure[fini >= as.Date("2006-07-01") & fini <= as.Date("2006-12-31")] <- 0.5 # 3rd case >>>>>> exposure # return value >>>>>> }) >>>>>> >>>>>> >>>>>> But I get an error message. >>>>>> >>>>>> Thanks for any help!! >>>>>> >>>>>> >>>>>> Frank S. >>>>>> >>>>>> >>>>>> [[alternative HTML version deleted]] >>>>>> >>>>>> ______________________________________________ >>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>> >>>>> ______________________________________________ >>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code.
... and just for fun, here's an alternative in which mapply() is used
to vectorize switch(); again, whether you like it may be just a matter
of taste, although I suspect it might be less efficient than ifelse(),
which is already vectorized:
DT <- within(DT,
exposure <- {
mapply(function(x,fac)switch(as.character(fac),
a = 1,
b = difftime(as.Date("2007-01-01"), x,
units="days")/365.25,
c = .5
),
x = fini,
fac
cut(fini,as.Date(c("2000-01-01","2006-01-01","2006-06-30","2006-12-21")),
labels= letters[1:3])
)}
)
> DT
id fini group exposure
1 2 2005-04-20 A 1.0000000
2 2 2005-04-20 A 1.0000000
3 2 2005-04-20 A 1.0000000
4 5 2006-02-19 B 0.8651608
5 5 2006-06-29 B 0.5092402
6 7 2006-10-08 A 0.5000000
7 7 2006-10-08 A 0.5000000
Cheers,
Bert
Bert Gunter
"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Mon, Sep 26, 2016 at 1:27 PM, Bert Gunter <bgunter.4567 at gmail.com>
wrote:> Ista:
>
> Aha -- now I see the point. My bad. You are right. I was careless.
>
> However, cut() with ifelse() might simplify the code a bit and/or make
> it more readable. To be clear, this is just a matter of taste; e.g.
> using your data and a data frame instead of a data table:
>
>> DT <- within(DT,
> exposure <- {
> f
<-cut(fini,as.Date(c("2000-01-01","2006-01-01","2006-06-30","2006-12-21")),
> labels= letters[1:3])
> ifelse(f == "a", 1,
> ifelse( f == "c", .5,
> difftime(as.Date("2007-01-01"), fini,
units="days")/365.25))
> }
> )
>
>
>> DT
> id fini group exposure f
> 1 2 2005-04-20 A 1.0000000 a
> 2 2 2005-04-20 A 1.0000000 a
> 3 2 2005-04-20 A 1.0000000 a
> 4 5 2006-02-19 B 0.8651608 b
> 5 5 2006-06-29 B 0.5092402 b
> 6 7 2006-10-08 A 0.5000000 c
> 7 7 2006-10-08 A 0.5000000 c
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip
)
>
>
> On Mon, Sep 26, 2016 at 12:07 PM, Ista Zahn <istazahn at gmail.com>
wrote:
>> On Mon, Sep 26, 2016 at 2:48 PM, Bert Gunter <bgunter.4567 at
gmail.com> wrote:
>>> I thought that that was a typo from the OP, as it disagrees with
his
>>> example. But the labels are arbitrary, so in fact cut() will do it
>>> whichever way he meant.
>>
>> I don't see how cut will do it, at least not conveniently. Consider
>> this slightly altered example:
>>
>> library(data.table)
>> DT <- data.table(
>> id = rep(c(2, 5, 7), c(3, 2, 2)),
>> fini = rep(as.Date(c('2005-04-20',
>> '2006-02-19',
>> '2006-06-29',
>> '2006-10-08')),
>> c(3, 1, 1, 2)),
>> group = rep(c("A", "B", "A"), c(3, 2,
2)) )
>>
>> DT[, exposure := vector(mode = "numeric", length = .N)]
>> DT[fini < as.Date("2006-01-01"), exposure := 1]
>> DT[fini >= as.Date("2006-01-01") & fini <=
as.Date("2006-06-30"),
>> exposure := difftime(as.Date("2007-01-01"), fini,
units="days")/365.25]
>> DT[fini >= as.Date("2006-07-01"), exposure := 0.5]
>>
>> DT
>>
>> ## id fini group exposure
>> ## 1: 2 2005-04-20 A 1.0000000
>> ## 2: 2 2005-04-20 A 1.0000000
>> ## 3: 2 2005-04-20 A 1.0000000
>> ## 4: 5 2006-02-19 B 0.8651608
>> ## 5: 5 2006-06-29 B 0.5092402
>> ## 6: 7 2006-10-08 A 0.5000000
>> ## 7: 7 2006-10-08 A 0.5000000
>>
>> Best,
>> Ista
>>
>>>
>>> -- Bert
>>> Bert Gunter
>>>
>>> "The trouble with having an open mind is that people keep
coming along
>>> and sticking things into it."
>>> -- Opus (aka Berkeley Breathed in his "Bloom County"
comic strip )
>>>
>>>
>>> On Mon, Sep 26, 2016 at 11:37 AM, Ista Zahn <istazahn at
gmail.com> wrote:
>>>> On Mon, Sep 26, 2016 at 1:59 PM, Bert Gunter <bgunter.4567
at gmail.com> wrote:
>>>>> This seems like a job for cut() .
>>>>
>>>> I thought that at first two, but the middle group shouldn't
be .87 but rather
>>>>
>>>> exposure" = "2007-01-01" - "fini"
>>>>
>>>> so, I think cut alone won't do it.
>>>>
>>>> Best,
>>>> Ista
>>>>>
>>>>> (I made DT a data frame to avoid loading the data table
package. But I
>>>>> assume it would work with a data table too, Check this,
though!)
>>>>>
>>>>>> DT <- within(DT, exposure <-
cut(fini,as.Date(c("2000-01-01","2006-01-01","2006-06-30","2006-12-21")),
labels= c(1,.87,.5)))
>>>>>
>>>>>> DT
>>>>> id fini group exposure
>>>>> 1 2 2005-04-20 A 1
>>>>> 2 2 2005-04-20 A 1
>>>>> 3 2 2005-04-20 A 1
>>>>> 4 5 2006-02-19 B 0.87
>>>>> 5 5 2006-02-19 B 0.87
>>>>> 6 7 2006-10-08 A 0.5
>>>>> 7 7 2006-10-08 A 0.5
>>>>>
>>>>>
>>>>> (but note that exposure is a factor, not numeric)
>>>>>
>>>>>
>>>>> Cheers,
>>>>> Bert
>>>>>
>>>>> Bert Gunter
>>>>>
>>>>> "The trouble with having an open mind is that people
keep coming along
>>>>> and sticking things into it."
>>>>> -- Opus (aka Berkeley Breathed in his "Bloom
County" comic strip )
>>>>>
>>>>>
>>>>> On Mon, Sep 26, 2016 at 10:05 AM, Ista Zahn <istazahn at
gmail.com> wrote:
>>>>>> Hi Frank,
>>>>>>
>>>>>> lapply(DT) iterates over each column. That doesn't
seem to be what you want.
>>>>>>
>>>>>> There are probably better ways, but here is one
approach.
>>>>>>
>>>>>> DT[, exposure := vector(mode = "numeric",
length = .N)]
>>>>>> DT[fini < as.Date("2006-01-01"), exposure
:= 1]
>>>>>> DT[fini >= as.Date("2006-01-01") &
fini <= as.Date("2006-06-30"),
>>>>>> exposure :=
difftime(as.Date("2007-01-01"), fini, units="days")/365.25]
>>>>>> DT[fini >= as.Date("2006-07-01"), exposure
:= 0.5]
>>>>>>
>>>>>> Best,
>>>>>> Ista
>>>>>>
>>>>>> On Mon, Sep 26, 2016 at 11:28 AM, Frank S. <f_j_rod
at hotmail.com> wrote:
>>>>>>> Dear all,
>>>>>>>
>>>>>>> I have a R data table like this:
>>>>>>>
>>>>>>> DT <- data.table(
>>>>>>> id = rep(c(2, 5, 7), c(3, 2, 2)),
>>>>>>> fini = rep(as.Date(c('2005-04-20',
'2006-02-19', '2006-10-08')), c(3, 2, 2)),
>>>>>>> group = rep(c("A", "B",
"A"), c(3, 2, 2)) )
>>>>>>>
>>>>>>>
>>>>>>> I want to construct a new variable
"exposure" defined as follows:
>>>>>>>
>>>>>>> 1) If "fini" earlier than 2006-01-01
--> "exposure" = 1
>>>>>>> 2) If "fini" in [2006-01-01, 2006-06-30]
--> "exposure" = "2007-01-01" - "fini"
>>>>>>> 3) If "fini" in [2006-07-01, 2006-12-31]
--> "exposure" = 0.5
>>>>>>>
>>>>>>>
>>>>>>> So the desired output would be the following data
table:
>>>>>>>
>>>>>>> id fini exposure group
>>>>>>> 1: 2 2005-04-20 1.00 A
>>>>>>> 2: 2 2005-04-20 1.00 A
>>>>>>> 3: 2 2005-04-20 1.00 A
>>>>>>> 4: 5 2006-02-19 0.87 B
>>>>>>> 5: 5 2006-02-19 0.87 B
>>>>>>> 6: 7 2006-10-08 0.50 A
>>>>>>> 7: 7 2006-10-08 0.50 A
>>>>>>>
>>>>>>>
>>>>>>> I have tried:
>>>>>>>
>>>>>>> DT <- DT[ , list(id, fini, exposure = 0, group)]
>>>>>>> DT.new <- lapply(DT, function(exposure){
>>>>>>> exposure[fini <
as.Date("2006-01-01")] <- 1 # 1st case
>>>>>>> exposure[fini >=
as.Date("2006-01-01") & fini <=
as.Date("2006-06-30")] <- difftime(as.Date("2007-01-01"),
fini, units="days")/365.25 # 2nd case
>>>>>>> exposure[fini >=
as.Date("2006-07-01") & fini <=
as.Date("2006-12-31")] <- 0.5 # 3rd case
>>>>>>> exposure # return value
>>>>>>> })
>>>>>>>
>>>>>>>
>>>>>>> But I get an error message.
>>>>>>>
>>>>>>> Thanks for any help!!
>>>>>>>
>>>>>>>
>>>>>>> Frank S.
>>>>>>>
>>>>>>>
>>>>>>> [[alternative HTML version deleted]]
>>>>>>>
>>>>>>> ______________________________________________
>>>>>>> R-help at r-project.org mailing list -- To
UNSUBSCRIBE and more, see
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>>>>>>> and provide commented, minimal, self-contained,
reproducible code.
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE
and more, see
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>>>>>> and provide commented, minimal, self-contained,
reproducible code.
Many thanks Ista and Bert for your nice solutions!
As Ista commented in a previous mail, the 0.87 value in my example is not fixed,
but for each subject
it depends on the difference "2007-01-01 - fini". However, both of
your solutions take into account this
fact.
Frank S.
________________________________
De: Bert Gunter <bgunter.4567 at gmail.com>
Enviat el: dilluns, 26 de setembre de 2016 23:18:52
Per a: Ista Zahn
A/c: Frank S.; r-help at r-project.org
Tema: Re: [R] Using lapply in R data table
... and just for fun, here's an alternative in which mapply() is used
to vectorize switch(); again, whether you like it may be just a matter
of taste, although I suspect it might be less efficient than ifelse(),
which is already vectorized:
DT <- within(DT,
exposure <- {
mapply(function(x,fac)switch(as.character(fac),
a = 1,
b = difftime(as.Date("2007-01-01"), x,
units="days")/365.25,
c = .5
),
x = fini,
fac
cut(fini,as.Date(c("2000-01-01","2006-01-01","2006-06-30","2006-12-21")),
labels= letters[1:3])
)}
)
> DT
id fini group exposure
1 2 2005-04-20 A 1.0000000
2 2 2005-04-20 A 1.0000000
3 2 2005-04-20 A 1.0000000
4 5 2006-02-19 B 0.8651608
5 5 2006-06-29 B 0.5092402
6 7 2006-10-08 A 0.5000000
7 7 2006-10-08 A 0.5000000
Cheers,
Bert
Bert Gunter
"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Mon, Sep 26, 2016 at 1:27 PM, Bert Gunter <bgunter.4567 at gmail.com>
wrote:> Ista:
>
> Aha -- now I see the point. My bad. You are right. I was careless.
>
> However, cut() with ifelse() might simplify the code a bit and/or make
> it more readable. To be clear, this is just a matter of taste; e.g.
> using your data and a data frame instead of a data table:
>
>> DT <- within(DT,
> exposure <- {
> f
<-cut(fini,as.Date(c("2000-01-01","2006-01-01","2006-06-30","2006-12-21")),
> labels= letters[1:3])
> ifelse(f == "a", 1,
> ifelse( f == "c", .5,
> difftime(as.Date("2007-01-01"), fini,
units="days")/365.25))
> }
> )
>
>
>> DT
> id fini group exposure f
> 1 2 2005-04-20 A 1.0000000 a
> 2 2 2005-04-20 A 1.0000000 a
> 3 2 2005-04-20 A 1.0000000 a
> 4 5 2006-02-19 B 0.8651608 b
> 5 5 2006-06-29 B 0.5092402 b
> 6 7 2006-10-08 A 0.5000000 c
> 7 7 2006-10-08 A 0.5000000 c
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip
)
>
>
> On Mon, Sep 26, 2016 at 12:07 PM, Ista Zahn <istazahn at gmail.com>
wrote:
>> On Mon, Sep 26, 2016 at 2:48 PM, Bert Gunter <bgunter.4567 at
gmail.com> wrote:
>>> I thought that that was a typo from the OP, as it disagrees with
his
>>> example. But the labels are arbitrary, so in fact cut() will do it
>>> whichever way he meant.
>>
>> I don't see how cut will do it, at least not conveniently. Consider
>> this slightly altered example:
>>
>> library(data.table)
>> DT <- data.table(
>> id = rep(c(2, 5, 7), c(3, 2, 2)),
>> fini = rep(as.Date(c('2005-04-20',
>> '2006-02-19',
>> '2006-06-29',
>> '2006-10-08')),
>> c(3, 1, 1, 2)),
>> group = rep(c("A", "B", "A"), c(3, 2,
2)) )
>>
>> DT[, exposure := vector(mode = "numeric", length = .N)]
>> DT[fini < as.Date("2006-01-01"), exposure := 1]
>> DT[fini >= as.Date("2006-01-01") & fini <=
as.Date("2006-06-30"),
>> exposure := difftime(as.Date("2007-01-01"), fini,
units="days")/365.25]
>> DT[fini >= as.Date("2006-07-01"), exposure := 0.5]
>>
>> DT
>>
>> ## id fini group exposure
>> ## 1: 2 2005-04-20 A 1.0000000
>> ## 2: 2 2005-04-20 A 1.0000000
>> ## 3: 2 2005-04-20 A 1.0000000
>> ## 4: 5 2006-02-19 B 0.8651608
>> ## 5: 5 2006-06-29 B 0.5092402
>> ## 6: 7 2006-10-08 A 0.5000000
>> ## 7: 7 2006-10-08 A 0.5000000
>>
>> Best,
>> Ista
>>
>>>
>>> -- Bert
>>> Bert Gunter
>>>
>>> "The trouble with having an open mind is that people keep
coming along
>>> and sticking things into it."
>>> -- Opus (aka Berkeley Breathed in his "Bloom County"
comic strip )
>>>
>>>
>>> On Mon, Sep 26, 2016 at 11:37 AM, Ista Zahn <istazahn at
gmail.com> wrote:
>>>> On Mon, Sep 26, 2016 at 1:59 PM, Bert Gunter <bgunter.4567
at gmail.com> wrote:
>>>>> This seems like a job for cut() .
>>>>
>>>> I thought that at first two, but the middle group shouldn't
be .87 but rather
>>>>
>>>> exposure" = "2007-01-01" - "fini"
>>>>
>>>> so, I think cut alone won't do it.
>>>>
>>>> Best,
>>>> Ista
>>>>>
>>>>> (I made DT a data frame to avoid loading the data table
package. But I
>>>>> assume it would work with a data table too, Check this,
though!)
>>>>>
>>>>>> DT <- within(DT, exposure <-
cut(fini,as.Date(c("2000-01-01","2006-01-01","2006-06-30","2006-12-21")),
labels= c(1,.87,.5)))
>>>>>
>>>>>> DT
>>>>> id fini group exposure
>>>>> 1 2 2005-04-20 A 1
>>>>> 2 2 2005-04-20 A 1
>>>>> 3 2 2005-04-20 A 1
>>>>> 4 5 2006-02-19 B 0.87
>>>>> 5 5 2006-02-19 B 0.87
>>>>> 6 7 2006-10-08 A 0.5
>>>>> 7 7 2006-10-08 A 0.5
>>>>>
>>>>>
>>>>> (but note that exposure is a factor, not numeric)
>>>>>
>>>>>
>>>>> Cheers,
>>>>> Bert
>>>>>
>>>>> Bert Gunter
>>>>>
>>>>> "The trouble with having an open mind is that people
keep coming along
>>>>> and sticking things into it."
>>>>> -- Opus (aka Berkeley Breathed in his "Bloom
County" comic strip )
>>>>>
>>>>>
>>>>> On Mon, Sep 26, 2016 at 10:05 AM, Ista Zahn <istazahn at
gmail.com> wrote:
>>>>>> Hi Frank,
>>>>>>
>>>>>> lapply(DT) iterates over each column. That doesn't
seem to be what you want.
>>>>>>
>>>>>> There are probably better ways, but here is one
approach.
>>>>>>
>>>>>> DT[, exposure := vector(mode = "numeric",
length = .N)]
>>>>>> DT[fini < as.Date("2006-01-01"), exposure
:= 1]
>>>>>> DT[fini >= as.Date("2006-01-01") &
fini <= as.Date("2006-06-30"),
>>>>>> exposure :=
difftime(as.Date("2007-01-01"), fini, units="days")/365.25]
>>>>>> DT[fini >= as.Date("2006-07-01"), exposure
:= 0.5]
>>>>>>
>>>>>> Best,
>>>>>> Ista
>>>>>>
>>>>>> On Mon, Sep 26, 2016 at 11:28 AM, Frank S. <f_j_rod
at hotmail.com> wrote:
>>>>>>> Dear all,
>>>>>>>
>>>>>>> I have a R data table like this:
>>>>>>>
>>>>>>> DT <- data.table(
>>>>>>> id = rep(c(2, 5, 7), c(3, 2, 2)),
>>>>>>> fini = rep(as.Date(c('2005-04-20',
'2006-02-19', '2006-10-08')), c(3, 2, 2)),
>>>>>>> group = rep(c("A", "B",
"A"), c(3, 2, 2)) )
>>>>>>>
>>>>>>>
>>>>>>> I want to construct a new variable
"exposure" defined as follows:
>>>>>>>
>>>>>>> 1) If "fini" earlier than 2006-01-01
--> "exposure" = 1
>>>>>>> 2) If "fini" in [2006-01-01, 2006-06-30]
--> "exposure" = "2007-01-01" - "fini"
>>>>>>> 3) If "fini" in [2006-07-01, 2006-12-31]
--> "exposure" = 0.5
>>>>>>>
>>>>>>>
>>>>>>> So the desired output would be the following data
table:
>>>>>>>
>>>>>>> id fini exposure group
>>>>>>> 1: 2 2005-04-20 1.00 A
>>>>>>> 2: 2 2005-04-20 1.00 A
>>>>>>> 3: 2 2005-04-20 1.00 A
>>>>>>> 4: 5 2006-02-19 0.87 B
>>>>>>> 5: 5 2006-02-19 0.87 B
>>>>>>> 6: 7 2006-10-08 0.50 A
>>>>>>> 7: 7 2006-10-08 0.50 A
>>>>>>>
>>>>>>>
>>>>>>> I have tried:
>>>>>>>
>>>>>>> DT <- DT[ , list(id, fini, exposure = 0, group)]
>>>>>>> DT.new <- lapply(DT, function(exposure){
>>>>>>> exposure[fini <
as.Date("2006-01-01")] <- 1 # 1st case
>>>>>>> exposure[fini >=
as.Date("2006-01-01") & fini <=
as.Date("2006-06-30")] <- difftime(as.Date("2007-01-01"),
fini, units="days")/365.25 # 2nd case
>>>>>>> exposure[fini >=
as.Date("2006-07-01") & fini <=
as.Date("2006-12-31")] <- 0.5 # 3rd case
>>>>>>> exposure # return value
>>>>>>> })
>>>>>>>
>>>>>>>
>>>>>>> But I get an error message.
>>>>>>>
[[elided Hotmail spam]]>>>>>>>
>>>>>>>
>>>>>>> Frank S.
>>>>>>>
>>>>>>>
>>>>>>> [[alternative HTML version deleted]]
>>>>>>>
>>>>>>> ______________________________________________
>>>>>>> R-help at r-project.org mailing list -- To
UNSUBSCRIBE and more, see
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>>>>>>> and provide commented, minimal, self-contained,
reproducible code.
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE
and more, see
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>>>>>> and provide commented, minimal, self-contained,
reproducible code.
[[alternative HTML version deleted]]