Thanks Therneau,?duplicated() function works well. --- Kai
On Friday, September 10, 2021, 05:13:47 AM PDT, Therneau, Terry M., Ph.D.
<therneau at mayo.edu> wrote:
I prefer the duplicated() function, since the final code will be clear to a
future reader.
? (Particularly when I am that future reader).
last <- !duplicated(mydata$ID, fromLast=TRUE)? # point to the last ID for
each subject
mydata$data3[last] <- NA
Terry T.
(I read the list once a day in digest form, so am always a late reply.)
On 9/10/21 5:00 AM, r-help-request at r-project.org
wrote:> Hello List,
> Please look at the sample data frame below:
>
> ID? ? ? ? ?date1? ? ? ? ? ? ? date2? ? ? ? ? ? ?date3
> 1? ? 2015-10-08? ? 2015-12-17? ? 2015-07-23
>
> 2? ? 2016-01-16? ? NA? ? ? ? ? ? ? ? ?2015-10-08
> 3? ? 2016-08-01? ? NA? ? ? ? ? ? ? ? ?2017-01-10
> 3? ? 2017-01-10? ? NA? ? ? ? ? ? ? ? ?2016-01-16
> 4? ? 2016-01-19? ? 2016-02-24? ?2016-08-01
> 5? ? 2016-03-01? ? 2016-03-10? ?2016-01-19
> This data frame was sorted by ID and date1. I need to set the column date3
as missing for the "last" record for each ID. In the sample data set,
the ID 1, 2, 4 and 5?has one row only, so they can be consider as first and last
records. the data3 can be set as missing. But the ID 3 has 2 rows. Since I
sorted the data by ID and date1, the ID=3 and date1=2017-01-10 should be the
last record only. I need to set date3=NA for this row only.
>
> the question is, how can I identify the "last" record and set it
as NA in date3 column.
> Thank you,
> Kai
> ??? [[alternative HTML version deleted]]
>
[[alternative HTML version deleted]]