Thank You Profs. Dalgaard and Barradas for the code, both codes worked perfectly for the data and I am going to use it in my big data set. Thanks once again. On Thu, Feb 6, 2020 at 2:02 AM peter dalgaard <pdalgd at gmail.com> wrote:> There is also > > > with(daT, tapply(as.character(ObsDate), list(id, ObsSite), > function(x)format(list(x)))) > site1 site4 site5 site7 > id1 NA NA NA "06/13/13" > id2 NA NA NA "07/03/14, 05/17/14" > id4 NA "05/08/14" NA NA > id5 NA NA "06/13/14" NA > id6 "05/30/14, 06/28/13" NA NA NA > id7 NA NA "06/25/13" NA > > ...with the added bonus that if you leave out the format() business, you > get a data structure that doesn't print as nicely, but can be used for > further computations: > > > with(daT, tapply(as.character(ObsDate), list(id, ObsSite), list, > simplify=FALSE)) > site1 site4 site5 site7 > id1 NULL NULL NULL List,1 > id2 NULL NULL NULL List,1 > id4 NULL List,1 NULL NULL > id5 NULL NULL List,1 NULL > id6 List,1 NULL NULL NULL > id7 NULL NULL List,1 NULL > > M <- with(daT, tapply(as.character(ObsDate), list(id, ObsSite), list, > simplify=FALSE)) > > M[["id2", "site7"]] > [[1]] > [1] "07/03/14" "05/17/14" > > -pd > > > On 6 Feb 2020, at 01:37 , Marna Wagley <marna.wagley at gmail.com> wrote: > > > > daT<-structure(list(id = structure(c(1L, 2L, 2L, 3L, 4L, 5L, 5L, 6L > > ), .Label = c("id1", "id2", "id4", "id5", "id6", "id7"), class > "factor"), > > ObsSite = structure(c(4L, 4L, 4L, 2L, 3L, 1L, 1L, 3L), .Label > > c("site1", > > "site4", "site5", "site7"), class = "factor"), ObsDate > > structure(c(4L, > > 8L, 2L, 1L, 5L, 3L, 7L, 6L), .Label = c("05/08/14", "05/17/14", > > "05/30/14", "06/13/13", "06/13/14", "06/25/13", "06/28/13", > > "07/03/14"), class = "factor")), .Names = c("id", "ObsSite", > > "ObsDate"), class = "data.frame", row.names = c(NA, -8L)) > > daT > > daT$date <- mdy(daT$ObsDate) > > -- > Peter Dalgaard, Professor, > Center for Statistics, Copenhagen Business School > Solbjerg Plads 3, 2000 Frederiksberg, Denmark > Phone: (+45)38153501 > Office: A 4.23 > Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com > > > > > > > > > >[[alternative HTML version deleted]]
Dear Peter and Rui, There are many dates (value) in some of the cells, If we want to chose only one date (either oldest or newest) from that cell, how can we make a table with that condition? For example, Using the following code; M <- with(daT, tapply(as.character(ObsDate), list(id, ObsSite), list, simplify=F)) data.frame(M) site1 site4 site5 site7 id1 NULL NULL NULL 6/13/13 id2 NULL NULL NULL 07/03/14, 05/17/14 id4 NULL 5/8/14 NULL NULL id5 NULL NULL 6/13/14 NULL id6 05/30/14, 06/28/13 NULL NULL NULL id7 NULL NULL 6/25/13 NULL If we want to take only one value (oldest date) if two or more dates/values (classes) are found in the certain cells. For example would like to get the following table (given below). site1 site4 site5 site7 id1 NULL NULL NULL 6/13/13 id2 NULL NULL NULL 07/03/14 id4 NULL 5/8/14 NULL NULL id5 NULL NULL 6/13/14 NULL id6 6/28/13 NULL NULL NULL id7 NULL NULL 6/25/13 NULL here is the code and an example data daT<-structure(list(id = structure(c(1L, 2L, 2L, 3L, 4L, 5L, 5L, 6L ), .Label = c("id1", "id2", "id4", "id5", "id6", "id7"), class = "factor"), ObsSite = structure(c(4L, 4L, 4L, 2L, 3L, 1L, 1L, 3L), .Label c("site1", "site4", "site5", "site7"), class = "factor"), ObsDate structure(c(4L, 8L, 2L, 1L, 5L, 3L, 7L, 6L), .Label = c("05/08/14", "05/17/14", "05/30/14", "06/13/13", "06/13/14", "06/25/13", "06/28/13", "07/03/14"), class = "factor")), .Names = c("id", "ObsSite", "ObsDate"), class = "data.frame", row.names = c(NA, -8L)) daT daT$date <- mdy(daT$ObsDate) tmp <- split(daT, daT$id) head(tmp) M <- with(daT, tapply(as.character(ObsDate), list(id, ObsSite), list, simplify=F)) data.frame(M) M[["id2", "site7"]] # but I wanted to get the following table daT2<-structure(list(X = structure(1:6, .Label = c("id1", "id2", "id4", "id5", "id6", "id7"), class = "factor"), site1 = structure(c(2L, 2L, 2L, 2L, 1L, 2L), .Label = c("6/28/13", "NULL"), class = "factor"), site4 = structure(c(2L, 2L, 1L, 2L, 2L, 2L), .Label = c("5/8/14", "NULL"), class = "factor"), site5 = structure(c(3L, 3L, 3L, 1L, 3L, 2L), .Label = c("6/13/14", "6/25/13", "NULL"), class "factor"), site7 = structure(c(2L, 1L, 3L, 3L, 3L, 3L), .Label = c("5/17/14", "6/13/13", "NULL"), class = "factor")), .Names = c("X", "site1", "site4", "site5", "site7"), class = "data.frame", row.names = c(NA, -6L)) daT2 Thank you === On Thu, Feb 6, 2020 at 8:48 AM Marna Wagley <marna.wagley at gmail.com> wrote:> Thank You Profs. Dalgaard and Barradas for the code, both codes worked > perfectly for the data and I am going to use it in my big data set. > Thanks once again. > > > On Thu, Feb 6, 2020 at 2:02 AM peter dalgaard <pdalgd at gmail.com> wrote: > >> There is also >> >> > with(daT, tapply(as.character(ObsDate), list(id, ObsSite), >> function(x)format(list(x)))) >> site1 site4 site5 site7 >> id1 NA NA NA "06/13/13" >> id2 NA NA NA "07/03/14, 05/17/14" >> id4 NA "05/08/14" NA NA >> id5 NA NA "06/13/14" NA >> id6 "05/30/14, 06/28/13" NA NA NA >> id7 NA NA "06/25/13" NA >> >> ...with the added bonus that if you leave out the format() business, you >> get a data structure that doesn't print as nicely, but can be used for >> further computations: >> >> > with(daT, tapply(as.character(ObsDate), list(id, ObsSite), list, >> simplify=FALSE)) >> site1 site4 site5 site7 >> id1 NULL NULL NULL List,1 >> id2 NULL NULL NULL List,1 >> id4 NULL List,1 NULL NULL >> id5 NULL NULL List,1 NULL >> id6 List,1 NULL NULL NULL >> id7 NULL NULL List,1 NULL >> > M <- with(daT, tapply(as.character(ObsDate), list(id, ObsSite), list, >> simplify=FALSE)) >> > M[["id2", "site7"]] >> [[1]] >> [1] "07/03/14" "05/17/14" >> >> -pd >> >> > On 6 Feb 2020, at 01:37 , Marna Wagley <marna.wagley at gmail.com> wrote: >> > >> > daT<-structure(list(id = structure(c(1L, 2L, 2L, 3L, 4L, 5L, 5L, 6L >> > ), .Label = c("id1", "id2", "id4", "id5", "id6", "id7"), class >> "factor"), >> > ObsSite = structure(c(4L, 4L, 4L, 2L, 3L, 1L, 1L, 3L), .Label >> > c("site1", >> > "site4", "site5", "site7"), class = "factor"), ObsDate >> > structure(c(4L, >> > 8L, 2L, 1L, 5L, 3L, 7L, 6L), .Label = c("05/08/14", "05/17/14", >> > "05/30/14", "06/13/13", "06/13/14", "06/25/13", "06/28/13", >> > "07/03/14"), class = "factor")), .Names = c("id", "ObsSite", >> > "ObsDate"), class = "data.frame", row.names = c(NA, -8L)) >> > daT >> > daT$date <- mdy(daT$ObsDate) >> >> -- >> Peter Dalgaard, Professor, >> Center for Statistics, Copenhagen Business School >> Solbjerg Plads 3, 2000 Frederiksberg, Denmark >> Phone: (+45)38153501 >> Office: A 4.23 >> Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com >> >> >> >> >> >> >> >> >> >>[[alternative HTML version deleted]]
Please read ?tapply *carefully.* Note that in ...,list, simplify = FALSE), "list" is the value of the FUN argument for tapply(). So, in theory, you could replace "list" (without quotes of course) with another function such as max or min to get the latest or earliest date. **Except** that won't work, because max() and min() are meaningless for character data. So what you must do is write a function that first converts the character data to "date" data, and then take the max or min of that (and maybe convert the result back after, although that may not really be needed). See ?as.Date and links therein for how to convert character data to/from dates. Although this may be less help than you would like, you will learn a lot by working through the details. However, tthers may be kinder than I and provide you more (and possibly better solutions). Note that the "lubridate" package has all sorts of this kind of functionality built in, and you may prefer using that as your interface to date handling. Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Fri, Feb 14, 2020 at 9:39 AM Marna Wagley <marna.wagley at gmail.com> wrote:> Dear Peter and Rui, > There are many dates (value) in some of the cells, If we want to chose only > one date (either oldest or newest) from that cell, how can we make a table > with that condition? > For example, > Using the following code; > M <- with(daT, tapply(as.character(ObsDate), list(id, ObsSite), list, > simplify=F)) > data.frame(M) > site1 site4 site5 site7 > id1 NULL NULL NULL 6/13/13 > id2 NULL NULL NULL 07/03/14, 05/17/14 > id4 NULL 5/8/14 NULL NULL > id5 NULL NULL 6/13/14 NULL > id6 05/30/14, 06/28/13 NULL NULL NULL > id7 NULL NULL 6/25/13 NULL > > If we want to take only one value (oldest date) if two or more dates/values > (classes) are found in the certain cells. For example would like to get the > following table (given below). > site1 site4 site5 site7 > id1 NULL NULL NULL 6/13/13 > id2 NULL NULL NULL 07/03/14 > id4 NULL 5/8/14 NULL NULL > id5 NULL NULL 6/13/14 NULL > id6 6/28/13 NULL NULL NULL > id7 NULL NULL 6/25/13 NULL > > here is the code and an example data > > daT<-structure(list(id = structure(c(1L, 2L, 2L, 3L, 4L, 5L, 5L, 6L > ), .Label = c("id1", "id2", "id4", "id5", "id6", "id7"), class = "factor"), > ObsSite = structure(c(4L, 4L, 4L, 2L, 3L, 1L, 1L, 3L), .Label > c("site1", > "site4", "site5", "site7"), class = "factor"), ObsDate > structure(c(4L, > 8L, 2L, 1L, 5L, 3L, 7L, 6L), .Label = c("05/08/14", "05/17/14", > "05/30/14", "06/13/13", "06/13/14", "06/25/13", "06/28/13", > "07/03/14"), class = "factor")), .Names = c("id", "ObsSite", > "ObsDate"), class = "data.frame", row.names = c(NA, -8L)) > daT > daT$date <- mdy(daT$ObsDate) > tmp <- split(daT, daT$id) > head(tmp) > M <- with(daT, tapply(as.character(ObsDate), list(id, ObsSite), list, > simplify=F)) > data.frame(M) > M[["id2", "site7"]] > # but I wanted to get the following table > daT2<-structure(list(X = structure(1:6, .Label = c("id1", "id2", "id4", > "id5", "id6", "id7"), class = "factor"), site1 = structure(c(2L, > 2L, 2L, 2L, 1L, 2L), .Label = c("6/28/13", "NULL"), class = "factor"), > site4 = structure(c(2L, 2L, 1L, 2L, 2L, 2L), .Label = c("5/8/14", > "NULL"), class = "factor"), site5 = structure(c(3L, 3L, 3L, > 1L, 3L, 2L), .Label = c("6/13/14", "6/25/13", "NULL"), class > "factor"), > site7 = structure(c(2L, 1L, 3L, 3L, 3L, 3L), .Label = c("5/17/14", > "6/13/13", "NULL"), class = "factor")), .Names = c("X", "site1", > "site4", "site5", "site7"), class = "data.frame", row.names = c(NA, > -6L)) > daT2 > Thank you > > ===> > > On Thu, Feb 6, 2020 at 8:48 AM Marna Wagley <marna.wagley at gmail.com> > wrote: > > > Thank You Profs. Dalgaard and Barradas for the code, both codes worked > > perfectly for the data and I am going to use it in my big data set. > > Thanks once again. > > > > > > On Thu, Feb 6, 2020 at 2:02 AM peter dalgaard <pdalgd at gmail.com> wrote: > > > >> There is also > >> > >> > with(daT, tapply(as.character(ObsDate), list(id, ObsSite), > >> function(x)format(list(x)))) > >> site1 site4 site5 site7 > >> id1 NA NA NA "06/13/13" > >> id2 NA NA NA "07/03/14, 05/17/14" > >> id4 NA "05/08/14" NA NA > >> id5 NA NA "06/13/14" NA > >> id6 "05/30/14, 06/28/13" NA NA NA > >> id7 NA NA "06/25/13" NA > >> > >> ...with the added bonus that if you leave out the format() business, you > >> get a data structure that doesn't print as nicely, but can be used for > >> further computations: > >> > >> > with(daT, tapply(as.character(ObsDate), list(id, ObsSite), list, > >> simplify=FALSE)) > >> site1 site4 site5 site7 > >> id1 NULL NULL NULL List,1 > >> id2 NULL NULL NULL List,1 > >> id4 NULL List,1 NULL NULL > >> id5 NULL NULL List,1 NULL > >> id6 List,1 NULL NULL NULL > >> id7 NULL NULL List,1 NULL > >> > M <- with(daT, tapply(as.character(ObsDate), list(id, ObsSite), list, > >> simplify=FALSE)) > >> > M[["id2", "site7"]] > >> [[1]] > >> [1] "07/03/14" "05/17/14" > >> > >> -pd > >> > >> > On 6 Feb 2020, at 01:37 , Marna Wagley <marna.wagley at gmail.com> > wrote: > >> > > >> > daT<-structure(list(id = structure(c(1L, 2L, 2L, 3L, 4L, 5L, 5L, 6L > >> > ), .Label = c("id1", "id2", "id4", "id5", "id6", "id7"), class > >> "factor"), > >> > ObsSite = structure(c(4L, 4L, 4L, 2L, 3L, 1L, 1L, 3L), .Label > >> > c("site1", > >> > "site4", "site5", "site7"), class = "factor"), ObsDate > >> > structure(c(4L, > >> > 8L, 2L, 1L, 5L, 3L, 7L, 6L), .Label = c("05/08/14", "05/17/14", > >> > "05/30/14", "06/13/13", "06/13/14", "06/25/13", "06/28/13", > >> > "07/03/14"), class = "factor")), .Names = c("id", "ObsSite", > >> > "ObsDate"), class = "data.frame", row.names = c(NA, -8L)) > >> > daT > >> > daT$date <- mdy(daT$ObsDate) > >> > >> -- > >> Peter Dalgaard, Professor, > >> Center for Statistics, Copenhagen Business School > >> Solbjerg Plads 3, 2000 Frederiksberg, Denmark > >> Phone: (+45)38153501 > >> Office: A 4.23 > >> Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]