Hi R users, I was trying to create a pivot table for the following data, in which I wanted to put "id" in rows and "ObsSite" in columns and "Obsdate" is in the cells. I used the following code but it took only one date among the two dates. For example, the animal (Id2) which was observed in the site7 two time or days (07/03/14 & 05/17/2014). see below id ObsSite ObsDate id1 site7 06/13/13 id2 site7 07/03/14 id2 site7 05/17/14 id4 site4 05/08/14 id5 site5 06/13/14 id6 site1 05/30/14 id6 site1 06/28/13 id7 site5 06/25/13 I wanted to put both dates in the cell if there is any multiple dates, as similar shown below site1 site4 site5 site7 id1 0 0 0 6/13/13 id2 0 0 0 7/3/2014, 5/17/2014 id4 0 5/8/14 0 0 id5 0 0 6/13/14 0 id6 5/30/2014, 6/28/2013 0 0 0 id7 0 0 6/25/13 0 the code I used is given below but it gave me only one date in that cells. Is there any way to get both dates in these cells? Thanks, ### library(lubridate) daT<-structure(list(id = structure(c(1L, 2L, 2L, 3L, 4L, 5L, 5L, 6L ), .Label = c("id1", "id2", "id4", "id5", "id6", "id7"), class = "factor"), ObsSite = structure(c(4L, 4L, 4L, 2L, 3L, 1L, 1L, 3L), .Label c("site1", "site4", "site5", "site7"), class = "factor"), ObsDate structure(c(4L, 8L, 2L, 1L, 5L, 3L, 7L, 6L), .Label = c("05/08/14", "05/17/14", "05/30/14", "06/13/13", "06/13/14", "06/25/13", "06/28/13", "07/03/14"), class = "factor")), .Names = c("id", "ObsSite", "ObsDate"), class = "data.frame", row.names = c(NA, -8L)) daT daT$date <- mdy(daT$ObsDate) tmp <- split(daT, daT$id) head(tmp) pivotTable <- do.call(rbind, lapply(tmp, function(daT){ tb <- table(daT$ObsSite) idx <- which(tb>0) tb1 <- replace(tb, idx, as.character(daT$date)) })) data.frame(pivotTable) [[alternative HTML version deleted]]
Hello, Function tidyr::pivot_wider is meant for this kind of problem. It is package tidyr's new way of reshaping from long to wide format. QUoting from the help page: Details pivot_wider() is an updated approach to spread(), designed to be both simpler to use and to handle more use cases. We recomend you use pivot_wider() for new code; spread() isn't going away but is no longer under active development. In what follows I have use column 'date' to fill the cells, not column 'ObsDate' like in your question. Just change this and you'll get your expected result. library(dplyr) library(tidyr) library(lubridate) daT %>% arrange(ObsSite) %>% pivot_wider(id_cols = id, names_from = ObsSite, values_from = date, values_fn = list(date = function(x){ paste(x, collapse = ",") })) %>% mutate_all(function(x) ifelse(is.na(x), 0, x)) %>% arrange(id) Hope this helps, Rui Barradas ?s 00:37 de 06/02/20, Marna Wagley escreveu:> Hi R users, > I was trying to create a pivot table for the following data, in which I > wanted to put "id" in rows and "ObsSite" in columns and "Obsdate" is in > the cells. > > I used the following code but it took only one date among the two dates. > For example, the animal (Id2) which was observed in the site7 two time or > days (07/03/14 & 05/17/2014). see below > id ObsSite ObsDate > id1 site7 06/13/13 > id2 site7 07/03/14 > id2 site7 05/17/14 > id4 site4 05/08/14 > id5 site5 06/13/14 > id6 site1 05/30/14 > id6 site1 06/28/13 > id7 site5 06/25/13 > > I wanted to put both dates in the cell if there is any multiple dates, as > similar shown below > > > > > > site1 site4 site5 site7 > id1 0 0 0 6/13/13 > id2 0 0 0 7/3/2014, 5/17/2014 > id4 0 5/8/14 0 0 > id5 0 0 6/13/14 0 > id6 5/30/2014, 6/28/2013 0 0 0 > id7 0 0 6/25/13 0 > > the code I used is given below but it gave me only one date in that cells. > Is there any way to get both dates in these cells? > Thanks, > > ### > library(lubridate) > daT<-structure(list(id = structure(c(1L, 2L, 2L, 3L, 4L, 5L, 5L, 6L > ), .Label = c("id1", "id2", "id4", "id5", "id6", "id7"), class = "factor"), > ObsSite = structure(c(4L, 4L, 4L, 2L, 3L, 1L, 1L, 3L), .Label > c("site1", > "site4", "site5", "site7"), class = "factor"), ObsDate > structure(c(4L, > 8L, 2L, 1L, 5L, 3L, 7L, 6L), .Label = c("05/08/14", "05/17/14", > "05/30/14", "06/13/13", "06/13/14", "06/25/13", "06/28/13", > "07/03/14"), class = "factor")), .Names = c("id", "ObsSite", > "ObsDate"), class = "data.frame", row.names = c(NA, -8L)) > daT > daT$date <- mdy(daT$ObsDate) > tmp <- split(daT, daT$id) > head(tmp) > > pivotTable <- do.call(rbind, lapply(tmp, function(daT){ > tb <- table(daT$ObsSite) > idx <- which(tb>0) > tb1 <- replace(tb, idx, as.character(daT$date)) > })) > > > data.frame(pivotTable) > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
There is also> with(daT, tapply(as.character(ObsDate), list(id, ObsSite), function(x)format(list(x))))site1 site4 site5 site7 id1 NA NA NA "06/13/13" id2 NA NA NA "07/03/14, 05/17/14" id4 NA "05/08/14" NA NA id5 NA NA "06/13/14" NA id6 "05/30/14, 06/28/13" NA NA NA id7 NA NA "06/25/13" NA ...with the added bonus that if you leave out the format() business, you get a data structure that doesn't print as nicely, but can be used for further computations:> with(daT, tapply(as.character(ObsDate), list(id, ObsSite), list, simplify=FALSE))site1 site4 site5 site7 id1 NULL NULL NULL List,1 id2 NULL NULL NULL List,1 id4 NULL List,1 NULL NULL id5 NULL NULL List,1 NULL id6 List,1 NULL NULL NULL id7 NULL NULL List,1 NULL> M <- with(daT, tapply(as.character(ObsDate), list(id, ObsSite), list, simplify=FALSE)) > M[["id2", "site7"]][[1]] [1] "07/03/14" "05/17/14" -pd> On 6 Feb 2020, at 01:37 , Marna Wagley <marna.wagley at gmail.com> wrote: > > daT<-structure(list(id = structure(c(1L, 2L, 2L, 3L, 4L, 5L, 5L, 6L > ), .Label = c("id1", "id2", "id4", "id5", "id6", "id7"), class = "factor"), > ObsSite = structure(c(4L, 4L, 4L, 2L, 3L, 1L, 1L, 3L), .Label > c("site1", > "site4", "site5", "site7"), class = "factor"), ObsDate > structure(c(4L, > 8L, 2L, 1L, 5L, 3L, 7L, 6L), .Label = c("05/08/14", "05/17/14", > "05/30/14", "06/13/13", "06/13/14", "06/25/13", "06/28/13", > "07/03/14"), class = "factor")), .Names = c("id", "ObsSite", > "ObsDate"), class = "data.frame", row.names = c(NA, -8L)) > daT > daT$date <- mdy(daT$ObsDate)-- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
Thank You Profs. Dalgaard and Barradas for the code, both codes worked perfectly for the data and I am going to use it in my big data set. Thanks once again. On Thu, Feb 6, 2020 at 2:02 AM peter dalgaard <pdalgd at gmail.com> wrote:> There is also > > > with(daT, tapply(as.character(ObsDate), list(id, ObsSite), > function(x)format(list(x)))) > site1 site4 site5 site7 > id1 NA NA NA "06/13/13" > id2 NA NA NA "07/03/14, 05/17/14" > id4 NA "05/08/14" NA NA > id5 NA NA "06/13/14" NA > id6 "05/30/14, 06/28/13" NA NA NA > id7 NA NA "06/25/13" NA > > ...with the added bonus that if you leave out the format() business, you > get a data structure that doesn't print as nicely, but can be used for > further computations: > > > with(daT, tapply(as.character(ObsDate), list(id, ObsSite), list, > simplify=FALSE)) > site1 site4 site5 site7 > id1 NULL NULL NULL List,1 > id2 NULL NULL NULL List,1 > id4 NULL List,1 NULL NULL > id5 NULL NULL List,1 NULL > id6 List,1 NULL NULL NULL > id7 NULL NULL List,1 NULL > > M <- with(daT, tapply(as.character(ObsDate), list(id, ObsSite), list, > simplify=FALSE)) > > M[["id2", "site7"]] > [[1]] > [1] "07/03/14" "05/17/14" > > -pd > > > On 6 Feb 2020, at 01:37 , Marna Wagley <marna.wagley at gmail.com> wrote: > > > > daT<-structure(list(id = structure(c(1L, 2L, 2L, 3L, 4L, 5L, 5L, 6L > > ), .Label = c("id1", "id2", "id4", "id5", "id6", "id7"), class > "factor"), > > ObsSite = structure(c(4L, 4L, 4L, 2L, 3L, 1L, 1L, 3L), .Label > > c("site1", > > "site4", "site5", "site7"), class = "factor"), ObsDate > > structure(c(4L, > > 8L, 2L, 1L, 5L, 3L, 7L, 6L), .Label = c("05/08/14", "05/17/14", > > "05/30/14", "06/13/13", "06/13/14", "06/25/13", "06/28/13", > > "07/03/14"), class = "factor")), .Names = c("id", "ObsSite", > > "ObsDate"), class = "data.frame", row.names = c(NA, -8L)) > > daT > > daT$date <- mdy(daT$ObsDate) > > -- > Peter Dalgaard, Professor, > Center for Statistics, Copenhagen Business School > Solbjerg Plads 3, 2000 Frederiksberg, Denmark > Phone: (+45)38153501 > Office: A 4.23 > Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com > > > > > > > > > >[[alternative HTML version deleted]]