Hi R users,
I was trying to create a pivot table for the following data, in which I
wanted to put "id" in rows and "ObsSite" in columns and
"Obsdate" is in
the cells.
I used the following code but it took only one date among the two dates.
For example, the animal (Id2) which was observed in the site7 two time or
days (07/03/14 & 05/17/2014). see below
id ObsSite ObsDate
id1 site7 06/13/13
id2 site7 07/03/14
id2 site7 05/17/14
id4 site4 05/08/14
id5 site5 06/13/14
id6 site1 05/30/14
id6 site1 06/28/13
id7 site5 06/25/13
I wanted to put both dates in the cell if there is any multiple dates, as
similar shown below
site1 site4 site5 site7
id1 0 0 0 6/13/13
id2 0 0 0 7/3/2014, 5/17/2014
id4 0 5/8/14 0 0
id5 0 0 6/13/14 0
id6 5/30/2014, 6/28/2013 0 0 0
id7 0 0 6/25/13 0
the code I used is given below but it gave me only one date in that cells.
Is there any way to get both dates in these cells?
Thanks,
###
library(lubridate)
daT<-structure(list(id = structure(c(1L, 2L, 2L, 3L, 4L, 5L, 5L, 6L
), .Label = c("id1", "id2", "id4",
"id5", "id6", "id7"), class = "factor"),
ObsSite = structure(c(4L, 4L, 4L, 2L, 3L, 1L, 1L, 3L), .Label
c("site1",
"site4", "site5", "site7"), class =
"factor"), ObsDate structure(c(4L,
8L, 2L, 1L, 5L, 3L, 7L, 6L), .Label = c("05/08/14",
"05/17/14",
"05/30/14", "06/13/13", "06/13/14",
"06/25/13", "06/28/13",
"07/03/14"), class = "factor")), .Names =
c("id", "ObsSite",
"ObsDate"), class = "data.frame", row.names = c(NA, -8L))
daT
daT$date <- mdy(daT$ObsDate)
tmp <- split(daT, daT$id)
head(tmp)
pivotTable <- do.call(rbind, lapply(tmp, function(daT){
tb <- table(daT$ObsSite)
idx <- which(tb>0)
tb1 <- replace(tb, idx, as.character(daT$date))
}))
data.frame(pivotTable)
[[alternative HTML version deleted]]
Hello,
Function tidyr::pivot_wider is meant for this kind of problem. It is
package tidyr's new way of reshaping from long to wide format. QUoting
from the help page:
Details
pivot_wider() is an updated approach to spread(), designed to be both
simpler to use and to handle more use cases. We recomend you use
pivot_wider() for new code; spread() isn't going away but is no longer
under active development.
In what follows I have use column 'date' to fill the cells, not column
'ObsDate' like in your question. Just change this and you'll get
your
expected result.
library(dplyr)
library(tidyr)
library(lubridate)
daT %>%
arrange(ObsSite) %>%
pivot_wider(id_cols = id,
names_from = ObsSite,
values_from = date,
values_fn = list(date = function(x){
paste(x, collapse = ",")
})) %>%
mutate_all(function(x) ifelse(is.na(x), 0, x)) %>%
arrange(id)
Hope this helps,
Rui Barradas
?s 00:37 de 06/02/20, Marna Wagley escreveu:> Hi R users,
> I was trying to create a pivot table for the following data, in which I
> wanted to put "id" in rows and "ObsSite" in columns
and "Obsdate" is in
> the cells.
>
> I used the following code but it took only one date among the two dates.
> For example, the animal (Id2) which was observed in the site7 two time or
> days (07/03/14 & 05/17/2014). see below
> id ObsSite ObsDate
> id1 site7 06/13/13
> id2 site7 07/03/14
> id2 site7 05/17/14
> id4 site4 05/08/14
> id5 site5 06/13/14
> id6 site1 05/30/14
> id6 site1 06/28/13
> id7 site5 06/25/13
>
> I wanted to put both dates in the cell if there is any multiple dates, as
> similar shown below
>
>
>
>
>
> site1 site4 site5 site7
> id1 0 0 0 6/13/13
> id2 0 0 0 7/3/2014, 5/17/2014
> id4 0 5/8/14 0 0
> id5 0 0 6/13/14 0
> id6 5/30/2014, 6/28/2013 0 0 0
> id7 0 0 6/25/13 0
>
> the code I used is given below but it gave me only one date in that cells.
> Is there any way to get both dates in these cells?
> Thanks,
>
> ###
> library(lubridate)
> daT<-structure(list(id = structure(c(1L, 2L, 2L, 3L, 4L, 5L, 5L, 6L
> ), .Label = c("id1", "id2", "id4",
"id5", "id6", "id7"), class = "factor"),
> ObsSite = structure(c(4L, 4L, 4L, 2L, 3L, 1L, 1L, 3L), .Label >
c("site1",
> "site4", "site5", "site7"), class =
"factor"), ObsDate > structure(c(4L,
> 8L, 2L, 1L, 5L, 3L, 7L, 6L), .Label = c("05/08/14",
"05/17/14",
> "05/30/14", "06/13/13", "06/13/14",
"06/25/13", "06/28/13",
> "07/03/14"), class = "factor")), .Names =
c("id", "ObsSite",
> "ObsDate"), class = "data.frame", row.names = c(NA,
-8L))
> daT
> daT$date <- mdy(daT$ObsDate)
> tmp <- split(daT, daT$id)
> head(tmp)
>
> pivotTable <- do.call(rbind, lapply(tmp, function(daT){
> tb <- table(daT$ObsSite)
> idx <- which(tb>0)
> tb1 <- replace(tb, idx, as.character(daT$date))
> }))
>
>
> data.frame(pivotTable)
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
There is also> with(daT, tapply(as.character(ObsDate), list(id, ObsSite), function(x)format(list(x))))site1 site4 site5 site7 id1 NA NA NA "06/13/13" id2 NA NA NA "07/03/14, 05/17/14" id4 NA "05/08/14" NA NA id5 NA NA "06/13/14" NA id6 "05/30/14, 06/28/13" NA NA NA id7 NA NA "06/25/13" NA ...with the added bonus that if you leave out the format() business, you get a data structure that doesn't print as nicely, but can be used for further computations:> with(daT, tapply(as.character(ObsDate), list(id, ObsSite), list, simplify=FALSE))site1 site4 site5 site7 id1 NULL NULL NULL List,1 id2 NULL NULL NULL List,1 id4 NULL List,1 NULL NULL id5 NULL NULL List,1 NULL id6 List,1 NULL NULL NULL id7 NULL NULL List,1 NULL> M <- with(daT, tapply(as.character(ObsDate), list(id, ObsSite), list, simplify=FALSE)) > M[["id2", "site7"]][[1]] [1] "07/03/14" "05/17/14" -pd> On 6 Feb 2020, at 01:37 , Marna Wagley <marna.wagley at gmail.com> wrote: > > daT<-structure(list(id = structure(c(1L, 2L, 2L, 3L, 4L, 5L, 5L, 6L > ), .Label = c("id1", "id2", "id4", "id5", "id6", "id7"), class = "factor"), > ObsSite = structure(c(4L, 4L, 4L, 2L, 3L, 1L, 1L, 3L), .Label > c("site1", > "site4", "site5", "site7"), class = "factor"), ObsDate > structure(c(4L, > 8L, 2L, 1L, 5L, 3L, 7L, 6L), .Label = c("05/08/14", "05/17/14", > "05/30/14", "06/13/13", "06/13/14", "06/25/13", "06/28/13", > "07/03/14"), class = "factor")), .Names = c("id", "ObsSite", > "ObsDate"), class = "data.frame", row.names = c(NA, -8L)) > daT > daT$date <- mdy(daT$ObsDate)-- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
Thank You Profs. Dalgaard and Barradas for the code, both codes worked perfectly for the data and I am going to use it in my big data set. Thanks once again. On Thu, Feb 6, 2020 at 2:02 AM peter dalgaard <pdalgd at gmail.com> wrote:> There is also > > > with(daT, tapply(as.character(ObsDate), list(id, ObsSite), > function(x)format(list(x)))) > site1 site4 site5 site7 > id1 NA NA NA "06/13/13" > id2 NA NA NA "07/03/14, 05/17/14" > id4 NA "05/08/14" NA NA > id5 NA NA "06/13/14" NA > id6 "05/30/14, 06/28/13" NA NA NA > id7 NA NA "06/25/13" NA > > ...with the added bonus that if you leave out the format() business, you > get a data structure that doesn't print as nicely, but can be used for > further computations: > > > with(daT, tapply(as.character(ObsDate), list(id, ObsSite), list, > simplify=FALSE)) > site1 site4 site5 site7 > id1 NULL NULL NULL List,1 > id2 NULL NULL NULL List,1 > id4 NULL List,1 NULL NULL > id5 NULL NULL List,1 NULL > id6 List,1 NULL NULL NULL > id7 NULL NULL List,1 NULL > > M <- with(daT, tapply(as.character(ObsDate), list(id, ObsSite), list, > simplify=FALSE)) > > M[["id2", "site7"]] > [[1]] > [1] "07/03/14" "05/17/14" > > -pd > > > On 6 Feb 2020, at 01:37 , Marna Wagley <marna.wagley at gmail.com> wrote: > > > > daT<-structure(list(id = structure(c(1L, 2L, 2L, 3L, 4L, 5L, 5L, 6L > > ), .Label = c("id1", "id2", "id4", "id5", "id6", "id7"), class > "factor"), > > ObsSite = structure(c(4L, 4L, 4L, 2L, 3L, 1L, 1L, 3L), .Label > > c("site1", > > "site4", "site5", "site7"), class = "factor"), ObsDate > > structure(c(4L, > > 8L, 2L, 1L, 5L, 3L, 7L, 6L), .Label = c("05/08/14", "05/17/14", > > "05/30/14", "06/13/13", "06/13/14", "06/25/13", "06/28/13", > > "07/03/14"), class = "factor")), .Names = c("id", "ObsSite", > > "ObsDate"), class = "data.frame", row.names = c(NA, -8L)) > > daT > > daT$date <- mdy(daT$ObsDate) > > -- > Peter Dalgaard, Professor, > Center for Statistics, Copenhagen Business School > Solbjerg Plads 3, 2000 Frederiksberg, Denmark > Phone: (+45)38153501 > Office: A 4.23 > Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com > > > > > > > > > >[[alternative HTML version deleted]]