Hi Jim, The data set is correct. I took two readings from the "SITE A" within a short time interval, therefore I want to take the first value if there are repeated within a same group of "timeGroup". Therefore I wanted following FinalData1 B1 B2 id_X "A" "B" id_Y "A" "B" thanks, On Tue, May 1, 2018 at 4:05 PM, Jim Lemon <drjimlemon at gmail.com> wrote:> Hi Marna, > I think this is due to having three rows for id_X and only two for > id_Y. The function creates a data frame with enough columns to hold > the greatest number of values for each ID variable. Notice that the > SITE_n columns contain three values for id_X (A, A, B) and two for > id_Y (A, B, NA) as there was no third occasion of measurement for the > latter. Even though there are only two _values_ for SITE, there must > be enough space for three. In your desired output, SITE for the second > occasion of measurement is wrong (it should be "A"), and for the third > occasion it is unknown. Even if there was only one value for SITE in > the original data frame, it should be repeated for the correct number > of observations. I think you may be mixing up case ID with location of > observation. > > Jim > > > On Wed, May 2, 2018 at 8:48 AM, Marna Wagley <marna.wagley at gmail.com> > wrote: > > Hi Jim, > > Thank you very much for your suggestions. I used it but it gave me three > > sites. But actually I do have only two sites "Id_X" and "Id_y" . In fact > > "A" is repeated two times for "Id_X". If it is repeated, I would like to > > take the first one among many repeated values. > > > > dat<-structure(list(ID = structure(c(1L, 1L, 1L, 2L, 2L), .Label > c("id_X", > > > > "id_Y"), class = "factor"), EventDate = structure(c(4L, 5L, 2L, > > > > 3L, 1L), .Label = c("9/15/16", "9/15/17", "9/7/16", "9/8/16", > > > > "9/9/16"), class = "factor"), timeGroup = structure(c(1L, 1L, > > > > 2L, 1L, 2L), .Label = c("B1", "B2"), class = "factor"), SITE > > structure(c(1L, > > > > 1L, 2L, 1L, 2L), .Label = c("A", "B"), class = "factor")), .Names > c("ID", > > > > "EventDate", "timeGroup", "SITE"), class = "data.frame", row.names > c(NA, > > > > -5L)) > > > > library(prettyR) > > > > stretch_df(dat,idvar="ID",to.stretch=c("EventDate","SITE")) > > > > > > ID timeGroup EventDate_1 EventDate_2 EventDate_3 SITE_1 SITE_2 SITE_3 > > 1 id_X B1 9/8/16 9/9/16 9/15/17 A A B > > 2 id_Y B1 9/7/16 9/15/16 <NA> A B <NA> > >> > > > > Basically I am looking for like following table > > > > ID timeGroup EventDate_1 EventDate_2 EventDate_3 SITE_1 SITE_2 > > 1 id_X B1 9/8/16 9/9/16 9/15/17 A B > > 2 id_Y B1 9/7/16 9/15/16 <NA> A B > > > > Thanks > > > > > > On Tue, May 1, 2018 at 3:32 PM, Jim Lemon <drjimlemon at gmail.com> wrote: > >> > >> Hi Marna, > >> Try this: > >> > >> library(prettyR) > >> stretch_df(dat,idvar="ID",to.stretch=c("EventDate","SITE")) > >> > >> Jim > >> > >> > >> On Wed, May 2, 2018 at 8:24 AM, Marna Wagley <marna.wagley at gmail.com> > >> wrote: > >> > Hi R user, > >> > I was trying to convert a long matrix to wide? I have an example and > >> > would > >> > like to get a table (FinalData1): > >> > > >> > > >> > FinalData1 > >> > B1 B2 > >> > id_X "A" "B" > >> > id_Y "A" "B" > >> > > >> > but I got the following table using the following code. > >> > > >> > FinalData1 > >> > > >> > B1 B2 > >> > > >> > id_X "A" "A" > >> > > >> > id_Y "A" "B" > >> > > >> > > >> > the code and the example data I used are given below. Is there any > >> > suggestions to fix the problem? > >> > > >> > > >> > dat<-structure(list(ID = structure(c(1L, 1L, 1L, 2L, 2L), .Label > >> > c("id_X", > >> > > >> > > >> > "id_Y"), class = "factor"), EventDate = structure(c(4L, 5L, 2L, > >> > > >> > 3L, 1L), .Label = c("9/15/16", "9/15/17", "9/7/16", "9/8/16", > >> > > >> > "9/9/16"), class = "factor"), timeGroup = structure(c(1L, 1L, > >> > > >> > 2L, 1L, 2L), .Label = c("B1", "B2"), class = "factor"), SITE > >> > structure(c( > >> > 1L, > >> > > >> > 1L, 2L, 1L, 2L), .Label = c("A", "B"), class = "factor")), .Names > >> > c("ID", > >> > > >> > "EventDate", "timeGroup", "SITE"), class = "data.frame", row.names > >> > c(NA, > >> > > >> > -5L)) > >> > > >> > > >> > tmp <- split(dat, dat$ID) > >> > > >> > tmp1 <- do.call(rbind, lapply(tmp, function(dat){ > >> > > >> > tb <- table(dat$timeGroup) > >> > > >> > idx <- which(tb>0) > >> > > >> > tb1 <- replace(tb, idx, as.character(dat$SITE)) > >> > > >> > })) > >> > > >> > > >> > tmp1 > >> > > >> > FinalData<-print(tmp1, quote=FALSE) > >> > > >> > [[alternative HTML version deleted]] > >> > > >> > ______________________________________________ > >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> > https://stat.ethz.ch/mailman/listinfo/r-help > >> > PLEASE do read the posting guide > >> > http://www.R-project.org/posting-guide.html > >> > and provide commented, minimal, self-contained, reproducible code. > > > > >[[alternative HTML version deleted]]
Hi Marna, This is a condition that the function cannot handle. It would be possible to reformat the result based on the time intervals, but the stretch_df function doesn't try to interpret the values, just stretches them out to a wide format. Jim On Wed, May 2, 2018 at 9:16 AM, Marna Wagley <marna.wagley at gmail.com> wrote:> Hi Jim, > The data set is correct. I took two readings from the "SITE A" within a > short time interval, therefore I want to take the first value if there are > repeated within a same group of "timeGroup". > Therefore I wanted following > > FinalData1 > > B1 B2 > id_X "A" "B" > id_Y "A" "B" > > thanks, > > > > On Tue, May 1, 2018 at 4:05 PM, Jim Lemon <drjimlemon at gmail.com> wrote: >> >> Hi Marna, >> I think this is due to having three rows for id_X and only two for >> id_Y. The function creates a data frame with enough columns to hold >> the greatest number of values for each ID variable. Notice that the >> SITE_n columns contain three values for id_X (A, A, B) and two for >> id_Y (A, B, NA) as there was no third occasion of measurement for the >> latter. Even though there are only two _values_ for SITE, there must >> be enough space for three. In your desired output, SITE for the second >> occasion of measurement is wrong (it should be "A"), and for the third >> occasion it is unknown. Even if there was only one value for SITE in >> the original data frame, it should be repeated for the correct number >> of observations. I think you may be mixing up case ID with location of >> observation. >> >> Jim >> >> >> On Wed, May 2, 2018 at 8:48 AM, Marna Wagley <marna.wagley at gmail.com> >> wrote: >> > Hi Jim, >> > Thank you very much for your suggestions. I used it but it gave me three >> > sites. But actually I do have only two sites "Id_X" and "Id_y" . In fact >> > "A" is repeated two times for "Id_X". If it is repeated, I would like to >> > take the first one among many repeated values. >> > >> > dat<-structure(list(ID = structure(c(1L, 1L, 1L, 2L, 2L), .Label >> > c("id_X", >> > >> > "id_Y"), class = "factor"), EventDate = structure(c(4L, 5L, 2L, >> > >> > 3L, 1L), .Label = c("9/15/16", "9/15/17", "9/7/16", "9/8/16", >> > >> > "9/9/16"), class = "factor"), timeGroup = structure(c(1L, 1L, >> > >> > 2L, 1L, 2L), .Label = c("B1", "B2"), class = "factor"), SITE >> > structure(c(1L, >> > >> > 1L, 2L, 1L, 2L), .Label = c("A", "B"), class = "factor")), .Names >> > c("ID", >> > >> > "EventDate", "timeGroup", "SITE"), class = "data.frame", row.names >> > c(NA, >> > >> > -5L)) >> > >> > library(prettyR) >> > >> > stretch_df(dat,idvar="ID",to.stretch=c("EventDate","SITE")) >> > >> > >> > ID timeGroup EventDate_1 EventDate_2 EventDate_3 SITE_1 SITE_2 SITE_3 >> > 1 id_X B1 9/8/16 9/9/16 9/15/17 A A >> > B >> > 2 id_Y B1 9/7/16 9/15/16 <NA> A B >> > <NA> >> >> >> > >> > Basically I am looking for like following table >> > >> > ID timeGroup EventDate_1 EventDate_2 EventDate_3 SITE_1 SITE_2 >> > 1 id_X B1 9/8/16 9/9/16 9/15/17 A B >> > 2 id_Y B1 9/7/16 9/15/16 <NA> A B >> > >> > Thanks >> > >> > >> > On Tue, May 1, 2018 at 3:32 PM, Jim Lemon <drjimlemon at gmail.com> wrote: >> >> >> >> Hi Marna, >> >> Try this: >> >> >> >> library(prettyR) >> >> stretch_df(dat,idvar="ID",to.stretch=c("EventDate","SITE")) >> >> >> >> Jim >> >> >> >> >> >> On Wed, May 2, 2018 at 8:24 AM, Marna Wagley <marna.wagley at gmail.com> >> >> wrote: >> >> > Hi R user, >> >> > I was trying to convert a long matrix to wide? I have an example and >> >> > would >> >> > like to get a table (FinalData1): >> >> > >> >> > >> >> > FinalData1 >> >> > B1 B2 >> >> > id_X "A" "B" >> >> > id_Y "A" "B" >> >> > >> >> > but I got the following table using the following code. >> >> > >> >> > FinalData1 >> >> > >> >> > B1 B2 >> >> > >> >> > id_X "A" "A" >> >> > >> >> > id_Y "A" "B" >> >> > >> >> > >> >> > the code and the example data I used are given below. Is there any >> >> > suggestions to fix the problem? >> >> > >> >> > >> >> > dat<-structure(list(ID = structure(c(1L, 1L, 1L, 2L, 2L), .Label >> >> > c("id_X", >> >> > >> >> > >> >> > "id_Y"), class = "factor"), EventDate = structure(c(4L, 5L, 2L, >> >> > >> >> > 3L, 1L), .Label = c("9/15/16", "9/15/17", "9/7/16", "9/8/16", >> >> > >> >> > "9/9/16"), class = "factor"), timeGroup = structure(c(1L, 1L, >> >> > >> >> > 2L, 1L, 2L), .Label = c("B1", "B2"), class = "factor"), SITE >> >> > structure(c( >> >> > 1L, >> >> > >> >> > 1L, 2L, 1L, 2L), .Label = c("A", "B"), class = "factor")), .Names >> >> > c("ID", >> >> > >> >> > "EventDate", "timeGroup", "SITE"), class = "data.frame", row.names >> >> > c(NA, >> >> > >> >> > -5L)) >> >> > >> >> > >> >> > tmp <- split(dat, dat$ID) >> >> > >> >> > tmp1 <- do.call(rbind, lapply(tmp, function(dat){ >> >> > >> >> > tb <- table(dat$timeGroup) >> >> > >> >> > idx <- which(tb>0) >> >> > >> >> > tb1 <- replace(tb, idx, as.character(dat$SITE)) >> >> > >> >> > })) >> >> > >> >> > >> >> > tmp1 >> >> > >> >> > FinalData<-print(tmp1, quote=FALSE) >> >> > >> >> > [[alternative HTML version deleted]] >> >> > >> >> > ______________________________________________ >> >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> >> > https://stat.ethz.ch/mailman/listinfo/r-help >> >> > PLEASE do read the posting guide >> >> > http://www.R-project.org/posting-guide.html >> >> > and provide commented, minimal, self-contained, reproducible code. >> > >> > > >
Here is a stab in the dark. I agree with Jim that the description of the problem is hard to follow. The original posting being in HTML format did not help. ######### library(dplyr) #> #> Attaching package: 'dplyr' #> The following objects are masked from 'package:stats': #> #> filter, lag #> The following objects are masked from 'package:base': #> #> intersect, setdiff, setequal, union library(tidyr) # indenting was just a side-effect of me cleaning up the HTML mess dat <- structure( list( ID = structure( c( 1L, 1L, 1L, 2L, 2L) , .Label = c("id_X","id_Y") , class = "factor" ) , EventDate = structure( c( 4L, 5L, 2L , 3L, 1L ) , .Label = c( "9/15/16" , "9/15/17" , "9/7/16" , "9/8/16" , "9/9/16" ) , class = "factor" ) , timeGroup = structure( c( 1L, 1L, 2L, 1L, 2L) , .Label = c("B1", "B2") , class = "factor" ) , SITE = structure( c( 1L, 1L, 2L, 1L, 2L) , .Label = c("A", "B" ) , class = "factor" ) ) , .Names = c( "ID", "EventDate" , "timeGroup", "SITE") , class = "data.frame" , row.names = c(NA, -5L) ) dat2 <- ( dat %>% mutate( EventDate = as.Date( as.character( EventDate ) , format = "%m/%d/%y" ) ) %>% arrange( ID, timeGroup, EventDate ) %>% group_by( ID, timeGroup ) %>% top_n( 1, EventDate ) %>% ungroup ) dat2 #> # A tibble: 4 x 4 #> ID EventDate timeGroup SITE #> <fct> <date> <fct> <fct> #> 1 id_X 2016-09-09 B1 A #> 2 id_X 2017-09-15 B2 B #> 3 id_Y 2016-09-07 B1 A #> 4 id_Y 2016-09-15 B2 B dat3a <- ( dat2 %>% mutate( timeGroup = paste( "EventDate" , timeGroup , sep="_" ) ) %>% select( ID, timeGroup, EventDate ) %>% spread( timeGroup, EventDate ) ) dat3a #> # A tibble: 2 x 3 #> ID EventDate_B1 EventDate_B2 #> <fct> <date> <date> #> 1 id_X 2016-09-09 2017-09-15 #> 2 id_Y 2016-09-07 2016-09-15 dat3b <- ( dat2 %>% mutate( timeGroup = paste( "SITE" , timeGroup , sep = "_" ) ) %>% select( ID, timeGroup, SITE ) %>% spread( timeGroup, SITE ) ) dat3b #> # A tibble: 2 x 3 #> ID SITE_B1 SITE_B2 #> <fct> <fct> <fct> #> 1 id_X A B #> 2 id_Y A B dat4 <- ( dat3a %>% left_join( dat3b, by = "ID" ) ) dat4 #> # A tibble: 2 x 5 #> ID EventDate_B1 EventDate_B2 SITE_B1 SITE_B2 #> <fct> <date> <date> <fct> <fct> #> 1 id_X 2016-09-09 2017-09-15 A B #> 2 id_Y 2016-09-07 2016-09-15 A B ######### On Wed, 2 May 2018, Jim Lemon wrote:> Hi Marna, > This is a condition that the function cannot handle. It would be > possible to reformat the result based on the time intervals, but the > stretch_df function doesn't try to interpret the values, just > stretches them out to a wide format. > > Jim > > > On Wed, May 2, 2018 at 9:16 AM, Marna Wagley <marna.wagley at gmail.com> wrote: >> Hi Jim, >> The data set is correct. I took two readings from the "SITE A" within a >> short time interval, therefore I want to take the first value if there are >> repeated within a same group of "timeGroup". >> Therefore I wanted following >> >> FinalData1 >> >> B1 B2 >> id_X "A" "B" >> id_Y "A" "B" >> >> thanks, >> >> >> >> On Tue, May 1, 2018 at 4:05 PM, Jim Lemon <drjimlemon at gmail.com> wrote: >>> >>> Hi Marna, >>> I think this is due to having three rows for id_X and only two for >>> id_Y. The function creates a data frame with enough columns to hold >>> the greatest number of values for each ID variable. Notice that the >>> SITE_n columns contain three values for id_X (A, A, B) and two for >>> id_Y (A, B, NA) as there was no third occasion of measurement for the >>> latter. Even though there are only two _values_ for SITE, there must >>> be enough space for three. In your desired output, SITE for the second >>> occasion of measurement is wrong (it should be "A"), and for the third >>> occasion it is unknown. Even if there was only one value for SITE in >>> the original data frame, it should be repeated for the correct number >>> of observations. I think you may be mixing up case ID with location of >>> observation. >>> >>> Jim >>> >>> >>> On Wed, May 2, 2018 at 8:48 AM, Marna Wagley <marna.wagley at gmail.com> >>> wrote: >>>> Hi Jim, >>>> Thank you very much for your suggestions. I used it but it gave me three >>>> sites. But actually I do have only two sites "Id_X" and "Id_y" . In fact >>>> "A" is repeated two times for "Id_X". If it is repeated, I would like to >>>> take the first one among many repeated values. >>>> >>>> dat<-structure(list(ID = structure(c(1L, 1L, 1L, 2L, 2L), .Label >>>> c("id_X", >>>> >>>> "id_Y"), class = "factor"), EventDate = structure(c(4L, 5L, 2L, >>>> >>>> 3L, 1L), .Label = c("9/15/16", "9/15/17", "9/7/16", "9/8/16", >>>> >>>> "9/9/16"), class = "factor"), timeGroup = structure(c(1L, 1L, >>>> >>>> 2L, 1L, 2L), .Label = c("B1", "B2"), class = "factor"), SITE >>>> structure(c(1L, >>>> >>>> 1L, 2L, 1L, 2L), .Label = c("A", "B"), class = "factor")), .Names >>>> c("ID", >>>> >>>> "EventDate", "timeGroup", "SITE"), class = "data.frame", row.names >>>> c(NA, >>>> >>>> -5L)) >>>> >>>> library(prettyR) >>>> >>>> stretch_df(dat,idvar="ID",to.stretch=c("EventDate","SITE")) >>>> >>>> >>>> ID timeGroup EventDate_1 EventDate_2 EventDate_3 SITE_1 SITE_2 SITE_3 >>>> 1 id_X B1 9/8/16 9/9/16 9/15/17 A A >>>> B >>>> 2 id_Y B1 9/7/16 9/15/16 <NA> A B >>>> <NA> >>>>> >>>> >>>> Basically I am looking for like following table >>>> >>>> ID timeGroup EventDate_1 EventDate_2 EventDate_3 SITE_1 SITE_2 >>>> 1 id_X B1 9/8/16 9/9/16 9/15/17 A B >>>> 2 id_Y B1 9/7/16 9/15/16 <NA> A B >>>> >>>> Thanks >>>> >>>> >>>> On Tue, May 1, 2018 at 3:32 PM, Jim Lemon <drjimlemon at gmail.com> wrote: >>>>> >>>>> Hi Marna, >>>>> Try this: >>>>> >>>>> library(prettyR) >>>>> stretch_df(dat,idvar="ID",to.stretch=c("EventDate","SITE")) >>>>> >>>>> Jim >>>>> >>>>> >>>>> On Wed, May 2, 2018 at 8:24 AM, Marna Wagley <marna.wagley at gmail.com> >>>>> wrote: >>>>>> Hi R user, >>>>>> I was trying to convert a long matrix to wide? I have an example and >>>>>> would >>>>>> like to get a table (FinalData1): >>>>>> >>>>>> >>>>>> FinalData1 >>>>>> B1 B2 >>>>>> id_X "A" "B" >>>>>> id_Y "A" "B" >>>>>> >>>>>> but I got the following table using the following code. >>>>>> >>>>>> FinalData1 >>>>>> >>>>>> B1 B2 >>>>>> >>>>>> id_X "A" "A" >>>>>> >>>>>> id_Y "A" "B" >>>>>> >>>>>> >>>>>> the code and the example data I used are given below. Is there any >>>>>> suggestions to fix the problem? >>>>>> >>>>>> >>>>>> dat<-structure(list(ID = structure(c(1L, 1L, 1L, 2L, 2L), .Label >>>>>> c("id_X", >>>>>> >>>>>> >>>>>> "id_Y"), class = "factor"), EventDate = structure(c(4L, 5L, 2L, >>>>>> >>>>>> 3L, 1L), .Label = c("9/15/16", "9/15/17", "9/7/16", "9/8/16", >>>>>> >>>>>> "9/9/16"), class = "factor"), timeGroup = structure(c(1L, 1L, >>>>>> >>>>>> 2L, 1L, 2L), .Label = c("B1", "B2"), class = "factor"), SITE >>>>>> structure(c( >>>>>> 1L, >>>>>> >>>>>> 1L, 2L, 1L, 2L), .Label = c("A", "B"), class = "factor")), .Names >>>>>> c("ID", >>>>>> >>>>>> "EventDate", "timeGroup", "SITE"), class = "data.frame", row.names >>>>>> c(NA, >>>>>> >>>>>> -5L)) >>>>>> >>>>>> >>>>>> tmp <- split(dat, dat$ID) >>>>>> >>>>>> tmp1 <- do.call(rbind, lapply(tmp, function(dat){ >>>>>> >>>>>> tb <- table(dat$timeGroup) >>>>>> >>>>>> idx <- which(tb>0) >>>>>> >>>>>> tb1 <- replace(tb, idx, as.character(dat$SITE)) >>>>>> >>>>>> })) >>>>>> >>>>>> >>>>>> tmp1 >>>>>> >>>>>> FinalData<-print(tmp1, quote=FALSE) >>>>>> >>>>>> [[alternative HTML version deleted]] >>>>>> >>>>>> ______________________________________________ >>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>> PLEASE do read the posting guide >>>>>> http://www.R-project.org/posting-guide.html >>>>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>>> >> >> > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >--------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k