Heh heh. Well "intuitiveness" is in the mind of the intuiter. ;-) One might even say that Jeff's and John's solutions were the most "intuitive" as they involved nothing more than the "straightforward" application of standard base R functionality. (Do note the scare quotes around 'straightforward'.) Of course, other factors may well be decisive, such as efficiency, generalizability to the *real* problem and data, and so forth. Best to all, Bert On Tue, Jun 21, 2022 at 10:50 AM Rui Barradas <ruipbarradas at sapo.pt> wrote:> Hello, > > pivot_longer is a package tidyr function, not dplyr. I find its syntax > very intuitive. Here is a solution. > > > > x <- "Time_stamp P1A0B0D P190-90D > 'Jun-10 10:34' -0.000208 -0.000195 > 'Jun-10 10:51' -0.000228 -0.000188 > 'Jun-10 11:02' -0.000234 -0.000204 > 'Jun-10 11:17' -0.00022 -0.000205 > 'Jun-10 11:25' -0.000238 -0.000195" > df1 <- read.table(textConnection(x), header = TRUE, check.names = FALSE) > > suppressPackageStartupMessages({ > library(dplyr) > library(tidyr) > }) > > df1 %>% > pivot_longer( > cols = -Time_stamp, # or starts_with("P1") > names_to = "Location", > values_to = "Measurement" > ) %>% > arrange(desc(Location), Time_stamp) > #> # A tibble: 10 ? 3 > #> Time_stamp Location Measurement > #> <chr> <chr> <dbl> > #> 1 Jun-10 10:34 P1A0B0D -0.000208 > #> 2 Jun-10 10:51 P1A0B0D -0.000228 > #> 3 Jun-10 11:02 P1A0B0D -0.000234 > #> 4 Jun-10 11:17 P1A0B0D -0.00022 > #> 5 Jun-10 11:25 P1A0B0D -0.000238 > #> 6 Jun-10 10:34 P190-90D -0.000195 > #> 7 Jun-10 10:51 P190-90D -0.000188 > #> 8 Jun-10 11:02 P190-90D -0.000204 > #> 9 Jun-10 11:17 P190-90D -0.000205 > #> 10 Jun-10 11:25 P190-90D -0.000195 > > > > Hope this helps, > > Rui Barradas > > ?s 17:22 de 21/06/2022, Thomas Subia escreveu: > > Colleagues: > > > > The header of my data set is: > > Time_stamp P1A0B0D P190-90D > > Jun-10 10:34 -0.000208 -0.000195 > > Jun-10 10:51 -0.000228 -0.000188 > > Jun-10 11:02 -0.000234 -0.000204 > > Jun-10 11:17 -0.00022 -0.000205 > > Jun-10 11:25 -0.000238 -0.000195 > > > > I want my data set to resemble: > > > > Time_stamp Location Measurement > > Jun-10 10:34 P1A0B0D -0.000208 > > Jun-10 10:51 P1A0B0D -0.000228 > > Jun-10 11:02 P1A0B0D -0.000234 > > Jun-10 11:17 P1A0B0D -0.00022 > > Jun-10 11:25 P1A0B0D -0.000238 > > Jun-10 10:34 P190-90D -0.000195 > > Jun-10 10:51 P190-90D -0.000188 > > Jun-10 11:02 P190-90D -0.000204 > > Jun-10 11:17 P190-90D -0.000205 > > Jun-10 11:25 P190-90D -0.000195 > > > > I need some advice on how to do this using dplyr. > > > > V/R > > Thomas Subia > > > > FM Industries, Inc. - NGK Electronics, USA | www.fmindustries.com > > 221 Warren Ave, Fremont, CA 94539 > > > > "En Dieu nous avons confiance, tous les autres doivent apporter des > donnees" > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Hello, Right, intuitive is (very) relative. I was thinking of base function stats::reshape. Its main difficulty is, imho, to reshape to both wide and long formats. Compared to it, tidyr::pivot_* are (much?) easier to understand. Here is a stats::reshape solution. df_long <- reshape( data = df1, idvar = "Time_stamp", varying = list(2:3), v.names = "Measurement", direction = "long") df_long$time <- sort(names(df1)[-1])[df_long$time] names(df_long)[2] <- "Location" df_long #> Time_stamp Location Measurement #> Jun-10 10:34.1 Jun-10 10:34 P190-90D -0.000208 #> Jun-10 10:51.1 Jun-10 10:51 P190-90D -0.000228 #> Jun-10 11:02.1 Jun-10 11:02 P190-90D -0.000234 #> Jun-10 11:17.1 Jun-10 11:17 P190-90D -0.000220 #> Jun-10 11:25.1 Jun-10 11:25 P190-90D -0.000238 #> Jun-10 10:34.2 Jun-10 10:34 P1A0B0D -0.000195 #> Jun-10 10:51.2 Jun-10 10:51 P1A0B0D -0.000188 #> Jun-10 11:02.2 Jun-10 11:02 P1A0B0D -0.000204 #> Jun-10 11:17.2 Jun-10 11:17 P1A0B0D -0.000205 #> Jun-10 11:25.2 Jun-10 11:25 P1A0B0D -0.000195 Hope this helps, Rui Barradas ?s 19:25 de 21/06/2022, Bert Gunter escreveu:> Heh heh. Well "intuitiveness" is in the mind of the intuiter. ;-) > One might even say that Jeff's and John's solutions were the most > "intuitive" as they involved nothing more than the "straightforward" > application of standard base R functionality. (Do note the scare quotes > around 'straightforward'.) Of course, other factors may well be > decisive, such as efficiency, generalizability to the *real* problem and > data, and so forth. > > Best to all, > Bert > > On Tue, Jun 21, 2022 at 10:50 AM Rui Barradas <ruipbarradas at sapo.pt > <mailto:ruipbarradas at sapo.pt>> wrote: > > Hello, > > pivot_longer is a package tidyr function, not dplyr. I find its syntax > very intuitive. Here is a solution. > > > > x <- "Time_stamp? ? P1A0B0D P190-90D > 'Jun-10 10:34'? -0.000208? ?-0.000195 > 'Jun-10 10:51'? -0.000228? ?-0.000188 > 'Jun-10 11:02'? -0.000234? ?-0.000204 > 'Jun-10 11:17'? -0.00022? ? -0.000205 > 'Jun-10 11:25'? -0.000238? ?-0.000195" > df1 <- read.table(textConnection(x), header = TRUE, check.names = FALSE) > > suppressPackageStartupMessages({ > ? ?library(dplyr) > ? ?library(tidyr) > }) > > df1 %>% > ? ?pivot_longer( > ? ? ?cols = -Time_stamp,? ? ?# or starts_with("P1") > ? ? ?names_to = "Location", > ? ? ?values_to = "Measurement" > ? ?) %>% > ? ?arrange(desc(Location), Time_stamp) > #> # A tibble: 10 ? 3 > #>? ? Time_stamp? ?Location Measurement > #>? ? <chr>? ? ? ? <chr>? ? ? ? ? <dbl> > #>? 1 Jun-10 10:34 P1A0B0D? ? -0.000208 > #>? 2 Jun-10 10:51 P1A0B0D? ? -0.000228 > #>? 3 Jun-10 11:02 P1A0B0D? ? -0.000234 > #>? 4 Jun-10 11:17 P1A0B0D? ? -0.00022 > #>? 5 Jun-10 11:25 P1A0B0D? ? -0.000238 > #>? 6 Jun-10 10:34 P190-90D? ?-0.000195 > #>? 7 Jun-10 10:51 P190-90D? ?-0.000188 > #>? 8 Jun-10 11:02 P190-90D? ?-0.000204 > #>? 9 Jun-10 11:17 P190-90D? ?-0.000205 > #> 10 Jun-10 11:25 P190-90D? ?-0.000195 > > > > Hope this helps, > > Rui Barradas > > ?s 17:22 de 21/06/2022, Thomas Subia escreveu: > > Colleagues: > > > > The header of my data set is: > > Time_stamp? ? P1A0B0D P190-90D > > Jun-10 10:34? -0.000208? ? ? ?-0.000195 > > Jun-10 10:51? -0.000228? ? ? ?-0.000188 > > Jun-10 11:02? -0.000234? ? ? ?-0.000204 > > Jun-10 11:17? -0.00022? ? ? ? -0.000205 > > Jun-10 11:25? -0.000238? ? ? ?-0.000195 > > > > I want my data set to resemble: > > > > Time_stamp? ? Location? ? ? ? Measurement > > Jun-10 10:34? P1A0B0D -0.000208 > > Jun-10 10:51? P1A0B0D -0.000228 > > Jun-10 11:02? P1A0B0D -0.000234 > > Jun-10 11:17? P1A0B0D -0.00022 > > Jun-10 11:25? P1A0B0D -0.000238 > > Jun-10 10:34? P190-90D? ? ? ? -0.000195 > > Jun-10 10:51? P190-90D? ? ? ? -0.000188 > > Jun-10 11:02? P190-90D? ? ? ? -0.000204 > > Jun-10 11:17? P190-90D? ? ? ? -0.000205 > > Jun-10 11:25? P190-90D? ? ? ? -0.000195 > > > > I need some advice on how to do this using dplyr. > > > > V/R > > Thomas Subia > > > > FM Industries, Inc. - NGK Electronics, USA | www.fmindustries.com > <http://www.fmindustries.com> > > 221 Warren Ave, Fremont, CA 94539 > > > > "En Dieu nous avons confiance, tous les autres doivent apporter > des donnees" > > > > ______________________________________________ > > R-help at r-project.org <mailto:R-help at r-project.org> mailing list > -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > <https://stat.ethz.ch/mailman/listinfo/r-help> > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > <http://www.R-project.org/posting-guide.html> > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org <mailto:R-help at r-project.org> mailing list -- > To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > <https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > <http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. >
Bert and Others, Now that newer versions of R support a reasonable pipeline method, I think?there may be more interest in using functions designed to be easy to use?in pipelines, including wrappers that just re-arrange the order for existing?functions to make the first argument the one passed along the pipeline. When people say "dplyr' now it is indeed a specific package but? some?use it to mean more like the "tidyverse" group of packages that are meant?to operate well together and they includes the "tidyr" package. The intuitively OBVIOUS solution using base R that is shown is actually?a bit restricted and does not trivially scale up to deal with lots more columns?that are to be consolidated, perhaps in multiple batches and based on?things like suffixes in the names and so on that the tidyverse functions?are able to handle. And if it matters, you may want to keep the order?of the rows relatively intact and the solution offered does not. But packages like dplyr are not a full solution and most people would?be better off learning all about what core R offers and only supplementing?it here and there with selected packages. If you ever have to read code others?wrote or modified, ... In any case, THIS forum seems dedicated for a purpose that precludes more?than an aside about packages. Very little that is in packages cannot in theory?be done using mostly regular R but I am not sure if that is any longer true?or wise. Many packages re-write R functionality as something like much?faster code in C or C++ or make use of some R code that is more efficient?than you might cobble together on your own. Some is also very general?and allows programming at higher levels of abstraction and I specifically?include the pipeline methods (now also in R) as such a level of?abstraction. The topic, loosely, was how to transform your data.frame (or equivalent)?from what some call WIDE form to LONG form. That is often done in?pipelines where after some steps, the resulting data has to be transformed?before being given to a program like one doing graphics with ggplot() and no?amount of lecturing suggesting we use native R graphics for everything will?in the slightest bit convince me. So the supplied method, unless suitably placed in a function that takes a?data.frame as a first argument and returns the modified new one as a result,?will only help for some purposes and be a pain for others as you pause and?leave any pipeline to make the change and then ... As was said, intuitive is fairly meaningless as my personal intuition often?intuits multiple ways of looking at something, each one being its own intuitive way?and the task often is simply to pick one based on additional factors. It may be?intuitively obvious to do it the shortest and easiest way imaginable but also?obvious if you will need this again, to make it properly commented and documented?and even at times do more error checking or do more general tasks ... At MY stage I think I know enough but also see no reason to waste lots of?time doing things in many steps with lots of possible mistakes on my part?when a few well-coordinated and tested packages make it easy. To each their own. But I am NOT suggesting this forum should change, there are?others that can accommodate people. And there are way more packages out?there that most of us are not even aware of exist! -----Original Message----- From: Bert Gunter <bgunter.4567 at gmail.com> To: Rui Barradas <ruipbarradas at sapo.pt> Cc: r-help at r-project.org <r-help at r-project.org>; Thomas Subia <thomas.subia at fmindustries.com> Sent: Tue, Jun 21, 2022 2:25 pm Subject: Re: [R] Dplyr question Heh heh. Well "intuitiveness" is in the mind of the intuiter. ;-) One might even say that Jeff's and John's solutions were the most "intuitive" as they involved nothing more than the "straightforward" application of standard base R functionality. (Do note the scare quotes around 'straightforward'.) Of course, other factors may well be decisive, such as efficiency, generalizability to the *real* problem and data, and so forth. Best to all, Bert On Tue, Jun 21, 2022 at 10:50 AM Rui Barradas <ruipbarradas at sapo.pt> wrote:> Hello, > > pivot_longer is a package tidyr function, not dplyr. I find its syntax > very intuitive. Here is a solution. > > > > x <- "Time_stamp? ? P1A0B0D P190-90D > 'Jun-10 10:34'? -0.000208? -0.000195 > 'Jun-10 10:51'? -0.000228? -0.000188 > 'Jun-10 11:02'? -0.000234? -0.000204 > 'Jun-10 11:17'? -0.00022? ? -0.000205 > 'Jun-10 11:25'? -0.000238? -0.000195" > df1 <- read.table(textConnection(x), header = TRUE, check.names = FALSE) > > suppressPackageStartupMessages({ >? ? library(dplyr) >? ? library(tidyr) > }) > > df1 %>% >? ? pivot_longer( >? ? ? cols = -Time_stamp,? ? # or starts_with("P1") >? ? ? names_to = "Location", >? ? ? values_to = "Measurement" >? ? ) %>% >? ? arrange(desc(Location), Time_stamp) > #> # A tibble: 10 ? 3 > #>? ? Time_stamp? Location Measurement > #>? ? <chr>? ? ? ? <chr>? ? ? ? ? <dbl> > #>? 1 Jun-10 10:34 P1A0B0D? ? -0.000208 > #>? 2 Jun-10 10:51 P1A0B0D? ? -0.000228 > #>? 3 Jun-10 11:02 P1A0B0D? ? -0.000234 > #>? 4 Jun-10 11:17 P1A0B0D? ? -0.00022 > #>? 5 Jun-10 11:25 P1A0B0D? ? -0.000238 > #>? 6 Jun-10 10:34 P190-90D? -0.000195 > #>? 7 Jun-10 10:51 P190-90D? -0.000188 > #>? 8 Jun-10 11:02 P190-90D? -0.000204 > #>? 9 Jun-10 11:17 P190-90D? -0.000205 > #> 10 Jun-10 11:25 P190-90D? -0.000195 > > > > Hope this helps, > > Rui Barradas > > ?s 17:22 de 21/06/2022, Thomas Subia escreveu: > > Colleagues: > > > > The header of my data set is: > > Time_stamp? ? P1A0B0D P190-90D > > Jun-10 10:34? -0.000208? ? ? -0.000195 > > Jun-10 10:51? -0.000228? ? ? -0.000188 > > Jun-10 11:02? -0.000234? ? ? -0.000204 > > Jun-10 11:17? -0.00022? ? ? ? -0.000205 > > Jun-10 11:25? -0.000238? ? ? -0.000195 > > > > I want my data set to resemble: > > > > Time_stamp? ? Location? ? ? ? Measurement > > Jun-10 10:34? P1A0B0D -0.000208 > > Jun-10 10:51? P1A0B0D -0.000228 > > Jun-10 11:02? P1A0B0D -0.000234 > > Jun-10 11:17? P1A0B0D -0.00022 > > Jun-10 11:25? P1A0B0D -0.000238 > > Jun-10 10:34? P190-90D? ? ? ? -0.000195 > > Jun-10 10:51? P190-90D? ? ? ? -0.000188 > > Jun-10 11:02? P190-90D? ? ? ? -0.000204 > > Jun-10 11:17? P190-90D? ? ? ? -0.000205 > > Jun-10 11:25? P190-90D? ? ? ? -0.000195 > > > > I need some advice on how to do this using dplyr. > > > > V/R > > Thomas Subia > > > > FM Industries, Inc. - NGK Electronics, USA | www.fmindustries.com > > 221 Warren Ave, Fremont, CA 94539 > > > > "En Dieu nous avons confiance, tous les autres doivent apporter des > donnees" > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >??? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]