Dear useRs, I'm new to the tidyverse world and I need some help on basic things. I have the following tibble: mytbl <- structure(list(files = c("a", "b", "c", "d", "e", "f"), prop 1:6), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame")) I want to subset the rows with "a" in the column "files", and keep only that column. So I did: myfile <- mytbl %>% ? filter(grepl("a", files)) %>% ? select(files) It works, but I believe there must be an easier way to combine filter() and select(), right? Thank you! Ivan -- Dr. Ivan Calandra TraCEr, laboratory for Traceology and Controlled Experiments MONREPOS Archaeological Research Centre and Museum for Human Behavioural Evolution Schloss Monrepos 56567 Neuwied, Germany +49 (0) 2631 9772-243 https://www.researchgate.net/profile/Ivan_Calandra
Inline ----- Original Message -----> From: "Ivan Calandra" <calandra at rgzm.de> > To: "R-help" <r-help at r-project.org> > Sent: Wednesday, 19 August, 2020 16:56:32 > Subject: [R] combine filter() and select()> Dear useRs, > > I'm new to the tidyverse world and I need some help on basic things. > > I have the following tibble: > mytbl <- structure(list(files = c("a", "b", "c", "d", "e", "f"), prop > 1:6), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame")) > > I want to subset the rows with "a" in the column "files", and keep only > that column. > > So I did: > myfile <- mytbl %>% >? filter(grepl("a", files)) %>% >? select(files) > > It works, but I believe there must be an easier way to combine filter() > and select(), right?I would write mytbl %>% filter(grepl("a", files)) %>% select(files) -> myfile as I like to keep a sort of "top to bottom and left to right" flow when writing in the tidyverse dialect of R but that's really not important. Apart from that I think what you've done is "proper tidyverse". To me another difference between the dialects is that classical R often seems to put value on, and make it easy, to do things with incredible few characters. I think the people who are brilliant at that sort of coding, and there are many on this list, that sort of coding is also easy to read. I know that Chinese is easy to read if you grew up on it but to a bear of little brain like me, the much more verbose style of tidyverse repays typing time with readability when I come back to my code and, though I have little experience of this yet, when I read other poeple's code. What did you think wasn't "easy" about what you wrote? Very best (all), Chris> > Thank you! > Ivan > > -- > Dr. Ivan Calandra > TraCEr, laboratory for Traceology and Controlled Experiments > MONREPOS Archaeological Research Centre and > Museum for Human Behavioural Evolution > Schloss Monrepos > 56567 Neuwied, Germany > +49 (0) 2631 9772-243 > https://www.researchgate.net/profile/Ivan_Calandra > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Small contribution in our coronavirus rigours: https://www.coresystemtrust.org.uk/home/free-options-to-replace-paper-core-forms-during-the-coronavirus-pandemic/ Chris Evans <chris at psyctc.org> Visiting Professor, University of Sheffield <chris.evans at sheffield.ac.uk> I do some consultation work for the University of Roehampton <chris.evans at roehampton.ac.uk> and other places but <chris at psyctc.org> remains my main Email address. I have a work web site at: https://www.psyctc.org/psyctc/ and a site I manage for CORE and CORE system trust at: http://www.coresystemtrust.org.uk/ I have "semigrated" to France, see: https://www.psyctc.org/pelerinage2016/semigrating-to-france/ https://www.psyctc.org/pelerinage2016/register-to-get-updates-from-pelerinage2016/ If you want an Emeeting, I am trying to keep them to Thursdays and my diary is at: https://www.psyctc.org/pelerinage2016/ceworkdiary/ Beware: French time, generally an hour ahead of UK.
The whole point of dplyr primitives is to support data frames... that is, lists of columns. When you pare your data frame down to one column you are almost certainly using the wrong tool for the job. So, sure, your code works... and it even does what you wanted in the dplyr style, but what a pointless exercise. grep( "a", mytbl$file, value=TRUE ) On August 19, 2020 7:56:32 AM PDT, Ivan Calandra <calandra at rgzm.de> wrote:>Dear useRs, > >I'm new to the tidyverse world and I need some help on basic things. > >I have the following tibble: >mytbl <- structure(list(files = c("a", "b", "c", "d", "e", "f"), prop >1:6), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame")) > >I want to subset the rows with "a" in the column "files", and keep only >that column. > >So I did: >myfile <- mytbl %>% >? filter(grepl("a", files)) %>% >? select(files) > >It works, but I believe there must be an easier way to combine filter() >and select(), right? > >Thank you! >Ivan-- Sent from my phone. Please excuse my brevity.
Dear Chris, I didn't think about having the assignment at the end as you showed; it indeed fits the pipe workflow better. By "easy", I actually meant shorter. As you said, in base R, I usually do that in 1 line, so I was hoping to do the same in tidyverse. But I'm glad to hear that I'm using tidyverse the proper way :) Best regards, Ivan -- Dr. Ivan Calandra TraCEr, laboratory for Traceology and Controlled Experiments MONREPOS Archaeological Research Centre and Museum for Human Behavioural Evolution Schloss Monrepos 56567 Neuwied, Germany +49 (0) 2631 9772-243 https://www.researchgate.net/profile/Ivan_Calandra On 19/08/2020 19:21, Chris Evans wrote:> Inline > > ----- Original Message ----- >> From: "Ivan Calandra" <calandra at rgzm.de> >> To: "R-help" <r-help at r-project.org> >> Sent: Wednesday, 19 August, 2020 16:56:32 >> Subject: [R] combine filter() and select() >> Dear useRs, >> >> I'm new to the tidyverse world and I need some help on basic things. >> >> I have the following tibble: >> mytbl <- structure(list(files = c("a", "b", "c", "d", "e", "f"), prop >> 1:6), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame")) >> >> I want to subset the rows with "a" in the column "files", and keep only >> that column. >> >> So I did: >> myfile <- mytbl %>% >> ? filter(grepl("a", files)) %>% >> ? select(files) >> >> It works, but I believe there must be an easier way to combine filter() >> and select(), right? > I would write > > mytbl %>% > filter(grepl("a", files)) %>% > select(files) -> myfile > > as I like to keep a sort of "top to bottom and left to right" flow when writing in the tidyverse dialect of R but that's really not important. > > Apart from that I think what you've done is "proper tidyverse". To me another difference between the dialects is that classical R often seems to put value on, and make it easy, to do things with incredible few characters. I think the people who are brilliant at that sort of coding, and there are many on this list, that sort of coding is also easy to read. I know that Chinese is easy to read if you grew up on it but to a bear of little brain like me, the much more verbose style of tidyverse repays typing time with readability when I come back to my code and, though I have little experience of this yet, when I read other poeple's code. > > What did you think wasn't "easy" about what you wrote? > > Very best (all), > > Chris > >> Thank you! >> Ivan >> >> -- >> Dr. Ivan Calandra >> TraCEr, laboratory for Traceology and Controlled Experiments >> MONREPOS Archaeological Research Centre and >> Museum for Human Behavioural Evolution >> Schloss Monrepos >> 56567 Neuwied, Germany >> +49 (0) 2631 9772-243 >> https://www.researchgate.net/profile/Ivan_Calandra >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code.
Hi Jeff, The code you show is exactly what I usually do, in base R; but I wanted to play with tidyverse to learn it (and also understand when it makes sense and when it doesn't). And yes, of course, in the example I gave, I end up with a 1-cell tibble, which could be better extracted as a length-1 vector. But my real goal is not to end up with a single value or even a single column. I just thought that simplifying my example was the best approach to ask for advice. But thank you for letting me know that what I'm doing is pointless! Ivan -- Dr. Ivan Calandra TraCEr, laboratory for Traceology and Controlled Experiments MONREPOS Archaeological Research Centre and Museum for Human Behavioural Evolution Schloss Monrepos 56567 Neuwied, Germany +49 (0) 2631 9772-243 https://www.researchgate.net/profile/Ivan_Calandra On 19/08/2020 19:27, Jeff Newmiller wrote:> The whole point of dplyr primitives is to support data frames... that is, lists of columns. When you pare your data frame down to one column you are almost certainly using the wrong tool for the job. > > So, sure, your code works... and it even does what you wanted in the dplyr style, but what a pointless exercise. > > grep( "a", mytbl$file, value=TRUE ) > > On August 19, 2020 7:56:32 AM PDT, Ivan Calandra <calandra at rgzm.de> wrote: >> Dear useRs, >> >> I'm new to the tidyverse world and I need some help on basic things. >> >> I have the following tibble: >> mytbl <- structure(list(files = c("a", "b", "c", "d", "e", "f"), prop >> 1:6), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame")) >> >> I want to subset the rows with "a" in the column "files", and keep only >> that column. >> >> So I did: >> myfile <- mytbl %>% >> ? filter(grepl("a", files)) %>% >> ? select(files) >> >> It works, but I believe there must be an easier way to combine filter() >> and select(), right? >> >> Thank you! >> Ivan
On Wed, Aug 19, 2020 at 10:03 AM Ivan Calandra <calandra at rgzm.de> wrote:> > Dear useRs, > > I'm new to the tidyverse world and I need some help on basic things. > > I have the following tibble: > mytbl <- structure(list(files = c("a", "b", "c", "d", "e", "f"), prop > 1:6), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame")) > > I want to subset the rows with "a" in the column "files", and keep only > that column. > > So I did: > myfile <- mytbl %>% > filter(grepl("a", files)) %>% > select(files) > > It works, but I believe there must be an easier way to combine filter() > and select(), right?Not in the tidyverse. As others have mentioned, both [ and subset() in base R allow you to simultaneously subset rows and columns, but there's no single verb in the tidyverse that does both. This is somewhat informed by the observation that in data frames, unlike matrices, rows and columns are not exchangeable, and you typically want to express subsetting in rather different ways. Hadley -- http://hadley.nz