I have a tibble that has a large number of variables and because I'm partial lazy (and I really want to know how to do this), I would like find out if it possible to partial column matching with the mutate function in the tidyverse. I have a tibble with the following> gross_test <- gross_df %>%+ select(Job,Mfg_Labor_Hrs_Planned,Mfg_Labor_Hrs_Actual,Eng_Labor_Hrs_Planned,Eng_Labor_Hrs_Actual)> gross_test# A tibble: 6 x 5 Job Mfg_Labor_Hrs_Pla~ Mfg_Labor_Hrs_Ac~ Eng_Labor_Hrs_Pl~ Eng_Labor_Hrs_Ac~ * <dbl> <dbl> <dbl> <dbl> <dbl> 1 9892 950. 1082. 133 302. 2 9893 950. 1082. 133 302. 3 9652 950. 1082. 133 302. 4 9894 950. 1082. 133 302. 5 9652 950. 1082. 133 302. 6 9894 950. 1082. 133 302.>The column names follow a pattern of (.+)_Planned and (.+)_Actual and there are a large number of them. What I would like to do is the following: mutate(Mfg_Labor_Hrs_Diff= Mfg_Labor_Hrs_Actual- Mfg_Labor_Hrs_Planned) using a regex. Something like the following mutate( $1_Diff = $1_Actual - $1_Planned) Where $1 was Mfg_Labor_Hrs. This would iterate over the entire set of columns doing Mfg_Labor_Hrs and then Eng_Labor_Hrs, etc. Is this even possible or will I need to explicitly write out the mutate for each combination? Thanks for looking at this. Shawn Way, P.E. ???
Hi Shawn, Here is a solution using base R (no dplyr). The only regex appears in the statement to get the common prefixes. colsPrefixes <- sub("_Planned$","",colnames(gross_test)[ grep("_Planned$",colnames(gross_test))]) f <- function(s) { gross_test[,paste(s,"Diff",sep="_")] <<- gross_test[,paste(s,"Actual",sep="_")] - gross_test[,paste(s,"Planned",sep="_")] } tmpOut <- sapply(colsPrefixes,f) HTH, Eric On Tue, Aug 6, 2019 at 3:34 PM Shawn Way <SWay at meco.com> wrote:> I have a tibble that has a large number of variables and because I'm > partial lazy (and I really want to know how to do this), I would like find > out if it possible to partial column matching with the mutate function in > the tidyverse. > > I have a tibble with the following > > > gross_test <- gross_df %>% > + > select(Job,Mfg_Labor_Hrs_Planned,Mfg_Labor_Hrs_Actual,Eng_Labor_Hrs_Planned,Eng_Labor_Hrs_Actual) > > gross_test > # A tibble: 6 x 5 > Job Mfg_Labor_Hrs_Pla~ Mfg_Labor_Hrs_Ac~ Eng_Labor_Hrs_Pl~ > Eng_Labor_Hrs_Ac~ > * <dbl> <dbl> <dbl> <dbl> > <dbl> > 1 9892 950. 1082. 133 > 302. > 2 9893 950. 1082. 133 > 302. > 3 9652 950. 1082. 133 > 302. > 4 9894 950. 1082. 133 > 302. > 5 9652 950. 1082. 133 > 302. > 6 9894 950. 1082. 133 > 302. > > > > The column names follow a pattern of (.+)_Planned and (.+)_Actual and > there are a large number of them. What I would like to do is the following: > > mutate(Mfg_Labor_Hrs_Diff= Mfg_Labor_Hrs_Actual- Mfg_Labor_Hrs_Planned) > > using a regex. Something like the following > > mutate( $1_Diff = $1_Actual - $1_Planned) > > Where $1 was Mfg_Labor_Hrs. This would iterate over the entire set of > columns doing Mfg_Labor_Hrs and then Eng_Labor_Hrs, etc. > > Is this even possible or will I need to explicitly write out the mutate > for each combination? > > Thanks for looking at this. > > Shawn Way, P.E. > > > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Frankly, this is great. I don?t really care if it base or tidy, I just need it to work. Thank you kindly! Shawn Way, PE From: Eric Berger <ericjberger at gmail.com> Sent: Tuesday, August 06, 2019 8:30 AM To: Shawn Way <SWay at meco.com> Cc: r-help at r-project.org Subject: Re: [R] Using Partial Column Matching for Mutate ** External Email **: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe. Hi Shawn, Here is a solution using base R (no dplyr). The only regex appears in the statement to get the common prefixes. colsPrefixes <- sub("_Planned$","",colnames(gross_test)[ grep("_Planned$",colnames(gross_test))]) f <- function(s) { gross_test[,paste(s,"Diff",sep="_")] <<- gross_test[,paste(s,"Actual",sep="_")] - gross_test[,paste(s,"Planned",sep="_")] } tmpOut <- sapply(colsPrefixes,f) HTH, Eric On Tue, Aug 6, 2019 at 3:34 PM Shawn Way <SWay at meco.com<mailto:SWay at meco.com>> wrote: I have a tibble that has a large number of variables and because I'm partial lazy (and I really want to know how to do this), I would like find out if it possible to partial column matching with the mutate function in the tidyverse. I have a tibble with the following> gross_test <- gross_df %>%+ select(Job,Mfg_Labor_Hrs_Planned,Mfg_Labor_Hrs_Actual,Eng_Labor_Hrs_Planned,Eng_Labor_Hrs_Actual)> gross_test# A tibble: 6 x 5 Job Mfg_Labor_Hrs_Pla~ Mfg_Labor_Hrs_Ac~ Eng_Labor_Hrs_Pl~ Eng_Labor_Hrs_Ac~ * <dbl> <dbl> <dbl> <dbl> <dbl> 1 9892 950. 1082. 133 302. 2 9893 950. 1082. 133 302. 3 9652 950. 1082. 133 302. 4 9894 950. 1082. 133 302. 5 9652 950. 1082. 133 302. 6 9894 950. 1082. 133 302.>The column names follow a pattern of (.+)_Planned and (.+)_Actual and there are a large number of them. What I would like to do is the following: mutate(Mfg_Labor_Hrs_Diff= Mfg_Labor_Hrs_Actual- Mfg_Labor_Hrs_Planned) using a regex. Something like the following mutate( $1_Diff = $1_Actual - $1_Planned) Where $1 was Mfg_Labor_Hrs. This would iterate over the entire set of columns doing Mfg_Labor_Hrs and then Eng_Labor_Hrs, etc. Is this even possible or will I need to explicitly write out the mutate for each combination? Thanks for looking at this. Shawn Way, P.E. ______________________________________________ R-help at r-project.org<mailto:R-help at r-project.org> mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]