Colleagues, Here is my dataset. Serial Measurement Meas_test Serial_test 1 17 fail fail 1 16 pass fail 2 12 pass pass 2 8 pass pass 2 10 pass pass 3 19 fail fail 3 13 pass pass If a measurement is less than or equal to 16, then Meas_test is pass. Else Meas_test is fail This is easy to code. Serial_test is a pass, when all of the Meas_test are pass for a given serial. Else Serial_test is a fail. I'm at a loss to figure out how to do this in R. Some guidance would be appreciated. All the best, Thomas Subia
On Sat, 21 Mar 2020 20:01:30 -0700 Thomas Subia via R-help <r-help at r-project.org> wrote:> Serial_test is a pass, when all of the Meas_test are pass for a given > serial. Else Serial_test is a fail.Use by/tapply in base R or dplyr::group_by if you prefer tidyverse packages. -- Best regards, Ivan
Here's a very "step by step" example with dplyr as I'm trying to teach myself the Tidyverse way of being library(dplyr) # Serial Measurement Meas_test Serial_test # 1 17 fail fail # 1 16 pass fail # 2 12 pass pass # 2 8 pass pass # 2 10 pass pass # 3 19 fail fail # 3 13 pass pass dat <- as.data.frame(list(Serial = c(1,1,2,2,2,3,3), Measurement = c(17, 16, 12, 8, 10, 19, 13), Meas_test = c("fail", "pass", "pass", "pass", "pass", "fail", "pass"))) dat %>% group_by(Serial) %>% summarise(Serial_test = sum(Meas_test == "fail")) %>% mutate(Serial_test = if_else(Serial_test > 0, 1, 0), Serial_test = factor(Serial_test, levels = 0:1, labels = c("pass", "fail"))) -> groupedDat dat %>% left_join(groupedDat) # add -> dat to the end to pip to dat Gives: Serial Measurement Meas_test Serial_test 1 1 17 fail fail 2 1 16 pass fail 3 2 12 pass pass 4 2 8 pass pass 5 2 10 pass pass 6 3 19 fail fail 7 3 13 pass fail Would be easier for us if used dput() to share your data but thanks for the minimal example! Chris ----- Original Message -----> From: "Ivan Krylov" <krylov.r00t at gmail.com> > To: "Thomas Subia via R-help" <r-help at r-project.org> > Cc: "Thomas Subia" <tgs77m at yahoo.com> > Sent: Sunday, 22 March, 2020 07:24:15 > Subject: Re: [R] Grouping Question> On Sat, 21 Mar 2020 20:01:30 -0700 > Thomas Subia via R-help <r-help at r-project.org> wrote: > >> Serial_test is a pass, when all of the Meas_test are pass for a given >> serial. Else Serial_test is a fail. > > Use by/tapply in base R or dplyr::group_by if you prefer tidyverse > packages. > > -- > Best regards, > Ivan > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Chris Evans <chris at psyctc.org> Visiting Professor, University of Sheffield <chris.evans at sheffield.ac.uk> I do some consultation work for the University of Roehampton <chris.evans at roehampton.ac.uk> and other places but <chris at psyctc.org> remains my main Email address. I have a work web site at: https://www.psyctc.org/psyctc/ and a site I manage for CORE and CORE system trust at: http://www.coresystemtrust.org.uk/ I have "semigrated" to France, see: https://www.psyctc.org/pelerinage2016/semigrating-to-france/ That page will also take you to my blog which started with earlier joys in France and Spain! If you want to book to talk, I am trying to keep that to Thursdays and my diary is at: https://www.psyctc.org/pelerinage2016/ceworkdiary/ Beware: French time, generally an hour ahead of UK.
On 22/03/20 4:01 pm, Thomas Subia via R-help wrote:> Colleagues, > > Here is my dataset. > > Serial Measurement Meas_test Serial_test > 1 17 fail fail > 1 16 pass fail > 2 12 pass pass > 2 8 pass pass > 2 10 pass pass > 3 19 fail fail > 3 13 pass pass > > If a measurement is less than or equal to 16, then Meas_test is pass. Else > Meas_test is fail > This is easy to code. > > Serial_test is a pass, when all of the Meas_test are pass for a given > serial. Else Serial_test is a fail. > I'm at a loss to figure out how to do this in R. > > Some guidance would be appreciated.In future, please present your data using dput(); makes life much easier for those trying to help you. Your data are really the first two columns of what you presented --- the last two columns are your desired output. Let "X" be these first two columns. Define foo <- function (X) { a <- with(X,Measurement <= 16) a <- ifelse(a,"pass","fail") b <- with(X,tapply(Measurement,Serial,function(x){all(x<=16)})) i <- match(X$Serial,names(b)) b <- ifelse(b[i],"pass","fail") data.frame(Meas_test=a,Serial_test=b) } foo(X) gives:> Meas_test Serial_test > 1 fail fail > 2 pass fail > 3 pass pass > 4 pass pass > 5 pass pass > 6 fail fail > 7 pass failIf you want input and output combined, as in the way that you presented your data use cbind(X,foo(X)). cheers, Rolf Turner -- Honorary Research Fellow Department of Statistics University of Auckland Phone: +64-9-373-7599 ext. 88276
Another possible approach is to use split -> lapply -> rbind, which I often find to be conceptually simpler: d <- data.frame(Serial = c(1, 1, 2, 2, 2, 3, 3), Measurement = c(17, 16, 12, 8, 10, 19, 13)) dlist <- split(d, d$Serial) dlist <- lapply(dlist, within, { Serial_test <- if (all(Measurement <= 16)) "pass" else "fail" Meas_test <- ifelse(Measurement <= 16, "pass", "fail") }) do.call(rbind, dlist) -Deepayan On Sun, Mar 22, 2020 at 12:29 PM Rolf Turner <r.turner at auckland.ac.nz> wrote:> > > On 22/03/20 4:01 pm, Thomas Subia via R-help wrote: > > > Colleagues, > > > > Here is my dataset. > > > > Serial Measurement Meas_test Serial_test > > 1 17 fail fail > > 1 16 pass fail > > 2 12 pass pass > > 2 8 pass pass > > 2 10 pass pass > > 3 19 fail fail > > 3 13 pass pass > > > > If a measurement is less than or equal to 16, then Meas_test is pass. Else > > Meas_test is fail > > This is easy to code. > > > > Serial_test is a pass, when all of the Meas_test are pass for a given > > serial. Else Serial_test is a fail. > > I'm at a loss to figure out how to do this in R. > > > > Some guidance would be appreciated. > > In future, please present your data using dput(); makes life much easier > for those trying to help you. Your data are really the first two > columns of what you presented --- the last two columns are your desired > output. > > Let "X" be these first two columns. Define > > foo <- function (X) { > a <- with(X,Measurement <= 16) > a <- ifelse(a,"pass","fail") > b <- with(X,tapply(Measurement,Serial,function(x){all(x<=16)})) > i <- match(X$Serial,names(b)) > b <- ifelse(b[i],"pass","fail") > data.frame(Meas_test=a,Serial_test=b) > } > > foo(X) gives: > > > Meas_test Serial_test > > 1 fail fail > > 2 pass fail > > 3 pass pass > > 4 pass pass > > 5 pass pass > > 6 fail fail > > 7 pass fail > > If you want input and output combined, as in the way that you presented > your data use cbind(X,foo(X)). > > cheers, > > Rolf Turner > > -- > Honorary Research Fellow > Department of Statistics > University of Auckland > Phone: +64-9-373-7599 ext. 88276 > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.