Colleagues, Here is my dataset. Serial Measurement Meas_test Serial_test 1 17 fail fail 1 16 pass fail 2 12 pass pass 2 8 pass pass 2 10 pass pass 3 19 fail fail 3 13 pass pass If a measurement is less than or equal to 16, then Meas_test is pass. Else Meas_test is fail This is easy to code. Serial_test is a pass, when all of the Meas_test are pass for a given serial. Else Serial_test is a fail. I'm at a loss to figure out how to do this in R. Some guidance would be appreciated. All the best, Thomas Subia
On Sat, 21 Mar 2020 20:01:30 -0700 Thomas Subia via R-help <r-help at r-project.org> wrote:> Serial_test is a pass, when all of the Meas_test are pass for a given > serial. Else Serial_test is a fail.Use by/tapply in base R or dplyr::group_by if you prefer tidyverse packages. -- Best regards, Ivan
Here's a very "step by step" example with dplyr as I'm trying
to teach myself the Tidyverse way of being
library(dplyr)
# Serial Measurement Meas_test Serial_test
# 1 17 fail fail
# 1 16 pass fail
# 2 12 pass pass
# 2 8 pass pass
# 2 10 pass pass
# 3 19 fail fail
# 3 13 pass pass
dat <- as.data.frame(list(Serial = c(1,1,2,2,2,3,3),
Measurement = c(17, 16, 12, 8, 10, 19, 13),
Meas_test = c("fail", "pass",
"pass", "pass", "pass", "fail",
"pass")))
dat %>%
group_by(Serial) %>%
summarise(Serial_test = sum(Meas_test == "fail")) %>%
mutate(Serial_test = if_else(Serial_test > 0, 1, 0),
Serial_test = factor(Serial_test,
levels = 0:1,
labels = c("pass", "fail")))
-> groupedDat
dat %>%
left_join(groupedDat) # add -> dat to the end to pip to dat
Gives:
Serial Measurement Meas_test Serial_test
1 1 17 fail fail
2 1 16 pass fail
3 2 12 pass pass
4 2 8 pass pass
5 2 10 pass pass
6 3 19 fail fail
7 3 13 pass fail
Would be easier for us if used dput() to share your data but thanks for the
minimal example!
Chris
----- Original Message -----> From: "Ivan Krylov" <krylov.r00t at gmail.com>
> To: "Thomas Subia via R-help" <r-help at r-project.org>
> Cc: "Thomas Subia" <tgs77m at yahoo.com>
> Sent: Sunday, 22 March, 2020 07:24:15
> Subject: Re: [R] Grouping Question
> On Sat, 21 Mar 2020 20:01:30 -0700
> Thomas Subia via R-help <r-help at r-project.org> wrote:
>
>> Serial_test is a pass, when all of the Meas_test are pass for a given
>> serial. Else Serial_test is a fail.
>
> Use by/tapply in base R or dplyr::group_by if you prefer tidyverse
> packages.
>
> --
> Best regards,
> Ivan
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Chris Evans <chris at psyctc.org> Visiting Professor, University of
Sheffield <chris.evans at sheffield.ac.uk>
I do some consultation work for the University of Roehampton <chris.evans at
roehampton.ac.uk> and other places
but <chris at psyctc.org> remains my main Email address. I have a work
web site at:
https://www.psyctc.org/psyctc/
and a site I manage for CORE and CORE system trust at:
http://www.coresystemtrust.org.uk/
I have "semigrated" to France, see:
https://www.psyctc.org/pelerinage2016/semigrating-to-france/
That page will also take you to my blog which started with earlier joys in
France and Spain!
If you want to book to talk, I am trying to keep that to Thursdays and my diary
is at:
https://www.psyctc.org/pelerinage2016/ceworkdiary/
Beware: French time, generally an hour ahead of UK.
On 22/03/20 4:01 pm, Thomas Subia via R-help wrote:> Colleagues, > > Here is my dataset. > > Serial Measurement Meas_test Serial_test > 1 17 fail fail > 1 16 pass fail > 2 12 pass pass > 2 8 pass pass > 2 10 pass pass > 3 19 fail fail > 3 13 pass pass > > If a measurement is less than or equal to 16, then Meas_test is pass. Else > Meas_test is fail > This is easy to code. > > Serial_test is a pass, when all of the Meas_test are pass for a given > serial. Else Serial_test is a fail. > I'm at a loss to figure out how to do this in R. > > Some guidance would be appreciated.In future, please present your data using dput(); makes life much easier for those trying to help you. Your data are really the first two columns of what you presented --- the last two columns are your desired output. Let "X" be these first two columns. Define foo <- function (X) { a <- with(X,Measurement <= 16) a <- ifelse(a,"pass","fail") b <- with(X,tapply(Measurement,Serial,function(x){all(x<=16)})) i <- match(X$Serial,names(b)) b <- ifelse(b[i],"pass","fail") data.frame(Meas_test=a,Serial_test=b) } foo(X) gives:> Meas_test Serial_test > 1 fail fail > 2 pass fail > 3 pass pass > 4 pass pass > 5 pass pass > 6 fail fail > 7 pass failIf you want input and output combined, as in the way that you presented your data use cbind(X,foo(X)). cheers, Rolf Turner -- Honorary Research Fellow Department of Statistics University of Auckland Phone: +64-9-373-7599 ext. 88276
Another possible approach is to use split -> lapply -> rbind, which I
often find to be conceptually simpler:
d <- data.frame(Serial = c(1, 1, 2, 2, 2, 3, 3),
Measurement = c(17, 16, 12, 8, 10, 19, 13))
dlist <- split(d, d$Serial)
dlist <- lapply(dlist, within,
{
Serial_test <- if (all(Measurement <= 16)) "pass" else
"fail"
Meas_test <- ifelse(Measurement <= 16, "pass",
"fail")
})
do.call(rbind, dlist)
-Deepayan
On Sun, Mar 22, 2020 at 12:29 PM Rolf Turner <r.turner at auckland.ac.nz>
wrote:>
>
> On 22/03/20 4:01 pm, Thomas Subia via R-help wrote:
>
> > Colleagues,
> >
> > Here is my dataset.
> >
> > Serial Measurement Meas_test Serial_test
> > 1 17 fail fail
> > 1 16 pass fail
> > 2 12 pass pass
> > 2 8 pass pass
> > 2 10 pass pass
> > 3 19 fail fail
> > 3 13 pass pass
> >
> > If a measurement is less than or equal to 16, then Meas_test is pass.
Else
> > Meas_test is fail
> > This is easy to code.
> >
> > Serial_test is a pass, when all of the Meas_test are pass for a given
> > serial. Else Serial_test is a fail.
> > I'm at a loss to figure out how to do this in R.
> >
> > Some guidance would be appreciated.
>
> In future, please present your data using dput(); makes life much easier
> for those trying to help you. Your data are really the first two
> columns of what you presented --- the last two columns are your desired
> output.
>
> Let "X" be these first two columns. Define
>
> foo <- function (X) {
> a <- with(X,Measurement <= 16)
> a <- ifelse(a,"pass","fail")
> b <- with(X,tapply(Measurement,Serial,function(x){all(x<=16)}))
> i <- match(X$Serial,names(b))
> b <- ifelse(b[i],"pass","fail")
> data.frame(Meas_test=a,Serial_test=b)
> }
>
> foo(X) gives:
>
> > Meas_test Serial_test
> > 1 fail fail
> > 2 pass fail
> > 3 pass pass
> > 4 pass pass
> > 5 pass pass
> > 6 fail fail
> > 7 pass fail
>
> If you want input and output combined, as in the way that you presented
> your data use cbind(X,foo(X)).
>
> cheers,
>
> Rolf Turner
>
> --
> Honorary Research Fellow
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.