HI all, I have some data to be screened based on the recording flag (obs). Some family recorded properly (1) and others not (0). Th 0 = improper and 1 = proper The recording period starts week1. All families may not start in the same week in recording properly an observation, DF2 <- read.table(header=TRUE, text='family time obs A WEEK1 0 A WEEK1 0 A WEEK1 0 A WEEK2 1 A WEEK2 0 A WEEK3 1 A WEEK3 0 B WEEK1 1 B WEEK1 0 B WEEK1 1 B WEEK2 0 B WEEK2 0 B WEEK3 1 B WEEK3 0 C WEEK3 0 C WEEK3 0 C WEEK4 1 C WEEK4 1') Example, in week1 all records of family "A" are 0 (improper), but starting the week2 they start recording proper (1) records as well. Then I create a table that shows me the ratio of proper records to the total records for each family within week. If the ratio is zero and there is no prior proper recordings for that family then I want to delete those records. However, once any family started showing proper records as "1" and even if in the the subsequent week the ratio is 0 then I want keep that record for that family. Example records of week2 for family B Here is the summary table WEEK1 WEEK2 WEEK3 WEEK4 A 0 0.5 0.5 . B 0.33 0 0.5 . C . . 0 1>From the above tableFor A- I want exclude all records of week1 and keep the rest. Because they were not recording it propeller For B- Keep all records, as they stated recording properly from the beginning. For C- Keep only the week4 records because all records are 1's Final and desired result will be A WEEK2 1 A WEEK2 0 A WEEK3 1 A WEEK3 0 B WEEK1 1 B WEEK1 0 B WEEK1 1 B WEEK2 0 B WEEK2 0 B WEEK3 1 B WEEK3 0 C WEEK4 1 C WEEK4 1 and the summary table looks like as follows WEEK1 WEEK2 WEEK3 WEEK4 A . 0.5 0.5 . B 0.33 0 0.5 . C . . . 1 Thank you in advance
Something like this?> DF2.agg <- aggregate(DF2$obs, DF2[, c("family", "time")], mean) > DF2.tbl <- xtabs(x~family+time, DF2.agg) > DF2.tbl timefamily WEEK1 WEEK2 WEEK3 WEEK4 A 0.00 0.50 0.50 0.00 B 0.67 0.00 0.50 0.00 C 0.00 0.00 0.00 1.00 You can get closer to the output in your example with this> suppressWarnings(as.table(formatC(DF2.tbl, digits=2, width=4, zero.print=".")))time family WEEK1 WEEK2 WEEK3 WEEK4 A . 0.5 0.5 . B 0.67 . 0.5 . C . . . 1 ------------------------------------- David L Carlson Department of Anthropology Texas A&M University College Station, TX 77840-4352 -----Original Message----- From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Val Sent: Wednesday, March 15, 2017 5:41 PM To: r-help at R-project.org (r-help at r-project.org) <r-help at r-project.org> Subject: [R] screen HI all, I have some data to be screened based on the recording flag (obs). Some family recorded properly (1) and others not (0). Th 0 = improper and 1 = proper The recording period starts week1. All families may not start in the same week in recording properly an observation, DF2 <- read.table(header=TRUE, text='family time obs A WEEK1 0 A WEEK1 0 A WEEK1 0 A WEEK2 1 A WEEK2 0 A WEEK3 1 A WEEK3 0 B WEEK1 1 B WEEK1 0 B WEEK1 1 B WEEK2 0 B WEEK2 0 B WEEK3 1 B WEEK3 0 C WEEK3 0 C WEEK3 0 C WEEK4 1 C WEEK4 1') Example, in week1 all records of family "A" are 0 (improper), but starting the week2 they start recording proper (1) records as well. Then I create a table that shows me the ratio of proper records to the total records for each family within week. If the ratio is zero and there is no prior proper recordings for that family then I want to delete those records. However, once any family started showing proper records as "1" and even if in the the subsequent week the ratio is 0 then I want keep that record for that family. Example records of week2 for family B Here is the summary table WEEK1 WEEK2 WEEK3 WEEK4 A 0 0.5 0.5 . B 0.33 0 0.5 . C . . 0 1>From the above tableFor A- I want exclude all records of week1 and keep the rest. Because they were not recording it propeller For B- Keep all records, as they stated recording properly from the beginning. For C- Keep only the week4 records because all records are 1's Final and desired result will be A WEEK2 1 A WEEK2 0 A WEEK3 1 A WEEK3 0 B WEEK1 1 B WEEK1 0 B WEEK1 1 B WEEK2 0 B WEEK2 0 B WEEK3 1 B WEEK3 0 C WEEK4 1 C WEEK4 1 and the summary table looks like as follows WEEK1 WEEK2 WEEK3 WEEK4 A . 0.5 0.5 . B 0.33 0 0.5 . C . . . 1 Thank you in advance ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Sorry. I focused on the table and not the record selection. Given the table this seems to be what you are looking for, but there may be an easier way:> keep <- which(t(apply(DF2.tbl, 1, cumsum)) > .000001, arr.ind=TRUE) > keep <- keep[order(keep[, 1], keep[, 2]), ] > keep # These are the records you want to keepfamily time A 1 2 A 1 3 A 1 4 B 2 1 B 2 2 B 2 3 B 2 4 C 3 4 # Now turn keep into a data.frame with factors: family and time # so it matches DF2> rownames(keep) <- NULL > keep <- data.frame(keep) > keep$family <- factor(keep$family, labels=levels(DF2$family)) > keep$time <- factor(keep$time, labels=levels(DF2$time)) > keepfamily time 1 A WEEK2 2 A WEEK3 3 A WEEK4 4 B WEEK1 5 B WEEK2 6 B WEEK3 7 B WEEK4 8 C WEEK4> DF2.new <- merge(DF2, keep) > DF2.newfamily time obs 1 A WEEK2 0 2 A WEEK2 1 3 A WEEK3 1 4 A WEEK3 0 5 B WEEK1 0 6 B WEEK1 1 7 B WEEK1 1 8 B WEEK2 0 9 B WEEK2 0 10 B WEEK3 1 11 B WEEK3 0 12 C WEEK4 1 13 C WEEK4 1 ------------------------------------- David L Carlson Department of Anthropology Texas A&M University College Station, TX 77840-4352 -----Original Message----- From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of David L Carlson Sent: Thursday, March 16, 2017 9:01 AM To: Val <valkremk at gmail.com>; r-help at R-project.org (r-help at r-project.org) <r-help at r-project.org> Subject: Re: [R] screen Something like this?> DF2.agg <- aggregate(DF2$obs, DF2[, c("family", "time")], mean) > DF2.tbl <- xtabs(x~family+time, DF2.agg) > DF2.tbl timefamily WEEK1 WEEK2 WEEK3 WEEK4 A 0.00 0.50 0.50 0.00 B 0.67 0.00 0.50 0.00 C 0.00 0.00 0.00 1.00 You can get closer to the output in your example with this> suppressWarnings(as.table(formatC(DF2.tbl, digits=2, width=4, zero.print=".")))time family WEEK1 WEEK2 WEEK3 WEEK4 A . 0.5 0.5 . B 0.67 . 0.5 . C . . . 1 ------------------------------------- David L Carlson Department of Anthropology Texas A&M University College Station, TX 77840-4352 -----Original Message----- From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Val Sent: Wednesday, March 15, 2017 5:41 PM To: r-help at R-project.org (r-help at r-project.org) <r-help at r-project.org> Subject: [R] screen HI all, I have some data to be screened based on the recording flag (obs). Some family recorded properly (1) and others not (0). Th 0 = improper and 1 = proper The recording period starts week1. All families may not start in the same week in recording properly an observation, DF2 <- read.table(header=TRUE, text='family time obs A WEEK1 0 A WEEK1 0 A WEEK1 0 A WEEK2 1 A WEEK2 0 A WEEK3 1 A WEEK3 0 B WEEK1 1 B WEEK1 0 B WEEK1 1 B WEEK2 0 B WEEK2 0 B WEEK3 1 B WEEK3 0 C WEEK3 0 C WEEK3 0 C WEEK4 1 C WEEK4 1') Example, in week1 all records of family "A" are 0 (improper), but starting the week2 they start recording proper (1) records as well. Then I create a table that shows me the ratio of proper records to the total records for each family within week. If the ratio is zero and there is no prior proper recordings for that family then I want to delete those records. However, once any family started showing proper records as "1" and even if in the the subsequent week the ratio is 0 then I want keep that record for that family. Example records of week2 for family B Here is the summary table WEEK1 WEEK2 WEEK3 WEEK4 A 0 0.5 0.5 . B 0.33 0 0.5 . C . . 0 1>From the above tableFor A- I want exclude all records of week1 and keep the rest. Because they were not recording it propeller For B- Keep all records, as they stated recording properly from the beginning. For C- Keep only the week4 records because all records are 1's Final and desired result will be A WEEK2 1 A WEEK2 0 A WEEK3 1 A WEEK3 0 B WEEK1 1 B WEEK1 0 B WEEK1 1 B WEEK2 0 B WEEK2 0 B WEEK3 1 B WEEK3 0 C WEEK4 1 C WEEK4 1 and the summary table looks like as follows WEEK1 WEEK2 WEEK3 WEEK4 A . 0.5 0.5 . B 0.33 0 0.5 . C . . . 1 Thank you in advance ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.