Dear r-users, I have this data: structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"), COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L, 4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113", "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class "factor"), PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65, 82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60, 100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA, 41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50), X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID", "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class "data.frame", row.names = c(NA, -11L)) I want to combine the same Student ID and add up all the values for PO1M, PO1T,...,PO2T obtained by the same ID. How do I do that? Thank you for any help given. -- *Roslinazairimah Zakaria* *Tel: +609-5492370; Fax. No.+609-5492766* *Email: roslinazairimah at ump.edu.my <roslinazairimah at ump.edu.my>; roslinaump at gmail.com <roslinaump at gmail.com>* Faculty of Industrial Sciences & Technology University Malaysia Pahang Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia [[alternative HTML version deleted]]
On 10/12/2018 12:12 AM, roslinazairimah zakaria wrote:> Dear r-users, > > I have this data: > > structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, > 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"), > COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L, > 4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113", > "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class > "factor"), > PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65, > 82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60, > 100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA, > 41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50), > X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA, > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID", > "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class > "data.frame", row.names = c(NA, > -11L)) > > I want to combine the same Student ID and add up all the values for PO1M, > PO1T,...,PO2T obtained by the same ID.dat <- structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"), COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L, 4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113", "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class "factor"), PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65, 82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60, 100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA, 41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50), X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID", "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class "data.frame", row.names = c(NA, -11L)) # I assume you would like to add up the values with na.rm = TRUE meanFn <- function(x) mean(x, na.rm = TRUE) # see ?aggregate aggregate(dat[, c("PO1M", "PO1T", "PO2M")], by = dat["STUDENT_ID"], FUN = meanFn) # if you have largish or large data library(data.table) dat2 <- as.data.table(dat) dat2[, lapply(.SD, meanFn), by = STUDENT_ID, .SDcols = c("PO1M", "PO1T", "PO2M")] Regards, Denes> > How do I do that? > Thank you for any help given. >
Hi Denes, It works perfectly as I want! Thanks a lot. On Fri, Oct 12, 2018 at 6:29 AM D?nes T?th <toth.denes at kogentum.hu> wrote:> > > On 10/12/2018 12:12 AM, roslinazairimah zakaria wrote: > > Dear r-users, > > > > I have this data: > > > > structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, > > 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"), > > COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L, > > 4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113", > > "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class > > "factor"), > > PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65, > > 82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60, > > 100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA, > > 41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50), > > X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA, > > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID", > > "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class > > "data.frame", row.names = c(NA, > > -11L)) > > > > I want to combine the same Student ID and add up all the values for PO1M, > > PO1T,...,PO2T obtained by the same ID. > > dat <- structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, > 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"), > COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L, > 4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113", > "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class > "factor"), > PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65, > 82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60, > 100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA, > 41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50), > X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA, > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID", > "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class > "data.frame", row.names = c(NA, > -11L)) > > # I assume you would like to add up the values with na.rm = TRUE > meanFn <- function(x) mean(x, na.rm = TRUE) > > # see ?aggregate > aggregate(dat[, c("PO1M", "PO1T", "PO2M")], > by = dat["STUDENT_ID"], > FUN = meanFn) > > # if you have largish or large data > library(data.table) > dat2 <- as.data.table(dat) > dat2[, lapply(.SD, meanFn), > by = STUDENT_ID, > .SDcols = c("PO1M", "PO1T", "PO2M")] > > > Regards, > Denes > > > > > > How do I do that? > > Thank you for any help given. > > >-- *Roslinazairimah Zakaria* *Tel: +609-5492370; Fax. No.+609-5492766* *Email: roslinazairimah at ump.edu.my <roslinazairimah at ump.edu.my>; roslinaump at gmail.com <roslinaump at gmail.com>* Faculty of Industrial Sciences & Technology University Malaysia Pahang Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia [[alternative HTML version deleted]]
> Dear r-users, > > I have this data: > > structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, > 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"), > COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L, > 4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113", > "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class > "factor"), > PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65, > 82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60, > 100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA, > 41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50), > X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA, > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID", > "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class > "data.frame", row.names = c(NA, > -11L)) > > I want to combine the same Student ID and add up all the values for PO1M, > PO1T,...,PO2T obtained by the same ID. > > How do I do that? > Thank you for any help given# load data # Enter dataframe by hand dat <- structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"), ??? COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L, ??? 4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113", ??? "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class "factor"), ??? PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65, ??? 82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60, ??? 100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA, ??? 41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50), ??? X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA, ??? NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID", "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class "data.frame", row.names = c(NA, -11L)) # Create sums by student ID library(dplyr) dat %>% ? group_by(STUDENT_ID) %>% ? summarize(sum.PO1M = sum(PO1M, na.rm = TRUE), ??????????? sum.PO1T = sum(PO1M, na.rm = TRUE), ??????????? sum.PO2M = sum(PO1M, na.rm = TRUE), ??????????? sum.PO2T = sum(PO1M, na.rm = TRUE))
On 10/11/2018 5:12 PM, roslinazairimah zakaria wrote:> Dear r-users, > > I have this data: > > structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, > 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"), > COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L, > 4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113", > "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class > "factor"), > PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65, > 82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60, > 100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA, > 41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50), > X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA, > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID", > "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class > "data.frame", row.names = c(NA, > -11L)) > > I want to combine the same Student ID and add up all the values for PO1M, > PO1T,...,PO2T obtained by the same ID. > > How do I do that? > Thank you for any help given. >oops!? Forgot to clean up after my cut and paste. Solution with dplyr looks like this: # Create sums by student ID library(dplyr) dat %>% ? group_by(STUDENT_ID) %>% ? summarize(sum.PO1M = sum(PO1M, na.rm = TRUE), ??????????? sum.PO1T = sum(PO1T, na.rm = TRUE), ??????????? sum.PO2M = sum(PO2M, na.rm = TRUE), ??????????? sum.PO2T = sum(PO2T, na.rm = TRUE))
Here is a base R solution: "dat" is the data frame as in Robert's solution.> aggregate(dat[,3:6], by= dat[1], FUN = sum, na.rm = TRUE)STUDENT_ID PO1M PO1T PO2M PO2T 1 AA15285 287.80 350 37 50 2 AA15286 240.45 330 41 50 Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Mon, Oct 15, 2018 at 6:42 PM Robert Baer <rbaer at atsu.edu> wrote:> > > On 10/11/2018 5:12 PM, roslinazairimah zakaria wrote: > > Dear r-users, > > > > I have this data: > > > > structure(list(STUDENT_ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, > > 2L, 2L, 2L, 2L, 2L), .Label = c("AA15285", "AA15286"), class = "factor"), > > COURSE_CODE = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 2L, 3L, > > 4L, 5L, 6L), .Label = c("BAA1113", "BAA1322", "BAA2113", > > "BAA2513", "BAA2713", "BAA2921", "BAA4273", "BAA4513"), class > > "factor"), > > PO1M = c(155.7, 48.9, 83.2, NA, NA, NA, 48.05, 68.4, 41.65, > > 82.35, NA), PO1T = c(180, 70, 100, NA, NA, NA, 70, 100, 60, > > 100, NA), PO2M = c(NA, NA, NA, 37, NA, NA, NA, NA, NA, NA, > > 41), PO2T = c(NA, NA, NA, 50, NA, NA, NA, NA, NA, NA, 50), > > X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA, > > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("STUDENT_ID", > > "COURSE_CODE", "PO1M", "PO1T", "PO2M", "PO2T", "X", "X.1"), class > > "data.frame", row.names = c(NA, > > -11L)) > > > > I want to combine the same Student ID and add up all the values for PO1M, > > PO1T,...,PO2T obtained by the same ID. > > > > How do I do that? > > Thank you for any help given. > > > oops! Forgot to clean up after my cut and paste. Solution with dplyr > looks like this: > # Create sums by student ID > library(dplyr) > dat %>% > group_by(STUDENT_ID) %>% > summarize(sum.PO1M = sum(PO1M, na.rm = TRUE), > sum.PO1T = sum(PO1T, na.rm = TRUE), > sum.PO2M = sum(PO2M, na.rm = TRUE), > sum.PO2T = sum(PO2T, na.rm = TRUE)) > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]