Muhuri, Pradip (SAMHSA/CBHSQ)
2014-Dec-01 01:45 UTC
[R] R dplyr solution vs. Base R solution for the slect column total
Hi Boris, Sorry for not being explicit when replying to your first email. I wanted to say it does not work when row-binding. I want the following output. Thanks, Pradip 1 1 3 2 2 4 Total 7 ################### Below is the console ##########> test <- data.frame(first=c(1,2), second=c(3,4)) > testfirst second 1 1 3 2 2 4> > sum(test$second)[1] 7> > rbind(test, sum(test$second))first second 1 1 3 2 2 4 3 7 7 Pradip K. Muhuri, PhD SAMHSA/CBHSQ 1 Choke Cherry Road, Room 2-1071 Rockville, MD 20857 Tel: 240-276-1070 Fax: 240-276-1260 -----Original Message----- From: Boris Steipe [mailto:boris.steipe at utoronto.ca] Sent: Sunday, November 30, 2014 5:51 PM To: Muhuri, Pradip (SAMHSA/CBHSQ) Cc: r-help at r-project.org Subject: Re: [R] R dplyr solution vs. Base R solution for the slect column total No it doesn't ... consider: test <- data.frame(first=c(1,2), second=c(3,4)) test first second 1 1 3 2 2 4 sum(test$second) [1] 7 On Nov 30, 2014, at 3:48 PM, Muhuri, Pradip (SAMHSA/CBHSQ) <Pradip.Muhuri at samhsa.hhs.gov> wrote:> Hi Boris, > > That gives me the total for each of the 6 columns of the data frame. I want the column sum just for the last column. > > Thanks, > > Pradip Muhuri > > > > -----Original Message----- > From: Boris Steipe [mailto:boris.steipe at utoronto.ca] > Sent: Sunday, November 30, 2014 12:50 PM > To: Muhuri, Pradip (SAMHSA/CBHSQ) > Cc: r-help at r-project.org > Subject: Re: [R] R dplyr solution vs. Base R solution for the slect > column total > > try: > > sum(test$count) > > > B. > > > On Nov 30, 2014, at 12:01 PM, Muhuri, Pradip (SAMHSA/CBHSQ) <Pradip.Muhuri at samhsa.hhs.gov> wrote: > >> Hello, >> >> I am looking for a dplyr or base R solution for the column total - JUST FOR THE LAST COLUMN in the example below. The following code works, giving me the total for each column - This is not exactly what I want. >> rbind(test, colSums(test)) >> >> I only want the total for the very last column. I am struggling with >> this part of the code: rbind(test, c("Total", colSums(test, ...))) I have searched for a solution on Stack Oveflow. I found some mutate() code for the cumsum but no luck for the select column total. Is there a dplyr solution for the select column total? >> >> Any hints will be appreciated. >> >> Thanks, >> >> Pradip Muhuri >> >> >> ####### The following is from the console - the R script with reproducible example is also appended. >> >> >> mrjflag cocflag inhflag halflag oidflag count >> 1 0 0 0 0 0 256 >> 2 0 0 0 1 1 256 >> 3 0 0 1 0 1 256 >> 4 0 0 1 1 1 256 >> 5 0 1 0 0 1 256 >> 6 0 1 0 1 1 256 >> 7 0 1 1 0 1 256 >> 8 0 1 1 1 1 256 >> 9 1 0 0 0 1 256 >> 10 1 0 0 1 1 256 >> 11 1 0 1 0 1 256 >> 12 1 0 1 1 1 256 >> 13 1 1 0 0 1 256 >> 14 1 1 0 1 1 256 >> 15 1 1 1 0 1 256 >> 16 1 1 1 1 1 256 >> 17 8 8 8 8 15 4096 >> >> >> >> ####################### below is the reproducible example >> ######################## >> library(dplyr) >> # generate data >> dlist <- rep( list( 0:1 ), 4 ) >> data <- do.call(expand.grid, drbind) >> data$id <- 1:nrow(data) >> names(data) <- c('mrjflag', 'cocflag', 'inhflag', 'halflag') >> >> >> # mutate a column and then sumamrize >> test <- data %>% >> mutate(oidflag= ifelse(mrjflag==1 | cocflag==1 | inhflag==1 | halflag==1, 1, 0)) %>% >> group_by(mrjflag,cocflag, inhflag, halflag, oidflag) %>% >> summarise(count=n()) %>% >> arrange(mrjflag,cocflag, inhflag, halflag, oidflag) >> >> >> # This works, giving me the total for each column - This is not what I exactly want. >> rbind(test, colSums(test)) >> >> # I only want the total for the very last column rbind(test, >> c("Total", colSums(test, ...))) >> >> Pradip K. Muhuri, PhD >> SAMHSA/CBHSQ >> 1 Choke Cherry Road, Room 2-1071 >> Rockville, MD 20857 >> Tel: 240-276-1070 >> Fax: 240-276-1260 >> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >
Jeff Newmiller
2014-Dec-01 01:52 UTC
[R] R dplyr solution vs. Base R solution for the slect column total
Seems like you have what you want, unless you meant to show something different than you did show. --------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --------------------------------------------------------------------------- Sent from my phone. Please excuse my brevity. On November 30, 2014 5:45:10 PM PST, "Muhuri, Pradip (SAMHSA/CBHSQ)" <Pradip.Muhuri at samhsa.hhs.gov> wrote:>Hi Boris, > >Sorry for not being explicit when replying to your first email. I >wanted to say it does not work when row-binding. I want the following >output. Thanks, Pradip > > >1 1 3 >2 2 4 >Total 7 > >################### Below is the console ########## >> test <- data.frame(first=c(1,2), second=c(3,4)) >> test > first second >1 1 3 >2 2 4 >> >> sum(test$second) >[1] 7 >> >> rbind(test, sum(test$second)) > first second >1 1 3 >2 2 4 >3 7 7 > >Pradip K. Muhuri, PhD >SAMHSA/CBHSQ >1 Choke Cherry Road, Room 2-1071 >Rockville, MD 20857 >Tel: 240-276-1070 >Fax: 240-276-1260 > >-----Original Message----- >From: Boris Steipe [mailto:boris.steipe at utoronto.ca] >Sent: Sunday, November 30, 2014 5:51 PM >To: Muhuri, Pradip (SAMHSA/CBHSQ) >Cc: r-help at r-project.org >Subject: Re: [R] R dplyr solution vs. Base R solution for the slect >column total > >No it doesn't ... >consider: > >test <- data.frame(first=c(1,2), second=c(3,4)) test > first second >1 1 3 >2 2 4 > >sum(test$second) >[1] 7 > > > > >On Nov 30, 2014, at 3:48 PM, Muhuri, Pradip (SAMHSA/CBHSQ) ><Pradip.Muhuri at samhsa.hhs.gov> wrote: > >> Hi Boris, >> >> That gives me the total for each of the 6 columns of the data frame. >I want the column sum just for the last column. >> >> Thanks, >> >> Pradip Muhuri >> >> >> >> -----Original Message----- >> From: Boris Steipe [mailto:boris.steipe at utoronto.ca] >> Sent: Sunday, November 30, 2014 12:50 PM >> To: Muhuri, Pradip (SAMHSA/CBHSQ) >> Cc: r-help at r-project.org >> Subject: Re: [R] R dplyr solution vs. Base R solution for the slect >> column total >> >> try: >> >> sum(test$count) >> >> >> B. >> >> >> On Nov 30, 2014, at 12:01 PM, Muhuri, Pradip (SAMHSA/CBHSQ) ><Pradip.Muhuri at samhsa.hhs.gov> wrote: >> >>> Hello, >>> >>> I am looking for a dplyr or base R solution for the column total - >JUST FOR THE LAST COLUMN in the example below. The following code >works, giving me the total for each column - This is not exactly what I >want. >>> rbind(test, colSums(test)) >>> >>> I only want the total for the very last column. I am struggling >with >>> this part of the code: rbind(test, c("Total", colSums(test, ...))) I >have searched for a solution on Stack Oveflow. I found some mutate() >code for the cumsum but no luck for the select column total. Is there >a dplyr solution for the select column total? >>> >>> Any hints will be appreciated. >>> >>> Thanks, >>> >>> Pradip Muhuri >>> >>> >>> ####### The following is from the console - the R script with >reproducible example is also appended. >>> >>> >>> mrjflag cocflag inhflag halflag oidflag count >>> 1 0 0 0 0 0 256 >>> 2 0 0 0 1 1 256 >>> 3 0 0 1 0 1 256 >>> 4 0 0 1 1 1 256 >>> 5 0 1 0 0 1 256 >>> 6 0 1 0 1 1 256 >>> 7 0 1 1 0 1 256 >>> 8 0 1 1 1 1 256 >>> 9 1 0 0 0 1 256 >>> 10 1 0 0 1 1 256 >>> 11 1 0 1 0 1 256 >>> 12 1 0 1 1 1 256 >>> 13 1 1 0 0 1 256 >>> 14 1 1 0 1 1 256 >>> 15 1 1 1 0 1 256 >>> 16 1 1 1 1 1 256 >>> 17 8 8 8 8 15 4096 >>> >>> >>> >>> ####################### below is the reproducible example >>> ######################## >>> library(dplyr) >>> # generate data >>> dlist <- rep( list( 0:1 ), 4 ) >>> data <- do.call(expand.grid, drbind) >>> data$id <- 1:nrow(data) >>> names(data) <- c('mrjflag', 'cocflag', 'inhflag', 'halflag') >>> >>> >>> # mutate a column and then sumamrize >>> test <- data %>% >>> mutate(oidflag= ifelse(mrjflag==1 | cocflag==1 | inhflag==1 | >halflag==1, 1, 0)) %>% >>> group_by(mrjflag,cocflag, inhflag, halflag, oidflag) %>% >>> summarise(count=n()) %>% >>> arrange(mrjflag,cocflag, inhflag, halflag, oidflag) >>> >>> >>> # This works, giving me the total for each column - This is not >what I exactly want. >>> rbind(test, colSums(test)) >>> >>> # I only want the total for the very last column rbind(test, >>> c("Total", colSums(test, ...))) >>> >>> Pradip K. Muhuri, PhD >>> SAMHSA/CBHSQ >>> 1 Choke Cherry Road, Room 2-1071 >>> Rockville, MD 20857 >>> Tel: 240-276-1070 >>> Fax: 240-276-1260 >>> >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> > >______________________________________________ >R-help at r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.
Duncan Murdoch
2014-Dec-01 02:15 UTC
[R] R dplyr solution vs. Base R solution for the slect column total
On 30/11/2014, 8:45 PM, Muhuri, Pradip (SAMHSA/CBHSQ) wrote:> Hi Boris, > > Sorry for not being explicit when replying to your first email. I wanted to say it does not work when row-binding. I want the following output. Thanks, Pradip > > > 1 1 3 > 2 2 4 > Total 7You are mixing up the computation of results with the presentation of them. That's the spreadsheet way of thinking, and it's okay for simple things like this, but gets really bogged down when the computations get hard. In R you can do it, and it's not too hard: test <- data.frame(first=c(1,2), second=c(3,4)) total <- c("", sum(test$second)) rbind(test, Total=total) but this isn't a really sensible thing to do: you can't work with that final result at all. It makes more sense to leave it in the original form, and then think about how you want to present it, and write a function that displays the result, with nice formatting, etc. That probably won't happen in the R console, you should be using Sweave or knitr or some other package for presentation of the results. Duncan Murdoch> > ################### Below is the console ########## >> test <- data.frame(first=c(1,2), second=c(3,4)) >> test > first second > 1 1 3 > 2 2 4 >> >> sum(test$second) > [1] 7 >> >> rbind(test, sum(test$second)) > first second > 1 1 3 > 2 2 4 > 3 7 7 > > Pradip K. Muhuri, PhD > SAMHSA/CBHSQ > 1 Choke Cherry Road, Room 2-1071 > Rockville, MD 20857 > Tel: 240-276-1070 > Fax: 240-276-1260 > > -----Original Message----- > From: Boris Steipe [mailto:boris.steipe at utoronto.ca] > Sent: Sunday, November 30, 2014 5:51 PM > To: Muhuri, Pradip (SAMHSA/CBHSQ) > Cc: r-help at r-project.org > Subject: Re: [R] R dplyr solution vs. Base R solution for the slect column total > > No it doesn't ... > consider: > > test <- data.frame(first=c(1,2), second=c(3,4)) test > first second > 1 1 3 > 2 2 4 > > sum(test$second) > [1] 7 > > > > > On Nov 30, 2014, at 3:48 PM, Muhuri, Pradip (SAMHSA/CBHSQ) <Pradip.Muhuri at samhsa.hhs.gov> wrote: > >> Hi Boris, >> >> That gives me the total for each of the 6 columns of the data frame. I want the column sum just for the last column. >> >> Thanks, >> >> Pradip Muhuri >> >> >> >> -----Original Message----- >> From: Boris Steipe [mailto:boris.steipe at utoronto.ca] >> Sent: Sunday, November 30, 2014 12:50 PM >> To: Muhuri, Pradip (SAMHSA/CBHSQ) >> Cc: r-help at r-project.org >> Subject: Re: [R] R dplyr solution vs. Base R solution for the slect >> column total >> >> try: >> >> sum(test$count) >> >> >> B. >> >> >> On Nov 30, 2014, at 12:01 PM, Muhuri, Pradip (SAMHSA/CBHSQ) <Pradip.Muhuri at samhsa.hhs.gov> wrote: >> >>> Hello, >>> >>> I am looking for a dplyr or base R solution for the column total - JUST FOR THE LAST COLUMN in the example below. The following code works, giving me the total for each column - This is not exactly what I want. >>> rbind(test, colSums(test)) >>> >>> I only want the total for the very last column. I am struggling with >>> this part of the code: rbind(test, c("Total", colSums(test, ...))) I have searched for a solution on Stack Oveflow. I found some mutate() code for the cumsum but no luck for the select column total. Is there a dplyr solution for the select column total? >>> >>> Any hints will be appreciated. >>> >>> Thanks, >>> >>> Pradip Muhuri >>> >>> >>> ####### The following is from the console - the R script with reproducible example is also appended. >>> >>> >>> mrjflag cocflag inhflag halflag oidflag count >>> 1 0 0 0 0 0 256 >>> 2 0 0 0 1 1 256 >>> 3 0 0 1 0 1 256 >>> 4 0 0 1 1 1 256 >>> 5 0 1 0 0 1 256 >>> 6 0 1 0 1 1 256 >>> 7 0 1 1 0 1 256 >>> 8 0 1 1 1 1 256 >>> 9 1 0 0 0 1 256 >>> 10 1 0 0 1 1 256 >>> 11 1 0 1 0 1 256 >>> 12 1 0 1 1 1 256 >>> 13 1 1 0 0 1 256 >>> 14 1 1 0 1 1 256 >>> 15 1 1 1 0 1 256 >>> 16 1 1 1 1 1 256 >>> 17 8 8 8 8 15 4096 >>> >>> >>> >>> ####################### below is the reproducible example >>> ######################## >>> library(dplyr) >>> # generate data >>> dlist <- rep( list( 0:1 ), 4 ) >>> data <- do.call(expand.grid, drbind) >>> data$id <- 1:nrow(data) >>> names(data) <- c('mrjflag', 'cocflag', 'inhflag', 'halflag') >>> >>> >>> # mutate a column and then sumamrize >>> test <- data %>% >>> mutate(oidflag= ifelse(mrjflag==1 | cocflag==1 | inhflag==1 | halflag==1, 1, 0)) %>% >>> group_by(mrjflag,cocflag, inhflag, halflag, oidflag) %>% >>> summarise(count=n()) %>% >>> arrange(mrjflag,cocflag, inhflag, halflag, oidflag) >>> >>> >>> # This works, giving me the total for each column - This is not what I exactly want. >>> rbind(test, colSums(test)) >>> >>> # I only want the total for the very last column rbind(test, >>> c("Total", colSums(test, ...))) >>> >>> Pradip K. Muhuri, PhD >>> SAMHSA/CBHSQ >>> 1 Choke Cherry Road, Room 2-1071 >>> Rockville, MD 20857 >>> Tel: 240-276-1070 >>> Fax: 240-276-1260 >>> >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Muhuri, Pradip (SAMHSA/CBHSQ)
2014-Dec-01 02:29 UTC
[R] R dplyr solution vs. Base R solution for the slect column total
Hi Duncan, Thank you for sending your solution. Below is another way. Pradip> test <- data.frame(first=c(1,2), second=c(3,4)) > total <- c("", sum(test$second)) > rbind(test, Total=total)first second 1 1 3 2 2 4 Total 7> rbind(test, c("Total", colSums(test[,2, drop=FALSE])))first second 1 1 3 2 2 4 3 Total 7 Pradip K. Muhuri, PhD SAMHSA/CBHSQ 1 Choke Cherry Road, Room 2-1071 Rockville, MD 20857 Tel: 240-276-1070 Fax: 240-276-1260 -----Original Message----- From: Duncan Murdoch [mailto:murdoch.duncan at gmail.com] Sent: Sunday, November 30, 2014 9:16 PM To: Muhuri, Pradip (SAMHSA/CBHSQ); 'Boris Steipe' Cc: r-help at r-project.org Subject: Re: [R] R dplyr solution vs. Base R solution for the slect column total On 30/11/2014, 8:45 PM, Muhuri, Pradip (SAMHSA/CBHSQ) wrote:> Hi Boris, > > Sorry for not being explicit when replying to your first email. I wanted to say it does not work when row-binding. I want the following output. Thanks, Pradip > > > 1 1 3 > 2 2 4 > Total 7You are mixing up the computation of results with the presentation of them. That's the spreadsheet way of thinking, and it's okay for simple things like this, but gets really bogged down when the computations get hard. In R you can do it, and it's not too hard: test <- data.frame(first=c(1,2), second=c(3,4)) total <- c("", sum(test$second)) rbind(test, Total=total) but this isn't a really sensible thing to do: you can't work with that final result at all. It makes more sense to leave it in the original form, and then think about how you want to present it, and write a function that displays the result, with nice formatting, etc. That probably won't happen in the R console, you should be using Sweave or knitr or some other package for presentation of the results. Duncan Murdoch> > ################### Below is the console ########## >> test <- data.frame(first=c(1,2), second=c(3,4)) test > first second > 1 1 3 > 2 2 4 >> >> sum(test$second) > [1] 7 >> >> rbind(test, sum(test$second)) > first second > 1 1 3 > 2 2 4 > 3 7 7 > > Pradip K. Muhuri, PhD > SAMHSA/CBHSQ > 1 Choke Cherry Road, Room 2-1071 > Rockville, MD 20857 > Tel: 240-276-1070 > Fax: 240-276-1260 > > -----Original Message----- > From: Boris Steipe [mailto:boris.steipe at utoronto.ca] > Sent: Sunday, November 30, 2014 5:51 PM > To: Muhuri, Pradip (SAMHSA/CBHSQ) > Cc: r-help at r-project.org > Subject: Re: [R] R dplyr solution vs. Base R solution for the slect > column total > > No it doesn't ... > consider: > > test <- data.frame(first=c(1,2), second=c(3,4)) test > first second > 1 1 3 > 2 2 4 > > sum(test$second) > [1] 7 > > > > > On Nov 30, 2014, at 3:48 PM, Muhuri, Pradip (SAMHSA/CBHSQ) <Pradip.Muhuri at samhsa.hhs.gov> wrote: > >> Hi Boris, >> >> That gives me the total for each of the 6 columns of the data frame. I want the column sum just for the last column. >> >> Thanks, >> >> Pradip Muhuri >> >> >> >> -----Original Message----- >> From: Boris Steipe [mailto:boris.steipe at utoronto.ca] >> Sent: Sunday, November 30, 2014 12:50 PM >> To: Muhuri, Pradip (SAMHSA/CBHSQ) >> Cc: r-help at r-project.org >> Subject: Re: [R] R dplyr solution vs. Base R solution for the slect >> column total >> >> try: >> >> sum(test$count) >> >> >> B. >> >> >> On Nov 30, 2014, at 12:01 PM, Muhuri, Pradip (SAMHSA/CBHSQ) <Pradip.Muhuri at samhsa.hhs.gov> wrote: >> >>> Hello, >>> >>> I am looking for a dplyr or base R solution for the column total - JUST FOR THE LAST COLUMN in the example below. The following code works, giving me the total for each column - This is not exactly what I want. >>> rbind(test, colSums(test)) >>> >>> I only want the total for the very last column. I am struggling >>> with this part of the code: rbind(test, c("Total", colSums(test, ...))) I have searched for a solution on Stack Oveflow. I found some mutate() code for the cumsum but no luck for the select column total. Is there a dplyr solution for the select column total? >>> >>> Any hints will be appreciated. >>> >>> Thanks, >>> >>> Pradip Muhuri >>> >>> >>> ####### The following is from the console - the R script with reproducible example is also appended. >>> >>> >>> mrjflag cocflag inhflag halflag oidflag count >>> 1 0 0 0 0 0 256 >>> 2 0 0 0 1 1 256 >>> 3 0 0 1 0 1 256 >>> 4 0 0 1 1 1 256 >>> 5 0 1 0 0 1 256 >>> 6 0 1 0 1 1 256 >>> 7 0 1 1 0 1 256 >>> 8 0 1 1 1 1 256 >>> 9 1 0 0 0 1 256 >>> 10 1 0 0 1 1 256 >>> 11 1 0 1 0 1 256 >>> 12 1 0 1 1 1 256 >>> 13 1 1 0 0 1 256 >>> 14 1 1 0 1 1 256 >>> 15 1 1 1 0 1 256 >>> 16 1 1 1 1 1 256 >>> 17 8 8 8 8 15 4096 >>> >>> >>> >>> ####################### below is the reproducible example >>> ######################## >>> library(dplyr) >>> # generate data >>> dlist <- rep( list( 0:1 ), 4 ) >>> data <- do.call(expand.grid, drbind) data$id <- 1:nrow(data) >>> names(data) <- c('mrjflag', 'cocflag', 'inhflag', 'halflag') >>> >>> >>> # mutate a column and then sumamrize >>> test <- data %>% >>> mutate(oidflag= ifelse(mrjflag==1 | cocflag==1 | inhflag==1 | halflag==1, 1, 0)) %>% >>> group_by(mrjflag,cocflag, inhflag, halflag, oidflag) %>% >>> summarise(count=n()) %>% >>> arrange(mrjflag,cocflag, inhflag, halflag, oidflag) >>> >>> >>> # This works, giving me the total for each column - This is not what I exactly want. >>> rbind(test, colSums(test)) >>> >>> # I only want the total for the very last column rbind(test, >>> c("Total", colSums(test, ...))) >>> >>> Pradip K. Muhuri, PhD >>> SAMHSA/CBHSQ >>> 1 Choke Cherry Road, Room 2-1071 >>> Rockville, MD 20857 >>> Tel: 240-276-1070 >>> Fax: 240-276-1260 >>> >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Boris Steipe
2014-Dec-01 02:42 UTC
[R] R dplyr solution vs. Base R solution for the slect column total
What do you think should be in the empty cells? Zero? NA? Empty strings? There can't just be nothing... Here's an example with empty strings "" as the filler element - but do consider carefully what Duncan wrote. test <- data.frame(first=c(1,2), second=c(3,4)) typeof(test[1,1]) # double # rbind() a vector that repeats the "empty" element one-less-then-ncols() times, # and has the column sum as its last element. test <- rbind(test, c(rep("", ncol(test)-1), sum(test$second))) test first second 1 1 3 2 2 4 3 7 # but...! typeof(test[1,1]) # character! typeof(test[2,2]) # also character! By adding characters to your columns, you cast all of your data into character type! If you want to *do* anything with the number, you'll need to cast it back to numeric. Or use 0 or NA as the filler element. test <- rbind(test, c(rep(NA, ncol(test)-1), sum(test$second))) But anyway ... as others have said, you may want to reconsider the logic of your approach. B. On Nov 30, 2014, at 8:45 PM, Muhuri, Pradip (SAMHSA/CBHSQ) <Pradip.Muhuri at samhsa.hhs.gov> wrote:> Hi Boris, > > Sorry for not being explicit when replying to your first email. I wanted to say it does not work when row-binding. I want the following output. Thanks, Pradip > > > 1 1 3 > 2 2 4 > Total 7 > > ################### Below is the console ########## >> test <- data.frame(first=c(1,2), second=c(3,4)) >> test > first second > 1 1 3 > 2 2 4 >> >> sum(test$second) > [1] 7 >> >> rbind(test, sum(test$second)) > first second > 1 1 3 > 2 2 4 > 3 7 7 > > Pradip K. Muhuri, PhD > SAMHSA/CBHSQ > 1 Choke Cherry Road, Room 2-1071 > Rockville, MD 20857 > Tel: 240-276-1070 > Fax: 240-276-1260 > > -----Original Message----- > From: Boris Steipe [mailto:boris.steipe at utoronto.ca] > Sent: Sunday, November 30, 2014 5:51 PM > To: Muhuri, Pradip (SAMHSA/CBHSQ) > Cc: r-help at r-project.org > Subject: Re: [R] R dplyr solution vs. Base R solution for the slect column total > > No it doesn't ... > consider: > > test <- data.frame(first=c(1,2), second=c(3,4)) test > first second > 1 1 3 > 2 2 4 > > sum(test$second) > [1] 7 > > > > > On Nov 30, 2014, at 3:48 PM, Muhuri, Pradip (SAMHSA/CBHSQ) <Pradip.Muhuri at samhsa.hhs.gov> wrote: > >> Hi Boris, >> >> That gives me the total for each of the 6 columns of the data frame. I want the column sum just for the last column. >> >> Thanks, >> >> Pradip Muhuri >> >> >> >> -----Original Message----- >> From: Boris Steipe [mailto:boris.steipe at utoronto.ca] >> Sent: Sunday, November 30, 2014 12:50 PM >> To: Muhuri, Pradip (SAMHSA/CBHSQ) >> Cc: r-help at r-project.org >> Subject: Re: [R] R dplyr solution vs. Base R solution for the slect >> column total >> >> try: >> >> sum(test$count) >> >> >> B. >> >> >> On Nov 30, 2014, at 12:01 PM, Muhuri, Pradip (SAMHSA/CBHSQ) <Pradip.Muhuri at samhsa.hhs.gov> wrote: >> >>> Hello, >>> >>> I am looking for a dplyr or base R solution for the column total - JUST FOR THE LAST COLUMN in the example below. The following code works, giving me the total for each column - This is not exactly what I want. >>> rbind(test, colSums(test)) >>> >>> I only want the total for the very last column. I am struggling with >>> this part of the code: rbind(test, c("Total", colSums(test, ...))) I have searched for a solution on Stack Oveflow. I found some mutate() code for the cumsum but no luck for the select column total. Is there a dplyr solution for the select column total? >>> >>> Any hints will be appreciated. >>> >>> Thanks, >>> >>> Pradip Muhuri >>> >>> >>> ####### The following is from the console - the R script with reproducible example is also appended. >>> >>> >>> mrjflag cocflag inhflag halflag oidflag count >>> 1 0 0 0 0 0 256 >>> 2 0 0 0 1 1 256 >>> 3 0 0 1 0 1 256 >>> 4 0 0 1 1 1 256 >>> 5 0 1 0 0 1 256 >>> 6 0 1 0 1 1 256 >>> 7 0 1 1 0 1 256 >>> 8 0 1 1 1 1 256 >>> 9 1 0 0 0 1 256 >>> 10 1 0 0 1 1 256 >>> 11 1 0 1 0 1 256 >>> 12 1 0 1 1 1 256 >>> 13 1 1 0 0 1 256 >>> 14 1 1 0 1 1 256 >>> 15 1 1 1 0 1 256 >>> 16 1 1 1 1 1 256 >>> 17 8 8 8 8 15 4096 >>> >>> >>> >>> ####################### below is the reproducible example >>> ######################## >>> library(dplyr) >>> # generate data >>> dlist <- rep( list( 0:1 ), 4 ) >>> data <- do.call(expand.grid, drbind) >>> data$id <- 1:nrow(data) >>> names(data) <- c('mrjflag', 'cocflag', 'inhflag', 'halflag') >>> >>> >>> # mutate a column and then sumamrize >>> test <- data %>% >>> mutate(oidflag= ifelse(mrjflag==1 | cocflag==1 | inhflag==1 | halflag==1, 1, 0)) %>% >>> group_by(mrjflag,cocflag, inhflag, halflag, oidflag) %>% >>> summarise(count=n()) %>% >>> arrange(mrjflag,cocflag, inhflag, halflag, oidflag) >>> >>> >>> # This works, giving me the total for each column - This is not what I exactly want. >>> rbind(test, colSums(test)) >>> >>> # I only want the total for the very last column rbind(test, >>> c("Total", colSums(test, ...))) >>> >>> Pradip K. Muhuri, PhD >>> SAMHSA/CBHSQ >>> 1 Choke Cherry Road, Room 2-1071 >>> Rockville, MD 20857 >>> Tel: 240-276-1070 >>> Fax: 240-276-1260 >>> >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >
Muhuri, Pradip (SAMHSA/CBHSQ)
2014-Dec-01 03:08 UTC
[R] R dplyr solution vs. Base R solution for the slect column total
Hi Boris, Excellent point. Yes, I want to convert it into to the numeric type. Your code has worked out well on the real data set. The issue is resolved. Thanks so much for your help! Pradip -----Original Message----- From: Boris Steipe [mailto:boris.steipe at utoronto.ca] Sent: Sunday, November 30, 2014 9:42 PM To: Muhuri, Pradip (SAMHSA/CBHSQ) Cc: r-help at r-project.org Subject: Re: [R] R dplyr solution vs. Base R solution for the slect column total What do you think should be in the empty cells? Zero? NA? Empty strings? There can't just be nothing... Here's an example with empty strings "" as the filler element - but do consider carefully what Duncan wrote. test <- data.frame(first=c(1,2), second=c(3,4)) typeof(test[1,1]) # double # rbind() a vector that repeats the "empty" element one-less-then-ncols() times, # and has the column sum as its last element. test <- rbind(test, c(rep("", ncol(test)-1), sum(test$second))) test first second 1 1 3 2 2 4 3 7 # but...! typeof(test[1,1]) # character! typeof(test[2,2]) # also character! By adding characters to your columns, you cast all of your data into character type! If you want to *do* anything with the number, you'll need to cast it back to numeric. Or use 0 or NA as the filler element. test <- rbind(test, c(rep(NA, ncol(test)-1), sum(test$second))) But anyway ... as others have said, you may want to reconsider the logic of your approach. B. On Nov 30, 2014, at 8:45 PM, Muhuri, Pradip (SAMHSA/CBHSQ) <Pradip.Muhuri at samhsa.hhs.gov> wrote:> Hi Boris, > > Sorry for not being explicit when replying to your first email. I wanted to say it does not work when row-binding. I want the following output. Thanks, Pradip > > > 1 1 3 > 2 2 4 > Total 7 > > ################### Below is the console ########## >> test <- data.frame(first=c(1,2), second=c(3,4)) test > first second > 1 1 3 > 2 2 4 >> >> sum(test$second) > [1] 7 >> >> rbind(test, sum(test$second)) > first second > 1 1 3 > 2 2 4 > 3 7 7 > > Pradip K. Muhuri, PhD > SAMHSA/CBHSQ > 1 Choke Cherry Road, Room 2-1071 > Rockville, MD 20857 > Tel: 240-276-1070 > Fax: 240-276-1260 > > -----Original Message----- > From: Boris Steipe [mailto:boris.steipe at utoronto.ca] > Sent: Sunday, November 30, 2014 5:51 PM > To: Muhuri, Pradip (SAMHSA/CBHSQ) > Cc: r-help at r-project.org > Subject: Re: [R] R dplyr solution vs. Base R solution for the slect > column total > > No it doesn't ... > consider: > > test <- data.frame(first=c(1,2), second=c(3,4)) test first second > 1 1 3 > 2 2 4 > > sum(test$second) > [1] 7 > > > > > On Nov 30, 2014, at 3:48 PM, Muhuri, Pradip (SAMHSA/CBHSQ) <Pradip.Muhuri at samhsa.hhs.gov> wrote: > >> Hi Boris, >> >> That gives me the total for each of the 6 columns of the data frame. I want the column sum just for the last column. >> >> Thanks, >> >> Pradip Muhuri >> >> >> >> -----Original Message----- >> From: Boris Steipe [mailto:boris.steipe at utoronto.ca] >> Sent: Sunday, November 30, 2014 12:50 PM >> To: Muhuri, Pradip (SAMHSA/CBHSQ) >> Cc: r-help at r-project.org >> Subject: Re: [R] R dplyr solution vs. Base R solution for the slect >> column total >> >> try: >> >> sum(test$count) >> >> >> B. >> >> >> On Nov 30, 2014, at 12:01 PM, Muhuri, Pradip (SAMHSA/CBHSQ) <Pradip.Muhuri at samhsa.hhs.gov> wrote: >> >>> Hello, >>> >>> I am looking for a dplyr or base R solution for the column total - JUST FOR THE LAST COLUMN in the example below. The following code works, giving me the total for each column - This is not exactly what I want. >>> rbind(test, colSums(test)) >>> >>> I only want the total for the very last column. I am struggling >>> with this part of the code: rbind(test, c("Total", colSums(test, ...))) I have searched for a solution on Stack Oveflow. I found some mutate() code for the cumsum but no luck for the select column total. Is there a dplyr solution for the select column total? >>> >>> Any hints will be appreciated. >>> >>> Thanks, >>> >>> Pradip Muhuri >>> >>> >>> ####### The following is from the console - the R script with reproducible example is also appended. >>> >>> >>> mrjflag cocflag inhflag halflag oidflag count >>> 1 0 0 0 0 0 256 >>> 2 0 0 0 1 1 256 >>> 3 0 0 1 0 1 256 >>> 4 0 0 1 1 1 256 >>> 5 0 1 0 0 1 256 >>> 6 0 1 0 1 1 256 >>> 7 0 1 1 0 1 256 >>> 8 0 1 1 1 1 256 >>> 9 1 0 0 0 1 256 >>> 10 1 0 0 1 1 256 >>> 11 1 0 1 0 1 256 >>> 12 1 0 1 1 1 256 >>> 13 1 1 0 0 1 256 >>> 14 1 1 0 1 1 256 >>> 15 1 1 1 0 1 256 >>> 16 1 1 1 1 1 256 >>> 17 8 8 8 8 15 4096 >>> >>> >>> >>> ####################### below is the reproducible example >>> ######################## >>> library(dplyr) >>> # generate data >>> dlist <- rep( list( 0:1 ), 4 ) >>> data <- do.call(expand.grid, drbind) data$id <- 1:nrow(data) >>> names(data) <- c('mrjflag', 'cocflag', 'inhflag', 'halflag') >>> >>> >>> # mutate a column and then sumamrize test <- data %>% >>> mutate(oidflag= ifelse(mrjflag==1 | cocflag==1 | inhflag==1 | halflag==1, 1, 0)) %>% >>> group_by(mrjflag,cocflag, inhflag, halflag, oidflag) %>% >>> summarise(count=n()) %>% >>> arrange(mrjflag,cocflag, inhflag, halflag, oidflag) >>> >>> >>> # This works, giving me the total for each column - This is not what I exactly want. >>> rbind(test, colSums(test)) >>> >>> # I only want the total for the very last column rbind(test, >>> c("Total", colSums(test, ...))) >>> >>> Pradip K. Muhuri, PhD >>> SAMHSA/CBHSQ >>> 1 Choke Cherry Road, Room 2-1071 >>> Rockville, MD 20857 >>> Tel: 240-276-1070 >>> Fax: 240-276-1260 >>> >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >