Bruce Ratner PhD
2017-Mar-31 16:20 UTC
[R] Taking the sum of only some columns of a data frame
Hi R'ers: Given a data.frame of five columns and ten rows. I would like to take the sum of, say, the first and third columns only. For the remaining columns, I do not want any calculations, thus rending their "values" on the "total" row blank. The sum/total row is to be combined to the original data.frame, yielding a data.frame with five columns and eleven rows. Thanks, in advance. Bruce ______________ Bruce Ratner PhD The Significant Statistician? [[alternative HTML version deleted]]
Doran, Harold
2017-Mar-31 16:33 UTC
[R] Taking the sum of only some columns of a data frame
I do not believe this can be done in one step dat <- data.frame(matrix(rnorm(50), 5)) pos <- c(1,3) res <- apply(dat[, pos], 2, sum) x <- numeric(5) x[pos] <- res rbind(dat,x) -----Original Message----- From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Bruce Ratner PhD Sent: Friday, March 31, 2017 12:20 PM To: r-help at r-project.org Subject: [R] Taking the sum of only some columns of a data frame Hi R'ers: Given a data.frame of five columns and ten rows. I would like to take the sum of, say, the first and third columns only. For the remaining columns, I do not want any calculations, thus rending their "values" on the "total" row blank. The sum/total row is to be combined to the original data.frame, yielding a data.frame with five columns and eleven rows. Thanks, in advance. Bruce ______________ Bruce Ratner PhD The Significant Statistician? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Doran, Harold
2017-Mar-31 16:35 UTC
[R] Taking the sum of only some columns of a data frame
Apologies, my code below has an error that recycles the vector x. Hopefully, the concept is clear. -----Original Message----- From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Doran, Harold Sent: Friday, March 31, 2017 12:34 PM To: 'Bruce Ratner PhD' <br at dmstat1.com>; r-help at r-project.org Subject: Re: [R] Taking the sum of only some columns of a data frame I do not believe this can be done in one step dat <- data.frame(matrix(rnorm(50), 5)) pos <- c(1,3) res <- apply(dat[, pos], 2, sum) x <- numeric(5) x[pos] <- res rbind(dat,x) -----Original Message----- From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Bruce Ratner PhD Sent: Friday, March 31, 2017 12:20 PM To: r-help at r-project.org Subject: [R] Taking the sum of only some columns of a data frame Hi R'ers: Given a data.frame of five columns and ten rows. I would like to take the sum of, say, the first and third columns only. For the remaining columns, I do not want any calculations, thus rending their "values" on the "total" row blank. The sum/total row is to be combined to the original data.frame, yielding a data.frame with five columns and eleven rows. Thanks, in advance. Bruce ______________ Bruce Ratner PhD The Significant Statistician? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Doran, Harold
2017-Mar-31 17:06 UTC
[R] Taking the sum of only some columns of a data frame
Let's keep r-list on the email per typical protocol. Apply is a function in base R, so you don't need to install it -----Original Message----- From: Bruce Ratner PhD [mailto:br at dmstat1.com] Sent: Friday, March 31, 2017 1:06 PM To: Doran, Harold <HDoran at air.org> Subject: Re: [R] Taking the sum of only some columns of a data frame Hey Harold: Thanks for quick reply. But, I can't install "apply." Is there anything you can suggest to get my install of apply on R 3.3.3, or a work around of your original answer? Thanks, so much. Bruce ______________ Bruce Ratner PhD The Significant Statistician?> On Mar 31, 2017, at 12:33 PM, Doran, Harold <HDoran at air.org> wrote: > > apply
William Dunlap
2017-Mar-31 17:19 UTC
[R] Taking the sum of only some columns of a data frame
> dat <- data.frame(Group=LETTERS[1:5], X=1:5, Y=11:15) > pos <- c(2,3) > rbind(dat, Sum=lapply(seq_len(ncol(dat)), function(i) if (i %in% pos) sum(dat[,i]) else NA_real_))Group X Y 1 A 1 11 2 B 2 12 3 C 3 13 4 D 4 14 5 E 5 15 Sum <NA> 15 65> str(.Last.value)'data.frame': 6 obs. of 3 variables: $ Group: Factor w/ 5 levels "A","B","C","D",..: 1 2 3 4 5 NA $ X : int 1 2 3 4 5 15 $ Y : int 11 12 13 14 15 65 Bill Dunlap TIBCO Software wdunlap tibco.com On Fri, Mar 31, 2017 at 9:20 AM, Bruce Ratner PhD <br at dmstat1.com> wrote:> Hi R'ers: > Given a data.frame of five columns and ten rows. > I would like to take the sum of, say, the first and third columns only. > For the remaining columns, I do not want any calculations, thus rending their "values" on the "total" row blank. The sum/total row is to be combined to the original data.frame, yielding a data.frame with five columns and eleven rows. > > Thanks, in advance. > Bruce > > > ______________ > Bruce Ratner PhD > The Significant Statistician? > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
William Michels
2017-Mar-31 17:46 UTC
[R] Taking the sum of only some columns of a data frame
I'm sure there are more efficient ways, but this works:> test1 <- matrix(runif(50), nrow=10, ncol=5) > ## test1 <- as.data.frame(test1) > test1 <- rbind(test1, NA) > test1[11, c(1,3)] <- colSums(test1[1:10,c(1,3)]) > test1HTH, Bill. William Michels, Ph.D. On Fri, Mar 31, 2017 at 9:20 AM, Bruce Ratner PhD <br at dmstat1.com> wrote:> > Hi R'ers: > Given a data.frame of five columns and ten rows. > I would like to take the sum of, say, the first and third columns only. > For the remaining columns, I do not want any calculations, thus rending their "values" on the "total" row blank. The sum/total row is to be combined to the original data.frame, yielding a data.frame with five columns and eleven rows. > > Thanks, in advance. > Bruce > > > ______________ > Bruce Ratner PhD > The Significant Statistician? > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
William Michels
2017-Mar-31 20:05 UTC
[R] Taking the sum of only some columns of a data frame
Again, you should always copy the R-help list on replies to your OP. The short answer is you **shouldn't** replace NAs with blanks in your matrix or dataframe. NA is the proper designation for those cell positions. Replacing NA with a "blank" in a dataframe will convert that column to a "character" mode, precluding further numeric manipulation of those columns. Consider your workflow: are you tying to export a table? If so, take a look at installing pander (see 'missing' argument on webpage below): https://cran.r-project.org/web/packages/pander/README.html Finally, please review the Introductory PDF, available here: https://cran.r-project.org/doc/manuals/R-intro.pdf HTH, Bill. William Michels, Ph.D. On Fri, Mar 31, 2017 at 11:21 AM, BR_email <br at dmstat1.com> wrote:> William: > How can I replace the "NAs" with blanks? > Bruce > > Bruce Ratner, Ph.D. > The Significant Statistician? > > > William Michels wrote: >> >> I'm sure there are more efficient ways, but this works: >> >>> test1 <- matrix(runif(50), nrow=10, ncol=5) >>> ## test1 <- as.data.frame(test1) >>> test1 <- rbind(test1, NA) >>> test1[11, c(1,3)] <- colSums(test1[1:10,c(1,3)]) >>> test1 >> >> >> HTH, >> >> Bill. >> >> William Michels, Ph.D. >> >> >> >> On Fri, Mar 31, 2017 at 9:20 AM, Bruce Ratner PhD <br at dmstat1.com> wrote: >>> >>> Hi R'ers: >>> Given a data.frame of five columns and ten rows. >>> I would like to take the sum of, say, the first and third columns only. >>> For the remaining columns, I do not want any calculations, thus rending >>> their "values" on the "total" row blank. The sum/total row is to be combined >>> to the original data.frame, yielding a data.frame with five columns and >>> eleven rows. >>> >>> Thanks, in advance. >>> Bruce >>> >>> >>> ______________ >>> Bruce Ratner PhD >>> The Significant Statistician? >>> >>> >>> >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> >> >
Jeff Newmiller
2017-Mar-31 21:49 UTC
[R] Taking the sum of only some columns of a data frame
You can also look at the knitr-RMarkdown work flow, or the knitr-latex work flow. In both of these it is reasonable to convert your data frame to a temporary character-only form purely for output purposes. However, one can usually use an existing function to push your results out without damaging your working data. It is important to separate your data from your output because mixing results (totals) with data makes using the data further extremely difficult. Mixing them is one of the major flaws of the spreadsheet model of computation, and it causes problems there as well as in R. -- Sent from my phone. Please excuse my brevity. On March 31, 2017 1:05:09 PM PDT, William Michels via R-help <r-help at r-project.org> wrote:>Again, you should always copy the R-help list on replies to your OP. > >The short answer is you **shouldn't** replace NAs with blanks in your >matrix or dataframe. NA is the proper designation for those cell >positions. Replacing NA with a "blank" in a dataframe will convert >that column to a "character" mode, precluding further numeric >manipulation of those columns. > >Consider your workflow: are you tying to export a table? If so, take >a look at installing pander (see 'missing' argument on webpage below): > >https://cran.r-project.org/web/packages/pander/README.html > >Finally, please review the Introductory PDF, available here: > >https://cran.r-project.org/doc/manuals/R-intro.pdf > >HTH, Bill. > >William Michels, Ph.D. > > > >On Fri, Mar 31, 2017 at 11:21 AM, BR_email <br at dmstat1.com> wrote: >> William: >> How can I replace the "NAs" with blanks? >> Bruce >> >> Bruce Ratner, Ph.D. >> The Significant Statistician? >> >> >> William Michels wrote: >>> >>> I'm sure there are more efficient ways, but this works: >>> >>>> test1 <- matrix(runif(50), nrow=10, ncol=5) >>>> ## test1 <- as.data.frame(test1) >>>> test1 <- rbind(test1, NA) >>>> test1[11, c(1,3)] <- colSums(test1[1:10,c(1,3)]) >>>> test1 >>> >>> >>> HTH, >>> >>> Bill. >>> >>> William Michels, Ph.D. >>> >>> >>> >>> On Fri, Mar 31, 2017 at 9:20 AM, Bruce Ratner PhD <br at dmstat1.com> >wrote: >>>> >>>> Hi R'ers: >>>> Given a data.frame of five columns and ten rows. >>>> I would like to take the sum of, say, the first and third columns >only. >>>> For the remaining columns, I do not want any calculations, thus >rending >>>> their "values" on the "total" row blank. The sum/total row is to be >combined >>>> to the original data.frame, yielding a data.frame with five columns >and >>>> eleven rows. >>>> >>>> Thanks, in advance. >>>> Bruce >>>> >>>> >>>> ______________ >>>> Bruce Ratner PhD >>>> The Significant Statistician? >>>> >>>> >>>> >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> ______________________________________________ >>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >>> >> > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.
Mathew Guilfoyle
2017-Mar-31 23:18 UTC
[R] Taking the sum of only some columns of a data frame
This does the summation you want in one line: #create example data and column selection d = as.data.frame(matrix(rnorm(50),ncol=5)) cols = c(1,3) #sum selected columns and put results in new row d[nrow(d)+1,cols] = colSums(d[,cols]) However, I would agree with the sentiments that this is a bad idea; far better to have the mean values stored in a new object leaving the original data table untainted.> On 31 Mar 2017, at 17:20, Bruce Ratner PhD <br at dmstat1.com> wrote: > > Hi R'ers: > Given a data.frame of five columns and ten rows. > I would like to take the sum of, say, the first and third columns only. > For the remaining columns, I do not want any calculations, thus rending their "values" on the "total" row blank. The sum/total row is to be combined to the original data.frame, yielding a data.frame with five columns and eleven rows. > > Thanks, in advance. > Bruce > > > ______________ > Bruce Ratner PhD > The Significant Statistician? > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.