Hi, I have a big dataframe as follows 109ABC 109XYZ 18ABC 18XYZ 22XYZ 23ABC 25ABC 25XYZ 30ABC 31XYZ 32ABC 32XYZ 34DCM 34XYZ 36ABC 36SUR 38DCM 38XYZ 39DCM 39SUR 41DCM 41SUR 42DCM 42SUR 46SUR 52DCM 64ABC 64XYZ 65ABC 65XYZ 66ABC 66XYZ 67XYZ 68ABC 68SUR 70MES 70SUR 72ABC 72XYZ 76ABC 76XYZ 82ABC 85ABC POV Cluster_1 17 1 3 10 14 5 2 2 1 1 1 2 2 TT:61 Cluster_2 1 4 20 6 5 3 6 9 9 6 10 1 3 1 4 TT:88 Cluster_3 3 3 6 4 17 17 18 13 17 19 22 11 5 21 8 5 18 4 7 9 TT:227 ........ I want to get two columns, i.e, one is to sum columns for all including ABC for each row and the other is to sum columns for all including XYZ for each row. Is there some help? Thank you! Dawn [[alternative HTML version deleted]]
Please read the Posting Guide before posting again. Pay particular attention to the guidance to post using plain text. Also use the dput function to provide your sample data so we can more accurately start from where you are starting bad give more targeted answers. You can use indexing to select the subset of columns to work with, and the rowSums function to do the calculations. Something like dta$abc <- rowSums( dta[ , grep( "abc", names( dta ) ) ] ) --------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --------------------------------------------------------------------------- Sent from my phone. Please excuse my brevity. On July 9, 2015 10:12:30 AM PDT, Dawn <dawn1313 at gmail.com> wrote:>Hi, > >I have a big dataframe as follows > > 109ABC 109XYZ 18ABC 18XYZ 22XYZ 23ABC 25ABC 25XYZ > 30ABC 31XYZ 32ABC 32XYZ 34DCM 34XYZ 36ABC 36SUR >38DCM 38XYZ 39DCM 39SUR 41DCM 41SUR 42DCM 42SUR >46SUR 52DCM 64ABC 64XYZ 65ABC 65XYZ 66ABC 66XYZ >67XYZ 68ABC 68SUR 70MES 70SUR 72ABC 72XYZ 76ABC >76XYZ 82ABC 85ABC POV >Cluster_1 17 >1 >3 10 14 5 2 2 1 1 1 2 > 2 TT:61 >Cluster_2 1 4 20 >6 5 3 6 9 9 6 10 1 3 1 > 4 TT:88 >Cluster_3 3 3 6 4 17 >17 18 13 17 19 22 11 5 21 8 5 18 4 >7 9 >TT:227 >........ > >I want to get two columns, i.e, one is to sum columns for all >including >ABC for each row and the other is to sum columns for all including XYZ >for >each row. > >Is there some help? Thank you! >Dawn > > [[alternative HTML version deleted]] > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.
Hello, Please use ?dput to give a data example, like this it's completely unreadable. If your data.frame is named 'dat' use dput(head(dat, 30)) # paste the outut of this in your mail And don't post in html, use plain text only, like the posting guide says. Rui Barradas Em 09-07-2015 18:12, Dawn escreveu:> Hi, > > I have a big dataframe as follows > > 109ABC 109XYZ 18ABC 18XYZ 22XYZ 23ABC 25ABC 25XYZ > 30ABC 31XYZ 32ABC 32XYZ 34DCM 34XYZ 36ABC 36SUR > 38DCM 38XYZ 39DCM 39SUR 41DCM 41SUR 42DCM 42SUR > 46SUR 52DCM 64ABC 64XYZ 65ABC 65XYZ 66ABC 66XYZ > 67XYZ 68ABC 68SUR 70MES 70SUR 72ABC 72XYZ 76ABC > 76XYZ 82ABC 85ABC POV > Cluster_1 17 1 > 3 10 14 5 2 2 1 1 1 2 > 2 TT:61 > Cluster_2 1 4 20 > 6 5 3 6 9 9 6 10 1 3 1 > 4 TT:88 > Cluster_3 3 3 6 4 17 > 17 18 13 17 19 22 11 5 21 8 5 18 4 > 7 9 TT:227 > ........ > > I want to get two columns, i.e, one is to sum columns for all including > ABC for each row and the other is to sum columns for all including XYZ for > each row. > > Is there some help? Thank you! > Dawn > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Hi Dawn, Your data are a bit messed up, but try the following: colSums(dat[,grep("ABC",names(dat),fixed=TRUE)],na.rm=TRUE) colSums(dat[,grep("XYZ",names(dat),fixed=TRUE)],na.rm=TRUE) I'm assuming that you want to discard the NA values. Jim On Fri, Jul 10, 2015 at 6:52 AM, Rui Barradas <ruipbarradas at sapo.pt> wrote:> Hello, > > Please use ?dput to give a data example, like this it's completely > unreadable. If your data.frame is named 'dat' use > > dput(head(dat, 30)) # paste the outut of this in your mail > > > And don't post in html, use plain text only, like the posting guide says. > > Rui Barradas > > > Em 09-07-2015 18:12, Dawn escreveu: >> >> Hi, >> >> I have a big dataframe as follows >> >> 109ABC 109XYZ 18ABC 18XYZ 22XYZ 23ABC 25ABC >> 25XYZ >> 30ABC 31XYZ 32ABC 32XYZ 34DCM 34XYZ 36ABC 36SUR >> 38DCM 38XYZ 39DCM 39SUR 41DCM 41SUR 42DCM 42SUR >> 46SUR 52DCM 64ABC 64XYZ 65ABC 65XYZ 66ABC 66XYZ >> 67XYZ 68ABC 68SUR 70MES 70SUR 72ABC 72XYZ 76ABC >> 76XYZ 82ABC 85ABC POV >> Cluster_1 17 1 >> 3 10 14 5 2 2 1 1 1 2 >> 2 TT:61 >> Cluster_2 1 4 20 >> 6 5 3 6 9 9 6 10 1 3 1 >> 4 TT:88 >> Cluster_3 3 3 6 4 17 >> 17 18 13 17 19 22 11 5 21 8 5 18 4 >> 7 9 >> TT:227 >> ........ >> >> I want to get two columns, i.e, one is to sum columns for all including >> ABC for each row and the other is to sum columns for all including XYZ >> for >> each row. >> >> Is there some help? Thank you! >> Dawn >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.