I'm studying alone the R language for data preparation. I found a course at MIT for data preparation that uses python but I'm using R to learning. The first exercise is the preparation of data from a database that shows the contributions made to candidates for U.S. president. The database is described in FORMART ftp://ftp.fec.gov/FEC/Presidential_Map/2012/DATA_DICTIONARIES/CONTRIBUTOR_FORMAT.txt link. I wonder how to print the table showing how many states are President Obama the top candidate (by full amount of donations received) with R language? I try using tapply method but, i dont understand how to working with more than one variable grouped. Could anyone help me in advance of the studies? -- View this message in context: http://r.789695.n4.nabble.com/How-to-use-tapply-with-more-than-one-variables-grouped-tp4646948.html Sent from the R help mailing list archive at Nabble.com.
PIKAL Petr
2012-Oct-22 09:00 UTC
[R] How to use tapply with more than one variables grouped
Hi> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of noobmin > Sent: Monday, October 22, 2012 2:28 AM > To: r-help at r-project.org > Subject: [R] How to use tapply with more than one variables grouped > > I'm studying alone the R language for data preparation. I found a > course at MIT for data preparation that uses python but I'm using R to > learning. The first exercise is the preparation of data from a database > that shows the contributions made to candidates for U.S. president. The > database is described in FORMART > ftp://ftp.fec.gov/FEC/Presidential_Map/2012/DATA_DICTIONARIES/CONTRIBUT > OR_FORMAT.txt > link. I wonder how to print the table showing how many states are > President Obama the top candidate (by full amount of donations > received) with R language? > > I try using tapply method but, i dont understand how to working with > more than one variable grouped. Could anyone help me in advance of the > studies? >How did you use tapply? Did you read help page? It points to ?aggregate which is maybe what you are looking for. Regards Petr> > > -- > View this message in context: http://r.789695.n4.nabble.com/How-to-use- > tapply-with-more-than-one-variables-grouped-tp4646948.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.
PIKAL Petr
2012-Oct-23 14:29 UTC
[R] How to use tapply with more than one variables grouped
Hi and what is wrong? Petr> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of noobmin > Sent: Tuesday, October 23, 2012 2:52 PM > To: r-help at r-project.org > Subject: Re: [R] How to use tapply with more than one variables grouped > > I used these commands previously: > > data <- read.csv("test.csv") > > tbl> > > data.frame(tapply(data$contb_receipt_amt,list(data$cand_nm,data$contbr_ > st),sum)) > > tbl > AL AR CA NY > Doug 250 250 250 NA > Jennifer 20 340 300 100 > Michele 250 500 250 60 > Obama 15 45 520 600 > > > > -- > View this message in context: http://r.789695.n4.nabble.com/How-to-use- > tapply-with-more-than-one-variables-grouped-tp4646948p4647122.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.
To take this example I reduced the number of records absurdly. In the original database there are 48 000 candidates and dozens of states. There is no way to analyze data visually. I would not put 400 mb of tables here. But based on the example how could list the states where obama received more contribution? -- View this message in context: http://r.789695.n4.nabble.com/How-to-use-tapply-with-more-than-one-variables-grouped-tp4646948p4647175.html Sent from the R help mailing list archive at Nabble.com.
I meant where obama has higher value compared to other candidates. Looking at the column NY, Obama has the highest. So to state that he wins. Looking for AR column, Michelle wins. I JUST want to list where obama wins. Thank you! This seems to work, just do not understand why you used a threshold? I will study your solution, thanks again! -- View this message in context: http://r.789695.n4.nabble.com/How-to-use-tapply-with-more-than-one-variables-grouped-tp4646948p4647203.html Sent from the R help mailing list archive at Nabble.com.
Hi, Your question is not clear. Suppose if you want to find the highest two contributions for each candidate: dat1<-read.table(text=" ????????????????? AL? AR? CA? NY Doug??? 250 250 250? NA Jennifer? 20 340 300 100 Michele? 250 500 250? 60 Obama??? 15? 45 520 600 ",header=TRUE,stringsAsFactors=FALSE,sep="") res1<-unlist(lapply(split(dat1,rownames(dat1)),function(x) tail(apply(x,1,sort),2))) nam1<-unlist(lapply(lapply(split(dat1,rownames(dat1)),function(x) tail(apply(x,1,sort),2)),function(x) dimnames(x)[1]),use.names=F) names(res1)<-paste(names(res1),nam1,sep="_") names(res1)<-gsub("\\d+","",names(res1)) res1 ? #? Doug_AR???? Doug_CA Jennifer_CA Jennifer_AR? Michele_CA? Michele_AR ??? # ?? 250???????? 250???????? 300???????? 340???????? 250???????? 500 ? # Obama_CA??? Obama_NY ??? # ?? 520???????? 600 #Contribution for Obama res1[grep("Obama",names(res1))] #Obama_CA Obama_NY ? # ? 520????? 600 A.K. ----- Original Message ----- From: noobmin <pseudovoid at hotmail.com> To: r-help at r-project.org Cc: Sent: Tuesday, October 23, 2012 12:48 PM Subject: Re: [R] How to use tapply with more than one variables grouped To take this example I reduced the number of records absurdly. In the original database there are 48 000 candidates and dozens of states. There is no way to analyze data visually. I would not put 400 mb of tables here. But based on the example how could list the states where obama received more contribution? -- View this message in context: http://r.789695.n4.nabble.com/How-to-use-tapply-with-more-than-one-variables-grouped-tp4646948p4647175.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
HI, I think I understand what you meant.? This will output all those states where contribution for Obama is higher than all the other candidates. dat1<-read.table(text=" ????????????????? AL? AR? CA? NY Doug??? 250 250 250? NA Jennifer? 20 340 300 100 Michele? 250 500 250? 60 Obama??? 15? 45 520 600 ",header=TRUE,stringsAsFactors=FALSE,sep="") ?res<-unlist(lapply(apply(dat1,2,function(x) x[!is.na(x)]),function(x) x[all(x["Obama"]>x[names(x)!=names(x)[grep("Obama",names(x))]])])) res[grep("Obama",names(res))] #CA.Obama NY.Obama ? # ? 520????? 600 A.K. ----- Original Message ----- From: noobmin <pseudovoid at hotmail.com> To: r-help at r-project.org Cc: Sent: Tuesday, October 23, 2012 3:00 PM Subject: Re: [R] How to use tapply with more than one variables grouped I meant where obama has higher value compared to other candidates. Looking at the column NY, Obama has the highest. So to state that he wins. Looking for AR column, Michelle wins. I JUST want to list where obama wins. Thank you! This seems to work, just do not understand why you used a threshold? I will study your solution, thanks again! -- View this message in context: http://r.789695.n4.nabble.com/How-to-use-tapply-with-more-than-one-variables-grouped-tp4646948p4647203.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Seemingly Similar Threads
- convert list to Dataframe
- Confirmatory factor analysis problems using sem package (works in Amos)
- For Whom the Gaza Bell Tolls -- Part 1 and 2 -- Obamas Mideast Jewish Wet Dream Team
- cannot find package in Packages>>Install Packages
- For Whom the Gaza Bell Tolls -- Part 1 and 2 -- Obamas Mideast Jewish Wet Dream Team