Jingxia Lin
2013-Dec-29 13:40 UTC
[R] counts and percentage of multiple categorical columns in R
Dear R helpers, I have a data sheet (“milk”) with four types of milk from five brands (A, B, C, D, E), the column shows the brands that each customer chose for each type of the milk they bought. The data sheet goes like below. You can see for some type of milk, no brand is chosen. fatfreemilk fatmilk halfmilk 2fatmilk A A A A A B B A B A A A C C C C D A A A A E A E C A B A A A A A A B B A A A B E I want to summarize each column so that for each type of milk, i know the counts and percentages of the brands chosen for each milk type. I tried "summary" in R, but the result is not shown nicely. How I can display the result in a way like below: A B C D E fatfreemilk 6(60) 1(10) 2(20) 1(10) 0(0) fatmilk 6(60) 2(20) 1(10) 0(10) 1(10) halfmilk 5(50) 4(40) 1(10) 0(0) 0(0) 2fatmilk 7(70) 0(0) 1(10) 0(0) 2(20) Thank you! [[alternative HTML version deleted]]
Bert Gunter
2013-Dec-29 18:28 UTC
[R] counts and percentage of multiple categorical columns in R
Is this homework? (We generally don't do homework here). However, hint: ?table and links therein. Also, as you can see below, post in plain text, not HTML, which is stripped and can lead to hard-to-read gobbledygook. Cheers, Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 "Data is not information. Information is not knowledge. And knowledge is certainly not wisdom." H. Gilbert Welch On Sun, Dec 29, 2013 at 5:40 AM, Jingxia Lin <jingxia08 at gmail.com> wrote:> Dear R helpers, > > I have a data sheet (?milk?) with four types of milk from five brands (A, > B, C, D, E), the column shows the brands that each customer chose for each > type of the milk they bought. The data sheet goes like below. You can see > for some type of milk, no brand is chosen. > > fatfreemilk fatmilk halfmilk 2fatmilk > A A A A > A B B A > B A A A > C C C C > D A A A > A E A E > C A B A > A A A A > A B B A > A A B E > > I want to summarize each column so that for each type of milk, i know the > counts and percentages of the brands chosen for each milk type. I tried > "summary" in R, but the result is not shown nicely. How I can display the > result in a way like below: > A B C D E > fatfreemilk 6(60) 1(10) 2(20) 1(10) 0(0) > fatmilk 6(60) 2(20) 1(10) 0(10) 1(10) > halfmilk 5(50) 4(40) 1(10) 0(0) 0(0) > 2fatmilk 7(70) 0(0) 1(10) 0(0) 2(20) > > Thank you! > > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Hi, Try: dat1 <- read.table(text="fatfreemilk fatmilk halfmilk 2fatmilk A A A A A B B A B A A A C C C C D A A A A E A E C A B A A A A A A B B A A A B E",sep="",header=TRUE,stringsAsFactors=FALSE,check.names=FALSE) ?dat2 <- dat1 ?dat2$id <- 1:nrow(dat2) library(reshape2) ?res <- dcast(melt(dat2,id.var="id")[,-1],variable~value,length) row.names(res) <- res[,1] res1 <- res[,-1] res2 <- as.matrix(res1) ?res2[]<- paste0(res2,paste0("(",(res2/rowSums(res2))*100),")") ?as.data.frame(res2) #??????????????? A???? B???? C???? D???? E #fatfreemilk 6(60) 1(10) 2(20) 1(10)? 0(0) #fatmilk???? 6(60) 2(20) 1(10)? 0(0) 1(10) #halfmilk??? 5(50) 4(40) 1(10)? 0(0)? 0(0) #2fatmilk??? 7(70)? 0(0) 1(10)? 0(0) 2(20) A.K. On Sunday, December 29, 2013 1:07 PM, Jingxia Lin <jingxia08 at gmail.com> wrote: Dear R helpers, I have a data sheet (?milk?) with four types of milk from five brands (A, B, C, D, E), the column shows the brands that each customer chose for each type of the milk they bought. The data sheet goes like below. You can see for some type of milk, no brand is chosen. fatfreemilk fatmilk halfmilk 2fatmilk A A A A A B B A B A A A C C C C D A A A A E A E C A B A A A A A A B B A A A B E I want to summarize each column so that for each type of milk, i know the counts and percentages of the brands chosen for each milk type. I tried "summary" in R, but the result is not shown nicely. How I can display the result in a way like below: A B C D E fatfreemilk 6(60) 1(10) 2(20) 1(10) 0(0) fatmilk 6(60) 2(20) 1(10) 0(10) 1(10) halfmilk 5(50) 4(40) 1(10) 0(0) 0(0) 2fatmilk 7(70) 0(0) 1(10) 0(0) 2(20) Thank you! ??? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.