Luma R
2011-May-19 09:13 UTC
[R] trouble with summary tables with several variables using aggregate function
Dear all, I am having trouble creating summary tables using aggregate function. given the following table: Var1 Var2 Var3 dummy S1 T1 I 1 S1 T1 I 1 S1 T1 D 1 S1 T1 D 1 S1 T2 I 1 S1 T2 I 1 S1 T2 D 1 S1 T2 D 1 S2 T1 I 1 S2 T1 I 1 S2 T1 D 1 S2 T1 D 1 S2 T2 I 1 S2 T2 I 1 S2 T2 I 1 S2 T2 I 1 I want to create a summary table that shows for each category of Var1, Var2, the number of cells that are Var3=D and Var3-I : Var1 Var2 Var3(D) Var3(I) S1 T1 2 2 S1 T2 2 2 S2 T1 2 2 S2 T2 0 4 However, if I do: Count.Cells= aggregate(dummy~ Var1+Var2+Var3, FUN='sum') , I get: Var1 Var2 Var3 Count of Resp S1 T1 D 2 S1 T1 I 2 S1 T2 D 2 S1 T2 I 2 S2 T1 D 2 S2 T1 I 2 S2 T2 I 4 Is there a way to get different columns for each Var3 level? Thank you for any help you can give! [[alternative HTML version deleted]]
Phil Spector
2011-May-19 18:10 UTC
[R] trouble with summary tables with several variables using aggregate function
Luma - If I understand you correctly, I think the easiest way to get what you want is to use the reshape function on the output from aggregate:> reshape(Count.Cells,idvar=c('Var1','Var2'),timevar='Var3',direction='wide')Var1 Var2 dummy.D dummy.I 1 S1 T1 2 2 2 S2 T1 2 2 3 S1 T2 2 2 7 S2 T2 NA 4 - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spector at stat.berkeley.edu On Thu, 19 May 2011, Luma R wrote:> Dear all, > > I am having trouble creating summary tables using aggregate function. > > given the following table: > > > Var1 Var2 Var3 dummy > S1 T1 I 1 > S1 T1 I 1 > S1 T1 D 1 > S1 T1 D 1 > S1 T2 I 1 > S1 T2 I 1 > S1 T2 D 1 > S1 T2 D 1 > S2 T1 I 1 > S2 T1 I 1 > S2 T1 D 1 > S2 T1 D 1 > S2 T2 I 1 > S2 T2 I 1 > S2 T2 I 1 > S2 T2 I 1 > > > I want to create a summary table that shows for each category of Var1, > Var2, the number of cells that are Var3=D and Var3-I : > > Var1 Var2 Var3(D) Var3(I) > S1 T1 2 2 > S1 T2 2 2 > S2 T1 2 2 > S2 T2 0 4 > > > > However, if I do: Count.Cells= aggregate(dummy~ Var1+Var2+Var3, FUN='sum') > , I get: > > Var1 Var2 Var3 Count of Resp > S1 T1 D 2 > S1 T1 I 2 > S1 T2 D 2 > S1 T2 I 2 > S2 T1 D 2 > S2 T1 I 2 > S2 T2 I 4 > > > Is there a way to get different columns for each Var3 level? > > > Thank you for any help you can give! > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Dennis Murphy
2011-May-19 19:29 UTC
[R] trouble with summary tables with several variables using aggregate function
Hi: The dummy column really isn't necessary. Here's another way to get the result you want. Let d be the name of your example data frame. d <- d[, 1:3] (dtable <- as.data.frame(ftable(d, row.vars = c(1, 2)))) Var1 Var2 Var3 Freq 1 S1 T1 D 2 2 S2 T1 D 2 3 S1 T2 D 2 4 S2 T2 D 0 5 S1 T1 I 2 6 S2 T1 I 2 7 S1 T2 I 2 8 S2 T2 I 4 An alternative to the reshape() function is the reshape2 package, which has a function dcast() that allows you to rearrange the data frame as you desire. library(reshape2) dcast(dtable, Var1 + Var2 ~ Var3) Using Freq as value column: use value_var to override. Var1 Var2 D I 1 S1 T1 2 2 2 S1 T2 2 2 3 S2 T1 2 2 4 S2 T2 0 4 HTH, Dennis On Thu, May 19, 2011 at 2:13 AM, Luma R <rluma1979 at gmail.com> wrote:> Dear all, > > I am having trouble creating summary tables using aggregate function. > > given the following table: > > > Var1 ? Var2 ? ?Var3 ? dummy > S1 ? ? ? T1 ? ? ? ? I ? ? ? ? 1 > S1 ? ? ? T1 ? ? ? ? I ? ? ? ? 1 > S1 ? ? ? T1 ? ? ? ? D ? ? ? ?1 > S1 ? ? ? T1 ? ? ? ? D ? ? ? ?1 > S1 ? ? ? T2 ? ? ? ? I ? ? ? ? 1 > S1 ? ? ? T2 ? ? ? ? I ? ? ? ? 1 > S1 ? ? ? T2 ? ? ? ? D ? ? ? ?1 > S1 ? ? ? T2 ? ? ? ? D ? ? ? ?1 > S2 ? ? ? T1 ? ? ? ? I ? ? ? ? 1 > S2 ? ? ? T1 ? ? ? ? I ? ? ? ? 1 > S2 ? ? ? T1 ? ? ? ? D ? ? ? ?1 > S2 ? ? ? T1 ? ? ? ? D ? ? ? ?1 > S2 ? ? ? T2 ? ? ? ? I ? ? ? ? 1 > S2 ? ? ? T2 ? ? ? ? I ? ? ? ? 1 > S2 ? ? ? T2 ? ? ? ? I ? ? ? ?1 > S2 ? ? ? T2 ? ? ? ? I ? ? ? ?1 > > > I want to create a summary table that shows for each category of Var1, > Var2, the number of cells that are Var3=D and Var3-I : > > ? ? ? ? Var1 Var2 ?Var3(D) ? Var3(I) > ? ? ? ? S1 ? ? T1 ? ?2 ? ? ? ? ? ? ?2 > ? ? ? ? S1 ? ? T2 ? ?2 ? ? ? ? ? ? ?2 > ? ? ? ? S2 ? ? T1 ? ?2 ? ? ? ? ? ? ?2 > ? ? ? ? S2 ? ? T2 ? ?0 ? ? ? ? ? ? ?4 > > > > However, if I do: Count.Cells= ?aggregate(dummy~ Var1+Var2+Var3, FUN='sum') > , I get: > > ? ? ? ? ? Var1 Var2 ?Var3 Count of Resp > ? ? ? ? ? ?S1 ? ? T1 ? ? D ? ? ? ?2 > ? ? ? ? ? ?S1 ? ? T1 ? ? I ? ? ? ? ?2 > ? ? ? ? ? ?S1 ? ? T2 ? ? D ? ? ? ?2 > ? ? ? ? ? ?S1 ? ? T2 ? ? I ? ? ? ? 2 > ? ? ? ? ? ?S2 ? ? T1 ? ? D ? ? ? 2 > ? ? ? ? ? ?S2 ? ? ?T1 ? ?I ? ? ? ?2 > ? ? ? ? ? ?S2 ? ? T2 ? ? I ? ? ? ?4 > > > Is there a way to get different columns for each Var3 level? > > > Thank you for any help you can give! > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >