Luma R
2011-May-19 09:13 UTC
[R] trouble with summary tables with several variables using aggregate function
Dear all,
I am having trouble creating summary tables using aggregate function.
given the following table:
Var1 Var2 Var3 dummy
S1 T1 I 1
S1 T1 I 1
S1 T1 D 1
S1 T1 D 1
S1 T2 I 1
S1 T2 I 1
S1 T2 D 1
S1 T2 D 1
S2 T1 I 1
S2 T1 I 1
S2 T1 D 1
S2 T1 D 1
S2 T2 I 1
S2 T2 I 1
S2 T2 I 1
S2 T2 I 1
I want to create a summary table that shows for each category of Var1,
Var2, the number of cells that are Var3=D and Var3-I :
Var1 Var2 Var3(D) Var3(I)
S1 T1 2 2
S1 T2 2 2
S2 T1 2 2
S2 T2 0 4
However, if I do: Count.Cells= aggregate(dummy~ Var1+Var2+Var3,
FUN='sum')
, I get:
Var1 Var2 Var3 Count of Resp
S1 T1 D 2
S1 T1 I 2
S1 T2 D 2
S1 T2 I 2
S2 T1 D 2
S2 T1 I 2
S2 T2 I 4
Is there a way to get different columns for each Var3 level?
Thank you for any help you can give!
[[alternative HTML version deleted]]
Phil Spector
2011-May-19 18:10 UTC
[R] trouble with summary tables with several variables using aggregate function
Luma -
If I understand you correctly, I think the easiest way
to get what you want is to use the reshape function on
the output from aggregate:
>
reshape(Count.Cells,idvar=c('Var1','Var2'),timevar='Var3',direction='wide')
Var1 Var2 dummy.D dummy.I
1 S1 T1 2 2
2 S2 T1 2 2
3 S1 T2 2 2
7 S2 T2 NA 4
- Phil Spector
Statistical Computing Facility
Department of Statistics
UC Berkeley
spector at stat.berkeley.edu
On Thu, 19 May 2011, Luma R wrote:
> Dear all,
>
> I am having trouble creating summary tables using aggregate function.
>
> given the following table:
>
>
> Var1 Var2 Var3 dummy
> S1 T1 I 1
> S1 T1 I 1
> S1 T1 D 1
> S1 T1 D 1
> S1 T2 I 1
> S1 T2 I 1
> S1 T2 D 1
> S1 T2 D 1
> S2 T1 I 1
> S2 T1 I 1
> S2 T1 D 1
> S2 T1 D 1
> S2 T2 I 1
> S2 T2 I 1
> S2 T2 I 1
> S2 T2 I 1
>
>
> I want to create a summary table that shows for each category of Var1,
> Var2, the number of cells that are Var3=D and Var3-I :
>
> Var1 Var2 Var3(D) Var3(I)
> S1 T1 2 2
> S1 T2 2 2
> S2 T1 2 2
> S2 T2 0 4
>
>
>
> However, if I do: Count.Cells= aggregate(dummy~ Var1+Var2+Var3,
FUN='sum')
> , I get:
>
> Var1 Var2 Var3 Count of Resp
> S1 T1 D 2
> S1 T1 I 2
> S1 T2 D 2
> S1 T2 I 2
> S2 T1 D 2
> S2 T1 I 2
> S2 T2 I 4
>
>
> Is there a way to get different columns for each Var3 level?
>
>
> Thank you for any help you can give!
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Dennis Murphy
2011-May-19 19:29 UTC
[R] trouble with summary tables with several variables using aggregate function
Hi: The dummy column really isn't necessary. Here's another way to get the result you want. Let d be the name of your example data frame. d <- d[, 1:3] (dtable <- as.data.frame(ftable(d, row.vars = c(1, 2)))) Var1 Var2 Var3 Freq 1 S1 T1 D 2 2 S2 T1 D 2 3 S1 T2 D 2 4 S2 T2 D 0 5 S1 T1 I 2 6 S2 T1 I 2 7 S1 T2 I 2 8 S2 T2 I 4 An alternative to the reshape() function is the reshape2 package, which has a function dcast() that allows you to rearrange the data frame as you desire. library(reshape2) dcast(dtable, Var1 + Var2 ~ Var3) Using Freq as value column: use value_var to override. Var1 Var2 D I 1 S1 T1 2 2 2 S1 T2 2 2 3 S2 T1 2 2 4 S2 T2 0 4 HTH, Dennis On Thu, May 19, 2011 at 2:13 AM, Luma R <rluma1979 at gmail.com> wrote:> Dear all, > > I am having trouble creating summary tables using aggregate function. > > given the following table: > > > Var1 ? Var2 ? ?Var3 ? dummy > S1 ? ? ? T1 ? ? ? ? I ? ? ? ? 1 > S1 ? ? ? T1 ? ? ? ? I ? ? ? ? 1 > S1 ? ? ? T1 ? ? ? ? D ? ? ? ?1 > S1 ? ? ? T1 ? ? ? ? D ? ? ? ?1 > S1 ? ? ? T2 ? ? ? ? I ? ? ? ? 1 > S1 ? ? ? T2 ? ? ? ? I ? ? ? ? 1 > S1 ? ? ? T2 ? ? ? ? D ? ? ? ?1 > S1 ? ? ? T2 ? ? ? ? D ? ? ? ?1 > S2 ? ? ? T1 ? ? ? ? I ? ? ? ? 1 > S2 ? ? ? T1 ? ? ? ? I ? ? ? ? 1 > S2 ? ? ? T1 ? ? ? ? D ? ? ? ?1 > S2 ? ? ? T1 ? ? ? ? D ? ? ? ?1 > S2 ? ? ? T2 ? ? ? ? I ? ? ? ? 1 > S2 ? ? ? T2 ? ? ? ? I ? ? ? ? 1 > S2 ? ? ? T2 ? ? ? ? I ? ? ? ?1 > S2 ? ? ? T2 ? ? ? ? I ? ? ? ?1 > > > I want to create a summary table that shows for each category of Var1, > Var2, the number of cells that are Var3=D and Var3-I : > > ? ? ? ? Var1 Var2 ?Var3(D) ? Var3(I) > ? ? ? ? S1 ? ? T1 ? ?2 ? ? ? ? ? ? ?2 > ? ? ? ? S1 ? ? T2 ? ?2 ? ? ? ? ? ? ?2 > ? ? ? ? S2 ? ? T1 ? ?2 ? ? ? ? ? ? ?2 > ? ? ? ? S2 ? ? T2 ? ?0 ? ? ? ? ? ? ?4 > > > > However, if I do: Count.Cells= ?aggregate(dummy~ Var1+Var2+Var3, FUN='sum') > , I get: > > ? ? ? ? ? Var1 Var2 ?Var3 Count of Resp > ? ? ? ? ? ?S1 ? ? T1 ? ? D ? ? ? ?2 > ? ? ? ? ? ?S1 ? ? T1 ? ? I ? ? ? ? ?2 > ? ? ? ? ? ?S1 ? ? T2 ? ? D ? ? ? ?2 > ? ? ? ? ? ?S1 ? ? T2 ? ? I ? ? ? ? 2 > ? ? ? ? ? ?S2 ? ? T1 ? ? D ? ? ? 2 > ? ? ? ? ? ?S2 ? ? ?T1 ? ?I ? ? ? ?2 > ? ? ? ? ? ?S2 ? ? T2 ? ? I ? ? ? ?4 > > > Is there a way to get different columns for each Var3 level? > > > Thank you for any help you can give! > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >