I keep expecting R to have something analogous to the =count function in
Excel, but I can't find anything. I simply want to count the data for a
given category.
I've been using the ddply() function in the plyr package to summarize
means and st dev of my data, with this code:
ddply(NZ_Conifers,.(ElevCat, DataSource, SizeClass), summarise,
avgDensity=mean(Density), sdDensity=sd(Density), n=sum(Density))
and that gives me results that look like this:
ElevCat DataSource SizeClass avgDensity sdDensity n
1 Elev1 FIA Class1 38.67768 46.6673478 734.87598
2 Elev1 FIA Class2 27.34096 23.3232470 820.22879
3 Elev1 FIA Class3 15.38758 0.7088432 76.93790
4 Elev1 VTM Class1 66.37897 70.2050817 24958.49284
5 Elev1 VTM Class2 39.40786 34.9343269 11782.95152
6 Elev1 VTM Class3 21.17839 12.3487600 1461.30895
But, instead of "sum(Density)", I'd really like counts of
"Density", so
that I know the sample size of each group. Any suggestions?
--
Christopher R. Dolanc
Post-doctoral Researcher
University of California, Davis
I think you are looking for the function called length(). I cannot recreate
your output, since I don't know what is in NZ_Conifers, but with the
built-in
dataset mtcars I get:
> ddply(mtcars, .(cyl,gear,carb), summarize, MeanWt=mean(wt), N=length(wt))
cyl gear carb MeanWt N
1 4 3 1 2.46500 1
2 4 4 1 2.07250 4
3 4 4 2 2.68375 4
4 4 5 2 1.82650 2
5 6 3 1 3.33750 2
6 6 4 4 3.09375 4
7 6 5 6 2.77000 1
8 8 3 2 3.56000 4
9 8 3 3 3.86000 3
10 8 3 4 4.68580 5
11 8 5 4 3.17000 1
12 8 5 8 3.57000 1
> with(mtcars, sum(cyl==8 & gear==3 & carb==4)) # output line 10
[1] 5
If all you want is the count of things in various categories, you can use table
instead of ddply and length:
> with(mtcars, table(cyl, gear, carb))
, , carb = 1
gear
cyl 3 4 5
4 1 4 0
6 2 0 0
8 0 0 0
, , carb = 2
gear
cyl 3 4 5
4 0 4 2
6 0 0 0
8 4 0 0
...
Using ftable on table's output gives a nicer looking printout, but
table's output is easier
to use in a program.
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at
r-project.org] On Behalf
> Of Christopher R. Dolanc
> Sent: Thursday, April 05, 2012 12:16 PM
> To: r-help at r-project.org
> Subject: [R] count() function
>
> I keep expecting R to have something analogous to the =count function in
> Excel, but I can't find anything. I simply want to count the data for a
> given category.
>
> I've been using the ddply() function in the plyr package to summarize
> means and st dev of my data, with this code:
>
> ddply(NZ_Conifers,.(ElevCat, DataSource, SizeClass), summarise,
> avgDensity=mean(Density), sdDensity=sd(Density), n=sum(Density))
>
> and that gives me results that look like this:
>
> ElevCat DataSource SizeClass avgDensity sdDensity n
> 1 Elev1 FIA Class1 38.67768 46.6673478 734.87598
> 2 Elev1 FIA Class2 27.34096 23.3232470 820.22879
> 3 Elev1 FIA Class3 15.38758 0.7088432 76.93790
> 4 Elev1 VTM Class1 66.37897 70.2050817 24958.49284
> 5 Elev1 VTM Class2 39.40786 34.9343269 11782.95152
> 6 Elev1 VTM Class3 21.17839 12.3487600 1461.30895
>
> But, instead of "sum(Density)", I'd really like counts of
"Density", so
> that I know the sample size of each group. Any suggestions?
>
> --
> Christopher R. Dolanc
> Post-doctoral Researcher
> University of California, Davis
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
On Apr 5, 2012, at 3:15 PM, Christopher R. Dolanc wrote:> I keep expecting R to have something analogous to the =count > function in Excel, but I can't find anything. I simply want to count > the data for a given category. > > I've been using the ddply() function in the plyr package to > summarize means and st dev of my data, with this code:Color me puzzled. The plyr package _has_ a count function.> > ddply(NZ_Conifers,.(ElevCat, DataSource, SizeClass), summarise, > avgDensity=mean(Density), sdDensity=sd(Density), n=sum(Density)) > > and that gives me results that look like this: > > ElevCat DataSource SizeClass avgDensity sdDensity n > 1 Elev1 FIA Class1 38.67768 46.6673478 734.87598 > 2 Elev1 FIA Class2 27.34096 23.3232470 820.22879 > 3 Elev1 FIA Class3 15.38758 0.7088432 76.93790 > 4 Elev1 VTM Class1 66.37897 70.2050817 24958.49284 > 5 Elev1 VTM Class2 39.40786 34.9343269 11782.95152 > 6 Elev1 VTM Class3 21.17839 12.3487600 1461.30895 > > But, instead of "sum(Density)", I'd really like counts of "Density", > so that I know the sample size of each group. Any suggestions? > > -- > Christopher R. Dolanc > Post-doctoral Researcher > University of California, Davis > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT