thr3ads.net - R help - [R] aggregating columns in a data frame in different ways [Apr 2006]

If this information is useful, please help other people find it:
Share via:

kavaumail-r at yahoo.com

2006-Apr-28 22:19 UTC

[R] aggregating columns in a data frame in different ways

I would like to use aggregate() to combine statistics
for several days in a data frame. My data frame looks
similar to this:

   date        type  count  value
1  2006-04-01     A     10   99.6
2  2006-04-01     B      4   33.2
3  2006-04-02     A     22   43.2
4  2006-04-02     B      8   44.9
5  2006-04-03     A     12   12.4
6  2006-04-03     B     14   18.5

('date' is a factor, and my actual data frame has
about 100 different 'types', not just two)

I would like to sum up the 'counts' per 'type', and
get an average of the 'values' per 'type'. In other
words, I would like my results to look like this:

   type  count  value
1  A     44     51.73333
2  B     26     32.2

The way I'm doing this now is to tear the table apart
into its individual columns, then apply aggregate() to
each column individually (using the 'type' column for
the 'by' parameter), and finally putting everything
back together, like this:
> A.count = aggregate(A$count, list(type=A$type), sum)
> A.value = aggregate(A$value, list(type=A$type),
mean)> B = data.frame(type=A.count$type, count=A.count$x,value=A.value$x)

My actual table is a bit more involved than in this
simple example, however, so this becomes quite
tedious.

I am hoping that there is a simpler way for doing
this, for example by providing different FUN
parameters for each column to the aggregate()
function.

I would appreciate any suggestions.
Thanks
Klaus

jim holtman

2006-Apr-28 22:29 UTC

head link

[R] aggregating columns in a data frame in different ways

Does this do what you want?
> x        date type count value
1 2006-04-01    A    10  99.6
2 2006-04-01    B     4  33.2
3 2006-04-02    A    22  43.2
4 2006-04-02    B     8  44.9
5 2006-04-03    A    12  12.4
6 2006-04-03    B    14  18.5> y <- lapply(split(1:nrow(x), x$type), function(.ind){+     c(count=sum(x$count[.ind]), value=mean(x$value[.ind]))
+ })> do.call('rbind', y)  count    value
A    44 51.73333
B    26 32.20000>


On 4/28/06, kavaumail-r@yahoo.com <kavaumail-r@yahoo.com>
wrote:>
> I would like to use aggregate() to combine statistics
> for several days in a data frame. My data frame looks
> similar to this:
>
>   date        type  count  value
> 1  2006-04-01     A     10   99.6
> 2  2006-04-01     B      4   33.2
> 3  2006-04-02     A     22   43.2
> 4  2006-04-02     B      8   44.9
> 5  2006-04-03     A     12   12.4
> 6  2006-04-03     B     14   18.5
>
> ('date' is a factor, and my actual data frame has
> about 100 different 'types', not just two)
>
> I would like to sum up the 'counts' per 'type', and
> get an average of the 'values' per 'type'. In other
> words, I would like my results to look like this:
>
>   type  count  value
> 1  A     44     51.73333
> 2  B     26     32.2
>
> The way I'm doing this now is to tear the table apart
> into its individual columns, then apply aggregate() to
> each column individually (using the 'type' column for
> the 'by' parameter), and finally putting everything
> back together, like this:
>
> > A.count = aggregate(A$count, list(type=A$type), sum)
> > A.value = aggregate(A$value, list(type=A$type),
> mean)
> > B = data.frame(type=A.count$type, count=A.count$x,
> value=A.value$x)
>
> My actual table is a bit more involved than in this
> simple example, however, so this becomes quite
> tedious.
>
> I am hoping that there is a simpler way for doing
> this, for example by providing different FUN
> parameters for each column to the aggregate()
> function.
>
> I would appreciate any suggestions.
> Thanks
> Klaus
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>


--
Jim Holtman
Cincinnati, OH
+1 513 646 9390 (Cell)
+1 513 247 0281 (Home)

What the problem you are trying to solve?

	[[alternative HTML version deleted]]

Gabor Grothendieck

2006-Apr-28 22:57 UTC

head link

[R] aggregating columns in a data frame in different ways

Here are three possibilities:

1. aggregate on the columns that you want to sum and aggregate on
the columns that you want to average and then merge them:

By <- A[, 2, drop = FALSE]
merge(aggregate(A[, 3, drop = FALSE], By, sum),
     aggregate(A[, 4, drop = FALSE], By, mean))

2. use by:

f <- function(x) with(x, c(count = sum(count), value = mean(value)))
do.call("rbind", by(A[, 3:4], A[, 2, drop = FALSE], f))

3. use summaryBy in the doBy package picking off the appropriate
columns in the output:

library(doBy)
summaryBy(. ~ type, A[, -1], FUN = c(sum, mean))[, c(1, 2, 5)]


On 4/28/06, kavaumail-r at yahoo.com <kavaumail-r at yahoo.com>
wrote:> I would like to use aggregate() to combine statistics
> for several days in a data frame. My data frame looks
> similar to this:
>
>   date        type  count  value
> 1  2006-04-01     A     10   99.6
> 2  2006-04-01     B      4   33.2
> 3  2006-04-02     A     22   43.2
> 4  2006-04-02     B      8   44.9
> 5  2006-04-03     A     12   12.4
> 6  2006-04-03     B     14   18.5
>
> ('date' is a factor, and my actual data frame has
> about 100 different 'types', not just two)
>
> I would like to sum up the 'counts' per 'type', and
> get an average of the 'values' per 'type'. In other
> words, I would like my results to look like this:
>
>   type  count  value
> 1  A     44     51.73333
> 2  B     26     32.2
>
> The way I'm doing this now is to tear the table apart
> into its individual columns, then apply aggregate() to
> each column individually (using the 'type' column for
> the 'by' parameter), and finally putting everything
> back together, like this:
>
> > A.count = aggregate(A$count, list(type=A$type), sum)
> > A.value = aggregate(A$value, list(type=A$type),
> mean)
> > B = data.frame(type=A.count$type, count=A.count$x,
> value=A.value$x)
>
> My actual table is a bit more involved than in this
> simple example, however, so this becomes quite
> tedious.
>
> I am hoping that there is a simpler way for doing
> this, for example by providing different FUN
> parameters for each column to the aggregate()
> function.
>
> I would appreciate any suggestions.
> Thanks
> Klaus
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>

Alexander Nervedi

2006-Apr-29 00:12 UTC

head link

[R] cloud() works but wireframe() is blank

I have to be making a riddiculously silly ommission.
when I run the fillowing i get the cloud plot ok. But I cant figure out what 
I am missing out when I call wireframe.

Any help would be appreciated.

x<-runif(100)
y<-rnorm(100)
z<-runif(100)

temp <-data.frame(x,y,z)
wireframe(x~y*z,temp)
cloud(x~y*z,temp)

Possibly Parallel Threads

Search for more apparently analagous threads

R help - Apr 2006 - aggregating columns in a data frame in different ways

[R] aggregating columns in a data frame in different ways

[R] aggregating columns in a data frame in different ways

[R] aggregating columns in a data frame in different ways

[R] cloud() works but wireframe() is blank

Possibly Parallel Threads