thr3ads.net - R help - [R] subtotal, submean, aggregate [Feb 2006]

If this information is useful, please help other people find it:
Share via:

Patrick Giraudoux

2006-Feb-26 13:24 UTC

[R] subtotal, submean, aggregate

Dear All,

I would like to make partial sums (or means or any other function) of 
the values in intervals along a sequence (spatial transect) where groups 
are defined.

For instance:

habitats<-rep(c("meadow","forest","meadow","pasture"),c(10,5,12,6))
observations<-rpois(length(habitats),2)
transect<-data.frame(observations=observations,habitats=habitats)

aggregate() is not suitable for my purpose because I want a result 
respecting the order of the habitats encountered although they may have 
the same name (and not pooling each group on each level of the factor 
created). For instance, the output of the ideal function 
mynicefunction() would be something as:

mynicefunction(transect$observations, by=list(transect$habitats),sum)
meadow     16
forest      9
meadow     21
pasture    17

and not

aggregate(transect$observations,by=list(transect$habitats),sum)
  Group.1  x
1  forest  9
2  meadow 37
3 pasture 17

Did anybody hear about such a function already written in R? If no, any 
idea to make it simple and elegant to write?

Cheers,

Patrick Giraudoux

Gabor Grothendieck

2006-Feb-26 14:08 UTC

head link

[R] subtotal, submean, aggregate

Create another variable that gives the run number and aggregate on
both the habitat and run number removing the run number after
aggregating:

runno <- cumsum(c(TRUE, diff(as.numeric(transect[,2])) !=0))
aggregate(transect[,1], list(obs = transect[,2], runno = runno), sum)[,-2]

This does not give the same as your example but I think there are some
errors in your example output.

On 2/26/06, Patrick Giraudoux <patrick.giraudoux at univ-fcomte.fr>
wrote:> Dear All,
>
> I would like to make partial sums (or means or any other function) of
> the values in intervals along a sequence (spatial transect) where groups
> are defined.
>
> For instance:
>
>
habitats<-rep(c("meadow","forest","meadow","pasture"),c(10,5,12,6))
> observations<-rpois(length(habitats),2)
> transect<-data.frame(observations=observations,habitats=habitats)
>
> aggregate() is not suitable for my purpose because I want a result
> respecting the order of the habitats encountered although they may have
> the same name (and not pooling each group on each level of the factor
> created). For instance, the output of the ideal function
> mynicefunction() would be something as:
>
> mynicefunction(transect$observations, by=list(transect$habitats),sum)
> meadow     16
> forest      9
> meadow     21
> pasture    17
>
> and not
>
> aggregate(transect$observations,by=list(transect$habitats),sum)
>  Group.1  x
> 1  forest  9
> 2  meadow 37
> 3 pasture 17
>
> Did anybody hear about such a function already written in R? If no, any
> idea to make it simple and elegant to write?
>
> Cheers,
>
> Patrick Giraudoux
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>

Roger Bivand

2006-Feb-26 14:18 UTC

head link

[R] subtotal, submean, aggregate

On Sun, 26 Feb 2006, Patrick Giraudoux wrote:
> Dear All,
> 
> I would like to make partial sums (or means or any other function) of 
> the values in intervals along a sequence (spatial transect) where groups 
> are defined.
> 
> For instance:
> 
>
habitats<-rep(c("meadow","forest","meadow","pasture"),c(10,5,12,6))
> observations<-rpois(length(habitats),2)
> transect<-data.frame(observations=observations,habitats=habitats)
> 
> aggregate() is not suitable for my purpose because I want a result 
> respecting the order of the habitats encountered although they may have 
> the same name (and not pooling each group on each level of the factor 
> created). For instance, the output of the ideal function 
> mynicefunction() would be something as:
> 
> mynicefunction(transect$observations, by=list(transect$habitats),sum)
> meadow     16
> forest      9
> meadow     21
> pasture    17
> 
> and not
> 
> aggregate(transect$observations,by=list(transect$habitats),sum)
>   Group.1  x
> 1  forest  9
> 2  meadow 37
> 3 pasture 17
> 
> Did anybody hear about such a function already written in R? If no, any 
> idea to make it simple and elegant to write?
I got as far as:

rle.habs <- rle(habitats)
habitats1 <- rep(make.names(rle.habs$values, unique=TRUE), rle.habs$lengths)
aggregate(observations,by=list(habitats1),sum)

making an extra habitats vector with a unique label for each run. 

Since I don't know your seed, the results are not the same, but rle() is 
quite good for runs.

Roger
> 
> Cheers,
> 
> Patrick Giraudoux
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
> 
-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no

Reasonably Related Threads

Search for more apparently analagous threads

R help - Feb 2006 - subtotal, submean, aggregate

[R] subtotal, submean, aggregate

[R] subtotal, submean, aggregate

[R] subtotal, submean, aggregate

Reasonably Related Threads