thr3ads.net - R help - [R] data manipulation involving aggregate [May 2009]

If this information is useful, please help other people find it:
Share via:

Simon Pickett

2009-May-29 14:27 UTC

[R] data manipulation involving aggregate

hi all,

I often have a data frame like this example

data.frame(sq=c(1,1,1,2,2,3,3,3,3),area=c(1,2,3,1,2,3,1,2,3),habitat=c("garden","garden","pond","field","garden","river","garden","field","field"))

for each "sq" I have multiple "habitat"s each with an
associated "area".

I want to aggregate the data frame so that for each "sq" I have a
column of all possible "habitat"s and another column for the
calculation of the summed areas for each "habitat". If a certain
habitat doesnt exist in that square I want a zero, like this..

data.frame(sq=rep(seq(1:3),each=4),area.sum=c(3,3,0,0,2,0,1,0,1,0,5,3),habitat=rep(c("garden","pond","field","river")
))

Is there an eloquent, efficient way of doing this? My solution involves lots of
intermediate aggregated data frames, one for each habitat, then a series of
merges onto a bigger data frame.

Thanks peeps and have a good weekend,

Simon.





Dr. Simon Pickett
Research Ecologist
Land Use Department
Terrestrial Unit
British Trust for Ornithology
The Nunnery
Thetford
Norfolk
IP242PU
01842750050

	[[alternative HTML version deleted]]

Gabor Grothendieck

2009-May-29 14:46 UTC

head link

[R] data manipulation involving aggregate

Try this:
> as.data.frame.table(xtabs(area ~ habitat + sq, DF), responseName =
"area.sum")[c(2:3, 1)]   sq area.sum habitat
1   1        0   field
2   1        3  garden
3   1        3    pond
4   1        0   river
5   2        1   field
6   2        2  garden
7   2        0    pond
8   2        0   river
9   3        5   field
10  3        1  garden
11  3        0    pond
12  3        3   river


On Fri, May 29, 2009 at 10:27 AM, Simon Pickett <simon.pickett at bto.org>
wrote:> hi all,
>
> I often have a data frame like this example
>
>
data.frame(sq=c(1,1,1,2,2,3,3,3,3),area=c(1,2,3,1,2,3,1,2,3),habitat=c("garden","garden","pond","field","garden","river","garden","field","field"))
>
> for each "sq" I have multiple "habitat"s each with an
associated "area".
>
> I want to aggregate the data frame so that for each "sq" I have a
column of all possible "habitat"s and another column for the
calculation of the summed areas for each "habitat". If a certain
habitat doesnt exist in that square I want a zero, like this..
>
>
data.frame(sq=rep(seq(1:3),each=4),area.sum=c(3,3,0,0,2,0,1,0,1,0,5,3),habitat=rep(c("garden","pond","field","river")
))
>
> Is there an eloquent, efficient way of doing this? My solution involves
lots of intermediate aggregated data frames, one for each habitat, then a series
of merges onto a bigger data frame.
>
> Thanks peeps and have a good weekend,
>
> Simon.
>
>
>
>
>
> Dr. Simon Pickett
> Research Ecologist
> Land Use Department
> Terrestrial Unit
> British Trust for Ornithology
> The Nunnery
> Thetford
> Norfolk
> IP242PU
> 01842750050
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Reasonably Related Threads

Search for more reasonably related threads

R help - May 2009 - data manipulation involving aggregate

[R] data manipulation involving aggregate

[R] data manipulation involving aggregate

Reasonably Related Threads