Hi,
I have to recognize that i don't fully understand the aggregate function,
but i think it should help me with what i want to do.
xveg is a data.frame with location, species, and total for the species. Each
location is repeated, once for every species present at that location. For each
location i want to find out which species has the maximum total ... so i've
tried different ways to do it using aggregate.
loc <- c(rep("L1", 3), rep("L2", 5), rep("L3",
2))
sp <- c("a", "b", "c", "a",
"d", "b", "e", "c", "b",
"d")
tot <- c(20, 60, 40, 15, 25, 10, 30, 20, 68, 32)
xveg <- data.frame(loc, sp, tot)
result desired:
L1 b
L2 e
L3 b
sp_maj <- aggregate(xveg[,2], list(xveg[,1], function(x)
levels(x)[which.max(table(x))])
This is wrong because it gives the first species name in each level of location,
so i get a, a, b, as species instead of b, e, b.
I've tried other few aggregate commands, all with wrong results.
I will appreciate any help,
Thanks,
Monica
_________________________________________________________________
the go.
Dear Monica, Try this xveg[with(xveg, tot %in% tapply(tot,loc,max)),] HTH, Jorge On Thu, Feb 12, 2009 at 1:58 PM, Monica Pisica <pisicandru@hotmail.com>wrote:> > Hi, > > I have to recognize that i don't fully understand the aggregate function, > but i think it should help me with what i want to do. > > xveg is a data.frame with location, species, and total for the species. > Each location is repeated, once for every species present at that location. > For each location i want to find out which species has the maximum total ... > so i've tried different ways to do it using aggregate. > > loc <- c(rep("L1", 3), rep("L2", 5), rep("L3", 2)) > sp <- c("a", "b", "c", "a", "d", "b", "e", "c", "b", "d") > tot <- c(20, 60, 40, 15, 25, 10, 30, 20, 68, 32) > xveg <- data.frame(loc, sp, tot) > > result desired: > > L1 b > L2 e > L3 b > > sp_maj <- aggregate(xveg[,2], list(xveg[,1], function(x) > levels(x)[which.max(table(x))]) > > This is wrong because it gives the first species name in each level of > location, so i get a, a, b, as species instead of b, e, b. > > I've tried other few aggregate commands, all with wrong results. > > I will appreciate any help, > > Thanks, > > Monica > > _________________________________________________________________ > > the go. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
I don't have an easy solution with aggregate, because the function in
aggregate needs to return a scalar.
But the following should work:
do.call("rbind", lapply(split(xveg, xveg$loc), function(x)
x[which.max(x$tot), ]))
loc sp tot
L1 L1 b 60
L2 L2 e 30
L3 L3 b 68
-Christos
> -----Original Message-----
> From: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org] On Behalf Of Monica Pisica
> Sent: Thursday, February 12, 2009 1:58 PM
> To: R help project
> Subject: [R] Aggregrate function
>
>
> Hi,
>
> I have to recognize that i don't fully understand the
> aggregate function, but i think it should help me with what i
> want to do.
>
> xveg is a data.frame with location, species, and total for
> the species. Each location is repeated, once for every
> species present at that location. For each location i want to
> find out which species has the maximum total ... so i've
> tried different ways to do it using aggregate.
>
> loc <- c(rep("L1", 3), rep("L2", 5),
rep("L3", 2)) sp <-
> c("a", "b", "c", "a",
"d", "b", "e", "c", "b",
"d") tot <-
> c(20, 60, 40, 15, 25, 10, 30, 20, 68, 32) xveg <-
> data.frame(loc, sp, tot)
>
> result desired:
>
> L1 b
> L2 e
> L3 b
>
> sp_maj <- aggregate(xveg[,2], list(xveg[,1], function(x)
> levels(x)[which.max(table(x))])
>
> This is wrong because it gives the first species name in each
> level of location, so i get a, a, b, as species instead of b, e, b.
>
> I've tried other few aggregate commands, all with wrong results.
>
> I will appreciate any help,
>
> Thanks,
>
> Monica
>
> _________________________________________________________________
>
> the go.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
it does and you get exactly what monica wanted if you take out the "sp and just return the whole thing. thanks. On Thu, Feb 12, 2009 at 5:52 PM, David Winsemius wrote:> aggregate and by are convenience functions of tapply. > > Consider this alternate solution: > > xveg[which(xveg$tot %in% with(xveg, tapply(tot, loc, max))),"sp"] > > It uses tapply to find the maximums by loc(ations) and then to goes > back into xveg to find the corresponding sp(ecies). You should do > testing to see whether the handling of ties agrees with your needs. > > -- > David Winsemius > > On Feb 12, 2:56?pm, "Christos Hatzis" <christos.hat... at nuverabio.com> > wrote: >> I don't have an easy solution with aggregate, because the function in >> aggregate needs to return a scalar. >> But the following should work: >> >> do.call("rbind", lapply(split(xveg, xveg$loc), function(x) >> x[which.max(x$tot), ])) >> >> ? ?loc sp tot >> L1 ?L1 ?b ?60 >> L2 ?L2 ?e ?30 >> L3 ?L3 ?b ?68 >> >> -Christos >> >> >> >>> -----Original Message----- >>> From: r-help-boun... at r-project.org >>> [mailto:r-help-boun... at r-project.org] On Behalf Of Monica Pisica >>> Sent: Thursday, February 12, 2009 1:58 PM >>> To: R help project >>> Subject: [R] Aggregrate function >> >>> Hi, >> >>> I have to recognize that i don't fully understand the >>> aggregate function, but i think it should help me with what i >>> want to do. >> >>> xveg is a data.frame with location, species, and total for >>> the species. Each location is repeated, once for every >>> species present at that location. For each location i want to >>> find out which species has the maximum total ... so i've >>> tried different ways to do it using aggregate. >> >>> loc <- c(rep("L1", 3), rep("L2", 5), rep("L3", 2)) sp <- >>> c("a", "b", "c", "a", "d", "b", "e", "c", "b", "d") tot <- >>> c(20, 60, 40, 15, 25, 10, 30, 20, 68, 32) xveg <- >>> data.frame(loc, sp, tot) >> >>> result desired: >> >>> L1 ? b >>> L2 ? e >>> L3 ? b >> >>> sp_maj <- aggregate(xveg[,2], list(xveg[,1], function(x) >>> levels(x)[which.max(table(x))]) >> >>> This is wrong because it gives the first species name in each >>> level of location, so i get a, a, b, as species instead of b, e, b. >> >>> I've tried other few aggregate commands, all with wrong results. >> >>> I will appreciate any help, >> >>> Thanks, >> >>> Monica >> >>> _________________________________________________________________ >> >>> ?the go. >> >>> ______________________________________________ >>> R-h... at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> ______________________________________________ >> R-h... at r-project.org mailing >> listhttps://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting >> guidehttp://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.