Hi, I have to recognize that i don't fully understand the aggregate function, but i think it should help me with what i want to do. xveg is a data.frame with location, species, and total for the species. Each location is repeated, once for every species present at that location. For each location i want to find out which species has the maximum total ... so i've tried different ways to do it using aggregate. loc <- c(rep("L1", 3), rep("L2", 5), rep("L3", 2)) sp <- c("a", "b", "c", "a", "d", "b", "e", "c", "b", "d") tot <- c(20, 60, 40, 15, 25, 10, 30, 20, 68, 32) xveg <- data.frame(loc, sp, tot) result desired: L1 b L2 e L3 b sp_maj <- aggregate(xveg[,2], list(xveg[,1], function(x) levels(x)[which.max(table(x))]) This is wrong because it gives the first species name in each level of location, so i get a, a, b, as species instead of b, e, b. I've tried other few aggregate commands, all with wrong results. I will appreciate any help, Thanks, Monica _________________________________________________________________ the go.
Dear Monica, Try this xveg[with(xveg, tot %in% tapply(tot,loc,max)),] HTH, Jorge On Thu, Feb 12, 2009 at 1:58 PM, Monica Pisica <pisicandru@hotmail.com>wrote:> > Hi, > > I have to recognize that i don't fully understand the aggregate function, > but i think it should help me with what i want to do. > > xveg is a data.frame with location, species, and total for the species. > Each location is repeated, once for every species present at that location. > For each location i want to find out which species has the maximum total ... > so i've tried different ways to do it using aggregate. > > loc <- c(rep("L1", 3), rep("L2", 5), rep("L3", 2)) > sp <- c("a", "b", "c", "a", "d", "b", "e", "c", "b", "d") > tot <- c(20, 60, 40, 15, 25, 10, 30, 20, 68, 32) > xveg <- data.frame(loc, sp, tot) > > result desired: > > L1 b > L2 e > L3 b > > sp_maj <- aggregate(xveg[,2], list(xveg[,1], function(x) > levels(x)[which.max(table(x))]) > > This is wrong because it gives the first species name in each level of > location, so i get a, a, b, as species instead of b, e, b. > > I've tried other few aggregate commands, all with wrong results. > > I will appreciate any help, > > Thanks, > > Monica > > _________________________________________________________________ > > the go. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
I don't have an easy solution with aggregate, because the function in aggregate needs to return a scalar. But the following should work: do.call("rbind", lapply(split(xveg, xveg$loc), function(x) x[which.max(x$tot), ])) loc sp tot L1 L1 b 60 L2 L2 e 30 L3 L3 b 68 -Christos> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Monica Pisica > Sent: Thursday, February 12, 2009 1:58 PM > To: R help project > Subject: [R] Aggregrate function > > > Hi, > > I have to recognize that i don't fully understand the > aggregate function, but i think it should help me with what i > want to do. > > xveg is a data.frame with location, species, and total for > the species. Each location is repeated, once for every > species present at that location. For each location i want to > find out which species has the maximum total ... so i've > tried different ways to do it using aggregate. > > loc <- c(rep("L1", 3), rep("L2", 5), rep("L3", 2)) sp <- > c("a", "b", "c", "a", "d", "b", "e", "c", "b", "d") tot <- > c(20, 60, 40, 15, 25, 10, 30, 20, 68, 32) xveg <- > data.frame(loc, sp, tot) > > result desired: > > L1 b > L2 e > L3 b > > sp_maj <- aggregate(xveg[,2], list(xveg[,1], function(x) > levels(x)[which.max(table(x))]) > > This is wrong because it gives the first species name in each > level of location, so i get a, a, b, as species instead of b, e, b. > > I've tried other few aggregate commands, all with wrong results. > > I will appreciate any help, > > Thanks, > > Monica > > _________________________________________________________________ > > the go. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >
it does and you get exactly what monica wanted if you take out the "sp and just return the whole thing. thanks. On Thu, Feb 12, 2009 at 5:52 PM, David Winsemius wrote:> aggregate and by are convenience functions of tapply. > > Consider this alternate solution: > > xveg[which(xveg$tot %in% with(xveg, tapply(tot, loc, max))),"sp"] > > It uses tapply to find the maximums by loc(ations) and then to goes > back into xveg to find the corresponding sp(ecies). You should do > testing to see whether the handling of ties agrees with your needs. > > -- > David Winsemius > > On Feb 12, 2:56?pm, "Christos Hatzis" <christos.hat... at nuverabio.com> > wrote: >> I don't have an easy solution with aggregate, because the function in >> aggregate needs to return a scalar. >> But the following should work: >> >> do.call("rbind", lapply(split(xveg, xveg$loc), function(x) >> x[which.max(x$tot), ])) >> >> ? ?loc sp tot >> L1 ?L1 ?b ?60 >> L2 ?L2 ?e ?30 >> L3 ?L3 ?b ?68 >> >> -Christos >> >> >> >>> -----Original Message----- >>> From: r-help-boun... at r-project.org >>> [mailto:r-help-boun... at r-project.org] On Behalf Of Monica Pisica >>> Sent: Thursday, February 12, 2009 1:58 PM >>> To: R help project >>> Subject: [R] Aggregrate function >> >>> Hi, >> >>> I have to recognize that i don't fully understand the >>> aggregate function, but i think it should help me with what i >>> want to do. >> >>> xveg is a data.frame with location, species, and total for >>> the species. Each location is repeated, once for every >>> species present at that location. For each location i want to >>> find out which species has the maximum total ... so i've >>> tried different ways to do it using aggregate. >> >>> loc <- c(rep("L1", 3), rep("L2", 5), rep("L3", 2)) sp <- >>> c("a", "b", "c", "a", "d", "b", "e", "c", "b", "d") tot <- >>> c(20, 60, 40, 15, 25, 10, 30, 20, 68, 32) xveg <- >>> data.frame(loc, sp, tot) >> >>> result desired: >> >>> L1 ? b >>> L2 ? e >>> L3 ? b >> >>> sp_maj <- aggregate(xveg[,2], list(xveg[,1], function(x) >>> levels(x)[which.max(table(x))]) >> >>> This is wrong because it gives the first species name in each >>> level of location, so i get a, a, b, as species instead of b, e, b. >> >>> I've tried other few aggregate commands, all with wrong results. >> >>> I will appreciate any help, >> >>> Thanks, >> >>> Monica >> >>> _________________________________________________________________ >> >>> ?the go. >> >>> ______________________________________________ >>> R-h... at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> ______________________________________________ >> R-h... at r-project.org mailing >> listhttps://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting >> guidehttp://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.