thr3ads.net - R help - [R] retrieve most abundant species by sample unit [Nov 2005]

If this information is useful, please help other people find it:
Share via:

Graham Watt-Gremm

2005-Nov-08 23:46 UTC

[R] retrieve most abundant species by sample unit

Hi R-users:
[R 2.2 on OSX 10.4.3]
I have a (sparse) vegetation data frame with 500 rows (sampling  
units) and 177 columns (plant species) where the data represent %  
cover. I need to summarize the cover data by returning the names of  
the most dominant and the second most dominant species per plot. I  
reduced the data frame to omit cover below 5%; this is what it looks  
like stacked. I have experimented with tapply(), by(), and some  
functions mentioned in archived postings, but I haven't seen anything  
that answers to this directly. Does anybody have any ideas?

      OBJECTID       PolygonID SpeciesCod AbundanceP
1       15006     ANT-CBG-rr1     Leymol    5.00000
3       15008     ANT-CBG-rr1     Ambcha    5.00000
5       15010      ANT-ESH-27     Atrpat   20.00000
6       15011      ANT-ESH-27     Ambcha   10.00000
11      15016      ANT-ESH-28     Salvir   20.00000
14      15019      ANT-ESH-28     Atrpat    5.00000
18      15023 ANT-POR-Rubarm5     Rubarm   60.00000
19      15024 ANT-POR-Rubarm5     Hedhel   40.00000
25      15030      ECO-CBG-A2     Griint    5.00000
27      15032      ECO-CBG-A2     Anngra    5.00000
38      15043      ECO-CBG-A4     Sperub   50.00000

Regards,
Graham Watt-Gremm

Adaikalavan Ramasamy

2005-Nov-09 00:52 UTC

head link

[R] retrieve most abundant species by sample unit

Your example does not appear to match your description of the problem.

If you want have a 500x177 matrix and want to find the largest and
second largest, you can try something like

 m <- matrix( sample( 101:115 ), nc=3 )

      [,1] [,2] [,3]
 [1,]  102  112  110 
 [2,]  111  106  104
 [3,]  108  101  103
 [4,]  114  115  105
 [5,]  113  107  109

 t( apply( m, 1, function(x){ 
           r <- rank(-x); c( which(r==1), which(r==2) ) } ) )

      [,1] [,2]
 [1,]    2    3
 [2,]    1    2
 [3,]    1    3
 [4,]    2    1 
 [5,]    1    3

This uses the fact that all entries in a column is always refers to the
same species. If you have stacked data (especially where the species
appear in a non-regular manner), then it becomes slightly more tricky to
find an elegant solution.

Regards, Adai



On Tue, 2005-11-08 at 15:46 -0800, Graham Watt-Gremm
wrote:> Hi R-users:
> [R 2.2 on OSX 10.4.3]
> I have a (sparse) vegetation data frame with 500 rows (sampling  
> units) and 177 columns (plant species) where the data represent %  
> cover. I need to summarize the cover data by returning the names of  
> the most dominant and the second most dominant species per plot. I  
> reduced the data frame to omit cover below 5%; this is what it looks  
> like stacked. I have experimented with tapply(), by(), and some  
> functions mentioned in archived postings, but I haven't seen anything  
> that answers to this directly. Does anybody have any ideas?
> 
>       OBJECTID       PolygonID SpeciesCod AbundanceP
> 1       15006     ANT-CBG-rr1     Leymol    5.00000
> 3       15008     ANT-CBG-rr1     Ambcha    5.00000
> 5       15010      ANT-ESH-27     Atrpat   20.00000
> 6       15011      ANT-ESH-27     Ambcha   10.00000
> 11      15016      ANT-ESH-28     Salvir   20.00000
> 14      15019      ANT-ESH-28     Atrpat    5.00000
> 18      15023 ANT-POR-Rubarm5     Rubarm   60.00000
> 19      15024 ANT-POR-Rubarm5     Hedhel   40.00000
> 25      15030      ECO-CBG-A2     Griint    5.00000
> 27      15032      ECO-CBG-A2     Anngra    5.00000
> 38      15043      ECO-CBG-A4     Sperub   50.00000
> 
> Regards,
> Graham Watt-Gremm
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>

Dave Roberts

2005-Nov-09 17:02 UTC

head link

[R] retrieve most abundant species by sample unit

Graham,

It's relatively easily done, especially the first one.

Let's suppose your veg data frame is called veg

 > dom1 <- apply(veg,1,which.max)

returns a vector with the column number of the species with the highest 
abundance for each sample (if there are ties, it returns the first one).
If you're concerned about ties, you can check to see how many there are 
with

for (i in 1:nrow(veg)) print(sum(veg[i,]==dom1[i]))

There may be ways to eliminate the for loop, but this works

If you want the names of the species, rather than column number, you can do

 > names(veg)[dom1]

which will return the species names (assuming they are the column names 
of the data.frame).  Now to get the next most abundant species, zero out 
the dominant species and repeat

 > tmp <- veg
 > for (i in 1:nrow(veg)) tmp[i,dom1[i]] <- 0
 > dom2 <- apply(veg,1,which.max)

HTH Dave R

Graham Watt-Gremm wrote:> Hi R-users:
> [R 2.2 on OSX 10.4.3]
> I have a (sparse) vegetation data frame with 500 rows (sampling  
> units) and 177 columns (plant species) where the data represent %  
> cover. I need to summarize the cover data by returning the names of  
> the most dominant and the second most dominant species per plot. I  
> reduced the data frame to omit cover below 5%; this is what it looks  
> like stacked. I have experimented with tapply(), by(), and some  
> functions mentioned in archived postings, but I haven't seen anything  
> that answers to this directly. Does anybody have any ideas?
> 
>       OBJECTID       PolygonID SpeciesCod AbundanceP
> 1       15006     ANT-CBG-rr1     Leymol    5.00000
> 3       15008     ANT-CBG-rr1     Ambcha    5.00000
> 5       15010      ANT-ESH-27     Atrpat   20.00000
> 6       15011      ANT-ESH-27     Ambcha   10.00000
> 11      15016      ANT-ESH-28     Salvir   20.00000
> 14      15019      ANT-ESH-28     Atrpat    5.00000
> 18      15023 ANT-POR-Rubarm5     Rubarm   60.00000
> 19      15024 ANT-POR-Rubarm5     Hedhel   40.00000
> 25      15030      ECO-CBG-A2     Griint    5.00000
> 27      15032      ECO-CBG-A2     Anngra    5.00000
> 38      15043      ECO-CBG-A4     Sperub   50.00000
> 
> Regards,
> Graham Watt-Gremm
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
> 
> 

-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
David W. Roberts                                     office 406-994-4548
Professor and Head                                      FAX 406-994-3190
Department of Ecology                         email droberts at montana.edu
Montana State University
Bozeman, MT 59717-3460

Seemingly Similar Threads

Search for more seemingly similar threads

R help - Nov 2005 - retrieve most abundant species by sample unit

[R] retrieve most abundant species by sample unit

[R] retrieve most abundant species by sample unit

[R] retrieve most abundant species by sample unit

Seemingly Similar Threads