Can anyone point me in the right direction of figuring out what downweight() is doing? I am using vegan to perform CCA on diatom assemblage data. I have a lot of rare species, so I want to reduce the influence of rare species in my CCA. I have read that some authors reduce rare species by only including species with an abundance of at least 1% in at least one sample (other authors use 5% as a rule, but this removes at least half my species). If I code it as follows: cca(downweight(diatoms, fraction=5) ~ ., env) It is clearly not removing these species entirely from analysis, as some authors suggest. So I am wondering: what is downweight() doing exactly? I assume it is somehow ranking the species and reducing their abundance values based on their rank, but I'm not entirely sure and can't seem to figure out how to look at the code (R novice here). Nor can I find a clear description within the documentation (although I may be looking in all the wrong places). So, my inclination is to remove species that are very rare (max abundance < 1%) prior to the CCA and then use the downweight function (fraction = 5?) in my CCA (as above). This way, I can include most of my species, but overall still reduce the impact of rare species. Any advice is appreciated. Thanks! -- View this message in context: http://r.789695.n4.nabble.com/ordination-in-vegan-what-does-downweight-do-tp4010352p4010352.html Sent from the R help mailing list archive at Nabble.com.
kelsmith <kelsmith <at> usgs.gov> writes:> > Can anyone point me in the right direction of figuring out what downweight() > is doing? > > I am using vegan to perform CCA on diatom assemblage data. I have a lot of > rare species, so I want to reduce the influence of rare species in my CCA. I > have read that some authors reduce rare species by only including species > with an abundance of at least 1% in at least one sample (other authors use > 5% as a rule, but this removes at least half my species). If I code it as > follows: > > cca(downweight(diatoms, fraction=5) ~ ., env) > > It is clearly not removing these species entirely from analysis, as some > authors suggest. So I am wondering: what is downweight() doing exactly? I > assume it is somehow ranking the species and reducing their abundance values > based on their rank, but I'm not entirely sure and can't seem to figure out > how to look at the code (R novice here). Nor can I find a clear description > within the documentation (although I may be looking in all the wrong > places). > > So, my inclination is to remove species that are very rare (max abundance < > 1%) prior to the CCA and then use the downweight function (fraction = 5?) in > my CCA (as above). This way, I can include most of my species, but overall > still reduce the impact of rare species. >Dear "kelsmith", First a question: how do you *know* that rare species influence the result? I know many people *believe* that rare species have an unduly high influence in CCA, but that is only ecological folklore with no empirical basis. You could have a look at the influence: run ordinations with all species and without rare species and compare resulting ordinations using procrustes() function in vegan. If the site/lake/sample/core/river results are very different, then rare species indeed were influential (use plot() so that you don't get misled by numbers). Rare species often are extreme in CA-family results, but that does not mean that they influenced the results. CA-family methods are weighted ordination methods, and for species the weights are their total abundances (marginal totals). For rare species the weights are low. The species are blown to the outskirts of the ordination after the ordination rotation, and they do not often influence that rotation very much. I see that I ignored explaining downweighting in detail in vegan documentation. You must read the reference cited there (Hill & Gauch): they explain the procedure (at least vaguely: you must read the code to see how it is actually implemented). However, the principle is simple: like I wrote above, rare species have low weights, and downweighting makes these weights even smaller. It does not remove the species, but it makes them even less influential. In vegan, downweighting was implemented for decorana, because it was the part of Mark Hill's original decorana code. However, it was implemented in R (instead of the original FORTRAN) and made independent of decorana so that it could be used with other functions. This was done to serve those people who wanted to use it outside its original context (greetings to Oslo!). If you do not know what downweighting does, I suggest you don't use it. I also suggest that you check the influence of rare species before you remove them from your data. Cheers, Jari Oksanen
On Mon, 2011-11-07 at 10:24 -0800, kelsmith wrote:> Can anyone point me in the right direction of figuring out what downweight() > is doing? > > I am using vegan to perform CCA on diatom assemblage data. I have a lot of > rare species, so I want to reduce the influence of rare species in my CCA. I > have read that some authors reduce rare species by only including species > with an abundance of at least 1% in at least one sample (other authors use > 5% as a rule, but this removes at least half my species).That is not what downweight() is for. If you want this sort of selection, see chooseTaxa() in my analogue package diat.sel <- chooseTaxa(diatoms, max.abun = 1, type = "AND") or, if proportions not percent diat.sel <- chooseTaxa(diatoms, max.abun = 0.01, type = "AND") This sort of indexing is trivial (but I made a typo in the current version so type = "OR" won't work) so you can study the code of analogue:::chooseTaxa.default once I fix the CRAN version or on R-Forge now: https://r-forge.r-project.org/scm/viewvc.php/pkg/R/chooseTaxa.R?view=markup&root=analogue Jari has addressed the other part of your question. Jari also mentioned the issue about whether you should be removing or downweighting rare species. Many people, especially diatomists, do this for practical purposes in a routine fashion because their data sets are especially speciose and have a large proportion of low abundance taxa. As a general matter of routine practice, I don't think this is a very good way of working, especially as we have no good ecological grounds for doing so and who knows what information these species could be telling us if we just listened to them instead of discarding them. HTH G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%