Um texto embutido e sem conjunto de caracteres especificado associado... Nome: n?o dispon?vel Url: https://stat.ethz.ch/pipermail/r-help/attachments/20070401/33921c2a/attachment.pl
Hi, There's a very good ordination web page by Mike Palmer aimed at ecologists (and since you have a species x site matrix, I'm assuming that describes you) at http://ordination.okstate.edu/ My recommendation is generally nonmetric multidimensional scaling (principal coordinates analysis is a metric scaling ordination), with a dissimilarity metric that doesn't consider joint absences, for example Bray-Curtis/Sorensen. Treating absent species as missing data is not a good idea, because while it may not be possible to say that they are truly missing from that site (depending on taxa and sampling methods), you at least know they aren't common at that site. Ecological data are messy enough without discarding information! There are several R packages that may be helpful, including ecodist and vegan. Sarah On 4/1/07, Milton Cezar Ribeiro <milton_ruser at yahoo.com.br> wrote:> Dear R-gurus > > I have a data.frame with abundance data for species and sites which looks like: > mydf<-data.frame( > sp1=sample(0:10,5,replace=T), > sp2=sample(0:20,5,replace=T), > sp3=sample(0:4,5,replace=T), > sp4=sample(0:2,5,replace=T)) > rownames(mydf)<-paste("sites",1:5,sep="") > > I would like make an ordination analysis of these data and my worries is about the "zeros" (absence of species) into the matrix. Up to I read (Gotelli - A primir of ecological statistics, 2004), when I have abundance data I can?t compute Euclidian Distances because the zeros have the meaning of absence of the species and not as zero counting. Gotelli suggests one make "principal coordinates analysis". I would like to here from you what you think about and what is the best packages and functions to I compute my distance matrices and do my ordination analysis. Can I considere zero as NA on my data.frame? Is there a good PDF book available about Multivariate Analysis for abundance data available on the web? > > Kind regards > > Miltinho > Brazil >-- Sarah Goslee http://www.functionaldiversity.org
There are many ways to do this, really. For example if you use constrained (~ canonical) correspondence analysis the distance measure between sites is Chi-square and absences are not informative to the analysis. Or you can use an ecological distance measure (similarity indices like Soerensen, Bray-Curtis, Jaccard, and others) and perform principal coordinates (=multidimensional scaling), etc. Read the documentation and tutorials for the packages vegan, ade4 and labdsv. You might start your search at the page of Jari Oksanen: http://cc.oulu.fi/~jarioksa/softhelp/vegan.html or the one from Dave Roberts http://ecology.msu.montana.edu/labdsv/R/ . The vegan tutorial was useful for me to learn to use vegan: http://cc.oulu.fi/~jarioksa/opetus/metodi/vegantutor.pdf If you need more indeep mathemathical details, you should take a look at Daniel Chessels site: http://pbil.univ-lyon1.fr/R/perso/pagechessel.html There are plenty of pdfs available for download (however, some are suited for beginners, others require more background knowledge) . Be warned: there is a large variety of techniques for multivariate analysis with different properties and weaknesses, sometimes the most popular analysis are not the most appropriate. Be sure of what you want and what you are doing before you perform the analysis, the interpretation will depend on the techniques applied. I personally find ade4 implements many different techniques but is poorly documented and some functionalities are somehow "hidden", while vegan provides more information about the functions and is perfect for getting started. I haven't used labdsv yet. hope this help JR El dom, 01-04-2007 a las 09:20 -0700, Milton Cezar Ribeiro escribi?:> Dear R-gurus > > I have a data.frame with abundance data for species and sites which looks like: > mydf<-data.frame( > sp1=sample(0:10,5,replace=T), > sp2=sample(0:20,5,replace=T), > sp3=sample(0:4,5,replace=T), > sp4=sample(0:2,5,replace=T)) > rownames(mydf)<-paste("sites",1:5,sep="") > > I would like make an ordination analysis of these data and my worries is about the "zeros" (absence of species) into the matrix. Up to I read (Gotelli - A primir of ecological statistics, 2004), when I have abundance data I cant compute Euclidian Distances because the zeros have the meaning of absence of the species and not as zero counting. Gotelli suggests one make "principal coordinates analysis". I would like to here from you what you think about and what is the best packages and functions to I compute my distance matrices and do my ordination analysis. Can I considere zero as NA on my data.frame? Is there a good PDF book available about Multivariate Analysis for abundance data available on the web? > > Kind regards > > Miltinho > Brazil > > __________________________________________________ > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
On Sun, 2007-04-01 at 09:20 -0700, Milton Cezar Ribeiro wrote:> Dear R-gurus > > I have a data.frame with abundance data for species and sites which looks like: > mydf<-data.frame( > sp1=sample(0:10,5,replace=T), > sp2=sample(0:20,5,replace=T), > sp3=sample(0:4,5,replace=T), > sp4=sample(0:2,5,replace=T)) > rownames(mydf)<-paste("sites",1:5,sep="") > > I would like make an ordination analysis of these data and my worries > is about the "zeros" (absence of species) into the matrix. Up to I > read (Gotelli - A primir of ecological statistics, 2004), when I have > abundance data I cant compute Euclidian Distances because the zeros > have the meaning of absence of the species and not as zero counting. > Gotelli suggests one make "principal coordinates analysis". I would > like to here from you what you think about and what is the best > packages and functions to I compute my distance matrices and do my > ordination analysis. Can I considere zero as NA on my data.frame? Is > there a good PDF book available about Multivariate Analysis for > abundance data available on the web?In addition to the other suggestions, there is a Task View on CRAN for the topic of Environmetrics. This has a section describing the various ordination techniques available in R as well as functions to calculate distance/dissimilarity matrices: http://cran.r-project.org/src/contrib/Views/Environmetrics.html G> > Kind regards > > Miltinho > Brazil > > __________________________________________________ > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC [f] +44 (0)20 7679 0565 UCL Department of Geography Pearson Building [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street London, UK [w] http://www.ucl.ac.uk/~ucfagls/ WC1E 6BT [w] http://www.freshwaters.org.uk/ %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Milton Cezar Ribeiro <milton_ruser <at> yahoo.com.br> writes:> > Dear R-gurus > > I have a data.frame with abundance data for species and sites which looks like: > mydf<-data.frame( > sp1=sample(0:10,5,replace=T), > sp2=sample(0:20,5,replace=T), > sp3=sample(0:4,5,replace=T), > sp4=sample(0:2,5,replace=T)) > rownames(mydf)<-paste("sites",1:5,sep="") > > I would like make an ordination analysis of these data and my worries is aboutthe "zeros" (absence of> species) into the matrix. Up to I read (Gotelli - A primir of ecologicalstatistics, 2004), when I have> abundance data I can?t compute Euclidian Distances because the zeros have themeaning of absence of the> species and not as zero counting. Gotelli suggests one make "principalcoordinates analysis". I would> like to here from you what you think about and what is the best packages andfunctions to I compute my> distance matrices and do my ordination analysis. Can I considere zero as NA onmy data.frame? Is there a> good PDF book available about Multivariate Analysis for abundance dataavailable on the web?> >Other people already suggested what to do with these data and where to find pdf texts. I only comment on some points raised in this original question. Firstly, Euclidean distance is quite OK with zeros, or at least as good as any other normal dissimilarity index is with zeros. Euclidean distance on non-transformed data is poor for other reasons (it takes squared differences emphasizing abundance, and even when two sites have nothing in common, Euclidean distance varies with total abundances). Using Principal Co-ordinates analysis does not change this, since it also can be run with Euclidean distances. However, there are a many packages providing "better" dissimilarity indices or transformations that make Euclidean distances more useful (such as the Hellinger transformation). Another question is more abstract: indeed, you may regard most zeros as missing data. Species probably could occur in your sample site, more or less, but it was too scarce to be observed. How to do this in practice is the tricky issue. You cannot simply change zeros to NA, since then the dissimilarities (if they don't fail) will really give a special significance to these cells. Regarding them as zeros certaily makes more sense than removing *pairs* of data where species is NA in one site and present in another. There are ways to have something like handling zeros as missing values of various degrees(!), but my decency prohibits me to write about these methods. cheers, jari oksanen