Zhang Jian
2007-Jul-07 22:18 UTC
[R] How to calculate the index "the number of species combinations"?
I want to analyze the co-occurrence of some species. In some papers, the authors said that the index"the number of species combinations (COMBO)" is a good index. I try to calculate the index by R language. But I can not get the right value. I think that I do not understand the concept of the index because my english is not good. The concept: *The number of species combinations *This index scans the columns of the presence-absence matrix and keeps track of the number of unique species combinations that are represented in different sites. For an assemblage of n species, there are 2n possible species combinations, including the combination of no species being present (Pielou and Pielou 1968). In most real matrices, the number of sites (= columns) is usually substantially less than 2n, which places an upper bound on the number of species combinations that can be found in both the observed and the simulated matrices. Presence-absence Data (Each row represents different species and each column represents a different site. A "1" indicates a species is present at a particular site, and a "0" indicates that a species is absent from a particular site): Species Cuba Hispaniola Jamaica Puerto_Rico Guadeloupe Martinique Dominica St._Lucia Barbados St._Vincent Grenada Antigua St._Croix Grand_Cayman St._Kitts Barbuda Montserrat St._Martin St._Thomas Carduelis_dominicensis 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Loxia_leucoptera 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Volatinia_jacarina 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 Sporophila_nigricollis 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 Melopyrrha_nigra 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 Loxigilla_portoricensis 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Loxigilla_violacea 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Loxigilla_noxis 0 0 0 0 1 1 1 1 1 1 1 1 0 0 1 1 1 1 0 Melanospiza_richardsoni 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 Tiara_olivacea 1 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 Tiara_bicolor 0 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 Tiara_canora 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Loxipasser_anoxanthus 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Saltator_albicollis 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 Torreornis_inexpectata 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Ammodramus_savannarum 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Zonotrichia_capensis 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 About the data, I calculated the index "COMBO" by other software.The value of the index is 10. Can you help me to calculate or give me some detalied explain about the concept of the index? [[alternative HTML version deleted]]
Sarah Goslee
2007-Jul-07 22:56 UTC
[R] How to calculate the index "the number of species combinations"?
It should be the number of unique sites. In this case, the number of unique columns in the data frame. See ?unique. (Interestingly, convention is usually that species are columns and sites are rows.) For your sample data you only see 10 of the 2^17 possible combinations of 17 species (not 2n). Sarah On 7/7/07, Zhang Jian <jzhang1982 at gmail.com> wrote:> I want to analyze the co-occurrence of some species. In some papers, the > authors said that the index"the number of species combinations (COMBO)" is a > good index. I try to calculate the index by R language. But I can not get > the right value. I think that I do not understand the concept of the index > because my english is not good. > > The concept: > *The number of species combinations *This index scans the columns of the > presence-absence matrix and keeps track of the number of unique species > combinations that are represented in different sites. For an assemblage of n > species, there are 2n possible species combinations, including the > combination of no species being present (Pielou and Pielou 1968). In most > real matrices, the number of sites (= columns) is usually substantially less > than 2n, which places an upper bound on the number of species combinations > that can be found in both the observed and the simulated matrices. > > Presence-absence Data (Each row represents different species and each column > represents a different site. A "1" indicates a species is present at a > particular site, and a "0" indicates that a species is absent from a > particular site): > Species Cuba Hispaniola Jamaica Puerto_Rico Guadeloupe Martinique Dominica > St._Lucia Barbados St._Vincent Grenada Antigua St._Croix Grand_Cayman > St._Kitts Barbuda Montserrat St._Martin St._Thomas > Carduelis_dominicensis 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > Loxia_leucoptera 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > Volatinia_jacarina 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 > Sporophila_nigricollis 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 > Melopyrrha_nigra 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 > Loxigilla_portoricensis 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > Loxigilla_violacea 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > Loxigilla_noxis 0 0 0 0 1 1 1 1 1 1 1 1 0 0 1 1 1 1 0 > Melanospiza_richardsoni 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 > Tiara_olivacea 1 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 > Tiara_bicolor 0 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 > Tiara_canora 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > Loxipasser_anoxanthus 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > Saltator_albicollis 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 > Torreornis_inexpectata 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > Ammodramus_savannarum 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > Zonotrichia_capensis 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 >-- Sarah Goslee http://www.functionaldiversity.org
(Ted Harding)
2007-Jul-07 23:46 UTC
[R] How to calculate the index "the number of species combin
On 07-Jul-07 22:18:42, Zhang Jian wrote:> I want to analyze the co-occurrence of some species. In some > papers, the authors said that the index"the number of species > combinations (COMBO)" is a good index. I try to calculate the > index by R language. But I can not get the right value. I think > that I do not understand the concept of the index because my > english is not good. > > The concept: > *The number of species combinations *This index scans the > columns of the presence-absence matrix and keeps track of the > number of unique species combinations that are represented in > different sites. For an assemblage of n species, there are 2n[I think this should be 2^n]> possible species combinations, including the combination of no > species being present (Pielou and Pielou 1968). In most real > matrices, the number of sites (= columns) is usually substantially > less than 2n, which places an upper bound on the number of species > combinations that can be found in both the observed and the > simulated matrices.English good or bad, I found the above description not easy to follow! But, since I could see only one thing it could mean if it was intended to gice a unique definition, I decided to test on you data the hypothesis that Species Combinations means the number of distinct columns in the data matrix. I took your species-incidence data (not reproduced here) and converted it to a matrix S of 0 and 1 with 19 columns and 17 rows (though the following would work just as well if it is a dataframe): S V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 5 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 6 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 7 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8 0 0 0 0 1 1 1 1 1 1 1 1 0 0 1 1 1 1 0 9 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 10 1 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 11 0 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 12 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 13 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 14 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 15 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 16 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 17 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 There are 10 different columns in there, as can be found by t(unique(t(S))) V1 V2 V3 V4 V5 V8 V9 V11 V13 V14 1 0 1 0 0 0 0 0 0 0 0 2 0 1 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 1 0 0 4 0 0 0 0 0 0 0 1 0 0 5 1 0 0 0 0 0 0 0 0 1 6 0 0 0 1 0 0 0 0 0 0 7 0 1 1 0 0 0 0 0 0 0 8 0 0 0 0 1 1 1 1 0 0 9 0 0 0 0 0 1 0 0 0 0 10 1 1 1 1 0 0 0 0 0 1 11 0 1 1 1 1 1 1 1 1 0 12 1 0 0 0 0 0 0 0 0 0 13 0 0 1 0 0 0 0 0 0 0 14 0 0 0 0 1 1 0 0 0 0 15 1 0 0 0 0 0 0 0 0 0 16 0 1 1 1 0 0 0 0 0 0 17 0 1 0 0 0 0 0 0 0 0 Of the "missing" columns, it can be seen that V6 and V7 are the same as V5, and V10, V12, V15, V16, V17, V18, V19 are the same as V9. Hence the interpretation that "COMBO" means the number of distinct columns is confirmed. If that is really the case, then a very simple R function can be written for it: COMBO<-function(S){ncol(t(unique(t(S))))} COMBO(S) [1] 10> About the data, I calculated the index "COMBO" by other software. > The value of the index is 10. > Can you help me to calculate or give me some detalied explain about > the concept of the index?See above! I hope it helps. Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <ted.harding at nessie.mcc.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 08-Jul-07 Time: 00:46:08 ------------------------------ XFMail ------------------------------