Hello Gilson,
On 06/05/2013, at 05:34 AM, Gilson Carvalho wrote:
> Dear all,
>
> Does anyone knows why the results of a BIOENV (PRIMER v. 6.1.15) are
diferent of the bioenv() + mantel() in vegan? Not the spearman correlation,
indeed the pseudo-p value.
>
> I know that the approach bioenv() + mantel() is biased. So, how the BIOENV
(PRIMER) ends with larger p values (permutated).
>
I cannot give a firm answer, because I know only half of the problem: I have
never used PRIMER nor seen any version of its manual, and I can only answer for
vegan.
I interpret your message so that PRIMER has a permutation test for BIOENV. Vegan
has no such test so that these two cannot be compared. It appears that you
tried bioenv() + mantel() in vegan, and you said that you know it is biased. It
certainly is biased, and therefore I cannot understand why you are surprised
after getting biased results. I don't know PRIMER, but chances are that it
does things correctly and gives unbiased results. In that case you should get
exactly the kind of bias you observed: too low (significant) P-values in
bioenv() + mantel() in vegan. (There are some technical points that you must
take care, too -- more at the bottom of the message.)
The bias in mantel() + bioenv() arises because you select variables in bioenv()
to maximize the correlation, and then you treat these selected variables as they
were a priori (not selected) in mantel(). The selection procedure must be a part
of the process of assessing the significance. It may be so in PRIMER, but I
don't know.
I tested this in vegan using varespec and varechem data. Here bioenv() selected
a five-variable model (N P Al Mn Baresoil) which I then fed into mantel(). In
addition, I made a permutation test for bioenv(): I permuted data, repeated
bioenv to select the best set of variables for this permutation, and collected
the max correlation from each run. In mantel with fixed pre-selected set of
variables the fivenum summaries for 999 permutations were Min = -0.267, 1st Qu =
-0.053, Median = 0.000 , 3rd Qu = 0.052, Max = 0.278. With exactly the same
permutations, bioenv gave five num summaries -0.006 (min), 0.129, 0.179 (Md),
0.225, 0.423 (max), or nearly 0.2 units higher. We selected the variables to
maximize the correlations in bioenv() and therefore the values were much higher
(Median 0.18 in bioenv pro 0.00 in Mantel). Consequently the P-values can be
much higher (less significant) in correctly performed bioenv() test.
BTW, if you do this test with bioenv, I really hope you have a PC with multicore
CPU. I used parallel processing with eight cores and it still was really slow
(felt like 30 min although I didn't check the timing).
This was the bias and how it works, and this alone is sufficient to explain
large differences.
Then some technical details -- you must be careful in comparing the models and
building your working sequences:
(1) bioenv paper (Clarke & Ainsworth, Mar Ecol Prog Ser 92, 205-219; 1993)
also introduces a new correlation-like measure that we have not implemented in
vegan::bioenv(). You must be careful to use same correlations in both tests.
(2) vegan::bioenv() defaults to use Spearman correlation, but vegan::mantel()
defaults to Pearson. You must be careful to use the same in both.
(3) vegan::mantel() only uses one-sided tests, whereas some implementations use
two-sided tests. This cannot be changed in vegan, but you must be careful here.
The P-values can be higher (less significant) with two-sided tests.
> Acctualy how the permutation test in BIOENV (PRIMER) is conducted. The user
guide does not make it clear.
>
I think it is best to look at the source code to see how things are really done.
This is easy in vegan which is open-source. I don't know about PRIMER.
Cheers, Jari Oksanen
--
Jari Oksanen, Dept Biology, Univ Oulu, 90014 Finland