thr3ads.net - R help - [R] What is the formula of Pseudo-F statistic in capscale in vegan? [Dec 2013]

If this information is useful, please help other people find it:
Share via:

Kristen Ross

2013-Dec-17 04:56 UTC

[R] What is the formula of Pseudo-F statistic in capscale in vegan?

Dear R-help,

We are conducting a distance-based redundancy analysis using capscale and
then testing for statistical significance for six terms in the model for the
constrained ordination using anova.cca in the vegan package. The
significance test is sequential, i.e., testing for significance of a term
only after accounting for all preceding terms. Could someone please provide
us with either the actual formula for the pseudo-F statistic or a reference
that provides this formula?

We ask because we are doing the exact same analysis in another program
(using distLM in PRIMER 6 v 6.1.13 and PERMANOVA+ v 1.0.3 from PRIMER-E),
but we are getting very different pseudo-F ratios despite specifying the
exact same order of model terms, using the same Bray-Curtis distance measure
(distance matrices produced by the two programs are the same), and using the
same sequential test for significance. Below is a table displaying the order
of the model terms and the pseudo-F values computed by R and by PRIMER (we
also ran the same analysis in CANOCO which showed the same results in
pseudo-F values as PRIMER). We have not been able to figure out why we get
very different pseudo-F values, leading us to believe that R calculates
pseudo-F values differently than PRIMER for the sequential tests.
Furthermore, constrained ordination outputs, i.e., eigenvalues, proportion
of variability explained by each constrained axes, etc. appear to be
identical between the two programs.  Below we provide (1) the table showing
the different pseudo-F values; (2) the formula used by PRIMER to calculate
pseudo-F values that we think is the same being used in R, but need
confirmation; and (3) the R code used for this analysis.  

(1) Table


Model

Social Variables

PRIMER pseudo-F

R pseudo-F


SEQUENTIAL TESTS

GroupSize

1.1904

1.5528


 

Board

1.5079

1.8872


 

MtgStyle

1.1326

1.4007


 

DmStyle

1.0971

1.3437


 

DifView

1.4892

1.7299


 

VolAuton

2.2923

2.2925

 

(2) pseudo-F formula

We know that PRIMER uses the following formula to calculate the pseudo-F for
a sequential test of significance (equation 4.3, Anderson, Gorley, and
Clarke 2008, Chapter 4. Pg. 129, and based on pseudo-F equation in Legendre
and Anderson (1999), Ecological Monographs vol. 69):

 

F= (SSFull - SSReduced)/(qFull-qReduced)

        (SSTotal-SSFull)/(N - qFull - 1)

 (3) R code

## creating Bray-Curtis of Biodiversity data

H.BC <- vegdist(H.Full [,14:211], "bray")

 

## Distance based redundancy analysis (dbRDA)

m1<-capscale(H.BC ~ GroupSize + Board + MtgStyle + DmStyle + DifView +
VolAuton, SScomp [,14:19], distance = "euclidean", add = TRUE)

### NOTE: pseudo-F values are the same with or without correcting for
negative eigenvalues (although they are different from other programs).

 

## Sequential test for terms

anova.cca(m1, by="terms", perm.max=1000, permu=500) 

 

Thank you very much for any help or insights that anyone can provide,

Kristen

 

Kristen A. Ross, PhD

Post Doctoral Researcher 

University of Illinois at Chicago

kristenrossUIC@gmail.com

          and

Research Associate

Green Mountain College

One Brennan Circle

Poultney, VT 05764

 


	[[alternative HTML version deleted]]

Jari Oksanen

2013-Dec-17 09:10 UTC

head link

[R] What is the formula of Pseudo-F statistic in capscale in vegan?

Dear Kristen Ross,

Kristen Ross <guayabitogirl <at> gmail.com> writes:
>tion; and (3) the R code used for this analysis.  
Sorry that I have to remove most of your original message: gmane won't
allow me to post if I add too little compared to the cited text.

This is now wild guessing, since there is nothing I would be able to
reproduce, in particular as I cannot afford buying PRIMER licence and
cannot even see its manual. I have one guess, though. Look at the first
and last item of your table:> 
> (1) Table
> 
>
> 
> PRIMER pseudo-F
> 
> R pseudo-F
> 
> SEQUENTIAL TESTS
> 
> GroupSize
> 
> 1.1904
> 
> 1.5528
>
...> VolAuton
> 
> 2.2923
> 
> 2.2925
> 
I am not quite sure how to read this table, but I *assume* that one
of the numbers comes PRIMER and one from vegan:::capscale (this whole
answer is based on that assumption). As you see, the first two numbers
are very different, and the last two numerically fairly equal. I cannot
decipher the PRIMER formula for pseudo-F you give below, but I guess that
PRIMER changes the denominator of the pseudo-F at every step. I guess
it uses residual SS and residual df after the current term, and that 
would include variation explained by later variables in the sequential
tests as well as their df's. In vegan we always use the same denominator
in all cases with the same degrees of freedom. This denominator, or
scale, is the residual variation and residual df's after all 
explanatory variables (constraints). The vegan way is similar to the
one you get from ordinary anova of lm(). We refuse (and have refused
earlier) to implement anything else. However, using
add1(<capscale-result>,
..., test = "perm") can give you tests that pretend that later 
variables in the sequence are not in the model. 

> (2) pseudo-F formula
> 
> We know that PRIMER uses the following formula to calculate the pseudo-F
for
> a sequential test of significance (equation 4.3, Anderson, Gorley, and
> Clarke 2008, Chapter 4. Pg. 129, and based on pseudo-F equation in Legendre
> and Anderson (1999), Ecological Monographs vol. 69):
> 
> F= (SSFull - SSReduced)/(qFull-qReduced)
> 
>         (SSTotal-SSFull)/(N - qFull - 1)
> 
>  (3) R code
> 
> ## creating Bray-Curtis of Biodiversity data
> 
> H.BC <- vegdist(H.Full [,14:211], "bray")
> 
> ## Distance based redundancy analysis (dbRDA)
> 
> m1<-capscale(H.BC ~ GroupSize + Board + MtgStyle + DmStyle + DifView +
> VolAuton, SScomp [,14:19], distance = "euclidean", add = TRUE)
> 
> ### NOTE: pseudo-F values are the same with or without correcting for
> negative eigenvalues (although they are different from other programs).
>If you here claim that the pseudo-F values are the same in
vegan:::capscale with add=FALSE and add=TRUE, I claim that you are wrong,
or that you made an error. One source of error may be that in the
example above you set 'distance = "euclidean"' in which case
'add'
argument has no effect since you have no negative eigenvalues with
Euclidean distances. However, the example above should *still* give
you negative eigenvalues and change in pseudo-F, because you did two
contradictory things: you supplied non-Euclidean dissimilarities in
input, and you asked for Euclidean distances in the command. In this case,
the 'distance' argument will be silently ignored and input
dissimilarities
will be used. Please check this.

I have no idea what you mean with "correcting" for negative
eigenvalues.
You can have transformation that removes them, but I cannot see 
how that would be a "correction".

Cheers, Jari Oksanen

Seemingly Similar Threads

Search for more possibly parallel threads

R help - Dec 2013 - What is the formula of Pseudo-F statistic in capscale in vegan?

[R] What is the formula of Pseudo-F statistic in capscale in vegan?

[R] What is the formula of Pseudo-F statistic in capscale in vegan?

Seemingly Similar Threads