thr3ads.net - R help - [R] Conservative "ANOVA tables" in lmer [Sep 2006]

If this information is useful, please help other people find it:
Share via:

Martin Henry H. Stevens

2006-Sep-07 11:46 UTC

[R] Conservative "ANOVA tables" in lmer

Dear lmer-ers,
My thanks for all of you who are sharing your trials and tribulations  
publicly.

I was hoping to elicit some feedback on my thoughts on denominator  
degrees of freedom for F ratios in mixed models. These thoughts and  
practices result from my reading of previous postings by Doug Bates  
and others.

- I start by assuming that the appropriate denominator degrees lies  
between n - p and and n - q, where n=number of observations, p=number  
of fixed effects (rank of model matrix X), and q=rank of Z:X.
- I then conclude that good estimates of P values on the F ratios lie  
between 1 - pf(F.ratio, numDF, n-p) and 1 - pf(F.ratio, numDF, n-q).
- I further surmise that the latter of these (1 - pf(F.ratio, numDF,  
n-q)) is the more conservative estimate.

When I use these criteria and compare my "ANOVA" table to the results
of analysis of Helmert contrasts using MCMC sample with highest  
posterior density intervals, I find that my conclusions (e.g. factor  
A, with three levels, has a "significant effect" on the response  
variable) are qualitatively the same.

Comments?

Hank


Dr. Hank Stevens, Assistant Professor
338 Pearson Hall
Botany Department
Miami University
Oxford, OH 45056

Office: (513) 529-4206
Lab: (513) 529-4262
FAX: (513) 529-4243
http://www.cas.muohio.edu/~stevenmh/
http://www.muohio.edu/ecology/
http://www.muohio.edu/botany/
"E Pluribus Unum"

Douglas Bates

2006-Sep-07 12:59 UTC

head link

[R] Conservative "ANOVA tables" in lmer

Thanks for your summary, Hank.

On 9/7/06, Martin Henry H. Stevens <hstevens at muohio.edu>
wrote:> Dear lmer-ers,
> My thanks for all of you who are sharing your trials and tribulations
> publicly.
> I was hoping to elicit some feedback on my thoughts on denominator
> degrees of freedom for F ratios in mixed models. These thoughts and
> practices result from my reading of previous postings by Doug Bates
> and others.
> - I start by assuming that the appropriate denominator degrees lies
> between n - p and and n - q, where n=number of observations, p=number
> of fixed effects (rank of model matrix X), and q=rank of Z:X.
I agree with this but the opinion is by no means universal.  Initially
I misread the statement because I usually write the number of columns
of Z as q.

It is not easy to assess rank of Z:X numerically.  In many cases one
can reason what it should be from the form of the model but a general
procedure to assess the rank of a matrix, especially a sparse matrix,
is difficult.

An alternative which can be easily calculated is n - t where t is the
trace of the 'hat matrix'.  The function 'hatTrace' applied to a
fitted lmer model evaluates this trace (conditional on the estimates
of the relative variances of the random effects).
> - I then conclude that good estimates of P values on the F ratios lie
> between 1 - pf(F.ratio, numDF, n-p) and 1 - pf(F.ratio, numDF, n-q).
> - I further surmise that the latter of these (1 - pf(F.ratio, numDF,
> n-q)) is the more conservative estimate.
>
> When I use these criteria and compare my "ANOVA" table to the
results
> of analysis of Helmert contrasts using MCMC sample with highest
> posterior density intervals, I find that my conclusions (e.g. factor
> A, with three levels, has a "significant effect" on the response
> variable) are qualitatively the same.
> Comments?
I would be happy to re-institute p-values for fixed effects in the
summary and anova methods for lmer objects using a denominator degrees
of freedom based on the trace of the hat matrix or the rank of Z:X if
others will volunteer to respond to the "these answers are obviously
wrong because they don't agree with <whatever> and the idiot who wrote
this software should be thrashed to within an inch of his life"
messages.  I don't have the patience.

Fabian Scheipl

2006-Sep-08 12:25 UTC

head link

[R] Conservative "ANOVA tables" in lmer

Dear list,

I have written functions to perform simulation-based tests of 
HO: Var(Random Effects)=0, beta_foo=0 
in linear mixed models based on the exact distribution of the LR- and
Restricted LR-test statistic (see research presented in
Crainiceanu, Ruppert: "LRT in LMM with 1 variance component",
J. R. Statist. Soc. B (2004), 66, Part 1, pp. 165-185 )
-they are about 15-20 times faster than the parametric bootstrap.

At the moment, the exact distributions are only easily simulated for the
case of 1 single variance component/random effect and i.i.d. errors; feasible
approximations for the "multivariate" case are currently
being investigated and will be implemented soon.

the syntax looks something like this:

#begin code:

data(sleepstudy)
summary(sleepstudy)  #Effect of sleep deprivation on reaction time
xyplot(Reaction~Days|Subject, data=sleepstudy)
m<-lmer(Reaction~Days+(Days-1|Subject),data=sleepstudy) 
#random slopes, but no random intercept  
#doesna make sense, but it's just an example
summary(m)

#test for individual heterogeneity based on RLRT
#(No restrictions on fixed effects under H0)
#HO: lambda=Var(RandomSlopes)/Var(error)==0 <==> Var(RandomSlopes)==0

t3<-RLRT1SimTest(m, lambda0=0, seed=5, nsim=10000)

#will produce output:
#HO: lambda = 0 ; p-value = 0 
# observed lambda = 0.06259639 

#test for influence of Days based on LRT 
#(restriction on fixed efects: beta_Days==0)

m0<-lm(Reaction~1,data=sleepstudy)
t4<-LRT1SimTest(m, m0, seed=10, nsim=10000)

#will produce output:
#Model under HO:  Reaction ~ 1 ;
#Model under HA:  Reaction ~ Days + (Days - 1 | Subject) ;
# p-value = 0 
# observed lambda = 0.06259639 

#end code 
                      
If you are interested in using these functions i'll be glad
to send them to you-
be aware, however, that you can only use them for testing
"1 Random Effect" vs. "no Random Effect" in a model with
i.i.d. errors!!

The plan is to put them in a package beginning next year and
use them as a basis for an (exact) anova.lmer() method.

Greetings,
Fabian Scheipl


--

Fabian Scheipl

2006-Sep-08 14:24 UTC

head link

[R] Conservative "ANOVA tables" in lmer

Dear list,

I have written functions to perform simulation-based tests of 
HO: Var(Random Effects)=0, beta_foo=0 
in linear mixed models based on the exact distribution of the LR- and
Restricted LR-test statistic (see research presented in
Crainiceanu, Ruppert: "LRT in LMM with 1 variance component",
J. R. Statist. Soc. B (2004), 66, Part 1, pp. 165-185 )
-they are about 15-20 times faster than the parametric bootstrap.

At the moment, the exact distributions are only easily simulated for the
case of 1 single variance component/random effect and i.i.d. errors; feasible
approximations for the "multivariate" case are currently
being investigated and will be implemented soon.

the syntax looks something like this:

#begin code:

data(sleepstudy)
summary(sleepstudy)  #Effect of sleep deprivation on reaction time
xyplot(Reaction~Days|Subject, data=sleepstudy)
m<-lmer(Reaction~Days+(Days-1|Subject),data=sleepstudy) 
#random slopes, but no random intercept  
#doesna make sense, but it's just an example
summary(m)

#test for individual heterogeneity based on RLRT
#(No restrictions on fixed effects under H0)
#HO: lambda=Var(RandomSlopes)/Var(error)==0 <==> Var(RandomSlopes)==0

t3<-RLRT1SimTest(m, lambda0=0, seed=5, nsim=10000)

#will produce output:
#HO: lambda = 0 ; p-value = 0 
# observed lambda = 0.06259639 

#test for influence of Days based on LRT 
#(restriction on fixed efects: beta_Days==0)

m0<-lm(Reaction~1,data=sleepstudy)
t4<-LRT1SimTest(m, m0, seed=10, nsim=10000)

#will produce output:
#Model under HO:  Reaction ~ 1 ;
#Model under HA:  Reaction ~ Days + (Days - 1 | Subject) ;
# p-value = 0 
# observed lambda = 0.06259639 

#end code 
                      
If you are interested in using these functions i'll be glad
to send them to you-
be aware, however, that you can only use them for testing
"1 Random Effect" vs. "no Random Effect" in a model with
i.i.d. errors!!

The plan is to put them in a package beginning next year and
use them as a basis for an (exact) anova.lmer() method.

Greetings,
Fabian Scheipl


-- 





--

John Maindonald

2006-Sep-10 23:24 UTC

head link

[R] Conservative "ANOVA tables" in lmer

A Wiki entry is an excellent idea.  I am happy to try to help.

An account of mcmcsamp() might be very useful part of the Wiki.  My  
limited investigations suggest that once the data starts to overwhelm  
the prior (maybe ~3 df for an effect that is of interest), the  
posterior distribution that it gives provides a very good  
approximation to the sampling distribution.

I have been meaning to put aside time to try to work out, with the  
help of a colleague here at ANU, how the Kenward & Roger (Biometrics,  
1997) approximation might be implemented in lmer, but it has'nt yet  
happened and is unlikely to do so for a while.

John Maindonald             email: john.maindonald@anu.edu.au
phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.


On 10 Sep 2006, at 8:00 PM, r-help-request@stat.math.ethz.ch wrote:
> From: Spencer Graves <spencer.graves@pdf.com>
> Date: 10 September 2006 4:54:50 PM
> To: Andrew Robinson <A.Robinson@ms.unimelb.edu.au>
> Cc: Douglas Bates <bates@stat.wisc.edu>, R-Help <r- 
> help@stat.math.ethz.ch>
> Subject: Re: [R] Conservative "ANOVA tables" in lmer
>
>
> Hi, Doug, et al.:
>      I'll volunteer to do the same, which is an extension of much  
> of what I've been doing for R Help for a while now.
>      Regarding writing a FAQ, what about a Wiki entry (and maybe  
> ultimately a vignette)?  This thread provides notes around which  
> such could be built.  Another piece might be an example from  
> Scheffé (1958), which I sent as a reply to an earlier comment on  
> this thread, (foolishly sent without reducing the "cc" list,
which
> means it "awaits moderator approval").   Each time a question of
> this nature arises, someone checks the Wiki, edits adds something  
> to it if necessary, then replies to the list with the reference to  
> the appropriate Wiki entry.
>      Spencer Graves
>
> Andrew Robinson wrote:
>> On Thu, Sep 07, 2006 at 07:59:58AM -0500, Douglas Bates wrote:
>>
>>
>>> I would be happy to re-institute p-values for fixed effects in the
>>> summary and anova methods for lmer objects using a denominator  
>>> degrees
>>> of freedom based on the trace of the hat matrix or the rank of  
>>> Z:X if
>>> others will volunteer to respond to the "these answers are
obviously
>>> wrong because they don't agree with <whatever> and the
idiot who
>>> wrote
>>> this software should be thrashed to within an inch of his
life"
>>> messages.  I don't have the patience.
>>>
>>
>> This seems to be more than fair to me.  I'll volunteer to help  
>> explain
>> why the anova.lmer() output doesn't match SAS, etc.  Is it worth
>> putting a caveat in the output and the help files?  Is it even worth
>> writing a FAQ about this?
>>
>> Cheers
>>
>> Andrew

	[[alternative HTML version deleted]]

Greg Snow

2006-Sep-13 16:11 UTC

head link

[R] Conservative "ANOVA tables" in lmer

[snip]

 Douglas Bates wrote:> Hmm - I'm not sure what confidence interval and what number of levels
> you mean there so I can't comment on that method.
> 
> Suppose we go back to Spencer's example and consider if there is a
> signficant effect for the Nozzle factor.  That is equivalent to the
> hypothesis H_0: beta_2 = beta_3 = 0 versus the general alternative.  A
> "p-value" could be formulated from an MCMC sample if we assume
that
> the marginal distribution of the parameter estimates for beta_2 and
> beta_3 has roughly elliptical contours and you can evaluate that by,
> say, examining a hexbin plot of the values in the MCMC sample. One
> could take the ellipses as defined by the standard errors and
> estimated correlation or, probably better, by the observed standard
> deviations and correlations in the MCMC sample.  Then determine the
> proportion of (beta_2, beta_3) pairs in the sample that fall outside
> the ellipse centered at the estimates and with that eccentricity and
> scaling factors that passes through (0,0).  That would be an empirical
> p-value for the test.
> 
> I would recommend calculating this for a couple of samples to check on
> the reproducibility.
Here is another thought for an empirical p-value that may be easier to
compute and would require fewer assumptions:

Take the proportion of MCMC samples that fall into each quadrant (++,
+-, -+, --) and use the smallest of these proportions as the p-value
(or the smallest out of a subset of the quadrants for a one-sided
style test).

Think of it this way, if the smallest proportion is greater than
alpha, then any closed curve (ellipse, polygon, even concave polygons)
that includes 1-alpha proportion of the points would need to include
points from all 4 quadrants and therefore any convex curve would have
to include (0,0) which is consistent with the null hypothesis.

On the other hand if there is a quadrant that contains fewer than
alpha percent of the points then there exists at least one confidence
region (possibly concave) that contains 1-alpha proportion of the
points and excludes (0,0) and that entire quadrant, which is
consistent with the alternative that at least one of the betas differs
from 0.

A more conservative p-value would be to take the minimum proportion
and muliply it by 4 (or 2^p for p simultaneous tests) which is the
same idea as multipying by 2 for a 2 sided univariate test and assumes
that the confidence regions would exclude similar proportions of
points in each  direction (central confidence regions rather than
minimum length or other confidence regions).  This seems to me that it
would be over conservative in some cases (since all the proportions
must sum to 1, we don't really have 4 degrees of freedom and a smaller
adjustment factor may still be correct and less conservative).

Some simulations would be a good idea to see if the plain minimum is
to liberal and how conservative the other approach is for common
situations.

This is just my first thoughts on the matter, I have not tested
anything, so any comments or other discussion of this idea is welcome.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at intermountainmail.org
(801) 408-8111

Maybe Matching Threads

Search for more apparently analagous threads

R help - Sep 2006 - Conservative "ANOVA tables" in lmer

[R] Conservative "ANOVA tables" in lmer

[R] Conservative "ANOVA tables" in lmer

[R] Conservative "ANOVA tables" in lmer

[R] Conservative "ANOVA tables" in lmer

[R] Conservative "ANOVA tables" in lmer

[R] Conservative "ANOVA tables" in lmer

Maybe Matching Threads