Raldo Kruger
2009-Aug-28 10:46 UTC
[R] Help with glmer {lme4) function: how to return F or t statistics instead of z statistics.
Hi, I'm new to R and GLMMs, and I've been unable to find the answers to my questions by trawling through the R help archives. I'm hoping someone here can help me. I'm running an analysis on Seedling survival (count data=Poisson distribution) on restoration sites, and my main interest is in determining whether the Nutrients (N) and water absorbing polymer Gel (G) additions to the soil substrate contribute positively to the survival of the seedlings, over a 3 year time period (for simplicity I'm just using 3 time periods, each in the same season for the 3 successive years). Fixed factors: Nutrients (0 and 1), Gel (0 and 1) Random factors: Site (4 non replicate sites), Year (3 time periods) Response variable: Seedling numbers (counts) / 0.25m2 plot According to the decision tree on page 131 in Bolker et al. (2008, in TREE; thanks, very useful paper!), most of my data sets should be analysed with Laplace or GHQ model with Wald t or F statistic (since it is non-normal, can?t be transformed to normality, has a mean < 5, has less than 3 random effects, and is overdispersed). I?m using the glmer {lme4} function, since it allows for Laplace or GHQ, as well as more than one random factor (glmmML {glmmML) and glmPQL {MASS} apparently does not), as follows:> m1<-glmer(Seedlings~N*G*(1|Year)*(1|Site), data=ex5m, family=poisson(link="log"))My questions are: 1) The model returns Z values, and I?m unable to find an argument in the function where this can be changed to return a t or F value (as Bolker et al. suggests I should use for my data). 2) I?m unsure what the AIC or QAIC value means, other than knowing that it should be as low as possible. Is there a rule of thumb of what is a good AIC value? Mine are in the region of 2230. 3) The default in glmer {lme4) for the argument nAGQ = 1, which uses the Laplace approximation. When nAGQ >1, it uses the GHQ method, but I?m unsure how to determine the correct number of Gauss-Hermite points to enter in the argument when using this method. How is this determined? 4) Some of my data sets have means >5, and are also overdispersed, and according to Bolker et al. should be analysed using a GLMM with PQL and a Wald t or F. However, the glmmPQL {glmmPQL} does not accept more than one random factor, and I have two, so how do I deal with that? 5) Lastly, what does the "1" imply in the random factor term, e.g. (1|Site), and how does this affect the analysis? Many thanks, Raldo Kruger MSc student University of Cape Town South Africa
Bert Gunter
2009-Aug-28 15:38 UTC
[R] Help with glmer {lme4) function: how to return F or tstatistics instead of z statistics.
R-help is for help on the use of R, not primarily for statistics advice (although this does sometimes occur). Most of your questions are about statistics, so you should probably consult a local statistician for help or post on a more suitable list, perhaps R-sig-mixed-models: https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models Cheers, Bert Gunter Genentech Nonclinical Statistics -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Raldo Kruger Sent: Friday, August 28, 2009 3:46 AM To: r-help at r-project.org Subject: [R] Help with glmer {lme4) function: how to return F or tstatistics instead of z statistics. Hi, I'm new to R and GLMMs, and I've been unable to find the answers to my questions by trawling through the R help archives. I'm hoping someone here can help me. I'm running an analysis on Seedling survival (count data=Poisson distribution) on restoration sites, and my main interest is in determining whether the Nutrients (N) and water absorbing polymer Gel (G) additions to the soil substrate contribute positively to the survival of the seedlings, over a 3 year time period (for simplicity I'm just using 3 time periods, each in the same season for the 3 successive years). Fixed factors: Nutrients (0 and 1), Gel (0 and 1) Random factors: Site (4 non replicate sites), Year (3 time periods) Response variable: Seedling numbers (counts) / 0.25m2 plot According to the decision tree on page 131 in Bolker et al. (2008, in TREE; thanks, very useful paper!), most of my data sets should be analysed with Laplace or GHQ model with Wald t or F statistic (since it is non-normal, can't be transformed to normality, has a mean < 5, has less than 3 random effects, and is overdispersed). I'm using the glmer {lme4} function, since it allows for Laplace or GHQ, as well as more than one random factor (glmmML {glmmML) and glmPQL {MASS} apparently does not), as follows:> m1<-glmer(Seedlings~N*G*(1|Year)*(1|Site), data=ex5m,family=poisson(link="log")) My questions are: 1) The model returns Z values, and I'm unable to find an argument in the function where this can be changed to return a t or F value (as Bolker et al. suggests I should use for my data). 2) I'm unsure what the AIC or QAIC value means, other than knowing that it should be as low as possible. Is there a rule of thumb of what is a good AIC value? Mine are in the region of 2230. 3) The default in glmer {lme4) for the argument nAGQ = 1, which uses the Laplace approximation. When nAGQ >1, it uses the GHQ method, but I'm unsure how to determine the correct number of Gauss-Hermite points to enter in the argument when using this method. How is this determined? 4) Some of my data sets have means >5, and are also overdispersed, and according to Bolker et al. should be analysed using a GLMM with PQL and a Wald t or F. However, the glmmPQL {glmmPQL} does not accept more than one random factor, and I have two, so how do I deal with that? 5) Lastly, what does the "1" imply in the random factor term, e.g. (1|Site), and how does this affect the analysis? Many thanks, Raldo Kruger MSc student University of Cape Town South Africa ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.