Björn Stollenwerk
2008-Nov-19 12:55 UTC
[R] F-Tests in generalized linear mixed models (GLMM)
Hi! I would like to perform an F-Test over more than one variable within a generalized mixed model with Gamma-distribution and log-link function. For this purpose, I use the package mgcv. Similar tests may be done using the function "anova", as for example in the case of a normal distributed response. However, if I do so, the error message "error in eval(expr, envir, enclos) : object "fixed" not found" occures. Does anyone konw why, or how to fix the problem? To illustrate the problem, I send the output of a simulated example. Thank you very much in advance. Best regards, Bj?rn Example: > # simulation of data > n <- 300 > x1 <- sample(c(T,F), n, replace=TRUE) > x2 <- rnorm(n) > random1 <- sample(c("level1","level2","level3"), n, replace=TRUE) > true.lp <- 5 + 1.1*x1 + 0.16 * x2 > mu <- exp(true.lp) > sigma <- mu * 1 > a <- mu^2/sigma^2 > s <- sigma^2/mu > y <- rgamma(n, shape=a, scale=s) > > library(mgcv) > > # a mixed model without Gamma-distribution and without log-link works as follows: > glmm1 <- gamm(y ~ x1 + x2, random=list(random1 = ~1)) > glmm2 <- gamm(y ~ 1, random=list(random1 = ~1)) > > anova(glmm1$lme) numDF denDF F-value p-value X 3 295 103.4730 <.0001 > anova(glmm2$lme, glmm1$lme) Model df AIC BIC logLik Test L.Ratio p-value glmm2$lme 1 3 4340.060 4351.172 -2167.030 glmm1$lme 2 5 4292.517 4311.036 -2141.258 1 vs 2 51.54367 <.0001 > > # a linear model also works, though no p-value is reported > glm1 <- gam(y ~ x1 + x2) > glm2 <- gam(y ~ 1) > anova(glm1) Family: gaussian Link function: identity Formula: y ~ x1 + x2 Parametric Terms: df F p-value x1 1 45.58 7.69e-11 x2 1 13.96 0.000224 > anova(glm2, glm1) Analysis of Deviance Table Model 1: y ~ 1 Model 2: y ~ x1 + x2 Resid. Df Resid. Dev Df Deviance 1 299 33024943 2 297 27811536 2 5213407 > > # general linear models (GLM) with Gamma and log-link don't work > glm1.gamma <- gam(y ~ x1 + x2, family=Gamma(link="log")) > glm2.gamma <- gam(y ~ 1, family=Gamma(link="log")) > anova(glm1.gamma) Family: Gamma Link function: log Formula: y ~ x1 + x2 Parametric Terms: df F p-value x1 1 59.98 1.52e-13 x2 1 16.06 7.78e-05 > anova(glm2.gamma, glm1.gamma) Analysis of Deviance Table Model 1: y ~ 1 Model 2: y ~ x1 + x2 Resid. Df Resid. Dev Df Deviance 1 299 413.59 2 297 343.90 2 69.69 > > # neither do general linear mixed models (GLMM) > > glm1.gamma <- gamm(y ~ x1 + x2, random=list(random1 = ~1), family=Gamma(link="log")) Maximum number of PQL iterations: 20 iteration 1 > glm2.gamma <- gamm(y ~ 1, random=list(random1 = ~1), family=Gamma(link="log")) Maximum number of PQL iterations: 20 iteration 1 > summary(glm1.gamma$lme) Linear mixed-effects model fit by maximum likelihood Data: data AIC BIC logLik 847.722 866.241 -418.861 Random effects: Formula: ~1 | random1 (Intercept) Residual StdDev: 2.954058e-05 0.9775214 Variance function: Structure: fixed weights Formula: ~invwt Fixed effects: list(fixed) Value Std.Error DF t-value p-value X(Intercept) 5.066376 0.08363392 295 60.57801 0e+00 Xx1TRUE 0.884486 0.11421762 295 7.74387 0e+00 Xx2 0.234537 0.05851689 295 4.00802 1e-04 Correlation: X(Int) X1TRUE Xx1TRUE -0.733 Xx2 -0.008 0.085 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -1.0207671 -0.6911364 -0.2899184 0.3665161 4.9603830 Number of Observations: 300 Number of Groups: 3 > summary(glm1.gamma$gam) Family: Gamma Link function: log Formula: y ~ x1 + x2 Parametric coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 5.06638 0.08363 60.578 < 2e-16 *** x1TRUE 0.88449 0.11422 7.744 1.53e-13 *** x2 0.23454 0.05852 4.008 7.75e-05 *** --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 R-sq.(adj) = 0.171 Scale est. = 0.95555 n = 300 > > anova(glm1.gamma$lme) numDF denDF F-value p-value X 3 295 3187.192 <.0001 > anova(glm2.gamma$lme, glm1.gamma$lme) Fehler in eval(expr, envir, enclos) : objekt "fixed" nicht gefunden >
On Wed, Nov 19, 2008 at 6:55 AM, Bj?rn Stollenwerk <bjoern.stollenwerk at helmholtz-muenchen.de> wrote:> Hi!> I would like to perform an F-Test over more than one variable within a > generalized mixed model with Gamma-distribution > and log-link function.Are you using the phrase "F-Test" as a general term for model comparison techniques like the analysis of variance or as a specific type of test based on ratios of mean squares and the F distribution? If the latter then you may need to reconsider your question. The F statistic is derived from a normal (i.e. Gaussian) distribution of the response vector. In certain balanced cases it can also be applied to linear mixed models. As far as I know there is not a derivation of the F statistic from a Gamma distribution, either with or without random effects.> For this purpose, I use the package mgcv. > Similar tests may be done using the function "anova", as for example in the > case of a normal > distributed response. However, if I do so, the error message > "error in eval(expr, envir, enclos) : object "fixed" not found" occures. > Does anyone konw why, or how to fix the problem? To illustrate the problem, > I send the output of a simulated example. > Thank you very much in advance. > > Best regards, Bj?rn > > Example: > >> # simulation of data >> n <- 300 >> x1 <- sample(c(T,F), n, replace=TRUE) >> x2 <- rnorm(n) >> random1 <- sample(c("level1","level2","level3"), n, replace=TRUE) >> true.lp <- 5 + 1.1*x1 + 0.16 * x2 >> mu <- exp(true.lp) >> sigma <- mu * 1 >> a <- mu^2/sigma^2 >> s <- sigma^2/mu >> y <- rgamma(n, shape=a, scale=s) >> >> library(mgcv) >> >> # a mixed model without Gamma-distribution and without log-link works as >> follows: >> glmm1 <- gamm(y ~ x1 + x2, random=list(random1 = ~1)) >> glmm2 <- gamm(y ~ 1, random=list(random1 = ~1)) >> >> anova(glmm1$lme) > numDF denDF F-value p-value > X 3 295 103.4730 <.0001 >> anova(glmm2$lme, glmm1$lme) > Model df AIC BIC logLik Test L.Ratio p-value > glmm2$lme 1 3 4340.060 4351.172 -2167.030 > glmm1$lme 2 5 4292.517 4311.036 -2141.258 1 vs 2 51.54367 <.0001 >> >> # a linear model also works, though no p-value is reported >> glm1 <- gam(y ~ x1 + x2) >> glm2 <- gam(y ~ 1) >> anova(glm1) > > Family: gaussian > Link function: identity > > Formula: > y ~ x1 + x2 > > Parametric Terms: > df F p-value > x1 1 45.58 7.69e-11 > x2 1 13.96 0.000224 > >> anova(glm2, glm1) > Analysis of Deviance Table > > Model 1: y ~ 1 > Model 2: y ~ x1 + x2 > Resid. Df Resid. Dev Df Deviance > 1 299 33024943 > 2 297 27811536 2 5213407 >> >> # general linear models (GLM) with Gamma and log-link don't work >> glm1.gamma <- gam(y ~ x1 + x2, family=Gamma(link="log")) >> glm2.gamma <- gam(y ~ 1, family=Gamma(link="log")) >> anova(glm1.gamma) > > Family: Gamma > Link function: log > > Formula: > y ~ x1 + x2 > > Parametric Terms: > df F p-value > x1 1 59.98 1.52e-13 > x2 1 16.06 7.78e-05 > >> anova(glm2.gamma, glm1.gamma) > Analysis of Deviance Table > > Model 1: y ~ 1 > Model 2: y ~ x1 + x2 > Resid. Df Resid. Dev Df Deviance > 1 299 413.59 > 2 297 343.90 2 69.69 >> >> # neither do general linear mixed models (GLMM) >> >> glm1.gamma <- gamm(y ~ x1 + x2, random=list(random1 = ~1), >> family=Gamma(link="log")) > > Maximum number of PQL iterations: 20 > iteration 1 >> glm2.gamma <- gamm(y ~ 1, random=list(random1 = ~1), >> family=Gamma(link="log")) > > Maximum number of PQL iterations: 20 > iteration 1 >> summary(glm1.gamma$lme) > Linear mixed-effects model fit by maximum likelihood > Data: data > AIC BIC logLik > 847.722 866.241 -418.861 > > Random effects: > Formula: ~1 | random1 > (Intercept) Residual > StdDev: 2.954058e-05 0.9775214 > > Variance function: > Structure: fixed weights > Formula: ~invwt > Fixed effects: list(fixed) > Value Std.Error DF t-value p-value > X(Intercept) 5.066376 0.08363392 295 60.57801 0e+00 > Xx1TRUE 0.884486 0.11421762 295 7.74387 0e+00 > Xx2 0.234537 0.05851689 295 4.00802 1e-04 > Correlation: > X(Int) X1TRUE > Xx1TRUE -0.733 > Xx2 -0.008 0.085 > > Standardized Within-Group Residuals: > Min Q1 Med Q3 Max > -1.0207671 -0.6911364 -0.2899184 0.3665161 4.9603830 > > Number of Observations: 300 > Number of Groups: 3 >> summary(glm1.gamma$gam) > > Family: Gamma > Link function: log > > Formula: > y ~ x1 + x2 > > Parametric coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) 5.06638 0.08363 60.578 < 2e-16 *** > x1TRUE 0.88449 0.11422 7.744 1.53e-13 *** > x2 0.23454 0.05852 4.008 7.75e-05 *** > --- > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > > R-sq.(adj) = 0.171 Scale est. = 0.95555 n = 300 >> >> anova(glm1.gamma$lme) > numDF denDF F-value p-value > X 3 295 3187.192 <.0001 >> anova(glm2.gamma$lme, glm1.gamma$lme) > Fehler in eval(expr, envir, enclos) : objekt "fixed" nicht gefunden >> > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >