Dear Pedro,
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Pedro de Barros
> Sent: Tuesday, November 08, 2005 9:47 AM
> To: r-help at stat.math.ethz.ch
> Subject: [R] Interpretation of output from glm
> Importance: High
>
> I am fitting a logistic model to binary data. The response
> variable is a factor (0 or 1) and all predictors are
> continuous variables. The main predictor is LT (I expect a
> logistic relation between LT and the probability of being
> mature) and the other are variables I expect to modify this relation.
>
> I want to test if all predictors contribute significantly for
> the fit or not I fit the full model, and get these results
>
> > summary(HMMaturation.glmfit.Full)
>
> Call:
> glm(formula = Mature ~ LT + CondF + Biom + LT:CondF + LT:Biom,
> family = binomial(link = "logit"), data = HMIndSamples)
>
> Deviance Residuals:
> Min 1Q Median 3Q Max
> -3.0983 -0.7620 0.2540 0.7202 2.0292
>
> Coefficients:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) -8.789e-01 3.694e-01 -2.379 0.01735 *
> LT 5.372e-02 1.798e-02 2.987 0.00281 **
> CondF -6.763e-02 9.296e-03 -7.275 3.46e-13 ***
> Biom -1.375e-02 2.005e-03 -6.856 7.07e-12 ***
> LT:CondF 2.434e-03 3.813e-04 6.383 1.74e-10 ***
> LT:Biom 7.833e-04 9.614e-05 8.148 3.71e-16 ***
> ---
> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05
'.' 0.1 ' ' 1
>
> (Dispersion parameter for binomial family taken to be 1)
>
> Null deviance: 10272.4 on 8224 degrees of freedom
> Residual deviance: 7185.8 on 8219 degrees of freedom
> AIC: 7197.8
>
> Number of Fisher Scoring iterations: 8
>
> However, when I run anova on the fit, I get >
> anova(HMMaturation.glmfit.Full, test='Chisq') Analysis of
> Deviance Table
>
> Model: binomial, link: logit
>
> Response: Mature
>
> Terms added sequentially (first to last)
>
>
> Df Deviance Resid. Df Resid. Dev P(>|Chi|)
> NULL 8224 10272.4
> LT 1 2873.8 8223 7398.7 0.0
> CondF 1 0.1 8222 7398.5 0.7
> Biom 1 0.2 8221 7398.3 0.7
> LT:CondF 1 142.1 8220 7256.3 9.413e-33
> LT:Biom 1 70.4 8219 7185.8 4.763e-17
> Warning message:
> fitted probabilities numerically 0 or 1 occurred in: method(x
> = x[, varseq <= i, drop = FALSE], y = object$y, weights =
> object$prior.weights,
>
>
> I am having a little difficulty interpreting these results.
> The result from the fit tells me that all predictors are
> significant, while
> the anova indicates that besides LT (the main variable), only the
> interaction of the other terms is significant, but the main
> effects are not.
> I believe that in the first output (on the glm object), the
> significance of
> all terms is calculated considering each of them alone in the
> model (i.e.
> removing all other terms), while the anova output is (as it says)
> considering the sequential addition of the terms.
>
> So, there are 2 questions:
> a) Can I tell that the interactions are significant, but not
> the main effects?
In a model with this structure, the "main effects" represent slopes
over the
origin (i.e., where the other variables in the product terms are 0), and
aren't meaningfully interpreted as main effects. (Is there even any data
near the origin?)
> b) Is it legitimate to consider a model where the interactions are
> considered, but not the main effects CondF and Biom?
Generally, no: That is, such a model is interpretable, but it places strange
constraints on the regression surface -- that the CondF and Biom slopes are
0 over the origin.
None of this is specific to logistic regression -- it applies generally to
generalized linear models, including linear models.
I hope this helps,
John