Hallo all,
I have the following glm model:
f1 <- as.formula(paste("factor(y.fondi)~",
"flgsess + segmeta2 + udm + zona.geo +
ultimo.prod.",
"+flg.a2 + flg.d.na2 + flg.v2 + flg.cc2",
" +(flg.a1 + flg.d.na1 + flg.v1 + flg.cc1)^2",
" + flg.a2:flg.d.na2 + flg.a2:flg.v2 +
flg.a2:flg.cc2",
" + flg.d.na2:flg.v2 + flg.v2:flg.cc2",
sep=""))
g1 <- glm(f1,family=binomial,data=camp.lavoro.meno.na)
The variables are all factors:
? y.fondi takes value 0 or 1;
? flgsess has 2 levels;
? segmeta2 has 4 levels;
? udm has 6 levels;
? zona.geo has 5 levels;
? ultimo.prod. has 4 levels;
? flg.a1, flg.d.na1, flg.v1, flg.cc1, flg.a2, flg.d.na2, flg.v2, flg.cc2 are 8
factors that take values 0 or 1.
The number of observations is 1390.
The observations with "y.fondi = 1" are 259.
The observations with "y.fondi = 0" are 1131.
The summary of the model is:> summary(g1)
Call:
glm(formula = f1, family = binomial, data = camp.lavoro.meno.na)
Deviance Residuals:
Min 1Q Median 3Q Max
-2.8955 -0.3586 -0.2692 -0.1642 2.9133
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.7647 0.7523 -3.675 0.000238 ***
... ... ... ...
...
flg.a21 0.7898 0.4948 1.596 0.110475
flg.d.na21 0.2097 0.7336 0.286 0.774963
flg.v21 0.3928 0.5257 0.747 0.454994
flg.cc21 -0.8547 1.4954 -0.572 0.567625
flg.a11 0.7051 0.4889 1.442 0.149221
flg.d.na11 1.3582 0.5429 2.502 0.012353 *
flg.v11 2.2596 0.5079 4.449 8.62e-06 ***
flg.cc11 -3.3658 8.5259 -0.395 0.693014
flg.a21:flg.d.na21 -6.9392 26.5432 -0.261 0.793760
flg.a21:flg.v21 -1.4355 4.0963 -0.350 0.726005
flg.a21:flg.cc21 -6.0460 72.4807 -0.083 0.933521
flg.d.na21:flg.v21 -2.4347 2.9045 -0.838 0.401888
flg.v21:flg.cc21 11.7232 72.4814 0.162 0.871510
flg.a11:flg.d.na11 -8.3843 30.4660 -0.275 0.783162 !!!!
flg.a11:flg.v11 6.5067 39.2569 0.166 0.868356
flg.a11:flg.cc11 13.5596 19.4693 0.696 0.486140 !!!!
flg.d.na11:flg.v11 -0.7143 1.2673 -0.564 0.573013
flg.d.na11:flg.cc11 12.0653 15.3880 0.784 0.432997
flg.v11:flg.cc11 6.2648 8.5808 0.730 0. 465331 !!!!
Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` '
1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1336.79 on 1389 degrees of freedom
Residual deviance: 576.08 on 1354 degrees of freedom
AIC: 648.08
Number of Fisher Scoring iterations: 8
If I apply the test anova, I obtain:
> g1.1 <- update(g1,~.-flg.a1:flg.d.na1,data=camp.lavoro.meno.na)
> anova(g1.1,g1,test="Chisq")
Analysis of Deviance Table
Resid. Df Resid. Dev Df Deviance P(>|Chi|)
1 1355 578.49
2 1354 576.08 1 2.41 0.12
> g1.1 <- update(g1,~.-flg.a1:flg.cc1,data=camp.lavoro.meno.na)
> anova(g1.1,g1,test="Chisq")
Analysis of Deviance Table
Resid. Df Resid. Dev Df Deviance P(>|Chi|)
1 1355 580.77
2 1354 576.08 1 4.69 0.03
> g1.1 <- update(g1,~.-flg.v1:flg.cc1,data=camp.lavoro.meno.na)
> anova(g1.1,g1,test="Chisq")
Analysis of Deviance Table
Resid. Df Resid. Dev Df Deviance P(>|Chi|)
1 1355 578.01
2 1354 576.08 1 1.94 0.16
Why I obtain these differences?
Many thanks for any help,
Simona
You need to look up the Hauck-Donner phenomenon in MASS (4th, 3rd or 2nd edition). In short, Wald tests of binomial or Poisson glms are highly unreliable: a moderate p-value indicates no effect or a very large effect. I suspect your model is in fact partially separable (that is can fit parts of the data exactly), since those are large coefficients for indicator variables. Try reducing the tolerance in glm.control (add epsilon=1e-10) and see if the coefficients change a lot. On Thu, 8 May 2003, Simona Avanzo wrote:> Hallo all, > > I have the following glm model: > > f1 <- as.formula(paste("factor(y.fondi)~", > "flgsess + segmeta2 + udm + zona.geo + ultimo.prod.", > "+flg.a2 + flg.d.na2 + flg.v2 + flg.cc2", > " +(flg.a1 + flg.d.na1 + flg.v1 + flg.cc1)^2", > " + flg.a2:flg.d.na2 + flg.a2:flg.v2 + flg.a2:flg.cc2", > " + flg.d.na2:flg.v2 + flg.v2:flg.cc2", > sep="")) > > g1 <- glm(f1,family=binomial,data=camp.lavoro.meno.na) > > The variables are all factors: > ? y.fondi takes value 0 or 1; > ? flgsess has 2 levels; > ? segmeta2 has 4 levels; > ? udm has 6 levels; > ? zona.geo has 5 levels; > ? ultimo.prod. has 4 levels; > ? flg.a1, flg.d.na1, flg.v1, flg.cc1, flg.a2, flg.d.na2, flg.v2, flg.cc2 are 8 factors that take values 0 or 1. > > The number of observations is 1390. > The observations with "y.fondi = 1" are 259. > The observations with "y.fondi = 0" are 1131. > > The summary of the model is: > > summary(g1) > Call: > glm(formula = f1, family = binomial, data = camp.lavoro.meno.na) > > Deviance Residuals: > Min 1Q Median 3Q Max > -2.8955 -0.3586 -0.2692 -0.1642 2.9133 > > Coefficients: > Estimate Std. Error z value Pr(>|z|) > (Intercept) -2.7647 0.7523 -3.675 0.000238 *** > ... ... ... ... ... > > flg.a21 0.7898 0.4948 1.596 0.110475 > flg.d.na21 0.2097 0.7336 0.286 0.774963 > flg.v21 0.3928 0.5257 0.747 0.454994 > flg.cc21 -0.8547 1.4954 -0.572 0.567625 > flg.a11 0.7051 0.4889 1.442 0.149221 > flg.d.na11 1.3582 0.5429 2.502 0.012353 * > flg.v11 2.2596 0.5079 4.449 8.62e-06 *** > flg.cc11 -3.3658 8.5259 -0.395 0.693014 > flg.a21:flg.d.na21 -6.9392 26.5432 -0.261 0.793760 > flg.a21:flg.v21 -1.4355 4.0963 -0.350 0.726005 > flg.a21:flg.cc21 -6.0460 72.4807 -0.083 0.933521 > flg.d.na21:flg.v21 -2.4347 2.9045 -0.838 0.401888 > flg.v21:flg.cc21 11.7232 72.4814 0.162 0.871510 > flg.a11:flg.d.na11 -8.3843 30.4660 -0.275 0.783162 !!!! > flg.a11:flg.v11 6.5067 39.2569 0.166 0.868356 > flg.a11:flg.cc11 13.5596 19.4693 0.696 0.486140 !!!! > flg.d.na11:flg.v11 -0.7143 1.2673 -0.564 0.573013 > flg.d.na11:flg.cc11 12.0653 15.3880 0.784 0.432997 > flg.v11:flg.cc11 6.2648 8.5808 0.730 0. 465331 !!!! > > Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 > (Dispersion parameter for binomial family taken to be 1) > > Null deviance: 1336.79 on 1389 degrees of freedom > Residual deviance: 576.08 on 1354 degrees of freedom > AIC: 648.08 > > Number of Fisher Scoring iterations: 8 > > If I apply the test anova, I obtain: > > > g1.1 <- update(g1,~.-flg.a1:flg.d.na1,data=camp.lavoro.meno.na) > > anova(g1.1,g1,test="Chisq") > Analysis of Deviance Table > Resid. Df Resid. Dev Df Deviance P(>|Chi|) > 1 1355 578.49 > 2 1354 576.08 1 2.41 0.12 > > > g1.1 <- update(g1,~.-flg.a1:flg.cc1,data=camp.lavoro.meno.na) > > anova(g1.1,g1,test="Chisq") > Analysis of Deviance Table > Resid. Df Resid. Dev Df Deviance P(>|Chi|) > 1 1355 580.77 > 2 1354 576.08 1 4.69 0.03 > > > g1.1 <- update(g1,~.-flg.v1:flg.cc1,data=camp.lavoro.meno.na) > > anova(g1.1,g1,test="Chisq") > Analysis of Deviance Table > Resid. Df Resid. Dev Df Deviance P(>|Chi|) > 1 1355 578.01 > 2 1354 576.08 1 1.94 0.16 > > Why I obtain these differences? > Many thanks for any help, > > Simona > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Maybe Matching Threads
- Flag file management techniques using rsync
- BT ISDN-30 Pri getting 'stuck' on outgoing calls.
- Understanding of ldd header allocation
- [lld] [arm] Linker Cannot Set Custom Section Type to NOBITS
- strange behaviour with seen/unseen messages in virtual folders.