I have fitted a number of models with receipt of social assictance
(toim1) during a year (values 0 or 1) with a number of covariates.
The data include sampling weights which I use in the models. Using the
exact same data, glm() under 1.0.1 and 1.1.0 give different results in
many (but not all) of the models. I have re-installed 1.0.1 to check
this and I found now mention in the NEWS file that indicated a change
of that would account for this in 1.1.0.]
I show the function calls and summary() results below for each version,
using the models that only allow for the probability of receipt to vary
by year ( the factor Vuosi1 below) . The information given by version
(for 1.0.1 here) is
> version
_
platform i686-unknown-linux
arch i686
os linux
system i686, linux
status
major 1
minor 0.1
year 2000
month April
day 14
language R
This is what R 1.0.1 gives:
> glm.toim.0 <- glm(toim1 ~ Vuosi1,
+ data = Data, #subset = Vuosi1 == "1993",
+ family=binomial("probit"),
+ na.action = na.omit , weights = pko1, model =
FALSE)> summary(glm.toim.0)
Call:
glm(formula = toim1 ~ Vuosi1, family = binomial("probit"), data =
Data,
weights = pko1, na.action = na.omit, model = FALSE)
Deviance Residuals:
Min 1Q Median 3Q Max
-27.05 -9.11 -7.31 -5.26 118.85
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.50660 0.00104 -1447.4 <2e-16 ***
Vuosi11994 0.09489 0.00143 66.2 <2e-16 ***
Vuosi11995 0.10624 0.00143 74.3 <2e-16 ***
Vuosi11996 0.13442 0.00142 94.8 <2e-16 ***
Vuosi11997 0.12126 0.00142 85.3 <2e-16 ***
---
Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1
` ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 9540324 on 40869 degrees of freedom
Residual deviance: 9529262 on 40865 degrees of freedom
AIC: 9529272
Number of Fisher Scoring iterations: 4
In R 1.1.0, I get
> glm.toim.0 <- glm(toim1 ~ Vuosi1,
+ data = Data, #subset = Vuosi1 == "1993",
+ family=binomial("probit"),
+ na.action = na.omit , weights = pko1, model = FALSE)
Warning message:
fitted probabilities numerically 0 or 1 occurred in: (if (is.empty.model(mt))
glm.fit.null else glm.fit)(x = X, y = Y, > summary(glm.toim.0)
Call:
glm(formula = toim1 ~ Vuosi1, family = binomial("probit"), data =
Data,
weights = pko1, na.action = na.omit, model = FALSE)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.35e-06 -4.76e-07 -3.82e-07 -2.75e-07 4.54e+02
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -3.91e+15 3.61e+04 -1.08e+11 <2e-16 ***
Vuosi11994 1.18e+14 5.11e+04 2.30e+09 <2e-16 ***
Vuosi11995 1.33e+14 5.11e+04 2.60e+09 <2e-16 ***
Vuosi11996 1.72e+14 5.10e+04 3.36e+09 <2e-16 ***
Vuosi11997 1.53e+14 5.10e+04 3.01e+09 <2e-16 ***
---
Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1
` ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 9540324 on 40869 degrees of freedom
Residual deviance: 98216118 on 40865 degrees of freedom
AIC: 98216128
Number of Fisher Scoring iterations: 4
The (unweighted) empirical pattern is
> table(Vuosi1, toim1)
toim1
Vuosi1 0 1
1993 8206 429
1994 6524 383
1995 8564 494
1996 6716 369
1997 8756 429>
It would be helpful to understand what is going on.
While the R FAQ kind of warned against this (i.e., trying to explain
the problem rather than describe it accurately), I can add the when I do
the same function call but do not use the weights, results are
identical.
Regards,
Markus
--
Markus Jantti | Department of Statistics
markus.jantti at abo.fi | Abo Akademi University
http://www.abo.fi/~mjantti | FIN 20500 Turku, Finland
358-9-643 747 (Home/Voice) | 358-2-2154 161 (Office/Voice)
| 358-2-2154 677 (Office/Fax)
PGP public key: http://www.abo.fi/~mjantti/pubring.asc
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._