On Tue, Jun 8, 2010 at 7:10 AM, Enrico Colosimo <enricoc57 at gmail.com>
wrote:> Hello,
> ?I am having some trouble running a very simple
> example. I am running a logistic regression entering the SAME data set
> in two different forms and getting different values for the deviance
residual.
>
> Just look with this naive data set:
>
> ===============================================================> # 1-
Entering as a Bernoulli data set
> ?y<-c(1,0,1,1,0)
> ?x<-c(2,2,5,5,8)
> ?ajust1<-glm(y~x,family=binomial(link="logit"))
> ?ajust1
> #
> Coefficients:
> (Intercept) ? ? ? ? ? ?x
> ? ? 1.3107 ? ? ?-0.2017
>
> Degrees of Freedom: 4 Total (i.e. Null); ?3 Residual
> Null Deviance: ? ? ?6.73
> Residual Deviance: 6.491 ? ? ? ?AIC: 10.49
> #
> # 2- Entering as Binomial data set
> #
> ?ysim<-c(1,2,0)
> ?ynao<-c(1,0,1)
> ?x<-c(2,5,8)
> ?dados<-cbind(ysim,ynao,x)
> ?dados<-as.data.frame(dados)
> ?attach(dados)
> ?ajust2<-glm(as.matrix(dados[,c(1,2)])~x,family=binomial, data=dados)
> ?summary(ajust2)
> #
> Coefficients:
> (Intercept) ? ? ? ? ? ?x
> ? ? 1.3107 ? ? ?-0.2017
>
> Degrees of Freedom: 2 Total (i.e. Null); ?1 Residual
> Null Deviance: ? ? ?3.958
> Residual Deviance: 3.718 ? ? ? ?AIC: 9.104
> ================================================================>
> It seems that there is problem with the first fitting!!!
In what way? Notice that the estimates of the coefficients are the
same in the two fits and the difference between the null deviance and
the residual deviance is approximately the same. If you are worried
about the deviance in the first fit being greater than the deviance in
the second fit it is because of the definition of the deviance used.
The deviance for the binomial fit is a shifted version of the deviance
from the Bernoulli fit, which is why the null deviance is also
reported.