Dear all, I am running a logistic regression and this is the output: glm(formula = educationUniv ~ brncntr, family = binomial) Deviance Residuals: Min 1Q Median 3Q Max # ???? ????? ?? ???????? -0.8825 -0.7684 -0.7684 1.5044 1.6516 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -1.06869 0.01155 -92.487 <2e-16 *** brncntrNo 0.32654 0.03742 8.726 <2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 49363 on 42969 degrees of freedom Residual deviance: 49289 on 42968 degrees of freedom AIC: 49293 I thought that the residuals should all be restricted in the range 0 to 1 (since I am predicting a binary outcome). I read many posts on this list and I realized that there are four(!?) different types of residuals. I need a simple account of these four types of residuals, if anyone can help it will be great. residuals(glm1, "response") residuals(glm1, "pearson") residuals(glm1, "deviance") residuals(glm1, "working") - especially this one confuses me a lot! What is the "working" option and how is this different? Thank you Jason Dr. Iasonas Lamprianou Assistant Professor (Educational Research and Evaluation) Department of Education Sciences European University-Cyprus P.O. Box 22006 1516 Nicosia Cyprus Tel.: +357-22-713178 Fax: +357-22-590539 Honorary Research Fellow Department of Education The University of Manchester Oxford Road, Manchester M13 9PL, UK Tel. 0044 161 275 3485 iasonas.lamprianou at manchester.ac.uk
Dear Iasonas,> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On > Behalf Of Iasonas Lamprianou > Sent: August-20-10 5:55 AM > To: r-help at r-project.org > Subject: [R] Deviance Residuals > > Dear all, > > I am running a logistic regression and this is the output: > > glm(formula = educationUniv ~ brncntr, family = binomial) > > Deviance Residuals: > Min 1Q Median 3Q Max # ???? ????? ?? ???????? > -0.8825 -0.7684 -0.7684 1.5044 1.6516 > > Coefficients: > Estimate Std. Error z value Pr(>|z|) > (Intercept) -1.06869 0.01155 -92.487 <2e-16 *** > brncntrNo 0.32654 0.03742 8.726 <2e-16 *** > --- > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > (Dispersion parameter for binomial family taken to be 1) > > Null deviance: 49363 on 42969 degrees of freedom > Residual deviance: 49289 on 42968 degrees of freedom > AIC: 49293 > > > I thought that the residuals should all be restricted in the range 0 to 1 > (since I am predicting a binary outcome). I read many posts on this list and > I realized that there are four(!?) different types of residuals. I need a > simple account of these four types of residuals, if anyone can help it will > be great. > > residuals(glm1, "response")Residuals on the scale of the response, y - E(y); in a binary logistic regression, y is 0 or 1 and E(y) is the fitted probability of a 1. As it turns out, response residuals aren't terribly useful for a logit model.> residuals(glm1, "pearson")Components of the Pearson goodness-of-fit statistic.> residuals(glm1, "deviance")Components of the residual deviance for the model.> residuals(glm1, "working") - especially this one confuses me a lot!Residuals from the final weighted-least-squares regression of the IWLS procedure used to fit the model; useful, for example, for detecting nonlinearity.> > What is the "working" option and how is this different?See above. I hope this helps, John -------------------------------- John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada web: socserv.mcmaster.ca/jfox> > Thank you > Jason > > Dr. Iasonas Lamprianou > > > Assistant Professor (Educational Research and Evaluation) > Department of Education Sciences > European University-Cyprus > P.O. Box 22006 > 1516 Nicosia > Cyprus > Tel.: +357-22-713178 > Fax: +357-22-590539 > > > Honorary Research Fellow > Department of Education > The University of Manchester > Oxford Road, Manchester M13 9PL, UK > Tel. 0044 161 275 3485 > iasonas.lamprianou at manchester.ac.uk > > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
On Aug 20, 2010, at 5:54 AM, Iasonas Lamprianou wrote:> Dear all, > > I am running a logistic regression and this is the output: > > glm(formula = educationUniv ~ brncntr, family = binomial) > > Deviance Residuals: > Min 1Q Median 3Q Max # ???? ????? > ?? ???????? > -0.8825 -0.7684 -0.7684 1.5044 1.6516 > > Coefficients: > Estimate Std. Error z value Pr(>|z|) > (Intercept) -1.06869 0.01155 -92.487 <2e-16 *** > brncntrNo 0.32654 0.03742 8.726 <2e-16 *** > --- > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > (Dispersion parameter for binomial family taken to be 1) > > Null deviance: 49363 on 42969 degrees of freedom > Residual deviance: 49289 on 42968 degrees of freedom > AIC: 49293 > > > I thought that the residuals should all be restricted in the range 0 > to 1 (since I am predicting a binary outcome).The internal regression calculations are done on the log-odds scale so the working residuals are on that scale. Those are stored in the glm.obj as the "residuals" item. I believe that if you tried mean(glm.obj$residuals) you should get 0. Presumably the deviance residuals are offered in preference to the working residuals because the deviance residual's use as an influence measure is made readily interpretable by reference to chi-square statistics. Page 205 of the Hastie and Pregibon citation has all the definitions. -- David.> I read many posts on this list and I realized that there are > four(!?) different types of residuals. I need a simple account of > these four types of residuals, if anyone can help it will be great. > > residuals(glm1, "response") > residuals(glm1, "pearson") > residuals(glm1, "deviance") > residuals(glm1, "working") - especially this one confuses me a lot! > > What is the "working" option and how is this different? > > Thank you > Jason > > Dr. Iasonas Lamprianou >-- David Winsemius, MD West Hartford, CT
Dear Dr. Iasonas, perhaps the following reference could also be useful: Dunn, P.K., Smyth, G.K., 1996. Randomized quantile residuals. J. Comput. Graph. Stat. 5, 236?244. An introduction to this residual definition and a link to the paper could be find here: http://www.statsci.org/smyth/pubs/residual.html This type of residuals can be easily computed in R, using the qresiduals function, provided by the statmod package. For a binomial response, the code would be as follows: qres.binom(glm.obj) I hope this also helps, Renzo Tascheri O. Marine Biologist Department of Living Marine Resources Assessment Instituto de Fomento Pesquero Avda. Blanco 839 Valpara?so - Chile Fono: (56)-32-2151617