J.T.Kent@leeds.ac.uk
2001-Nov-16 10:35 UTC
[Rd] pearson residuals in glm for binomial response (PR#1175)
R version 1.3.0 OS: SunOS 5.7, but I think the same problem occurs with Windows An incorrect formula seems to be used to calculate the pearson residuals for a generalized linear model with a binomial response. Here is a simple program which gives (a) the pearson residuals calculated directly, (b) the pearson residuals from glm, and (c) the deviance residuals from glm. The first and last columns are quite close. The middle differs from the first by a factor of n (n=10 here). This problem did not occur with an earlier version of R I tried, version 0.64.2. Thank you. Professor John T. Kent tel (direct) (44) 113-233-5103 Department of Statistics tel(secretary) (44) 113-233-5101 University of Leeds fax (44) 113-233-5090 Leeds LS2 9JT, England e-mail: J.T.Kent@leeds.ac.uk Input: x_1:4 # regressor variable y_c(2,6,7,8) # response binomial counts n_rep(10,4) # number of binomial trials ym_cbind(y,n-y) # response variable as a matrix glm1_glm(ym~x,binomial) # fit a generalized linear model f_fitted(glm1) rp1_(y-n*f)/sqrt(n*f*(1-f)) # direct calculation of pearson residuals rp2_residuals(glm1,type="pearson") # should be pearson residuals rd_residuals(glm1) # deviance residuals cbind(rp1,rp2,rd) # note second column is wrong, in this case by factor of 10 Output: rp1 rp2 rd [1,] -0.56502992 -0.056500696 -0.58467936 [2,] 0.73739407 0.073739452 0.73886417 [3,] 0.05266796 0.005266533 0.05279174 [4,] -0.38314283 -0.038307944 -0.37010220>-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Peter Dalgaard BSA
2001-Nov-16 11:23 UTC
[Rd] pearson residuals in glm for binomial response (PR#1175)
J.T.Kent@leeds.ac.uk writes:> R version 1.3.0 > OS: SunOS 5.7, but I think the same problem occurs with Windows > > An incorrect formula seems to be used to calculate the pearson residuals > for a generalized linear model with a binomial response. Here is a > simple program which gives (a) the pearson residuals calculated directly, > (b) the pearson residuals from glm, and (c) the deviance residuals from > glm. The first and last columns are quite close. The middle differs from > the first by a factor of n (n=10 here). This problem did not occur with > an earlier version of R I tried, version 0.64.2.I almost said that your version is old and that you forgot to check the bug repository for prior reports (PR#1123 was also on Pearson residuals). However, the problem persists in the development version. The issue is that we have (in 1.4.0pre) pearson = (y - mu)/sqrt(wts * (object$family$variance)(mu)) However, the y and mu that we use are divided by n:> glm1$prior.weights[1] 10 10 10 10> glm1$fitted.values[1] 0.2802480 0.4834715 0.6923131 0.8439674> glm1$family$variance(glm1$fitted.values)[1] 0.2017091 0.2497268 0.2130157 0.1316864> glm1$y[1] 0.2 0.6 0.7 0.8 I've stuck my foot in it with this stuff before, but should the denominator not be sqrt(object$family$variance(mu)/wts) ?? (If the weights are large, the variance should be smaller, not larger). Could someone please ram a wooden stake through this one and prevent it from haunting us again? -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk) FAX: (+45) 35327907 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Thomas Lumley
2001-Nov-17 18:40 UTC
[Rd] pearson residuals in glm for binomial response (PR#1175)
On Fri, 16 Nov 2001 J.T.Kent@leeds.ac.uk wrote:> R version 1.3.0 > OS: SunOS 5.7, but I think the same problem occurs with Windows > > An incorrect formula seems to be used to calculate the pearson residuals > for a generalized linear model with a binomial response. Here is a > simple program which gives (a) the pearson residuals calculated directly, > (b) the pearson residuals from glm, and (c) the deviance residuals from > glm. The first and last columns are quite close. The middle differs from > the first by a factor of n (n=10 here). This problem did not occur with > an earlier version of R I tried, version 0.64.2. >Yes, it is a bug, resulting from an incorrect fix to an earlier bug that had the sign wrong for decreasing link functions like the reciprocal. It's already been reported and fixed. -thomas -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._