ripley@stats.ox.ac.uk
2000-Dec-20 10:59 UTC
[Rd] glm gives incorrect results for zero-weight cases (PR#780)
Using zero-weight values in glm returns incorrect fitted values and linear predictors, the ninth value in the following.> example(glm) > fit <- glm(counts ~ outcome + treatment, family = poisson(),data=d.AD, weights=c(rep(1,8), 0))> fit$linear.predictor1 2 3 4 5 6 7 8 2.989646 2.535391 2.862201 2.989646 2.535391 2.862201 3.145992 2.691737 9 2.493205> predict(fit, d.AD)1 2 3 4 5 6 7 8 2.989646 2.535391 2.862201 2.989646 2.535391 2.862201 3.145992 2.691737 9 3.018547> fitted(fit)1 2 3 4 5 6 7 8 19.87864 12.62136 17.50000 19.87864 12.62136 17.50000 23.24272 14.75728 9 12.10000> predict(fit, d.AD, type="response")[1] 19.87864 12.62136 17.50000 19.87864 12.62136 17.50000 23.24272 14.75728 [9] 20.46154 The reason is obvious: glm.fit only ever updates eta[good], and zero-weight values are not `good'. So eta[weights == 0] is stuck at the initial values. There are two possible fixes: 1) Update eta after the final fit, and then mu. Out of range values could then be NA (although it looks like predict.glm does not check). 2) Update all eta and hence mu values during the iterations. This will apply the constraints on eta/mu at zero-weight points too, and so might be different. I am inclined to think that 2) is right, and that adding points with zero weight to the fit is not the same as omitting them. Opinions? --please do not edit the information below-- Version: platform = sparc-sun-solaris2.6 arch = sparc os = solaris2.6 system = sparc, solaris2.6 status = major = 1 minor = 2.0 year = 2000 month = 12 day = 15 language = R Search Path: .GlobalEnv, package:ctest, Autoloads, package:base -- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Peter Dalgaard BSA
2000-Dec-20 12:37 UTC
[Rd] glm gives incorrect results for zero-weight cases (PR#780)
ripley@stats.ox.ac.uk writes:> The reason is obvious: glm.fit only ever updates eta[good], and > zero-weight values are not `good'. So eta[weights == 0] is stuck at the > initial values. > > There are two possible fixes: > > 1) Update eta after the final fit, and then mu. Out of range values > could then be NA (although it looks like predict.glm does not check). > > 2) Update all eta and hence mu values during the iterations. This will > apply the constraints on eta/mu at zero-weight points too, and so might > be different. > > I am inclined to think that 2) is right, and that adding points with zero > weight to the fit is not the same as omitting them. > > Opinions?Just for clarification: This applies only to cases where the parametrization is non-canonical, e.g. additive models with Poisson response, right? And essentially the issue is that if you have a model like lambda = a + b x and you put in a zero-weight observation with x = 0, then that should effectively constrain a to be positive. That does make quite good sense, yes. -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk) FAX: (+45) 35327907 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._