A.Robinson at ms.unimelb.edu.au
2006-Oct-24 22:25 UTC
[Rd] Cook's Distance in GLM (PR#9316)
Hi Community, I'm trying to reconcile Cook's Distances computed in glm. The following snippet of code shows that the Cook's Distances contours on the plot of Residuals v Leverage do not seem to be the same as the values produced by cooks.distance() or in the Cook's Distance against observation number plot. counts <- c(18,17,15,20,10,20,25,13,12) outcome <- gl(3,1,9) treatment <- gl(3,3) d.AD <- data.frame(treatment, outcome, counts) glm.D93 <- glm(counts ~ outcome + treatment, family=poisson()) opar <- par(mfrow=c(2,1)) plot(glm.D93, which=c(4,5)) par(opar) cooks.distance(glm.D93) The difference is reasonably moderate in this case. My suspicions were aroused by a case in which the plot showed five or size points greater than 1, none of which could be identified in the output of the function.> version_ platform i386-unknown-freebsd6.1 arch i386 os freebsd6.1 system i386, freebsd6.1 status Patched major 2 minor 4.0 year 2006 month 10 day 03 svn rev 39576 language R version.string R version 2.4.0 Patched (2006-10-03 r39576) Cheers Andrew -- Andrew Robinson Department of Mathematics and Statistics Tel: +61-3-8344-9763 University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599 http://www.ms.unimelb.edu.au/~andrewpr http://blogs.mbs.edu/fishing-in-the-bay/
A.Robinson at ms.unimelb.edu.au writes:> Hi Community, > > I'm trying to reconcile Cook's Distances computed in glm. The > following snippet of code shows that the Cook's Distances contours on > the plot of Residuals v Leverage do not seem to be the same as the > values produced by cooks.distance() or in the Cook's Distance against > observation number plot. > > counts <- c(18,17,15,20,10,20,25,13,12) > outcome <- gl(3,1,9) > treatment <- gl(3,3) > d.AD <- data.frame(treatment, outcome, counts) > glm.D93 <- glm(counts ~ outcome + treatment, family=poisson()) > > opar <- par(mfrow=c(2,1)) > plot(glm.D93, which=c(4,5)) > par(opar) > > cooks.distance(glm.D93) > > The difference is reasonably moderate in this case. My suspicions > were aroused by a case in which the plot showed five or size points > greater than 1, none of which could be identified in the output of the > function.Hmm, yes. A good guess is that the contour levels need to be modified by a dispersion factor. The plot is much more consistent with cooks.distance(glm.D93,dispersion=5.129/4 ) -- O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907