Heather.Turner@warwick.ac.uk
2005-Dec-06 14:52 UTC
[Rd] standardized residuals (rstandard & plot.lm) (PR#8367)
Full_Name: Heather Turner Version: 2.2.0 OS: Windows XP Submission from: (NULL) (137.205.240.44) Standardized residuals as calculated by rstandard.lm, rstandard.glm and plot.lm are Inf/NaN rather than zero when the un-standardized residuals are zero. This causes plot.lm to break when calculating 'ylim' for any of the plots of standardized residuals. Example: "occupationalStatus" <- structure(as.integer(c(50, 16, 12, 11, 2, 12, 0, 0, 19, 40, 35, 20, 8, 28, 6, 3, 26, 34, 65, 58, 12, 102, 19, 14, 8, 18, 66, 110, 23, 162, 40, 32, 7, 11, 35, 40, 25, 90, 21, 15, 11, 20, 88, 183, 46, 554, 158, 126, 6, 8, 23, 64, 28, 230, 143, 91, 2, 3, 21, 32, 12, 177, 71, 106) ), .Dim = as.integer(c(8, 8)), .Dimnames structure(list(origin = c("1", "2", "3", "4", "5", "6", "7", "8"), destination = c("1", "2", "3", "4", "5", "6", "7", "8")), .Names = c("origin", "destination")), class = "table") Diag <- as.factor(diag(1:8)) Rscore <- scale(as.numeric(row(occupationalStatus)), scale = FALSE) Cscore <- scale(as.numeric(col(occupationalStatus)), scale = FALSE) Uniform <- glm(Freq ~ origin + destination + Diag + Rscore:Cscore, family = poisson, data = occupationalStatus) residuals(Uniform)[as.logical(diag(8))] #zero/near-zero rstandard(Uniform)[as.logical(diag(8))] #mostly Inf/NaN plot(Uniform) #breaks on qqnorm plot (or any 'which' > 1) This could be fixed by replacing standardized residuals with zero where the hat value is one, e.g. rstandard.glm <- function (model, infl = lm.influence(model, do.coef = FALSE), ...) { res <- infl$wt.res hat <- infl$hat ifelse(hat == 1, 0, res/sqrt(summary(model)$dispersion * (1 - infl$hat))) } etc.
ripley@stats.ox.ac.uk
2005-Dec-06 16:10 UTC
[Rd] standardized residuals (rstandard & plot.lm) (PR#8367)
Curiously, I was just looking at that, since I believe the answer should be NaN, and some optimizing compilers/fast BLASes are not giving that. (There's an example in reg-test-3.R.) So I think we need to return NaN when hat is within rounding error of 1. My take is that plot.lm should handle this: you will see most but not all cases have na.rm=TRUE in calculating ylim, but as Inf is theoretically impossible it has not been considered. Note that plot.lm does not use rstandard and so needs a separate fix. Thanks for the report On Tue, 6 Dec 2005 Heather.Turner at warwick.ac.uk wrote:> Full_Name: Heather Turner > Version: 2.2.0 > OS: Windows XP > Submission from: (NULL) (137.205.240.44) > > > Standardized residuals as calculated by rstandard.lm, rstandard.glm and plot.lm > are Inf/NaN rather than zero when the un-standardized residuals are zero. This > causes plot.lm to break when calculating 'ylim' for any of the plots of > standardized residuals. Example: > > "occupationalStatus" <- > structure(as.integer(c(50, 16, 12, 11, 2, 12, 0, 0, 19, 40, 35, > 20, 8, 28, 6, 3, 26, 34, 65, 58, 12, 102, 19, 14, 8, > 18, 66, 110, 23, 162, 40, 32, 7, 11, 35, 40, 25, 90, > 21, 15, 11, 20, 88, 183, 46, 554, 158, 126, 6, 8, > 23, > 64, 28, 230, 143, 91, 2, 3, 21, 32, 12, 177, 71, > 106) > ), .Dim = as.integer(c(8, 8)), .Dimnames > structure(list(origin = c("1", "2", "3", "4", "5", "6", "7", > "8"), > destination = c("1", "2", "3", "4", "5", "6", "7", > "8")), .Names = c("origin", "destination")), > class = "table") > Diag <- as.factor(diag(1:8)) > Rscore <- scale(as.numeric(row(occupationalStatus)), scale = FALSE) > Cscore <- scale(as.numeric(col(occupationalStatus)), scale = FALSE) > Uniform <- glm(Freq ~ origin + destination + Diag + > Rscore:Cscore, family = poisson, data = occupationalStatus) > residuals(Uniform)[as.logical(diag(8))] #zero/near-zero > rstandard(Uniform)[as.logical(diag(8))] #mostly Inf/NaN > plot(Uniform) #breaks on qqnorm plot (or any 'which' > 1) > > This could be fixed by replacing standardized residuals with zero where the hat > value is one, e.g. > rstandard.glm <- function (model, > infl = lm.influence(model, do.coef = FALSE), > ...) { > res <- infl$wt.res > hat <- infl$hat > ifelse(hat == 1, 0, res/sqrt(summary(model)$dispersion * (1 - > infl$hat))) > } > etc. > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Possibly Parallel Threads
- standardized residuals (rstandard & plot.lm) (PR#8468)
- rstandard.glm() in base/R/lm.influence.R
- Help RFM analysis in R (i want a code where i can define my own breaks instead of system defined breaks used in auto_RFM package)
- Help RFM analysis in R (i want a code where i can define my own breaks instead of system defined breaks used in auto_RFM package)
- Help RFM analysis in R (i want a code where i can define my own breaks instead of system defined breaks used in auto_RFM package)