John Maindonald
2003-May-21 01:41 UTC
[Rd] predict.lm(); zero weights; follow-on to PR#3043
I have now looked at implications for zero weights The code (lines 45-47) if (type != "terms") { if (missing(newdata)) XRinv <- qr.Q(object$qr)[, p1, drop = FALSE] requires the more extensive change: if (missing(newdata)&(is.null(w)|all(w>0))){ XRinv <- if(is.null(w)) qr.Q(object$qr)[, p1, drop = FALSE] else qr.Q(object$qr)[, p1, drop = FALSE]/sqrt(w)} The issue here is that the object returned by qr.Q(object$qr) has only as many rows as there are non-zero values of w. So the easy way to fix the matter is that taken above, to revert to the more straightforward but probably lengthier calculation Rinv <- qr.solve(qr.R(object$qr)[p1, p1]) XRinv <- X[, piv] %*% Rinv I was aware that this was probably an issue when I submitted the earlier bug report, but did not not want to delay reporting that issue. SEs of prediction values (i.e., predicted value for a new observation) seem to me an ambiguous notion when there are weights. As it stands the code effectively assumes a weight of one for SEs of prediction values. I guess this is reasonable. Here is code that exercises this patch: "xy0" <- structure(list(x = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), y = c(1.036, 1.883, 3.11, 3.148, 4.226, 6.291, 7.312, 7.796, 9.563, 9.563), w = c(0.067, 0, 1.588, 0.239, 0.015, 0.938, 0.041, 0.473, 0.483, 0.044)), .Names = c("x", "y", "w"), class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10")) > xy0.lm <- lm(y ~ x, weights=w, data=xy0) > my.predict(xy0.lm, se=T)$se 1 2 3 4 5 6 7 8 0.2485517 0.2052760 0.1666100 0.1365279 0.1215779 0.1272119 0.1511453 0.1864598 9 10 0.2279254 0.2727509 John Maindonald email: john.maindonald@anu.edu.au phone : +61 2 (6125)3473 fax : +61 2(6125)5549 Centre for Bioinformation Science, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200.