russell-lenth at uiowa.edu
2008-Apr-23 17:05 UTC
[Rd] poly() can exceed degree k - 1 for k distinct points (PR#11251)
The poly() function can create more variables than can be fitted when there are replicated values. In the example below, 'x' has only 5 distinct values, but I can apparently fit a 12th-degree polynomial with no error messages or even nonzero coefficients: R> x = rep(1:5,3) R> y = rnorm(15) R> lm(y ~ poly(x, 12)) Call: lm(formula = y ~ poly(x, 12)) Coefficients: (Intercept) poly(x, 12)1 poly(x, 12)2 poly(x, 12)3 -0.27442 0.35822 -0.26412 2.11780 poly(x, 12)4 poly(x, 12)5 poly(x, 12)6 poly(x, 12)7 1.83117 -0.09260 -0.48572 1.94030 poly(x, 12)8 poly(x, 12)9 poly(x, 12)10 poly(x, 12)11 -0.88297 -1.04556 0.74289 -0.01422 poly(x, 12)12 -0.46548 If I try the same with raw=TRUE, only a 4th-degree polynomial is obtained: R> lm(y ~ poly(x, 12, raw=TRUE)) Call: lm(formula = y ~ poly(x, 12, raw = TRUE)) Coefficients: (Intercept) poly(x, 12, raw = TRUE)1 9.7527 -22.0971 poly(x, 12, raw = TRUE)2 poly(x, 12, raw = TRUE)3 15.3293 -4.1005 poly(x, 12, raw = TRUE)4 poly(x, 12, raw = TRUE)5 0.3686 NA poly(x, 12, raw = TRUE)6 poly(x, 12, raw = TRUE)7 NA NA poly(x, 12, raw = TRUE)8 poly(x, 12, raw = TRUE)9 NA NA poly(x, 12, raw = TRUE)10 poly(x, 12, raw = TRUE)11 NA NA poly(x, 12, raw = TRUE)12 NA I believe that what is needed is a check on the 'rank' result after poly() calls the qr() function. System info: R version: 2.6.2 Windows XP Pro; also get same results on Linux x_64 dual-core system. [I thought I submitted this via the website yesterday, but I can find no trace of it. I apologize if this is a duplicate, but I don't think it is.] -- Russell V. Lenth, Professor Department of Statistics & Actuarial Science (319)335-0814 FAX (319)335-3017 The University of Iowa russell-lenth at uiowa.edu Iowa City, IA 52242 USA http://www.stat.uiowa.edu/~rlenth/