Full_Name: Russell Lenth
Version: 2.6.2
OS: Windows XP Pro
Submission from: (NULL) (128.255.132.36)
The poly() function allows a higher-degree polynomial than it should, when
raw=FALSE.
For example, consider 5 distinct 'x' values, each repeated twice. we
can fit a
polynomial of degree 8:
====R> x = rep(1:5, 2)
R> y = rnorm(10)
R> lm(y ~ poly(x, 8))
Call:
lm(formula = y ~ poly(x, 8))
Coefficients:
(Intercept) poly(x, 8)1 poly(x, 8)2 poly(x, 8)3 poly(x, 8)4 poly(x, 8)5
-0.07790 0.58768 0.30147 -0.18237 0.64779 -0.04444
poly(x, 8)6 poly(x, 8)7 poly(x, 8)8
-0.22694 -0.07500 0.17235
====
If we specify 'raw=TRUE' in the same example, we are limited (as we
should) to a
degree-4 polynomial:
====R> lm(y ~ poly(x, 8, raw=TRUE))
Call:
lm(formula = y ~ poly(x, 8, raw = TRUE))
Coefficients:
(Intercept) poly(x, 8, raw = TRUE)1 poly(x, 8, raw = TRUE)2
7.3958 -14.0150 8.2784
poly(x, 8, raw = TRUE)3 poly(x, 8, raw = TRUE)4 poly(x, 8, raw = TRUE)5
-1.9502 0.1597 NA
poly(x, 8, raw = TRUE)6 poly(x, 8, raw = TRUE)7 poly(x, 8, raw = TRUE)8
NA NA NA
====
We do get an error with an even higher degree:
====R> lm(y ~ poly(x, 10))
Error in poly(x, 10) : 'degree' must be less than number of points
====
Looking at the code for poly(), I think the problem is that it should check the
'rank' result from its call to qr().
[The above results are using Windows XP Pro, but I verified this bug under Linux
as well (x86_64, dual core). It seems pretty obvious to me that this bug is not
platform-dependent.]