Xue, Liangjie
2009-Jan-09 19:41 UTC
[R] question about the scale in ridge regression of the MASS package
Hi
I am reading the source code of the ridge regression of the MASS package,
and I found the following piece of code
X <- model.matrix(Terms, m, contrasts)
n <- nrow(X); p <- ncol(X)
offset <- model.offset(m)
if(!is.null(offset)) Y <- Y - offset
if(Inter <- attr(Terms, "intercept"))
{
Xm <- colMeans(X[, -Inter])
Ym <- mean(Y)
p <- p - 1
X <- X[, -Inter] - rep(Xm, rep(n, p))
Y <- Y - Ym
} else Ym <- Xm <- NA
* Xscale <- drop(rep(1/n, n) %*% X^2)^0.5* # line 38 of the original
code
X <- X/rep(Xscale, rep(n, p))
It uses sqrt((x-xbar)^2)/n to calculate the scale of the data, while the
scale function provided by R uses sqrt((x-xbar)^2)/(n-1), which calculates
the standard deviation of the data. Is this a mistake in the MASS package,
or there are some other reasons?
Thank you very much.
L.J.
[[alternative HTML version deleted]]