Xue, Liangjie
2009-Jan-09 19:41 UTC
[R] question about the scale in ridge regression of the MASS package
Hi I am reading the source code of the ridge regression of the MASS package, and I found the following piece of code X <- model.matrix(Terms, m, contrasts) n <- nrow(X); p <- ncol(X) offset <- model.offset(m) if(!is.null(offset)) Y <- Y - offset if(Inter <- attr(Terms, "intercept")) { Xm <- colMeans(X[, -Inter]) Ym <- mean(Y) p <- p - 1 X <- X[, -Inter] - rep(Xm, rep(n, p)) Y <- Y - Ym } else Ym <- Xm <- NA * Xscale <- drop(rep(1/n, n) %*% X^2)^0.5* # line 38 of the original code X <- X/rep(Xscale, rep(n, p)) It uses sqrt((x-xbar)^2)/n to calculate the scale of the data, while the scale function provided by R uses sqrt((x-xbar)^2)/(n-1), which calculates the standard deviation of the data. Is this a mistake in the MASS package, or there are some other reasons? Thank you very much. L.J. [[alternative HTML version deleted]]