Jimmy Purnell
2009-Nov-17 18:20 UTC
[R] lm.ridge {MASS} ridge regression calculation question
Why do I get different coefficients for each of these methods? Y <- matrix(c(136,144,145,169,176),byrow=F,5,1) X1 <- matrix(c(91,105,109,130,146),byrow=F,5,1) X2 <- matrix(c(11,13,17,19,23),byrow=F,5,1) cY <- scale(Y, scale=FALSE) sX1 <- scale(X1) sX2 <- scale(X2) library(MASS) lm.ridge(cY ~ sX1 + sX2, lambda = 100) This yields different coefficients than what I get from this, which I thought should be identical (if I set the lambdas to 0 rather than 100 I get identical results): Xmat <- cbind(sX1, sX2) Ymat <- cbind(cY) XXI <- solve(t(Xmat)%*%Xmat + 100*diag(2)) XY <- t(Xmat)%*%Ymat (ridge.coef <- XXI%*%XY) Thanks, Jim
Ravi Varadhan
2009-Nov-17 18:41 UTC
[R] lm.ridge {MASS} ridge regression calculation question
The difference is due to the use of "n" versus "n-1" when calculating the standard deviation for scaling. Try this: Y <- matrix(c(136,144,145,169,176),byrow=F,5,1) X1 <- matrix(c(91,105,109,130,146),byrow=F,5,1) X2 <- matrix(c(11,13,17,19,23),byrow=F,5,1) cY <- scale(Y, scale=FALSE) sX1 <- scale(X1) * sqrt(5/4) # multiply by sqrt(n / (n-1)) sX2 <- scale(X2) * sqrt(5/4) library(MASS) ans <- lm.ridge(cY ~ sX1 + sX2, lambda = 100) Xmat <- cbind(sX1, sX2) Ymat <- cbind(cY) XXI <- solve(t(Xmat)%*%Xmat + 100*diag(2)) XY <- t(Xmat)%*%Ymat ridge.coef <- XXI%*%XY all.equal(as.numeric(ridge.coef), as.numeric(ans$coef)) Ravi. ---------------------------------------------------------------------------- ------- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvaradhan at jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h tml ---------------------------------------------------------------------------- -------- -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Jimmy Purnell Sent: Tuesday, November 17, 2009 1:21 PM To: r-help at r-project.org Subject: [R] lm.ridge {MASS} ridge regression calculation question Why do I get different coefficients for each of these methods? Y <- matrix(c(136,144,145,169,176),byrow=F,5,1) X1 <- matrix(c(91,105,109,130,146),byrow=F,5,1) X2 <- matrix(c(11,13,17,19,23),byrow=F,5,1) cY <- scale(Y, scale=FALSE) sX1 <- scale(X1) sX2 <- scale(X2) library(MASS) lm.ridge(cY ~ sX1 + sX2, lambda = 100) This yields different coefficients than what I get from this, which I thought should be identical (if I set the lambdas to 0 rather than 100 I get identical results): Xmat <- cbind(sX1, sX2) Ymat <- cbind(cY) XXI <- solve(t(Xmat)%*%Xmat + 100*diag(2)) XY <- t(Xmat)%*%Ymat (ridge.coef <- XXI%*%XY) Thanks, Jim ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.