Dear all, For an ordinary ridge regression problem, I followed three different approaches: 1. estimate beta without any standardization 2. estimate standardized beta (standardizing X and y) and then again convert back 3. estimate beta using lm.ridge() function X<-matrix(c(1,2,9,3,2,4,7,2,3,5,9,1),4,3) y<-as.matrix(c(2,3,4,5)) n<-nrow(X) p<-ncol(X) #Without standardization intercept <- rep(1,n) Xn <- cbind(intercept, X) K<-diag(c(0,rep(1.5,p))) beta1 <- solve(t(Xn)%*%Xn+K)%*%t(Xn)%*%y beta1 #with standardization ys<-scale(y) Xs<-scale(X) K<-diag(1.5,p) bs <- solve(t(Xs)%*%Xs+K)%*%t(Xs)%*%ys b<- sd(y)*(bs/sd(X)) intercept <- mean(y)-sum(as.matrix(colMeans(X))*b) beta2<-rbind(intercept,b) beta2 #Using lm.ridge function of MASS package beta3<-lm.ridge(y~X,lambda=1.5) I'm getting three different results using above described approaches:> beta1[,1] intercept 3.4007944 0.3977462 0.2082025 -0.4829115> beta2[,1] intercept 3.3399855 0.1639469 0.0262021 -0.1228987> beta3X1 X2 X3 3.35158977 0.19460958 0.03152778 -0.15546775 It will be very helpful to me if anybody can help me regarding why the outputs are coming different. I am extremely sorry for my previous anonymous post. regards. ----- Sabyasachi Patra PhD Scholar Indian institute of Technology Kanpur India. -- View this message in context: http://www.nabble.com/Ridge-regression--Repost--tp25048898p25048898.html Sent from the R help mailing list archive at Nabble.com.
Sabyasachi Patra wrote:> Dear all, > > For an ordinary ridge regression problem, I followed three different > approaches: > 1. estimate beta without any standardization > 2. estimate standardized beta (standardizing X and y) and then again convert > back > 3. estimate beta using lm.ridge() function > > > X<-matrix(c(1,2,9,3,2,4,7,2,3,5,9,1),4,3) > y<-as.matrix(c(2,3,4,5)) > > n<-nrow(X) > p<-ncol(X) > > #Without standardization > intercept <- rep(1,n) > Xn <- cbind(intercept, X) > K<-diag(c(0,rep(1.5,p))) > beta1 <- solve(t(Xn)%*%Xn+K)%*%t(Xn)%*%y > beta1 > > #with standardization > ys<-scale(y) > Xs<-scale(X) > K<-diag(1.5,p) > bs <- solve(t(Xs)%*%Xs+K)%*%t(Xs)%*%ys > b<- sd(y)*(bs/sd(X)) > intercept <- mean(y)-sum(as.matrix(colMeans(X))*b) > beta2<-rbind(intercept,b) > beta2 > > #Using lm.ridge function of MASS package > beta3<-lm.ridge(y~X,lambda=1.5) > > I'm getting three different results using above described approaches: > >> beta1 > [,1] > intercept 3.4007944 > 0.3977462 > 0.2082025 > -0.4829115 >> beta2 > [,1] > intercept 3.3399855 > 0.1639469 > 0.0262021 > -0.1228987 > >> beta3 > X1 X2 X3 > 3.35158977 0.19460958 0.03152778 -0.15546775 > > It will be very helpful to me if anybody can help me regarding why the > outputs are coming different. > > I am extremely sorry for my previous anonymous post. > > regards. > > ----- > Sabyasachi Patra > PhD Scholar > Indian institute of Technology Kanpur > India.Thanks for re-posting. If you look at the ols function in the Design package and my chapter on maximum likelihood estimation in the book Regrsession Modeling Strategies you'll see an alternative approach that doesn't change anything but that may be easier to visualize. The standard deviations are put into the penalty directly so no pre-scaling is required. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University