Dear all, I have a question that hopefully is an R question and does not simply arise from my lack of understanding of the LASSO. The code below generates two different sets of relationships between y and X, one in which both variables matter (coefficients .5 each, line 14) and one in which one of the two will be shrunk to zero (coefficients .9 and .01, line 13). Lines 15-18 normalize by demeaning y and creating an orthonormal X matrix. I assign a "budget" of 0.5 to the coefficients (line 21). Line 31 translates this budget to the implied lambda in the Lagrangian form of the Lasso (one potential source of error, but I hope I read JRSSB equation 6 correctly here). In the scenario in which both variables matter, everything works fine: the sum of the coefficients is 0.5 as intended, and the expressions from Tibshirani, eq 6 (lines 24,25), and those from glmnet agree. If, however, line 13 is switched on, so that the second coefficient gets shrunk to 0, the sum of the coefficients no longer equals the budget of 0.5 for either expression (which still agree). Any thoughts on why I only seem to do it right in the case in which there is no shrinkage to 0? Best, Christoph rm(list=ls()) library(glmnet) library(mvtnorm) set.seed(38) N = 50 K = 2 SigCorr=.0 Sigma = matrix(c(1,SigCorr,SigCorr,1),ncol=2) X = rmvnorm(N, mean=rnorm(K), sigma=Sigma, method="chol") u = rnorm(N) y = .9*X[,1]+.01*X[,2]+u #y = .5*X[,1]+.5*X[,2]+u y = y-mean(y) X = scale(X) vX = var(X) X = sqrt(N/(N-1))*X%*%solve(chol(vX)) # generates orthonormal matrix reg = lm(y~X-1) budget = .5 # where LASSO estimates should be according to Tibshirani (JRSSB 1996), eq (6) beta_1_LASSO = max(c(0,budget/2+(reg$coefficients[1]-reg$coefficients[2])/2)) beta_2_LASSO = max(c(0,budget/2-(reg$coefficients[1]-reg$coefficients[2])/2)) beta_1_LASSO beta_2_LASSO beta_1_LASSO+beta_2_LASSO # using glmnet. notice it minimizes 1/2*RSS+penalty, unlike in JRSSB, with implications for relationship betahat and lambda lambda = (reg$coefficients[1]+reg$coefficients[2]-budget)/2 lasso.mod=glmnet(X,y,alpha=1,lambda=lambda) coef(lasso.mod) sum(coef(lasso.mod)) -- Prof. Dr. Christoph Hanck Lehrstuhl f?r ?konometrie Universit?t Duisburg-Essen +49 201 183 2263 christoph.hanck at vwl.uni-due.de<mailto:christoph.hanck at vwl.uni-due.de> www.oek.wiwi.uni-due.de<http://www.oek.wiwi.uni-due.de/> [[alternative HTML version deleted]]