On 03/21/2012 06:30 AM, Vito Muggeo (UniPa) wrote:> It appears that glmnet(), when "selecting" the covariates
entering the
> model, skips from K covariates, say, to K+2 or K+3. Thus 2 or 3
> variables are "added" at the same time and it is not possible to
obtain
> a ranking of the covariates according to their importance in the model.
> On the other hand lars() "adds" the covariates one at a time.
> My question is: is it possible to obtain a similar output of lars (in
> terms of order of the variables entering the model) using glmnet()?
glmnet() is based on an iterative coordinate descent algorithm applied
to a grid of lambda values; LARS is a more elegant algorithm and
computes exact solutions. You can get your glmnet solutions to have
higher resolution (more "exact") by using a finer grid. In your
example:
> set.seed(123)
> x=matrix(rnorm(100*20),100,20)
> y=rnorm(100)
> fit1=glmnet(x,y)
> fit1$df
[1] 0 2 4 4 ...
The default is a grid of 100 lambda values. If we use 300 values, the
resolution is finer and we can see the variables enter one at a time:
> fit1=glmnet(x,y,nlambda=300)
> fit1$df
[1] 0 1 1 2 3 3 4 ...
However, it is impossible to know in advance how fine the grid must be
in order to ensure that only one variable enters the model between any
two consecutive lambda values.
--
Patrick Breheny
Assistant Professor
Department of Biostatistics
Department of Statistics
University of Kentucky