thr3ads.net - R help - [R] glmnet() vs. lars() [Mar 2012]

If this information is useful, please help other people find it:
Share via:

Vito Muggeo (UniPa)

2012-Mar-21 10:30 UTC

[R] glmnet() vs. lars()

dear all,

It appears that glmnet(), when "selecting" the covariates entering the
model, skips from K covariates, say, to K+2 or K+3. Thus 2 or 3 
variables are "added" at the same time and it is not possible to
obtain
a ranking of the covariates according to their importance in the model. 
On the other hand lars() "adds" the covariates one at a time.
My question is: is it possible to obtain a similar output of lars (in 
terms of order of the variables entering the model) using glmnet()?

many thanks,
vito


#Example (from ?glmnet)

set.seed(123)
x=matrix(rnorm(100*20),100,20)
y=rnorm(100)
fit1=glmnet(x,y)
fit1$df #no. of covariates entering the model at different lambdas

#Thus in the "first" model no covariate is included and in the second 
#one 2 covariates (V8 and V20) are included at the same time. Because 
#two variables are included at the same time I do not know which 
#variable (among the selected ones) is more important.
#Everything is fine with lars

o<-lars(x,y)
o$df #the covariates enter one at a time.. V8 is "better" than V20


-- 
===================================Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Universit? di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726
http://dssm.unipa.it/vmuggeo

Patrick Breheny

2012-Mar-21 14:26 UTC

head link

[R] glmnet() vs. lars()

On 03/21/2012 06:30 AM, Vito Muggeo (UniPa) wrote:> It appears that glmnet(), when "selecting" the covariates
entering the
> model, skips from K covariates, say, to K+2 or K+3. Thus 2 or 3
> variables are "added" at the same time and it is not possible to
obtain
> a ranking of the covariates according to their importance in the model.
> On the other hand lars() "adds" the covariates one at a time.
> My question is: is it possible to obtain a similar output of lars (in
> terms of order of the variables entering the model) using glmnet()?
glmnet() is based on an iterative coordinate descent algorithm applied 
to a grid of lambda values; LARS is a more elegant algorithm and 
computes exact solutions.  You can get your glmnet solutions to have 
higher resolution (more "exact") by using a finer grid.  In your
example:
> set.seed(123)
> x=matrix(rnorm(100*20),100,20)
> y=rnorm(100)
> fit1=glmnet(x,y)
> fit1$df  [1]  0  2  4  4 ...

The default is a grid of 100 lambda values.  If we use 300 values, the 
resolution is finer and we can see the variables enter one at a time:

 > fit1=glmnet(x,y,nlambda=300)
 > fit1$df
   [1]  0  1  1  2  3  3  4  ...

However, it is impossible to know in advance how fine the grid must be 
in order to ensure that only one variable enters the model between any 
two consecutive lambda values.

-- 
Patrick Breheny
Assistant Professor
Department of Biostatistics
Department of Statistics
University of Kentucky

Seemingly Similar Threads

Search for more seemingly similar threads

R help - Mar 2012 - glmnet() vs. lars()

[R] glmnet() vs. lars()

[R] glmnet() vs. lars()

Seemingly Similar Threads