Hi,
On Sun, Jun 5, 2011 at 9:12 PM, Dae-Jin Lee <lee.daejin at gmail.com>
wrote:> Dear R-users
>
> I'm trying to use lasso in lars package for subset regression, ?I have
a
> large matrix of size 1000x100 and my aim is to select a subset k of the 100
> variables.
>
> Is there any way in lars to fix the number k (i.e. to select the best 10
> variables)
>
> library(lars)
>
> aa=lars(X,Y,type="lasso",max.steps=200)
>
> plot(aa,plottype="Cp")
> aa$RSS
> which.min(aa$RSS)
> round(aa$beta,2)
>
> aa$beta[which.min(aa$RSS),] ? ?# ?find which coefficients minimizes the RSS
>
> lasso.ind=which((as.vector((aa$beta[which.min(aa$RSS),])))>0) ? ?# index
of
> variables
>
> print(lasso.ind) ? # this usually gives more than 10 variables (also
depends
> on the max.steps in lars)
First off: I'd suggest using the glmnet package instead of lars.
Setting its `alpha` parameter to 1 will give you the lasso, but you
can also play w/ different values of alpha to see if an
elasticnet-type penalty would be better.
Now that you are using glmnet, check its `dfmax` and `pmax` arguments.
HTH,
-steve
--
Steve Lianoglou
Graduate Student: Computational Systems Biology
?| Memorial Sloan-Kettering Cancer Center
?| Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact