thr3ads.net - R help - [R] Behaviour of dfmax in glmnet [Feb 2019]

If this information is useful, please help other people find it:
Share via:

Abhishek Ghose

2019-Feb-27 22:56 UTC

[R] Behaviour of dfmax in glmnet

Hi,

I am new to <i>glmnet</i>, so I do not yet understand fully what the
various

parameters do. I am trying to build a multinomial classifier which restricts

the number of features used in the model. From reading the docs and some

answers on this forum, I understand <i>dfmax</i> is the way to do
it. I
played

around with it a bit; I have a couple of questions and would appreciate some

help:

<h3>Setup</h3>

For a particular dataset, I want to restrict the number of features to 3;

the original data has 126 features. Here's what I run:

fit<-glmnet(data.matrix(X), data.matrix(y), family='multinomial',
dfmax=3)

d<-data.frame(tidy(fit))

This is the value of <i>d</i> (inserting a screenshot since the
table
columns get

disturbed by the formatting):



My questions about the output:


[1] I see multiple values of <i>lambda</i> in there; it looks like
glmnet
tries

to fit lambdas that gets the number of terms close to dfmax=3. So its less

like the LARs algorithm (in the sense that we don't move stagewise by adding

variables) and more about getting the right lambdas for regularization that

lead to the intended dfmax. Is this right?

[2] I'm guessing alpha plays a role in how close we can get to dfmax. At

alpha=1, where we're doing lasso, and so its easier to get close to dfmax,

compared to when alpha=0 and we're doing ridge. Is this understanding

correct?

[3] A "neighborhood" of dfmax is the best we can do it'd seem. Or
am I

missing a parameter that gets me to the model with the exact dfmax (fyi:

alpha=1 doesn't seem to get me to the precise number of non zero terms

either, at least on this dataset).

[4] what does pmax do?

-------------- next part --------------
A non-text attachment was scrubbed...
Name: dfmax.PNG
Type: image/png
Size: 54147 bytes
Desc: not available
URL:
<https://stat.ethz.ch/pipermail/r-help/attachments/20190227/92b3a1a1/attachment.png>

R help - Feb 2019 - Behaviour of dfmax in glmnet

[R] Behaviour of dfmax in glmnet