thr3ads.net - R help - [R] different L2 regularization behavior between lrm, glmnet, and penalized? [Oct 2009]

If this information is useful, please help other people find it:
Share via:

Robert V (Bob) Sasseen

2009-Oct-14 17:57 UTC

[R] different L2 regularization behavior between lrm, glmnet, and penalized?

The following R code using different packages gives the same results for a
   simple logistic regression without regularization, but different results
   with regularization. This may just be a matter of different scaling of the
   regularization parameters, but if anyone familiar with these packages has
   insight into why the results differ, I'd appreciate hearing about it.
I'm
   new to R. Thanks. (Version info below. Same results on Windows and Solaris
   8, except that I haven't gotten glmnet to compile on the latter.)
   Robert V (Bob) Sasseen
   [1]sasseen at ai.sri.com
   > # Some x values (predictive var).
   > x = c(-2, -2, -2, -2, -1, -1, -1, 2, 2, 2, 3, 3, 3, 3)
   > # Some z values--the observed outcome.
   > # Effect is that for
   > # x = -2, p = 1/4;
   > # x = -1, p = 1/3;
   > # x =  2, p = 2/3;
   > # x =  3, p = 3/4.
   > z = c(0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1)
   > library(Design)
   > g = lrm(z ~ x)
   > g$coefficients
    Intercept          x
   -0.2224842  0.4449685
   > g = lrm(z ~ x, penalty = 1)
   > g$coefficients
    Intercept          x
   -0.1620727  0.3241454
   > library(glmnet)
   > g = glmnet(cbind(x), cbind(1-z, z), family = "binomial",
lambda = 0,
   standardize = FALSE)
   > coef(g)
              1
     -0.2224843
   x  0.4449687
   > g = glmnet(cbind(x), cbind(1-z, z), family = "binomial",
lambda = 1, alpha
   = 0, standardize = FALSE)
   > coef(g)
              1
     -0.1098361
   x  0.2196721
   > library(penalized)
   > fit = penalized(z ~ x)
   > coefficients(fit, "all")
   (Intercept)           x
    -0.2224843   0.4449687
   > fit = penalized(z ~ x, lambda2 = 1)
   > coefficients(fit, "all")
   (Intercept)           x
    -0.2060658   0.4121315
   > sessionInfo()
   R version 2.9.2 (2009-08-24)
   i386-pc-mingw32
   locale:
   LC_COLLATE=English_United          States.1252;LC_CTYPE=English_United
   States.1252;LC_MONETARY=English_United
   States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
   attached base packages:
   [1] splines   stats     graphics  grDevices utils     datasets  methods
   [8] base
   other attached packages:
   [1]     penalized_0.9-26      glmnet_1.1-3          Matrix_0.999375-30
   lattice_0.17-25
   [5] Design_2.3-0       Hmisc_3.7-0        survival_2.35-4
   loaded via a namespace (and not attached):
   [1] cluster_1.12.0 grid_2.9.2

References

   1. mailto:sasseen at ai.sri.com

Robert V (Bob) Sasseen

2009-Dec-03 02:24 UTC

head link

[R] different L2 regularization behavior between lrm, glmnet, and penalized?

The author of the penalized package, j.j.goeman at lumc.nl, kindly 
replied to my message. He also responded to another question I asked him.

------------------

The differences have to do with different scaling defaults.
 
lrm by default standardizes the covariates to unit sd before applying 
penalization. penalized by default does not do any standardization, but 
if asked standardizes on unit second central moment. In your example:
 
x = c(-2, -2, -2, -2, -1, -1, -1, 2, 2, 2, 3, 3, 3, 3)
z = c(0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1)
 
You got:
 
 > g = lrm(z ~ x, penalty = 1)
 > g$coefficients
 Intercept          x
-0.1620727  0.3241454
 
Now:
 
 > coef(penalized(z ~ x, lambda2 = 1,standardize=T))
 
(Intercept)           x
 -0.1651565   0.3303130
 
is already more similar. To counter the difference between dividing by n 
(penalized) or n-1 (lrm), do
 
 > coef(penalized(z ~ x, lambda2 = 14/13,standardize=T))
 
(Intercept)           x
 -0.1620727   0.3241454
 
The glmnet case should be similar, but I don't know the details here. It 
seems glmnet uses its own peculiar way of defining the penalty, but some 
choice of scaling should be able to bring glmnet in line as well.

------------------

I have another question about the penalized package. I'm not clear on 
the following behavior of penalized on the same data from my original 
question:

 > fit = penalized(z ~ x, lambda2 = 1)
 > penalty(fit)
        L1         L2
0.00000000 0.08492619

 > fit = penalized(z ~ x, lambda1 = 1, lambda2 = 1)
 > penalty(fit)
       L1        L2
0.3417345 0.1167825

What is the penalty function showing? I'd naively expect it to show the 
same lambda parameters that were given to the penalized() calls. I 
haven't been able to find this in any documentation.

------------------

The penalty returns the evaluated penalty term, lambda1*sum(abs(coef)) 
for L1; lambda2*sum(coef^2) for L2.
 
I'll make a note of documenting this.

Possibly Parallel Threads

Search for more apparently analagous threads

R help - Oct 2009 - different L2 regularization behavior between lrm, glmnet, and penalized?

[R] different L2 regularization behavior between lrm, glmnet, and penalized?

[R] different L2 regularization behavior between lrm, glmnet, and penalized?

Possibly Parallel Threads