In glmnet_1.5 a poor default was set for the argument type which caused the
program
to be very slow or even crash when nvar (p) is very large.
The argument type (now called type.gaussian) has two options,
"covariance" or "naive", and is used for the default
family="gaussion" model (squared error loss).
When type.gaussian="covariance", all inner-products between variables
in the active set
and all other variables are cached, and can cause considerable speedup when nobs
is large.
However, when nvar is large (>500) the matrix to be stored gets large, and
this strategy becomes counterproductive.
In addition, when nvar is very large, glmnet tries to allocate a storage space
for this matrix that can exceed the
machine's memory.
When type.gaussian="naive", nothing is cached, and inner products
(loop over nobs) are computed whenever needed.
In this minor upgrade, the default is "covariance" if nvar<500,
else it is "naive". We established this rule after conducting
extensive simulations.
In addition, the argument was renamed so as not to collide with the argument
type to cv.glmnet, which is now renamed to
type.measure. In both cases, abbreviations work.
-------------------------------------------------------------------
Trevor Hastie hastie@stanford.edu
Professor, Department of Statistics, Stanford University
Phone: (650) 725-2231 (Statistics) Fax: (650) 725-8977
(650) 498-5233 (Biostatistics) Fax: (650) 725-6951
URL: http://www-stat.stanford.edu/~hastie
address: room 104, Department of Statistics, Sequoia Hall
390 Serra Mall, Stanford University, CA 94305-4065
--------------------------------------------------------------------
[[alternative HTML version deleted]]