thr3ads.net - R help - [R] step, leaps, lasso, LSE or what? [Mar 2002]

If this information is useful, please help other people find it:
Share via:

Frank, Murray

2002-Mar-01 00:12 UTC

[R] step, leaps, lasso, LSE or what?

Hi,

I am trying to understand the alternative methods that are available for
selecting
variables in a regression without simply imposing my own bias (having "good
judgement"). The methods implimented in leaps and step and stepAIC seem to 
fall into the general class of stepwise procedures. But these are commonly 
condemmed for inducing overfitting.

In Hastie, Tibshirani and Friedman "The Elements of Statistical
Learning"
chapter 3, 
they describe a number of procedures that seem better. The use of
cross-validation 
in the training stage presumably helps guard against overfitting. They seem 
particularly favorable to shrinkage through ridge regressions, and to the
"lasso". This
may not be too surprising, given the authorship. Is the lasso "generally
accepted" as 
being a pretty good approach? Has it proved its worth on a variety of
problems? Or is 
it at the "interesting idea" stage? What, if anything, would be widely
accepted as 
being sensible -- apart from having "good judgement".

In econometrics there is a school (the "LSE methodology") which argues
for
what
amounts to stepwise regressions combined with repeated tests of the
properties of 
the error terms. (It is actually a bit more complex than that.) This has
been coded in 
the program PCGets:
(http://www.pcgive.com/pcgets/index.html?content=/pcgets/main.html) 
If anyone knows how this compares in terms of effectiveness to the methods
discussed in 
Hastie et al., I would really be very interested. 

Cheers,
Murray

Murray Z. Frank
B.I. Ghert Family Foundation Professor
Strategy & Business Economics
Faculty of Commerce
University of British Columbia
Vancouver, B.C.
Canada V6T 1Z2

phone: 604-822-8480
fax: 604-822-8477
e-mail: Murray.Frank at commerce.ubc.ca  

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Prof Brian D Ripley

2002-Mar-01 07:26 UTC

head link

[R] step, leaps, lasso, LSE or what?

On Thu, 28 Feb 2002, Frank, Murray wrote:
> Hi,
>
> I am trying to understand the alternative methods that are available for
> selecting
> variables in a regression without simply imposing my own bias (having
"good
> judgement"). The methods implimented in leaps and step and stepAIC
seem to
> fall into the general class of stepwise procedures. But these are commonly
> condemmed for inducing overfitting.
There are big differences between regression with only continuous variates,
and regression involving hierarchies of factors.  step/stepAIC include the
latter, the rest do not.

A second difference is the purpose of selecting a model.  AIC is intended
to select a model which is large enough to include the `true' model, and
hence to give good predictions.  There over-fitting is not a real problem.
(There are variations on AIC which do not assume some model considered is
true.)   This is a different aim from trying to find the `true' model or
trying to find the smallest adequate model, both aims for explanation not
prediction.  AIC is often criticised (`condemmed') for not being good at
what it does not intend to do.   [Sometimes R is, too.]

Shrinkage methods have their advocates for good predictions (including me),
but they are a different class of statistical methods, that is *not*
regression.  They too have issues of selection, usually how much to shrink
and often how to calibrate equal shrinkage across predictors.  In ridge
regression choosing the ridge coefficient is not easy, and depends on the
scaling of the variables. In the neural networks field, shrinkage is widely
used.
> In Hastie, Tibshirani and Friedman "The Elements of Statistical
Learning"
> chapter 3,
> they describe a number of procedures that seem better. The use of
I think that is a quite selective account.
> cross-validation
> in the training stage presumably helps guard against overfitting. They seem
> particularly favorable to shrinkage through ridge regressions, and to the
> "lasso". This
> may not be too surprising, given the authorship. Is the lasso
"generally
> accepted" as
> being a pretty good approach? Has it proved its worth on a variety of
> problems? Or is
> it at the "interesting idea" stage? What, if anything, would be
widely
> accepted as
> being sensible -- apart from having "good judgement".
Depends on the aim.  If you look at the account in Venables & Ripley you
will see many caveats about any automated method: all statistical problems
(outside textbooks) come with a context which should be used in selecting
variables if the aim is explanation, and perhaps also if it is prediction.
You should use what you know about the variables and the possible
mechanisms, especially to select derived variables.  But generally model
averaging (which you have not mentioned and is for regression a form of
shrinkage) seems to have most support for prediction.
> In econometrics there is a school (the "LSE methodology") which
argues
> for what amounts to stepwise regressions combined with repeated tests of
> the properties of the error terms. (It is actually a bit more complex
> than that.) This has been coded in the program PCGets:
> (http://www.pcgive.com/pcgets/index.html?content=/pcgets/main.html)
Lots of hyperbolic claims, no references.  But I suspect this is `ex-LSE'
methodology, associated with Hendry's group (as PcGive and Ox are), and
there is a link to Hendry (who is in Oxford).
> If anyone knows how this compares in terms of effectiveness to the methods
> discussed in
> Hastie et al., I would really be very interested.
It has a different aim, I believe.  Certainly `effectiveness' has to be
assessed relative to a clear aim, and simulation studies with true models
don't seem to me to have the right aim.  Statisticians of the Box/Cox/Tukey
generation would say that effectiveness in deriving scientific insights
was the real test (and I recall hearing that from those I named).

Chpater 2 of my `Pattern Recognition and Neural Networks' takes a much
wider view of the methods available for model selection, and their
philosophies.  Specifically for regression, you might take a look at Frank
Harrell's book.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Frank, Murray

2002-Mar-01 18:57 UTC

head link

[R] RE: step, leaps, lasso, LSE or what?

Thanks for the most informative, and helpful feedback. 

Professor Ripley wrote:
(most of his message has been edited out)>There are big differences between regression with only continuous variates,
>and regression involving hierarchies of factors. step/stepAIC include the 
>latter, the rest do not. In much of Venables and Ripley, bootstrapping keeps popping up. Is there a 
reason not to run step/stepAIC repeatedly on bootstrapped samples from the 
original data? On the face of it, bootstrapping seems intuitively appealing 
in this context. (Would some form of cross-validation on subsamples be 
better?)
>But generally model
>averaging (which you have not mentioned and is for regression a form of
>shrinkage) seems to have most support for prediction.
What do you mean by model averaging? It does not seem to match the
discussion 
of model selection that I found in Venables and Ripley (ie pages 186-188). 
>Lots of hyperbolic claims, no references.  But I suspect this is
`ex-LSE'
>methodology, associated with Hendry's group (as PcGive and Ox are), and
>there is a link to Hendry (who is in Oxford).
Quite right. It is the Hendry group. As far as I can figure out, the main 
specific references are to: 
Hoover, K. D., and Perez, S. J. (1999). Data mining reconsidered:
Encompassing 
and the general-to specific approach to specification search. Econometrics 
Journal, 2, 167-191. 

Hoover, K. D., and Perez, S. J. (2001). Truth and robustness in
cross-country 
growth regressions. unpublished paper, Economics Department, University of 
California, Davis. 
>It has a different aim, I believe.  Certainly `effectiveness' has to be
>assessed relative to a clear aim, and simulation studies with true models
>don't seem to me to have the right aim.  
As suggested, the Hoover and Perez papers are basically simulation studies
where finding a true model was the aim. The working paper on growth
regressions
tries to go further, and seems to have reasonable sounding economic
conclusions.
>Statisticians of the Box/Cox/Tukey
>generation would say that effectiveness in deriving scientific insights
>was the real test (and I recall hearing that from those I named).
It is hard to argue with that claim. But it is equally hard to see it as 
complete. How do we define "scientific insight"? Or is it one of those
cases
of: "I don't know how to define it, but I know it when I see it"?

Murray Z. Frank
B.I. Ghert Family Foundation Professor
Strategy & Business Economics
Faculty of Commerce
University of British Columbia
Vancouver, B.C.
Canada V6T 1Z2

phone: 604-822-8480
fax: 604-822-8477
e-mail: Murray.Frank at commerce.ubc.ca  
>  -----Original Message-----
> From: 	Frank, Murray  
> Sent:	Thursday, February 28, 2002 4:12 PM
> To:	
> Subject:	step, leaps, lasso, LSE or what?
> 
> Hi,
> 
> I am trying to understand the alternative methods that are available for
> selecting
> variables in a regression without simply imposing my own bias (having
> "good
> judgement"). The methods implimented in leaps and step and stepAIC
seem to
> 
> fall into the general class of stepwise procedures. But these are commonly
> 
> condemmed for inducing overfitting.
> 
> In Hastie, Tibshirani and Friedman "The Elements of Statistical
Learning"
> chapter 3, 
> they describe a number of procedures that seem better. The use of
> cross-validation 
> in the training stage presumably helps guard against overfitting. They
> seem 
> particularly favorable to shrinkage through ridge regressions, and to the
> "lasso". This
> may not be too surprising, given the authorship. Is the lasso
"generally
> accepted" as 
> being a pretty good approach? Has it proved its worth on a variety of
> problems? Or is 
> it at the "interesting idea" stage? What, if anything, would be
widely
> accepted as 
> being sensible -- apart from having "good judgement".
> 
> In econometrics there is a school (the "LSE methodology") which
argues for
> what
> amounts to stepwise regressions combined with repeated tests of the
> properties of 
> the error terms. (It is actually a bit more complex than that.) This has
> been coded in 
> the program PCGets:
> (http://www.pcgive.com/pcgets/index.html?content=/pcgets/main.html) 
> If anyone knows how this compares in terms of effectiveness to the methods
> discussed in 
> Hastie et al., I would really be very interested. 
> 
> Cheers,
> Murray
> 
> Murray Z. Frank
> B.I. Ghert Family Foundation Professor
> Strategy & Business Economics
> Faculty of Commerce
> University of British Columbia
> Vancouver, B.C.
> Canada V6T 1Z2
> 
> phone: 604-822-8480
> fax: 604-822-8477
> e-mail: Murray.Frank at commerce.ubc.ca  
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Seemingly Similar Threads

Search for more reasonably related threads

R help - Mar 2002 - step, leaps, lasso, LSE or what?

[R] step, leaps, lasso, LSE or what?

[R] step, leaps, lasso, LSE or what?

[R] RE: step, leaps, lasso, LSE or what?

Seemingly Similar Threads