thr3ads.net - R help - [R] Reduced Error Logistic Regression, and R? [Apr 2007]

If this information is useful, please help other people find it:
Share via:

Tim Churches

2007-Apr-26 02:29 UTC

[R] Reduced Error Logistic Regression, and R?

This news item in a data mining newsletter makes various claims for a technique
called "Reduced Error Logistic Regression":
http://www.kdnuggets.com/news/2007/n08/12i.html

In brief, are these (ambitious) claims justified and if so, has this technique
been implemented in R (or does anyone have any plans to do so)?

Tim C

Roy Mendelssohn

2007-Apr-26 02:49 UTC

head link

[R] Reduced Error Logistic Regression, and R?

I don't know about the claims, but I do know about this:
> Recent News: January 31, 2007. St. Louis, MO - Rice Analytics  
> applied for a U.S. patent this week on a generalized form of  
> Reduced Error Logistic Regression.  This generalized form allows  
> repeated measures, multilevel, and survival designs that include  
> individual level estimates.  None of these capabilities were  
> possible with the previously disclosed formulation which also had  
> limited application because it could only be applied to models  
> where all variables had no missing observations
This is a very bad trend in science and statistics, IMHO.

-Roy M.

On Apr 25, 2007, at 7:29 PM, Tim Churches wrote:
> This news item in a data mining newsletter makes various claims for  
> a technique called "Reduced Error Logistic Regression": http:// 
> www.kdnuggets.com/news/2007/n08/12i.html
>
> In brief, are these (ambitious) claims justified and if so, has  
> this technique been implemented in R (or does anyone have any plans  
> to do so)?
>
> Tim C
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting- 
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
**********************
"The contents of this message do not reflect any position of the U.S.  
Government or NOAA."
**********************
Roy Mendelssohn
Supervisory Operations Research Analyst
NOAA/NMFS
Environmental Research Division	
Southwest Fisheries Science Center
1352 Lighthouse Avenue
Pacific Grove, CA 93950-2097

e-mail: Roy.Mendelssohn at noaa.gov (Note new e-mail address)
voice: (831)-648-9029
fax: (831)-648-8440
www: http://www.pfeg.noaa.gov/

"Old age and treachery will overcome youth and skill."

Simon Blomberg

2007-Apr-26 05:15 UTC

head link

[R] Reduced Error Logistic Regression, and R?

>From what I've read (which isn't much), the idea is to estimate autility (preference) function for discrete categories, using logistic
regression, under the assumption that the residuals of the linear
predictor of the utilities are ~ Type I Gumbel. This implies the
"independence of irrelevant alternatives" in economic jargon. ie the
utility of choice a versus choice b is independent of the introduction
of a third choice c. It also implies homoscedasticity of the errors. The
model can be generalized in various ways. If you are willing to
introduce extra parameters into the model, such as the parameters of the
Gumbel distribution, you may get more precision in the estimates of the
utility function. An alternative (without the independence of irrelevant
alternatives assumption) is to model the errors as multivariate normal
(ie use probit regression), which is computationally much more
difficult.

Whether it makes substantive sense to use these models outside of
"discrete choice" experiments is another question.

 Patenting these methods is worrying. There have been a lot of people
working on discrete choice experiments over the years. It's hard to
believe that a single company could have ownership over an idea that is
the result of a collaborative effort such as this.

Cheers,

Simon.

 On Thu, 2007-04-26 at 12:29 +1000, Tim Churches wrote:> This news item in a data mining newsletter makes various claims for a
technique called "Reduced Error Logistic Regression":
http://www.kdnuggets.com/news/2007/n08/12i.html
> 
> In brief, are these (ambitious) claims justified and if so, has this
technique been implemented in R (or does anyone have any plans to do so)?
> 
> Tim C
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.-- 
Simon Blomberg, BSc (Hons), PhD, MAppStat. 
Lecturer and Consultant Statistician 
Faculty of Biological and Chemical Sciences 
The University of Queensland 
St. Lucia Queensland 4072 
Australia

Room 320, Goddard Building (8)
T: +61 7 3365 2506 
email: S.Blomberg1_at_uq.edu.au 

The combination of some data and an aching desire for 
an answer does not ensure that a reasonable answer can 
be extracted from a given body of data. - John Tukey.

paulandpen at optusnet.com.au

2007-Apr-26 06:35 UTC

head link

[R] Reduced Error Logistic Regression, and R?

Further to Simon's points,

Here is what is confusing to me and I highlight the section of the claims below:

The key assumption concerns "symmetrical error constraints". These
"symmetrical error constraints" force a solution where the
probabilities of positive and negative error are symmetrical across all cross
product sums that are the basis of maximum likelihood logistic regression. As
the number of independent variables increases, it becomes more and more likely
that this symmetrical assumption is accurate. Because this error component can
be reliably estimated and subtracted out with a large enough number of
variables, the resulting model parameters are strikingly error-free and do not
overfit the data.

For me, maybe this is a bit old school here, but isn't the point of model
development generating the most parsimonious model with the greatest explanatory
power from the fewest variables.  I myself could just imagine going to a client
and standing in a 'bored' (grin) room for a presentation, and saying hay
client, here are the 200 variables that are driving choice behaviour.  I use
latent class and bayes based approaches because they recover heterogeneity in
utility allocation across the sample, that to me is a big battle in choice based
analytics.

I believe that after a certain point, a heap of predictors become meaningless. 
I can see some of my colleagues adopting this because it is in SAS and makes up
for poor design.

Anyway, from a technical point of view, I would have to read a little about the
error they are referring to.  Good on them for developing a new technology, like
any algorithm, it will have its strengths and weaknesses and depending on
factors such as usability etc, will gain some level of acceptance.

Paul

> Simon Blomberg <s.blomberg1@uq.edu.au> wrote:
> 
> >From what I've read (which isn't much), the idea is to estimate
a
> utility (preference) function for discrete categories, using logistic
> regression, under the assumption that the residuals of the linear
> predictor of the utilities are ~ Type I Gumbel. This implies the
> "independence of irrelevant alternatives" in economic jargon. ie
the
> utility of choice a versus choice b is independent of the introduction
> of a third choice c. It also implies homoscedasticity of the errors. The
> model can be generalized in various ways. If you are willing to
> introduce extra parameters into the model, such as the parameters of the
> Gumbel distribution, you may get more precision in the estimates of the
> utility function. An alternative (without the independence of irrelevant
> alternatives assumption) is to model the errors as multivariate normal
> (ie use probit regression), which is computationally much more
> difficult.
> 
> Whether it makes substantive sense to use these models outside of
> "discrete choice" experiments is another question.
> 
>  Patenting these methods is worrying. There have been a lot of people
> working on discrete choice experiments over the years. It's hard to
> believe that a single company could have ownership over an idea that is
> the result of a collaborative effort such as this.
> 
> Cheers,
> 
> Simon.
> 
>  On Thu, 2007-04-26 at 12:29 +1000, Tim Churches wrote:
> > This news item in a data mining newsletter makes various claims for a 
> technique called "Reduced Error Logistic Regression": 
> http://www.kdnuggets.com/news/2007/n08/12i.html
> > 
> > In brief, are these (ambitious) claims justified and if so, has this 
> technique been implemented in R (or does anyone have any plans to do 
> so)? 
> > 
> > Tim C
> > 
> > ______________________________________________
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> -- 
> Simon Blomberg, BSc (Hons), PhD, MAppStat. 
> Lecturer and Consultant Statistician 
> Faculty of Biological and Chemical Sciences 
> The University of Queensland 
> St. Lucia Queensland 4072 
> Australia
> 
> Room 320, Goddard Building (8)
> T: +61 7 3365 2506 
> email: S.Blomberg1_at_uq.edu.au 
> 
> The combination of some data and an aching desire for 
> an answer does not ensure that a reasonable answer can 
> be extracted from a given body of data. - John Tukey.
> 
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Apparently Analagous Threads

Search for more maybe matching threads

R help - Apr 2007 - Reduced Error Logistic Regression, and R?

[R] Reduced Error Logistic Regression, and R?

[R] Reduced Error Logistic Regression, and R?

[R] Reduced Error Logistic Regression, and R?

[R] Reduced Error Logistic Regression, and R?

Apparently Analagous Threads