thr3ads.net - R help - [R] Re: Logistic Regression [Sep 2003]

If this information is useful, please help other people find it:
Share via:

Trevor Hastie

2003-Sep-14 04:42 UTC

[R] Re: Logistic Regression

Christoph Lehman had problems with seperated data in two-class logistic
regression.

One useful little trick is to penalize the logistic regression using a quadratic
penalty on the coefficients.
I am sure there are functions in the R contributed libraries to do this;
otherwise it is easy to achieve via IRLS
using ridge regressions. Then even though the data are separated, the penalized
log-likelihood
has a unique maximum. One intriguing feature is that as the penalty parameter
goes to zero,
the solution converges to the SVM solution - i.e. the optimal separating
hyperplane
see  http://www-stat.stanford.edu/~hastie/Papers/margmax1.ps 

--------------------------------------------------------------------
  Trevor Hastie                                  hastie@stanford.edu  
  Professor, Department of Statistics, Stanford University
  Phone: (650) 725-2231 (Statistics)        Fax: (650) 725-8977  
    (650) 498-5233 (Biostatistics)   Fax: (650) 725-6951
  URL: http://www-stat.stanford.edu/~hastie  
  address: room 104, Department of Statistics, Sequoia Hall
           390 Serra Mall, Stanford University, CA 94305-4065  
--------------------------------------------------------------------

	[[alternative HTML version deleted]]

Prof Brian Ripley

2003-Sep-14 07:17 UTC

head link

[R] Re: Logistic Regression

On Sat, 13 Sep 2003, Trevor Hastie wrote:
> Christoph Lehman had problems with seperated data in two-class logistic
> regression.
> 
> One useful little trick is to penalize the logistic regression using a
> quadratic penalty on the coefficients. I am sure there are functions in
> the R contributed libraries to do this; 
Using nnet/multinom with weight decay does exactly this.
> otherwise it is easy to achieve via IRLS using ridge regressions. Then
> even though the data are separated, the penalized log-likelihood has a
> unique maximum. One intriguing feature is that as the penalty parameter
> goes to zero, the solution converges to the SVM solution - i.e. the
> optimal separating hyperplane see
> http://www-stat.stanford.edu/~hastie/Papers/margmax1.ps

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

Frank E Harrell Jr

2003-Sep-14 07:35 UTC

head link

[R] Re: Logistic Regression

On Sun, 14 Sep 2003 08:17:20 +0100 (BST)
Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:
> On Sat, 13 Sep 2003, Trevor Hastie wrote:
> 
> > Christoph Lehman had problems with seperated data in two-class
logistic
> > regression.
> > 
> > One useful little trick is to penalize the logistic regression using a
> > quadratic penalty on the coefficients. I am sure there are functions
in
> > the R contributed libraries to do this; 
> 
> Using nnet/multinom with weight decay does exactly this.
Also the lrm function in the Design package will do quadratic penalization.

Frank Harrell> 
> > otherwise it is easy to achieve via IRLS using ridge regressions. Then
> > even though the data are separated, the penalized log-likelihood has a
> > unique maximum. One intriguing feature is that as the penalty
parameter
> > goes to zero, the solution converges to the SVM solution - i.e. the
> > optimal separating hyperplane see
> > http://www-stat.stanford.edu/~hastie/Papers/margmax1.ps
> 
> 
> -- 
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help

---
Frank E Harrell Jr    Professor and Chair            School of Medicine
                      Department of Biostatistics    Vanderbilt University

Simon Wood

2003-Sep-17 13:02 UTC

head link

[R]gam and concurvity

> in the paper "Avoiding the effects of concurvity in GAM's .."
of
> Figueiras et al. (2003) it is mentioned that in GLM collinearity is taken 
> into account in the calc of se but not in GAM (-> results in confidence 
> interval too narrow, p-value understated,  GAM S-Plus version). I
haven't
> found any references to GAM and concurvity or collinearity on the R page. 
> And I wonder if the R  version of Gam differ in this point.
- the penalized regression spline representation means that it's easy to
calculate the `correct' s.e.'s and this is what is done. The covariance
matrix used is based on a Bayesian model of smoothing, generalized from
Silverman (1985), JRSSB (and less closely, Wahba, 1983, JRSSB), so the
s.e.'s are generally a little larger than you'd get if you just
pretended
that the GAM was an un-penalized GLM (this widening generally improves CI
performance). 

As Thomas Lumley pointed out, the s.e.'s don't take into account
smoothing
parameter estimation uncertainty. In simulation studies this
uncertainty seems to have very little effect on the realized coverage
probabilities of Confidence Interval's that are in some sense `whole
model' intervals, but the performance of CI's for component functions of
the GAM can be quite a long way from nominal. There's a simple
`not-very-computer-intensive' fix for this which removes the conditioning
on the smoothing parameters and greatly improves component-wise coverage
probabilities.... implementation is on my `to-do' list (might wait to see
what the referees say though!)

Simon 

ps. mgcv 0.9 out now! (changes list linked to my www page)
_____________________________________________________________________> Simon Wood simon at stats.gla.ac.uk        www.stats.gla.ac.uk/~simon/
>>  Department of Statistics, University of Glasgow, Glasgow, G12 8QQ
>>>   Direct telephone: (0)141 330 4530          Fax: (0)141 330 4814

Apparently Analagous Threads

Search for more possibly parallel threads

R help - Sep 2003 - Re: Logistic Regression

[R] Re: Logistic Regression

[R] Re: Logistic Regression

[R] Re: Logistic Regression

[R]gam and concurvity

Apparently Analagous Threads