Frank Gibbons
2003-Aug-21 22:35 UTC
[R] LDA in R: how to extract full equation, especially constant term
Hi, Having dipped my toe into R a few times over the last year or two, in the last few weeks I've been using it more and more; I'm now a thorough convert. I've just joined the list, because although it's great, I do have this problem... I'm using linear discriminant analysis for binary classification, and am happy with the classification performance using predict(). What I'd like to do now is extract the equation for this classifier, for use elsewhere (in Perl/Python code). I know that I can get the means and scaling factors from the predict() object, but I'm having trouble computing the constant term. From reading Venables & Ripley and Hastie/Tibshirani/Friedman, I know the priors play a role in adjusting the "cut-point" from zero (for equally sized classes), based on the relative sizes of the two classes. But when I try to do the computation, I don't get a value that agrees with that returned by predict(). I've seen a post about this problem in the past, but it was never really answered by anyone who was familiar with R/S-PLUS. Can anyone help me with this? I guess I'm really wondering how R is computing the constant term in its discriminant function. Thanks, -Frank Gibbons PhD, Computational Biologist, Harvard Medical School BCMP/SGM-322, 250 Longwood Ave, Boston MA 02115, USA. Tel: 617-432-3555 Fax: 617-432-3557 http://llama.med.harvard.edu/~fgibbons
Prof Brian Ripley
2003-Aug-22 06:15 UTC
[R] LDA in R: how to extract full equation, especially constant term
You have the R code: please read it. Hint: these isn't `an equation', but LDA chooses the largest of several expressions, and those expressions are in all the standard books, including V&R and in more detail in my PRNN book. For numerical stability reasons the `constants' are adjusted to keep the largest expression finite in computer arithmetic. On Thu, 21 Aug 2003, Frank Gibbons wrote:> Hi, > > Having dipped my toe into R a few times over the last year or two, in the > last few weeks I've been using it more and more; I'm now a thorough > convert. I've just joined the list, because although it's great, I do have > this problem... > > I'm using linear discriminant analysis for binary classification, and am > happy with the classification performance using predict(). What I'd like to > do now is extract the equation for this classifier, for use elsewhere (in > Perl/Python code). > > I know that I can get the means and scaling factors from the predict() > object, but I'm having trouble computing the constant term. From reading > Venables & Ripley and Hastie/Tibshirani/Friedman, I know the priors play > a role in adjusting the "cut-point" from zero (for equally sized classes), > based on the relative sizes of the two classes. But when I try to do the > computation, I don't get a value that agrees with that returned by predict(). > > I've seen a post about this problem in the past, but it was never really > answered by anyone who was familiar with R/S-PLUS. Can anyone help me with > this? I guess I'm really wondering how R is computing the constant term in > its discriminant function. > > Thanks, > > -Frank Gibbons > > PhD, Computational Biologist, > Harvard Medical School BCMP/SGM-322, 250 Longwood Ave, Boston MA 02115, USA. > Tel: 617-432-3555 Fax: > 617-432-3557 http://llama.med.harvard.edu/~fgibbons > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595