thr3ads.net - R help - [R] Cross-validation for Linear Discrimitant Analysis [Sep 2004]

If this information is useful, please help other people find it:
Share via:

Yu Shao

2004-Sep-15 23:28 UTC

[R] Cross-validation for Linear Discrimitant Analysis

Hello:

I am new to R and statistics and I have two questions.

First I need help to interpret the cross-validation result from the R
linear discriminant analysis function "lda". I did the following:

lda (group ~ Var1 + Var2, CV=T)

where "CV=T" tells the lda to do cross-validation. The output of lda
are
the posterior probabilities among other things, but I can't find an error
term (like delta returned by cv.glm). My question is how to get such an
error term from the output? Can I just simply calculate the prediction
accuracy using the posterior probabilities from the cross-validation, and
use that to measure the quality of the model?

Another question is more basic: how to determine if a lda model is
significant? (There is no p-value.) Thanks,

Yu Shao

Wadsworth Research Center
Department of Health of New York State
Albany, NY 12208

Prof Brian Ripley

2004-Sep-16 04:50 UTC

head link

[R] Cross-validation for Linear Discrimitant Analysis

On Wed, 15 Sep 2004, Yu Shao wrote:
> I am new to R and statistics and I have two questions.
Perhaps then you need to start by explaining why you are using LDA.
Please take a good look at the posting guide.
> First I need help to interpret the cross-validation result from the R
> linear discriminant analysis function "lda". 
You mean Professor Ripley's function lda in package MASS, I guess.
> I did the following:
> 
> lda (group ~ Var1 + Var2, CV=T)
R allows you to use meaningful names, so please do so.
> where "CV=T" tells the lda to do cross-validation. The output of
lda are
> the posterior probabilities among other things, but I can't find an
error
> term (like delta returned by cv.glm). My question is how to get such an
> error term from the output? Can I just simply calculate the prediction
> accuracy using the posterior probabilities from the cross-validation, and
> use that to measure the quality of the model?
cv.glm as in Dr Canty's package boot?  If you are trying to predict
classifications, LDA is not the right tool, and LOO CV probably is not
either.  There is no unique definition of `error term' (true for cv.glm as
well), and people have written whole books about how to assess
classifiers.  LDA is about `discrimination' not `allocation' in the
jargon
used ca 1960.
> Another question is more basic: how to determine if a lda model is
> significant? (There is no p-value.) Thanks,
Please do read the references on the ?lda page.  It's not a useful
question, as LDA is about discriminating between populations and makes the
unrealistic assumption of multivariate normality.  (Analogously for linear
regression, there are ways to test if that is (statistically)
`significant', but knowledgable users almost never do so.)

Perhaps more realistic advice is to suggest you seek some statistical 
consultancy.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

Apparently Analagous Threads

Search for more apparently analagous threads

R help - Sep 2004 - Cross-validation for Linear Discrimitant Analysis

[R] Cross-validation for Linear Discrimitant Analysis

[R] Cross-validation for Linear Discrimitant Analysis

Apparently Analagous Threads