thr3ads.net - R help - [R] Statistical significance of a classifier [Aug 2005]

If this information is useful, please help other people find it:
Share via:

Martin C. Martin

2005-Aug-05 19:58 UTC

[R] Statistical significance of a classifier

Hi,

I have a bunch of data points x from two classes A & B, and I'm creating
a classifier.  So I have a function f(x) which estimates the probability 
that x is in class A.  (I have an equal number of examples of each, so 
p(class) = 0.5.)

One way of seeing how well this does is to compute the error rate on the 
test set, i.e. if f(x)>0.5 call it A, and see how many times I 
misclassify an item.  That's what MASS does.  But we should be able to 
do better: misclassifying should be more of a problem if the regression 
is confident then if it isn't.

How can I show that my f(x) = P(x is in class A) does better than chance?

Thanks,
Martin

Liaw, Andy

2005-Aug-05 20:06 UTC

head link

[R] Statistical significance of a classifier

> From: Martin C. Martin
> 
> Hi,
> 
> I have a bunch of data points x from two classes A & B, and 
> I'm creating 
> a classifier.  So I have a function f(x) which estimates the 
> probability 
> that x is in class A.  (I have an equal number of examples of 
> each, so 
> p(class) = 0.5.)
> 
> One way of seeing how well this does is to compute the error 
> rate on the 
> test set, i.e. if f(x)>0.5 call it A, and see how many times I 
> misclassify an item.  That's what MASS does.  But we should 
Surely you mean `99% of dataminers/machine learners' rather than `MASS'?
> be able to 
> do better: misclassifying should be more of a problem if the 
> regression 
> is confident then if it isn't.
> 
> How can I show that my f(x) = P(x is in class A) does better 
> than chance?
It depends on what you mean by `better'.  For some problem, people are
perfectly happy with misclassifcation rate.  For others, the estimated
probabilities count a lot more.  One possibility is to look at the ROC
curve.  Another possibility is to look at the calibration curve (see MASS
the book).

Andy

 > Thanks,
> Martin
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
> 
>

Apparently Analagous Threads

Search for more apparently analagous threads

R help - Aug 2005 - Statistical significance of a classifier

[R] Statistical significance of a classifier

[R] Statistical significance of a classifier

Apparently Analagous Threads