thr3ads.net - R help - [R] e1071 question: what's the definition of performance in t une.* functions? [Jul 2004]

If this information is useful, please help other people find it:
Share via:

Liaw, Andy

2004-Jul-13 00:55 UTC

[R] e1071 question: what's the definition of performance in t une.* functions?

Basically, the `Detail' section of ?tune says it all:

Details:

     As performance measure, the classification error is used for
     classification, and the mean squared error for regression. ...


Andy
> From: Tae-Hoon Chung
> 
> Hi, all;
> 
> Basically, the subject contains the all information I need to know.
> In e1071 library, there are functions to tune parameters.
> They provide several values one of which is the performance.
> Does any body know the "definition" of performance here?
> Is it percentage of error or just the error rate or anything else?
> 
> Thanks in advance!
> 
> Tae-Hoon Chung, Ph.D
> 
> Post-doctoral Research Fellow
> Molecular Diagnostics and Target Validation Division
> Translational Genomics Research Institute
> 1275 W Washington St, Tempe AZ 85281 USA
> Phone: 602-343-8724
>

Tae-Hoon Chung

2004-Jul-13 01:11 UTC

head link

[R] e1071 question: what's the definition of performance in t une.* functions?

Thanks Andy, however, let me make it more clear.

When you run tune.*, you will get performance value like 0.7...
If this value is percent, we get error rate of 0.7% which is excellent
(of course, we should be sure whether this is really a case of  
over-fitting ...
but anyway nominally this error rate is great).
However, if this error rate is ratio, than 0.7 is poor because  
basically we have 70% error rate.
So my question is whether the error rate is presented in percent or is  
just the error rate.
One puzzling thing is that when you run tune.*, you will also get  
values like
1.2* which makes it absurd to regard it as ratio because ratio larger  
than
1 is really absurd, right?
However, since the definition is not explicitly given anywhere, it is  
hard to interpret the result properly.

Thanks in advance;
TH

On Jul 12, 2004, at 5:55 PM, Liaw, Andy wrote:
> Basically, the `Detail' section of ?tune says it all:
>
> Details:
>
>      As performance measure, the classification error is used for
>      classification, and the mean squared error for regression. ...
>
>
> Andy
>
>> From: Tae-Hoon Chung
>>
>> Hi, all;
>>
>> Basically, the subject contains the all information I need to know.
>> In e1071 library, there are functions to tune parameters.
>> They provide several values one of which is the performance.
>> Does any body know the "definition" of performance here?
>> Is it percentage of error or just the error rate or anything else?
>>
>> Thanks in advance!
>>
>> Tae-Hoon Chung, Ph.D
>>
>> Post-doctoral Research Fellow
>> Molecular Diagnostics and Target Validation Division
>> Translational Genomics Research Institute
>> 1275 W Washington St, Tempe AZ 85281 USA
>> Phone: 602-343-8724
>>
>
>
> ----------------------------------------------------------------------- 
> -------
> Notice:  This e-mail message, together with any attachments, contains  
> information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station,
> New Jersey, USA 08889), and/or its affiliates (which may be known  
> outside the United States as Merck Frosst, Merck Sharp & Dohme or MSD  
> and in Japan, as Banyu) that may be confidential, proprietary  
> copyrighted and/or legally privileged. It is intended solely for the  
> use of the individual or entity named on this message.  If you are not  
> the intended recipient, and have received this message in error,  
> please notify us immediately by reply e-mail and then delete it from  
> your system.
> ----------------------------------------------------------------------- 
> -------
>
>Tae-Hoon Chung, Ph.D

Post-doctoral Research Fellow
Molecular Diagnostics and Target Validation Division
Translational Genomics Research Institute
1275 W Washington St, Tempe AZ 85281 USA
Phone: 602-343-8724

Liaw, Andy

2004-Jul-13 01:40 UTC

head link

[R] e1071 question: what's the definition of performance in t une.* functions?

Looking at the body of tune(), it has:

...
                repeat.errors[reps] <- if (is.factor(true.y)) 
                  1 - classAgreement(table(pred, true.y))
                else crossprod(pred - true.y)/length(pred)
...

where classAgreement() is a function defined inside tune() that computes the
fraction of correctly predicted cases.  So it looks like tune() and friends
are returning error rates as fractions, not percentages.

You're right that the fraction shouldn't be larger than 1.  Did you make
sure that tune() sees the data as classification, not regression (i.e., did
you make sure that the class labels given to tune.*() are factor)?

HTH,
Andy
> From: Tae-Hoon Chung [mailto:thchung at tgen.org] 
> 
> Thanks Andy, however, let me make it more clear.
> 
> When you run tune.*, you will get performance value like 0.7...
> If this value is percent, we get error rate of 0.7% which is excellent
> (of course, we should be sure whether this is really a case of  
> over-fitting ...
> but anyway nominally this error rate is great).
> However, if this error rate is ratio, than 0.7 is poor because  
> basically we have 70% error rate.
> So my question is whether the error rate is presented in 
> percent or is  
> just the error rate.
> One puzzling thing is that when you run tune.*, you will also get  
> values like
> 1.2* which makes it absurd to regard it as ratio because 
> ratio larger  
> than
> 1 is really absurd, right?
> However, since the definition is not explicitly given 
> anywhere, it is  
> hard to interpret the result properly.
> 
> Thanks in advance;
> TH
> 
> On Jul 12, 2004, at 5:55 PM, Liaw, Andy wrote:
> 
> > Basically, the `Detail' section of ?tune says it all:
> >
> > Details:
> >
> >      As performance measure, the classification error is used for
> >      classification, and the mean squared error for regression. ...
> >
> >
> > Andy
> >
> >> From: Tae-Hoon Chung
> >>
> >> Hi, all;
> >>
> >> Basically, the subject contains the all information I need to
know.
> >> In e1071 library, there are functions to tune parameters.
> >> They provide several values one of which is the performance.
> >> Does any body know the "definition" of performance here?
> >> Is it percentage of error or just the error rate or anything else?
> >>
> >> Thanks in advance!
> >>
> >> Tae-Hoon Chung, Ph.D
> >>
> >> Post-doctoral Research Fellow
> >> Molecular Diagnostics and Target Validation Division
> >> Translational Genomics Research Institute
> >> 1275 W Washington St, Tempe AZ 85281 USA
> >> Phone: 602-343-8724
> >>
> >
> >
> > 
> --------------------------------------------------------------
> --------- 
> > -------
> > Notice:  This e-mail message, together with any 
> attachments, contains  
> > information of Merck & Co., Inc. (One Merck Drive, 
> Whitehouse Station,  
> > New Jersey, USA 08889), and/or its affiliates (which may be known  
> > outside the United States as Merck Frosst, Merck Sharp & 
> Dohme or MSD  
> > and in Japan, as Banyu) that may be confidential, proprietary  
> > copyrighted and/or legally privileged. It is intended 
> solely for the  
> > use of the individual or entity named on this message.  If 
> you are not  
> > the intended recipient, and have received this message in error,  
> > please notify us immediately by reply e-mail and then 
> delete it from  
> > your system.
> > 
> --------------------------------------------------------------
> --------- 
> > -------
> >
> >
> Tae-Hoon Chung, Ph.D
> 
> Post-doctoral Research Fellow
> Molecular Diagnostics and Target Validation Division
> Translational Genomics Research Institute
> 1275 W Washington St, Tempe AZ 85281 USA
> Phone: 602-343-8724
> 
> 
>

Possibly Parallel Threads

Search for more reasonably related threads

R help - Jul 2004 - e1071 question: what's the definition of performance in t une.* functions?

[R] e1071 question: what's the definition of performance in t une.* functions?

[R] e1071 question: what's the definition of performance in t une.* functions?

[R] e1071 question: what's the definition of performance in t une.* functions?

Possibly Parallel Threads