HI, Andy,
Thanks so much for your reply!
IN the paper "Classification and regression by randomForest", the
first
page, there is "the random forest estimate the the importance of a
variable
by looking at how much prediction error increase when the variable is
permuted..."
IN the help document of randomForest, the variable is measured in total
decrease in node impurities. IT should be total* increase* in node
impurities? right?
if total decrease in node impurities, will it be contradict with the paper?
ALso in the fit$importance, what is the meaning for first two columns?
> fit$importance
0 1 MeanDecreaseAccuracy MeanDecreaseGini
CT 0.0022352025 0.003829344 0.0030311246 5.184427
DP 0.0069461974 0.016387520 0.0116650960 15.440624
DY 0.0141150255 0.026031690 0.0200603555 19.901538
FC 0.0024279188 0.005158945 0.0037948155 5.527078
NE 0.0352705133 0.070503233 0.0527718526 46.278504
NW 0.0256059127 0.034433862 0.0299981496 26.440402
QT 0.0037228694 0.008181262 0.0059571350 9.308828
SK 0.0048187014 0.008895719 0.0068609174 10.662129
TA 0.0042134249 0.011746533 0.0079851331 12.878367
WC 0.0177155268 0.014981440 0.0163366320 14.240232
WD 0.0232972311 0.034083695 0.0286702065 25.335182
WG 0.0328547215 0.053142508 0.0429480441 30.663749
WW 0.0093983693 0.006377956 0.0078681474 7.250101
YG 0.0051691399 0.007338639 0.0062618144 11.084111
num_cell 0.0061355526 0.005373049 0.0057463613 5.060577
num_genes 0.0364878788 0.044544488 0.0404558096 32.745034
position 0.0025375614 0.011566496 0.0070255302 10.070505
freq_hypo 0.0008723241 0.001757602 0.0013181209 1.930695
freq_intra 0.0009449492 0.001943090 0.0014431451 2.611950
log_hypo 0.0004514713 0.001366561 0.0009096419 1.736749
acid_per 0.0125815445 0.023360179 0.0179634375 21.131681
base_per 0.0070077737 0.012196570 0.0096129124 13.675893
charge_per 0.0095668425 0.024125997 0.0168345956 20.969665
hydrophob_per 0.0185736697 0.031941513 0.0252200036 25.994903
polar_per 0.0169369327 0.023633413 0.0202776247 20.890415
On Thu, Apr 29, 2010 at 5:22 AM, Liaw, Andy <andy_liaw@merck.com> wrote:
> Please see the "Detail" section of the help page for the
importance()
> function in the randomForest package, and let me know which part of it you
> do not understand.
>
> For boosting, you need to read its documentation and decide for yourself if
> its importance measure is at all comparable to the two in RF.
>
> Andy
>
> ------------------------------
> *From:* Changbin Du [mailto:changbind@gmail.com]
> *Sent:* Wednesday, April 28, 2010 8:58 PM
> *To:* Liaw, Andy
> *Cc:* r-help@r-project.org
> *Subject:* variable importance in Random Forest
>
> HI, Dear Andy,
>
> I run the RandomFOrest in R, and get the following resutls in variable
> importance:
>
> What is the meaning of MeanDecreaseAccuracy and MeanDecreaseGini?
>
> I found they are raw values, they are not scaled to 1, right?
>
> Which column if most similar to the variable rel.influence in Boosting?
>
> Thanks so much!
>
>
>
> > fit$importance
> 0 1 MeanDecreaseAccuracy
> MeanDecreaseGini
> CT 0.0022352025 0.003829344 0.0030311246
> 5.184427
> DP 0.0069461974 0.016387520 0.0116650960
> 15.440624
> DY 0.0141150255 0.026031690 0.0200603555
> 19.901538
> FC 0.0024279188 0.005158945 0.0037948155
> 5.527078
> NE 0.0352705133 0.070503233 0.0527718526
> 46.278504
> NW 0.0256059127 0.034433862 0.0299981496
> 26.440402
> QT 0.0037228694 0.008181262 0.0059571350
> 9.308828
> SK 0.0048187014 0.008895719 0.0068609174
> 10.662129
> TA 0.0042134249 0.011746533 0.0079851331
> 12.878367
> WC 0.0177155268 0.014981440 0.0163366320
> 14.240232
> WD 0.0232972311 0.034083695 0.0286702065
> 25.335182
> WG 0.0328547215 0.053142508 0.0429480441
> 30.663749
> WW 0.0093983693 0.006377956 0.0078681474
> 7.250101
> YG 0.0051691399 0.007338639 0.0062618144
> 11.084111
> num_cell 0.0061355526 0.005373049 0.0057463613
> 5.060577
> num_genes 0.0364878788 0.044544488 0.0404558096
> 32.745034
> position 0.0025375614 0.011566496 0.0070255302
> 10.070505
> freq_hypo 0.0008723241 0.001757602 0.0013181209
> 1.930695
> freq_intra 0.0009449492 0.001943090 0.0014431451
> 2.611950
> log_hypo 0.0004514713 0.001366561 0.0009096419
> 1.736749
> acid_per 0.0125815445 0.023360179 0.0179634375
> 21.131681
> base_per 0.0070077737 0.012196570 0.0096129124
> 13.675893
> charge_per 0.0095668425 0.024125997 0.0168345956
> 20.969665
> hydrophob_per 0.0185736697 0.031941513 0.0252200036
> 25.994903
> polar_per 0.0169369327 0.023633413 0.0202776247
> 20.890415
>
>
>
>
>
>
>
>
>
>
> --
> Sincerely,
> Changbin
> --
>
>
> Notice: This e-mail message, together with any attach...{{dropped:21}}