thr3ads.net - R help - [R] mx2 contingency tables or (2^(m-1)-1)'s 2x2 contingency tables in the context of feature selection for random forest [Sep 2006]

If this information is useful, please help other people find it:
Share via:

Weiwei Shi

2006-Sep-28 17:52 UTC

[R] mx2 contingency tables or (2^(m-1)-1)'s 2x2 contingency tables in the context of feature selection for random forest

Dear Listers:

I have a categorical feature selection problem for random forest.

Suppose I have a multiple-leveled category variable A, which has m=3
levels: red, green, and blue and the final target is binary
classification.

I want to evaluate its power in discrimination between 2 classes. We
know rf splits multiple-leveled category variable by considering all
combinations of its levels. So suppose again I have 1000 such
multiple-leveled category variables and I need to do some feature
selection. Then I would like to try chi-sqr tests (or information
gain).

To match the splitting method used in rf, I am thinking if I should
simply use mx2 contingency table or (2^(m-1)-1)'s 2x2 contingency
tables in which I pick the best p-value to evaluate A's power. For the
latter, I am sure it is very alike the way used in rf. But is the
former good enough?

Thanks.
-- 
Weiwei Shi, Ph.D
Research Scientist
GeneGO, Inc.

"Did you always know?"
"No, I did not. But I believed..."
---Matrix III

Possibly Parallel Threads

Search for more maybe matching threads

R help - Sep 2006 - mx2 contingency tables or (2^(m-1)-1)'s 2x2 contingency tables in the context of feature selection for random forest

[R] mx2 contingency tables or (2^(m-1)-1)'s 2x2 contingency tables in the context of feature selection for random forest

Possibly Parallel Threads

Wisdom of the Ancients