thr3ads.net - R help - [R] Using unbalanced-learning algorithms in the randomForest [May 2014]

If this information is useful, please help other people find it:
Share via:

Byron Dom

2014-May-16 23:18 UTC

[R] Using unbalanced-learning algorithms in the randomForest

Responding to my own post/question here.

Andy Liaw directed me to this page:
http://grokbase.com/t/r/r-help/05av0aaa2e/r-repost-examples-of-classwt-strata-and-sampsize-i-n-randomforest,
which gives an answer to my question.

----------------------------------- original post
---------------------------------------------------
Date: Tue, 6 May 2014 22:54:22 -0700 (PDT)
From: Byron Dom <byron_dom at yahoo.com>
To: "r-help at r-project.org" <r-help at r-project.org>
Subject: [R] Using unbalanced-learning algorithms in the randomForest
    package.
Message-ID:
    <1399442062.12706.YahooMailNeo at web142801.mail.bf1.yahoo.com>
Content-Type: text/plain
In archive: https://stat.ethz.ch/pipermail/r-help/2014-May/374384.html

The following report by the authors of the randomForest package describes two
different algorithm modifications for using random forests to learn classifiers
for "unbalanced" learning problems in which one class is much less
frequent than the other (in 2-class problems). These two variations are called
"balanced RF" and "weighted RF."
http://statistics.berkeley.edu/sites/default/files/tech-reports/666.pdf

Would someone please answer these three questions.
(1) Is it possible to use the R randomForest package to learn random forests
using either of these modified RF-learning algorithms?
(2) If it is possible, how does one do it?
(3) Is there some detailed documentation for running these modified versions?
I've read the R package manual but it's too sketchy. It seems to be
primarily for users who are already familiar with the package and just need to
look up some detail like the name of an argument.

R help - May 2014 - Using unbalanced-learning algorithms in the randomForest

[R] Using unbalanced-learning algorithms in the randomForest