Displaying 1 result from an estimated 1 matches for "downsamp".
2012 Mar 03
0
Strategies to deal with unbalanced classification data in randomForest
...itself when I try to
train a random forest
## without accounting for this imbalance
df.rf<-randomForest(cls~var1+var2+var3, data=df,importance=TRUE)
## Now one option is to down sample the majority variable. However, I
can seem to find exactly
## how to do this. Does this seem correct?
df.rf.downsamp <-randomForest(cls~var1+var2+var3,
data=df,sampsize=c(50,50), importance=TRUE)
## 50 being the number of observations in the minority variable
## The other option which there seems to be some confusion over is
establish some class weights
## to balance the error rate. This approach I've mos...