Martin Lam
2006-Feb-09 16:18 UTC
[R] How to calculate the generalization error of random forests?
Hi, Perhaps this is not the proper place to ask this question but I am out of options, therefore I apologize in advance. I want to know how the (upper bound?) generalization error of the random forest is determined using the out-of-bag estimate. I read in Breiman's paper that s and p determine the generalization error: p(1-s^2)/s^2. Does s stands for the strength of the individual tree or of the entire ensemble? p stands for the correlation between the trees. If I have, let's say, built 3 trees in my forest and I know for each tree the instances that were left out during training, how do I calculate s and p, so I can calculate the error? Thanks in advance, Martin