MSE is the mean squared residuals. For the training data, the OOB
estimate is used (i.e., residual = data - OOB prediction, MSE sum(residuals) /
n, OOB prediction is the mean of predictions from all
trees for which the case is OOB). It is _not_ the average OOB MSE of
trees in the forest.
I hope there's no question about how the pseudo R^2 is computed on a
test set? If you understand how that's done, I assume the confusion is
only how the OOB MSE is formed.
Best,
Andy
From: Dimitri Liakhovitski>
> Dear Random Forests gurus,
>
> I have a question about R^2 provided by randomForest (for regression).
> I don't succeed in finding this information.
>
> In the help file for randomForest under "Value" it says:
>
> rsq: (regression only) - "pseudo R-squared'': 1 - mse /
Var(y).
>
> Could someone please explain in somewhat more detail how exactly R^2
> is calculated?
> Is "mse" mean squared error for prediction?
> Is "mse" an average of mse's for all trees run on out-of-bag
> holdout samples?
> In other words - is this R^2 based on out-of-bag samples?
>
> Thank you very much for clarification!
>
> --
> Dimitri Liakhovitski
> MarketTools, Inc.
> Dimitri.Liakhovitski at markettools.com
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Notice: This e-mail message, together with any attachme...{{dropped:12}}