Is it expected that predict.randomForest() produces the response vector when given the same data frame as provided to randomForest()? See below. Thanks. Rodney> rf <- randomForest (Species ~ ., iris) > pc <- predict (rf, iris) > confusion (pc, iris$Species)true object setosa versicolor virginica setosa 50 0 0 versicolor 0 50 0 virginica 0 0 50 attr(,"error") [1] 0> confusion (rf$predicted, iris$Species)true object setosa versicolor virginica setosa 50 0 0 versicolor 0 47 3 virginica 0 3 47 attr(,"error") [1] 0.04> rf$confusionsetosa versicolor virginica class.error setosa 50 0 0 0.00 versicolor 0 47 3 0.06 virginica 0 3 47 0.06>
If you read the help page for randomForest(), you would expect this behavior... In the "Value" section of that help page, it says: predicted the predicted values of the input data based on out-of-bag samples. Andy From: Rodney Barnett> > Is it expected that predict.randomForest() produces the > response vector when > given the same data frame as provided to randomForest()? See below. > > Thanks. > > Rodney > > > rf <- randomForest (Species ~ ., iris) > > pc <- predict (rf, iris) > > confusion (pc, iris$Species) > true > object setosa versicolor virginica > setosa 50 0 0 > versicolor 0 50 0 > virginica 0 0 50 > attr(,"error") > [1] 0 > > confusion (rf$predicted, iris$Species) > true > object setosa versicolor virginica > setosa 50 0 0 > versicolor 0 47 3 > virginica 0 3 47 > attr(,"error") > [1] 0.04 > > rf$confusion > setosa versicolor virginica class.error > setosa 50 0 0 0.00 > versicolor 0 47 3 0.06 > virginica 0 3 47 0.06 > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >Notice: This e-mail message, together with any attachme...{{dropped:12}}
For the sake of anyone else asking this question....as Andy Liaw explained to me, yes, this is expected because predict() uses all the trees in the forest, including the ones based on any particular item in the input data frame. Rodney -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Rodney Barnett Sent: Wednesday, October 01, 2008 10:01 AM To: r-help at r-project.org Subject: [R] Surprising randomForest Results Is it expected that predict.randomForest() produces the response vector when given the same data frame as provided to randomForest()? See below. Thanks. Rodney> rf <- randomForest (Species ~ ., iris) > pc <- predict (rf, iris) > confusion (pc, iris$Species)true object setosa versicolor virginica setosa 50 0 0 versicolor 0 50 0 virginica 0 0 50 attr(,"error") [1] 0> confusion (rf$predicted, iris$Species)true object setosa versicolor virginica setosa 50 0 0 versicolor 0 47 3 virginica 0 3 47 attr(,"error") [1] 0.04> rf$confusionsetosa versicolor virginica class.error setosa 50 0 0 0.00 versicolor 0 47 3 0.06 virginica 0 3 47 0.06>______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.