Liaw, Andy
2004-Oct-14 20:28 UTC
[R] random forest problem when calculating variable importanc e
Are the results dramatically different? The result would be expected to be somewhat different, as setting importance=TRUE would make many calls to the random number generator (for permuting OOB data in each variable), making all but the first tree in the forest different than if importance=FALSE. Cheers, Andy> From: Scott Gilpin > > Hi - > > When using the randomForest function for regression, I get different > results for mean-squared error of the predictions depending on whether > or not I specify to calculate variable importance. There is an > example below. I looked briefly at the source code, but couldn't find > anything that would indicate why calculating variable importance would > (or should) change predictions. > > I'm using randomForest version 4.3-3 (the latest from CRAN), and tried > R 1.9.0, 1.9.1 and 2.0.0 on Windows XP, and R 1.9.1 on solaris 8. > > Thanks, > Scott Gilpin > > library(randomForest) > set.seed(2863) > x<-matrix(runif(1000),ncol=10) > colnames(x)<-1:10 > beta<-matrix(c(1,2,3,4,5,0,0,0,0,0),ncol=1) > y<-drop(x %*% beta + rnorm(100)) > newx<-matrix(runif(1000),ncol=10) > newy<-drop(newx %*% beta + rnorm(100)) > > set.seed(2863) > rf.fit <- randomForest(x=x,y=y,xtest=newx,ytest=newy,importance=F) > print(rf.fit$test$mse[500]) > > set.seed(2863) > rf.fit <- randomForest(x=x,y=y,xtest=newx,ytest=newy,importance=T) > print(rf.fit$test$mse[500]) > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > >
Maybe Matching Threads
- random forest problem when calculating variable importance
- randomForest, 'No forest component...' error while calling Predict()
- Different results from random.Forest with test option and using predict function
- Regarding randomForest regression
- random forest question