search for: incnodepurity

Displaying 6 results from an estimated 6 matches for "incnodepurity".

2010 May 05
0
Which column in randomForest importances (for regression) is MSE and which IncNodePurity
...n the function randomForest with importance=T. All my variables (predictors and the dependent variable) are numeric. rf<-randomForest(formula, data=mydata, importance=T, etc.) my results object "rf" contains predictor importances: rf$importance I am seeing two columns: %IncMSE IncNodePurity V1 -0.01683558 58.10910 V2 0.04000299 71.27579 V3 0.01974636 67.22586 V4 0.25020393 113.69823 V5 0.03146358 67.11151 V6 0.01717313 66.57246 V7 -0.00500985 62.37103 V8 -0.02862065 66.15369 V9 -0.02431507 54.50013 They seem to be clearly labeled %IncM...
2010 Apr 28
1
Question on: Random Forest Variable Importance for Regression Problems
I am trying to use the package RandomForest performing regression. The variable importance estimates are given as: "%IncMSE" and "IncNodePurity" Can anyone explain me what these refer to and how they are calculated? I found a lot of information on variable importance measures for classification problems, but nothing on regression. Thanks a lot. Mareike
2013 May 17
2
Selecting A List of Columns
...s.rf$importance) Importance MSEImportance<-head(Importance[order(Importance$X.IncMSE, decreasing=TRUE),],3) MSEVars<-row.names(MSEImportance) MSEVars<-data.frame(MSEVars,stringsAsFactors = FALSE) colnames(MSEVars)<-"Vars" NodeImportance<-head(Importance[order(Importance$IncNodePurity,decreasing=TRUE),], 3) NodeVars<-row.names(NodeImportance) NodeVars<-data.frame(NodeVars,stringsAsFactors = FALSE) colnames(NodeVars)<-"Vars" ImportantVars<-rbind(MSEVars,NodeVars) ImportantVars<-unique(ImportantVars) nrow(ImportantVars) ImportantVars<-as.character(Imp...
2011 Aug 04
1
randomForest partial dependence plot variable names
...iables in each model, but I am unable to figure out how to enter the variable names into my code. I had originally thought to extract them from the $importance matrix after sorting by metric (e.g. %IncMSE), but the importance matrix is n by 2 - containing only the data for each metric (%IncMSE and IncNodePurity). It is clearly linked to the variable names, but I am unsure how to extract those names for use in scripting. Any assistance would be greatly appreciated as I am currently typing the variable names into each partialPlot call for every model I run.....and that is taking a LONG time. Thanks! [[...
2010 May 05
1
randomForest: predictor importance (for regressions)
...o get the "standard errors" of the permutation-based importance measure: rfresult$importanceSD I have 2 questions: 1. Because I am dealing with regressions, I am getting an importance object (rfresult$importance) with two columns, labeled "%IncMSE" (the first column) and "IncNodePurity" (the second column). I assume it's the first one that is the mean decrease in accuracy due to permutation. Am I correct or am I wrong? I am confused because ?randomForest says: "or Regression, the first column is the mean decrease in accuracy and the second the mean decrease in MSE.&...
2010 Jul 13
1
question regarding "varImpPlot" results vs. model$importance data on package "RandomForest"
Hi everyone, I have another "Random Forest" package question: - my (presumably incorrect) understanding of the varImpPlot is that it should plot the "% increase in MSE" and "IncNodePurity" exactly as can be found from the "importance" section of the model results. - However, the plot does not, in fact, match the "importance" section of the random forest model. E.g., if you use the example given in the ?randomForest, you will see the plot...