Displaying 6 results from an estimated 6 matches for "incnodepurity".
2010 May 05
0
Which column in randomForest importances (for regression) is MSE and which IncNodePurity
...n the function randomForest with importance=T. All my variables
(predictors and the dependent variable) are numeric.
rf<-randomForest(formula, data=mydata, importance=T, etc.)
my results object "rf" contains predictor importances:
rf$importance
I am seeing two columns:
%IncMSE IncNodePurity
V1 -0.01683558 58.10910
V2 0.04000299 71.27579
V3 0.01974636 67.22586
V4 0.25020393 113.69823
V5 0.03146358 67.11151
V6 0.01717313 66.57246
V7 -0.00500985 62.37103
V8 -0.02862065 66.15369
V9 -0.02431507 54.50013
They seem to be clearly labeled %IncM...
2010 Apr 28
1
Question on: Random Forest Variable Importance for Regression Problems
I am trying to use the package RandomForest performing regression.
The variable importance estimates are given as: "%IncMSE" and
"IncNodePurity"
Can anyone explain me what these refer to and how they are calculated?
I found a lot of information on variable importance measures for
classification problems, but nothing on regression.
Thanks a lot.
Mareike
2013 May 17
2
Selecting A List of Columns
...s.rf$importance)
Importance
MSEImportance<-head(Importance[order(Importance$X.IncMSE,
decreasing=TRUE),],3)
MSEVars<-row.names(MSEImportance)
MSEVars<-data.frame(MSEVars,stringsAsFactors = FALSE)
colnames(MSEVars)<-"Vars"
NodeImportance<-head(Importance[order(Importance$IncNodePurity,decreasing=TRUE),],
3)
NodeVars<-row.names(NodeImportance)
NodeVars<-data.frame(NodeVars,stringsAsFactors = FALSE)
colnames(NodeVars)<-"Vars"
ImportantVars<-rbind(MSEVars,NodeVars)
ImportantVars<-unique(ImportantVars)
nrow(ImportantVars)
ImportantVars<-as.character(Imp...
2011 Aug 04
1
randomForest partial dependence plot variable names
...iables in each model, but I am unable to figure out how to enter the
variable names into my code. I had originally thought to extract them from
the $importance matrix after sorting by metric (e.g. %IncMSE), but the
importance matrix is n by 2 - containing only the data for each metric
(%IncMSE and IncNodePurity). It is clearly linked to the variable names,
but I am unsure how to extract those names for use in scripting. Any
assistance would be greatly appreciated as I am currently typing the
variable names into each partialPlot call for every model I run.....and that
is taking a LONG time.
Thanks!
[[...
2010 May 05
1
randomForest: predictor importance (for regressions)
...o get the "standard errors" of the permutation-based importance
measure: rfresult$importanceSD
I have 2 questions:
1. Because I am dealing with regressions, I am getting an importance object
(rfresult$importance) with two columns, labeled "%IncMSE" (the first column)
and "IncNodePurity" (the second column). I assume it's the first one that is
the mean decrease in accuracy due to permutation. Am I correct or am I
wrong? I am confused because ?randomForest says: "or Regression, the first
column is the mean decrease in accuracy and the second the mean decrease in
MSE.&...
2010 Jul 13
1
question regarding "varImpPlot" results vs. model$importance data on package "RandomForest"
Hi everyone,
I have another "Random Forest" package question:
- my (presumably incorrect) understanding of the varImpPlot is that it
should plot the "% increase in MSE" and "IncNodePurity" exactly as can be
found from the "importance" section of the model results.
- However, the plot does not, in fact, match the "importance" section
of the random forest model.
E.g., if you use the example given in the ?randomForest, you will see
the plot...