thr3ads.net - similar to: "Question on: Random Forest Variable Importance for Regression Problems"

Displaying 20 results from an estimated 4000 matches similar to: "Question on: Random Forest Variable Importance for Regression Problems"

randomForest partial dependence plot variable names

2011 Aug 04

randomForest partial dependence plot variable names

Hello, I am running randomForest models on a number of species. I would like to be able to automate the printing of dependence plots for the most important variables in each model, but I am unable to figure out how to enter the variable names into my code. I had originally thought to extract them from the $importance matrix after sorting by metric (e.g. %IncMSE), but the importance matrix is n

randomForest: predictor importance (for regressions)

2010 May 05

randomForest: predictor importance (for regressions)

I have a question about predictor importances in randomForest. Once I've run randomForest and got my object, I get their importances: rfresult$importance I also get the "standard errors" of the permutation-based importance measure: rfresult$importanceSD I have 2 questions: 1. Because I am dealing with regressions, I am getting an importance object (rfresult$importance) with two

question regarding "varImpPlot" results vs. model$importance data on package "RandomForest"

2010 Jul 13

question regarding "varImpPlot" results vs. model$importance data on package "RandomForest"

Hi everyone, I have another "Random Forest" package question: - my (presumably incorrect) understanding of the varImpPlot is that it should plot the "% increase in MSE" and "IncNodePurity" exactly as can be found from the "importance" section of the model results. - However, the plot does not, in fact, match the "importance"

Selecting A List of Columns

2013 May 17

Selecting A List of Columns

Dear R Helpers, I need help with a slightly unusual situation in which I am trying to select some columns from a data frame. I know how to use the subset statement with column names as in: x=as.data.frame(matrix(c(1,2,3, 1,2,3, 1,2,2, 1,2,2, 1,1,1),ncol=3,byrow=T)) all.cols<-colnames(x) to.keep<-all.cols[1:2] Kept<-subset(x,select=to.keep) Kept

Which column in randomForest importances (for regression) is MSE and which IncNodePurity

2010 May 05

Which column in randomForest importances (for regression) is MSE and which IncNodePurity

I've run the function randomForest with importance=T. All my variables (predictors and the dependent variable) are numeric. rf<-randomForest(formula, data=mydata, importance=T, etc.) my results object "rf" contains predictor importances: rf$importance I am seeing two columns: %IncMSE IncNodePurity V1 -0.01683558 58.10910 V2 0.04000299 71.27579 V3 0.01974636

Random Forest Variable Importance Interpretation

2009 Jun 24

Random Forest Variable Importance Interpretation

Hi I am trying to explore the use of random forests for regression to identify the important environmental/microclimate variables involved in predicting the abundance of a species in different habitats, there are approx 40 variable and between 200 and 500 data points depending on the dataset. I have successfully used the randomForest package to conduct the analysis and looked at the %IncMSE

randomForest - NaN in %IncMSE

2011 Sep 20

randomForest - NaN in %IncMSE

Hi I am having a problem using varImpPlot in randomForest. I get the error message "Error in plot.window(xlim = xlim, ylim = ylim, log = "") : need finite 'xlim' values" When print $importance, several variables have NaN under %IncMSE. There are no NaNs in the original data. Can someone help me figure out what is happening here? Thanks! [[alternative HTML

interpret the importance output?

2012 Aug 27

interpret the importance output?

> importance(rfor.pdp11_t25.comb1,type=1) %IncMSE v1 -0.28956401263 v2 1.92865561147 v3 -0.63443929130 v4 1.58949137047 v5 0.03190940065 I wasn't entirely confident with interpreting these results based on the documentation. Could you please interpret? [[alternative HTML version deleted]]

Error on random forest variable importance estimates

2010 Aug 06

Error on random forest variable importance estimates

Hello, I am using the R randomForest package to classify variable stars. I have a training set of 1755 stars described by (too) many variables. Some of these variables are highly correlated. I believe that I understand how randomForest works and how the variable importance are evaluated (through variable permutations). Here are my questions. 1) variable importance error? Is there any ways

variable importance in Random Forest

2010 Apr 29

variable importance in Random Forest

HI, Dear Andy, I run the RandomFOrest in R, and get the following resutls in variable importance: What is the meaning of MeanDecreaseAccuracy and MeanDecreaseGini? I found they are raw values, they are not scaled to 1, right? Which column if most similar to the variable rel.influence in Boosting? Thanks so much! > fit$importance 0 1

Variable Importance - Random Forest

2007 Aug 24

Variable Importance - Random Forest

Hello, I am trying to explore the use of random forests for classification and am certain about the interpretation of the importance measurements. When having the option "importance = T" in the randomForest call, the resulting 'importance' element matrix has four columns with the following headings: 0 - mean raw importance score of variable x for class 0 (where

rpart

2004 Jun 04

rpart

Hello everyone, I'm a newbie to R and to CART so I hope my questions don't seem too stupid. 1.) My first question concerns the rpart() method. Which method does rpart use in order to get the best split - entropy impurity, Bayes error (min. error) or Gini index? Is there a way to make it use the entropy impurity? The second and third question concern the output of the printcp() function.

Random Forest, Giving More Importance to Some Data

2013 Mar 24

Random Forest, Giving More Importance to Some Data

Dear All, I am using randomForest to predict the final selling price of some items. As it often happens, I have a lot of (noisy) historical data, but the question is not so much about data cleaning. The dataset for which I need to carry out some predictions are fairly recent sales or even some sales that will took place in the near future. As a consequence, historical data should be somehow

Random Forest prediction questions

2010 Mar 01

Random Forest prediction questions

Hi, I need help with the randomForest prediction. i run the folowing code: > iris.rf <- randomForest(Species ~ ., data=iris, > importance=TRUE,keep.forest=TRUE, proximity=TRUE) > pr<-predict(iris.rf,iris,predict.all=T) > iris.rf$votes[53,] setosa versicolor virginica 0.0000000 0.8074866 0.1925134 > table(pr$individual[53,])/500 versicolor virginica 0.928

Random Forest Variable Importance

2009 Mar 27

Random Forest Variable Importance

Hello, I have an object of Random Forest : iris.rf (importance = TRUE). What is the difference between "iris.rf$importance" and "importance(iris.rf)"? Thank you in advance, Best, Li GUO [[alternative HTML version deleted]]

Regarding variable importance in the randomForest package

2010 Mar 16

Regarding variable importance in the randomForest package

For anyone who is knowledgeable about the randomForest package in R, I have a question: When I look at the variable importance for data, I see that my response variable is included along with my predictor variables. That is, I am getting a MeanDecreaseGini for my response variable, and therefore it seems as though it is being treated as a predictor variable. my code (just in case it helps) :

Random Forest % Variation vs Psuedo-R^2?

2009 Jun 08

Random Forest % Variation vs Psuedo-R^2?

Hi all (and Andy!), When running a randomForest run in R, I get the last part of an output (with do.trace=T) that looks like this: 1993 | 0.04606 130.43 | 1994 | 0.04605 130.40 | 1995 | 0.04605 130.43 | 1996 | 0.04605 130.43 | 1997 | 0.04606 130.44 | 1998 | 0.04607 130.47 | 1999 | 0.04606 130.46 | 2000 | 0.04605 130.42 | With the first column representing the

specifying x-axis scale on random forest variable importance plot

2008 Oct 02

specifying x-axis scale on random forest variable importance plot

i am new to R and using the random forest package. is there a way to specify the x-axis scale range for the variable importance plot? many thanks. -alison -- View this message in context: http://www.nabble.com/specifying-x-axis-scale-on-random-forest-variable-importance-plot-tp19780560p19780560.html Sent from the R help mailing list archive at Nabble.com.

Random Forest %var(y)

2008 Jul 05

Random Forest %var(y)

The verbose option gives a display like: > rf.500 <- + randomForest(new.x,trn.y,do.trace=20,ntree=100,nodesize=500, + importance=T) | Out-of-bag | Tree | MSE %Var(y) | 20 | 0.9279 100.84 | What is the meaning of %var(y)>100%? I expected that to correspond to a model that was worse than random, but the predictions seem much better than that on

Problem with Random Forest predict

2009 Apr 28

Problem with Random Forest predict

I am trying to run a partialPlot with Random Forest (as I have done many times before). First I run my forest... Cell is a 6 level factor that is the dependent variable - all other variables are predictors, most of these are factors as well. predCell<-randomForest(x=tempdata[-match("Cell",names(tempdata))],y=tempdata$Cell,importance=T) Then I try my partial plot to look at the

similar to: Question on: Random Forest Variable Importance for Regression Problems