thr3ads.net - similar to: "randomForest: predictor importance (for regressions)"

Displaying 20 results from an estimated 1100 matches similar to: "randomForest: predictor importance (for regressions)"

Which column in randomForest importances (for regression) is MSE and which IncNodePurity

2010 May 05

Which column in randomForest importances (for regression) is MSE and which IncNodePurity

I've run the function randomForest with importance=T. All my variables (predictors and the dependent variable) are numeric. rf<-randomForest(formula, data=mydata, importance=T, etc.) my results object "rf" contains predictor importances: rf$importance I am seeing two columns: %IncMSE IncNodePurity V1 -0.01683558 58.10910 V2 0.04000299 71.27579 V3 0.01974636

question regarding "varImpPlot" results vs. model$importance data on package "RandomForest"

2010 Jul 13

question regarding "varImpPlot" results vs. model$importance data on package "RandomForest"

Hi everyone, I have another "Random Forest" package question: - my (presumably incorrect) understanding of the varImpPlot is that it should plot the "% increase in MSE" and "IncNodePurity" exactly as can be found from the "importance" section of the model results. - However, the plot does not, in fact, match the "importance"

Error on random forest variable importance estimates

2010 Aug 06

Error on random forest variable importance estimates

Hello, I am using the R randomForest package to classify variable stars. I have a training set of 1755 stars described by (too) many variables. Some of these variables are highly correlated. I believe that I understand how randomForest works and how the variable importance are evaluated (through variable permutations). Here are my questions. 1) variable importance error? Is there any ways

interpret the importance output?

2012 Aug 27

interpret the importance output?

> importance(rfor.pdp11_t25.comb1,type=1) %IncMSE v1 -0.28956401263 v2 1.92865561147 v3 -0.63443929130 v4 1.58949137047 v5 0.03190940065 I wasn't entirely confident with interpreting these results based on the documentation. Could you please interpret? [[alternative HTML version deleted]]

Question on: Random Forest Variable Importance for Regression Problems

2010 Apr 28

Question on: Random Forest Variable Importance for Regression Problems

I am trying to use the package RandomForest performing regression. The variable importance estimates are given as: "%IncMSE" and "IncNodePurity" Can anyone explain me what these refer to and how they are calculated? I found a lot of information on variable importance measures for classification problems, but nothing on regression. Thanks a lot. Mareike

Random Forests Variable Importance Question

2009 Apr 13

Random Forests Variable Importance Question

I am trying to use the random forests package for classification in R. The Variable Importance Measures listed are: -mean raw importance score of variable x for class 0 -mean raw importance score of variable x for class 1 -MeanDecreaseAccuracy -MeanDecreaseGini Now I know what these "mean" as in I know their definitions. What I want to know is how to use them. What I am trying to

Increasing the font size on axes in trellis

2010 May 08

Increasing the font size on axes in trellis

Hello, the code below gives me the picture I need - but there is on small thing I can't figure out. The plot has very small tick mark labels for both axes. I don't mean the axis labels - they are both good, but what is shown near the tick marks. Please help me figure out what parameter I should add to make those larger. I tried sticking cex.lab=1.3 in different places but it didn't

Random forests

2007 Dec 18

Random forests

Dear all, I would like to use a tree regression method to analyze my dataset. I am interested in the fact that random forests creates in-bag and out-of-bag datasets, but I also need an estimate of support for each split. That seems hard to do in random forests since each tree is grown using a subset of the predictor variables. I was thinking of setting mtry = number of predictor variables,

randomForest partial dependence plot variable names

2011 Aug 04

randomForest partial dependence plot variable names

Hello, I am running randomForest models on a number of species. I would like to be able to automate the printing of dependence plots for the most important variables in each model, but I am unable to figure out how to enter the variable names into my code. I had originally thought to extract them from the $importance matrix after sorting by metric (e.g. %IncMSE), but the importance matrix is n

randomForest 4.3-0 released

2004 Jul 08

randomForest 4.3-0 released

Dear all, Version 4.3-0 of the randomForest package is now available on CRAN (in source; binaries will follow in due course). There are some interface changes and a few new features, as well as bug fixes. For those who had used previous versions, the important things to note are: 1. there's a namespace now, and 2. some functions have been renamed. The list of changes since 4.0-7 (last

randomForest 4.3-0 released

2004 Jul 08

randomForest 4.3-0 released

Selecting A List of Columns

2013 May 17

Selecting A List of Columns

Dear R Helpers, I need help with a slightly unusual situation in which I am trying to select some columns from a data frame. I know how to use the subset statement with column names as in: x=as.data.frame(matrix(c(1,2,3, 1,2,3, 1,2,2, 1,2,2, 1,1,1),ncol=3,byrow=T)) all.cols<-colnames(x) to.keep<-all.cols[1:2] Kept<-subset(x,select=to.keep) Kept

svm in e1071 package: polynomial vs linear kernel

2003 Nov 03

svm in e1071 package: polynomial vs linear kernel

I am trying to understand what is the difference between linear and polynomial kernel: linear: u'*v polynomial: (gamma*u'*v + coef0)^degree It would seem that polynomial kernel with gamma = 1; coef0 = 0 and degree = 1 should be identical to linear kernel, however it gives me significantly different results for very simple data set, with linear kernel

Multivariate Normal: Help wanted!

2011 Aug 30

Multivariate Normal: Help wanted!

I have the following function, a MSE calc based on some Multivariate normals: MV.MSE<-function(n,EP,X,S){ (dmvnorm(X,mean=rep(0,2),I+S+EP)-dmvnorm(X,mean=rep(0,2),I+S))^2 + 1/n*(dmvnorm(X,mean=rep(0,2),1+S+EP/2)*det(4*pi*EP)^-.5- (dmvnorm(X,mean=rep(0,2),I+S+EP ))^2)} I can get the MV.MSE for given values of the function e.g

integrate

2007 Aug 22

integrate

Hi, I am trying to integrate a function which is approximately constant over the range of the integration. The function is as follows: > my.fcn = function(mu){ + m = 1000 + z = 0 + z.mse = 0 + for(i in 1:m){ + z[i] = rnorm(1, mu, 1) + z.mse = z.mse + (z[i] - mu)^2 + } + return(z.mse/m) + } > my.fcn(-10) [1] 1.021711 > my.fcn(10) [1] 0.9995235 > my.fcn(-5) [1] 1.012727 > my.fcn(5)

Random Forests: Question about R^2

2009 Apr 10

Random Forests: Question about R^2

Dear Random Forests gurus, I have a question about R^2 provided by randomForest (for regression). I don't succeed in finding this information. In the help file for randomForest under "Value" it says: rsq: (regression only) - "pseudo R-squared'': 1 - mse / Var(y). Could someone please explain in somewhat more detail how exactly R^2 is calculated? Is "mse"

About systemfit package

2012 Nov 13

About systemfit package

Dear friends, I have written the following lines in R console wich already exist in pdf file systemfit: data( "GrunfeldGreene" ) library( "plm" ) GGPanel <- plm.data( GrunfeldGreene, c( "firm", "year" ) ) greeneSur <- systemfit( invest ~ value + capital, method = "SUR", + data = GGPanel ) greenSur I have obtained the following incomplete

Combination of Bias and MSE ?

2006 Apr 05

Combination of Bias and MSE ?

Dear R Users, My question is overall and not necessarily related to R. Suppose we face to a situation in which MSE( Mean Squared Error) shows desired results but Bias shows undesired ones, Or in advers. How can we evaluate the results. And suppose, Both MSE and Bias are important for us. The ecact question is that, whether there is any combined measure of two above metrics. Thank you so

Calculation of VCV matrix of estimated coefficient

2024 Sep 04

Calculation of VCV matrix of estimated coefficient

Hi, I am trying to replicate the R's result for VCV matrix of estimated coefficients from linear model as below data(mtcars) model <- lm(mpg~disp+hp, data=mtcars) model_summ <-summary(model) MSE = mean(model_summ$residuals^2) vcov(model) Now I want to calculate the same thing manually, library(dplyr) X = as.matrix(mtcars[, c('disp', 'hp')] %>% mutate(Intercept =

nnet abstol

2005 Mar 09

nnet abstol

Hi, I am using nnet to learn transfer functions. For each transfer function I can estimate the best possible Mean Squared Error (MSE). So, rather than trying to grind the MSE to 0, I would like to use abstol to stop training once the best MSE is reached. Can anyone confirm that the abstol parameter in the nnet function is the MSE, or is it the Sum-of-Squares (SSE)? Best regards, Sam.

similar to: randomForest: predictor importance (for regressions)