Displaying 20 results from an estimated 1100 matches similar to: "randomForest: predictor importance (for regressions)"
2010 May 05
0
Which column in randomForest importances (for regression) is MSE and which IncNodePurity
I've run the function randomForest with importance=T. All my variables
(predictors and the dependent variable) are numeric.
rf<-randomForest(formula, data=mydata, importance=T, etc.)
my results object "rf" contains predictor importances:
rf$importance
I am seeing two columns:
%IncMSE IncNodePurity
V1 -0.01683558 58.10910
V2 0.04000299 71.27579
V3 0.01974636
2010 Jul 13
1
question regarding "varImpPlot" results vs. model$importance data on package "RandomForest"
Hi everyone,
I have another "Random Forest" package question:
- my (presumably incorrect) understanding of the varImpPlot is that it
should plot the "% increase in MSE" and "IncNodePurity" exactly as can be
found from the "importance" section of the model results.
- However, the plot does not, in fact, match the "importance"
2010 Aug 06
1
Error on random forest variable importance estimates
Hello,
I am using the R randomForest package to classify variable stars. I have
a training set of 1755 stars described by (too) many variables. Some of
these variables are highly correlated.
I believe that I understand how randomForest works and how the variable
importance are evaluated (through variable permutations). Here are my
questions.
1) variable importance error? Is there any ways
2012 Aug 27
1
interpret the importance output?
> importance(rfor.pdp11_t25.comb1,type=1)
%IncMSE
v1 -0.28956401263
v2 1.92865561147
v3 -0.63443929130
v4 1.58949137047
v5 0.03190940065
I wasn't entirely confident with interpreting these results based on the
documentation.
Could you please interpret?
[[alternative HTML version deleted]]
2010 Apr 28
1
Question on: Random Forest Variable Importance for Regression Problems
I am trying to use the package RandomForest performing regression.
The variable importance estimates are given as: "%IncMSE" and
"IncNodePurity"
Can anyone explain me what these refer to and how they are calculated?
I found a lot of information on variable importance measures for
classification problems, but nothing on regression.
Thanks a lot.
Mareike
2009 Apr 13
2
Random Forests Variable Importance Question
I am trying to use the random forests package for classification in R.
The Variable Importance Measures listed are:
-mean raw importance score of variable x for class 0
-mean raw importance score of variable x for class 1
-MeanDecreaseAccuracy
-MeanDecreaseGini
Now I know what these "mean" as in I know their definitions. What I
want to know is how to use them.
What I am trying to
2010 May 08
1
Increasing the font size on axes in trellis
Hello,
the code below gives me the picture I need - but there is on small
thing I can't figure out.
The plot has very small tick mark labels for both axes. I don't mean
the axis labels - they are both good, but what is shown near the tick
marks.
Please help me figure out what parameter I should add to make those
larger. I tried sticking cex.lab=1.3 in different places but it didn't
2007 Dec 18
1
Random forests
Dear all,
I would like to use a tree regression method to analyze my dataset. I
am interested in the fact that random forests creates in-bag and
out-of-bag datasets, but I also need an estimate of support for each
split. That seems hard to do in random forests since each tree is
grown using a subset of the predictor variables.
I was thinking of setting mtry = number of predictor variables,
2011 Aug 04
1
randomForest partial dependence plot variable names
Hello,
I am running randomForest models on a number of species. I would like to be
able to automate the printing of dependence plots for the most important
variables in each model, but I am unable to figure out how to enter the
variable names into my code. I had originally thought to extract them from
the $importance matrix after sorting by metric (e.g. %IncMSE), but the
importance matrix is n
2004 Jul 08
0
randomForest 4.3-0 released
Dear all,
Version 4.3-0 of the randomForest package is now available on CRAN (in
source; binaries will follow in due course). There are some interface
changes and a few new features, as well as bug fixes. For those who had
used previous versions, the important things to note are: 1. there's a
namespace now, and 2. some functions have been renamed. The list of changes
since 4.0-7 (last
2004 Jul 08
0
randomForest 4.3-0 released
Dear all,
Version 4.3-0 of the randomForest package is now available on CRAN (in
source; binaries will follow in due course). There are some interface
changes and a few new features, as well as bug fixes. For those who had
used previous versions, the important things to note are: 1. there's a
namespace now, and 2. some functions have been renamed. The list of changes
since 4.0-7 (last
2013 May 17
2
Selecting A List of Columns
Dear R Helpers,
I need help with a slightly unusual situation in which I am trying to
select some columns from a data frame. I know how to use the subset
statement with column names as in:
x=as.data.frame(matrix(c(1,2,3,
1,2,3,
1,2,2,
1,2,2,
1,1,1),ncol=3,byrow=T))
all.cols<-colnames(x)
to.keep<-all.cols[1:2]
Kept<-subset(x,select=to.keep)
Kept
2003 Nov 03
1
svm in e1071 package: polynomial vs linear kernel
I am trying to understand what is the difference between linear and
polynomial kernel:
linear: u'*v
polynomial: (gamma*u'*v + coef0)^degree
It would seem that polynomial kernel with gamma = 1; coef0 = 0 and degree
= 1
should be identical to linear kernel, however it gives me significantly
different results for very simple
data set, with linear kernel
2011 Aug 30
2
Multivariate Normal: Help wanted!
I have the following function, a MSE calc based on some Multivariate normals:
MV.MSE<-function(n,EP,X,S){
(dmvnorm(X,mean=rep(0,2),I+S+EP)-dmvnorm(X,mean=rep(0,2),I+S))^2
+
1/n*(dmvnorm(X,mean=rep(0,2),1+S+EP/2)*det(4*pi*EP)^-.5-
(dmvnorm(X,mean=rep(0,2),I+S+EP ))^2)}
I can get the MV.MSE for given values of the function e.g
2007 Aug 22
3
integrate
Hi,
I am trying to integrate a function which is approximately constant
over the range of the integration. The function is as follows:
> my.fcn = function(mu){
+ m = 1000
+ z = 0
+ z.mse = 0
+ for(i in 1:m){
+ z[i] = rnorm(1, mu, 1)
+ z.mse = z.mse + (z[i] - mu)^2
+ }
+ return(z.mse/m)
+ }
> my.fcn(-10)
[1] 1.021711
> my.fcn(10)
[1] 0.9995235
> my.fcn(-5)
[1] 1.012727
> my.fcn(5)
2009 Apr 10
1
Random Forests: Question about R^2
Dear Random Forests gurus,
I have a question about R^2 provided by randomForest (for regression).
I don't succeed in finding this information.
In the help file for randomForest under "Value" it says:
rsq: (regression only) - "pseudo R-squared'': 1 - mse / Var(y).
Could someone please explain in somewhat more detail how exactly R^2
is calculated?
Is "mse"
2012 Nov 13
1
About systemfit package
Dear friends,
I have written the following lines in R console wich already exist in pdf
file systemfit:
data( "GrunfeldGreene" )
library( "plm" )
GGPanel <- plm.data( GrunfeldGreene, c( "firm", "year" ) )
greeneSur <- systemfit( invest ~ value + capital, method = "SUR",
+ data = GGPanel )
greenSur
I have obtained the following incomplete
2006 Apr 05
1
Combination of Bias and MSE ?
Dear R Users,
My question is overall and not necessarily related to R.
Suppose we face to a situation in which MSE( Mean Squared Error) shows desired results but Bias shows undesired ones, Or in advers. How can we evaluate the results. And suppose, Both MSE and Bias are important for us.
The ecact question is that, whether there is any combined measure of two above metrics.
Thank you so
2024 Sep 04
2
Calculation of VCV matrix of estimated coefficient
Hi,
I am trying to replicate the R's result for VCV matrix of estimated
coefficients from linear model as below
data(mtcars)
model <- lm(mpg~disp+hp, data=mtcars)
model_summ <-summary(model)
MSE = mean(model_summ$residuals^2)
vcov(model)
Now I want to calculate the same thing manually,
library(dplyr)
X = as.matrix(mtcars[, c('disp', 'hp')] %>% mutate(Intercept =
2005 Mar 09
1
nnet abstol
Hi,
I am using nnet to learn transfer functions. For each transfer function I can estimate the best possible Mean Squared Error (MSE). So, rather than trying to grind the MSE to 0, I would like to use abstol to stop training once the best MSE is reached.
Can anyone confirm that the abstol parameter in the nnet function is the MSE, or is it the Sum-of-Squares (SSE)?
Best regards,
Sam.