Displaying 20 results from an estimated 5000 matches similar to: "Extracting the MSE and % Variance from RandomForest"
2004 Apr 05
3
Can't seem to finish a randomForest.... Just goes and goe s!
When you have fairly large data, _do not use the formula interface_, as a
couple of copies of the data would be made. Try simply:
Myforest.rf <- randomForest(Mydata[, -46], Mydata[,46],
ntrees=100, mtry=7)
[Note that you don't need to set proximity (not proximities) or importance
to FALSE, as that's the default already.]
You might also want to use
2004 Apr 05
2
Can't seem to finish a randomForest.... Just goes and goes!
Alternatively, if you can arrive at a sensible ordering of the levels
you can declare them ordered factors and make the computation feasible
once again.
Bill Venables.
-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Torsten Hothorn
Sent: Monday, 5 April 2004 4:27 PM
To: David L. Van Brunt, Ph.D.
Cc: R-Help
Subject:
2005 Oct 27
1
Repost: Examples of "classwt", "strata", and "sampsize" i n randomForest?
"classwt" in the current version of the randomForest package doesn't work
too well. (It's what was in version 3.x of the original Fortran code by
Breiman and Cutler, not the one in the new Fortran code.) I'd advise
against using it.
"sampsize" and "strata" can be use in conjunction. If "strata" is not
specified, the class labels will be used.
2009 Apr 10
1
Random Forests: Question about R^2
Dear Random Forests gurus,
I have a question about R^2 provided by randomForest (for regression).
I don't succeed in finding this information.
In the help file for randomForest under "Value" it says:
rsq: (regression only) - "pseudo R-squared'': 1 - mse / Var(y).
Could someone please explain in somewhat more detail how exactly R^2
is calculated?
Is "mse"
2010 May 05
0
Which column in randomForest importances (for regression) is MSE and which IncNodePurity
I've run the function randomForest with importance=T. All my variables
(predictors and the dependent variable) are numeric.
rf<-randomForest(formula, data=mydata, importance=T, etc.)
my results object "rf" contains predictor importances:
rf$importance
I am seeing two columns:
%IncMSE IncNodePurity
V1 -0.01683558 58.10910
V2 0.04000299 71.27579
V3 0.01974636
2005 Oct 27
1
Repost: Examples of "classwt", "strata", and "sampsize" in randomForest?
Sorry for the repost, but I've really been looking, and can't find any
syntax direction on this issue...
Just browsing the documentation, and searching the list came up short... I
have some unbalanced data and was wondering if, in a "0" v "1"
classification forest, some combo of these options might yield better
predictions when the proportion of one class is low (less
2007 Apr 29
1
randomForest gives different results for formula call v. x, y methods. Why?
Just out of curiosity, I took the default "iris" example in the RF
helpfile...
but seeing the admonition against using the formula interface for large data
sets, I wanted to play around a bit to see how the various options affected
the output. Found something interesting I couldn't find documentation for...
Just like the example...
> set.seed(12) # to be sure I have
2004 May 15
0
" cannot allocate vector of length 1072693248"
Andy;
Well, that about does it....
I'm copying this one back to the list for the benefit of those who may hit
this thread while searching the archives. Your changes to the code run just
fine on the my Windows machine, but gives the vector length error on the G4
whether I'm using the OS X build of R (as in Raqua) or the X11 build (for
Darwin). It is worth noting that I have nearly twice
2004 May 21
1
Memory Leak in OS X versions? (PR#6903)
Full_Name: David L. Van Brunt
Version: 1.8-1.9 beta
OS: OS X 10.3
Submission from: (NULL) (68.74.58.109)
As posted on R-Help (after which another user replicated the problem):
---------------
This is the conclusion from a prior thread ([R] " cannot allocate vector of
length 1072693248") which ended with no other answer but that there must be
a problem in the OS X version of R, or in
2011 Nov 16
0
problem to tunning RandomForest, an unexpected result
Dear Researches,
I am using RF (in regression way) for analize several metrics extract from
image. I am tuning RF setting a loop using different range of mtry, tree
and nodesize using the lower value of MSE-OOB
mtry from 1 to 5
nodesize from1 to 10
tree from 1 to 500
using this paper as refery
Palmer, D. S., O'Boyle, N. M., Glen, R. C., & Mitchell, J. B. O. (2007).
Random Forest Models
2010 Jun 29
1
Model validation and penalization with rms package
I?ve been using Frank Harrell?s rms package to do bootstrap model
validation. Is it the case that the optimum penalization may still
give a model which is substantially overfitted?
I calculated corrected R^2, optimism in R^2, and corrected slope for
various penalties for a simple example:
x1 <- rnorm(45)
x2 <- rnorm(45)
x3 <- rnorm(45)
y <- x1 + 2*x2 + rnorm(45,0,3)
ols0 <- ols(y
2010 May 05
1
randomForest: predictor importance (for regressions)
I have a question about predictor importances in randomForest.
Once I've run randomForest and got my object, I get their importances:
rfresult$importance
I also get the "standard errors" of the permutation-based importance
measure: rfresult$importanceSD
I have 2 questions:
1. Because I am dealing with regressions, I am getting an importance object
(rfresult$importance) with two
2012 Mar 08
2
Regarding randomForest regression
Sir,
This query is related to randomForest regression using R.
I have a dataset called qsar.arff which I use as my training set and
then I run the following function -
rf=randomForest(x=train,y=trainy,xtest=train,ytest=trainy,ntree=500)
where train is a matrix of predictors without the column to be
predicted(the target column), trainy is the target column.I feed the same
data
2012 Oct 11
0
Error with cForest
All --
I have been trying to work with the 'Party' package using R v2.15.1 and have cobbled together a (somewhat) functioning code from examples on the web. I need to run a series of unbiased, conditional, cForest tests on several subsets of data which I have made into a loop. The results ideally will be saved to an output file in matrix form. The two questions regarding the script in
2010 Jul 13
1
question regarding "varImpPlot" results vs. model$importance data on package "RandomForest"
Hi everyone,
I have another "Random Forest" package question:
- my (presumably incorrect) understanding of the varImpPlot is that it
should plot the "% increase in MSE" and "IncNodePurity" exactly as can be
found from the "importance" section of the model results.
- However, the plot does not, in fact, match the "importance"
2009 Apr 07
1
Concern with randomForest
Hi all,
When running a randomForest run using the following command:
forestplas=randomForest(Prev~.,data=plas,ntree=200000)
print(forestplas)
I get the following result:
Call:
randomForest(formula = Prev ~ ., data = plas, ntree = 2e+05,
importance = TRUE)
Type of random forest: regression
Number of trees: 2e+05
No. of variables tried at each split: 5
2007 Oct 11
1
random forest mtry and mse
I have been using random forest on a data set with 226 sites and 36
explanatory variables (continuous and categorical). When I use
"tune.randomforest" to determine the best value to use in "mtry" there
is a fairly consistent and steady decrease in MSE, with the optimum of
"mtry" usually equal to 1. Why would that occur, and what does it
signify? What I would
2008 Sep 09
1
randomForest
I am combining many different random forest objects run on the same data set
using the combine ( ) function. After combining the forests I am not sure
whether the variable importance, local importance, and rsq predictors are
recalculated for the new random forest object or are calculated
individually for each tree ensemble? Is it possible to calculate these
predictors for the new random forest
2008 Jun 15
1
randomForest, 'No forest component...' error while calling Predict()
Dear R-users,
While making a prediction using the randomForest function (package
randomForest) I'm getting the following error message:
"Error in predict.randomForest(model, newdata = CV) : No forest component
in the object"
Here's my complete code. For reproducing this task, please find my 2 data
sets attached ( http://www.nabble.com/file/p17855119/data.rar data.rar ).
2013 Jan 11
0
Error with looping through a list of strings as variables
Dear R users:
I have been trying to figure out how to include string variables in a for
loop to run multiple random forests with little success. The current code
returns the following error:
Error in trafo(data = data, numeric_trafo = numeric_trafo, factor_trafo =
factor_trafo, :
data class character is not supported
In addition: Warning message:
In storage.mode(RET@predict_trafo) <-