similar to: Extracting the MSE and % Variance from RandomForest

Displaying 20 results from an estimated 5000 matches similar to: "Extracting the MSE and % Variance from RandomForest"

2004 Apr 05
3
Can't seem to finish a randomForest.... Just goes and goe s!
When you have fairly large data, _do not use the formula interface_, as a couple of copies of the data would be made. Try simply: Myforest.rf <- randomForest(Mydata[, -46], Mydata[,46], ntrees=100, mtry=7) [Note that you don't need to set proximity (not proximities) or importance to FALSE, as that's the default already.] You might also want to use
2004 Apr 05
2
Can't seem to finish a randomForest.... Just goes and goes!
Alternatively, if you can arrive at a sensible ordering of the levels you can declare them ordered factors and make the computation feasible once again. Bill Venables. -----Original Message----- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Torsten Hothorn Sent: Monday, 5 April 2004 4:27 PM To: David L. Van Brunt, Ph.D. Cc: R-Help Subject:
2005 Oct 27
1
Repost: Examples of "classwt", "strata", and "sampsize" i n randomForest?
"classwt" in the current version of the randomForest package doesn't work too well. (It's what was in version 3.x of the original Fortran code by Breiman and Cutler, not the one in the new Fortran code.) I'd advise against using it. "sampsize" and "strata" can be use in conjunction. If "strata" is not specified, the class labels will be used.
2009 Apr 10
1
Random Forests: Question about R^2
Dear Random Forests gurus, I have a question about R^2 provided by randomForest (for regression). I don't succeed in finding this information. In the help file for randomForest under "Value" it says: rsq: (regression only) - "pseudo R-squared'': 1 - mse / Var(y). Could someone please explain in somewhat more detail how exactly R^2 is calculated? Is "mse"
2010 May 05
0
Which column in randomForest importances (for regression) is MSE and which IncNodePurity
I've run the function randomForest with importance=T. All my variables (predictors and the dependent variable) are numeric. rf<-randomForest(formula, data=mydata, importance=T, etc.) my results object "rf" contains predictor importances: rf$importance I am seeing two columns: %IncMSE IncNodePurity V1 -0.01683558 58.10910 V2 0.04000299 71.27579 V3 0.01974636
2005 Oct 27
1
Repost: Examples of "classwt", "strata", and "sampsize" in randomForest?
Sorry for the repost, but I've really been looking, and can't find any syntax direction on this issue... Just browsing the documentation, and searching the list came up short... I have some unbalanced data and was wondering if, in a "0" v "1" classification forest, some combo of these options might yield better predictions when the proportion of one class is low (less
2007 Apr 29
1
randomForest gives different results for formula call v. x, y methods. Why?
Just out of curiosity, I took the default "iris" example in the RF helpfile... but seeing the admonition against using the formula interface for large data sets, I wanted to play around a bit to see how the various options affected the output. Found something interesting I couldn't find documentation for... Just like the example... > set.seed(12) # to be sure I have
2004 May 15
0
" cannot allocate vector of length 1072693248"
Andy; Well, that about does it.... I'm copying this one back to the list for the benefit of those who may hit this thread while searching the archives. Your changes to the code run just fine on the my Windows machine, but gives the vector length error on the G4 whether I'm using the OS X build of R (as in Raqua) or the X11 build (for Darwin). It is worth noting that I have nearly twice
2004 May 21
1
Memory Leak in OS X versions? (PR#6903)
Full_Name: David L. Van Brunt Version: 1.8-1.9 beta OS: OS X 10.3 Submission from: (NULL) (68.74.58.109) As posted on R-Help (after which another user replicated the problem): --------------- This is the conclusion from a prior thread ([R] " cannot allocate vector of length 1072693248") which ended with no other answer but that there must be a problem in the OS X version of R, or in
2011 Nov 16
0
problem to tunning RandomForest, an unexpected result
Dear Researches, I am using RF (in regression way) for analize several metrics extract from image. I am tuning RF setting a loop using different range of mtry, tree and nodesize using the lower value of MSE-OOB mtry from 1 to 5 nodesize from1 to 10 tree from 1 to 500 using this paper as refery Palmer, D. S., O'Boyle, N. M., Glen, R. C., & Mitchell, J. B. O. (2007). Random Forest Models
2010 Jun 29
1
Model validation and penalization with rms package
I?ve been using Frank Harrell?s rms package to do bootstrap model validation. Is it the case that the optimum penalization may still give a model which is substantially overfitted? I calculated corrected R^2, optimism in R^2, and corrected slope for various penalties for a simple example: x1 <- rnorm(45) x2 <- rnorm(45) x3 <- rnorm(45) y <- x1 + 2*x2 + rnorm(45,0,3) ols0 <- ols(y
2010 May 05
1
randomForest: predictor importance (for regressions)
I have a question about predictor importances in randomForest. Once I've run randomForest and got my object, I get their importances: rfresult$importance I also get the "standard errors" of the permutation-based importance measure: rfresult$importanceSD I have 2 questions: 1. Because I am dealing with regressions, I am getting an importance object (rfresult$importance) with two
2012 Mar 08
2
Regarding randomForest regression
Sir, This query is related to randomForest regression using R. I have a dataset called qsar.arff which I use as my training set and then I run the following function - rf=randomForest(x=train,y=trainy,xtest=train,ytest=trainy,ntree=500) where train is a matrix of predictors without the column to be predicted(the target column), trainy is the target column.I feed the same data
2012 Oct 11
0
Error with cForest
All -- I have been trying to work with the 'Party' package using R v2.15.1 and have cobbled together a (somewhat) functioning code from examples on the web. I need to run a series of unbiased, conditional, cForest tests on several subsets of data which I have made into a loop. The results ideally will be saved to an output file in matrix form. The two questions regarding the script in
2010 Jul 13
1
question regarding "varImpPlot" results vs. model$importance data on package "RandomForest"
Hi everyone, I have another "Random Forest" package question: - my (presumably incorrect) understanding of the varImpPlot is that it should plot the "% increase in MSE" and "IncNodePurity" exactly as can be found from the "importance" section of the model results. - However, the plot does not, in fact, match the "importance"
2009 Apr 07
1
Concern with randomForest
Hi all, When running a randomForest run using the following command: forestplas=randomForest(Prev~.,data=plas,ntree=200000) print(forestplas) I get the following result: Call: randomForest(formula = Prev ~ ., data = plas, ntree = 2e+05, importance = TRUE) Type of random forest: regression Number of trees: 2e+05 No. of variables tried at each split: 5
2007 Oct 11
1
random forest mtry and mse
I have been using random forest on a data set with 226 sites and 36 explanatory variables (continuous and categorical). When I use "tune.randomforest" to determine the best value to use in "mtry" there is a fairly consistent and steady decrease in MSE, with the optimum of "mtry" usually equal to 1. Why would that occur, and what does it signify? What I would
2008 Sep 09
1
randomForest
I am combining many different random forest objects run on the same data set using the combine ( ) function. After combining the forests I am not sure whether the variable importance, local importance, and rsq predictors are recalculated for the new random forest object or are calculated individually for each tree ensemble? Is it possible to calculate these predictors for the new random forest
2008 Jun 15
1
randomForest, 'No forest component...' error while calling Predict()
Dear R-users, While making a prediction using the randomForest function (package randomForest) I'm getting the following error message: "Error in predict.randomForest(model, newdata = CV) : No forest component in the object" Here's my complete code. For reproducing this task, please find my 2 data sets attached ( http://www.nabble.com/file/p17855119/data.rar data.rar ).
2013 Jan 11
0
Error with looping through a list of strings as variables
Dear R users: I have been trying to figure out how to include string variables in a for loop to run multiple random forests with little success. The current code returns the following error: Error in trafo(data = data, numeric_trafo = numeric_trafo, factor_trafo = factor_trafo, : data class character is not supported In addition: Warning message: In storage.mode(RET@predict_trafo) <-