thr3ads.net - similar to: "More digits in prediction using random forest object"

Displaying 20 results from an estimated 10000 matches similar to: "More digits in prediction using random forest object"

Different results from random.Forest with test option and using predict function

2012 Dec 03

Different results from random.Forest with test option and using predict function

Hello R Gurus, I am perplexed by the different results I obtained when I ran code like this: set.seed(100) test1<-randomForest(BinaryY~., data=Xvars, trees=51, mtry=5, seed=200) predict(test1, newdata=cbind(NewBinaryY, NewXs), type="response") and this code: set.seed(100) test2<-randomForest(BinaryY~., data=Xvars, trees=51, mtry=5, seed=200, xtest=NewXs, ytest=NewBinarY) The

How to use classwt parameter option in RandomForest

2008 May 21

How to use classwt parameter option in RandomForest

Hi, I am trying to model a dataset with the response variable Y, which has 6 levels { Great, Greater, Greatest, Weak, Weaker, Weakest}, and predictor variables X, with continuous and factor variables using random forests in R. The variable Y acts like an ordinal variable, but I recoded it as factor variable. I ran a simulation and got OOB estimate of error rate 60%. I validated against some

error in random forest

2008 Mar 07

error in random forest

Hi, I get the following error when I try to predict the probabilities of a test sample: Error in predict.randomForest(fit.EBA.OM.rf.50, x.OM, type = "prob") : New factor levels not present in the training data I have about 630 predictor variables in the dataset x.OM (25 factor variables and the remaining are continuous variables). Any ideas on how to trace it? Thank you, Nagu

To get more digits in precision of predict function of randomForests

2008 Feb 25

To get more digits in precision of predict function of randomForests

Hi, I am using randomForests for a classification problem. The predict function in the randomForest library, when asked to return the probabilities, has precision of two digits after the decimal. I need at least four digits of precision for the predicted probabilities. How do I achieve this? Thank you, Nagu

Random Forest prediction questions

2010 Mar 01

Random Forest prediction questions

Hi, I need help with the randomForest prediction. i run the folowing code: > iris.rf <- randomForest(Species ~ ., data=iris, > importance=TRUE,keep.forest=TRUE, proximity=TRUE) > pr<-predict(iris.rf,iris,predict.all=T) > iris.rf$votes[53,] setosa versicolor virginica 0.0000000 0.8074866 0.1925134 > table(pr$individual[53,])/500 versicolor virginica 0.928

Random Forest AUC

2010 Oct 22

Random Forest AUC

Guys, I used Random Forest with a couple of data sets I had to predict for binary response. In all the cases, the AUC of the training set is coming to be 1. Is this always the case with random forests? Can someone please clarify this? I have given a simple example, first using logistic regression and then using random forests to explain the problem. AUC of the random forest is coming out to be

Random forests prediction

2012 May 11

Random forests prediction

Hi all, I have a strange problem when applying RF in R. I have a set of variables with which I obtain an AUC of 0.67. I do have a second set of variables that have an AUC of 0.57. When I merge the first and second set of variables, the AUC becomes 0.64. I would expect the prediction to become better as I add variables that do have some predictive power? This is even more strange as the AUC

Running randomForests on large datasets

2008 Feb 25

Running randomForests on large datasets

Hi, I am trying to run randomForests on a datasets of size 500000X650 and R pops up memory allocation error. Are there any better ways to deal with large datasets in R, for example, Splus had something like bigData library. Thank you, Nagu

Random Forest Classification_ForestCombination

2012 May 23

Random Forest Classification_ForestCombination

Hello, I am aware of the fact that the combine() function in the Random Forest package of R is meant to combine forests built from the same training set, but is there any way to combine trees built on different training sets? Both the training datasets used contain the same variables and classes, but their sizes are different. Thanks [[alternative HTML version deleted]]

Random Forests: Question about R^2

2009 Apr 10

Random Forests: Question about R^2

Dear Random Forests gurus, I have a question about R^2 provided by randomForest (for regression). I don't succeed in finding this information. In the help file for randomForest under "Value" it says: rsq: (regression only) - "pseudo R-squared'': 1 - mse / Var(y). Could someone please explain in somewhat more detail how exactly R^2 is calculated? Is "mse"

Help predicting random forest-like data

2012 Apr 10

Help predicting random forest-like data

Hi, I have been using some code for multivariate random forests. The output from this code is a list object with all the same values as from randomForest, but the model object is, of course, not of the class randomForest. So, I was hoping to modify the code for predict.randomForest to work for predicting the multivariate model to new data. This is my first attempt at modifying code from a

Random Forests Variable Importance Question

2009 Apr 13

Random Forests Variable Importance Question

I am trying to use the random forests package for classification in R. The Variable Importance Measures listed are: -mean raw importance score of variable x for class 0 -mean raw importance score of variable x for class 1 -MeanDecreaseAccuracy -MeanDecreaseGini Now I know what these "mean" as in I know their definitions. What I want to know is how to use them. What I am trying to

Question on implementing Random Forests scoring

2010 Apr 09

Question on implementing Random Forests scoring

So I've been working with Random Forests ( R library is randomForest) and I curious if Random Forests could be applied to classifying on a real time basis. For instance lets say I've scored fraud from a group of transactions. If I want to score any new incoming transactions for fraud could Random Forests be used in that context. Linear Regression is nice in that it is very easy to

some question regarding random forest

2004 Mar 02

some question regarding random forest

Hi, I had two questions regarding random forests for regression. 1) I have read the original paper by Breiman as well as a paper dicussing an application of random forests and it appears that the one of the nice features of this technique is good predictive ability. However I have some data with which I have generated a linear model using lm(). I can get an RMS error of 0.43 and an R^2 of

random forest and vegetation data

2008 Jan 31

random forest and vegetation data

Hi there, I am an environmental studies masters student trying to get my thesis out the door. I am also newbie at trees in general, but I like what I see in the literature about the random forest algorithm. I think I get the general gist of things, but even after reading stuff I?m unclear about how I could be getting the results I?m seeing. I obviously am missing something about how the split

Random Forest Variable Importance Interpretation

2009 Jun 24

Random Forest Variable Importance Interpretation

Hi I am trying to explore the use of random forests for regression to identify the important environmental/microclimate variables involved in predicting the abundance of a species in different habitats, there are approx 40 variable and between 200 and 500 data points depending on the dataset. I have successfully used the randomForest package to conduct the analysis and looked at the %IncMSE

Random Forest %var(y)

2008 Jul 05

Random Forest %var(y)

The verbose option gives a display like: > rf.500 <- + randomForest(new.x,trn.y,do.trace=20,ntree=100,nodesize=500, + importance=T) | Out-of-bag | Tree | MSE %Var(y) | 20 | 0.9279 100.84 | What is the meaning of %var(y)>100%? I expected that to correspond to a model that was worse than random, but the predictions seem much better than that on

Random Forests: Predictor importance for Regression Trees

2009 Apr 20

Random Forests: Predictor importance for Regression Trees

Hello! I think I am relatively clear on how predictor importance (the first one) is calculated by Random Forests for a Classification tree: Importance of predictor P1 when the response variable is categorical: 1. For out-of-bag (oob) cases, randomly permute their values on predictor P1 and then put them down the tree 2. For a given tree, subtract the number of votes for the correct class in the

Random Forest % Variation vs Psuedo-R^2?

2009 Jun 08

Random Forest % Variation vs Psuedo-R^2?

Hi all (and Andy!), When running a randomForest run in R, I get the last part of an output (with do.trace=T) that looks like this: 1993 | 0.04606 130.43 | 1994 | 0.04605 130.40 | 1995 | 0.04605 130.43 | 1996 | 0.04605 130.43 | 1997 | 0.04606 130.44 | 1998 | 0.04607 130.47 | 1999 | 0.04606 130.46 | 2000 | 0.04605 130.42 | With the first column representing the

Multivariate linear regression

2006 Apr 05

Multivariate linear regression

Hi, I am working on a multivariate linear regression of the form y = Ax. I am seeing a great dispersion of y w.r.t x. For example, the correlations between y and x are very small, even after using some typical transformations like log, power. I tried with simple linear regression, robust regression and ace and avas package in R (or splus). I didn't see an improvement in the fit and

similar to: More digits in prediction using random forest object