Displaying 20 results from an estimated 10000 matches similar to: "More digits in prediction using random forest object"
2012 Dec 03
2
Different results from random.Forest with test option and using predict function
Hello R Gurus,
I am perplexed by the different results I obtained when I ran code like
this:
set.seed(100)
test1<-randomForest(BinaryY~., data=Xvars, trees=51, mtry=5, seed=200)
predict(test1, newdata=cbind(NewBinaryY, NewXs), type="response")
and this code:
set.seed(100)
test2<-randomForest(BinaryY~., data=Xvars, trees=51, mtry=5, seed=200,
xtest=NewXs, ytest=NewBinarY)
The
2008 May 21
1
How to use classwt parameter option in RandomForest
Hi,
I am trying to model a dataset with the response variable Y, which has
6 levels { Great, Greater, Greatest, Weak, Weaker, Weakest}, and
predictor variables X, with continuous and factor variables using
random forests in R. The variable Y acts like an ordinal variable, but
I recoded it as factor variable.
I ran a simulation and got OOB estimate of error rate 60%. I validated
against some
2008 Mar 07
2
error in random forest
Hi,
I get the following error when I try to predict the probabilities of a
test sample:
Error in predict.randomForest(fit.EBA.OM.rf.50, x.OM, type = "prob") :
New factor levels not present in the training data
I have about 630 predictor variables in the dataset x.OM (25 factor
variables and the remaining are continuous variables). Any ideas on
how to trace it?
Thank you,
Nagu
2008 Feb 25
1
To get more digits in precision of predict function of randomForests
Hi,
I am using randomForests for a classification problem. The predict
function in the randomForest library, when asked to return the
probabilities, has precision of two digits after the decimal. I need
at least four digits of precision for the predicted probabilities. How
do I achieve this?
Thank you,
Nagu
2010 Mar 01
1
Random Forest prediction questions
Hi,
I need help with the randomForest prediction. i run the folowing code:
> iris.rf <- randomForest(Species ~ ., data=iris,
> importance=TRUE,keep.forest=TRUE, proximity=TRUE)
> pr<-predict(iris.rf,iris,predict.all=T)
> iris.rf$votes[53,]
setosa versicolor virginica
0.0000000 0.8074866 0.1925134
> table(pr$individual[53,])/500
versicolor virginica
0.928
2010 Oct 22
2
Random Forest AUC
Guys,
I used Random Forest with a couple of data sets I had to predict for binary
response. In all the cases, the AUC of the training set is coming to be 1.
Is this always the case with random forests? Can someone please clarify
this?
I have given a simple example, first using logistic regression and then
using random forests to explain the problem. AUC of the random forest is
coming out to be
2012 May 11
2
Random forests prediction
Hi all,
I have a strange problem when applying RF in R.
I have a set of variables with which I obtain an AUC of 0.67.
I do have a second set of variables that have an AUC of 0.57.
When I merge the first and second set of variables, the AUC becomes 0.64.
I would expect the prediction to become better as I add variables that do
have some predictive power?
This is even more strange as the AUC
2008 Feb 25
1
Running randomForests on large datasets
Hi,
I am trying to run randomForests on a datasets of size 500000X650 and
R pops up memory allocation error. Are there any better ways to deal
with large datasets in R, for example, Splus had something like
bigData library.
Thank you,
Nagu
2012 May 23
1
Random Forest Classification_ForestCombination
Hello,
I am aware of the fact that the combine() function in the Random Forest package of R is meant to combine forests built from the same training set, but is there any way to combine trees built on different training sets? Both the training datasets used contain the same variables and classes, but their sizes are different.
Thanks
[[alternative HTML version deleted]]
2009 Apr 10
1
Random Forests: Question about R^2
Dear Random Forests gurus,
I have a question about R^2 provided by randomForest (for regression).
I don't succeed in finding this information.
In the help file for randomForest under "Value" it says:
rsq: (regression only) - "pseudo R-squared'': 1 - mse / Var(y).
Could someone please explain in somewhat more detail how exactly R^2
is calculated?
Is "mse"
2012 Apr 10
1
Help predicting random forest-like data
Hi,
I have been using some code for multivariate random forests. The output
from this code is a list object with all the same values as from
randomForest, but the model object is, of course, not of the class
randomForest. So, I was hoping to modify the code for predict.randomForest
to work for predicting the multivariate model to new data. This is my
first attempt at modifying code from a
2009 Apr 13
2
Random Forests Variable Importance Question
I am trying to use the random forests package for classification in R.
The Variable Importance Measures listed are:
-mean raw importance score of variable x for class 0
-mean raw importance score of variable x for class 1
-MeanDecreaseAccuracy
-MeanDecreaseGini
Now I know what these "mean" as in I know their definitions. What I
want to know is how to use them.
What I am trying to
2010 Apr 09
1
Question on implementing Random Forests scoring
So I've been working with Random Forests ( R library is randomForest) and I
curious if Random Forests could be applied to classifying on a real time
basis. For instance lets say I've scored fraud from a group of
transactions. If I want to score any new incoming transactions for fraud
could Random Forests be used in that context. Linear Regression is nice in
that it is very easy to
2004 Mar 02
1
some question regarding random forest
Hi,
I had two questions regarding random forests for regression.
1) I have read the original paper by Breiman as well as a paper
dicussing an application of random forests and it appears that the one
of the nice features of this technique is good predictive ability.
However I have some data with which I have generated a linear model
using lm(). I can get an RMS error of 0.43 and an R^2 of
2008 Jan 31
1
random forest and vegetation data
Hi there,
I am an environmental studies masters student trying to get my thesis out the door. I am also newbie at trees in general, but I like what I see in the literature about the random forest algorithm. I think I get the general gist of things, but even after reading stuff I?m unclear about how I could be getting the results I?m seeing. I obviously am missing something about how the split
2009 Jun 24
1
Random Forest Variable Importance Interpretation
Hi
I am trying to explore the use of random forests for regression to
identify the important environmental/microclimate variables involved in
predicting the abundance of a species in different habitats, there are
approx 40 variable and between 200 and 500 data points depending on the
dataset. I have successfully used the randomForest package to conduct
the analysis and looked at the %IncMSE
2008 Jul 05
1
Random Forest %var(y)
The verbose option gives a display like:
> rf.500 <-
+ randomForest(new.x,trn.y,do.trace=20,ntree=100,nodesize=500,
+ importance=T)
| Out-of-bag |
Tree | MSE %Var(y) |
20 | 0.9279 100.84 |
What is the meaning of %var(y)>100%? I expected that to correspond to a
model that was worse than random, but the predictions seem much better than
that on
2009 Apr 20
1
Random Forests: Predictor importance for Regression Trees
Hello!
I think I am relatively clear on how predictor importance (the first
one) is calculated by Random Forests for a Classification tree:
Importance of predictor P1 when the response variable is categorical:
1. For out-of-bag (oob) cases, randomly permute their values on
predictor P1 and then put them down the tree
2. For a given tree, subtract the number of votes for the correct
class in the
2009 Jun 08
1
Random Forest % Variation vs Psuedo-R^2?
Hi all (and Andy!),
When running a randomForest run in R, I get the last part of an output
(with do.trace=T) that looks like this:
1993 | 0.04606 130.43 |
1994 | 0.04605 130.40 |
1995 | 0.04605 130.43 |
1996 | 0.04605 130.43 |
1997 | 0.04606 130.44 |
1998 | 0.04607 130.47 |
1999 | 0.04606 130.46 |
2000 | 0.04605 130.42 |
With the first column representing the
2006 Apr 05
2
Multivariate linear regression
Hi,
I am working on a multivariate linear regression of the form y = Ax.
I am seeing a great dispersion of y w.r.t x. For example, the
correlations between y and x are very small, even after using some
typical transformations like log, power.
I tried with simple linear regression, robust regression and ace and
avas package in R (or splus). I didn't see an improvement in the fit
and