Displaying 20 results from an estimated 2000 matches similar to: "How to use classwt parameter option in RandomForest"
2007 Jan 28
2
help with RandomForest classwt option
Hello there,
I am working on an extremely unbalanced two class classification problems. I
wanna use "classwt" with "down sampling" together. By checking the rfNews()
in R, it looks that classwt is not working yet. Then I looked at the
software from Salford. I did not find the down sampling option. I am
wondering if you have any experience to deal with this problem. Do you
2005 Oct 27
1
Repost: Examples of "classwt", "strata", and "sampsize" in randomForest?
Sorry for the repost, but I've really been looking, and can't find any
syntax direction on this issue...
Just browsing the documentation, and searching the list came up short... I
have some unbalanced data and was wondering if, in a "0" v "1"
classification forest, some combo of these options might yield better
predictions when the proportion of one class is low (less
2005 Oct 27
1
Repost: Examples of "classwt", "strata", and "sampsize" i n randomForest?
"classwt" in the current version of the randomForest package doesn't work
too well. (It's what was in version 3.x of the original Fortran code by
Breiman and Cutler, not the one in the new Fortran code.) I'd advise
against using it.
"sampsize" and "strata" can be use in conjunction. If "strata" is not
specified, the class labels will be used.
2005 Oct 25
0
Examples of "classwt", "strata", and "sampsize" in randomForest?
Just browsing the documentation, and searching the list came up short... I
have some unbalance data and was wondering if, in a "0" v "1" classification
forest, if these options might yield better predictions when the proportion
of one class is low (less than 10% in a sample of 2,000 observations).
Not sure how to specify these terms... from the docs, we have:
classwt: Priors
2008 Feb 25
1
Running randomForests on large datasets
Hi,
I am trying to run randomForests on a datasets of size 500000X650 and
R pops up memory allocation error. Are there any better ways to deal
with large datasets in R, for example, Splus had something like
bigData library.
Thank you,
Nagu
2008 Feb 25
1
To get more digits in precision of predict function of randomForests
Hi,
I am using randomForests for a classification problem. The predict
function in the randomForest library, when asked to return the
probabilities, has precision of two digits after the decimal. I need
at least four digits of precision for the predicted probabilities. How
do I achieve this?
Thank you,
Nagu
2004 Jan 20
1
random forest question
Hi,
here are three results of random forest (version 4.0-1).
The results seem to be more or less the same which is strange because I
changed the classwt.
I hoped that for example classwt=c(0.45,0.1,0.45) would result in fewer
cases classified as class 2. Did I understand something wrong?
Christian
x1rf <- randomForest(x=as.data.frame(mfilters[cvtrain,]),
2011 Sep 13
1
class weights with Random Forest
Hi All,
I am looking for a reference that explains how the randomForest function in
the randomForest package uses the classwt parameter. Here:
http://tolstoy.newcastle.edu.au/R/e4/help/08/05/12088.html
Andy Liaw suggests not using classwt. And according to:
http://r.789695.n4.nabble.com/R-help-with-RandomForest-classwt-option-td817149.html
it has "not been implemented" as of 2007.
2008 Mar 07
2
error in random forest
Hi,
I get the following error when I try to predict the probabilities of a
test sample:
Error in predict.randomForest(fit.EBA.OM.rf.50, x.OM, type = "prob") :
New factor levels not present in the training data
I have about 630 predictor variables in the dataset x.OM (25 factor
variables and the remaining are continuous variables). Any ideas on
how to trace it?
Thank you,
Nagu
2010 May 05
1
What is the default nPerm for regression in randomForest?
Could not find it in ?randomForest.
Thank you for your help!
--
Dimitri Liakhovitski
Ninah.com
Dimitri.Liakhovitski at ninah.com
2012 Mar 03
0
Strategies to deal with unbalanced classification data in randomForest
Hello all,
I have become somewhat confused with options available for dealing
with a highly unbalanced data set (10000 in one class, 50 in the
other). As a summary I am unsure:
a) if I am perform the two class weighting methods properly,
b) if the data are too unbalanced and that this type of analysis is
appropriate and
c) if there is any interaction between the weighting for class
imbalances
2006 Apr 05
2
Multivariate linear regression
Hi,
I am working on a multivariate linear regression of the form y = Ax.
I am seeing a great dispersion of y w.r.t x. For example, the
correlations between y and x are very small, even after using some
typical transformations like log, power.
I tried with simple linear regression, robust regression and ace and
avas package in R (or splus). I didn't see an improvement in the fit
and
2008 Mar 11
1
More digits in prediction using random forest object
I need to get more digits in predicting a test sample with a random
forests object. Format or options(digits=) do nothing. Any ideas?
Thank you,
Nagu
2006 Jan 25
1
imbalanced classes
Hi Andy,
I know this topic has been discussed before on the R-help, but I was
wondering if you could offer some advice specific to my application.
I'm using the R random forest package to compare two classes of data,
the number of cases in each class relatively low, 28 in class 1 and 9
in class 2. I'd really like to use R environment to analyze this data,
however I'm finding it
2003 Aug 20
2
RandomForest
Hello,
When I plot or look at the error rate vector for a random forest
(rf$err.rate) it looks like a descending function except for a few first
points of the vector with error rates values lower(sometimes much lower)
than the general level of error rates for a forest with such number of trees
when the error rates stop descending. Does it mean that there is a tree(s)
(that is built the first in
2007 Jun 06
0
Question on RandomForest in unsupervised mode
Hi,
I attempted to run the randomForest() function on a dataset without
predefined classes. According to the manual, running randomForest
without a response variable/class labels should result in the
function assuming you are running in unsupervised mode. In this case,
I understand that my data is all assigned to one class whereas a
second synthetic class is made up, which is assigned
2002 Sep 25
5
CART vs. Random Forest
According to Dr. Breiman, the RF should be more accurate
method than a single tree. However, the performance of each
method seems to depend on the proprotion of outcome variable
in my case. My data set is a typical classification problem
(predict bad guys). When I ran both of them with different
proportion of outcome variables(there's a criterion to measure
the degree of bad behavior), I
2003 Dec 03
1
Error in randomForest.default(m, y, ...) : negative lengt h vectors are not allowed
Christian --
You don't provide enough information (like a call) to answer this. I
suspect, though, that you may be subsetting in a way that passes
randomForest no data.
I'm not aware offhand of an easy way to get this error from randomForest. I
tried creating some data superficially similar to yours to see whether
something would break if there were only a single value in the variable
2007 Feb 05
0
random forest proximities
Good Day,
I'm using the randomForest package to perform a classification. If I
supply weights to the optional classwt argument are proximity values
computed as a weighted average? I understand that the forest will
possibly change as a function of the particular weights I supply.
Thanks in advance.
Mike
Michael Fugate
Los Alamos National Laboratory
Mail Stop MS-F600,
Los Alamos, NM
2004 May 12
1
Random Forest with highly imbalanced data
Hi group,
I am trying to do a RF with approx 250,000
cases. My objective is to determine the risk factors
of a person being readmitted to hospital (response=1)
or else (response=0). Only 10%, or 25,000 cases were
readmitted. I've heard about down-sampling and class
weight approach and am wondering if R can do it. Even
some reference to articles will help.
>From the statistical point