Displaying 17 results from an estimated 17 matches for "classwt".
Did you mean:
class
2008 May 21
1
How to use classwt parameter option in RandomForest
...actor variables using
random forests in R. The variable Y acts like an ordinal variable, but
I recoded it as factor variable.
I ran a simulation and got OOB estimate of error rate 60%. I validated
against some external datasets and got about 59% misclassification
error. I would like to tinker with classwt option in the function
randomForest to see if I can get a better performance the model. My
confusion arises from how to define these weights. If I say, classwt =
c(3,6,9,1,2,3), how exactly the levels get weighted. If this is a 6X6
matrix, I can put a number in each cell to adjust the weights. How...
2007 Jan 28
2
help with RandomForest classwt option
Hello there,
I am working on an extremely unbalanced two class classification problems. I
wanna use "classwt" with "down sampling" together. By checking the rfNews()
in R, it looks that classwt is not working yet. Then I looked at the
software from Salford. I did not find the down sampling option. I am
wondering if you have any experience to deal with this problem. Do you know
any method o...
2004 Jan 20
1
random forest question
Hi,
here are three results of random forest (version 4.0-1).
The results seem to be more or less the same which is strange because I
changed the classwt.
I hoped that for example classwt=c(0.45,0.1,0.45) would result in fewer
cases classified as class 2. Did I understand something wrong?
Christian
x1rf <- randomForest(x=as.data.frame(mfilters[cvtrain,]),
y=as.factor(traingroups),
xtest=as.data.frame(m...
2005 Oct 27
1
Repost: Examples of "classwt", "strata", and "sampsize" in randomForest?
...data and was wondering if, in a "0" v "1"
classification forest, some combo of these options might yield better
predictions when the proportion of one class is low (less than 10% in a
sample of 2,000 observations).
Not sure how to specify these terms... from the docs, we have:
classwt: Priors of the classes. Need not add up to one. Ignored for
regression.
So is this something like "... classwt=c(.90,.10)" ? I didn't see the syntax
demonstrated. Similar for "strata" and "sampsize" though there is a default
for sampsize that makes sense... not su...
2005 Oct 27
1
Repost: Examples of "classwt", "strata", and "sampsize" i n randomForest?
"classwt" in the current version of the randomForest package doesn't work
too well. (It's what was in version 3.x of the original Fortran code by
Breiman and Cutler, not the one in the new Fortran code.) I'd advise
against using it.
"sampsize" and "strata" can be use...
2005 Oct 25
0
Examples of "classwt", "strata", and "sampsize" in randomForest?
...unbalance data and was wondering if, in a "0" v "1" classification
forest, if these options might yield better predictions when the proportion
of one class is low (less than 10% in a sample of 2,000 observations).
Not sure how to specify these terms... from the docs, we have:
classwt: Priors of the classes. Need not add up to one. Ignored for
regression.
So is this something like "... classwt=c(.90,.10)" ? I didn't see the syntax
demonstrated. Similar for "strata" and "sampsize" though there is a default
for sampsize that makes sense... not su...
2011 Sep 13
1
class weights with Random Forest
Hi All,
I am looking for a reference that explains how the randomForest function in
the randomForest package uses the classwt parameter. Here:
http://tolstoy.newcastle.edu.au/R/e4/help/08/05/12088.html
Andy Liaw suggests not using classwt. And according to:
http://r.789695.n4.nabble.com/R-help-with-RandomForest-classwt-option-td817149.html
it has "not been implemented" as of 2007. However it improved classif...
2010 Mar 08
0
error when using svm routine: Error in if (any(co)) { : missing value where TRUE/FALSE needed
Hi,
I met with this error message with the following data set. Do you know how
to resolve it? Thanks.
> data<-read.table("c://temp3//abc.csv", sep = ",", header=T)
> classwt<-c( 0.5806452, 0.4193548)
> y<-data[,1]
> x<-data[,2:ncol(data)]
> print(y)
[1] 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 1 1 1 1 1
[36] 1 1 1 1 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2
> print(x)
rs2289472 rs1551398 rs7927894
1 CT AA...
2006 Jan 25
1
imbalanced classes
...t package to compare two classes of data,
the number of cases in each class relatively low, 28 in class 1 and 9
in class 2. I'd really like to use R environment to analyze this data,
however I'm finding it difficult to put much trust in the results of
my analysis. As you've stated, the classwt variables do not do much,
and I've tried working with the cuttoff and sampsize variables as
well, with limited success in balancing error rates between the two
classes.
It was unclear to me how to use the cuttoff parameter correctly. If
you have any recommendations here, it would be appreciat...
2002 Sep 25
5
CART vs. Random Forest
According to Dr. Breiman, the RF should be more accurate
method than a single tree. However, the performance of each
method seems to depend on the proprotion of outcome variable
in my case. My data set is a typical classification problem
(predict bad guys). When I ran both of them with different
proportion of outcome variables(there's a criterion to measure
the degree of bad behavior), I
2004 May 12
1
Random Forest with highly imbalanced data
Hi group,
I am trying to do a RF with approx 250,000
cases. My objective is to determine the risk factors
of a person being readmitted to hospital (response=1)
or else (response=0). Only 10%, or 25,000 cases were
readmitted. I've heard about down-sampling and class
weight approach and am wondering if R can do it. Even
some reference to articles will help.
>From the statistical point
2007 Feb 05
0
random forest proximities
Good Day,
I'm using the randomForest package to perform a classification. If I
supply weights to the optional classwt argument are proximity values
computed as a weighted average? I understand that the forest will
possibly change as a function of the particular weights I supply.
Thanks in advance.
Mike
Michael Fugate
Los Alamos National Laboratory
Mail Stop MS-F600,
Los Alamos, NM
87545
(505) 667-0398
2010 May 05
1
What is the default nPerm for regression in randomForest?
Could not find it in ?randomForest.
Thank you for your help!
--
Dimitri Liakhovitski
Ninah.com
Dimitri.Liakhovitski at ninah.com
2012 Oct 17
0
How to optimize or build a better random forest?
...sibsp pclass2 pclass3 sexmale
"factor" "numeric" "integer" "factor" "factor" "factor"
> sapply(split(train,train$survived),function(x) dim(x)[1]) 0 1
549 342
> rf <- randomForest(train[,-1], train[,1], ntree=10000,classwt=c(549/891,342/891),importance=TRUE,do.trace=FALSE)
OOB estimate of error rate: 17.73%
Confusion matrix:
0 1 class.error
0 500 49 0.08925319
1 109 233 0.31871345
[[alternative HTML version deleted]]
2012 Mar 03
0
Strategies to deal with unbalanced classification data in randomForest
...ate. This approach I've mostly drawn from here:
## http://stat-www.berkeley.edu/users/breiman/RandomForests/cc_home.htm#balance
## This might not be appropriate, however, as of September it looks
like Breiman method wasn't used in R
df.rf.weights<-randomForest(cls~var1+var2+var3, data=df,classwt=c(1,
600), importance=TRUE)
## Nevertheless, what I am concerned about is the effect of an
unbalanced data set has on my randomForest model
## For example:
par(mfrow=c(1,3))
plot(df.rf)
plot(df.rf.downsamp)
plot(df.rf.weights)
presents three very different scenarios and I having trouble resolvin...
2003 Aug 20
2
RandomForest
Hello,
When I plot or look at the error rate vector for a random forest
(rf$err.rate) it looks like a descending function except for a few first
points of the vector with error rates values lower(sometimes much lower)
than the general level of error rates for a forest with such number of trees
when the error rates stop descending. Does it mean that there is a tree(s)
(that is built the first in
2004 Nov 04
4
highly biased PCA data?
Hello, supposing that I have two or three clear categories for my data,
lets say pet preferece across fish, cat, dog. Lets say most people rate
their preference as being mostly one of the categories.
I want to do pca on the data to see three 'groups' of people, one group
for fish, one for cat and one for dog. I would like to see the odd person
who likes both or all three in the