similar to: class weights with Random Forest

Displaying 20 results from an estimated 4000 matches similar to: "class weights with Random Forest"

2004 Jan 20
1
random forest question
Hi, here are three results of random forest (version 4.0-1). The results seem to be more or less the same which is strange because I changed the classwt. I hoped that for example classwt=c(0.45,0.1,0.45) would result in fewer cases classified as class 2. Did I understand something wrong? Christian x1rf <- randomForest(x=as.data.frame(mfilters[cvtrain,]),
2005 Oct 27
1
Repost: Examples of "classwt", "strata", and "sampsize" i n randomForest?
"classwt" in the current version of the randomForest package doesn't work too well. (It's what was in version 3.x of the original Fortran code by Breiman and Cutler, not the one in the new Fortran code.) I'd advise against using it. "sampsize" and "strata" can be use in conjunction. If "strata" is not specified, the class labels will be used.
2005 Oct 27
1
Repost: Examples of "classwt", "strata", and "sampsize" in randomForest?
Sorry for the repost, but I've really been looking, and can't find any syntax direction on this issue... Just browsing the documentation, and searching the list came up short... I have some unbalanced data and was wondering if, in a "0" v "1" classification forest, some combo of these options might yield better predictions when the proportion of one class is low (less
2006 Jan 25
1
imbalanced classes
Hi Andy, I know this topic has been discussed before on the R-help, but I was wondering if you could offer some advice specific to my application. I'm using the R random forest package to compare two classes of data, the number of cases in each class relatively low, 28 in class 1 and 9 in class 2. I'd really like to use R environment to analyze this data, however I'm finding it
2007 Jan 28
2
help with RandomForest classwt option
Hello there, I am working on an extremely unbalanced two class classification problems. I wanna use "classwt" with "down sampling" together. By checking the rfNews() in R, it looks that classwt is not working yet. Then I looked at the software from Salford. I did not find the down sampling option. I am wondering if you have any experience to deal with this problem. Do you
2004 May 12
1
Random Forest with highly imbalanced data
Hi group, I am trying to do a RF with approx 250,000 cases. My objective is to determine the risk factors of a person being readmitted to hospital (response=1) or else (response=0). Only 10%, or 25,000 cases were readmitted. I've heard about down-sampling and class weight approach and am wondering if R can do it. Even some reference to articles will help. >From the statistical point
2008 May 21
1
How to use classwt parameter option in RandomForest
Hi, I am trying to model a dataset with the response variable Y, which has 6 levels { Great, Greater, Greatest, Weak, Weaker, Weakest}, and predictor variables X, with continuous and factor variables using random forests in R. The variable Y acts like an ordinal variable, but I recoded it as factor variable. I ran a simulation and got OOB estimate of error rate 60%. I validated against some
2005 Oct 25
0
Examples of "classwt", "strata", and "sampsize" in randomForest?
Just browsing the documentation, and searching the list came up short... I have some unbalance data and was wondering if, in a "0" v "1" classification forest, if these options might yield better predictions when the proportion of one class is low (less than 10% in a sample of 2,000 observations). Not sure how to specify these terms... from the docs, we have: classwt: Priors
2008 Mar 09
1
sampsize in Random Forests
Hi all, I have a dataset where each point is assigned to a class A, B, C, or D. Each point is also assigned to a study site. Each study site is coded with a number ranging between 1-100. This information is stored in the vector studySites. I want to run randomForests using stratified sampling, so I chose the option strata = factor(studySites) But I am not sure how to control the number of
2006 Nov 13
1
random forest regression
Dear all, I am doing a regression in ramdomForest, using the option "sampsize" reduce the number of records used to produce the randomForest object. The manual says "For classification, if sampsize is a vector of the length the number of strata, then sampling is stratified by strata, and the elements of sampsize indicate the numbers to be drawn from the strata". I need my
2010 Jul 20
1
Random Forest - Strata
Hi all, Had struggled in getting "Strata" in randomForest to work on this. Can I get randomForest for each of its TREE, to get ALL sample from some strata to build tree, while leaving some strata TOTALLY untouched as oob? e.g. in below, how I can tell RF to, - for tree 1 in the forest, to use only Site A and B to build the tree, while using the WHOLE Site C data for the oob error
2007 Aug 24
2
Variable Importance - Random Forest
Hello, I am trying to explore the use of random forests for classification and am certain about the interpretation of the importance measurements. When having the option "importance = T" in the randomForest call, the resulting 'importance' element matrix has four columns with the following headings: 0 - mean raw importance score of variable x for class 0 (where
2012 Mar 03
0
Strategies to deal with unbalanced classification data in randomForest
Hello all, I have become somewhat confused with options available for dealing with a highly unbalanced data set (10000 in one class, 50 in the other). As a summary I am unsure: a) if I am perform the two class weighting methods properly, b) if the data are too unbalanced and that this type of analysis is appropriate and c) if there is any interaction between the weighting for class imbalances
2010 May 05
1
What is the default nPerm for regression in randomForest?
Could not find it in ?randomForest. Thank you for your help! -- Dimitri Liakhovitski Ninah.com Dimitri.Liakhovitski at ninah.com
2008 Sep 27
1
ariable Importance Measure in Package RandomForest
Hi, I've a question about the RandomForest package. The package allows the extraction of a variable importance measure. As far as I could see from the documentation, the computation is based on the Gini index. Do you know if this extraction can be also based on other criteria? In particular, I'm interested in the info gain criterion. Best regards, Chris --
2004 Jan 12
0
new version of randomForest (4.0-7)
Dear R users, I've just released a new version of randomForest (available on CRAN now). This version contained quite a number of new features and bug fixes, compared to version prior to 4.0-x (and few more since 4.0-1). For those not familiar with randomForest, it's an ensemble classifier/regression tool. Please see http://www.math.usu.edu/~adele/forests/ for more detailed information,
2004 Jan 12
0
new version of randomForest (4.0-7)
Dear R users, I've just released a new version of randomForest (available on CRAN now). This version contained quite a number of new features and bug fixes, compared to version prior to 4.0-x (and few more since 4.0-1). For those not familiar with randomForest, it's an ensemble classifier/regression tool. Please see http://www.math.usu.edu/~adele/forests/ for more detailed information,
2009 Sep 24
3
pipe data from plot(). was: ROCR.plot methods, cross validation averaging
All, I'm trying again with a slightly more generic version of my first question. I can extract the plotted values from hist(), boxplot(), and even plot.randomForest(). Observe: # get some data dat <- rnorm(100) # grab histogram data hdat <- hist(dat) hdat #provides details of the hist output #grab boxplot data bdat <- boxplot(dat) bdat #provides details of the boxplot
2004 Jul 08
0
randomForest 4.3-0 released
Dear all, Version 4.3-0 of the randomForest package is now available on CRAN (in source; binaries will follow in due course). There are some interface changes and a few new features, as well as bug fixes. For those who had used previous versions, the important things to note are: 1. there's a namespace now, and 2. some functions have been renamed. The list of changes since 4.0-7 (last
2004 Jul 08
0
randomForest 4.3-0 released
Dear all, Version 4.3-0 of the randomForest package is now available on CRAN (in source; binaries will follow in due course). There are some interface changes and a few new features, as well as bug fixes. For those who had used previous versions, the important things to note are: 1. there's a namespace now, and 2. some functions have been renamed. The list of changes since 4.0-7 (last