thr3ads.net - similar to: "How to use classwt parameter option in RandomForest"

Displaying 20 results from an estimated 2000 matches similar to: "How to use classwt parameter option in RandomForest"

2007 Jan 28

help with RandomForest classwt option

Hello there, I am working on an extremely unbalanced two class classification problems. I wanna use "classwt" with "down sampling" together. By checking the rfNews() in R, it looks that classwt is not working yet. Then I looked at the software from Salford. I did not find the down sampling option. I am wondering if you have any experience to deal with this problem. Do you

Repost: Examples of "classwt", "strata", and "sampsize" in randomForest?

2005 Oct 27

Repost: Examples of "classwt", "strata", and "sampsize" in randomForest?

Sorry for the repost, but I've really been looking, and can't find any syntax direction on this issue... Just browsing the documentation, and searching the list came up short... I have some unbalanced data and was wondering if, in a "0" v "1" classification forest, some combo of these options might yield better predictions when the proportion of one class is low (less

Repost: Examples of "classwt", "strata", and "sampsize" i n randomForest?

2005 Oct 27

Repost: Examples of "classwt", "strata", and "sampsize" i n randomForest?

"classwt" in the current version of the randomForest package doesn't work too well. (It's what was in version 3.x of the original Fortran code by Breiman and Cutler, not the one in the new Fortran code.) I'd advise against using it. "sampsize" and "strata" can be use in conjunction. If "strata" is not specified, the class labels will be used.

Examples of "classwt", "strata", and "sampsize" in randomForest?

2005 Oct 25

Examples of "classwt", "strata", and "sampsize" in randomForest?

Just browsing the documentation, and searching the list came up short... I have some unbalance data and was wondering if, in a "0" v "1" classification forest, if these options might yield better predictions when the proportion of one class is low (less than 10% in a sample of 2,000 observations). Not sure how to specify these terms... from the docs, we have: classwt: Priors

Running randomForests on large datasets

2008 Feb 25

Running randomForests on large datasets

Hi, I am trying to run randomForests on a datasets of size 500000X650 and R pops up memory allocation error. Are there any better ways to deal with large datasets in R, for example, Splus had something like bigData library. Thank you, Nagu

To get more digits in precision of predict function of randomForests

2008 Feb 25

To get more digits in precision of predict function of randomForests

Hi, I am using randomForests for a classification problem. The predict function in the randomForest library, when asked to return the probabilities, has precision of two digits after the decimal. I need at least four digits of precision for the predicted probabilities. How do I achieve this? Thank you, Nagu

random forest question

2004 Jan 20

random forest question

Hi, here are three results of random forest (version 4.0-1). The results seem to be more or less the same which is strange because I changed the classwt. I hoped that for example classwt=c(0.45,0.1,0.45) would result in fewer cases classified as class 2. Did I understand something wrong? Christian x1rf <- randomForest(x=as.data.frame(mfilters[cvtrain,]),

class weights with Random Forest

2011 Sep 13

class weights with Random Forest

Hi All, I am looking for a reference that explains how the randomForest function in the randomForest package uses the classwt parameter. Here: http://tolstoy.newcastle.edu.au/R/e4/help/08/05/12088.html Andy Liaw suggests not using classwt. And according to: http://r.789695.n4.nabble.com/R-help-with-RandomForest-classwt-option-td817149.html it has "not been implemented" as of 2007.

error in random forest

2008 Mar 07

error in random forest

Hi, I get the following error when I try to predict the probabilities of a test sample: Error in predict.randomForest(fit.EBA.OM.rf.50, x.OM, type = "prob") : New factor levels not present in the training data I have about 630 predictor variables in the dataset x.OM (25 factor variables and the remaining are continuous variables). Any ideas on how to trace it? Thank you, Nagu

What is the default nPerm for regression in randomForest?

2010 May 05

What is the default nPerm for regression in randomForest?

Could not find it in ?randomForest. Thank you for your help! -- Dimitri Liakhovitski Ninah.com Dimitri.Liakhovitski at ninah.com

Strategies to deal with unbalanced classification data in randomForest

2012 Mar 03

Strategies to deal with unbalanced classification data in randomForest

Hello all, I have become somewhat confused with options available for dealing with a highly unbalanced data set (10000 in one class, 50 in the other). As a summary I am unsure: a) if I am perform the two class weighting methods properly, b) if the data are too unbalanced and that this type of analysis is appropriate and c) if there is any interaction between the weighting for class imbalances

Multivariate linear regression

2006 Apr 05

Multivariate linear regression

Hi, I am working on a multivariate linear regression of the form y = Ax. I am seeing a great dispersion of y w.r.t x. For example, the correlations between y and x are very small, even after using some typical transformations like log, power. I tried with simple linear regression, robust regression and ace and avas package in R (or splus). I didn't see an improvement in the fit and

More digits in prediction using random forest object

2008 Mar 11

More digits in prediction using random forest object

I need to get more digits in predicting a test sample with a random forests object. Format or options(digits=) do nothing. Any ideas? Thank you, Nagu

imbalanced classes

2006 Jan 25

imbalanced classes

Hi Andy, I know this topic has been discussed before on the R-help, but I was wondering if you could offer some advice specific to my application. I'm using the R random forest package to compare two classes of data, the number of cases in each class relatively low, 28 in class 1 and 9 in class 2. I'd really like to use R environment to analyze this data, however I'm finding it

RandomForest

2003 Aug 20

RandomForest

Hello, When I plot or look at the error rate vector for a random forest (rf$err.rate) it looks like a descending function except for a few first points of the vector with error rates values lower(sometimes much lower) than the general level of error rates for a forest with such number of trees when the error rates stop descending. Does it mean that there is a tree(s) (that is built the first in

Question on RandomForest in unsupervised mode

2007 Jun 06

Question on RandomForest in unsupervised mode

Hi, I attempted to run the randomForest() function on a dataset without predefined classes. According to the manual, running randomForest without a response variable/class labels should result in the function assuming you are running in unsupervised mode. In this case, I understand that my data is all assigned to one class whereas a second synthetic class is made up, which is assigned

CART vs. Random Forest

2002 Sep 25

CART vs. Random Forest

According to Dr. Breiman, the RF should be more accurate method than a single tree. However, the performance of each method seems to depend on the proprotion of outcome variable in my case. My data set is a typical classification problem (predict bad guys). When I ran both of them with different proportion of outcome variables(there's a criterion to measure the degree of bad behavior), I

Error in randomForest.default(m, y, ...) : negative lengt h vectors are not allowed

2003 Dec 03

Error in randomForest.default(m, y, ...) : negative lengt h vectors are not allowed

Christian -- You don't provide enough information (like a call) to answer this. I suspect, though, that you may be subsetting in a way that passes randomForest no data. I'm not aware offhand of an easy way to get this error from randomForest. I tried creating some data superficially similar to yours to see whether something would break if there were only a single value in the variable

random forest proximities

2007 Feb 05

random forest proximities

Good Day, I'm using the randomForest package to perform a classification. If I supply weights to the optional classwt argument are proximity values computed as a weighted average? I understand that the forest will possibly change as a function of the particular weights I supply. Thanks in advance. Mike Michael Fugate Los Alamos National Laboratory Mail Stop MS-F600, Los Alamos, NM

Random Forest with highly imbalanced data

2004 May 12

Random Forest with highly imbalanced data

Hi group, I am trying to do a RF with approx 250,000 cases. My objective is to determine the risk factors of a person being readmitted to hospital (response=1) or else (response=0). Only 10%, or 25,000 cases were readmitted. I've heard about down-sampling and class weight approach and am wondering if R can do it. Even some reference to articles will help. >From the statistical point

similar to: How to use classwt parameter option in RandomForest