similar to: How to use classwt parameter option in RandomForest

Displaying 20 results from an estimated 2000 matches similar to: "How to use classwt parameter option in RandomForest"

2007 Jan 28
2
help with RandomForest classwt option
Hello there, I am working on an extremely unbalanced two class classification problems. I wanna use "classwt" with "down sampling" together. By checking the rfNews() in R, it looks that classwt is not working yet. Then I looked at the software from Salford. I did not find the down sampling option. I am wondering if you have any experience to deal with this problem. Do you
2005 Oct 27
1
Repost: Examples of "classwt", "strata", and "sampsize" in randomForest?
Sorry for the repost, but I've really been looking, and can't find any syntax direction on this issue... Just browsing the documentation, and searching the list came up short... I have some unbalanced data and was wondering if, in a "0" v "1" classification forest, some combo of these options might yield better predictions when the proportion of one class is low (less
2005 Oct 27
1
Repost: Examples of "classwt", "strata", and "sampsize" i n randomForest?
"classwt" in the current version of the randomForest package doesn't work too well. (It's what was in version 3.x of the original Fortran code by Breiman and Cutler, not the one in the new Fortran code.) I'd advise against using it. "sampsize" and "strata" can be use in conjunction. If "strata" is not specified, the class labels will be used.
2005 Oct 25
0
Examples of "classwt", "strata", and "sampsize" in randomForest?
Just browsing the documentation, and searching the list came up short... I have some unbalance data and was wondering if, in a "0" v "1" classification forest, if these options might yield better predictions when the proportion of one class is low (less than 10% in a sample of 2,000 observations). Not sure how to specify these terms... from the docs, we have: classwt: Priors
2008 Feb 25
1
Running randomForests on large datasets
Hi, I am trying to run randomForests on a datasets of size 500000X650 and R pops up memory allocation error. Are there any better ways to deal with large datasets in R, for example, Splus had something like bigData library. Thank you, Nagu
2008 Feb 25
1
To get more digits in precision of predict function of randomForests
Hi, I am using randomForests for a classification problem. The predict function in the randomForest library, when asked to return the probabilities, has precision of two digits after the decimal. I need at least four digits of precision for the predicted probabilities. How do I achieve this? Thank you, Nagu
2004 Jan 20
1
random forest question
Hi, here are three results of random forest (version 4.0-1). The results seem to be more or less the same which is strange because I changed the classwt. I hoped that for example classwt=c(0.45,0.1,0.45) would result in fewer cases classified as class 2. Did I understand something wrong? Christian x1rf <- randomForest(x=as.data.frame(mfilters[cvtrain,]),
2011 Sep 13
1
class weights with Random Forest
Hi All, I am looking for a reference that explains how the randomForest function in the randomForest package uses the classwt parameter. Here: http://tolstoy.newcastle.edu.au/R/e4/help/08/05/12088.html Andy Liaw suggests not using classwt. And according to: http://r.789695.n4.nabble.com/R-help-with-RandomForest-classwt-option-td817149.html it has "not been implemented" as of 2007.
2008 Mar 07
2
error in random forest
Hi, I get the following error when I try to predict the probabilities of a test sample: Error in predict.randomForest(fit.EBA.OM.rf.50, x.OM, type = "prob") : New factor levels not present in the training data I have about 630 predictor variables in the dataset x.OM (25 factor variables and the remaining are continuous variables). Any ideas on how to trace it? Thank you, Nagu
2010 May 05
1
What is the default nPerm for regression in randomForest?
Could not find it in ?randomForest. Thank you for your help! -- Dimitri Liakhovitski Ninah.com Dimitri.Liakhovitski at ninah.com
2012 Mar 03
0
Strategies to deal with unbalanced classification data in randomForest
Hello all, I have become somewhat confused with options available for dealing with a highly unbalanced data set (10000 in one class, 50 in the other). As a summary I am unsure: a) if I am perform the two class weighting methods properly, b) if the data are too unbalanced and that this type of analysis is appropriate and c) if there is any interaction between the weighting for class imbalances
2006 Apr 05
2
Multivariate linear regression
Hi, I am working on a multivariate linear regression of the form y = Ax. I am seeing a great dispersion of y w.r.t x. For example, the correlations between y and x are very small, even after using some typical transformations like log, power. I tried with simple linear regression, robust regression and ace and avas package in R (or splus). I didn't see an improvement in the fit and
2008 Mar 11
1
More digits in prediction using random forest object
I need to get more digits in predicting a test sample with a random forests object. Format or options(digits=) do nothing. Any ideas? Thank you, Nagu
2006 Jan 25
1
imbalanced classes
Hi Andy, I know this topic has been discussed before on the R-help, but I was wondering if you could offer some advice specific to my application. I'm using the R random forest package to compare two classes of data, the number of cases in each class relatively low, 28 in class 1 and 9 in class 2. I'd really like to use R environment to analyze this data, however I'm finding it
2003 Aug 20
2
RandomForest
Hello, When I plot or look at the error rate vector for a random forest (rf$err.rate) it looks like a descending function except for a few first points of the vector with error rates values lower(sometimes much lower) than the general level of error rates for a forest with such number of trees when the error rates stop descending. Does it mean that there is a tree(s) (that is built the first in
2007 Jun 06
0
Question on RandomForest in unsupervised mode
Hi, I attempted to run the randomForest() function on a dataset without predefined classes. According to the manual, running randomForest without a response variable/class labels should result in the function assuming you are running in unsupervised mode. In this case, I understand that my data is all assigned to one class whereas a second synthetic class is made up, which is assigned
2002 Sep 25
5
CART vs. Random Forest
According to Dr. Breiman, the RF should be more accurate method than a single tree. However, the performance of each method seems to depend on the proprotion of outcome variable in my case. My data set is a typical classification problem (predict bad guys). When I ran both of them with different proportion of outcome variables(there's a criterion to measure the degree of bad behavior), I
2003 Dec 03
1
Error in randomForest.default(m, y, ...) : negative lengt h vectors are not allowed
Christian -- You don't provide enough information (like a call) to answer this. I suspect, though, that you may be subsetting in a way that passes randomForest no data. I'm not aware offhand of an easy way to get this error from randomForest. I tried creating some data superficially similar to yours to see whether something would break if there were only a single value in the variable
2007 Feb 05
0
random forest proximities
Good Day, I'm using the randomForest package to perform a classification. If I supply weights to the optional classwt argument are proximity values computed as a weighted average? I understand that the forest will possibly change as a function of the particular weights I supply. Thanks in advance. Mike Michael Fugate Los Alamos National Laboratory Mail Stop MS-F600, Los Alamos, NM
2004 May 12
1
Random Forest with highly imbalanced data
Hi group, I am trying to do a RF with approx 250,000 cases. My objective is to determine the risk factors of a person being readmitted to hospital (response=1) or else (response=0). Only 10%, or 25,000 cases were readmitted. I've heard about down-sampling and class weight approach and am wondering if R can do it. Even some reference to articles will help. >From the statistical point