thr3ads.net - similar to: "sampsize in Random Forests"

Displaying 20 results from an estimated 2000 matches similar to: "sampsize in Random Forests"

2006 Nov 13

random forest regression

Dear all, I am doing a regression in ramdomForest, using the option "sampsize" reduce the number of records used to produce the randomForest object. The manual says "For classification, if sampsize is a vector of the length the number of strata, then sampling is stratified by strata, and the elements of sampsize indicate the numbers to be drawn from the strata". I need my

Repost: Examples of "classwt", "strata", and "sampsize" i n randomForest?

2005 Oct 27

Repost: Examples of "classwt", "strata", and "sampsize" i n randomForest?

"classwt" in the current version of the randomForest package doesn't work too well. (It's what was in version 3.x of the original Fortran code by Breiman and Cutler, not the one in the new Fortran code.) I'd advise against using it. "sampsize" and "strata" can be use in conjunction. If "strata" is not specified, the class labels will be used.

Repost: Examples of "classwt", "strata", and "sampsize" in randomForest?

2005 Oct 27

Repost: Examples of "classwt", "strata", and "sampsize" in randomForest?

Sorry for the repost, but I've really been looking, and can't find any syntax direction on this issue... Just browsing the documentation, and searching the list came up short... I have some unbalanced data and was wondering if, in a "0" v "1" classification forest, some combo of these options might yield better predictions when the proportion of one class is low (less

Random forests

2007 Dec 18

Random forests

Dear all, I would like to use a tree regression method to analyze my dataset. I am interested in the fact that random forests creates in-bag and out-of-bag datasets, but I also need an estimate of support for each split. That seems hard to do in random forests since each tree is grown using a subset of the predictor variables. I was thinking of setting mtry = number of predictor variables,

comparing random forests and classification trees

2007 Jan 29

comparing random forests and classification trees

Hi, I have done an analysis using 'rpart' to construct a Classification Tree. I am wanting to retain the output in tree form so that it is easily interpretable. However, I am wanting to compare the 'accuracy' of the tree to a Random Forest to estimate how much predictive ability is lost by using one simple tree. My understanding is that the error automatically displayed by the two

CARET: Any way to access other tuning parameters?

2013 Feb 13

CARET: Any way to access other tuning parameters?

The documentation for caret::train shows a list of parameters that one can tune for each method classification/regression method. For example, for the method randomForest one can tune mtry in the call to train. But the function call to train random forests in the original package has many other parameters, e.g. sampsize, maxnodes, etc. Is there **any** way to access these parameters using train

class weights with Random Forest

2011 Sep 13

class weights with Random Forest

Hi All, I am looking for a reference that explains how the randomForest function in the randomForest package uses the classwt parameter. Here: http://tolstoy.newcastle.edu.au/R/e4/help/08/05/12088.html Andy Liaw suggests not using classwt. And according to: http://r.789695.n4.nabble.com/R-help-with-RandomForest-classwt-option-td817149.html it has "not been implemented" as of 2007.

imbalanced classes

2006 Jan 25

imbalanced classes

Hi Andy, I know this topic has been discussed before on the R-help, but I was wondering if you could offer some advice specific to my application. I'm using the R random forest package to compare two classes of data, the number of cases in each class relatively low, 28 in class 1 and 9 in class 2. I'd really like to use R environment to analyze this data, however I'm finding it

Random Forest - Strata

2010 Jul 20

Random Forest - Strata

Hi all, Had struggled in getting "Strata" in randomForest to work on this. Can I get randomForest for each of its TREE, to get ALL sample from some strata to build tree, while leaving some strata TOTALLY untouched as oob? e.g. in below, how I can tell RF to, - for tree 1 in the forest, to use only Site A and B to build the tree, while using the WHOLE Site C data for the oob error

Examples of "classwt", "strata", and "sampsize" in randomForest?

2005 Oct 25

Examples of "classwt", "strata", and "sampsize" in randomForest?

Just browsing the documentation, and searching the list came up short... I have some unbalance data and was wondering if, in a "0" v "1" classification forest, if these options might yield better predictions when the proportion of one class is low (less than 10% in a sample of 2,000 observations). Not sure how to specify these terms... from the docs, we have: classwt: Priors

pipe data from plot(). was: ROCR.plot methods, cross validation averaging

2009 Sep 24

pipe data from plot(). was: ROCR.plot methods, cross validation averaging

All, I'm trying again with a slightly more generic version of my first question. I can extract the plotted values from hist(), boxplot(), and even plot.randomForest(). Observe: # get some data dat <- rnorm(100) # grab histogram data hdat <- hist(dat) hdat #provides details of the hist output #grab boxplot data bdat <- boxplot(dat) bdat #provides details of the boxplot

Random Forest Reading N/A's, I don't see them

2011 Dec 15

Random Forest Reading N/A's, I don't see them

After checking the original data in Excel for blanks and running Summary(cm3) to identify any null values in my data, I'm unable to identify an instances. Yet when I attempted to use the data in Random Forest, I get the following error. Is there something that Random Forest is reading as null which is not actually null? Is there a better way to check for this? > library(randomForest) >

randomForest

2009 Mar 20

randomForest

Hi! I am dealing with random forest using R. Is there a way to sample a fixed no.of rows from a dataset for use with different trees in random Forest. To be more clear, my data set contains 1500 rows, and I am growing 500 trees in Random Forest Is it possible to sample only 500 rows of data from the data set and use it for different trees in the forest. I mean each tree of the forest should use

help with RandomForest classwt option

2007 Jan 28

help with RandomForest classwt option

Hello there, I am working on an extremely unbalanced two class classification problems. I wanna use "classwt" with "down sampling" together. By checking the rfNews() in R, it looks that classwt is not working yet. Then I looked at the software from Salford. I did not find the down sampling option. I am wondering if you have any experience to deal with this problem. Do you

No Data in randomForest predict

2012 May 05

No Data in randomForest predict

I would like to ask a general question about the randomForest predict function and how it handles No Data values. I understand that you can omit No Data values while developing the randomForest object, but how does it handle No Data in the prediction phase? I would like the output to be NA if any (not just all) of the input data have an NA value. It is not clear to me if this is the default or

How do I make R randomForest model size smaller?

2012 Dec 03

How do I make R randomForest model size smaller?

I've been training randomForest models on 7 million rows of data (41 features). Here's an example call: myModel <- randomForest(RESPONSE~., data=mydata, ntree=50, maxnodes=30) I thought surely with only 50 trees and 30 terminal nodes that the memory footprint of "myModel" would be small. But it's 65 megs in a dump file. The object seems to be holding all sorts of

use "caret" to rank predictors by random forest model

2011 Mar 07

use "caret" to rank predictors by random forest model

Hi, I'm using package "caret" to rank predictors using random forest model and draw predictors importance plot. I used below commands: rf.fit<-randomForest(x,y,ntree=500,importance=TRUE) ## "x" is matrix whose columns are predictors, "y" is a binary resonse vector ## Then I got the ranked predictors by ranking

Re-evaluating the tree in the random forest

2005 Sep 08

Re-evaluating the tree in the random forest

Dear mailinglist members, I was wondering if there was a way to re-evaluate the instances of a tree (in the forest) again after I have manually changed a splitpoint (or split variable) of a decision node. Here's an illustration: library("randomForest") forest.rf <- randomForest(formula = Species ~ ., data = iris, do.trace = TRUE, ntree = 3, mtry = 2, norm.votes = FALSE) # I am

random forest question

2004 Jan 20

random forest question

Hi, here are three results of random forest (version 4.0-1). The results seem to be more or less the same which is strange because I changed the classwt. I hoped that for example classwt=c(0.45,0.1,0.45) would result in fewer cases classified as class 2. Did I understand something wrong? Christian x1rf <- randomForest(x=as.data.frame(mfilters[cvtrain,]),

randomForest: help with combine() function

2010 Dec 11

randomForest: help with combine() function

I've built two RF objects (RF1 and RF2) and have tried to combine them, but I get the following error: Error in rf$votes + ifelse(is.na(rflist[[i]]$votes), 0, rflist[[i]]$votes) : non-conformable arrays In addition: Warning message: In rf$oob.times + rflist[[i]]$oob.times : longer object length is not a multiple of shorter object length Both RF models use the same variables, although

similar to: sampsize in Random Forests