Displaying 20 results from an estimated 11000 matches similar to: "Stratified Sampling with randomForest Regression"
2008 Mar 09
1
sampsize in Random Forests
Hi all,
I have a dataset where each point is assigned to a class A, B, C, or
D. Each point is also assigned to a study site. Each study site is
coded with a number ranging between 1-100. This information is stored
in the vector studySites.
I want to run randomForests using stratified sampling, so I chose the option
strata = factor(studySites)
But I am not sure how to control the number of
2005 Oct 27
1
Repost: Examples of "classwt", "strata", and "sampsize" i n randomForest?
"classwt" in the current version of the randomForest package doesn't work
too well. (It's what was in version 3.x of the original Fortran code by
Breiman and Cutler, not the one in the new Fortran code.) I'd advise
against using it.
"sampsize" and "strata" can be use in conjunction. If "strata" is not
specified, the class labels will be used.
2013 Jan 10
1
SRS, Stratified, and Cluster sampling
Hi,
Has anyone done (or know of) any nice R activities that help introductory
students ( and teachers :) ) better understand the concepts of simple vs
stratified vs cluster sampling?
Any links?
David
--
View this message in context: http://r.789695.n4.nabble.com/SRS-Stratified-and-Cluster-sampling-tp4655099.html
Sent from the R help mailing list archive at Nabble.com.
2009 Jun 18
1
Stratified random sampling?
Rers:
What is the preferred library/function for doing stratified random
sampling from a dataset, given I want to control the number of samples
(rather than the proportion of samples) per strata? Thanks!
--j
--
Jonathan A. Greenberg, PhD
Postdoctoral Scholar
Center for Spatial Technologies and Remote Sensing (CSTARS)
University of California, Davis
One Shields Avenue
The Barn, Room
2006 Nov 13
1
random forest regression
Dear all,
I am doing a regression in ramdomForest, using the option "sampsize" reduce
the number of records used to produce the randomForest object.
The manual says "For classification, if sampsize is a vector of the length
the number of strata, then sampling is stratified by strata, and the
elements of sampsize indicate the numbers to be drawn from the strata". I
need my
2004 Jul 08
0
randomForest 4.3-0 released
Dear all,
Version 4.3-0 of the randomForest package is now available on CRAN (in
source; binaries will follow in due course). There are some interface
changes and a few new features, as well as bug fixes. For those who had
used previous versions, the important things to note are: 1. there's a
namespace now, and 2. some functions have been renamed. The list of changes
since 4.0-7 (last
2004 Jul 08
0
randomForest 4.3-0 released
Dear all,
Version 4.3-0 of the randomForest package is now available on CRAN (in
source; binaries will follow in due course). There are some interface
changes and a few new features, as well as bug fixes. For those who had
used previous versions, the important things to note are: 1. there's a
namespace now, and 2. some functions have been renamed. The list of changes
since 4.0-7 (last
2007 Jan 28
2
help with RandomForest classwt option
Hello there,
I am working on an extremely unbalanced two class classification problems. I
wanna use "classwt" with "down sampling" together. By checking the rfNews()
in R, it looks that classwt is not working yet. Then I looked at the
software from Salford. I did not find the down sampling option. I am
wondering if you have any experience to deal with this problem. Do you
2013 Apr 26
1
Stratified Random Sampling Proportional to Size
Hello R Experts,
I kindly request your assistance on figuring out how to get a stratified random sampling proportional to 100.
Below is my r code showing what I did and the error I'm getting with sampling::strata
# FIRST I summarized count of records by the two variables I want to use as strata
Library(RODBC)
library(sqldf)
library(sampling)
#After establishing connection I query the data
2011 Mar 10
1
ANOVA for stratified cox regression
This is a follow-up to a query that was posted regarding some problems that
emerge when running anova analyses for cox models, posted by Mathias Gondan:
Matthias Gondan wrote:
>* Dear List,*>**>* I have tried a stratified Cox Regression, it is working fine, except for*>* the "Anova"-Tests:*>**>* Here the commands (should work out of the box):*>**>*
2009 Mar 20
2
randomForest
Hi!
I am dealing with random forest using R.
Is there a way to sample a fixed no.of rows from a dataset for use with
different trees in random Forest.
To be more clear, my data set contains 1500 rows, and I am growing 500 trees
in Random Forest
Is it possible to sample only 500 rows of data from the data set and use it
for different trees in the forest. I mean each tree of the forest should use
2010 Jul 14
1
randomForest outlier return NA
Dear R-users,
I have a problem with randomForest{outlier}.
After running the following code ( that produces a silly data set and builds
a model with randomForest ):
#######################
library(randomForest)
set.seed(0)
## build data set
X <- rbind( matrix( runif(n=400,min=-1,max=1), ncol = 10 ) ,
rep(1,times= 10 ) )
Y <- matrix( nrow = nrow(X), ncol = 1)
for( i in (1:nrow(X))){
2010 May 10
2
Installing randomForest on Ubuntu Errors
Hello,
I've tried to install randomForest on a Ubuntu 8.04 Hardy Heron system.
I've repeatedly rec'd the error:
> install.packages("randomForest", dependencies = TRUE)
ERROR: compiliation failed for package 'randomForest'
** Removing '/home/admuser/R/i486-pc-linux-gnu-library/2.6/randomForest'
The downloaded packages are in
2012 Jan 25
1
Error in predict.randomForest ... subscript out of bounds with NULL name in X
RF trains fine with X, but fails on prediction
> library(randomForest)
> chirps <-
c(20,16.0,19.8,18.4,17.1,15.5,14.7,17.1,15.4,16.2,15,17.2,16,17,14.1)
> temp <-
c(88.6,71.6,93.3,84.3,80.6,75.2,69.7,82,69.4,83.3,78.6,82.6,80.6,83.5,76
.3)
> X <- cbind(1,chirps)
> rf <- randomForest(X, temp)
> yp <- predict(rf, X)
Error in predict.randomForest(rf, X) : subscript
2011 Jan 20
1
randomForest: too many elements specified?
I getting "Error in matrix(0, n, n) : too many elements specified"
while building randomForest model, which looks like memory allocation
error.
Software versions are: randomForest 4.5-25, R version 2.7.1
Dataset is big (~90K rows, ~200 columns), but this is on a big machine (
~120G RAM)
and I call randomForest like this: randomForest(x,y)
i.e. in supervised mode and not requesting
2005 Mar 25
3
Stratified bootstrap question
Dear experts,
I am asking for help with a question regarding to stratified bootstrap.
My dataset is a longitudinal dataset (3 measurements per person at year
1, 4 and 7) composed of multiple clinic centers and multiple participants
within each clinic. It has missing values.
I want to do a bootstrap to find the standard errors and confidence
intervals for my variance components. My model is a
2009 Apr 07
1
Concern with randomForest
Hi all,
When running a randomForest run using the following command:
forestplas=randomForest(Prev~.,data=plas,ntree=200000)
print(forestplas)
I get the following result:
Call:
randomForest(formula = Prev ~ ., data = plas, ntree = 2e+05,
importance = TRUE)
Type of random forest: regression
Number of trees: 2e+05
No. of variables tried at each split: 5
2008 Jul 20
1
confusion matrix in randomForest
I have a question on the output generated by randomForest in classification
mode, specifically, the confusion matrix. The confusion matrix lists the
various classes and how the forest classified each one, plus the
classification error. Are these numbers essentially averages over all the
trees in the forest? If so, is there a way I can get the standard deviation
values out of the randomForest,
2012 May 05
1
No Data in randomForest predict
I would like to ask a general question about the randomForest predict
function and how it handles No Data values. I understand that you can omit
No Data values while developing the randomForest object, but how does it
handle No Data in the prediction phase? I would like the output to be NA
if any (not just all) of the input data have an NA value. It is not clear
to me if this is the default or
2012 Nov 22
1
Partial dependence plot in randomForest package (all flat responses)
Hi,
I'm trying to make a partial plot with package randomForest in R. After I
perform my random forest object I type
partialPlot(data.rforest, pred.data=act2, x.var=centroid, "C")
where data.rforest is my randomforest object, act2 is the original dataset,
centroid is one of the predictor and C is one of the classes in my response
variable.
Whatever predictor or response class I