similar to: Stratified Sampling with randomForest Regression

Displaying 20 results from an estimated 11000 matches similar to: "Stratified Sampling with randomForest Regression"

2008 Mar 09
1
sampsize in Random Forests
Hi all, I have a dataset where each point is assigned to a class A, B, C, or D. Each point is also assigned to a study site. Each study site is coded with a number ranging between 1-100. This information is stored in the vector studySites. I want to run randomForests using stratified sampling, so I chose the option strata = factor(studySites) But I am not sure how to control the number of
2005 Oct 27
1
Repost: Examples of "classwt", "strata", and "sampsize" i n randomForest?
"classwt" in the current version of the randomForest package doesn't work too well. (It's what was in version 3.x of the original Fortran code by Breiman and Cutler, not the one in the new Fortran code.) I'd advise against using it. "sampsize" and "strata" can be use in conjunction. If "strata" is not specified, the class labels will be used.
2013 Jan 10
1
SRS, Stratified, and Cluster sampling
Hi, Has anyone done (or know of) any nice R activities that help introductory students ( and teachers :) ) better understand the concepts of simple vs stratified vs cluster sampling? Any links? David -- View this message in context: http://r.789695.n4.nabble.com/SRS-Stratified-and-Cluster-sampling-tp4655099.html Sent from the R help mailing list archive at Nabble.com.
2009 Jun 18
1
Stratified random sampling?
Rers: What is the preferred library/function for doing stratified random sampling from a dataset, given I want to control the number of samples (rather than the proportion of samples) per strata? Thanks! --j -- Jonathan A. Greenberg, PhD Postdoctoral Scholar Center for Spatial Technologies and Remote Sensing (CSTARS) University of California, Davis One Shields Avenue The Barn, Room
2006 Nov 13
1
random forest regression
Dear all, I am doing a regression in ramdomForest, using the option "sampsize" reduce the number of records used to produce the randomForest object. The manual says "For classification, if sampsize is a vector of the length the number of strata, then sampling is stratified by strata, and the elements of sampsize indicate the numbers to be drawn from the strata". I need my
2004 Jul 08
0
randomForest 4.3-0 released
Dear all, Version 4.3-0 of the randomForest package is now available on CRAN (in source; binaries will follow in due course). There are some interface changes and a few new features, as well as bug fixes. For those who had used previous versions, the important things to note are: 1. there's a namespace now, and 2. some functions have been renamed. The list of changes since 4.0-7 (last
2004 Jul 08
0
randomForest 4.3-0 released
Dear all, Version 4.3-0 of the randomForest package is now available on CRAN (in source; binaries will follow in due course). There are some interface changes and a few new features, as well as bug fixes. For those who had used previous versions, the important things to note are: 1. there's a namespace now, and 2. some functions have been renamed. The list of changes since 4.0-7 (last
2007 Jan 28
2
help with RandomForest classwt option
Hello there, I am working on an extremely unbalanced two class classification problems. I wanna use "classwt" with "down sampling" together. By checking the rfNews() in R, it looks that classwt is not working yet. Then I looked at the software from Salford. I did not find the down sampling option. I am wondering if you have any experience to deal with this problem. Do you
2013 Apr 26
1
Stratified Random Sampling Proportional to Size
Hello R Experts, I kindly request your assistance on figuring out how to get a stratified random sampling proportional to 100. Below is my r code showing what I did and the error I'm getting with sampling::strata # FIRST I summarized count of records by the two variables I want to use as strata Library(RODBC) library(sqldf) library(sampling) #After establishing connection I query the data
2011 Mar 10
1
ANOVA for stratified cox regression
This is a follow-up to a query that was posted regarding some problems that emerge when running anova analyses for cox models, posted by Mathias Gondan: Matthias Gondan wrote: >* Dear List,*>**>* I have tried a stratified Cox Regression, it is working fine, except for*>* the "Anova"-Tests:*>**>* Here the commands (should work out of the box):*>**>*
2009 Mar 20
2
randomForest
Hi! I am dealing with random forest using R. Is there a way to sample a fixed no.of rows from a dataset for use with different trees in random Forest. To be more clear, my data set contains 1500 rows, and I am growing 500 trees in Random Forest Is it possible to sample only 500 rows of data from the data set and use it for different trees in the forest. I mean each tree of the forest should use
2010 Jul 14
1
randomForest outlier return NA
Dear R-users, I have a problem with randomForest{outlier}. After running the following code ( that produces a silly data set and builds a model with randomForest ): ####################### library(randomForest) set.seed(0) ## build data set X <- rbind( matrix( runif(n=400,min=-1,max=1), ncol = 10 ) , rep(1,times= 10 ) ) Y <- matrix( nrow = nrow(X), ncol = 1) for( i in (1:nrow(X))){
2010 May 10
2
Installing randomForest on Ubuntu Errors
Hello, I've tried to install randomForest on a Ubuntu 8.04 Hardy Heron system. I've repeatedly rec'd the error: > install.packages("randomForest", dependencies = TRUE) ERROR: compiliation failed for package 'randomForest' ** Removing '/home/admuser/R/i486-pc-linux-gnu-library/2.6/randomForest' The downloaded packages are in
2012 Jan 25
1
Error in predict.randomForest ... subscript out of bounds with NULL name in X
RF trains fine with X, but fails on prediction > library(randomForest) > chirps <- c(20,16.0,19.8,18.4,17.1,15.5,14.7,17.1,15.4,16.2,15,17.2,16,17,14.1) > temp <- c(88.6,71.6,93.3,84.3,80.6,75.2,69.7,82,69.4,83.3,78.6,82.6,80.6,83.5,76 .3) > X <- cbind(1,chirps) > rf <- randomForest(X, temp) > yp <- predict(rf, X) Error in predict.randomForest(rf, X) : subscript
2011 Jan 20
1
randomForest: too many elements specified?
I getting "Error in matrix(0, n, n) : too many elements specified" while building randomForest model, which looks like memory allocation error. Software versions are: randomForest 4.5-25, R version 2.7.1 Dataset is big (~90K rows, ~200 columns), but this is on a big machine ( ~120G RAM) and I call randomForest like this: randomForest(x,y) i.e. in supervised mode and not requesting
2005 Mar 25
3
Stratified bootstrap question
Dear experts, I am asking for help with a question regarding to stratified bootstrap. My dataset is a longitudinal dataset (3 measurements per person at year 1, 4 and 7) composed of multiple clinic centers and multiple participants within each clinic. It has missing values. I want to do a bootstrap to find the standard errors and confidence intervals for my variance components. My model is a
2009 Apr 07
1
Concern with randomForest
Hi all, When running a randomForest run using the following command: forestplas=randomForest(Prev~.,data=plas,ntree=200000) print(forestplas) I get the following result: Call: randomForest(formula = Prev ~ ., data = plas, ntree = 2e+05, importance = TRUE) Type of random forest: regression Number of trees: 2e+05 No. of variables tried at each split: 5
2008 Jul 20
1
confusion matrix in randomForest
I have a question on the output generated by randomForest in classification mode, specifically, the confusion matrix. The confusion matrix lists the various classes and how the forest classified each one, plus the classification error. Are these numbers essentially averages over all the trees in the forest? If so, is there a way I can get the standard deviation values out of the randomForest,
2012 May 05
1
No Data in randomForest predict
I would like to ask a general question about the randomForest predict function and how it handles No Data values. I understand that you can omit No Data values while developing the randomForest object, but how does it handle No Data in the prediction phase? I would like the output to be NA if any (not just all) of the input data have an NA value. It is not clear to me if this is the default or
2012 Nov 22
1
Partial dependence plot in randomForest package (all flat responses)
Hi, I'm trying to make a partial plot with package randomForest in R. After I perform my random forest object I type partialPlot(data.rforest, pred.data=act2, x.var=centroid, "C") where data.rforest is my randomforest object, act2 is the original dataset, centroid is one of the predictor and C is one of the classes in my response variable. Whatever predictor or response class I