thr3ads.net - similar to: "randomForest and missing data"

Displaying 20 results from an estimated 5000 matches similar to: "randomForest and missing data"

2007 Jan 04

importing timestamp data into R

I have a set of timestamp data that I have in a text file that I would like to import into R for analysis. The timestamps are formated as follows: DT_1,DT_2 [2006/08/10 21:12:14 ],[2006/08/10 21:54:00 ] [2006/08/10 20:42:00 ],[2006/08/10 22:48:00 ] [2006/08/10 20:58:00 ],[2006/08/10 21:39:00 ] [2006/08/04 12:15:24 ],[2006/08/04 12:20:00 ] [2006/08/04 12:02:00 ],[2006/08/04 14:20:00 ] I can get

RandomForest and Missing Values

2013 Jan 28

RandomForest and Missing Values

Dear All, I would like to use a randomForest algorithm on a dataset. The set is not particularly large/difficult to handle, but it has some missing values (both factors and numerical values). According to what I found https://stat.ethz.ch/pipermail/r-help/2005-September/078880.html https://stat.ethz.ch/pipermail/r-help/2007-January/123117.html the randomForest package has a problem with missing

rfImpute

2007 Aug 10

rfImpute

I am having trouble with the rfImpute function in the randomForest package. Here is a sample... clunk.roughfix<-na.roughfix(clunk) > > clunk.impute<-rfImpute(CONVERT~.,data=clunk) ntree OOB 1 2 300: 26.80% 3.83% 85.37% ntree OOB 1 2 300: 18.56% 5.74% 51.22% Error in randomForest.default(xf, y, ntree = ntree, ..., do.trace = ntree, : NA not

Imputing data

2011 Dec 02

Imputing data

So I have a very big matrix of about 900 by 400 and there are a couple of NA in the list. I have used the following functions to impute the missing data data(pc) pc.na<-pc pc.roughfix <- na.roughfix(pc.na) pc.narf <- randomForest(pc.na, na.action=na.roughfix) yet it does not replace the NA in the list. Presently I want to replace the NA with maybe the mean of the rows or columns or

rfImpute (for randomForest) crashed

2003 Aug 26

rfImpute (for randomForest) crashed

In trying to execute this line in R (Version 1.7.1 (2003-06-16), under windows XP pro), with the randomForest library (about two weeks old) loaded, the program crashed: bost4rf <- rfImpute(TargetDensity~.,data=bost4rf0) Specifically, an XP dialog box popped up, saying ?R for windows GUI front-end has encountered a problem and needs to close.? That was the dialog saying asking whether I

anyone know why package "RandomForest" na.roughfix is so slow??

2010 Jun 30

anyone know why package "RandomForest" na.roughfix is so slow??

Hi all, I am using the package "random forest" for random forest predictions. I like the package. However, I have fairly large data sets, and it can often take *hours* just to go through the "na.roughfix" call, which simply goes through and cleans up any NA values to either the median (numerical data) or the most frequent occurrence (factors). I am going to start

na.action in randomForest --- Summary

2003 Aug 05

na.action in randomForest --- Summary

A few days ago I asked whether there were options other than na.action=na.fail for the R port of Breiman?s randomForest; the function?s help page did not say anything about other options. I have since discovered that a pdf document called ?The randomForest Package? and made available by Andy Liaw (who made the tool available in R---thank you) does discuss an option. It is an implementation of

help with the usage of "randomForest"

2004 Mar 31

help with the usage of "randomForest"

Dear all, Can anybody give me some hint on the following error msg I got with using randomForest? I have two-class classification problem. The data file "sample" is: ---------------------------------------------------------- udomain.edu udomain.hcs hpclass 1 1.0000 1 not 2 NA 2 not 3 NA 0.8 not 4 NA 0.2 hp 5 NA 0.9 hp ------------------------------------------------------------ The

missing value replacement for test data in random forest

2006 Mar 29

missing value replacement for test data in random forest

Hi, In R, how to do missing value replacement for test data in randome forest in the way Breiman decribed. thanks in advance iris

NA in R package randomForest

2012 Mar 26

NA in R package randomForest

I have a question regarding NA in randomForest (in R). I have a dataset which include both numerical and non-numerical variables, and the data includes some NA. I tried to use na.roughfix but then i get an error message "na.roughfix only works for numeric or factor". I also tried rfImpute but this does not work either because I have some NA in my response variable. Does anyone have som

[handling] Missing [values in randomForest]

2005 Sep 12

[handling] Missing [values in randomForest]

Hi Jan-Paul, You definitely want to be careful with na.omit in randomForest -- that wipes out any row with even one NA. If NAs are sprawled throughout your dataset, na.omit might end up killing a lot of rows. Here's my usual MO for missing values: 1) "impute" in Hmisc fills in gaps with the mean, median, most common value, etc. 2) rfImpute: fits a forest on the rows available and

Error while using rfImpute

2009 May 08

Error while using rfImpute

Dear Administrator, I am using linux (suse 10.2). While attempting rfImpute, I am getting the following error message: > Members <- rfImpute(Status ~ ., data = Members) Error in .C("classRF", x = x, xdim = as.integer(c(p, n)), y = as.integer(y), : C symbol name "classRF" not in DLL for package "randomForest". I need the help to sort out above error.

Gradient Boosting Trees with correlated predictors in gbm

2010 Feb 28

Gradient Boosting Trees with correlated predictors in gbm

Dear R users, I’m trying to understand how correlated predictors impact the Relative Importance measure in Stochastic Boosting Trees (J. Friedman). As Friedman described “ …with single decision trees (referring to Brieman’s CART algorithm), the relative importance measure is augmented by a strategy involving surrogate splits intended to uncover the masking of influential variables by others

Problems using rfImpute

2008 May 05

Problems using rfImpute

Hello R-user! I am running R 2.7.0 on a Power Book (Tiger). (I am still R and statistics beginner) I tried rfImpute (randomForest) and as far as I understood should it replace NA`s using a proximity matrix: > set.seed(100000) > Subset5Imputed<-rfImpute(Sex~., data=Subset5) ntree OOB 1 2 300: 11.78% 12.36% 11.21% ntree OOB 1 2 300: 12.07% 12.64%

Missing value in Rpart

2001 Aug 02

Missing value in Rpart

Hi, all Our understanding of how classification trees in Rpart treat missing is that if the variable is ordinal(continous), Rpart, by default, imputes a value for missing. How do we do the classification tree and tell Rpart not to impute. That is, what command is used to turn off the imputation. Also, if we do get true missing, how does classification tree analysis in Rpart treat missing when

problem with rfImpute (package randomForest)

2009 Mar 11

problem with rfImpute (package randomForest)

Hello everybody, this is my first request about R so I am sorry if I send it to a bad mail or if I am not very clear. So my problem is about the use of rfImpute from randomForest package. I am interested in imputations of missing values and I read that randomForest can make it. So i write the following code : set.seed(100); library(mlbench) library(randomForest) data(BreastCancer)

predict.randomForest

2004 Dec 10

predict.randomForest

I have a data.frame with a series of variables tagged to a binary response ('present'/'absent'). I am trying to use randomForest to predict present/absent in a second dataset. After a lot a fiddling (using two data frames, making sure data types are the same, lots of testing with data that works such as data(iris)) I've settled on combining all my data into one data.frame

Questions on RandomForest

2004 Jan 07

Questions on RandomForest

Hi, erveryone, I show much thanks to Andy and Matthew on former questions. I now sample only a small segment of a image can segment the image into several classes by RandomForest successfully. Now I have some confusion on it: 1. What is the internal component classifier in RandomForest? Are they the CART implemented in the rpart package? 2. I use training samples to predict new samples. But

installing problems repeated.tgz linux

2004 Jul 26

installing problems repeated.tgz linux

Hi, i try several possibilities adn looking in the archive, but didn't getting success to install j.lindsey's usefuel "library repeated" on my linux (suse9.0 with kernel 2.6.7,R.1.9.1) P.S. Windows, works fine Many thanks for help Christian chris at linux:/space/downs> R CMD INSTALL - l /usr/lib/R/library repeated WARNING: invalid package '-' WARNING:

randomForest memory footprint

2011 Sep 07

randomForest memory footprint

Hello, I am attempting to train a random forest model using the randomForest package on 500,000 rows and 8 columns (7 predictors, 1 response). The data set is the first block of data from the UCI Machine Learning Repo dataset "Record Linkage Comparison Patterns" with the slight modification that I dropped two columns with lots of NA's and I used knn imputation to fill in other gaps.

similar to: randomForest and missing data