similar to: Regarding variable importance in the randomForest package

Displaying 20 results from an estimated 2000 matches similar to: "Regarding variable importance in the randomForest package"

2013 Oct 15
1
randomForest: Numeric deviation between 32/64 Windows builds
Dear R Developers I'm using the great randomForest package (4.6-7) for many projects and recently stumbled upon a problem when I wrote unit tests for one of my projects: On Windows, there are small numeric deviations when using the 32- / 64-bit version of R, which doesn't seem to be a problem on Linux or Mac. R64 on Windows produces the same results as R64/R32 on Linux or Mac: >
2010 Apr 29
1
variable importance in Random Forest
HI, Dear Andy, I run the RandomFOrest in R, and get the following resutls in variable importance: What is the meaning of MeanDecreaseAccuracy and MeanDecreaseGini? I found they are raw values, they are not scaled to 1, right? Which column if most similar to the variable rel.influence in Boosting? Thanks so much! > fit$importance 0 1
2005 Mar 23
1
Gini's Importance Value Variable = Inf
Hi All, In the script below, the importance measure for column 4 (ie MeanDecreaseGini) indicated "Inf" for V7. Running the getTree command showed that "V7" had been selected at least twice in one of the trees for Random Forest. So the "Inf" command was not generated as a result of dividing the sum of the decreases by 0. Any suggestions on what may be causing the
2007 Aug 24
2
Variable Importance - Random Forest
Hello, I am trying to explore the use of random forests for classification and am certain about the interpretation of the importance measurements. When having the option "importance = T" in the randomForest call, the resulting 'importance' element matrix has four columns with the following headings: 0 - mean raw importance score of variable x for class 0 (where
2010 May 04
1
randomforests - how to classify
Hi, I'm experimenting with random forests and want to perform a binary classification task. I've tried some of the sample codes in the help files and things run, but I get a message to the effect 'you don't have very many unique values in the target - are you sure you want to do regression?' (sorry, don't know exact message but r is busy now so can't check). In
2004 Dec 10
1
predict.randomForest
I have a data.frame with a series of variables tagged to a binary response ('present'/'absent'). I am trying to use randomForest to predict present/absent in a second dataset. After a lot a fiddling (using two data frames, making sure data types are the same, lots of testing with data that works such as data(iris)) I've settled on combining all my data into one data.frame
2003 Apr 21
2
randomForest crash?
I am attempting to use randomForests to look for interesting genes in microarray data with 216genes, 2 classes and 52 samples. My data.frame is 52x217 with the last column, V217 being the class(1 or 2). When I try lung.rf <- randomForest(V217 ~ ., data=tlSA216cda, importance= TRUE, proximity = TRUE) the GUI crashes. I am running R-1.6.2 under windo$e98, and most
2008 Mar 09
1
sampsize in Random Forests
Hi all, I have a dataset where each point is assigned to a class A, B, C, or D. Each point is also assigned to a study site. Each study site is coded with a number ranging between 1-100. This information is stored in the vector studySites. I want to run randomForests using stratified sampling, so I chose the option strata = factor(studySites) But I am not sure how to control the number of
2005 Mar 22
2
Error: Can not handle categorical predictors with more than 32 categories.
Hi All, My question is in regards to an error generated when using randomForest in R. Is there a special way to format the data in order to avoid this error, or am I completely confused on what the error implies? "Error in randomForest.default(m, y, ...) : Can not handle categorical predictors with more than 32 categories." This is generated from the command line: >
2008 Jun 15
1
randomForest, 'No forest component...' error while calling Predict()
Dear R-users, While making a prediction using the randomForest function (package randomForest) I'm getting the following error message: "Error in predict.randomForest(model, newdata = CV) : No forest component in the object" Here's my complete code. For reproducing this task, please find my 2 data sets attached ( http://www.nabble.com/file/p17855119/data.rar data.rar ).
2011 Feb 15
1
[slightly OT] predict.randomForest and type=”prob”
Dear all , I would like to use the function randomForest to predict the probability of relocation failure of a GPS collar as a function of several environmental variables x (both factor and numeric: slope, vegetation, etc.) on a given area. The response variable y is thus success (0)/failure(1) of the relocation, and the sampling unit is the pixel of a raster map. My aim is to build a map
2005 Aug 15
2
randomForest Error passing string argument
I'm attempting to pass a string argument into the function randomForest but I get an error: state <- paste(list("fruit ~", "apples+oranges+blueberries", "data=fruits.data, mtry=2, do.trace=100, na.action=na.omit, keep.forest=TRUE"), sep= " ", collapse="") model.rf <- randomForest(state) Error in if (n==0) stop ("data(x) has 0
2005 Jul 21
4
RandomForest question
Hello, I'm trying to find out the optimal number of splits (mtry parameter) for a randomForest classification. The classification is binary and there are 32 explanatory variables (mostly factors with each up to 4 levels but also some numeric variables) and 575 cases. I've seen that although there are only 32 explanatory variables the best classification performance is reached when
2010 Nov 10
2
randomForest can not handle categorical predictors with more than 32 categories
I received this error Error in randomForest.default(m, y, ...) : Can not handle categorical predictors with more than 32 categories. using below code library(randomForest) library(MASS) memory.limit(size=12999) x <- read.csv("D:/train_store_title_view.csv", header=TRUE) x <- na.omit(x) set.seed(131) sales.rf <- randomForest(sales ~ ., data=x, mtry=3, importance=TRUE) My
2013 Feb 03
3
RandomForest, Party and Memory Management
Dear All, For a data mining project, I am relying heavily on the RandomForest and Party packages. Due to the large size of the data set, I have often memory problems (in particular with the Party package; RandomForest seems to use less memory). I really have two questions at this point 1) Please see how I am using the Party and RandomForest packages. Any comment is welcome and useful.
2008 Feb 25
1
To get more digits in precision of predict function of randomForests
Hi, I am using randomForests for a classification problem. The predict function in the randomForest library, when asked to return the probabilities, has precision of two digits after the decimal. I need at least four digits of precision for the predicted probabilities. How do I achieve this? Thank you, Nagu
2008 Dec 26
2
about randomForest
hello, I want to use randomForest to classify a matrix which is 331030?42,the last column is class signal.I use ? Memebers.rf<-randomForest(class~.,data=Memebers,proximity=TRUE,mtry=6,ntree=200) which told me" the error is matrix(0,n,n) set too elements" then I use: Memebers.rf<-randomForest(class~.,data=Memebers,importance=TRUE,proximity=TRUE) which told me"the error is
2010 Jul 14
1
randomForest outlier return NA
Dear R-users, I have a problem with randomForest{outlier}. After running the following code ( that produces a silly data set and builds a model with randomForest ): ####################### library(randomForest) set.seed(0) ## build data set X <- rbind( matrix( runif(n=400,min=-1,max=1), ncol = 10 ) , rep(1,times= 10 ) ) Y <- matrix( nrow = nrow(X), ncol = 1) for( i in (1:nrow(X))){
2012 Mar 23
1
Memory limits for MDSplot in randomForest package
Hello, I am struggling to produce an MDS plot using the randomForest package with a moderately large data set. My data set has one categorical response variables, 7 predictor variables and just under 19000 observations. That means my proximity matrix is approximately 133000 by 133000 which is quite large. To train a random forest on this large a dataset I have to use my institutions high
2003 Apr 12
5
rpart vs. randomForest
Greetings. I'm trying to determine whether to use rpart or randomForest for a classification tree. Has anybody tested efficacy formally? I've run both and the confusion matrix for rf beats rpart. I've looking at the rf help page and am unable to figure out how to extract the tree. But more than that I'm looking for a more comprehensive user's guide for randomForest including