thr3ads.net - similar to: "Random forests"

Displaying 20 results from an estimated 8000 matches similar to: "Random forests"

2008 Mar 09

sampsize in Random Forests

Hi all, I have a dataset where each point is assigned to a class A, B, C, or D. Each point is also assigned to a study site. Each study site is coded with a number ranging between 1-100. This information is stored in the vector studySites. I want to run randomForests using stratified sampling, so I chose the option strata = factor(studySites) But I am not sure how to control the number of

Random Forest AUC

2010 Oct 22

Random Forest AUC

Guys, I used Random Forest with a couple of data sets I had to predict for binary response. In all the cases, the AUC of the training set is coming to be 1. Is this always the case with random forests? Can someone please clarify this? I have given a simple example, first using logistic regression and then using random forests to explain the problem. AUC of the random forest is coming out to be

Unbalanced Manova

2006 Mar 30

Unbalanced Manova

Dear all, I need to do a Manova but I have an unbalanced design. I have morphological measurements similar to the iris dataset, but I don't have the same number of measurements for all species. Does anyone know a procedure to do Manova with this kind of input in R? Thank you very much, Naiara. -------------------------------------------- Naiara S. Pinto Ecology, Evolution and Behavior 1

random forests for R

2002 Apr 02

random forests for R

Hi all, There is now a package available on CRAN that provides an R interface to Leo Breiman's random forest classifier. Basically, random forest does the following: 1. Select ntree, the number of trees to grow, and mtry, a number no larger than number of variables. 2. For i = 1 to ntree: 3. Draw a bootstrap sample from the data. Call those not in the bootstrap sample the

random forests for R

2002 Apr 02

random forests for R

reading contigency tables

2006 Jan 10

reading contigency tables

Hi all, I need some help using read.ftable to read a contingency table. My columns are organized as follows: order--family--species--location--number of individuals I couldn't figure out how to change the data on my text file to be imported into R; and after you do that, is it possible to convert the table into a data frame? Any tips would be greatly appreciatted! Thanks a lot, Naiara.

polr (MASS)

2006 Jan 24

polr (MASS)

Hello all, I am trying to use polr (the ordered logistic model from MASS) but I am getting the following error message: Error in if (all(pr > 0)) -sum(wt * log(pr)) else Inf : missing value where TRUE/FALSE needed My response variable is a factor with 3 levels and I have 2 independent variables. I am not sure if I guessed the starting parameters right, which I imagine could be a source of

" 'x' must be numeric"

2006 Jan 21

" 'x' must be numeric"

Hello all, I am importing data from a txt file and try to get a histogram, I get the message: "Error in hist: 'x' must be numeric". When I use mode R returns "List". However when I use srt I get: `data.frame': 456 obs. of 1 variable: $ V1: num 0.6344 0.4516 0.0968 0.7634 0.7957 ... My file consists of one column only (no headers) and I can't figure out why

random forest mtry and mse

2007 Oct 11

random forest mtry and mse

I have been using random forest on a data set with 226 sites and 36 explanatory variables (continuous and categorical). When I use "tune.randomforest" to determine the best value to use in "mtry" there is a fairly consistent and steady decrease in MSE, with the optimum of "mtry" usually equal to 1. Why would that occur, and what does it signify? What I would

random forest regression

2006 Nov 13

random forest regression

Dear all, I am doing a regression in ramdomForest, using the option "sampsize" reduce the number of records used to produce the randomForest object. The manual says "For classification, if sampsize is a vector of the length the number of strata, then sampling is stratified by strata, and the elements of sampsize indicate the numbers to be drawn from the strata". I need my

Different results from random.Forest with test option and using predict function

2012 Dec 03

Different results from random.Forest with test option and using predict function

Hello R Gurus, I am perplexed by the different results I obtained when I ran code like this: set.seed(100) test1<-randomForest(BinaryY~., data=Xvars, trees=51, mtry=5, seed=200) predict(test1, newdata=cbind(NewBinaryY, NewXs), type="response") and this code: set.seed(100) test2<-randomForest(BinaryY~., data=Xvars, trees=51, mtry=5, seed=200, xtest=NewXs, ytest=NewBinarY) The

Random Forests: Predictor importance for Regression Trees

2009 Apr 20

Random Forests: Predictor importance for Regression Trees

Hello! I think I am relatively clear on how predictor importance (the first one) is calculated by Random Forests for a Classification tree: Importance of predictor P1 when the response variable is categorical: 1. For out-of-bag (oob) cases, randomly permute their values on predictor P1 and then put them down the tree 2. For a given tree, subtract the number of votes for the correct class in the

Rcmdr and scatter3d

2005 Oct 04

Rcmdr and scatter3d

Hi folks, I'd like to use scatter3d (which is in R commander) to plot more than one dataset in the same graph, each dataset with a different color. The kind of stuff you would do with "holdon" in Matlab. I read a recent message that was posted to this list with a similar problem, but I couldn't understand the reply. Could someone give me one example? How do you plot subgroups

Random Forest con poca "n" y muchos predictores

2018 Dec 13

Random Forest con poca "n" y muchos predictores

Hola, Me he iniciado hace poco en Machine Learning, y tengo una duda sobre mis conjuntos de datos: el primero tiene 37 variables explicativas y 116 instancias, y el segundo, 140 variables explicativas y 195 instancias. El primero lo veo bien, ya que hay 3 veces más casos que variables explicativas, pero creo que el segundo caso puede suponer un problema al haber casi el mismo número de

synthax for R CMD INSTALL

2008 Jul 04

synthax for R CMD INSTALL

Dear all, I am trying to install rgdal from source on a Mac OS 10.4.11. I installed GDAL and PROJ as frameworks so the installation does not work unless I explicitly state where the GDAL and PROJ libraries are. I tried: R CMD INSTALL rgdal_0.5-25 --configure-args=--with-proj-include=/Library/Frameworks/PROJ.framework/unix/include --with-proj-lib=/Library/Frameworks/PROJ.framework/unix/lib but I

comparing random forests and classification trees

2007 Jan 29

comparing random forests and classification trees

Hi, I have done an analysis using 'rpart' to construct a Classification Tree. I am wanting to retain the output in tree form so that it is easily interpretable. However, I am wanting to compare the 'accuracy' of the tree to a Random Forest to estimate how much predictive ability is lost by using one simple tree. My understanding is that the error automatically displayed by the two

Random forests prediction

2012 May 11

Random forests prediction

Hi all, I have a strange problem when applying RF in R. I have a set of variables with which I obtain an AUC of 0.67. I do have a second set of variables that have an AUC of 0.57. When I merge the first and second set of variables, the AUC becomes 0.64. I would expect the prediction to become better as I add variables that do have some predictive power? This is even more strange as the AUC

Re-evaluating the tree in the random forest

2005 Sep 08

Re-evaluating the tree in the random forest

Dear mailinglist members, I was wondering if there was a way to re-evaluate the instances of a tree (in the forest) again after I have manually changed a splitpoint (or split variable) of a decision node. Here's an illustration: library("randomForest") forest.rf <- randomForest(formula = Species ~ ., data = iris, do.trace = TRUE, ntree = 3, mtry = 2, norm.votes = FALSE) # I am

Random Forests

2018 Jan 22

Random Forests

Muchas gracias Carlos, como siempre. Es raro que se me pasase. En su momento miré todos los argumentos del RF, como hago siempre, pero ese lo había olvidado. La verdad es que funcionaba estupendamente, pero me parecía extraño. Aunque dado que los RF no sobreajustan, no hay problema con que sus árboles sean todo lo grandes que quieras. Lo he testado con una base de datos externa y explica

RandomForest question

2005 Jul 21

RandomForest question

Hello, I'm trying to find out the optimal number of splits (mtry parameter) for a randomForest classification. The classification is binary and there are 32 explanatory variables (mostly factors with each up to 4 levels but also some numeric variables) and 575 cases. I've seen that although there are only 32 explanatory variables the best classification performance is reached when

similar to: Random forests