similar to: Random forests

Displaying 20 results from an estimated 8000 matches similar to: "Random forests"

2008 Mar 09
1
sampsize in Random Forests
Hi all, I have a dataset where each point is assigned to a class A, B, C, or D. Each point is also assigned to a study site. Each study site is coded with a number ranging between 1-100. This information is stored in the vector studySites. I want to run randomForests using stratified sampling, so I chose the option strata = factor(studySites) But I am not sure how to control the number of
2010 Oct 22
2
Random Forest AUC
Guys, I used Random Forest with a couple of data sets I had to predict for binary response. In all the cases, the AUC of the training set is coming to be 1. Is this always the case with random forests? Can someone please clarify this? I have given a simple example, first using logistic regression and then using random forests to explain the problem. AUC of the random forest is coming out to be
2006 Mar 30
2
Unbalanced Manova
Dear all, I need to do a Manova but I have an unbalanced design. I have morphological measurements similar to the iris dataset, but I don't have the same number of measurements for all species. Does anyone know a procedure to do Manova with this kind of input in R? Thank you very much, Naiara. -------------------------------------------- Naiara S. Pinto Ecology, Evolution and Behavior 1
2002 Apr 02
2
random forests for R
Hi all, There is now a package available on CRAN that provides an R interface to Leo Breiman's random forest classifier. Basically, random forest does the following: 1. Select ntree, the number of trees to grow, and mtry, a number no larger than number of variables. 2. For i = 1 to ntree: 3. Draw a bootstrap sample from the data. Call those not in the bootstrap sample the
2002 Apr 02
2
random forests for R
Hi all, There is now a package available on CRAN that provides an R interface to Leo Breiman's random forest classifier. Basically, random forest does the following: 1. Select ntree, the number of trees to grow, and mtry, a number no larger than number of variables. 2. For i = 1 to ntree: 3. Draw a bootstrap sample from the data. Call those not in the bootstrap sample the
2006 Jan 10
2
reading contigency tables
Hi all, I need some help using read.ftable to read a contingency table. My columns are organized as follows: order--family--species--location--number of individuals I couldn't figure out how to change the data on my text file to be imported into R; and after you do that, is it possible to convert the table into a data frame? Any tips would be greatly appreciatted! Thanks a lot, Naiara.
2006 Jan 24
1
polr (MASS)
Hello all, I am trying to use polr (the ordered logistic model from MASS) but I am getting the following error message: Error in if (all(pr > 0)) -sum(wt * log(pr)) else Inf : missing value where TRUE/FALSE needed My response variable is a factor with 3 levels and I have 2 independent variables. I am not sure if I guessed the starting parameters right, which I imagine could be a source of
2006 Jan 21
1
" 'x' must be numeric"
Hello all, I am importing data from a txt file and try to get a histogram, I get the message: "Error in hist: 'x' must be numeric". When I use mode R returns "List". However when I use srt I get: `data.frame': 456 obs. of 1 variable: $ V1: num 0.6344 0.4516 0.0968 0.7634 0.7957 ... My file consists of one column only (no headers) and I can't figure out why
2007 Oct 11
1
random forest mtry and mse
I have been using random forest on a data set with 226 sites and 36 explanatory variables (continuous and categorical). When I use "tune.randomforest" to determine the best value to use in "mtry" there is a fairly consistent and steady decrease in MSE, with the optimum of "mtry" usually equal to 1. Why would that occur, and what does it signify? What I would
2006 Nov 13
1
random forest regression
Dear all, I am doing a regression in ramdomForest, using the option "sampsize" reduce the number of records used to produce the randomForest object. The manual says "For classification, if sampsize is a vector of the length the number of strata, then sampling is stratified by strata, and the elements of sampsize indicate the numbers to be drawn from the strata". I need my
2012 Dec 03
2
Different results from random.Forest with test option and using predict function
Hello R Gurus, I am perplexed by the different results I obtained when I ran code like this: set.seed(100) test1<-randomForest(BinaryY~., data=Xvars, trees=51, mtry=5, seed=200) predict(test1, newdata=cbind(NewBinaryY, NewXs), type="response") and this code: set.seed(100) test2<-randomForest(BinaryY~., data=Xvars, trees=51, mtry=5, seed=200, xtest=NewXs, ytest=NewBinarY) The
2009 Apr 20
1
Random Forests: Predictor importance for Regression Trees
Hello! I think I am relatively clear on how predictor importance (the first one) is calculated by Random Forests for a Classification tree: Importance of predictor P1 when the response variable is categorical: 1. For out-of-bag (oob) cases, randomly permute their values on predictor P1 and then put them down the tree 2. For a given tree, subtract the number of votes for the correct class in the
2005 Oct 04
1
Rcmdr and scatter3d
Hi folks, I'd like to use scatter3d (which is in R commander) to plot more than one dataset in the same graph, each dataset with a different color. The kind of stuff you would do with "holdon" in Matlab. I read a recent message that was posted to this list with a similar problem, but I couldn't understand the reply. Could someone give me one example? How do you plot subgroups
2018 Dec 13
2
Random Forest con poca "n" y muchos predictores
Hola, Me he iniciado hace poco en Machine Learning, y tengo una duda sobre mis conjuntos de datos: el primero tiene 37 variables explicativas y 116 instancias, y el segundo, 140 variables explicativas y 195 instancias. El primero lo veo bien, ya que hay 3 veces más casos que variables explicativas, pero creo que el segundo caso puede suponer un problema al haber casi el mismo número de
2008 Jul 04
1
synthax for R CMD INSTALL
Dear all, I am trying to install rgdal from source on a Mac OS 10.4.11. I installed GDAL and PROJ as frameworks so the installation does not work unless I explicitly state where the GDAL and PROJ libraries are. I tried: R CMD INSTALL rgdal_0.5-25 --configure-args=--with-proj-include=/Library/Frameworks/PROJ.framework/unix/include --with-proj-lib=/Library/Frameworks/PROJ.framework/unix/lib but I
2007 Jan 29
3
comparing random forests and classification trees
Hi, I have done an analysis using 'rpart' to construct a Classification Tree. I am wanting to retain the output in tree form so that it is easily interpretable. However, I am wanting to compare the 'accuracy' of the tree to a Random Forest to estimate how much predictive ability is lost by using one simple tree. My understanding is that the error automatically displayed by the two
2012 May 11
2
Random forests prediction
Hi all, I have a strange problem when applying RF in R. I have a set of variables with which I obtain an AUC of 0.67. I do have a second set of variables that have an AUC of 0.57. When I merge the first and second set of variables, the AUC becomes 0.64. I would expect the prediction to become better as I add variables that do have some predictive power? This is even more strange as the AUC
2005 Sep 08
2
Re-evaluating the tree in the random forest
Dear mailinglist members, I was wondering if there was a way to re-evaluate the instances of a tree (in the forest) again after I have manually changed a splitpoint (or split variable) of a decision node. Here's an illustration: library("randomForest") forest.rf <- randomForest(formula = Species ~ ., data = iris, do.trace = TRUE, ntree = 3, mtry = 2, norm.votes = FALSE) # I am
2018 Jan 22
2
Random Forests
Muchas gracias Carlos, como siempre. Es raro que se me pasase. En su momento miré todos los argumentos del RF, como hago siempre, pero ese lo había olvidado. La verdad es que funcionaba estupendamente, pero me parecía extraño. Aunque dado que los RF no sobreajustan, no hay problema con que sus árboles sean todo lo grandes que quieras. Lo he testado con una base de datos externa y explica
2005 Jul 21
4
RandomForest question
Hello, I'm trying to find out the optimal number of splits (mtry parameter) for a randomForest classification. The classification is binary and there are 32 explanatory variables (mostly factors with each up to 4 levels but also some numeric variables) and 575 cases. I've seen that although there are only 32 explanatory variables the best classification performance is reached when