Displaying 20 results from an estimated 8000 matches similar to: "Random forests"
2008 Mar 09
1
sampsize in Random Forests
Hi all,
I have a dataset where each point is assigned to a class A, B, C, or
D. Each point is also assigned to a study site. Each study site is
coded with a number ranging between 1-100. This information is stored
in the vector studySites.
I want to run randomForests using stratified sampling, so I chose the option
strata = factor(studySites)
But I am not sure how to control the number of
2010 Oct 22
2
Random Forest AUC
Guys,
I used Random Forest with a couple of data sets I had to predict for binary
response. In all the cases, the AUC of the training set is coming to be 1.
Is this always the case with random forests? Can someone please clarify
this?
I have given a simple example, first using logistic regression and then
using random forests to explain the problem. AUC of the random forest is
coming out to be
2006 Mar 30
2
Unbalanced Manova
Dear all,
I need to do a Manova but I have an unbalanced design. I have
morphological measurements similar to the iris dataset, but I don't have
the same number of measurements for all species. Does anyone know a
procedure to do Manova with this kind of input in R?
Thank you very much,
Naiara.
--------------------------------------------
Naiara S. Pinto
Ecology, Evolution and Behavior
1
2002 Apr 02
2
random forests for R
Hi all,
There is now a package available on CRAN that provides an R interface to Leo
Breiman's random forest classifier.
Basically, random forest does the following:
1. Select ntree, the number of trees to grow, and mtry, a number no larger
than number of variables.
2. For i = 1 to ntree:
3. Draw a bootstrap sample from the data. Call those not in the bootstrap
sample the
2002 Apr 02
2
random forests for R
Hi all,
There is now a package available on CRAN that provides an R interface to Leo
Breiman's random forest classifier.
Basically, random forest does the following:
1. Select ntree, the number of trees to grow, and mtry, a number no larger
than number of variables.
2. For i = 1 to ntree:
3. Draw a bootstrap sample from the data. Call those not in the bootstrap
sample the
2006 Jan 10
2
reading contigency tables
Hi all,
I need some help using read.ftable to read a contingency table. My columns
are organized as follows:
order--family--species--location--number of individuals
I couldn't figure out how to change the data on my text file to be
imported into R; and after you do that, is it possible to convert the
table into a data frame? Any tips would be greatly appreciatted!
Thanks a lot,
Naiara.
2006 Jan 24
1
polr (MASS)
Hello all,
I am trying to use polr (the ordered logistic model from MASS) but I am
getting the following error message:
Error in if (all(pr > 0)) -sum(wt * log(pr)) else Inf :
missing value where TRUE/FALSE needed
My response variable is a factor with 3 levels and I have 2 independent
variables. I am not sure if I guessed the starting parameters right, which
I imagine could be a source of
2006 Jan 21
1
" 'x' must be numeric"
Hello all,
I am importing data from a txt file and try to get a histogram, I get the
message: "Error in hist: 'x' must be numeric".
When I use mode R returns "List".
However when I use srt I get:
`data.frame': 456 obs. of 1 variable:
$ V1: num 0.6344 0.4516 0.0968 0.7634 0.7957 ...
My file consists of one column only (no headers) and I can't figure out
why
2007 Oct 11
1
random forest mtry and mse
I have been using random forest on a data set with 226 sites and 36
explanatory variables (continuous and categorical). When I use
"tune.randomforest" to determine the best value to use in "mtry" there
is a fairly consistent and steady decrease in MSE, with the optimum of
"mtry" usually equal to 1. Why would that occur, and what does it
signify? What I would
2006 Nov 13
1
random forest regression
Dear all,
I am doing a regression in ramdomForest, using the option "sampsize" reduce
the number of records used to produce the randomForest object.
The manual says "For classification, if sampsize is a vector of the length
the number of strata, then sampling is stratified by strata, and the
elements of sampsize indicate the numbers to be drawn from the strata". I
need my
2012 Dec 03
2
Different results from random.Forest with test option and using predict function
Hello R Gurus,
I am perplexed by the different results I obtained when I ran code like
this:
set.seed(100)
test1<-randomForest(BinaryY~., data=Xvars, trees=51, mtry=5, seed=200)
predict(test1, newdata=cbind(NewBinaryY, NewXs), type="response")
and this code:
set.seed(100)
test2<-randomForest(BinaryY~., data=Xvars, trees=51, mtry=5, seed=200,
xtest=NewXs, ytest=NewBinarY)
The
2009 Apr 20
1
Random Forests: Predictor importance for Regression Trees
Hello!
I think I am relatively clear on how predictor importance (the first
one) is calculated by Random Forests for a Classification tree:
Importance of predictor P1 when the response variable is categorical:
1. For out-of-bag (oob) cases, randomly permute their values on
predictor P1 and then put them down the tree
2. For a given tree, subtract the number of votes for the correct
class in the
2005 Oct 04
1
Rcmdr and scatter3d
Hi folks,
I'd like to use scatter3d (which is in R commander) to plot more than one
dataset in the same graph, each dataset with a different color. The kind
of stuff you would do with "holdon" in Matlab.
I read a recent message that was posted to this list with a similar
problem, but I couldn't understand the reply. Could someone give me one
example? How do you plot subgroups
2018 Dec 13
2
Random Forest con poca "n" y muchos predictores
Hola,
Me he iniciado hace poco en Machine Learning, y tengo una duda sobre mis
conjuntos de datos: el primero tiene 37 variables explicativas y 116
instancias, y el segundo, 140 variables explicativas y 195 instancias. El
primero lo veo bien, ya que hay 3 veces más casos que variables
explicativas, pero creo que el segundo caso puede suponer un problema al
haber casi el mismo número de
2008 Jul 04
1
synthax for R CMD INSTALL
Dear all,
I am trying to install rgdal from source on a Mac OS 10.4.11. I installed
GDAL and PROJ as frameworks so the installation does not work unless I
explicitly state where the GDAL and PROJ libraries are. I tried:
R CMD INSTALL rgdal_0.5-25
--configure-args=--with-proj-include=/Library/Frameworks/PROJ.framework/unix/include
--with-proj-lib=/Library/Frameworks/PROJ.framework/unix/lib
but I
2007 Jan 29
3
comparing random forests and classification trees
Hi,
I have done an analysis using 'rpart' to construct a Classification Tree. I
am wanting to retain the output in tree form so that it is easily
interpretable. However, I am wanting to compare the 'accuracy' of the tree
to a Random Forest to estimate how much predictive ability is lost by using
one simple tree. My understanding is that the error automatically displayed
by the two
2012 May 11
2
Random forests prediction
Hi all,
I have a strange problem when applying RF in R.
I have a set of variables with which I obtain an AUC of 0.67.
I do have a second set of variables that have an AUC of 0.57.
When I merge the first and second set of variables, the AUC becomes 0.64.
I would expect the prediction to become better as I add variables that do
have some predictive power?
This is even more strange as the AUC
2005 Sep 08
2
Re-evaluating the tree in the random forest
Dear mailinglist members,
I was wondering if there was a way to re-evaluate the
instances of a tree (in the forest) again after I have
manually changed a splitpoint (or split variable) of a
decision node. Here's an illustration:
library("randomForest")
forest.rf <- randomForest(formula = Species ~ ., data
= iris, do.trace = TRUE, ntree = 3, mtry = 2,
norm.votes = FALSE)
# I am
2018 Jan 22
2
Random Forests
Muchas gracias Carlos, como siempre.
Es raro que se me pasase. En su momento miré todos los argumentos del
RF, como hago siempre, pero ese lo había olvidado. La verdad es que
funcionaba estupendamente, pero me parecía extraño. Aunque dado que
los RF no sobreajustan, no hay problema con que sus árboles sean todo
lo grandes que quieras. Lo he testado con una base de datos externa y
explica
2005 Jul 21
4
RandomForest question
Hello,
I'm trying to find out the optimal number of splits (mtry parameter) for a randomForest classification. The classification is binary and there are 32 explanatory variables (mostly factors with each up to 4 levels but also some numeric variables) and 575 cases.
I've seen that although there are only 32 explanatory variables the best classification performance is reached when