Displaying 20 results from an estimated 1000 matches similar to: "Random Forest, Giving More Importance to Some Data"
2013 Feb 03
3
RandomForest, Party and Memory Management
Dear All,
For a data mining project, I am relying heavily on the RandomForest and
Party packages.
Due to the large size of the data set, I have often memory problems (in
particular with the Party package; RandomForest seems to use less memory).
I really have two questions at this point
1) Please see how I am using the Party and RandomForest packages. Any
comment is welcome and useful.
2013 Mar 24
3
Parallelizing GBM
Dear All,
I am far from being a guru about parallel programming.
Most of the time, I rely or randomForest for data mining large datasets.
I would like to give a try also to the gradient boosted methods in GBM,
but I have a need for parallelization.
I normally rely on gbm.fit for speed reasons, and I usually call it this
way
gbm_model <- gbm.fit(trainRF,prices_train,
offset = NULL,
misc =
2023 May 09
1
RandomForest tuning the parameters
Hi Sacha,
On second thought, perhaps this is more the direction that you want ...
X2 = cbind(X_train,y_train)
colnames(X2)[3] = "y"
regr2<-randomForest(y~x1+x2, data=X2,maxnodes=10, ntree=10)
regr
regr2
#Make prediction
predictions= predict(regr, X_test)
predictions2= predict(regr2, X_test)
HTH,
Eric
On Tue, May 9, 2023 at 6:40?AM Eric Berger <ericjberger at gmail.com>
2023 May 08
1
RandomForest tuning the parameters
Dear R-experts,
Here below a toy example with some error messages, especially at the end of the code (Tuning the parameters). Your help to correct my R code would be highly appreciated.
#######################################
#libraries
library(lattice)
library(ggplot2)
library(caret)
library(randomForest)
??
#Data
2018 Mar 29
2
Pasar argunmentos string a una formula
Buenas
Tengo en un string guardado lo siguiente:
> parametros
[1] "ntree=10" "ntree=30" "ntree=50" "ntree=100" "ntree=200"
Con un bucle for quiero ir metiendolo en el modelo, pero no se muy bien como hacerlo, ya que con deparse no me funciona, con get tampoco (obvio, no es un objeto), y no se muy bien como hacerlo de manera dinamica
2018 Jan 22
2
Random Forests
Muchas gracias Carlos, como siempre.
Es raro que se me pasase. En su momento miré todos los argumentos del
RF, como hago siempre, pero ese lo había olvidado. La verdad es que
funcionaba estupendamente, pero me parecía extraño. Aunque dado que
los RF no sobreajustan, no hay problema con que sus árboles sean todo
lo grandes que quieras. Lo he testado con una base de datos externa y
explica
2012 Dec 03
1
How do I make R randomForest model size smaller?
I've been training randomForest models on 7 million rows of data (41
features). Here's an example call:
myModel <- randomForest(RESPONSE~., data=mydata, ntree=50, maxnodes=30)
I thought surely with only 50 trees and 30 terminal nodes that the memory
footprint of "myModel" would be small. But it's 65 megs in a dump file. The
object seems to be holding all sorts of
2012 Oct 22
1
random forest
Hi all,
Can some one tell me the difference between the following two formulas?
1. epiG.rf <-randomForest(gamma~.,data=data, na.action = na.fail,ntree =
300,xtest = NULL, ytest = NULL,replace = T, proximity =F)
2.epiG.rf <-randomForest(gamma~.,data=data, na.action = na.fail,ntree =
300,xtest = NULL, ytest = NULL,replace = T, proximity =F)
[[alternative HTML version deleted]]
2018 Jan 20
2
Random Forests
Si, Carlos. Yo hago lo mismo, pero esos mismos numeritos salen enormes.
> treesize(RFfit)
[1] 4304 4302 4311 4319 4343 4298 4298 4311 4349 4327 4331 4317
4294 4321 4283 4362
[17] 4300 4330 4266 4331 4308 4352 4294 4315 4372 4349 4331 4347
4329 4348 4298 4335
[33] 4346 4396 4345 4313 4293 4276 4353 4272 4304 4325 4317 4336
4308 4351 4374 4324
[49] 4386 4359 4311 4346 4300
2010 May 25
1
Need Help! Poor performance about randomForest for large data
Hi, dears,
I am processing some data with 60 columns, and 286,730 rows.
Most columns are numerical value, and some columns are categorical value.
It turns out that: when ntree sets to the default value (500), it says "can
not allocate a vector of 1.1 GB size"; And when I set ntree to be a very
small number like 10, it will run for hours.
I use the (x,y) rather than the (formula,data).
2010 Jan 11
1
Help me! using random Forest package, how to calculate Error Rates in the training set ?
now I am learining random forest and using random forest package, I can get
the OOB error rates, and test set rate, now I want to get the training set
error rate, how can I do?
pgp.rf<-randomForest(x.tr,y.tr,x.ts,y.ts,ntree=1e3,keep.forest=FALSE,do.trace=1e2)
using the code can get oob and test set error rate, if I replace x.ts and
y.ts with x.tr and y.tr,respectively, is the error rate
2005 Sep 08
2
Re-evaluating the tree in the random forest
Dear mailinglist members,
I was wondering if there was a way to re-evaluate the
instances of a tree (in the forest) again after I have
manually changed a splitpoint (or split variable) of a
decision node. Here's an illustration:
library("randomForest")
forest.rf <- randomForest(formula = Species ~ ., data
= iris, do.trace = TRUE, ntree = 3, mtry = 2,
norm.votes = FALSE)
# I am
2008 Jun 15
1
randomForest, 'No forest component...' error while calling Predict()
Dear R-users,
While making a prediction using the randomForest function (package
randomForest) I'm getting the following error message:
"Error in predict.randomForest(model, newdata = CV) : No forest component
in the object"
Here's my complete code. For reproducing this task, please find my 2 data
sets attached ( http://www.nabble.com/file/p17855119/data.rar data.rar ).
2010 Jan 15
1
randomForest maxnodes
Has anyone sucessfully used the maxnodes feature in randomForest? I tried
setting it, but when it is non-NULL I always get back a forest in which all
trees have size 1. I am using a continuous response (regression). Any help
would be appreciated.
Thanks.
[[alternative HTML version deleted]]
2007 Apr 23
6
Random Forest
Hi,
I am trying to print out my confusion matrix after having created my random
forest.
I have put in this command:
fit<-randomForest(MMS_ENABLED_HANDSET~.,data=dat,ntree=500,mtry=14,
na.action=na.omit,confusion=TRUE)
but I can't get it to give me the confusion matrix, anyone know how this
works?
Thansk!
Ruben
[[alternative HTML version deleted]]
2011 Dec 15
2
Random Forest Reading N/A's, I don't see them
After checking the original data in Excel for blanks and running Summary(cm3)
to identify any null values in my data, I'm unable to identify an instances.
Yet when I attempted to use the data in Random Forest, I get the following
error. Is there something that Random Forest is reading as null which is not
actually null? Is there a better way to check for this?
> library(randomForest)
>
2018 Jan 20
2
Random Forests
Gracias Carlos y Javier, ntrees es el nº de árboles y treesize sus
respectivos tamaños (nº de nodos)
ntree: Number of trees to grow. This should not be set to too small ......
treesize: Size of trees (number of nodes) in and ensemble.
Puse 1000 árboles (ntree=1000), si, pero la función treesize te da el
nº de nodos:
treesize(RFfit, terminal=TRUE) me da un vector de 1000 elementos (uno
2007 Aug 10
1
rfImpute
I am having trouble with the rfImpute function in the randomForest package.
Here is a sample...
clunk.roughfix<-na.roughfix(clunk)
>
> clunk.impute<-rfImpute(CONVERT~.,data=clunk)
ntree OOB 1 2
300: 26.80% 3.83% 85.37%
ntree OOB 1 2
300: 18.56% 5.74% 51.22%
Error in randomForest.default(xf, y, ntree = ntree, ..., do.trace = ntree,
:
NA not
2008 Jul 05
1
Random Forest %var(y)
The verbose option gives a display like:
> rf.500 <-
+ randomForest(new.x,trn.y,do.trace=20,ntree=100,nodesize=500,
+ importance=T)
| Out-of-bag |
Tree | MSE %Var(y) |
20 | 0.9279 100.84 |
What is the meaning of %var(y)>100%? I expected that to correspond to a
model that was worse than random, but the predictions seem much better than
that on
2008 May 05
1
Problems using rfImpute
Hello R-user!
I am running R 2.7.0 on a Power Book (Tiger). (I am still R and
statistics beginner)
I tried rfImpute (randomForest) and as far as I understood should it
replace NA`s using a proximity matrix:
> set.seed(100000)
> Subset5Imputed<-rfImpute(Sex~., data=Subset5)
ntree OOB 1 2
300: 11.78% 12.36% 11.21%
ntree OOB 1 2
300: 12.07% 12.64%