Chrysanthi A.
2009-Apr-12 16:26 UTC
[R] Running random forest using different training and testing schemes
Hi, I would like to run random Forest classification algorithm and check the accuracy of the prediction according to different training and testing schemes. For example, extracting 70% of the samples for training and the rest for testing, or using 10-fold cross validation scheme. How can I do that? Is there a function? Thanks a lot, Chrysanthi. [[alternative HTML version deleted]]
Pierre Moffard
2009-Apr-12 17:01 UTC
[R] Re : Running random forest using different training and testing schemes
Hi Chysanthi,
check out the randomForest package, with the function randomForest. It has a CV
option. Sorry for not providing you with a lengthier response at the moment but
I'm rather busy on a project. Let me know if you need more help.
Also, to split your data into two parts- the training and the test set you can
do (n the number of data points):
n<-length(data[,1])
indices<-sample(rep(c(TRUE,FALSE),each=n/2),round(n/2),replace=TRUE)
training_indices<-(1:n)[indices]
test_indices<-(1:n)[!indices]
Then, data[train,] is the training set and data[test,] is the test set.
Best,
Pierre
________________________________
De : Chrysanthi A. <chrysain@gmail.com>
À : r-help@r-project.org
Envoyé le : Dimanche, 12 Avril 2009, 17h26mn 59s
Objet : [R] Running random forest using different training and testing schemes
Hi,
I would like to run random Forest classification algorithm and check the
accuracy of the prediction according to different training and testing
schemes. For example, extracting 70% of the samples for training and the
rest for testing, or using 10-fold cross validation scheme.
How can I do that? Is there a function?
Thanks a lot,
Chrysanthi.
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
Pierre Moffard
2009-Apr-12 17:09 UTC
[R] Re : Running random forest using different training and testing schemes
you need to include in your code something like:
tree<-rpart(result~., data, control=rpart.control(xval=10)).
this xval=10 is 10-fold CV.
Best,
Pierre
________________________________
De : Chrysanthi A. <chrysain@gmail.com>
À : r-help@r-project.org
Envoyé le : Dimanche, 12 Avril 2009, 17h26mn 59s
Objet : [R] Running random forest using different training and testing schemes
Hi,
I would like to run random Forest classification algorithm and check the
accuracy of the prediction according to different training and testing
schemes. For example, extracting 70% of the samples for training and the
rest for testing, or using 10-fold cross validation scheme.
How can I do that? Is there a function?
Thanks a lot,
Chrysanthi.
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
Max Kuhn
2009-Apr-12 21:46 UTC
[R] Running random forest using different training and testing schemes
There is also the train function in the caret package. The trainControl function can be used to try different resampling schemes. There is also a package vignette with details. Max On Apr 12, 2009, at 12:26 PM, "Chrysanthi A." <chrysain at gmail.com> wrote:> Hi, > > I would like to run random Forest classification algorithm and check > the > accuracy of the prediction according to different training and testing > schemes. For example, extracting 70% of the samples for training and > the > rest for testing, or using 10-fold cross validation scheme. > How can I do that? Is there a function? > > Thanks a lot, > > Chrysanthi. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Maybe Matching Threads
- help with random forest package
- heatmap.2: question regarding the "raw z-score"
- how to visualize gini coefficient in each node in RF?
- Help me! using random Forest package, how to calculate Error Rates in the training set ?
- Cplex solver and optimization in R