Chrysanthi A.
2009-Apr-12 16:26 UTC
[R] Running random forest using different training and testing schemes
Hi, I would like to run random Forest classification algorithm and check the accuracy of the prediction according to different training and testing schemes. For example, extracting 70% of the samples for training and the rest for testing, or using 10-fold cross validation scheme. How can I do that? Is there a function? Thanks a lot, Chrysanthi. [[alternative HTML version deleted]]
Pierre Moffard
2009-Apr-12 17:01 UTC
[R] Re : Running random forest using different training and testing schemes
Hi Chysanthi, check out the randomForest package, with the function randomForest. It has a CV option. Sorry for not providing you with a lengthier response at the moment but I'm rather busy on a project. Let me know if you need more help. Also, to split your data into two parts- the training and the test set you can do (n the number of data points): n<-length(data[,1]) indices<-sample(rep(c(TRUE,FALSE),each=n/2),round(n/2),replace=TRUE) training_indices<-(1:n)[indices] test_indices<-(1:n)[!indices] Then, data[train,] is the training set and data[test,] is the test set. Best, Pierre ________________________________ De : Chrysanthi A. <chrysain@gmail.com> À : r-help@r-project.org Envoyé le : Dimanche, 12 Avril 2009, 17h26mn 59s Objet : [R] Running random forest using different training and testing schemes Hi, I would like to run random Forest classification algorithm and check the accuracy of the prediction according to different training and testing schemes. For example, extracting 70% of the samples for training and the rest for testing, or using 10-fold cross validation scheme. How can I do that? Is there a function? Thanks a lot, Chrysanthi. [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]
Pierre Moffard
2009-Apr-12 17:09 UTC
[R] Re : Running random forest using different training and testing schemes
you need to include in your code something like: tree<-rpart(result~., data, control=rpart.control(xval=10)). this xval=10 is 10-fold CV. Best, Pierre ________________________________ De : Chrysanthi A. <chrysain@gmail.com> À : r-help@r-project.org Envoyé le : Dimanche, 12 Avril 2009, 17h26mn 59s Objet : [R] Running random forest using different training and testing schemes Hi, I would like to run random Forest classification algorithm and check the accuracy of the prediction according to different training and testing schemes. For example, extracting 70% of the samples for training and the rest for testing, or using 10-fold cross validation scheme. How can I do that? Is there a function? Thanks a lot, Chrysanthi. [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]
Max Kuhn
2009-Apr-12 21:46 UTC
[R] Running random forest using different training and testing schemes
There is also the train function in the caret package. The trainControl function can be used to try different resampling schemes. There is also a package vignette with details. Max On Apr 12, 2009, at 12:26 PM, "Chrysanthi A." <chrysain at gmail.com> wrote:> Hi, > > I would like to run random Forest classification algorithm and check > the > accuracy of the prediction according to different training and testing > schemes. For example, extracting 70% of the samples for training and > the > rest for testing, or using 10-fold cross validation scheme. > How can I do that? Is there a function? > > Thanks a lot, > > Chrysanthi. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Seemingly Similar Threads
- help with random forest package
- heatmap.2: question regarding the "raw z-score"
- how to visualize gini coefficient in each node in RF?
- Help me! using random Forest package, how to calculate Error Rates in the training set ?
- Cplex solver and optimization in R