Hi all, I would like to do cross validation in random forest using rfcv function. As the documentation for this package says: rfcv(trainx, trainy, cv.fold=5, scale="log", step=0.5, mtry=function(p) max(1, floor(sqrt(p))), recursive=FALSE, ...) however I don't know how to build trianx and trainy for my data set, and I could not understand the way trainx is built in the package documentation example for iris data set. Here is my data set and I want to do cross validation to see accuracy in classifying Alzheimer and Control Group: str(data) 'data.frame': 499 obs. of 606 variables: $ Gender : int 0 0 0 0 0 1 1 1 1 1 ... $ NumOfWords : num 157 111 163 176 100 124 201 100 76 101 $ NumofLivings : int 6 6 9 4 3 5 3 3 4 3 ... $ NumofStopWords: num 77 45 87 91 46 64 104 37 32 41 ... . . $ Group : Factor w/ 2 levels "Alzheimer","Control","Control"..: So basically trainy should be data$Group but how about trainx? Could anyone help me in this? Thanks for any help! Elahe
Elahe chalabi
2017-Aug-23 12:38 UTC
[R] cross validation in random forest using rfcv functin
Hi all, I would like to do cross validation in random forest using rfcv function. As the documentation for this package says: rfcv(trainx, trainy, cv.fold=5, scale="log", step=0.5, mtry=function(p) max(1, floor(sqrt(p))), recursive=FALSE, ...) however I don't know how to build trianx and trainy for my data set, and I could not understand the way trainx is built in the package documentation example for iris data set. Here is my data set and I want to do cross validation to see accuracy in classifying Alzheimer and Control Group: str(data) 'data.frame': 499 obs. of 606 variables: $ Gender : int 0 0 0 0 0 1 1 1 1 1 ... $ NumOfWords : num 157 111 163 176 100 124 201 100 76 101 $ NumofLivings : int 6 6 9 4 3 5 3 3 4 3 ... $ NumofStopWords: num 77 45 87 91 46 64 104 37 32 41 ... . . $ Group : Factor w/ 2 levels "Alzheimer","Control","Control"..: So basically trainy should be data$Group but how about trainx? Could anyone help me in this? Thanks for any help! Elahe
Elahe chalabi
2017-Aug-23 17:59 UTC
[R] cross validation in random forest using rfcv functin
Any responds?! On Wednesday, August 23, 2017 5:50 AM, Elahe chalabi via R-help <r-help at r-project.org> wrote: Hi all, I would like to do cross validation in random forest using rfcv function. As the documentation for this package says: rfcv(trainx, trainy, cv.fold=5, scale="log", step=0.5, mtry=function(p) max(1, floor(sqrt(p))), recursive=FALSE, ...) however I don't know how to build trianx and trainy for my data set, and I could not understand the way trainx is built in the package documentation example for iris data set. Here is my data set and I want to do cross validation to see accuracy in classifying Alzheimer and Control Group: str(data) 'data.frame': 499 obs. of 606 variables: $ Gender : int 0 0 0 0 0 1 1 1 1 1 ... $ NumOfWords : num 157 111 163 176 100 124 201 100 76 101 $ NumofLivings : int 6 6 9 4 3 5 3 3 4 3 ... $ NumofStopWords: num 77 45 87 91 46 64 104 37 32 41 ... . . $ Group : Factor w/ 2 levels "Alzheimer","Control","Control"..: So basically trainy should be data$Group but how about trainx? Could anyone help me in this? Thanks for any help! Elahe ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.