J4T5U8
2020-Apr-07 14:22 UTC
[R] Error using R caret package (train) with C5.0 decision tree to do K-fold cross validation
I'm trying to use the caret package to do repeated k-fold cross validation
with C5.0 decision trees.
The following code generates a working C5.0 decision tree (68% accuracy on
confusion matrix):
> model <- C5.0(as.factor(OneGM) ~., data=OneT.train)
> results <- predict(object=model, newdata=OneT.test,
type="class")
The caret package code gives these errors:
> train_control <- trainControl(method="repeatedcv",
number=10, repeats=10)
> model <- train(as.factor(OneGM) ~., data=OneT.train,
trControl=train_control, method="C5.0")
Error in na.fail.default(list(`as.factor(OneGM)` = c(1L, 1L, 1L, 1L, :
missing values in object
> model <- train(OneGM ~., data=OneT.train, trControl=train_control,
method="C5.0")
Error in na.fail.default(list(OneGM = c(FALSE, FALSE, FALSE, FALSE, :
missing values in object
The data is loaded from a .csv file, and OneGM is either TRUE or FALSE (a text
column in the .csv).
I would like to use the one-line caret package approach above (which I've
seen used in multiple places), and I'm not looking for solutions that do
cross validation manually.
Thanks for any help.
[[alternative HTML version deleted]]