Jing Liu
2011-May-12 23:04 UTC
[R] Can ROC be used as a metric for optimal model selection for randomForest?
Dear all, I am using the "caret" Package for predictors selection with a randomForest model. The following is the train function: rfFit<- train(x=trainRatios, y=trainClass, method="rf", importance = TRUE, do.trace = 100, keep.inbag = TRUE, tuneGrid = grid, trControl=bootControl, scale = TRUE, metric = "ROC") I wanted to use ROC as the metric for variable selection. I know that this works with the logit model by making sure that classProbs = TRUE and summaryFunction = twoClassSummary in the trainControl function. However if I do the same with randomForest, I get a warning saying that "In train.default(x = trainPred, y = trainDep, method = "rf", : The metric "ROC" was not in the result set. Accuracy will be used instead." I wonder if ROC metric can be used for randomForest? Have I missed something? Very very grateful if anyone can help! Best regards, XiaoLiu [[alternative HTML version deleted]]
Frank Harrell
2011-May-13 12:11 UTC
[R] Can ROC be used as a metric for optimal model selection for randomForest?
Using anything other than deviance (or likelihood) as the objective function will result in a suboptimal model. Frank ----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Can-ROC-be-used-as-a-metric-for-optimal-model-selection-for-randomForest-tp3519003p3520043.html Sent from the R help mailing list archive at Nabble.com.
Max Kuhn
2011-May-13 12:38 UTC
[R] Can ROC be used as a metric for optimal model selection for randomForest?
XiaoLiu, I can't see the options in bootControl you used here. Your error is consistent with leaving classProbs and summaryFunction unspecified. Please double check that you set them with classProbs = TRUE and summaryFunction = twoClassSummary before you ran. Max On Thu, May 12, 2011 at 7:04 PM, Jing Liu <quiet_jing0920 at hotmail.com> wrote:> > Dear all, > > I am using the "caret" Package for predictors selection with a randomForest model. The following is the train function: > > rfFit<- train(x=trainRatios, y=trainClass, method="rf", importance = TRUE, do.trace = 100, keep.inbag = TRUE, > ? ?tuneGrid = grid, trControl=bootControl, scale = TRUE, metric = "ROC") > > I wanted to use ROC as the metric for variable selection. I know that this works with the logit model by making sure that classProbs = TRUE and summaryFunction = twoClassSummary in the trainControl function. However if I do the same with randomForest, I get a warning saying that > > "In train.default(x = trainPred, y = trainDep, method = "rf", ?: > ?The metric "ROC" was not in the result set. Accuracy will be used instead." > > I wonder if ROC metric can be used for randomForest? Have I missed something? Very very grateful if anyone can help! > > Best regards, > XiaoLiu > > > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Max