More information is needed to be sure, but it is most likely that some
of the resampled rpart models produce the same prediction for the
hold-out samples (likely the result of no viable split being found).
Almost every incarnation of R^2 requires the variance of the
prediction. This particular failure mode would result in a divide by
zero.
Try using you own summary function (see ?trainControl) and put a
print(summary(data$pred)) in there to verify my claim.
Max
On Wed, May 16, 2012 at 11:30 AM, Max Kuhn <mxkuhn at gmail.com>
wrote:> More information is needed to be sure, but it is most likely that some
> of the resampled rpart models produce the same prediction for the
> hold-out samples (likely the result of no viable split being found).
>
> Almost every incarnation of R^2 requires the variance of the
> prediction. This particular failure mode would result in a divide by
> zero.
>
> Try using you own summary function (see ?trainControl) and put a
> print(summary(data$pred)) in there to verify my claim.
>
> Max
>
> On Tue, May 15, 2012 at 5:55 AM, Dominik Bruhn <dominik at dbruhn.de>
wrote:
>> Hy,
>> I got the following problem when trying to build a rpart model and
using
>> everything but LOOCV. Originally, I wanted to used k-fold partitioning,
>> but every partitioning except LOOCV throws the following warning:
>>
>> ----
>> Warning message: In nominalTrainWorkflow(dat = trainData, info >>
trainInfo, method = method, : There were missing values in resampled
>> performance measures.
>> -----
>>
>> Below are some simplified testcases which repoduce the warning on my
>> system.
>>
>> Question: What does this error mean? How can I avoid it?
>>
>> System-Information:
>> -----
>>> sessionInfo()
>> R version 2.15.0 (2012-03-30)
>> Platform: x86_64-pc-linux-gnu (64-bit)
>>
>> locale:
>> ?[1] LC_CTYPE=en_GB.UTF-8 ? ? ? LC_NUMERIC=C
>> ?[3] LC_TIME=en_GB.UTF-8 ? ? ? ?LC_COLLATE=en_GB.UTF-8
>> ?[5] LC_MONETARY=en_GB.UTF-8 ? ?LC_MESSAGES=en_GB.UTF-8
>> ?[7] LC_PAPER=C ? ? ? ? ? ? ? ? LC_NAME=C
>> ?[9] LC_ADDRESS=C ? ? ? ? ? ? ? LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base
>>
>> other attached packages:
>> [1] rpart_3.1-52 ? caret_5.15-023 foreach_1.4.0 ?cluster_1.14.2
>> reshape_0.8.4
>> [6] plyr_1.7.1 ? ? lattice_0.20-6
>>
>> loaded via a namespace (and not attached):
>> [1] codetools_0.2-8 compiler_2.15.0 grid_2.15.0 ? ? iterators_1.0.6
>> [5] tools_2.15.0
>> -------
>>
>>
>> Simlified Testcase I: Throws warning
>> ---
>> library(caret)
>> data(trees)
>> formula=Volume~Girth+Height
>> train(formula, data=trees, ?method='rpart')
>> ---
>>
>> Simlified Testcase II: Every other CV-method also throws the warning,
>> for example using 'cv':
>> ---
>> library(caret)
>> data(trees)
>> formula=Volume~Girth+Height
>> tc=trainControl(method='cv')
>> train(formula, data=trees, ?method='rpart', trControl=tc)
>> ---
>>
>> Simlified Testcase III: The only CV-method which is working is
'LOOCV':
>> ---
>> library(caret)
>> data(trees)
>> formula=Volume~Girth+Height
>> tc=trainControl(method='LOOCV')
>> train(formula, data=trees, ?method='rpart', trControl=tc)
>> ---
>>
>>
>> Thanks!
>> --
>> Dominik Bruhn
>> mailto: dominik at dbruhn.de
>>
>>
>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
>
> Max
--
Max