Rodica Coderie
2015-Feb-16 10:08 UTC
[R] #library(party) - Compare predicted results for ctree
Hello, I've created a ctree model called fit using 15 input variables for a factor predicted variable Response (YES/NO). When I run the following : table(predict(fit2), training_data$response) I get the following result: NO YES NO 48694 480 YES 0 0 It appears that the NO responses are predicted with 100% accuracy and the YES response are predicted with 0% accuracy. Why is this happening? It's because of my data or it's something in ctree algorithm? Thanks! Rodica
Achim Zeileis
2015-Feb-16 10:17 UTC
[R] #library(party) - Compare predicted results for ctree
On Mon, 16 Feb 2015, Rodica Coderie via R-help wrote:> Hello, > > I've created a ctree model called fit using 15 input variables for a factor predicted variable Response (YES/NO). > When I run the following : > table(predict(fit2), training_data$response) > I get the following result: > > NO YES > NO 48694 480 > YES 0 0 > > It appears that the NO responses are predicted with 100% accuracy and > the YES response are predicted with 0% accuracy. > > Why is this happening? It's because of my data or it's something in > ctree algorithm?Your data has less than 1% of YES observations and I would guess that the tree cannot separate these in a way such that majority voting gives a YES prediction. You might consider a different cutoff (other than 50%) or downsampling the NO observations.> Thanks! > Rodica > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >