I have a general question about how to interpret the plotcp() graph.
The cross-validation "xerror" value typically follows a decreasing
pattern, from approximately 1.0 at the root node, then it crosses the 1SE
boundary, reaches a plateau, and decreases further when the tree gets very
complex [e.g., Venables & Ripley, 4ed, p.260]. The preferred tree is the
one before it crosses the 1SE boundary.
What happens if plotcp() shows an V-shape profile? For example, it goes down,
reaches the lowest point at a tree of size 6, then comes back up (see attached
pdf graph). It seems that the predictors (I have a few) do not help the splits.
Would it be reasonable to prune it at size 6? Or perhaps rpart() is not
suitable for this analysis?
Yuelin.
====================================================================
Please note that this e-mail and any files transmitted with it may be
privileged, confidential, and protected from disclosure under
applicable law. If the reader of this message is not the intended
recipient, or an employee or agent responsible for delivering this
message to the intended recipient, you are hereby notified that any
reading, dissemination, distribution, copying, or other use of this
communication or any of its attachments is strictly prohibited. If
you have received this communication in error, please notify the
sender immediately by replying to this message and deleting this
message, any attachments, and all copies and backups from your
computer.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Temp1.pdf
Type: application/pdf
Size: 7493 bytes
Desc: not available
URL:
<https://stat.ethz.ch/pipermail/r-help/attachments/20080530/6b6b262d/attachment.pdf>