On Mon, 25 Sep 2006, henrigel at gmx.de wrote:
> Dear r-help-list:
>
> If I use the rpart method like
>
> cfit<-rpart(y~.,data=data,...),
>
> what kind of tree is stored in cfit?
> Is it right that this tree is not pruned at all, that it is the full tree?
It is an rpart object. This contains both the tree and the instructions
for pruning it at all values of cp: note that cp is also used in deciding
how large a tree to grow.
> If so, it's up to me to choose a subtree by using the printcp method.
Or the plotcp method.
> In the technical report from Atkinson and Therneau "An Introduction to
> recursive partitioning using the rpart routines" from 2000, one can
see
> the following table on page 15:
>
> CP nsplit relerror xerror xstd
> 1 0.105 0 1.00000 1.0000 0.108
> 2 0.056 3 0.68519 1.1852 0.111
> 3 0.028 4 0.62963 1.0556 0.109
> 4 0.574 6 0.57407 1.0556 0.109
> 5 0.100 7 0.55556 1.0556 0.109
>
> Some lines below it says "We see that the best tree has 5 terminal
nodes
> (4 splits). Why that if the xerror is the lowest for the tree only
> consisting of the root?
There are *two* reports with that name: this seems to be from minitech.ps.
The choice is explained in the rest of that para (the 1-SE rule was used).
My guess is that the authors excluded the root as not being a tree, but
only they can answer that.
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595