I asked why rpart is slow.
Patrick Connolly <p.connolly at hortresearch.co.nz> replied:
You could give us an indication of just what you're trying to
do, with what, and to what, so we would be in a position to say what
improvements could be made.
The thing that is chugging away now is
rpart(rgrp ~ y2 + sex, a.frame, a.frame$wt)
where
rgrp has 21 levels
y2 has 561 levels
sex has 2 levels
wt has values 1..9
a.frame has 50,500 cases and other variables
I have written decision tree builders, in fact I've published a paper on
the technique, and I really would expect this to zip through in seconds.
instead of the 4 hours this one has taken so far today (500MHz machine).
Presuambly it's something to do with trying to do binary splits and find
good subsets, but I don't *want* binary splits, and I can't figure out
from
?rpart how to tell rpart that I don't want binary splits.
(The idea of trying to find an optimal partition of a set of 561 elements
does not fill me with enthusiasm.)
Is there perhaps an alternative to rpart that does n-way splits instead of
binary splits?