Hi,
I am using rpart for my decision stump. I am trying to extract the learned
stump from the output of rpart.
For example:
cntrl <- rpart.control(maxdepth = 1, minsplit = learn-1,
maxsurrogate = 0, usesurrogate=0, maxcompete = 1,
cp = 0, xval = 0)
fit <- rpart(y~bx, weights = w/mean(w), control = cntrl)
After some while of searching rpart document, playing the code and asking
questions, I came to this tentative conclusion that:
if my stump is for classifiction, fit[[1]]$var[[1]] is the selected feature
ID, and fit$split[1,4] is the splitting value
if my stump is for regression, fit[[1]]$var[[1]] is the selected feature ID,
fit$split[1,4] is the splitting value, fit[[1]]$yval[[2]] and
fit[[1]]$yval[[3]] are the two constant values for the two leaves.
However I get stuck again. For example, for my classification stump, here is
a value of fit, where feature No. 59 is selected and split at
0.0065> fit
n= 236
node), split, n, deviance, yval
* denotes terminal node
1) root 236 236.0000 -2.831422e-17
2) bx.59< 0.0065 156 0.0000 -1.000000e+00 *
3) bx.59>=0.0065 80 116.7475 5.053073e-01 *
> fit[[1]]$var[[1]]
[1] bx.59
60 Levels: <leaf> bx.1 bx.2 bx.3 bx.4 bx.5 bx.6 bx.7 bx.8 bx.9 bx.10 ...
bx.59
if I give fit[[1]]$var[[1]] to a variable rr, then it is
wrong> rr <- as.numeric(fit[[1]]$var[[1]])
> rr
[1] 60
Can someone tell me how to correctly get the decision tree info from the
output of rpart?
Thanks and regards!
--
View this message in context:
http://www.nabble.com/extract-decision-tree-from-rpart-tp23541393p23541393.html
Sent from the R help mailing list archive at Nabble.com.