Tim
2009-May-08 09:30 UTC
[R] Get (feature, threshold) from Output of rpart() for Stump Tree
Hi, I have a question regarding how to get some partial information from the output of rpart, which could be used as the first argument to predict. For example, in my code, I try to learn a stump tree (decision tree of depth 2): "fit <- rpart(y~bx, weights = w/mean(w), control = cntrl) print(fit) btest[1,] <- predict(fit, newdata = data.frame(bx)) " I found that "fit" is of mode "list" and length 12. If I "print(fit)", I will get as output: "n= 124 node), split, n, deviance, yval * denotes terminal node 1) root 124 61.54839 0.7096774 2) bx.21< 13.5 41 40.39024 0.1219512 * 3) bx.21>=13.5 83 0.00000 1.0000000 *" I don't want the whole output of "print(fit)" but only the two kinds of info in it: "21" in "bx.21", which I believe to be the feature ID of the stump tree , and 13.5, which I believe to be the threshold on the feature. If I am able to get these two out, then I will be able to further process them or write them into a file. Any hint? Thanks and regards! -Tim [[alternative HTML version deleted]]
Terry Therneau
2009-May-08 12:05 UTC
[R] Get (feature, threshold) from Output of rpart() for Stump Tree
--- begin included message -- Hi, I have a question regarding how to get some partial information from the output of rpart, which could be used as the first argument to predict. For example, in my code, I try to learn a stump tree (decision tree of depth 2): ?? "fit??????? <- rpart(y~bx, weights = w/mean(w), control = cntrl) --- end inclusion --- 1. For stump trees, you can use the depth option in rpart.control to get a small tree. You also might want to set maxsurrogate=0 for speed. 2. Try help(rpart.object) for more information on what is contained in the returned rpart object. In your case fit$splits[1,] would contain all that you need. Terry T.