Hi, when I have made a decision tree with rpart, is it possible to "apply" this tree to a new set of data in order to find out the distribution of observations? Ideally I would like to plot my original tree, with the counts (at each node) of the new data. Reagards, Jay
? predict.rpart Weidong Gu On Mon, Aug 29, 2011 at 12:49 PM, Jay <josip.2000 at gmail.com> wrote:> Hi, > > when I have made a decision tree with rpart, is it possible to "apply" > this tree to a new set of data in order to find out the distribution > of observations? Ideally I would like to plot my original tree, with > the counts (at each node) of the new data. > > > Reagards, > Jay > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Jay <josip.2000 at gmail.com> het geskryf> When I have made a decision tree with rpart, is it possible to "apply" > this tree to a new set of data in order to find out the distribution > of observations? Ideally I would like to plot my original tree, with > the counts (at each node) of the new data.Sadly, neither plot.rpart or rpart.plot support plotting a tree trained on one set of data but showing results predicted for a new set of data. Page 21 of the vignette for the rpart.plot package has this to say "Arguably the most serious limitation of the current implementation is its inability to display results on test data (on the tree derived from the training data)." One way of implementing this (quite a lot of work) would be to extend the rpart function to include a newdata argument. If given such an argument, rpart would additionally return new.frame, new.where, and new.y fields (corresponding to the existing frame, where, and y fields). The plotting functions could then trivially be extended to use these new fields.