Ah yes I will have to use the predict function. But the predict function will not get me there really. If I can take the example that I have a model predicting whether or not I will play golf (this is the dependent value), and there are three independent variables Humidity(High, Medium, Low), Pending_Chores(Taxes, None, Laundry, Car Maintenance) and Wind (High, Low). I would like rules like where any record that follows these rules (IF humidity = high AND pending_chores = None AND Wind = High THEN 77% there is probability that play_golf is YES). I was thinking that random forrest would weight the rules somehow on the collection of trees and give a probability. But if that doesnt make sense, then can you just tell me how to get the decsion rules with one tree and I will work from that. Mike Mike On Wed, Apr 13, 2016 at 4:30 PM, Bert Gunter <bgunter.4567 at gmail.com> wrote:> I think you are missing the point of random forests. But if you just > want to predict using the forest, there is a predict() method that you > can use. Other than that, I certainly don't understand what you mean. > Maybe someone else might. > > Cheers, > Bert > > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along > and sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Wed, Apr 13, 2016 at 2:11 PM, Michael Artz <michaeleartz at gmail.com> > wrote: > > Ok is there a way to do it with decision tree? I just need to make the > > decision rules. Perhaps I can pick one of the trees used with Random > > Forrest. I am somewhat familiar already with Random Forrest with > respective > > to bagging and feature sampling and getting the mode from the leaf nodes > and > > it being an ensemble technique of many trees. I am just working from the > > perspective that I need decision rules, and I am working backward form > that, > > and I need to do it in R. > > > > On Wed, Apr 13, 2016 at 4:08 PM, Bert Gunter <bgunter.4567 at gmail.com> > wrote: > >> > >> Nope. > >> > >> Random forests are not decision trees -- they are ensembles (forests) > >> of trees. You need to go back and read up on them so you understand > >> how they work. The Hastie/Tibshirani/Friedman "The Elements of > >> Statistical Learning" has a nice explanation, but I'm sure there are > >> lots of good web resources, too. > >> > >> Cheers, > >> Bert > >> > >> > >> Bert Gunter > >> > >> "The trouble with having an open mind is that people keep coming along > >> and sticking things into it." > >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > >> > >> > >> On Wed, Apr 13, 2016 at 1:40 PM, Michael Artz <michaeleartz at gmail.com> > >> wrote: > >> > Hi I'm trying to get the top decision rules from a decision tree. > >> > Eventually I will like to do this with R and Random Forrest. There > has > >> > to > >> > be a way to output the decsion rules of each leaf node in an easily > >> > readable way. I am looking at the randomforrest and rpart packages > and I > >> > dont see anything yet. > >> > Mike > >> > > >> > [[alternative HTML version deleted]] > >> > > >> > ______________________________________________ > >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> > https://stat.ethz.ch/mailman/listinfo/r-help > >> > PLEASE do read the posting guide > >> > http://www.R-project.org/posting-guide.html > >> > and provide commented, minimal, self-contained, reproducible code. > > > > >[[alternative HTML version deleted]]
It sounds like you want classification or regression trees. rpart does exactly what you describe. Here's an overview: http://www.statmethods.net/advstats/cart.html But there are a lot of other ways to do the same thing in R, for instance: http://www.r-bloggers.com/a-brief-tour-of-the-trees-and-forests/ You can get the same kind of information from random forests, but it's less straightforward. If you want a clear set of rules as in your golf example, then you need rpart or similar. Sarah On Wed, Apr 13, 2016 at 6:02 PM, Michael Artz <michaeleartz at gmail.com> wrote:> Ah yes I will have to use the predict function. But the predict function > will not get me there really. If I can take the example that I have a > model predicting whether or not I will play golf (this is the dependent > value), and there are three independent variables Humidity(High, Medium, > Low), Pending_Chores(Taxes, None, Laundry, Car Maintenance) and Wind (High, > Low). I would like rules like where any record that follows these rules > (IF humidity = high AND pending_chores = None AND Wind = High THEN 77% > there is probability that play_golf is YES). I was thinking that random > forrest would weight the rules somehow on the collection of trees and give > a probability. But if that doesnt make sense, then can you just tell me > how to get the decsion rules with one tree and I will work from that. > > Mike > > Mike > > On Wed, Apr 13, 2016 at 4:30 PM, Bert Gunter <bgunter.4567 at gmail.com> wrote: > >> I think you are missing the point of random forests. But if you just >> want to predict using the forest, there is a predict() method that you >> can use. Other than that, I certainly don't understand what you mean. >> Maybe someone else might. >> >> Cheers, >> Bert >> >> >> Bert Gunter >> >> "The trouble with having an open mind is that people keep coming along >> and sticking things into it." >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >> >> >> On Wed, Apr 13, 2016 at 2:11 PM, Michael Artz <michaeleartz at gmail.com> >> wrote: >> > Ok is there a way to do it with decision tree? I just need to make the >> > decision rules. Perhaps I can pick one of the trees used with Random >> > Forrest. I am somewhat familiar already with Random Forrest with >> respective >> > to bagging and feature sampling and getting the mode from the leaf nodes >> and >> > it being an ensemble technique of many trees. I am just working from the >> > perspective that I need decision rules, and I am working backward form >> that, >> > and I need to do it in R. >> > >> > On Wed, Apr 13, 2016 at 4:08 PM, Bert Gunter <bgunter.4567 at gmail.com> >> wrote: >> >> >> >> Nope. >> >> >> >> Random forests are not decision trees -- they are ensembles (forests) >> >> of trees. You need to go back and read up on them so you understand >> >> how they work. The Hastie/Tibshirani/Friedman "The Elements of >> >> Statistical Learning" has a nice explanation, but I'm sure there are >> >> lots of good web resources, too. >> >> >> >> Cheers, >> >> Bert >> >> >> >> >> >> Bert Gunter >> >>
Tjats great that you are familiar and thanks for responding. Have you ever done what I am referring to? I have alteady spent time going through links and tutorials about decision trees and random forrests and have even used them both before. Mike On Apr 13, 2016 5:32 PM, "Sarah Goslee" <sarah.goslee at gmail.com> wrote: It sounds like you want classification or regression trees. rpart does exactly what you describe. Here's an overview: http://www.statmethods.net/advstats/cart.html But there are a lot of other ways to do the same thing in R, for instance: http://www.r-bloggers.com/a-brief-tour-of-the-trees-and-forests/ You can get the same kind of information from random forests, but it's less straightforward. If you want a clear set of rules as in your golf example, then you need rpart or similar. Sarah On Wed, Apr 13, 2016 at 6:02 PM, Michael Artz <michaeleartz at gmail.com> wrote:> Ah yes I will have to use the predict function. But the predict function > will not get me there really. If I can take the example that I have a > model predicting whether or not I will play golf (this is the dependent > value), and there are three independent variables Humidity(High, Medium, > Low), Pending_Chores(Taxes, None, Laundry, Car Maintenance) and Wind(High,> Low). I would like rules like where any record that follows these rules > (IF humidity = high AND pending_chores = None AND Wind = High THEN 77% > there is probability that play_golf is YES). I was thinking that random > forrest would weight the rules somehow on the collection of trees and give > a probability. But if that doesnt make sense, then can you just tell me > how to get the decsion rules with one tree and I will work from that. > > Mike > > Mike > > On Wed, Apr 13, 2016 at 4:30 PM, Bert Gunter <bgunter.4567 at gmail.com>wrote:> >> I think you are missing the point of random forests. But if you just >> want to predict using the forest, there is a predict() method that you >> can use. Other than that, I certainly don't understand what you mean. >> Maybe someone else might. >> >> Cheers, >> Bert >> >> >> Bert Gunter >> >> "The trouble with having an open mind is that people keep coming along >> and sticking things into it." >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >> >> >> On Wed, Apr 13, 2016 at 2:11 PM, Michael Artz <michaeleartz at gmail.com> >> wrote: >> > Ok is there a way to do it with decision tree? I just need to makethe>> > decision rules. Perhaps I can pick one of the trees used with Random >> > Forrest. I am somewhat familiar already with Random Forrest with >> respective >> > to bagging and feature sampling and getting the mode from the leafnodes>> and >> > it being an ensemble technique of many trees. I am just working fromthe>> > perspective that I need decision rules, and I am working backward form >> that, >> > and I need to do it in R. >> > >> > On Wed, Apr 13, 2016 at 4:08 PM, Bert Gunter <bgunter.4567 at gmail.com> >> wrote: >> >> >> >> Nope. >> >> >> >> Random forests are not decision trees -- they are ensembles (forests) >> >> of trees. You need to go back and read up on them so you understand >> >> how they work. The Hastie/Tibshirani/Friedman "The Elements of >> >> Statistical Learning" has a nice explanation, but I'm sure there are >> >> lots of good web resources, too. >> >> >> >> Cheers, >> >> Bert >> >> >> >> >> >> Bert Gunter >> >>[[alternative HTML version deleted]]
On Wednesday, April 13, 2016, Michael Artz <michaeleartz at gmail.com> wrote:> Tjats great that you are familiar and thanks for responding. Have you > ever done what I am referring to? I have alteady spent time going through > links and tutorials about decision trees and random forrests and have even > used them both before. >Then what specifically is your problem? Both of the tutorials I provided show worked examples, as does even the help for rpart. If none of those, or your extensive reading, work for your project you will have to be a lot more specific about why not. Sarah> Mike > On Apr 13, 2016 5:32 PM, "Sarah Goslee" <sarah.goslee at gmail.com > <javascript:_e(%7B%7D,'cvml','sarah.goslee at gmail.com');>> wrote: > > It sounds like you want classification or regression trees. rpart does > exactly what you describe. > > Here's an overview: > http://www.statmethods.net/advstats/cart.html > > But there are a lot of other ways to do the same thing in R, for instance: > http://www.r-bloggers.com/a-brief-tour-of-the-trees-and-forests/ > > You can get the same kind of information from random forests, but it's > less straightforward. If you want a clear set of rules as in your golf > example, then you need rpart or similar. > > Sarah > > On Wed, Apr 13, 2016 at 6:02 PM, Michael Artz <michaeleartz at gmail.com > <javascript:_e(%7B%7D,'cvml','michaeleartz at gmail.com');>> wrote: > > Ah yes I will have to use the predict function. But the predict function > > will not get me there really. If I can take the example that I have a > > model predicting whether or not I will play golf (this is the dependent > > value), and there are three independent variables Humidity(High, Medium, > > Low), Pending_Chores(Taxes, None, Laundry, Car Maintenance) and Wind > (High, > > Low). I would like rules like where any record that follows these rules > > (IF humidity = high AND pending_chores = None AND Wind = High THEN 77% > > there is probability that play_golf is YES). I was thinking that random > > forrest would weight the rules somehow on the collection of trees and > give > > a probability. But if that doesnt make sense, then can you just tell me > > how to get the decsion rules with one tree and I will work from that. > > > > Mike > > > > Mike > > > > On Wed, Apr 13, 2016 at 4:30 PM, Bert Gunter <bgunter.4567 at gmail.com > <javascript:_e(%7B%7D,'cvml','bgunter.4567 at gmail.com');>> wrote: > > > >> I think you are missing the point of random forests. But if you just > >> want to predict using the forest, there is a predict() method that you > >> can use. Other than that, I certainly don't understand what you mean. > >> Maybe someone else might. > >> > >> Cheers, > >> Bert > >> > >> > >> Bert Gunter > >> > >> "The trouble with having an open mind is that people keep coming along > >> and sticking things into it." > >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > >> > >> > >> On Wed, Apr 13, 2016 at 2:11 PM, Michael Artz <michaeleartz at gmail.com > <javascript:_e(%7B%7D,'cvml','michaeleartz at gmail.com');>> > >> wrote: > >> > Ok is there a way to do it with decision tree? I just need to make > the > >> > decision rules. Perhaps I can pick one of the trees used with Random > >> > Forrest. I am somewhat familiar already with Random Forrest with > >> respective > >> > to bagging and feature sampling and getting the mode from the leaf > nodes > >> and > >> > it being an ensemble technique of many trees. I am just working from > the > >> > perspective that I need decision rules, and I am working backward form > >> that, > >> > and I need to do it in R. > >> > > >> > On Wed, Apr 13, 2016 at 4:08 PM, Bert Gunter <bgunter.4567 at gmail.com > <javascript:_e(%7B%7D,'cvml','bgunter.4567 at gmail.com');>> > >> wrote: > >> >> > >> >> Nope. > >> >> > >> >> Random forests are not decision trees -- they are ensembles (forests) > >> >> of trees. You need to go back and read up on them so you understand > >> >> how they work. The Hastie/Tibshirani/Friedman "The Elements of > >> >> Statistical Learning" has a nice explanation, but I'm sure there are > >> >> lots of good web resources, too. > >> >> > >> >> Cheers, > >> >> Bert > >> >> > >> >> > >> >> Bert Gunter > >> >> > >-- Sarah Goslee http://www.stringpage.com http://www.sarahgoslee.com http://www.functionaldiversity.org [[alternative HTML version deleted]]
On Thu, 14 Apr 2016, Michael Artz wrote:> Ah yes I will have to use the predict function. But the predict function > will not get me there really. If I can take the example that I have a > model predicting whether or not I will play golf (this is the dependent > value), and there are three independent variables Humidity(High, Medium, > Low), Pending_Chores(Taxes, None, Laundry, Car Maintenance) and Wind (High, > Low). I would like rules like where any record that follows these rules > (IF humidity = high AND pending_chores = None AND Wind = High THEN 77% > there is probability that play_golf is YES).Although I think that this toy example is not overly useful for practical illustrations we have included the standard dataset in the "partykit" package: ## data data("WeatherPlay", package = "partykit")> I was thinking that random forrest would weight the rules somehow on the > collection of trees and give a probability. But if that doesnt make > sense, then can you just tell me how to get the decsion rules with one > tree and I will work from that.Then you can learn one tree on this data, e.g., with rpart() or ctree(): ## trees library("rpart") rp <- rpart(play ~ ., data = WeatherPlay, control = rpart.control(minsplit = 5)) library("partykit") ct <- ctree(play ~ ., data = WeatherPlay, minsplit = 5, mincriterion = 0.1) ## visualize via partykit pr <- as.party(rp) plot(pr) plot(ct) And the partykit package also includes a function to generate a text representation of the rules although this is currently not exported: partykit:::.list.rules.party(pr) ## "outlook %in% c(\"overcast\")" ## 4 ## "outlook %in% c(\"sunny\", \"rainy\") & humidity < 82.5" ## 5 ## "outlook %in% c(\"sunny\", \"rainy\") & humidity >= 82.5" partykit:::.list.rules.party(ct) ## 2 3 ## "humidity <= 80" "humidity > 80" If you do not want a text representation but something else you can compute on, then look at the source code of partykit:::.list.rules.party() and try to adapt it to your needs.> On Wed, Apr 13, 2016 at 4:30 PM, Bert Gunter <bgunter.4567 at gmail.com> wrote: > >> I think you are missing the point of random forests. But if you just >> want to predict using the forest, there is a predict() method that you >> can use. Other than that, I certainly don't understand what you mean. >> Maybe someone else might. >> >> Cheers, >> Bert >> >> >> Bert Gunter >> >> "The trouble with having an open mind is that people keep coming along >> and sticking things into it." >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >> >> >> On Wed, Apr 13, 2016 at 2:11 PM, Michael Artz <michaeleartz at gmail.com> >> wrote: >>> Ok is there a way to do it with decision tree? I just need to make the >>> decision rules. Perhaps I can pick one of the trees used with Random >>> Forrest. I am somewhat familiar already with Random Forrest with >> respective >>> to bagging and feature sampling and getting the mode from the leaf nodes >> and >>> it being an ensemble technique of many trees. I am just working from the >>> perspective that I need decision rules, and I am working backward form >> that, >>> and I need to do it in R. >>> >>> On Wed, Apr 13, 2016 at 4:08 PM, Bert Gunter <bgunter.4567 at gmail.com> >> wrote: >>>> >>>> Nope. >>>> >>>> Random forests are not decision trees -- they are ensembles (forests) >>>> of trees. You need to go back and read up on them so you understand >>>> how they work. The Hastie/Tibshirani/Friedman "The Elements of >>>> Statistical Learning" has a nice explanation, but I'm sure there are >>>> lots of good web resources, too. >>>> >>>> Cheers, >>>> Bert >>>> >>>> >>>> Bert Gunter >>>> >>>> "The trouble with having an open mind is that people keep coming along >>>> and sticking things into it." >>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >>>> >>>> >>>> On Wed, Apr 13, 2016 at 1:40 PM, Michael Artz <michaeleartz at gmail.com> >>>> wrote: >>>>> Hi I'm trying to get the top decision rules from a decision tree. >>>>> Eventually I will like to do this with R and Random Forrest. There >> has >>>>> to >>>>> be a way to output the decsion rules of each leaf node in an easily >>>>> readable way. I am looking at the randomforrest and rpart packages >> and I >>>>> dont see anything yet. >>>>> Mike >>>>> >>>>> [[alternative HTML version deleted]] >>>>> >>>>> ______________________________________________ >>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide >>>>> http://www.R-project.org/posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >