Muhammad Subianto
2005-Mar-18 17:51 UTC
[R] How to show which variables include in plot of classification tree
Dear all For my research, I am learning classification now. I was trying some example about classification tree pakages, such as tree and rpart, for instance, in Pima.te dataset have 8 variables (include class=type): library(rpart) library(datasets) pima.rpart <- rpart(type ~ npreg+glu+bp+skin+bmi+ped+age,data=Pima.te, method='class') plot(pima.rpart, uniform=TRUE) text(pima.rpart) summary(pima.rpart) In the result I found only 5 variables: npreg, glu, bmi, ped, and age were showing in the plot. Now, I have 50 variables in my dataset. The result my classification tree very difficult to know which variables showing in the plot. Are there any trick which variables are showing in plot. Thanks for your help. Muhammad Subianto
Uwe Ligges
2005-Mar-18 18:45 UTC
[R] How to show which variables include in plot of classification tree
Muhammad Subianto wrote:> Dear all > For my research, I am learning classification now. > I was trying some example about classification tree pakages, such as > tree and rpart, for instance, > in Pima.te dataset have 8 variables (include class=type): > > library(rpart) > library(datasets) > pima.rpart <- rpart(type ~ npreg+glu+bp+skin+bmi+ped+age,data=Pima.te, > method='class') > plot(pima.rpart, uniform=TRUE) > text(pima.rpart) > summary(pima.rpart) > > In the result I found only 5 variables: npreg, glu, bmi, ped, and age > were showing in the plot. > Now, I have 50 variables in my dataset. The result my classification > tree very difficult to know which > variables showing in the plot. Are there any trick which variables are > showing in plot.1. Please read a good book on classification. Also, you might want to take a look into Breiman et al. (1984) cited in ?rpart. 2. rpart does variable selection when growing the tree, so you should not expect to find all 50 variables in the plot. See, e.g., ?rpart.control 3. You have specified the formula "type ~ npreg + glu + bp + skin + bmi + ped + age", so in particular you cannot expect to get more variables than "npreg + glu + bp + skin + bmi + ped + age" Uwe Ligges> Thanks for your help. > Muhammad Subianto > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html