Liaw, Andy
2005-Jan-27 01:42 UTC
[R] how to evaluate the significance of attributes in tree gr owing
FWIW, I wrote a little function to extract variable importance as defined in the CART book a while ago. It's rather limited: Only works for regression problem, and you need to set maxsurrogate=0 and maxcompete=0. It may (or may not) help you: varimp.rpart <- function(x) { dev <- x$frame[, c("var", "dev")] dev <- dev[dev$var != "<leaf>", ] improve <- x$split[, "improve"] imp <- tapply(dev[, 2] * improve, dev$var, sum)[-1] if (any(is.na(imp))) imp[is.na(imp)] <- 0 imp } Here's an example using the Boston housing data:> library(rpart) > data(Boston, package="MASS") > boston.rp <- rpart(medv ~ ., Boston, control=rpart.control(maxsurrogate=0,maxcompete=0))> varimp.rpart(boston.rp)crim zn indus chas nox rm age dis 1136.809 0.000 0.000 0.000 0.000 23825.922 0.000 1544.804 rad tax ptratio black lstat 0.000 0.000 0.000 0.000 7988.955 Both gbm and randomForest has analogous measures. Andy> From: WeiWei Shi > > Hi, there: > > I am wondering if there is a package in R (doing decison trees) which > can provide some methods to evaluate the significance of attributes. I > remembered randomForest gives some output like that. Unfortunately my > current computing env. cannot handle my datasets if I use > randomForest. So, I am thinking if other packages can do this job or > not. > > > Thanks, > > Ed > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > >
Maybe Matching Threads
- memory error with rpart()
- Problem with the step() function
- Please help!! How do I set graphical parameters for ploting ctree()
- how to seperate " "? or how to do regression on each variable when I have multiple variables?
- error using pvcm() on unbalanced panel data