Johann Hibschman
2011-Sep-14 16:44 UTC
[R] substitute games with randomForest::partialPlot
I'm having trouble calling randomForest::partialPlot programmatically. It tries to use name of the (R) variable as the data column name. Example: library(randomForest) iris.rf <- randomForest(Species ~ ., data=iris, importance=TRUE, proximity=TRUE) partialPlot(iris.rf, iris, Sepal.Width) # works partialPlot(iris.rf, iris, "Sepal.Width") # works (function(var.name) partialPlot(iris.rf, iris, var.name))("Sepal.Width") # fails I can get around this with the following hack: (function(var.name) eval(substitute(partialPlot(iris.rf, iris, var.name), list(var.name=var.name))) )("Sepal.Width") # works But this seems like a horrible hack. Is there a "nicer" way to do this? The relevant code is: function (x, pred.data, x.var, which.class, w, plot = TRUE, add = FALSE, n.pt = min(length(unique(pred.data[, xname])), 51), rug = TRUE, xlab = deparse(substitute(x.var)), ylab = "", main = paste("Partial Dependence on", deparse(substitute(x.var))), ...) { ... x.var <- substitute(x.var) xname <- if (is.character(x.var)) x.var else { if (is.name(x.var)) deparse(x.var) else { eval(x.var) } } xv <- pred.data[, xname] ... } Thanks, Johann
Stephen Milborrow
2011-Sep-16 18:43 UTC
[R] substitute games with randomForest::partialPlot
> Johann Hibschman wrote: > I'm having trouble calling randomForest::partialPlot programmatically. > It tries to use name of the (R) variable as the data column name.You may want to consider looking at plotmo (in the plotmo package) which doesn't have the above issue. library(randomForest) library(plotmo) plotmo.var <- function(var.name) { plotmo(model, degree1=var.name, degree2=0) } model <- randomForest(Volume ~ ., data = trees) plotmo(model) plotmo.var("Girth") Plotmo can be considered to be a "poor man's partial dependence plot" --- it simply holds the not-being-plotted variables at their median values, rather than averaging as in a partial dependence plot. This may seem a bit simplistic, but an advantage is that plotmo plots are easier to understand than partial dependence plots (I believe). Or at least it is easier to understand what a plotmo plot is NOT telling you. For additive models, the shape of the plotmo curve is identical to the partial dependence curve. Currently the quickest introduction to plotmo is pages 17-19 of http://www.milbo.org/rpart-plot/prp.pdf Regards, Steve www.milbo.users.sonic.net