Tim Howard
2010-Nov-16 12:36 UTC
[R] Force evaluation of variable when calling partialPlot
Greg, Two thoughts: 1. It might be possible that 'vars' is a reserved word of sorts and if you change the name of your vector RF might be happier 2. A way that works for me is to call importance as follows: sel.imp <- importance(sel.rf, class=NULL, scale=TRUE, type=NULL) and then use the 'names' of the imp data frame to be absolutely clear to RF you are talking about the same variables for(i in length(sel.imp){ partialPlot(sel.rf,xdata,names(sel.imp[i]),which.class=1,xlab=vars[i],main="") } Hope that helps. Tim Howard>>>>>>>>>>>Date: Mon, 15 Nov 2010 12:29:08 -0800 (PST) From: gdillon <gdillon@fs.fed.us> To: r-help@r-project.org Subject: Re: [R] Force evaluation of variable when calling partialPlot Message-ID: <1289852948670-3043750.post@n4.nabble.com> Content-Type: text/plain; charset=us-ascii RE: the folloing original question:> I'm using the randomForest package and would like to generate partial > dependence plots, one after another, for a variety of variables: > > m <- randomForest( s, ... ) > varnames <- c( "var1", "var2", "var3", "var4" ) # var1..4 are all in > data frame s > for( v in varnames ) { > partialPlot( x=m, pred.data=s, x.var=v ) > } > > ...but this doesn't work, with partialPlot complaining that it can't > find the variable "v".I'm having a very similar problem using the partialPlot function. A simplified version of my code looks like this: data.in <- paste(basedir,"/sw_climate_dataframe_",root.name,".csv",sep="") data <- read.csv(data.in) vars <- c("sm1","precip.spring","tmax.fall","precip.fall") #selected variables in data frame xdata <- as.data.frame(data[,vars]) ydata <- data[,5] ntree <- 2000 rf.pdplots <- function() { sel.rf <- randomForest(xdata,ydata,ntree=ntree,keep.forest=TRUE) par(family="sans",mfrow=c(2,2),mar=c(4,3,1,2),oma=c(0,3,0,0),mgp=c(2,1,0)) for (i in 1:length(vars)) { print((vars)[i]) partialPlot(sel.rf,xdata,vars[i],which.class=1,xlab=vars[i],main="") mtext("(Logit of probability of high severity)/2", side=2, line=1, outer=T) } } rf.pdplots() When I run this code, with partialPlots embedded in a function, I get the following error: Error in eval(expr, envir, enclos) : object "i" not found If I just take the code inside the function and run it (not embedded in the function), it runs just fine. Is there some reason why partialPlots doesn't like to be called from inside a function? Other things I've tried/noticed: 1. If I comment out the line with the partialPlots call (and the next line with mtext), the function runs as expected and prints the variable names one at a time. 2. If the variable i is defined as a number (e.g., 4) in the global environment, then the function will run, the names print out one at a time, and four plots are created. HOWEVER, the plots are all for the last (4th) variable, BUT the x labels actually are different on each plot (i.e., the xlab is actually looping through the four values in vars). Can anyone help me make sense of this? Thanks. -Greg -- [[alternative HTML version deleted]]
Greg Dillon
2010-Nov-16 17:24 UTC
[R] Force evaluation of variable when calling partialPlot
Tim, Thanks for the suggestions. I tried them both, and while neither solved my problem directly, they helped me to get to a solution. What I've realized is that partialPlots, for some reason, always looks to the Global environment when evaluating the "x.var" argument. So any variables created within a function (such as "i" in my looping example, or even "sel.imp" when I tried your 2nd suggestion) are not seen. From what I can tell, this only seems to be true for the "x.var" argument. Other arguments seem to be able to take variables from within the function's environment/namespace. My solution was to just modify my original function to kick a copy of "i" out to Global, as follows: rf.pdplots <- function() { sel.rf <- randomForest(xdata,ydata,ntree=ntree,keep.forest=TRUE) par(family="sans",mfrow=c(2,2),mar=c(4,3,1,2),oma=c(0,3,0,0),mgp=c(2,1,0)) for (i in 1:length(vars)) { i <<- i print((vars)[i]) partialPlot(sel.rf,xdata,vars[i],which.class=1,xlab=vars[i],main="") mtext("(Logit of probability of high severity)/2", side=2, line=1, outer=T) } } Thanks again for the help. -Greg Dillon "Tim Howard" <tghoward@gw.dec.state.ny.us> 11/16/2010 05:36 AM To <gdillon <gdillon@fs.fed.us>, <r-help@r-project.org> cc Subject Re: [R] Force evaluation of variable when calling partialPlot Greg, Two thoughts: 1. It might be possible that 'vars' is a reserved word of sorts and if you change the name of your vector RF might be happier 2. A way that works for me is to call importance as follows: sel.imp <- importance(sel.rf, class=NULL, scale=TRUE, type=NULL) and then use the 'names' of the imp data frame to be absolutely clear to RF you are talking about the same variables for(i in length(sel.imp){ partialPlot(sel.rf,xdata,names(sel.imp[i]),which.class=1,xlab=vars[i],main="") } Hope that helps. Tim Howard>>>>>>>>>>>Date: Mon, 15 Nov 2010 12:29:08 -0800 (PST) From: gdillon <gdillon@fs.fed.us> RE: the folloing original question:> I'm using the randomForest package and would like to generate partial > dependence plots, one after another, for a variety of variables: > > m <- randomForest( s, ... ) > varnames <- c( "var1", "var2", "var3", "var4" ) # var1..4 are all in > data frame s > for( v in varnames ) { > partialPlot( x=m, pred.data=s, x.var=v ) > } > > ...but this doesn't work, with partialPlot complaining that it can't > find the variable "v".I'm having a very similar problem using the partialPlot function. A simplified version of my code looks like this: data.in <- paste(basedir,"/sw_climate_dataframe_",root.name,".csv",sep="") data <- read.csv(data.in) vars <- c("sm1","precip.spring","tmax.fall","precip.fall") #selected variables in data frame xdata <- as.data.frame(data[,vars]) ydata <- data[,5] ntree <- 2000 rf.pdplots <- function() { sel.rf <- randomForest(xdata,ydata,ntree=ntree,keep.forest=TRUE) par(family="sans",mfrow=c(2,2),mar=c(4,3,1,2),oma=c(0,3,0,0),mgp=c(2,1,0)) for (i in 1:length(vars)) { print((vars)[i]) partialPlot(sel.rf,xdata,vars[i],which.class=1,xlab=vars[i],main="") mtext("(Logit of probability of high severity)/2", side=2, line=1, outer=T) } } rf.pdplots() When I run this code, with partialPlots embedded in a function, I get the following error: Error in eval(expr, envir, enclos) : object "i" not found If I just take the code inside the function and run it (not embedded in the function), it runs just fine. Is there some reason why partialPlots doesn't like to be called from inside a function? Other things I've tried/noticed: 1. If I comment out the line with the partialPlots call (and the next line with mtext), the function runs as expected and prints the variable names one at a time. 2. If the variable i is defined as a number (e.g., 4) in the global environment, then the function will run, the names print out one at a time, and four plots are created. HOWEVER, the plots are all for the last (4th) variable, BUT the x labels actually are different on each plot (i.e., the xlab is actually looping through the four values in vars). Can anyone help me make sense of this? Thanks. -Greg -- [[alternative HTML version deleted]]