Tim Howard
2010-Nov-16 12:36 UTC
[R] Force evaluation of variable when calling partialPlot
Greg,
Two thoughts:
1. It might be possible that 'vars' is a reserved word of sorts and if
you change the name of your vector RF might be happier
2. A way that works for me is to call importance as follows:
sel.imp <- importance(sel.rf, class=NULL, scale=TRUE, type=NULL)
and then use the 'names' of the imp data frame to be absolutely clear to
RF you are talking about the same variables
for(i in length(sel.imp){
partialPlot(sel.rf,xdata,names(sel.imp[i]),which.class=1,xlab=vars[i],main="")
}
Hope that helps.
Tim Howard
>>>>>>>>>>>
Date: Mon, 15 Nov 2010 12:29:08 -0800 (PST)
From: gdillon <gdillon@fs.fed.us>
To: r-help@r-project.org
Subject: Re: [R] Force evaluation of variable when calling partialPlot
Message-ID: <1289852948670-3043750.post@n4.nabble.com>
Content-Type: text/plain; charset=us-ascii
RE: the folloing original question:> I'm using the randomForest package and would like to generate partial
> dependence plots, one after another, for a variety of variables:
>
> m <- randomForest( s, ... )
> varnames <- c( "var1", "var2", "var3",
"var4" ) # var1..4 are all in
> data frame s
> for( v in varnames ) {
> partialPlot( x=m, pred.data=s, x.var=v )
> }
>
> ...but this doesn't work, with partialPlot complaining that it
can't
> find the variable "v".
I'm having a very similar problem using the partialPlot function. A
simplified version of my code looks like this:
data.in <-
paste(basedir,"/sw_climate_dataframe_",root.name,".csv",sep="")
data <- read.csv(data.in)
vars <-
c("sm1","precip.spring","tmax.fall","precip.fall")
#selected
variables in data frame
xdata <- as.data.frame(data[,vars])
ydata <- data[,5]
ntree <- 2000
rf.pdplots <- function() {
sel.rf <- randomForest(xdata,ydata,ntree=ntree,keep.forest=TRUE)
par(family="sans",mfrow=c(2,2),mar=c(4,3,1,2),oma=c(0,3,0,0),mgp=c(2,1,0))
for (i in 1:length(vars)) {
print((vars)[i])
partialPlot(sel.rf,xdata,vars[i],which.class=1,xlab=vars[i],main="")
mtext("(Logit of probability of high severity)/2", side=2, line=1,
outer=T)
}
}
rf.pdplots()
When I run this code, with partialPlots embedded in a function, I get the
following error:
Error in eval(expr, envir, enclos) : object "i" not found
If I just take the code inside the function and run it (not embedded in the
function), it runs just fine. Is there some reason why partialPlots doesn't
like to be called from inside a function?
Other things I've tried/noticed:
1. If I comment out the line with the partialPlots call (and the next line
with mtext), the function runs as expected and prints the variable names one
at a time.
2. If the variable i is defined as a number (e.g., 4) in the global
environment, then the function will run, the names print out one at a time,
and four plots are created. HOWEVER, the plots are all for the last (4th)
variable, BUT the x labels actually are different on each plot (i.e., the
xlab is actually looping through the four values in vars).
Can anyone help me make sense of this? Thanks.
-Greg
--
[[alternative HTML version deleted]]
Greg Dillon
2010-Nov-16 17:24 UTC
[R] Force evaluation of variable when calling partialPlot
Tim,
Thanks for the suggestions. I tried them both, and while neither solved my
problem directly, they helped me to get to a solution.
What I've realized is that partialPlots, for some reason, always looks to
the Global environment when evaluating the "x.var" argument. So any
variables created within a function (such as "i" in my looping
example, or
even "sel.imp" when I tried your 2nd suggestion) are not seen. From
what I
can tell, this only seems to be true for the "x.var" argument. Other
arguments seem to be able to take variables from within the function's
environment/namespace. My solution was to just modify my original function
to kick a copy of "i" out to Global, as follows:
rf.pdplots <- function() {
sel.rf <- randomForest(xdata,ydata,ntree=ntree,keep.forest=TRUE)
par(family="sans",mfrow=c(2,2),mar=c(4,3,1,2),oma=c(0,3,0,0),mgp=c(2,1,0))
for (i in 1:length(vars)) {
i <<- i
print((vars)[i])
partialPlot(sel.rf,xdata,vars[i],which.class=1,xlab=vars[i],main="")
mtext("(Logit of probability of high severity)/2", side=2, line=1,
outer=T)
}
}
Thanks again for the help.
-Greg Dillon
"Tim Howard" <tghoward@gw.dec.state.ny.us>
11/16/2010 05:36 AM
To
<gdillon <gdillon@fs.fed.us>, <r-help@r-project.org>
cc
Subject
Re: [R] Force evaluation of variable when calling partialPlot
Greg,
Two thoughts:
1. It might be possible that 'vars' is a reserved word of sorts and if
you
change the name of your vector RF might be happier
2. A way that works for me is to call importance as follows:
sel.imp <- importance(sel.rf, class=NULL, scale=TRUE, type=NULL)
and then use the 'names' of the imp data frame to be absolutely clear to
RF you are talking about the same variables
for(i in length(sel.imp){
partialPlot(sel.rf,xdata,names(sel.imp[i]),which.class=1,xlab=vars[i],main="")
}
Hope that helps.
Tim Howard
>>>>>>>>>>>
Date: Mon, 15 Nov 2010 12:29:08 -0800 (PST)
From: gdillon <gdillon@fs.fed.us>
RE: the folloing original question:> I'm using the randomForest package and would like to generate partial
> dependence plots, one after another, for a variety of variables:
>
> m <- randomForest( s, ... )
> varnames <- c( "var1", "var2", "var3",
"var4" ) # var1..4 are all in
> data frame s
> for( v in varnames ) {
> partialPlot( x=m, pred.data=s, x.var=v )
> }
>
> ...but this doesn't work, with partialPlot complaining that it
can't
> find the variable "v".
I'm having a very similar problem using the partialPlot function. A
simplified version of my code looks like this:
data.in <-
paste(basedir,"/sw_climate_dataframe_",root.name,".csv",sep="")
data <- read.csv(data.in)
vars <-
c("sm1","precip.spring","tmax.fall","precip.fall")
#selected
variables in data frame
xdata <- as.data.frame(data[,vars])
ydata <- data[,5]
ntree <- 2000
rf.pdplots <- function() {
sel.rf <- randomForest(xdata,ydata,ntree=ntree,keep.forest=TRUE)
par(family="sans",mfrow=c(2,2),mar=c(4,3,1,2),oma=c(0,3,0,0),mgp=c(2,1,0))
for (i in 1:length(vars)) {
print((vars)[i])
partialPlot(sel.rf,xdata,vars[i],which.class=1,xlab=vars[i],main="")
mtext("(Logit of probability of high severity)/2", side=2, line=1,
outer=T)
}
}
rf.pdplots()
When I run this code, with partialPlots embedded in a function, I get the
following error:
Error in eval(expr, envir, enclos) : object "i" not found
If I just take the code inside the function and run it (not embedded in
the
function), it runs just fine. Is there some reason why partialPlots
doesn't
like to be called from inside a function?
Other things I've tried/noticed:
1. If I comment out the line with the partialPlots call (and the next line
with mtext), the function runs as expected and prints the variable names
one
at a time.
2. If the variable i is defined as a number (e.g., 4) in the global
environment, then the function will run, the names print out one at a
time,
and four plots are created. HOWEVER, the plots are all for the last (4th)
variable, BUT the x labels actually are different on each plot (i.e., the
xlab is actually looping through the four values in vars).
Can anyone help me make sense of this? Thanks.
-Greg
--
[[alternative HTML version deleted]]