Gavin Simpson
2009-Aug-08 18:31 UTC
[Rd] Problem using model.frame with argument subset in own function
Dear List, I am writing a formula method for a function in a package I maintain. I want the method to return a data.frame that potentially only contains some of the variables in 'data', as specified by the formula. The problem I am having is in writing the function and wrapping it around model.frame. Consider the following data frame: dat <- data.frame(A = runif(10), B = runif(10), C = runif(10)) And the wrapper function: foo <- function(formula, data = NULL, ..., subset = NULL, na.action = na.pass) { mt <- terms(formula, data = data, simplify = TRUE) mf <- model.frame(formula(mt), data = data, subset = subset, na.action = na.action) ## real function would do more stuff here and pass mf on to ## other functions mf } This is how I envisage the function being called. The real world use would have a data.frame with tens or hundreds of components where only a few need to be excluded. Hence wanting formulas of the form below to work. foo(~ . - B, data = dat) The aim is to return only columns A and C in an object returned by model.frame. However, when I run the above, I get the following error:> foo(~ A + B, data = dat)Error in xj[i] : invalid subscript type 'closure' I've tracked this down to the line in model.frame.default subset <- eval(substitute(subset), data, env) After evaluating this line, subset contains: Browse[1]> subset function (x, ...) UseMethod("subset") <environment: namespace:base> Not NULL, and hence the error later on when calling the internal model.frame code. So the question is, what am I doing wrong? If I leave the subset argument out of the definition of foo and rely upon the default in model.frame.default, the function works as expected. Perhaps the question should be, how do I modify foo() to allow it to have a formal subset argument, passed to model.frame? Any other suggestions gratefully accepted. Thanks in advance, G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Douglas Bates
2009-Aug-09 16:32 UTC
[Rd] Problem using model.frame with argument subset in own function
On Sat, Aug 8, 2009 at 1:31 PM, Gavin Simpson<gavin.simpson at ucl.ac.uk> wrote:> Dear List,> I am writing a formula method for a function in a package I maintain. I > want the method to return a data.frame that potentially only contains > some of the variables in 'data', as specified by the formula.The usual way to call model.frame (the method that Thomas Lumley has called "the standard, non-standard evaluation) is to match the call to foo, replace the name of the function being called with as.name("model.frame") and force an evaluation in the parent frame. it looks like mf <- match.call() if (missing(data)) data <- environment(formula) ## evaluate and install the model frame m <- match(c("formula", "data", "subset", "weights", "na.action", "offset"), names(mf), 0) mf <- mf[c(1, m)] mf$drop.unused.levels <- TRUE mf[[1]] <- as.name("model.frame") fr <- eval(mf, parent.frame()) The point of all of this manipulation is to achieve the kind of result you need where the subset argument is evaluated in the correct environmnent.> The problem I am having is in writing the function and wrapping it > around model.frame. Consider the following data frame: > > dat <- data.frame(A = runif(10), B = runif(10), C = runif(10)) > > And the wrapper function: > > foo <- function(formula, data = NULL, ..., subset = NULL, > ? ? ? ? ? ? ? ?na.action = na.pass) { > ? ?mt <- terms(formula, data = data, simplify = TRUE) > ? ?mf <- model.frame(formula(mt), data = data, subset = subset, > ? ? ? ? ? ? ? ? ? ? ?na.action = na.action) > ? ?## real function would do more stuff here and pass mf on to > ? ?## other functions > ? ?mf > } > > This is how I envisage the function being called. The real world use > would have a data.frame with tens or hundreds of components where only a > few need to be excluded. Hence wanting formulas of the form below to > work. > > foo(~ . - B, data = dat) > > The aim is to return only columns A and C in an object returned by > model.frame. However, when I run the above, I get the following error: > >> foo(~ A + B, data = dat) > Error in xj[i] : invalid subscript type 'closure' > > I've tracked this down to the line in model.frame.default > > ? ?subset <- eval(substitute(subset), data, env) > > After evaluating this line, subset contains: > > Browse[1]> subset > function (x, ...) > UseMethod("subset") > <environment: namespace:base> > > Not NULL, and hence the error later on when calling the internal > model.frame code. > > So the question is, what am I doing wrong? > > If I leave the subset argument out of the definition of foo and rely > upon the default in model.frame.default, the function works as > expected. > > Perhaps the question should be, how do I modify foo() to allow it to > have a formal subset argument, passed to model.frame? > > Any other suggestions gratefully accepted. > > Thanks in advance, > > G > -- > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% > ?Dr. Gavin Simpson ? ? ? ? ? ? [t] +44 (0)20 7679 0522 > ?ECRC, UCL Geography, ? ? ? ? ?[f] +44 (0)20 7679 0565 > ?Pearson Building, ? ? ? ? ? ? [e] gavin.simpsonATNOSPAMucl.ac.uk > ?Gower Street, London ? ? ? ? ?[w] http://www.ucl.ac.uk/~ucfagls/ > ?UK. WC1E 6BT. ? ? ? ? ? ? ? ? [w] http://www.freshwaters.org.uk > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >
Greg B. Hill
2009-Sep-09 21:00 UTC
[Rd] Problem using model.frame with argument subset in own function
Gavin, I ran into the same cryptic "invalid subscript type 'closure'" message in a slightly less complicated scenario, and wanted to post the cause in my case (the root cause is probably the same either way). Similarly to your case, I was subsetting a data frame. I had a list of variable names corresponding to columns in the frame. Unfortunately the variable name I had assigned to this list, var, coincided with the name of a base package function in R for variance. When I attempted to subset df[, var], I got the 'closure' error message, but if I renamed the list of variable names so the collision didn't occur, e.g. df[, vars] instead of df[, var], it worked as expected. Sincerely, Greg B. Hill Gavin Simpson wrote:> > Dear List, > > I am writing a formula method for a function in a package I maintain. I > want the method to return a data.frame that potentially only contains > some of the variables in 'data', as specified by the formula. > > The problem I am having is in writing the function and wrapping it > around model.frame. Consider the following data frame: > > dat <- data.frame(A = runif(10), B = runif(10), C = runif(10)) > > And the wrapper function: > > foo <- function(formula, data = NULL, ..., subset = NULL, > na.action = na.pass) { > mt <- terms(formula, data = data, simplify = TRUE) > mf <- model.frame(formula(mt), data = data, subset = subset, > na.action = na.action) > ## real function would do more stuff here and pass mf on to > ## other functions > mf > } > > This is how I envisage the function being called. The real world use > would have a data.frame with tens or hundreds of components where only a > few need to be excluded. Hence wanting formulas of the form below to > work. > > foo(~ . - B, data = dat) > > The aim is to return only columns A and C in an object returned by > model.frame. However, when I run the above, I get the following error: > >> foo(~ A + B, data = dat) > Error in xj[i] : invalid subscript type 'closure' > > I've tracked this down to the line in model.frame.default > > subset <- eval(substitute(subset), data, env) > > After evaluating this line, subset contains: > > Browse[1]> subset > function (x, ...) > UseMethod("subset") > <environment: namespace:base> > > Not NULL, and hence the error later on when calling the internal > model.frame code. > > So the question is, what am I doing wrong? > > If I leave the subset argument out of the definition of foo and rely > upon the default in model.frame.default, the function works as > expected. > > Perhaps the question should be, how do I modify foo() to allow it to > have a formal subset argument, passed to model.frame? > > Any other suggestions gratefully accepted. > > Thanks in advance, > > G > -- > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% > Dr. Gavin Simpson [t] +44 (0)20 7679 0522 > ECRC, UCL Geography, [f] +44 (0)20 7679 0565 > Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk > Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ > UK. WC1E 6BT. [w] http://www.freshwaters.org.uk > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > >-- View this message in context: http://www.nabble.com/Problem-using-model.frame-with-argument-subset-in-own-function-tp24880908p25373059.html Sent from the R devel mailing list archive at Nabble.com.