Martin Maechler
2005-Nov-18 20:32 UTC
[Rd] challenge: using 'subset = <computed>' inside function ..
I've been asked by someone else whom I originally taught `` to just work with substitute() and then all will be fine'' ... But it looks to me that I've been caught here. Is it possible to make this work along the way we thought it should? 1) Inside a function, say tst() with the 'formula' and a 'data' argument, 2) call another modeling function using 'subset = <EXPR>' with the *original* data, 3) but <EXPR> is really computed from 'formula' itself .. It would probably be pretty easy to use a modified 'data' (data frame), inside tst(), instead of trying to the original data; but let's assume for the moment that this is not at all wanted. Here is example code {that fails} showing several other possibilities that fail as well tst <- function(formula, data, na.action = na.omit) { stopifnot(inherits(formula,"formula"), length(formula) == 3) ## I want to fit a model to those observations that have 'Y > 0' ## where 'Y' is the left-hand-side (LHS) ## The really natural problem is using 'subset'; since I want to keep 'data' intact ## It's really lm(), glm(), gam(), ... but the problem is with model.frame: cat("subsetting expression: ") print(substitute(Y > 0, list(Y = formula[[2]])))# is perfect YY <- formula[[2]] cat(" or "); print(bquote(.(YY) > 0)) mf <- model.frame(formula, data=data, subset = bquote(.(YY) > 0), ##or subset = substitute(Y > 0, list(Y = formula[[2]])), ##or subset = eval(substitute(Y > 0, list(Y = formula[[2]]))), ##or subset = as.expression(bquote(.(formula[[2]]) > 0)), ##or subset = bquote(.(formula[[2]]) > 0), na.action = na.action) mf } ## never works tst(ncases ~ agegp + alcgp, data = esoph) traceback() #--> shows that inside model.frame.default # eval(substitute(subset, ...)) is called as well ---- Happy quizzing.. Martin Maechler, ETH Zurich
Bjørn-Helge Mevik
2005-Nov-19 10:03 UTC
[Rd] challenge: using 'subset = <computed>' inside function ..
Hmm.. Maybe I'm overlooking something, but why not use do.call()? For instance tst <- function(formula, data, na.action = na.omit) { stopifnot(inherits(formula,"formula"), length(formula) == 3) ## I want to fit a model to those observations that have 'Y > 0' ## where 'Y' is the left-hand-side (LHS) ## The really natural problem is using 'subset'; since I want to keep 'data' intact ## It's really lm(), glm(), gam(), ... but the problem is with model.frame: cat("subsetting expression: ") print(substitute(Y > 0, list(Y = formula[[2]])))# is perfect YY <- formula[[2]] cat(" or "); print(bquote(.(YY) > 0)) mf <- do.call("model.frame", list(formula = formula, data = data, subset = bquote(.(YY) > 0), na.action = na.action)) mf } It seems to work for me:> mydata <- data.frame(y = rep(c(-1, 1), each = 5), x = rnorm(10)) > tst(y ~ x, data = mydata)subsetting expression: y > 0 or y > 0 y x 6 1 0.9568283 7 1 0.1166081 8 1 -0.9592458 9 1 -0.0974119 10 1 0.2217222 -- Bj?rn-Helge Mevik