David Winsemius
2014-Sep-06 22:31 UTC
[R] Why does debugging print() change output of function?
The goal: to create a function modeled after `subset` (notorious for its non-standard evaluation) that will take a series of logical tests as unqiuoted expressions to be evaluated in the framework of a dataframe environment and return a dataframe of logicals: mtest.data.frame <- function (x, ..., drop=FALSE) { tests <- list(...); print(tests) r <- if (length(tests)==0) stop("no 'tests'") else { cbind.data.frame( lapply( tests, function(t){ e <- substitute(t) r <- eval(e, x, parent.frame() ) if ( !is.logical(r) ) { stop("'tests' must be logical") } r & !is.na(r) } ) ) } } #-------------- testdata <- structure(list(group1 = structure(1:7, .Label = c("Group A", "Group B", "Group C", "Group D", "Group E", "Group F", "Group G" ), class = "factor"), group2 = structure(c(3L, 3L, 2L, 1L, 1L, 2L, 3L), .Label = c("LS", "SS", "UNC"), class = "factor"), valid1 = structure(c(2L, 1L, NA, 1L, 2L, 2L, 1L), .Label = c("N", "Y"), class = "factor"), valid2 = structure(c(1L, 1L, 2L, 1L, 1L, 2L, 1L), .Label = c("N", "Y"), class = "factor"), valid3 = structure(c(4L, 3L, NA, 2L, 1L, NA, 5L), .Label = c("0.3", "0.7", "1.2", "1.4", "1.7" ), class = "factor"), valid4 = structure(c(2L, 1L, 3L, 4L, 1L, 1L, 5L), .Label = c("0.3", "0.4", "0.53", "0.66", "0.71" ), class = "factor"), valid5 = structure(c(4L, 1L, NA, NA, 3L, NA, 2L), .Label = c("11.2", "11.7", "8.3", "8.5"), class = "factor")), .Names = c("group1", "group2", "valid1", "valid2", "valid3", "valid4", "valid5"), row.names = c(NA, -7L), class = "data.frame") #######> mtest.data.frame(testdata, valid2=="N", valid3 > 1)[[1]] [1] "tests are" [[2]] [1] TRUE TRUE FALSE TRUE TRUE FALSE TRUE [[3]] [1] TRUE TRUE NA FALSE FALSE NA TRUE This actually seemed to be somewhat successful, but when ... Now if I take out the `print()` call for 'tests', I get an different answer:> mtest.data.frame <-+ function (x, ..., drop=FALSE) + { tests <- list(...) + r <- if (length(tests)==0) + stop("no 'tests'") + else { cbind.data.frame( + lapply( tests, function(t){ + e <- substitute(t) + r <- eval(e, x, parent.frame() ) + if ( !is.logical(r) ) { + stop("'tests' must be logical") } + r & !is.na(r) } ) ) + } + }> mtest.data.frame(testdata, valid2=="N", valid3 > 1) > # i.e. no answer-- David Winsemius Alameda, CA, USA
On Sat, 6 Sep 2014 15:31:49 -0700 David Winsemius <dwinsemius at comcast.net> wrote: The only other difference I see is a missing semicolon in the second example, which, in the first precedes your print() instruction. JWDougherty
William Dunlap
2014-Sep-06 23:24 UTC
[R] Why does debugging print() change output of function?
In your first example I get an error: > mtest.data.frame(testdata, valid2=="N", valid3 > 1) Error in mtest.data.frame(testdata, valid2 == "N", valid3 > 1) : object 'valid2' not found I expect the error because list(...) ought to evaluate the ... arguments. Use substitute() to get the unevaluated ... arguments up front and don't use substitute() in the loop over the elements of test. There are several ways to get the unevaluated ... arguments. E.g., f0 <- function(x, ..., drop=FALSE) match.call(expand.dots=FALSE)$... f1 <- function(x, ..., drop=FALSE) substitute(...()) f2 <- function(x, ..., drop=FALSE) as.list(substitute(list(...)))[-1] Your function could be the following, where I also fixed a problem with parent.frame() being called in the wrong scope and improved, IMO, the names on the output data.frame. m2 <- function (x, ..., drop = FALSE, verbose = FALSE) { tests <- substitute(...()) nms <- names(tests) # fix up names, since data.frame makes ugly ones if (is.null(nms)) { names(tests) <- paste0("T", seq_along(tests)) } else if (any(nms == "")) { names(tests)[nms == ""] <- paste0("T", which(nms == "")) } if (verbose) { print(tests) } r <- if (length(tests) == 0) { stop("no 'tests'") } else { enclos <- parent.frame() # evaluate parent.frame() outside of FUN() data.frame(lapply(tests, FUN=function(e) { r <- eval(e, x, enclos) if (!is.logical(r)) { stop("'tests' must be logical") } r & !is.na(r) })) } r } used as:> m2(testdata, group2=="UNC", Eleven.Two=valid5=="11.2")T1 Eleven.Two 1 TRUE FALSE 2 TRUE TRUE 3 FALSE FALSE 4 FALSE FALSE 5 FALSE FALSE 6 FALSE FALSE 7 TRUE FALSE Bill Dunlap TIBCO Software wdunlap tibco.com On Sat, Sep 6, 2014 at 3:31 PM, David Winsemius <dwinsemius at comcast.net> wrote:> The goal: > to create a function modeled after `subset` (notorious for its non-standard evaluation) that will take a series of logical tests as unqiuoted expressions to be evaluated in the framework of a dataframe environment and return a dataframe of logicals: > > > mtest.data.frame <- > function (x, ..., drop=FALSE) > { tests <- list(...); print(tests) > r <- if (length(tests)==0) > stop("no 'tests'") > else { cbind.data.frame( > lapply( tests, function(t){ > e <- substitute(t) > r <- eval(e, x, parent.frame() ) > if ( !is.logical(r) ) { > stop("'tests' must be logical") } > r & !is.na(r) } ) ) > } > } > #-------------- > > testdata <- structure(list(group1 = structure(1:7, .Label = c("Group A", > "Group B", "Group C", "Group D", "Group E", "Group F", "Group G" > ), class = "factor"), group2 = structure(c(3L, 3L, 2L, 1L, 1L, > 2L, 3L), .Label = c("LS", "SS", "UNC"), class = "factor"), valid1 = structure(c(2L, > 1L, NA, 1L, 2L, 2L, 1L), .Label = c("N", "Y"), class = "factor"), > valid2 = structure(c(1L, 1L, 2L, 1L, 1L, 2L, 1L), .Label = c("N", > "Y"), class = "factor"), valid3 = structure(c(4L, 3L, NA, > 2L, 1L, NA, 5L), .Label = c("0.3", "0.7", "1.2", "1.4", "1.7" > ), class = "factor"), valid4 = structure(c(2L, 1L, 3L, 4L, > 1L, 1L, 5L), .Label = c("0.3", "0.4", "0.53", "0.66", "0.71" > ), class = "factor"), valid5 = structure(c(4L, 1L, NA, NA, > 3L, NA, 2L), .Label = c("11.2", "11.7", "8.3", "8.5"), class = "factor")), .Names = c("group1", > "group2", "valid1", "valid2", "valid3", "valid4", "valid5"), row.names = c(NA, > -7L), class = "data.frame") > > ####### > > >> mtest.data.frame(testdata, valid2=="N", valid3 > 1) > [[1]] > [1] "tests are" > > [[2]] > [1] TRUE TRUE FALSE TRUE TRUE FALSE TRUE > > [[3]] > [1] TRUE TRUE NA FALSE FALSE NA TRUE > > This actually seemed to be somewhat successful, but when ... > > Now if I take out the `print()` call for 'tests', I get an different answer: > >> mtest.data.frame <- > + function (x, ..., drop=FALSE) > + { tests <- list(...) > + r <- if (length(tests)==0) > + stop("no 'tests'") > + else { cbind.data.frame( > + lapply( tests, function(t){ > + e <- substitute(t) > + r <- eval(e, x, parent.frame() ) > + if ( !is.logical(r) ) { > + stop("'tests' must be logical") } > + r & !is.na(r) } ) ) > + } > + } >> mtest.data.frame(testdata, valid2=="N", valid3 > 1) >> # i.e. no answer > > -- > > David Winsemius > Alameda, CA, USA > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
peter dalgaard
2014-Sep-09 23:07 UTC
[R] Why does debugging print() change output of function?
On 07 Sep 2014, at 00:31 , David Winsemius <dwinsemius at comcast.net> wrote:> The goal: > to create a function modeled after `subset` (notorious for its non-standard evaluation) that will take a series of logical tests as unqiuoted expressions to be evaluated in the framework of a dataframe environment and return a dataframe of logicals:... A belated peep from the author of subset(): Don't! I think we learned the hard way by now that it is much easier to pass unevaluated expressions in the shape of formula objects or maybe expression objects. Lots of pain can be avoided by slipping in a simple "~". -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com