I recently tried to write a new method for "[", to be applied to data frames, so that the object returned would retain (all) attributes of the columns, including attributes that my code had created. I thrashed around for quite a while, and then got some help from Rui Barradas who showed me how to do it, in the following manner: `[.myclass` <- function(x, i, j, drop = if (missing(i)) TRUE else length(cols) == 1)[{ SaveAt <- lapply(x, attributes) x <- NextMethod() lX <- lapply(names(x),function(nm, x, Sat){ attributes(x[[nm]]) <- Sat[[nm]] x[[nm]]}, x = x, Sat = SaveAt) names(lX) <- names(x) x <- as.data.frame(lX) x } If I set class(X) <- c("myclass",class(X)) and apply "[" to X (e.g. something like X[1:42,]) the attributes are retained as desired. OK. All good. Now we finally come to my question! I want to put this new method into a package that I am building. When I build the package and run R CMD check I get a complaint: ... no visible binding for global variable ?cols? And indeed, there is no such variable. At first I thought that maybe the code should be `[.myclass` <- function(x, i, j, drop = if (missing(i)) TRUE else length(j) == 1)[{ But I looked at "[.data.frame" and it has "cols" too; not "j". So why doesn't "[.data.frame" throw a warning when R gets built? Can someone please explain to me what's going on here? cheers, Rolf P. S. I amended the code for my method, replacing "cols" by "j", and it *seems* to run, and deliver the desired results. (And the package checks, without complaint.) I am nervous, however, that there may be some Trap for Young Players that I don't perceive, lurking about and waiting to cause problems for me. R. -- Honorary Research Fellow Department of Statistics University of Auckland Phone: +64-9-373-7599 ext. 88276
On Sun, 3 Nov 2019 at 22:12, Rolf Turner <r.turner at auckland.ac.nz> wrote:> > > I recently tried to write a new method for "[", to be applied to data > frames, so that the object returned would retain (all) attributes of the > columns, including attributes that my code had created. > > I thrashed around for quite a while, and then got some help from Rui > Barradas who showed me how to do it, in the following manner: > > `[.myclass` <- function(x, i, j, drop = if (missing(i)) TRUE else > length(cols) == 1)[{ > SaveAt <- lapply(x, attributes) > x <- NextMethod() > lX <- lapply(names(x),function(nm, x, Sat){ > attributes(x[[nm]]) <- Sat[[nm]] > x[[nm]]}, x = x, Sat = SaveAt) > names(lX) <- names(x) > x <- as.data.frame(lX) > x > } > > If I set class(X) <- c("myclass",class(X)) and apply "[" to X (e.g. > something like X[1:42,]) the attributes are retained as desired. > > OK. All good. Now we finally come to my question! I want to put this > new method into a package that I am building. When I build the package > and run R CMD check I get a complaint: > > ... no visible binding for global variable ?cols? > > And indeed, there is no such variable. At first I thought that maybe > the code should be > > `[.myclass` <- function(x, i, j, drop = if (missing(i)) TRUE else > length(j) == 1)[{ > > But I looked at "[.data.frame" and it has "cols" too; not "j". > > So why doesn't "[.data.frame" throw a warning when R gets built? > > Can someone please explain to me what's going on here?The thing is... test <- function(x = y * 2) { y <- 1 x } test() # 2 Lazy evaluation magic. I?aki
On 03/11/2019 4:11 p.m., Rolf Turner wrote:> > I recently tried to write a new method for "[", to be applied to data > frames, so that the object returned would retain (all) attributes of the > columns, including attributes that my code had created. > > I thrashed around for quite a while, and then got some help from Rui > Barradas who showed me how to do it, in the following manner: > > `[.myclass` <- function(x, i, j, drop = if (missing(i)) TRUE else > length(cols) == 1)[{ > SaveAt <- lapply(x, attributes) > x <- NextMethod() > lX <- lapply(names(x),function(nm, x, Sat){ > attributes(x[[nm]]) <- Sat[[nm]] > x[[nm]]}, x = x, Sat = SaveAt) > names(lX) <- names(x) > x <- as.data.frame(lX) > x > } > > If I set class(X) <- c("myclass",class(X)) and apply "[" to X (e.g. > something like X[1:42,]) the attributes are retained as desired. > > OK. All good. Now we finally come to my question! I want to put this > new method into a package that I am building. When I build the package > and run R CMD check I get a complaint: > > ... no visible binding for global variable ?cols? > > And indeed, there is no such variable. At first I thought that maybe > the code should be > > `[.myclass` <- function(x, i, j, drop = if (missing(i)) TRUE else > length(j) == 1)[{ > > But I looked at "[.data.frame" and it has "cols" too; not "j". > > So why doesn't "[.data.frame" throw a warning when R gets built? > > Can someone please explain to me what's going on here?Defaults for parameters are evaluated in the evaluation frame of the function, at the time the parameter is first used. If you look at the source for "[.data.frame", you should see that "cols" is defined there as a local variable. The "drop" argument shouldn't be used until it is. (There's a call to "missing(drop)" early in the source that doesn't count: it doesn't evaluate "drop", it just checks whether it is specified by the caller.) Duncan Murdoch
On 4/11/19 10:31 AM, Duncan Murdoch wrote:> On 03/11/2019 4:11 p.m., Rolf Turner wrote: >> >> I recently tried to write a new method for "[", to be applied to data >> frames, so that the object returned would retain (all) attributes of the >> columns, including attributes that my code had created. >> >> I thrashed around for quite a while, and then got some help from Rui >> Barradas who showed me how to do it, in the following manner: >> >> `[.myclass` <- function(x, i, j, drop = if (missing(i)) TRUE else >> length(cols) == 1)[{ >> ???? SaveAt <- lapply(x, attributes) >> ???? x <- NextMethod() >> ???? lX <- lapply(names(x),function(nm, x, Sat){ >> ?????? attributes(x[[nm]]) <- Sat[[nm]] >> ?????? x[[nm]]}, x = x, Sat = SaveAt) >> ???? names(lX) <- names(x) >> ???? x <- as.data.frame(lX) >> ???? x >> } >> >> If I set class(X) <- c("myclass",class(X)) and apply "[" to X (e.g. >> something like X[1:42,]) the attributes are retained as desired. >> >> OK.? All good.? Now we finally come to my question!? I want to put this >> new method into a package that I am building.? When I build the package >> and run R CMD check I get a complaint: >> >> ... no visible binding for global variable ?cols? >> >> And indeed, there is no such variable.? At first I thought that maybe >> the code should be >> >> `[.myclass` <- function(x, i, j, drop = if (missing(i)) TRUE else >> ??????????????????????????????????????? length(j) == 1)[{ >> >> But I looked at "[.data.frame" and it has "cols" too; not "j". >> >> So why doesn't "[.data.frame" throw a warning when R gets built? >> >> Can someone please explain to me what's going on here? > > Defaults for parameters are evaluated in the evaluation frame of the > function, at the time the parameter is first used. > > If you look at the source for "[.data.frame", you should see that "cols" > is defined there as a local variable.? The "drop" argument shouldn't be > used until it is.? (There's a call to "missing(drop)" early in the > source that doesn't count:? it doesn't evaluate "drop", it just checks > whether it is specified by the caller.)OK. As I understand what you're saying, the reason there isn't a "no visible binding" problem in [.data.frame is that "cols" *is* defined in the body of the function. Whereas, in my method, "cols" does not get defined anywhere in the function, and thus triggers the warning. I guess that a workaround would be to do a dummy assignment, like unto cols <- 42 at the start of the code for my method. (a) Are there perils involved with this strategy? (b) Is there anything wrong with my current strategy of replacing drop = if (missing(i)) TRUE else length(cols) == 1) by drop = if (missing(i)) TRUE else length(j) == 1) ??? As I said, this *seems* to work OK, by I cannot work through what the implications might be. Can anyone reassure me? cheers, Rolf -- Honorary Research Fellow Department of Statistics University of Auckland Phone: +64-9-373-7599 ext. 88276
For what it's worth, I don't think this strategy can work in general, because a class might have attributes that depend on its data/contents (e.g. https://vctrs.r-lib.org/articles/s3-vector.html#cached-sum). I don't think these are particularly common in practice, but it's dangerous to assume that you can restore a class simply by restoring its attributes after subsetting. Hadley On Sun, Nov 3, 2019 at 3:11 PM Rolf Turner <r.turner at auckland.ac.nz> wrote:> > > I recently tried to write a new method for "[", to be applied to data > frames, so that the object returned would retain (all) attributes of the > columns, including attributes that my code had created. > > I thrashed around for quite a while, and then got some help from Rui > Barradas who showed me how to do it, in the following manner: > > `[.myclass` <- function(x, i, j, drop = if (missing(i)) TRUE else > length(cols) == 1)[{ > SaveAt <- lapply(x, attributes) > x <- NextMethod() > lX <- lapply(names(x),function(nm, x, Sat){ > attributes(x[[nm]]) <- Sat[[nm]] > x[[nm]]}, x = x, Sat = SaveAt) > names(lX) <- names(x) > x <- as.data.frame(lX) > x > } > > If I set class(X) <- c("myclass",class(X)) and apply "[" to X (e.g. > something like X[1:42,]) the attributes are retained as desired. > > OK. All good. Now we finally come to my question! I want to put this > new method into a package that I am building. When I build the package > and run R CMD check I get a complaint: > > ... no visible binding for global variable ?cols? > > And indeed, there is no such variable. At first I thought that maybe > the code should be > > `[.myclass` <- function(x, i, j, drop = if (missing(i)) TRUE else > length(j) == 1)[{ > > But I looked at "[.data.frame" and it has "cols" too; not "j". > > So why doesn't "[.data.frame" throw a warning when R gets built? > > Can someone please explain to me what's going on here? > > cheers, > > Rolf > > P. S. I amended the code for my method, replacing "cols" by "j", and it > *seems* to run, and deliver the desired results. (And the package > checks, without complaint.) I am nervous, however, that there may be > some Trap for Young Players that I don't perceive, lurking about and > waiting to cause problems for me. > > R. > > -- > Honorary Research Fellow > Department of Statistics > University of Auckland > Phone: +64-9-373-7599 ext. 88276 > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel-- http://hadley.nz
On 5/11/19 3:41 AM, Hadley Wickham wrote:> For what it's worth, I don't think this strategy can work in general, > because a class might have attributes that depend on its data/contents > (e.g. https://vctrs.r-lib.org/articles/s3-vector.html#cached-sum). I > don't think these are particularly common in practice, but it's > dangerous to assume that you can restore a class simply by restoring > its attributes after subsetting.You're probably right that there are lurking perils in general, but I am not trying to "restore a class". I simply want to *retain* attributes of columns in a data frame. * I have a data frame X * I attach attributes to certain of its columns; attr(X$melvin,"clyde") <- 42 (I *don't* change the class of X$melvin.) * I form a subset of X: Y <- X[1:100,3:10] * given that "melvin" is amongst columns 3 through 10 of X, I want Y$melvin to retain the attribute "clyde", i.e. I want attr(Y$melvin,"clyde") to return 42 There is almost surely a better approach than the one that I've chosen (isn't there always?) but it seems to work, and the perils certainly are not immediately apparent to me. cheers, Rolf -- Honorary Research Fellow Department of Statistics University of Auckland Phone: +64-9-373-7599 ext. 88276