Kurt Hornik
2022-Sep-17 09:13 UTC
[Rd] [External] Time to drop globalenv() from searches in package code?
>>>>> luke-tierney writes:> On Thu, 15 Sep 2022, Duncan Murdoch wrote: >> The author of this Stackoverflow question >> https://stackoverflow.com/q/73722496/2554330 got confused because a typo in >> his code didn't trigger an error in normal circumstances, but it did when he >> ran his code in pkgdown. >> >> The typo was to use "x" in a test, when the local variable was named ".x". >> There was no "x" defined locally or in the package or its imports, so the >> search got all the way to the global environment and found one. (The very >> confusing part for this user was that it found the right variable.) >> >> This author had suppressed the "R CMD check" check for use of global >> variables. Obviously he shouldn't have done that, but he's working with >> tidyverse NSE, and that causes so many false positives that it is somewhat >> understandable he would suppress one too many. >> >> The pkgdown simulation of code in examples doesn't do perfect mimicry of >> running it at top level; the fake global environment never makes it onto the >> search list. Some might call this a bug, but I'd call it the right search >> strategy. >> >> My suggestion is that the search for variables in package code should never >> get to globalenv(). The chain of environments should stop after handling the >> imports. (Probably base package functions should also be implicitly >> imported, but nothing else.) >>> This was considered and discussed when I added namespaces. Basically > it would mean making the parent of the base namespace environment be > the empty environment instead of the global environment. As a design > this is cleaner, and it would be a one-line change in eval.c. But > there were technical reasons this was not a viable option at the time, > also a few political reasons. The technical reasons mostly had to do > with S3 dispatch.> Changes over the years, mostly from work Kurt has done, to S3 dispatch > for methods defined and registered in packages might make this more > viable in principle, but there would still be a lot of existing code > that would stop working. For example, 'make check' with the one-line > change fails in a base example that defines an S3 method. It might be > possible to fiddle with the dispatch to keep most of that code > working, but I suspect that would be a lot of work. Seeing what it > would take to get 'make check' to succeed would be a first step if > anyone wants to take a crack at it.Luke, Can you please share the one-line change so that I can take a closer look? Best -k>> I suspect this change would reveal errors in lots of packages, but the number >> of legitimate uses of the current search strategy has got to be pretty small >> nowadays, since we've been getting warnings for years about implicit imports >> from other standard packages.> Your definition of 'legitimate' is probably quite similar to mine, but > there is likely to be a small but vocal minority with very different > views :-).> Best,> luke>> Duncan Murdoch >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >>> -- > Luke Tierney > Ralph E. Wareham Professor of Mathematical Sciences > University of Iowa Phone: 319-335-3386 > Department of Statistics and Fax: 319-335-3017 > Actuarial Science > 241 Schaeffer Hall email: luke-tierney at uiowa.edu > Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu> ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
iuke-tier@ey m@iii@g oii uiow@@edu
2022-Sep-17 20:24 UTC
[Rd] [External] Time to drop globalenv() from searches in package code?
On Sat, 17 Sep 2022, Kurt Hornik wrote:>>>>>> luke-tierney writes: > >> On Thu, 15 Sep 2022, Duncan Murdoch wrote: >>> The author of this Stackoverflow question >>> https://stackoverflow.com/q/73722496/2554330 got confused because a typo in >>> his code didn't trigger an error in normal circumstances, but it did when he >>> ran his code in pkgdown. >>> >>> The typo was to use "x" in a test, when the local variable was named ".x". >>> There was no "x" defined locally or in the package or its imports, so the >>> search got all the way to the global environment and found one. (The very >>> confusing part for this user was that it found the right variable.) >>> >>> This author had suppressed the "R CMD check" check for use of global >>> variables. Obviously he shouldn't have done that, but he's working with >>> tidyverse NSE, and that causes so many false positives that it is somewhat >>> understandable he would suppress one too many. >>> >>> The pkgdown simulation of code in examples doesn't do perfect mimicry of >>> running it at top level; the fake global environment never makes it onto the >>> search list. Some might call this a bug, but I'd call it the right search >>> strategy. >>> >>> My suggestion is that the search for variables in package code should never >>> get to globalenv(). The chain of environments should stop after handling the >>> imports. (Probably base package functions should also be implicitly >>> imported, but nothing else.) >>> > >> This was considered and discussed when I added namespaces. Basically >> it would mean making the parent of the base namespace environment be >> the empty environment instead of the global environment. As a design >> this is cleaner, and it would be a one-line change in eval.c. But >> there were technical reasons this was not a viable option at the time, >> also a few political reasons. The technical reasons mostly had to do >> with S3 dispatch. > >> Changes over the years, mostly from work Kurt has done, to S3 dispatch >> for methods defined and registered in packages might make this more >> viable in principle, but there would still be a lot of existing code >> that would stop working. For example, 'make check' with the one-line >> change fails in a base example that defines an S3 method. It might be >> possible to fiddle with the dispatch to keep most of that code >> working, but I suspect that would be a lot of work. Seeing what it >> would take to get 'make check' to succeed would be a first step if >> anyone wants to take a crack at it. > > Luke, > > Can you please share the one-line change so that I can take a closer > look?Index: src/main/envir.c ==================================================================--- src/main/envir.c (revision 82861) +++ src/main/envir.c (working copy) @@ -683,7 +683,7 @@ R_GlobalCachePreserve = CONS(R_GlobalCache, R_NilValue); R_PreserveObject(R_GlobalCachePreserve); #endif - R_BaseNamespace = NewEnvironment(R_NilValue, R_NilValue, R_GlobalEnv); + R_BaseNamespace = NewEnvironment(R_NilValue, R_NilValue, R_EmptyEnv); R_PreserveObject(R_BaseNamespace); SET_SYMVALUE(install(".BaseNamespaceEnv"), R_BaseNamespace); R_BaseNamespaceName = ScalarString(mkChar("base")); ----- For S3 the dispatch will have to be changed to explicitly search .GlobalEnv and parents after the namespace if we don't want to break too much. Another idiom that will be broken is if (require("foo")) bar(...) with bar exported from foo. I don't know if that is already warned about. Moving away from this is arguably good in principle but also probably fairly disruptive. We might need to add some cleaner use-if-available mechanism, or maybe just adjust some checking code. Best, luke> > Best > -k > >>> I suspect this change would reveal errors in lots of packages, but the number >>> of legitimate uses of the current search strategy has got to be pretty small >>> nowadays, since we've been getting warnings for years about implicit imports >>> from other standard packages. > >> Your definition of 'legitimate' is probably quite similar to mine, but >> there is likely to be a small but vocal minority with very different >> views :-). > >> Best, > >> luke > >>> Duncan Murdoch >>> >>> ______________________________________________ >>> R-devel at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >>> > >> -- >> Luke Tierney >> Ralph E. Wareham Professor of Mathematical Sciences >> University of Iowa Phone: 319-335-3386 >> Department of Statistics and Fax: 319-335-3017 >> Actuarial Science >> 241 Schaeffer Hall email: luke-tierney at uiowa.edu >> Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu > >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >-- Luke Tierney Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics and Fax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: luke-tierney at uiowa.edu Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu
Henrik Bengtsson
2022-Oct-05 19:55 UTC
[Rd] [External] Time to drop globalenv() from searches in package code?
Excluding the global environment, and all its parent environments from the search path that a package sees would be great news, because it would makes things more robust and probably detect a few more bugs out there. In addition to the use case that Duncan mentions, it would also remove the ambiguity that comes from searching attached packages for global/free variables. Speaking of the latter, isn't it time to escalate the following to a WARNING? * checking R code for possible problems ... NOTE my_fcn: no visible global function definition for ?var? Undefined global functions or variables: var Consider adding importFrom("stats", "var") to your NAMESPACE file. There are several package on Bioconductor with such mistakes, and I can imagine there are still some old CRAN package too that haven't gone from "recent" CRAN incoming checks. The problem here is that the intended `var` can be overridden in the global environment, in a third-party attached package, or in any other environment on the search() path. Regarding:> if (require("foo")) > bar(...) > > with bar exported from foo. I don't know if that is already warned > about. Moving away from this is arguably good in principle but also > probably fairly disruptive.This should be coved by 'R CMD check', also without --as-cran; if we use: hello <- function() { msg <- "Hello world!" if (require("tools")) msg <- toTitleCase(msg) message(msg) } we get: * checking dependencies in R code ... NOTE 'library' or 'require' call to ?tools? in package code. Please use :: or requireNamespace() instead. See section 'Suggested packages' in the 'Writing R Extensions' manual.? ... * checking R code for possible problems ... NOTE hello: no visible global function definition for ?toTitleCase? Undefined global functions or variables: toTitleCase This should be enough for a package developer to figure out that the function should be implemented as: hello <- function() { msg <- "Hello world!" if (requireNamespace("tools")) msg <- tools::toTitleCase(msg) message(msg) } which passes 'R CMD check' with all OK. BTW, is there a reason why one cannot import from the 'base' package? If we add, say, importFrom(base, mean) to a package's NAMESPACE file, the package builds fine, but fails to install; $ R --vanilla CMD INSTALL teeny_0.1.0.tar.gz * installing to library ?/home/hb/R/x86_64-pc-linux-gnu-library/4.2-CBI-gcc9? * installing *source* package ?teeny? ... ** using staged installation ** R ** byte-compile and prepare package for lazy loading Error in asNamespace(ns, base.OK = FALSE) : operation not allowed on base namespace Calls: <Anonymous> ... namespaceImportFrom -> importIntoEnv -> getNamespaceInfo -> asNamespace Execution halted ERROR: lazy loading failed for package ?teeny? Is this by design (e.g. more flexibility in maintaining the base-R packages), or due to a technical limitation (e.g. the 'base' package is not fully loaded at this point)? /Henrik On Sat, Sep 17, 2022 at 1:24 PM <luke-tierney at uiowa.edu> wrote:> > On Sat, 17 Sep 2022, Kurt Hornik wrote: > > >>>>>> luke-tierney writes: > > > >> On Thu, 15 Sep 2022, Duncan Murdoch wrote: > >>> The author of this Stackoverflow question > >>> https://stackoverflow.com/q/73722496/2554330 got confused because a typo in > >>> his code didn't trigger an error in normal circumstances, but it did when he > >>> ran his code in pkgdown. > >>> > >>> The typo was to use "x" in a test, when the local variable was named ".x". > >>> There was no "x" defined locally or in the package or its imports, so the > >>> search got all the way to the global environment and found one. (The very > >>> confusing part for this user was that it found the right variable.) > >>> > >>> This author had suppressed the "R CMD check" check for use of global > >>> variables. Obviously he shouldn't have done that, but he's working with > >>> tidyverse NSE, and that causes so many false positives that it is somewhat > >>> understandable he would suppress one too many. > >>> > >>> The pkgdown simulation of code in examples doesn't do perfect mimicry of > >>> running it at top level; the fake global environment never makes it onto the > >>> search list. Some might call this a bug, but I'd call it the right search > >>> strategy. > >>> > >>> My suggestion is that the search for variables in package code should never > >>> get to globalenv(). The chain of environments should stop after handling the > >>> imports. (Probably base package functions should also be implicitly > >>> imported, but nothing else.) > >>> > > > >> This was considered and discussed when I added namespaces. Basically > >> it would mean making the parent of the base namespace environment be > >> the empty environment instead of the global environment. As a design > >> this is cleaner, and it would be a one-line change in eval.c. But > >> there were technical reasons this was not a viable option at the time, > >> also a few political reasons. The technical reasons mostly had to do > >> with S3 dispatch. > > > >> Changes over the years, mostly from work Kurt has done, to S3 dispatch > >> for methods defined and registered in packages might make this more > >> viable in principle, but there would still be a lot of existing code > >> that would stop working. For example, 'make check' with the one-line > >> change fails in a base example that defines an S3 method. It might be > >> possible to fiddle with the dispatch to keep most of that code > >> working, but I suspect that would be a lot of work. Seeing what it > >> would take to get 'make check' to succeed would be a first step if > >> anyone wants to take a crack at it. > > > > Luke, > > > > Can you please share the one-line change so that I can take a closer > > look? > > Index: src/main/envir.c > ==================================================================> --- src/main/envir.c (revision 82861) > +++ src/main/envir.c (working copy) > @@ -683,7 +683,7 @@ > R_GlobalCachePreserve = CONS(R_GlobalCache, R_NilValue); > R_PreserveObject(R_GlobalCachePreserve); > #endif > - R_BaseNamespace = NewEnvironment(R_NilValue, R_NilValue, R_GlobalEnv); > + R_BaseNamespace = NewEnvironment(R_NilValue, R_NilValue, R_EmptyEnv); > R_PreserveObject(R_BaseNamespace); > SET_SYMVALUE(install(".BaseNamespaceEnv"), R_BaseNamespace); > R_BaseNamespaceName = ScalarString(mkChar("base")); > > ----- > > For S3 the dispatch will have to be changed to explicitly search > .GlobalEnv and parents after the namespace if we don't want to break > too much. > > Another idiom that will be broken is > > if (require("foo")) > bar(...) > > with bar exported from foo. I don't know if that is already warned > about. Moving away from this is arguably good in principle but also > probably fairly disruptive. We might need to add some cleaner > use-if-available mechanism, or maybe just adjust some checking code. > > Best, > > luke > > > > > Best > > -k > > > >>> I suspect this change would reveal errors in lots of packages, but the number > >>> of legitimate uses of the current search strategy has got to be pretty small > >>> nowadays, since we've been getting warnings for years about implicit imports > >>> from other standard packages. > > > >> Your definition of 'legitimate' is probably quite similar to mine, but > >> there is likely to be a small but vocal minority with very different > >> views :-). > > > >> Best, > > > >> luke > > > >>> Duncan Murdoch > >>> > >>> ______________________________________________ > >>> R-devel at r-project.org mailing list > >>> https://stat.ethz.ch/mailman/listinfo/r-devel > >>> > > > >> -- > >> Luke Tierney > >> Ralph E. Wareham Professor of Mathematical Sciences > >> University of Iowa Phone: 319-335-3386 > >> Department of Statistics and Fax: 319-335-3017 > >> Actuarial Science > >> 241 Schaeffer Hall email: luke-tierney at uiowa.edu > >> Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu > > > >> ______________________________________________ > >> R-devel at r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-devel > > > > -- > Luke Tierney > Ralph E. Wareham Professor of Mathematical Sciences > University of Iowa Phone: 319-335-3386 > Department of Statistics and Fax: 319-335-3017 > Actuarial Science > 241 Schaeffer Hall email: luke-tierney at uiowa.edu > Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel