Ben Bolker
2025-Apr-14 21:49 UTC
[Rd] Request for comment: namespace resolution in terms(<formula>, specials=) [<pkg>::<name>, etc.]
I don't have any concerns about these changes, don't see any need to preserve the old behaviour. In lme4 and glmmTMB (and now broken out into a separate `reformulas` package, I do this the hard way, walking down the parse trees of formula objects and looking for specials, and not using the functionality here. Mikael showed how I could use the *new* functionality instead: https://github.com/bbolker/reformulas/issues/4 but honestly if I were going to change things in `reformulas` it would be in the direction of streamlining and refactoring, not changing the basic approach. cheers Ben Bolker On 2025-04-14 5:43 p.m., Mikael Jagan wrote:> [CC: maintainers of R packages survival, mgcv, lme4, RItools] > > Dear R-devel subscribers, > > If you have never used stats:::terms.formula or its 'specials' argument, > then feel free to stop reading or otherwise review help("terms.formula") > and help("terms.object"). > > Folks may have noticed a recent change in R-devel: > > ??? $ svn log -v -r 88066 > > ------------------------------------------------------------------------ > ??? r88066 | maechler | 2025-03-28 17:04:27 -0400 (Fri, 28 Mar 2025) | > 1 line > ??? Changed paths: > ?????? M /trunk/doc/NEWS.Rd > ?????? M /trunk/src/library/stats/src/model.c > ?????? M /trunk/tests/reg-tests-1e.R > > ??? terms(<formula>, specials = "<non-syntactic>") now works > > ------------------------------------------------------------------------ > > intended to resolve Bug 18568 > > ??? https://bugs.r-project.org/show_bug.cgi?id=18568 > > which pointed out the following undesirable behaviour in R-release: > > ??? > attr(terms(~x1 +? s (x2, f) +? s (x3, g), specials = "s"), > "specials") > ??? $s > ??? [1] 2 3 > > ??? > attr(terms(~x1 + `|`(x2, f) + `|`(x3, g), specials = "|"), > "specials") > ??? $`|` > ??? NULL > > > namely that non-syntactic names like "|" were not supported. > Unfortunately, > the patch (r88066) broke one package on CRAN, RItools, which relied on the > following > > ??? > attr(terms(~x1 +? mgcv::s (x2, f), specials = "mgcv::s"), > "specials") > ??? $`mgcv::s` > ??? [1] 2 > > ??? > attr(terms(~x1 + `mgcv::s`(x2, f), specials = "mgcv::s"), > "specials") > ??? $`mgcv::s` > ??? NULL > > > whereas in R-devel we see > > ??? > attr(terms(~x1 +? mgcv::s (x2, f), specials = "mgcv::s"), > "specials") > ??? $`mgcv::s` > ??? NULL > > ??? > attr(terms(~x1 + `mgcv::s`(x2, f), specials = "mgcv::s"), > "specials") > ??? $`mgcv::s` > ??? [1] 2 > > > A strict interpretation of 'specials' as a list of *name*s of functions > would > suggest that the old behaviour was "wrong" (and accidental, predating > package > namespaces altogether) and that the new behaviour is "right".? After all, > `mgcv::s` (with backticks) is a name (of type "symbol", class "name") > whereas > mgcv::s (without backticks) is a call (of type "language", class "call"). > > Martin and I are requesting comments from the community, especially R-core > members and package authors who use 'specials', on the following: > > ??? 1. Should the previous (long standing but undocumented, likely > rarely used) > ?????? behaviour be preserved going forward? > ??? 2. If we pursue a more *robust* implementation of namespace > resolution by > ?????? stats:::terms.formula, not relying on details of how non- > syntactic names > ?????? are deparsed, then what should that look like? > > (I say "likely rarely used" because stats:::terms.formula is called > primarily by > ?package *authors* to parse formulas of package *users*.? Only a subset > of those > ?packages will set 'specials', only a subset of *those* packages will set > ?specials="<pkg>::<name>", and only one such package is known to be > broken due > ?to r88066.) > > Relevant to (2) is an earlier thread > > ??? https://stat.ethz.ch/pipermail/r-devel/2025-March/083906.html > > in which I proposed that we make use of an optional 'package' attribute of > 'specials', so that > > ??? specials = structure(c("s", "s"), package = c("", "mgcv")) > > would match calls s(...) and mgcv::s(...) separately.? This attribute > would be > preserved by the 'specials' component of the 'terms' object, e.g., > > ??? > attr(terms(~x1 + s(x2, f) + mgcv::s(x3, g), > ??? +??????????? specials = structure(c("s", "s"), package = c("", > "mgcv"))), > ??? +????? "specials") > ??? $s > ??? [1] 2 > > ??? $s > ??? [1] 3 > > ??? attr(,"package") > ??? [1] ""???? "mgcv" > > A patch against R-devel (at r88141) implementing this proposal is attached. > > Mikael-- Dr. Benjamin Bolker Professor, Mathematics & Statistics and Biology, McMaster University Director, School of Computational Science and Engineering > E-mail is sent at my convenience; I don't expect replies outside of working hours.
peter dalgaard
2025-Apr-15 08:17 UTC
[Rd] Request for comment: namespace resolution in terms(<formula>, specials=) [<pkg>::<name>, etc.]
I don't seem to have the original post (not in spamfilter either). But generically, I think namespacing specials in formulas is just a Bad Idea. They are syntactic constructs, specifically _not_ function calls, so people are stumbling over formally protecting them from a non-existing scoping issue, then having to undo that for the actual use. It all came about by someone (I have forgotten the details) having a corporate coding standard mandating namespaces on all function calls and falling over things like strata() in the survival package. Then package author(s) chose to comply rather than explain... -pd> On 14 Apr 2025, at 23.49, Ben Bolker <bbolker at gmail.com> wrote: > > I don't have any concerns about these changes, don't see any need to preserve the old behaviour. > > In lme4 and glmmTMB (and now broken out into a separate `reformulas` package, I do this the hard way, walking down the parse trees of formula objects and looking for specials, and not using the functionality here. > > Mikael showed how I could use the *new* functionality instead: > > https://github.com/bbolker/reformulas/issues/4 > > but honestly if I were going to change things in `reformulas` it would be in the direction of streamlining and refactoring, not changing the basic approach. > > cheers > Ben Bolker > > > On 2025-04-14 5:43 p.m., Mikael Jagan wrote: >> [CC: maintainers of R packages survival, mgcv, lme4, RItools] >> Dear R-devel subscribers, >> If you have never used stats:::terms.formula or its 'specials' argument, >> then feel free to stop reading or otherwise review help("terms.formula") >> and help("terms.object"). >> Folks may have noticed a recent change in R-devel: >> $ svn log -v -r 88066 >> ------------------------------------------------------------------------ >> r88066 | maechler | 2025-03-28 17:04:27 -0400 (Fri, 28 Mar 2025) | 1 line >> Changed paths: >> M /trunk/doc/NEWS.Rd >> M /trunk/src/library/stats/src/model.c >> M /trunk/tests/reg-tests-1e.R >> terms(<formula>, specials = "<non-syntactic>") now works >> ------------------------------------------------------------------------ >> intended to resolve Bug 18568 >> https://bugs.r-project.org/show_bug.cgi?id=18568 >> which pointed out the following undesirable behaviour in R-release: >> > attr(terms(~x1 + s (x2, f) + s (x3, g), specials = "s"), "specials") >> $s >> [1] 2 3 >> > attr(terms(~x1 + `|`(x2, f) + `|`(x3, g), specials = "|"), "specials") >> $`|` >> NULL >> namely that non-syntactic names like "|" were not supported. Unfortunately, >> the patch (r88066) broke one package on CRAN, RItools, which relied on the >> following >> > attr(terms(~x1 + mgcv::s (x2, f), specials = "mgcv::s"), "specials") >> $`mgcv::s` >> [1] 2 >> > attr(terms(~x1 + `mgcv::s`(x2, f), specials = "mgcv::s"), "specials") >> $`mgcv::s` >> NULL >> whereas in R-devel we see >> > attr(terms(~x1 + mgcv::s (x2, f), specials = "mgcv::s"), "specials") >> $`mgcv::s` >> NULL >> > attr(terms(~x1 + `mgcv::s`(x2, f), specials = "mgcv::s"), "specials") >> $`mgcv::s` >> [1] 2 >> A strict interpretation of 'specials' as a list of *name*s of functions would >> suggest that the old behaviour was "wrong" (and accidental, predating package >> namespaces altogether) and that the new behaviour is "right". After all, >> `mgcv::s` (with backticks) is a name (of type "symbol", class "name") whereas >> mgcv::s (without backticks) is a call (of type "language", class "call"). >> Martin and I are requesting comments from the community, especially R-core >> members and package authors who use 'specials', on the following: >> 1. Should the previous (long standing but undocumented, likely rarely used) >> behaviour be preserved going forward? >> 2. If we pursue a more *robust* implementation of namespace resolution by >> stats:::terms.formula, not relying on details of how non- syntactic names >> are deparsed, then what should that look like? >> (I say "likely rarely used" because stats:::terms.formula is called primarily by >> package *authors* to parse formulas of package *users*. Only a subset of those >> packages will set 'specials', only a subset of *those* packages will set >> specials="<pkg>::<name>", and only one such package is known to be broken due >> to r88066.) >> Relevant to (2) is an earlier thread >> https://stat.ethz.ch/pipermail/r-devel/2025-March/083906.html >> in which I proposed that we make use of an optional 'package' attribute of >> 'specials', so that >> specials = structure(c("s", "s"), package = c("", "mgcv")) >> would match calls s(...) and mgcv::s(...) separately. This attribute would be >> preserved by the 'specials' component of the 'terms' object, e.g., >> > attr(terms(~x1 + s(x2, f) + mgcv::s(x3, g), >> + specials = structure(c("s", "s"), package = c("", "mgcv"))), >> + "specials") >> $s >> [1] 2 >> $s >> [1] 3 >> attr(,"package") >> [1] "" "mgcv" >> A patch against R-devel (at r88141) implementing this proposal is attached. >> Mikael > > -- > Dr. Benjamin Bolker > Professor, Mathematics & Statistics and Biology, McMaster University > Director, School of Computational Science and Engineering > > E-mail is sent at my convenience; I don't expect replies outside of working hours. > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel-- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business SchoolSolbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com