Ivan Krylov
2022-Jun-22 15:25 UTC
[Rd] stats::getInitial: requires the model to live in the stats namespace or above
Hello R-devel, Here's a corner case I've stumbled upon recently: local({ # Originally this was a package namespace, but a local # environment also leads to failure stopifnot(!identical(environment(), globalenv())) # Make a self-starting model inside this private environment... SSlinear <- selfStart( ~ a * x + b, function(mCall, data, LHS, ...) { xy <- sortedXyData(mCall[['x']], LHS, data) setNames( coef(lm(y ~ x, xy)), mCall[c('b', 'a')] ) }, c('a', 'b') ) # ...and try to use it x <- 1:100 y <- 100 + 5 * x + rnorm(length(x), sd = 10) nls(y ~ SSlinear(x, a, b)) # error in get('SSlinear'): object not found }) As a workaround, I'll just provide the starting values manually, but should this work? As implemented [1], getInitial requires the model object to live in the stats package namespace or any of its parents, which eventually include the global environment and the attached packages, but not the private namespaces of the packages or any other local environments. This results from the fact that getInitial() uses plain get() in order to resolve the symbol for the self-starting model, and get() defaults to the current environment, which leads a chain of stats -> imports:stats -> base -> global environment -> attached packages. It seems easy to suggest get(., envir = environment(object)) as a fix, which would be able to access anything available at the time of creation of the formula. On the other hand, it would break the case when the stats package is not attached to the global environment or the formula environment, which currently works. -- Best regards, Ivan [1] https://github.com/r-devel/r-svn/blob/d43497cbc927e632c6f597fa23001c3f31d4cae6/src/library/stats/R/selfStart.R#L81-L87
Bill Dunlap
2022-Jun-22 16:44 UTC
[Rd] stats::getInitial: requires the model to live in the stats namespace or above
Shouldn't the get()'s in stats:::getInitial.formula be looking in the environment of the formula, not the environment of getInitial.formula? --- selfStart.R (revision 82512) +++ selfStart.R (working copy) @@ -78,13 +79,19 @@ switch (length(object), stop("argument 'object' has an impossible length"), { # one-sided formula - func <- get(as.character(object[[2L]][[1L]])) + if (!is.call(object[[2L]])) { + stop("Right-hand side of formula is not a call") + } + func <- get(as.character(object[[2L]][[1L]]), mode="function", envir=environment(object)) getInitial(func, data, mCall = as.list(match.call(func, call object[[2L]])), ...) }, { # two-sided formula - func <- get(as.character(object[[3L]][[1L]])) + if (!is.call(object[[3L]])) { + stop("Right-hand side of formula is not a call") + } + func <- get(as.character(object[[3L]][[1L]]), mode="function", envir=environment(object)) getInitial(func, data, mCall = as.list(match.call(func, call object[[3L]])), LHS = object[[2L]], ...) -Bill On Wed, Jun 22, 2022 at 8:25 AM Ivan Krylov <krylov.r00t at gmail.com> wrote:> Hello R-devel, > > Here's a corner case I've stumbled upon recently: > > local({ > # Originally this was a package namespace, but a local > # environment also leads to failure > stopifnot(!identical(environment(), globalenv())) > > # Make a self-starting model inside this private environment... > SSlinear <- selfStart( > ~ a * x + b, > function(mCall, data, LHS, ...) { > xy <- sortedXyData(mCall[['x']], LHS, data) > setNames( > coef(lm(y ~ x, xy)), > mCall[c('b', 'a')] > ) > }, > c('a', 'b') > ) > > # ...and try to use it > x <- 1:100 > y <- 100 + 5 * x + rnorm(length(x), sd = 10) > nls(y ~ SSlinear(x, a, b)) > # error in get('SSlinear'): object not found > }) > > As a workaround, I'll just provide the starting values manually, > but should this work? > > As implemented [1], getInitial requires the model object to live in the > stats package namespace or any of its parents, which eventually include > the global environment and the attached packages, but not the private > namespaces of the packages or any other local environments. This > results from the fact that getInitial() uses plain get() in order to > resolve the symbol for the self-starting model, and get() defaults to > the current environment, which leads a chain of stats -> imports:stats > -> base -> global environment -> attached packages. > > It seems easy to suggest get(., envir = environment(object)) as a fix, > which would be able to access anything available at the time of > creation of the formula. On the other hand, it would break the case > when the stats package is not attached to the global environment or the > formula environment, which currently works. > > -- > Best regards, > Ivan > > [1] > > https://github.com/r-devel/r-svn/blob/d43497cbc927e632c6f597fa23001c3f31d4cae6/src/library/stats/R/selfStart.R#L81-L87 > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >[[alternative HTML version deleted]]
Sebastian Meyer
2022-Jun-22 16:47 UTC
[Rd] stats::getInitial: requires the model to live in the stats namespace or above
Thank you, Ivan, for this careful report. You are right and I have actually found that issue myself while working on nlme bugs some months ago, but had decided to postpone working on that until someone reports any real problems with how this is implemented since ages. Here is my (smaller) example:> local({ > mySSfunc <- stats::SSlogis > getInitial(circumference ~ mySSfunc(age, Asym, xmid, scal), Orange) > })Error in get(as.character(object[[3L]][[1L]])) : object 'mySSfunc' not found And also evil code that I had planned as a regression test:> plogis <- stats::SSlogis > in2 <- getInitial(circumference ~ plogis(age, Asym, xmid, scal), Orange)Error in getInitial.default(func, data, mCall = as.list(match.call(func, : no 'getInitial' method found for "function" objects I had similar thoughts about the "obvious" patch that you describe and also assume a minor slow-down in variable lookup for the standard use case with the pre-defined self-starting functions from stats. However, these problems may not be relevant in practice ... they seem to be less relevant than the bug itself since we now both found it independently and it cannot be worked around. Furthermore, stats is a base package attached by default (but packages like yours could even "Depends: stats" to ensure that self-starting functions from stats are eventually found starting from the formula environment, often the global environment, if not masked). I'd suggest you add this report to R's Bugzilla so that it can be linked from the NEWS once this gets addressed. Thanks and best regards, Sebastian Meyer Am 22.06.22 um 17:25 schrieb Ivan Krylov:> Hello R-devel, > > Here's a corner case I've stumbled upon recently: > > local({ > # Originally this was a package namespace, but a local > # environment also leads to failure > stopifnot(!identical(environment(), globalenv())) > > # Make a self-starting model inside this private environment... > SSlinear <- selfStart( > ~ a * x + b, > function(mCall, data, LHS, ...) { > xy <- sortedXyData(mCall[['x']], LHS, data) > setNames( > coef(lm(y ~ x, xy)), > mCall[c('b', 'a')] > ) > }, > c('a', 'b') > ) > > # ...and try to use it > x <- 1:100 > y <- 100 + 5 * x + rnorm(length(x), sd = 10) > nls(y ~ SSlinear(x, a, b)) > # error in get('SSlinear'): object not found > }) > > As a workaround, I'll just provide the starting values manually, > but should this work? > > As implemented [1], getInitial requires the model object to live in the > stats package namespace or any of its parents, which eventually include > the global environment and the attached packages, but not the private > namespaces of the packages or any other local environments. This > results from the fact that getInitial() uses plain get() in order to > resolve the symbol for the self-starting model, and get() defaults to > the current environment, which leads a chain of stats -> imports:stats > -> base -> global environment -> attached packages. > > It seems easy to suggest get(., envir = environment(object)) as a fix, > which would be able to access anything available at the time of > creation of the formula. On the other hand, it would break the case > when the stats package is not attached to the global environment or the > formula environment, which currently works. >