I wish I had started with "I am disappointed that lm() doesn't continue its search for weights into the calling environment" or "the fact that lm() looks only in the formula environment and data frame for weights doesn't seem consistent with how other values are treated." But I did not. So I do apologize for both that and for negative tone on my part. Simplified example: d <- data.frame(x = 1:3, y = c(1, 2, 1)) w <- c(1, 10, 1) f <- as.formula(y ~ x) lm(f, data = d, weights = w) # works # fails environment(f) <- baseenv() lm(f, data = d, weights = w) # Error in eval(extras, data, env) : object 'w' not found> On Aug 9, 2020, at 11:56 AM, Duncan Murdoch <murdoch.duncan at gmail.com> wrote: > > This is fairly clearly documented in ?lm: >
I assume you are concerned about this because the formula is defined in one environment and the model fitting with weights occurs in a separate function. If that is the case then the model fitting function can create a new environment, a child of the formula's environment, add the weights variable to it, and make that the new environment of the formula. (This new environment is only an attribute of the copy of the formula in the model fitting function: it will not affect the formula outside of that function.) E.g., d <- data.frame(x = 1:3, y = c(1, 2, 1)) lmWithWeightsBad <- function(formula, data, weights) { lm(formula, data=data, weights=weights) } coef(lmWithWeightsBad(y~x, data=d, weights=c(2,5,1))) # lm finds the 'weights' function in package:stats #Error in model.frame.default(formula = formula, data = data, weights = weights, : # invalid type (closure) for variable '(weights)' lmWithWeightsGood <- function(formula, data, weights) { envir <- new.env(parent = environment(formula)) envir$weights <- weights environment(formula) <- envir lm(formula, data=data, weights=weights) } coef(lmWithWeightsGood(y~x, data=d, weights=c(2,5,1))) #(Intercept) x # 1.2173913 0.2173913 Bill Dunlap TIBCO Software wdunlap tibco.com On Mon, Aug 10, 2020 at 10:43 AM John Mount <jmount at win-vector.com> wrote:> > I wish I had started with "I am disappointed that lm() doesn't continue its search for weights into the calling environment" or "the fact that lm() looks only in the formula environment and data frame for weights doesn't seem consistent with how other values are treated." > > But I did not. So I do apologize for both that and for negative tone on my part. > > > Simplified example: > > d <- data.frame(x = 1:3, y = c(1, 2, 1)) > w <- c(1, 10, 1) > f <- as.formula(y ~ x) > lm(f, data = d, weights = w) # works > > # fails > environment(f) <- baseenv() > lm(f, data = d, weights = w) > # Error in eval(extras, data, env) : object 'w' not found > > > > On Aug 9, 2020, at 11:56 AM, Duncan Murdoch <murdoch.duncan at gmail.com> wrote: > > > > This is fairly clearly documented in ?lm: > > > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
On 10/08/2020 1:42 p.m., John Mount wrote:> I wish I had started with "I am disappointed that lm() doesn't continue its search for weights into the calling environment" or "the fact that lm() looks only in the formula environment and data frame for weights doesn't seem consistent with how other values are treated."Normally searching is done automatically by following a chain of environments. It's easy to add something to the head of the chain (e.g. data), it's hard to add something in the middle or at the end (because the chain ends with emptyenv(), which is not allowed to have a parent). So I'd suggest using environment(f) <- environment() before calling lm() if you want the calling environment to be in the search. Setting it to baseenv() doesn't really make sense, unless you want to disable all searches except in data, in which case emptyenv() would make more sense (but I haven't tried it, so it might break something). Duncan Murdoch> > But I did not. So I do apologize for both that and for negative tone on my part. > > > Simplified example: > > d <- data.frame(x = 1:3, y = c(1, 2, 1)) > w <- c(1, 10, 1) > f <- as.formula(y ~ x) > lm(f, data = d, weights = w) # works > > # fails > environment(f) <- baseenv() > lm(f, data = d, weights = w) > # Error in eval(extras, data, env) : object 'w' not found > > >> On Aug 9, 2020, at 11:56 AM, Duncan Murdoch <murdoch.duncan at gmail.com> wrote: >> >> This is fairly clearly documented in ?lm: >> >
Thank you for your suggestion. I do know how to work around the issue. I usually build a fresh environment as a child of base-environment and then insurt the weights there. I was just trying to provide an example of the issue. emptyenv() can not be used, as it is needed for the eval (errors out even if weights are not used with "could not find function list"). For some applications one doesn't want the formula to have a non-trivial environment with respect to serialization. Nina Zumel wrote about reference leaks in lm()/glm() and a good part of that was environments other than global/base (such as those formed when building a formula in a function) capturing references to unrelated structures.> On Aug 10, 2020, at 11:34 AM, Duncan Murdoch <murdoch.duncan at gmail.com> wrote: > > On 10/08/2020 1:42 p.m., John Mount wrote: >> I wish I had started with "I am disappointed that lm() doesn't continue its search for weights into the calling environment" or "the fact that lm() looks only in the formula environment and data frame for weights doesn't seem consistent with how other values are treated." > > Normally searching is done automatically by following a chain of environments. It's easy to add something to the head of the chain (e.g. data), it's hard to add something in the middle or at the end (because the chain ends with emptyenv(), which is not allowed to have a parent). > > So I'd suggest using > > environment(f) <- environment() > > before calling lm() if you want the calling environment to be in the search. Setting it to baseenv() doesn't really make sense, unless you want to disable all searches except in data, in which case emptyenv() would make more sense (but I haven't tried it, so it might break something). > > Duncan Murdoch > >> But I did not. So I do apologize for both that and for negative tone on my part. >> Simplified example: >> d <- data.frame(x = 1:3, y = c(1, 2, 1)) >> w <- c(1, 10, 1) >> f <- as.formula(y ~ x) >> lm(f, data = d, weights = w) # works >> # fails >> environment(f) <- baseenv() >> lm(f, data = d, weights = w) >> # Error in eval(extras, data, env) : object 'w' not found >>> On Aug 9, 2020, at 11:56 AM, Duncan Murdoch <murdoch.duncan at gmail.com> wrote: >>> >>> This is fairly clearly documented in ?lm: >>> >