Dear r-devel list members, On a couple of occasions I've encountered the issue illustrated by the following examples: --------- snip -----------> mod.1 <- lm(Employed ~ GNP.deflator + GNP + Unemployed ++ Armed.Forces + Population + Year, data=longley)> mod.2 <- update(mod.1, . ~ . - Year + Year)> all.equal(mod.1, mod.2)[1] TRUE> > f <- function(mod){+ subs <- 1:10 + update(mod, subset=subs) + }> f(mod.1)Call: lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + Population + Year, data = longley, subset = subs) Coefficients: (Intercept) GNP.deflator GNP Unemployed Armed.Forces 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03 Population Year 1.164e+00 -1.911e+00> f(mod.2)Error in eval(expr, envir, enclos) : object 'subs' not found --------- snip ----------- I *almost* understand what's going -- that is, clearly mod.1 and mod.2, or the formulas therein, are associated with different environments, but I don't quite see why. Anyway, here are two "solutions" that work, but neither is in my view desirable: --------- snip -----------> f1 <- function(mod){+ assign(".subs", 1:10, envir=.GlobalEnv) + on.exit(remove(".subs", envir=.GlobalEnv)) + update(mod, subset=.subs) + }> f1(mod.1)Call: lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + Population + Year, data = longley, subset = .subs) Coefficients: (Intercept) GNP.deflator GNP Unemployed Armed.Forces 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03 Population Year 1.164e+00 -1.911e+00> f1(mod.2)Call: lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + Population + Year, data = longley, subset = .subs) Coefficients: (Intercept) GNP.deflator GNP Unemployed Armed.Forces 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03 Population Year 1.164e+00 -1.911e+00> f2 <- function(mod){+ env <- new.env(parent=.GlobalEnv) + attach(NULL) + on.exit(detach()) + assign(".subs", 1:10, pos=2) + update(mod, subset=.subs) + }> f2(mod.1)Call: lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + Population + Year, data = longley, subset = .subs) Coefficients: (Intercept) GNP.deflator GNP Unemployed Armed.Forces 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03 Population Year 1.164e+00 -1.911e+00> f2(mod.2)Call: lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + Population + Year, data = longley, subset = .subs) Coefficients: (Intercept) GNP.deflator GNP Unemployed Armed.Forces 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03 Population Year 1.164e+00 -1.911e+00 --------- snip ----------- The problem with f1() is that it will clobber a variable named .subs in the global environment; the problem with f2() is that .subs can be masked by a variable in the global environment. Is there a better approach? Thanks, John -------------------------------- John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada web: socserv.mcmaster.ca/jfox
Dear all, A small correction: I had a stray line in my f2(), env <- new.env(parent=.GlobalEnv) left over from yet another attempt; it can simply be removed. Sorry for the confusion, John -------------------------------- John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada web: socserv.mcmaster.ca/jfox> -----Original Message----- > From: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r-project.org]On> Behalf Of John Fox > Sent: January-04-11 4:36 PM > To: r-devel at r-project.org > Cc: 'Sanford Weisberg' > Subject: [Rd] scoping/non-standard evaluation issue > > Dear r-devel list members, > > On a couple of occasions I've encountered the issue illustrated by the > following examples: > > --------- snip ----------- > > > mod.1 <- lm(Employed ~ GNP.deflator + GNP + Unemployed + > + Armed.Forces + Population + Year, data=longley) > > > mod.2 <- update(mod.1, . ~ . - Year + Year) > > > all.equal(mod.1, mod.2) > [1] TRUE > > > > f <- function(mod){ > + subs <- 1:10 > + update(mod, subset=subs) > + } > > > f(mod.1) > > Call: > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + > Population + Year, data = longley, subset = subs) > > Coefficients: > (Intercept) GNP.deflator GNP Unemployed Armed.Forces > 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03 > Population Year > 1.164e+00 -1.911e+00 > > > f(mod.2) > Error in eval(expr, envir, enclos) : object 'subs' not found > > --------- snip ----------- > > I *almost* understand what's going -- that is, clearly mod.1 and mod.2, or > the formulas therein, are associated with different environments, but I > don't quite see why. > > Anyway, here are two "solutions" that work, but neither is in my view > desirable: > > --------- snip ----------- > > > f1 <- function(mod){ > + assign(".subs", 1:10, envir=.GlobalEnv) > + on.exit(remove(".subs", envir=.GlobalEnv)) > + update(mod, subset=.subs) > + } > > > f1(mod.1) > > Call: > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + > Population + Year, data = longley, subset = .subs) > > Coefficients: > (Intercept) GNP.deflator GNP Unemployed Armed.Forces > 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03 > Population Year > 1.164e+00 -1.911e+00 > > > f1(mod.2) > > Call: > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + > Population + Year, data = longley, subset = .subs) > > Coefficients: > (Intercept) GNP.deflator GNP Unemployed Armed.Forces > 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03 > Population Year > 1.164e+00 -1.911e+00 > > > f2 <- function(mod){ > + env <- new.env(parent=.GlobalEnv) > + attach(NULL) > + on.exit(detach()) > + assign(".subs", 1:10, pos=2) > + update(mod, subset=.subs) > + } > > > f2(mod.1) > > Call: > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + > Population + Year, data = longley, subset = .subs) > > Coefficients: > (Intercept) GNP.deflator GNP Unemployed Armed.Forces > 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03 > Population Year > 1.164e+00 -1.911e+00 > > > f2(mod.2) > > Call: > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + > Population + Year, data = longley, subset = .subs) > > Coefficients: > (Intercept) GNP.deflator GNP Unemployed Armed.Forces > 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03 > Population Year > 1.164e+00 -1.911e+00 > > --------- snip ----------- > > The problem with f1() is that it will clobber a variable named .subs inthe> global environment; the problem with f2() is that .subs can be masked by a > variable in the global environment. > > Is there a better approach? > > Thanks, > John > > -------------------------------- > John Fox > Senator William McMaster > Professor of Social Statistics > Department of Sociology > McMaster University > Hamilton, Ontario, Canada > web: socserv.mcmaster.ca/jfox > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
On Jan 4, 2011, at 22:35 , John Fox wrote:> Dear r-devel list members, > > On a couple of occasions I've encountered the issue illustrated by the > following examples: > > --------- snip ----------- > >> mod.1 <- lm(Employed ~ GNP.deflator + GNP + Unemployed + > + Armed.Forces + Population + Year, data=longley) > >> mod.2 <- update(mod.1, . ~ . - Year + Year) > >> all.equal(mod.1, mod.2) > [1] TRUE >> >> f <- function(mod){ > + subs <- 1:10 > + update(mod, subset=subs) > + } > >> f(mod.1) > > Call: > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + > Population + Year, data = longley, subset = subs) > > Coefficients: > (Intercept) GNP.deflator GNP Unemployed Armed.Forces > 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03 > Population Year > 1.164e+00 -1.911e+00 > >> f(mod.2) > Error in eval(expr, envir, enclos) : object 'subs' not found > > --------- snip ----------- > > I *almost* understand what's going -- that is, clearly mod.1 and mod.2, or > the formulas therein, are associated with different environments, but I > don't quite see why. > > Anyway, here are two "solutions" that work, but neither is in my view > desirable: > > --------- snip ----------- > >> f1 <- function(mod){ > + assign(".subs", 1:10, envir=.GlobalEnv) > + on.exit(remove(".subs", envir=.GlobalEnv)) > + update(mod, subset=.subs) > + } > >> f1(mod.1) > > Call: > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + > Population + Year, data = longley, subset = .subs) > > Coefficients: > (Intercept) GNP.deflator GNP Unemployed Armed.Forces > 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03 > Population Year > 1.164e+00 -1.911e+00 > >> f1(mod.2) > > Call: > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + > Population + Year, data = longley, subset = .subs) > > Coefficients: > (Intercept) GNP.deflator GNP Unemployed Armed.Forces > 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03 > Population Year > 1.164e+00 -1.911e+00 > >> f2 <- function(mod){ > + env <- new.env(parent=.GlobalEnv) > + attach(NULL) > + on.exit(detach()) > + assign(".subs", 1:10, pos=2) > + update(mod, subset=.subs) > + } > >> f2(mod.1) > > Call: > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + > Population + Year, data = longley, subset = .subs) > > Coefficients: > (Intercept) GNP.deflator GNP Unemployed Armed.Forces > 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03 > Population Year > 1.164e+00 -1.911e+00 > >> f2(mod.2) > > Call: > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + > Population + Year, data = longley, subset = .subs) > > Coefficients: > (Intercept) GNP.deflator GNP Unemployed Armed.Forces > 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03 > Population Year > 1.164e+00 -1.911e+00 > > --------- snip ----------- > > The problem with f1() is that it will clobber a variable named .subs in the > global environment; the problem with f2() is that .subs can be masked by a > variable in the global environment. > > Is there a better approach?I think the best way would be to modify the environment of the formula. Something like the below, except that it doesn't actually work... f3 <- function(mod) { f <- formula(mod) environment(f) <- e <- new.env(parent=environment(f)) mod <- update(mod, formula=f) evalq(.subs <- 1:10, e) update(mod, subset=.subs) } The catch is that it is not quite so easy to update the formula of a model. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
On Tue, Jan 4, 2011 at 4:35 PM, John Fox <jfox at mcmaster.ca> wrote:> Dear r-devel list members, > > On a couple of occasions I've encountered the issue illustrated by the > following examples: > > --------- snip ----------- > >> mod.1 <- lm(Employed ~ GNP.deflator + GNP + Unemployed + > + ? ? ? ? Armed.Forces + Population + Year, data=longley) > >> mod.2 <- update(mod.1, . ~ . - Year + Year) > >> all.equal(mod.1, mod.2) > [1] TRUE >> >> f <- function(mod){ > + ? ? subs <- 1:10 > + ? ? update(mod, subset=subs) > + ? ? } > >> f(mod.1) > > Call: > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + > ? ?Population + Year, data = longley, subset = subs) > > Coefficients: > ?(Intercept) ?GNP.deflator ? ? ? ? ? GNP ? ?Unemployed ?Armed.Forces > ? 3.641e+03 ? ? 8.394e-03 ? ? 6.909e-02 ? ?-3.971e-03 ? ?-8.595e-03 > ?Population ? ? ? ? ?Year > ? 1.164e+00 ? ?-1.911e+00 > >> f(mod.2) > Error in eval(expr, envir, enclos) : object 'subs' not found > > --------- snip ----------- > > I *almost* understand what's going -- that is, clearly mod.1 and mod.2, or > the formulas therein, are associated with different environments, but I > don't quite see why. > > Anyway, here are two "solutions" that work, but neither is in my view > desirable: > > --------- snip ----------- > >> f1 <- function(mod){ > + ? ? assign(".subs", 1:10, envir=.GlobalEnv) > + ? ? on.exit(remove(".subs", envir=.GlobalEnv)) > + ? ? update(mod, subset=.subs) > + ? ? } > >> f1(mod.1) > > Call: > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + > ? ?Population + Year, data = longley, subset = .subs) > > Coefficients: > ?(Intercept) ?GNP.deflator ? ? ? ? ? GNP ? ?Unemployed ?Armed.Forces > ? 3.641e+03 ? ? 8.394e-03 ? ? 6.909e-02 ? ?-3.971e-03 ? ?-8.595e-03 > ?Population ? ? ? ? ?Year > ? 1.164e+00 ? ?-1.911e+00 > >> f1(mod.2) > > Call: > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + > ? ?Population + Year, data = longley, subset = .subs) > > Coefficients: > ?(Intercept) ?GNP.deflator ? ? ? ? ? GNP ? ?Unemployed ?Armed.Forces > ? 3.641e+03 ? ? 8.394e-03 ? ? 6.909e-02 ? ?-3.971e-03 ? ?-8.595e-03 > ?Population ? ? ? ? ?Year > ? 1.164e+00 ? ?-1.911e+00 > >> f2 <- function(mod){ > + ? ? env <- new.env(parent=.GlobalEnv) > + ? ? attach(NULL) > + ? ? on.exit(detach()) > + ? ? assign(".subs", 1:10, pos=2) > + ? ? update(mod, subset=.subs) > + ? ? } > >> f2(mod.1) > > Call: > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + > ? ?Population + Year, data = longley, subset = .subs) > > Coefficients: > ?(Intercept) ?GNP.deflator ? ? ? ? ? GNP ? ?Unemployed ?Armed.Forces > ? 3.641e+03 ? ? 8.394e-03 ? ? 6.909e-02 ? ?-3.971e-03 ? ?-8.595e-03 > ?Population ? ? ? ? ?Year > ? 1.164e+00 ? ?-1.911e+00 > >> f2(mod.2) > > Call: > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + > ? ?Population + Year, data = longley, subset = .subs) > > Coefficients: > ?(Intercept) ?GNP.deflator ? ? ? ? ? GNP ? ?Unemployed ?Armed.Forces > ? 3.641e+03 ? ? 8.394e-03 ? ? 6.909e-02 ? ?-3.971e-03 ? ?-8.595e-03 > ?Population ? ? ? ? ?Year > ? 1.164e+00 ? ?-1.911e+00 > > --------- snip ----------- > > The problem with f1() is that it will clobber a variable named .subs in the > global environment; the problem with f2() is that .subs can be masked by a > variable in the global environment. > > Is there a better approach? >I think there is something wrong with R here since the formula in the call component of mod.1 has a "call" class whereas the corresponding call component of mod.2 has "formula" class:> class(mod.1$call[[2]])[1] "call"> class(mod.2$call[[2]])[1] "formula" If we reset call[[2]] to have "call" class then it works:> mod.2a <- mod.2 > mod.2a$call[[2]] <- as.call(as.list(mod.2a$call[[2]])) > f(mod.2a)Call: lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + Population + Year, data = longley, subset = subs) Coefficients: (Intercept) GNP.deflator GNP Unemployed Armed.Forces Population Year 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03 1.164e+00 -1.911e+00 -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com
the following seems an easy solution: f1 <- function(mod){ subs <- 1:10 toeval <- quote(update(mod, subset=subs)) toeval$subset<-subs eval(toeval) } f1(mod.2) When experimenting with this I once had (by mistake): mod.2 <- lm(update(mod.1, . ~ . - Year + Year)) # instead of just update(this) ... and this helped, too, i.e. f(mod.2) worked. Best regards, Kenn Kenn Konstabel Department of Chronic Diseases National Institute for Health Development Hiiu 42 Tallinn, Estonia On Tue, Jan 4, 2011 at 11:35 PM, John Fox <jfox@mcmaster.ca> wrote:> Dear r-devel list members, > > On a couple of occasions I've encountered the issue illustrated by the > following examples: > > --------- snip ----------- > > > mod.1 <- lm(Employed ~ GNP.deflator + GNP + Unemployed + > + Armed.Forces + Population + Year, data=longley) > > > mod.2 <- update(mod.1, . ~ . - Year + Year) > > > all.equal(mod.1, mod.2) > [1] TRUE > > > > f <- function(mod){ > + subs <- 1:10 > + update(mod, subset=subs) > + } > > > f(mod.1) > > Call: > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + > Population + Year, data = longley, subset = subs) > > Coefficients: > (Intercept) GNP.deflator GNP Unemployed Armed.Forces > 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03 > Population Year > 1.164e+00 -1.911e+00 > > > f(mod.2) > Error in eval(expr, envir, enclos) : object 'subs' not found > > --------- snip ----------- > > I *almost* understand what's going -- that is, clearly mod.1 and mod.2, or > the formulas therein, are associated with different environments, but I > don't quite see why. > > Anyway, here are two "solutions" that work, but neither is in my view > desirable: > > --------- snip ----------- > > > f1 <- function(mod){ > + assign(".subs", 1:10, envir=.GlobalEnv) > + on.exit(remove(".subs", envir=.GlobalEnv)) > + update(mod, subset=.subs) > + } > > > f1(mod.1) > > Call: > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + > Population + Year, data = longley, subset = .subs) > > Coefficients: > (Intercept) GNP.deflator GNP Unemployed Armed.Forces > 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03 > Population Year > 1.164e+00 -1.911e+00 > > > f1(mod.2) > > Call: > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + > Population + Year, data = longley, subset = .subs) > > Coefficients: > (Intercept) GNP.deflator GNP Unemployed Armed.Forces > 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03 > Population Year > 1.164e+00 -1.911e+00 > > > f2 <- function(mod){ > + env <- new.env(parent=.GlobalEnv) > + attach(NULL) > + on.exit(detach()) > + assign(".subs", 1:10, pos=2) > + update(mod, subset=.subs) > + } > > > f2(mod.1) > > Call: > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + > Population + Year, data = longley, subset = .subs) > > Coefficients: > (Intercept) GNP.deflator GNP Unemployed Armed.Forces > 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03 > Population Year > 1.164e+00 -1.911e+00 > > > f2(mod.2) > > Call: > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + > Population + Year, data = longley, subset = .subs) > > Coefficients: > (Intercept) GNP.deflator GNP Unemployed Armed.Forces > 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03 > Population Year > 1.164e+00 -1.911e+00 > > --------- snip ----------- > > The problem with f1() is that it will clobber a variable named .subs in the > global environment; the problem with f2() is that .subs can be masked by a > variable in the global environment. > > Is there a better approach? > > Thanks, > John > > -------------------------------- > John Fox > Senator William McMaster > Professor of Social Statistics > Department of Sociology > McMaster University > Hamilton, Ontario, Canada > web: socserv.mcmaster.ca/jfox > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >[[alternative HTML version deleted]]
On Jan 6, 2011, at 13:11 , Kenn Konstabel wrote:> the following seems an easy solution: > > f1 <- function(mod){ > subs <- 1:10 > toeval <- quote(update(mod, subset=subs)) > toeval$subset<-subs > eval(toeval) > } > > f1(mod.2)Tere, Kenn! Yes, enforcing pass-by-value by pre-evaluating the argument will certainly defeat the nonstandard evaluation issues. Another version of the same idea is eval(bquote(update(mod, .(subs))) The only thing is that if the argument is ever deparsed, you might get a messy display. E.g., try eval(bquote(plot(.(rnorm(20))))) -pd -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com