Full_Name: Peter Perkins Version: 1.2.1 OS: LinuxPPC Submission from: (NULL) (24.4.89.36) the lines if (missing(data)) data <- environment(formula) in glm() seem to contradict the documentaton: data: an optional data frame containing the variables in the model. By default the variables are taken from the environment from which `glm' is called. actually, the near lack of other references to "data" is glm() is not clear to me, so i may have this wrong. but a small test seems to bear out the problem: function () { form <- y ~ x fun1 <- function(f) { n <- 10 x <- runif(n, 1, 10) y <- rpois(n, x) glm(f, family=poisson(link="log"), data=environment()) } fun2 <- function(f) { n <- 10 x <- runif(n, 1, 10) y <- rpois(n, x) glm(f, family=poisson(link="log")) } print("fit1") print(fun1(form)$call) print("fit2") print(fun2(form)$call) }> test()[1] "fit1" glm(formula = f, family = poisson(link = "log"), data = environment()) [1] "fit2" Error in eval(expr, envir, enclos) : Object "y" not found -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Fri, 9 Feb 2001 pperkins@ucsd.edu wrote:> > the lines > > if (missing(data)) > data <- environment(formula) > > in glm() seem to contradict the documentaton: > > data: an optional data frame containing the variables in the model. > By default the variables are taken from the environment from > which `glm' is called. > > actually, the near lack of other references to "data" is glm() is not > clear to me, so i may have this wrong. but a small test seems to bear > out the problem:The documentation is slightly out of date. The current behaviour is that the variables are taken from the environment in which the model formula is defined. This will usually, but not always, be the same thing. This behaviour is new in 1.2.0, and is designed to give functions like predict() and update() a fighting chance. However, the line you quote is in fact irrelevant to this. The variable `data' that it modifies is not used in fitting the model, it is just returned as part of the result. So, you ask, if the `data' variable is never used, how is it used? This is Deep Magic, and probably worth explaining. A copy of the call to glm() (or other modelling functions) is grabbed by match.call() and turned into a call to model.frame(). This call is then evaluated in the parent environment to create a model frame. The model.matrix() function is then called to create a design matrix from the model frame. This is how we fake dynamic scope -- using the values of variables in the parent environment -- in a language that really has static scope. The moral of this is that it pays to put the variables you want to use in a data frame. It's much easier to find them that way than by playing clever games with environments. -thomas Thomas Lumley Asst. Professor, Biostatistics tlumley@u.washington.edu University of Washington, Seattle -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._