murdoch at stats.uwo.ca
2006-Nov-03 14:12 UTC
[Rd] [R] difference in using with() and the "data" argument in glm (PR#9338)
I've redirected this reply from r-help to the bugs list. On 11/3/2006 8:25 AM, vito muggeo wrote:> Dear all, > I am dealing with the following (apparently simple problem): > For some reasons I am interested in passing variables from a dataframe > to a specific environment, and in fitting a standard glm: > > dati<-data.frame(y=rnorm(10),x1=runif(10),x2=runif(10)) > KK<-new.env() > for(i in 1:ncol(dati)) assign(names(dati[i]),dati[[i]],envir=KK) > #Now the following two lines work correctly: > coef(glm(y~x1+x2,data=KK)) > with(KK,coef(glm(y~x1+x2))) > > #However if I write the above code inside a function, with() does not > appear to work.. > > ff<-function(Formula,Data,method=1){ > KK<-new.env() > for(i in 1:ncol(Data)) assign(names(Data[i]),Data[[i]],envir=KK) > o<-if(method==1) glm(Formula,data=KK) else with(KK,glm(Formula)) > o} > > > ff(y~x1+x2,dati,1) #it works > Call: glm(formula = Formula, data = KK) > ..[SNIP].. > > ff(y~x1+x2,dati,2) #it does not > Error in eval(expr, envir, enclos) : object "y" not found > > > > Could anyone to explain such difference? I believed that > "with(data,glm(formula))" and "glm(formula,data)" were equivalent.I think this is a bug in terms.formula. Near the end it has environment(terms) <- environment(x) where x is the formula. Since "y" isn't defined in that environment, it fails. It would work for you with environment(terms) <- data but see below. A workaround that should work for you is to put environment(Formula) <- KK before the call to glm. I'm not going to make the patch I suggest above, because I don't think it's consistent with the expected behaviour of glm() in the case where some of the terms in the formula are supposed to come from environment(x), and some from "data". I don't know how to handle that case properly: I think it requires a different search strategy than R employs (but I might be wrong). This isn't a problem with the workaround I suggested to you, because there the parent of KK is environment(x), but that wouldn't be true in general. Duncan Murdoch
Gabor Grothendieck
2006-Nov-03 15:34 UTC
[Rd] [R] difference in using with() and the "data" argument in glm (PR#9338)
One thing I noticed is that ?glm does not really specify what happens if you do not give a value for data. Is data then just skipped so that search takes place in enivonrment(formula) only or is it supposed to default to something? Some clarification in ?glm would be helpful. On 11/3/06, murdoch at stats.uwo.ca <murdoch at stats.uwo.ca> wrote:> I've redirected this reply from r-help to the bugs list. > > On 11/3/2006 8:25 AM, vito muggeo wrote: > > Dear all, > > I am dealing with the following (apparently simple problem): > > For some reasons I am interested in passing variables from a dataframe > > to a specific environment, and in fitting a standard glm: > > > > dati<-data.frame(y=rnorm(10),x1=runif(10),x2=runif(10)) > > KK<-new.env() > > for(i in 1:ncol(dati)) assign(names(dati[i]),dati[[i]],envir=KK) > > #Now the following two lines work correctly: > > coef(glm(y~x1+x2,data=KK)) > > with(KK,coef(glm(y~x1+x2))) > > > > #However if I write the above code inside a function, with() does not > > appear to work.. > > > > ff<-function(Formula,Data,method=1){ > > KK<-new.env() > > for(i in 1:ncol(Data)) assign(names(Data[i]),Data[[i]],envir=KK) > > o<-if(method==1) glm(Formula,data=KK) else with(KK,glm(Formula)) > > o} > > > > > ff(y~x1+x2,dati,1) #it works > > Call: glm(formula = Formula, data = KK) > > ..[SNIP].. > > > ff(y~x1+x2,dati,2) #it does not > > Error in eval(expr, envir, enclos) : object "y" not found > > > > > > > Could anyone to explain such difference? I believed that > > "with(data,glm(formula))" and "glm(formula,data)" were equivalent. > > I think this is a bug in terms.formula. Near the end it has > > environment(terms) <- environment(x) > > where x is the formula. Since "y" isn't defined in that environment, it > fails. It would work for you with > > environment(terms) <- data > > but see below. > > A workaround that should work for you is to put > > environment(Formula) <- KK > > before the call to glm. > > I'm not going to make the patch I suggest above, because I don't think > it's consistent with the expected behaviour of glm() in the case where > some of the terms in the formula are supposed to come from > environment(x), and some from "data". > > I don't know how to handle that case properly: I think it requires a > different search strategy than R employs (but I might be wrong). This > isn't a problem with the workaround I suggested to you, because there > the parent of KK is environment(x), but that wouldn't be true in general. > > Duncan Murdoch > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >