> On 4 May 2023, at 10:26, Duncan Murdoch <murdoch.duncan at gmail.com> wrote: > > On 04/05/2023 4:05 a.m., Adelchi Azzalini via R-help wrote: >> Hi. There must be something about the use of update() which I do not grasp, >> as the next exercise indicates. >> Suppose that obj is an object returned by a call to lm() or glm(). >> Next, a new variable xf is constructed using the same dataframe used >> for producing obj. Then >> obj$data <- cbind(obj$data, xf=xf) >> new.obj <- update(obj, . ~ . + xf) >> generates >> Error in eval(predvars, data, env) : object 'xf' not found >> Could somebody explain what I got wrong, and how to fix it? > > I don't think you should be modifying the obj$data element: as far as I can see, it's not used during the update, which will just re-evaluate the original call to glm(). So you should modify the dataframe that you passed in when creating obj. >Thanks, Duncan. What you indicate is surely the ideal route. Unfortunately, in my case this is not feasible, because the construction of xf and the update call are within an iterative procedure where xf is changed at each iteration, so that the steps obj$data <- cbind(obj$data, xf=xf) new.obj <- update(obj, . ~ . + xf) must be repeated hundreds of times, each with a different xf. Adelchi
On 04/05/2023 4:34 a.m., Adelchi Azzalini wrote:> > >> On 4 May 2023, at 10:26, Duncan Murdoch <murdoch.duncan at gmail.com> wrote: >> >> On 04/05/2023 4:05 a.m., Adelchi Azzalini via R-help wrote: >>> Hi. There must be something about the use of update() which I do not grasp, >>> as the next exercise indicates. >>> Suppose that obj is an object returned by a call to lm() or glm(). >>> Next, a new variable xf is constructed using the same dataframe used >>> for producing obj. Then >>> obj$data <- cbind(obj$data, xf=xf) >>> new.obj <- update(obj, . ~ . + xf) >>> generates >>> Error in eval(predvars, data, env) : object 'xf' not found >>> Could somebody explain what I got wrong, and how to fix it? >> >> I don't think you should be modifying the obj$data element: as far as I can see, it's not used during the update, which will just re-evaluate the original call to glm(). So you should modify the dataframe that you passed in when creating obj. >> > > Thanks, Duncan. What you indicate is surely the ideal route. Unfortunately, in my case this is not feasible, because the construction of xf and the update call are within an iterative procedure where xf is changed at each iteration, so that the steps > > obj$data <- cbind(obj$data, xf=xf) > new.obj <- update(obj, . ~ . + xf) > > must be repeated hundreds of times, each with a different xf.Sorry, that doesn't make sense. You didn't show us complete code, but presumably it's preceded by something like this: obj <- glm( ..., data = somedata) So change your modification to this: somedata$xf <- xf That can be done hundreds of times. This will need to be more elaborate if the function doing the iteration has a copy of obj but doesn't have a copy of somedata, but there are lots of ways to resolve that. Without seeing complete code, I can't recommend which one to use. Duncan Murdoch
G'day Adelchi, hope all is well with you. On Thu, 4 May 2023 10:34:00 +0200 Adelchi Azzalini via R-help <r-help at r-project.org> wrote:> Thanks, Duncan. What you indicate is surely the ideal route. > Unfortunately, in my case this is not feasible, because the > construction of xf and the update call are within an iterative > procedure where xf is changed at each iteration, so that the steps > > obj$data <- cbind(obj$data, xf=xf) > new.obj <- update(obj, . ~ . + xf) > > must be repeated hundreds of times, each with a different xf.If memory serves correctly, update() takes the object that is passed to it, looks at what the call was that created that object, modifies that call according to the additional arguments, and finally executes the modified call. So there is a lot of manipulations going on in update(). In particular it would result each time in a call to lm(), glm() or whatever call was used to create the object. Inside any of these modelling functions a lot of symbolic manipulations/calculations are needed too (parsing the formula, creating the design matrix and response vector from the parsed formula and data frame, checking if weights are used &c). If you do the same calculation essentially over and over again, just with minor modification, all these symbolic manipulations are just time consuming. IMHO, you will be better off to bypass update() and just use lm.fit() (for which lm() is a nice front-end) and glm.fit() (for which glm() is a nice front-end), or whatever routine does the grunt work of fitting the model to the data in your application (hopefully, the package creator used a set up of XXX.fit() to fit the model, called by XXX() that does all the fancy formula handling). Cheers, Berwin