Thaler, Thorn, LAUSANNE, Applied Mathematics
2011-Aug-02 11:48 UTC
[Rd] update.default: fall back on model.frame in case that the data frame is not in the parent environment
Dear all, Suppose the following code: --------------8<-------------- mm <- function(datf) { lm(y ~ x, data = datf) } mydatf <- data.frame(x = rep(1:2, 10), y = rnorm(20, rep(1:2, 10))) l <- mm(mydatf) -------------->8-------------- If I want to update l now without providing the data argument an error occurs: --------------8<--------------> update(l, . ~ .)Error in inherits(x, "data.frame") : object 'datf' not found -------------->8-------------- and I've to provide the data argument explicitly: --------------8<-------------- update(l, . ~ ., data = mydatf) update(l, . ~ ., data = model.frame(l)) -------------->8-------------- While the first work-around is additionally error prone (what if I change the name of mydatf earlier in the file? In the best case I just get an error if mydatf is not defined), both options are kind of semantically questionable (I do not want to _update_ the data argument of the lm object it should remain untouched). So my suggestion would be that update falls back on the data stored in model.frame in case that the data argument in the lm call cannot be resolved in the parent.frame of update, which can be easily achieved by adding just four lines to update.default: --------------8<-------------- update.default <- function (object, formula., ..., evaluate = TRUE) { call <- object$call if (is.null(call)) stop("need an object with call component") extras <- match.call(expand.dots = FALSE)$... if (!missing(formula.)) call$formula <- update.formula(formula(object), formula.) if (length(extras)) { existing <- !is.na(match(names(extras), names(call))) for (a in names(extras)[existing]) call[[a]] <- extras[[a]] if (any(!existing)) { call <- c(as.list(call), extras[!existing]) call <- as.call(call) } } if (!is.null(call$data)) { if (!exists(as.character(call$data), envir = parent.frame())) call$data <- model.frame(object) } if (evaluate) eval(call, parent.frame()) else call } -------------->8-------------- This is just a quick dirty hack which works fine here (with an ugly drawback that in the standard output of lm I now see the lengthy explicit data.frame statement) but I'm sure there are some cracks out there who could take it over from here and beautify this idea. I don't see any problems with this proposition regarding old code, but if I'm wrong and there are some reasons not to touch update.default in the way I was proposing please let me know. Any other feedback is highly appreciated too. Thanks for sharing your thoughts with me. KR, -Thorn
Duncan Murdoch
2011-Aug-02 13:41 UTC
[Rd] update.default: fall back on model.frame in case that the data frame is not in the parent environment
It looks to me as though your proposal would allow update to remove variables, but would give erroneous results when adding them. For example: mm <- function(datf) { lm(y ~ x, data = datf) } mydatf <- data.frame(x = rep(1:2, 10), y = rnorm(20, rep(1:2, 10)), z = rnorm(20)) l <- mm(mydatf) update(l, . ~ . + z) # This fails, z is not found z <- rnorm(20) update(l, . ~ . + z) # This finds the wrong z, without a warning I'd rather get the "datf not found" error than wrong results. Duncan Murdoch On 02/08/2011 7:48 AM, Thaler, Thorn, LAUSANNE, Applied Mathematics wrote:> Dear all, > > Suppose the following code: > > --------------8<-------------- > mm<- function(datf) { > lm(y ~ x, data = datf) > } > mydatf<- data.frame(x = rep(1:2, 10), y = rnorm(20, rep(1:2, 10))) > > l<- mm(mydatf) > -------------->8-------------- > > If I want to update l now without providing the data argument an error > occurs: > > --------------8<-------------- > > update(l, . ~ .) > Error in inherits(x, "data.frame") : object 'datf' not found > -------------->8-------------- > > and I've to provide the data argument explicitly: > --------------8<-------------- > update(l, . ~ ., data = mydatf) > update(l, . ~ ., data = model.frame(l)) > -------------->8-------------- > > While the first work-around is additionally error prone (what if I > change the name of mydatf earlier in the file? In the best case I just > get an error if mydatf is not defined), both options are kind of > semantically questionable (I do not want to _update_ the data argument > of the lm object it should remain untouched). > > So my suggestion would be that update falls back on the data stored in > model.frame in case that the data argument in the lm call cannot be > resolved in the parent.frame of update, which can be easily achieved by > adding just four lines to update.default: > > --------------8<-------------- > update.default<- function (object, formula., ..., evaluate = TRUE) { > call<- object$call > if (is.null(call)) > stop("need an object with call component") > extras<- match.call(expand.dots = FALSE)$... > if (!missing(formula.)) > call$formula<- update.formula(formula(object), formula.) > if (length(extras)) { > existing<- !is.na(match(names(extras), names(call))) > for (a in names(extras)[existing]) call[[a]]<- extras[[a]] > if (any(!existing)) { > call<- c(as.list(call), extras[!existing]) > call<- as.call(call) > } > } > if (!is.null(call$data)) { > if (!exists(as.character(call$data), envir = parent.frame())) > call$data<- model.frame(object) > } > if (evaluate) > eval(call, parent.frame()) > else call > } > -------------->8-------------- > > This is just a quick dirty hack which works fine here (with an ugly > drawback that in the standard output of lm I now see the lengthy > explicit data.frame statement) but I'm sure there are some cracks out > there who could take it over from here and beautify this idea. > > I don't see any problems with this proposition regarding old code, but > if I'm wrong and there are some reasons not to touch update.default in > the way I was proposing please let me know. Any other feedback is highly > appreciated too. > > Thanks for sharing your thoughts with me. > > KR, > > -Thorn > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel