Full_Name: James Signorovitch Version: 2.2.1 OS: WinXP Submission from: (NULL) (134.174.182.203) In the code below, fn1() and fn2() fail with the messages given in the comments. Strangely, fn2() fails for all data sets I've tried except for those with 100 rows. The same errors occur if glm() is used in place of lm(), or if R 2.1.1 is used on a unix system. Thanks for looking into this. JS fn1 <- function(model, data) { w <- runif(nrow(data)); print(lm(model, data=data, weights=w)); } fn2 <- function(model, data) { print(lm(model, data=data, weights=runif(nrow(data)))); } n = 101; A <- data.frame(matrix(rnorm(2*n), n, 2)); names(A) <- c("x", "y"); # we can run the command print(lm(y ~ x, data=A, weights=runif(nrow(A)))); # But fn1() generates the error message # # Error in eval(expr, envir, enclos) : object "w" not found # fn1(y ~ x, data=A); # fn2() generates the error message: # Error in model.frame(formula, rownames, variables, varnames, extras, extranames, : # variable lengths differ # # # But fn2() works if n=100 fn2(y ~ x, data=A); # note: the error in fn1() still occurs if w <- 1:nrow(data);
jsignoro at hsph.harvard.edu writes:> Full_Name: James Signorovitch > Version: 2.2.1 > OS: WinXP > Submission from: (NULL) (134.174.182.203) > > > In the code below, fn1() and fn2() fail with the messages given in the comments. > Strangely, fn2() fails for all data sets I've tried except for those with 100 > rows. > The same errors occur if glm() is used in place of lm(), or if R 2.1.1 is used > on a unix system. Thanks for looking into this. JS > > fn1 <- function(model, data) > { > w <- runif(nrow(data)); > print(lm(model, data=data, weights=w)); > } > > fn2 <- function(model, data) > { > print(lm(model, data=data, weights=runif(nrow(data)))); > } > > n = 101; > > A <- data.frame(matrix(rnorm(2*n), n, 2)); > names(A) <- c("x", "y"); > > # we can run the command > print(lm(y ~ x, data=A, weights=runif(nrow(A)))); > > # But fn1() generates the error message > # > # Error in eval(expr, envir, enclos) : object "w" not found > # > > fn1(y ~ x, data=A); > > # fn2() generates the error message: > # Error in model.frame(formula, rownames, variables, varnames, extras, > extranames, : > # variable lengths differ > # > # > # But fn2() works if n=100 > > fn2(y ~ x, data=A); > > # note: the error in fn1() still occurs if w <- 1:nrow(data); >This is due to the concept of a model environment (which does not contain w). This is a deliberate design, not a bug. The rationale is a bit convoluted, but it ensures that w <- runif(nrow(A)) fn1(y ~ x + w, data=A) picks up w from the global environment, not the evaluation frame of fn1, and the convention is that weight, subset, and offset arguments are picked from the same environment as the variables in the formula. -- O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
On Thu, 22 Jun 2006, jsignoro at hsph.harvard.edu wrote:> > In the code below, fn1() and fn2() fail with the messages given in the comments. > Strangely, fn2() fails for all data sets I've tried except for those with 100 > rows.<snip>> fn1 <- function(model, data) > { > w <- runif(nrow(data)); > print(lm(model, data=data, weights=w)); > } > > fn2 <- function(model, data) > { > print(lm(model, data=data, weights=runif(nrow(data)))); > }This is the result of an interaction between a (IMO bad) design choice when lm and glm were first introduced in S-PLUS and a (IMO good) design choice more recently in R. The bad design choice was that lm(model, data=data, weights=w) is interpreted more like lm(model, data=data, weights=~w) That is, as far as you can see from the outside, weights=w appears to be an ordinary argument passed by value but it is interpreted as if it were a reference by name to the data= argument. This still wouldn't be too bad, except that if there is no element of data= called "w", lm() looks further. In S-PLUS it looks in the calling frame and then in the global workspace. In R it looks at the environment where the formula was defined. Neither of these is necessarily what you expect, but people expect a wide range of incompatible things, so this isn't decisive. There are at least two ways to get the result you want. The simpler and cruder way is to make w a column of the data frame. This is inefficient in memory if data is very large, and requires that you use a name that doesn't conflict with any variable that you already want in the model, eg. data$".weights."<-runif(nrow(data)) lm(model, data=data,weights=.weights.) The other approach is to set the environment of the formula to be the current environment. This will work as long as the formula doesn't refer to any variables in its original environment environment(model)<-environment() w<-runif(nrow(data)) lm(model,data=data, weights=w)> # But fn2() works if n=100No, it just looks as though it does. I suspect you have a data frame called data, with 100 rows, in your workspace. In a clean copy of R I get> fn2(y ~ x, data=A);Error in runif(n, min, max) : invalid arguments -thomas Thomas Lumley Assoc. Professor, Biostatistics tlumley at u.washington.edu University of Washington, Seattle
Apparently Analagous Threads
- Behavior or as.environment in function arguments/call (and force() behaviors...)
- data.frame: adding a column that is based on ranges of values in another column
- distinct DISubprograms hindering sharing inlined subprogram descriptions
- distinct DISubprograms hindering sharing inlined subprogram descriptions
- distinct DISubprograms hindering sharing inlined subprogram descriptions