Ingmar Visser
2006-Mar-23 15:27 UTC
[R] invalid variable type in model.frame within a function
Dear expeRts, I came across the following error in using model.frame: # make a data.frame jet=data.frame(y=rnorm(10),x1=rnorm(10),x2=rnorm(10),rvar=rnorm(10)) # spec of formula mf1=y~x1+x2 # make the model.frame mf=model.frame(formula=mf1,data=jet,weights=rvar) Which gives the desired output:> mfy x1 x2 (weights) 1 0.8041254 0.1815366 0.4999551 1.4957814 2 -0.2546224 1.9368141 -2.2373186 0.7579341 3 0.8627935 -0.6690416 1.3948077 -0.2107092 4 0.3951245 0.5733776 -1.2926074 -0.3289226 5 -1.4805766 -0.6113256 1.1635959 0.2300376 6 -0.7418800 -0.1610305 0.4057340 -0.2280754 7 -1.1420962 -0.9363492 -0.4811192 -0.9258711 8 0.3507427 1.8744646 1.3227931 0.5292313 9 1.4196519 0.1340283 -1.3970614 -0.7189726 10 -1.0164708 -0.2044681 -0.6825873 -0.1719102 However, doing this inside another function like this: makemodelframe <- function(formula,data,weights) { mf=model.frame(formula=formula,data=data,weights=weights) mf } produces the following error:> makemodelframe(mf1,jet,weights=rvar)Error in model.frame(formula, rownames, variables, varnames, extras, extranames, : invalid variable type Searching the R-help archives I came across bug-reports about this but couldn't figure out whehter the bug was solved or whether there are work-arounds available. platform: platform powerpc-apple-darwin7.9.0 arch powerpc os darwin7.9.0 system powerpc, darwin7.9.0 status major 2 minor 2.1 year 2005 month 12 day 20 svn rev 36812 language R Any hints are welcome, Best, Ingmar
Gabor Grothendieck
2006-Mar-23 16:46 UTC
[R] invalid variable type in model.frame within a function
Examine the source code of lm to determine the proper way of doing this. Following that gives: makemodelframe <- function(formula,data,weights) { mf <- match.call() mf[[1]] <- as.name("model.frame") eval(mf, parent.frame()) } On 3/23/06, Ingmar Visser <I.Visser at uva.nl> wrote:> Dear expeRts, > > I came across the following error in using model.frame: > > # make a data.frame > jet=data.frame(y=rnorm(10),x1=rnorm(10),x2=rnorm(10),rvar=rnorm(10)) > # spec of formula > mf1=y~x1+x2 > # make the model.frame > mf=model.frame(formula=mf1,data=jet,weights=rvar) > > Which gives the desired output: > > mf > y x1 x2 (weights) > 1 0.8041254 0.1815366 0.4999551 1.4957814 > 2 -0.2546224 1.9368141 -2.2373186 0.7579341 > 3 0.8627935 -0.6690416 1.3948077 -0.2107092 > 4 0.3951245 0.5733776 -1.2926074 -0.3289226 > 5 -1.4805766 -0.6113256 1.1635959 0.2300376 > 6 -0.7418800 -0.1610305 0.4057340 -0.2280754 > 7 -1.1420962 -0.9363492 -0.4811192 -0.9258711 > 8 0.3507427 1.8744646 1.3227931 0.5292313 > 9 1.4196519 0.1340283 -1.3970614 -0.7189726 > 10 -1.0164708 -0.2044681 -0.6825873 -0.1719102 > > However, doing this inside another function like this: > > makemodelframe <- function(formula,data,weights) { > mf=model.frame(formula=formula,data=data,weights=weights) > mf > } > > produces the following error: > > > makemodelframe(mf1,jet,weights=rvar) > Error in model.frame(formula, rownames, variables, varnames, extras, > extranames, : > invalid variable type > > Searching the R-help archives I came across bug-reports about this but > couldn't figure out whehter the bug was solved or whether there are > work-arounds available. > > platform: > platform powerpc-apple-darwin7.9.0 > arch powerpc > os darwin7.9.0 > system powerpc, darwin7.9.0 > status > major 2 > minor 2.1 > year 2005 > month 12 > day 20 > svn rev 36812 > language R > > Any hints are welcome, > Best, Ingmar > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >
Thomas Lumley
2006-Mar-23 17:17 UTC
[R] invalid variable type in model.frame within a function
On Thu, 23 Mar 2006, Ingmar Visser wrote:> Dear expeRts, > > I came across the following error in using model.frame: > > # make a data.frame > jet=data.frame(y=rnorm(10),x1=rnorm(10),x2=rnorm(10),rvar=rnorm(10)) > # spec of formula > mf1=y~x1+x2 > # make the model.frame > mf=model.frame(formula=mf1,data=jet,weights=rvar) > > Which gives the desired output:<output snipped>> However, doing this inside another function like this: > > makemodelframe <- function(formula,data,weights) { > mf=model.frame(formula=formula,data=data,weights=weights) > mf > } > > produces the following error: > >> makemodelframe(mf1,jet,weights=rvar) > Error in model.frame(formula, rownames, variables, varnames, extras, > extranames, : > invalid variable type > > > Searching the R-help archives I came across bug-reports about this but > couldn't figure out whehter the bug was solved or whether there are > work-arounds available.It's not a bug. There have been bug reports about related issues (and also about this issue, but they tend to be marked "not a bug"). If you think about it, how could makemodelframe(mf1,jet,weights=rvar) possibly work? R passes variables by value, so rvar has to be evaluated before the function is called. But rvar is not the name of any global variable (it's just a column in data frame), so how can R know where to look? The reason that people think it might work is by analogy with model.frame and the regression commands, where model.frame(y~x, data=d, weights=w) does somehow retrieve d$w as the weight. This analogy tends to override programming commonsense and make people believe that R will somehow know where to find the weights. Now, since model.frame() *does* manage to find the weights, it must be possible, and it is. That doesn't make it a good idea, though. Regression commands and model.frame() do some fairly advanced trickery to make it work. This is documented on developer.r-project.org. I don't think it's a good idea for people to write code like this. I should admit (especially since it's Lent at the moment, and so is an appropriate time to repent one's past errors) that I lobbied Ross and Robert to make model.frame() work compatibly with S-PLUS in its treatment of weights= arguments (when porting the survival package, nearly ten years ago). They were reluctant at the time, and I now think they were right, although this level of S-PLUS compatibility might have been unavoidable. I would advise writing your code so that you the call looks like makemodelframe(mf1,jet,weights=~rvar) That is, pass all the variables that are going to be evaluated in the data= argument as formulas (or as quoted expressions). This is basically what lme() does, where you supply two formulas and then various other bits and pieces as objects. It is what my survey package does. Then a user can do makemodelframe(mf1,jet,weights=rvar) if rvar is a variable in the current environment and makemodelframe(mf1,jet,weights=~rvar) if rvar is a variable in the data= argument, and both will work. There is some discussion of this in a note on "Nonstandard evaluation" on the developer.r-project.org webpage, including a function that will produce a single model frame from multiple formulas. Now, I think there are some exceptions to this recommendation, and I don't have a very clear definition of them. I think of them as "macro-like" functions that evaluate a supplied expression in some special context Functions like this in base R include with() and capture.output(), and you will find some more nice simple examples in the mitools package. For these functions it really isn't ambiguous where the evaluation takes place. A related issue is functions such as the plot() methods that use the unevaluated forms of their arguments as labels. Again, the evaluation of the labels isn't ambiguous, because it doesn't even happen. With a few exceptions like these, though, I think its a bad idea to subvert the pass-by-value illusion in R. This was a lot more than you probably wanted to know, but the alternative answer is the traditional "Doctor, it hurts when I do this" "Don't do that, then" -thomas
Maybe Matching Threads
- model.matrix.default() silently ignores bad contrasts.arg
- model.matrix.default() silently ignores bad contrasts.arg
- Cross validation, one more time (hopefully the last)
- model.matrix.default() silently ignores bad contrasts.arg
- [LLVMdev] Improving the quality of debug locations / DbgValueHistoryCalculator