I want to write a function to take an argument as the response variable of a linear model, e.g. to do anova's across a list of variables, something like the following (except, of course, this doesn't work): function(x) { anova(lm(x ~ my.factor,data=my.data)) } The x in lm() above is getting evaluated at the wrong level. How can I make this work? -- Russell Senior ``The two chiefs turned to each other. seniorr at aracnet.com Bellison uncorked a flood of horrible profanity, which, translated meant, `This is extremely unusual.' '' -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On 07/09/02 03:48, Russell Senior wrote:> >I want to write a function to take an argument as the response >variable of a linear model, e.g. to do anova's across a list of >variables, something like the following (except, of course, this >doesn't work): > > function(x) { anova(lm(x ~ my.factor,data=my.data)) } > >The x in lm() above is getting evaluated at the wrong level. How >can I make this work?I don't know. But what I do instead is this: my.coeffs <- matrix(NA,mynumber,3) # set up empty matrix for (i in 1:my.number) { mycoeffs[i] <- as.numeric(try(coef(lm(Y[i]~X1[i]+X2[i]+X3[i]))[2:4]))} The "try()" is in case it doesn't work for a given value of i (e.g., too many missing data), and that makes "as.numeric()" necessary. The coef() extracts the coefficients, excluding the constant,, which is all I was interested in, but you could store the whole thing if you use a list instead of my.coeffs. I would like to know how to do it with a function. Jon -- Jonathan Baron, Professor of Psychology, University of Pennsylvania R page: http://finzi.psych.upenn.edu/ -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
In response to a query from Russel Senior, Jon Baron writes:> -----Original Message----- > From: Jonathan Baron [mailto:baron at cattell.psych.upenn.edu] > Sent: Tuesday, July 09, 2002 9:38 PM > To: Russell Senior > Cc: r-help at stat.math.ethz.ch > Subject: Re: [R] building formula objects > > On 07/09/02 03:48, Russell Senior wrote: > > > >I want to write a function to take an argument as the response > >variable of a linear model, e.g. to do anova's across a list of > >variables, something like the following (except, of course, this > >doesn't work): > > > > function(x) { anova(lm(x ~ my.factor,data=my.data)) } > > > >The x in lm() above is getting evaluated at the wrong level. How > >can I make this work? > > I don't know. But what I do instead is this: > > my.coeffs <- matrix(NA,mynumber,3) # set up empty matrix > for (i in 1:my.number) { > mycoeffs[i] <- as.numeric(try(coef(lm(Y[i]~X1[i]+X2[i]+X3[i]))[2:4]))} > > The "try()" is in case it doesn't work for a given value of i > (e.g., too many missing data), and that makes "as.numeric()" > necessary. The coef() extracts the coefficients, excluding the > constant,, which is all I was interested in, but you could store > the whole thing if you use a list instead of my.coeffs. > > I would like to know how to do it with a function.[WNV] It's not too bad, but a little trickier than most people would expect. See the Programmer's Niche article in the latest edition of R news for a general strategy for handling this kind of problem (at the risk of touting my own stuff). Bill Venables> Jon > -- > Jonathan Baron, Professor of Psychology, University of Pennsylvania > R page: http://finzi.psych.upenn.edu/ >-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-. -.-> r-help mailing list -- Readhttp://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html> Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch >_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._. _._ -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On 9 Jul 2002, Russell Senior wrote:> > I want to write a function to take an argument as the response > variable of a linear model, e.g. to do anova's across a list of > variables, something like the following (except, of course, this > doesn't work): > > function(x) { anova(lm(x ~ my.factor,data=my.data)) } > > The x in lm() above is getting evaluated at the wrong level. How > can I make this work? >When I tried this first I thought it did work. It provides a nice example of why R does things this way (as well as why it's useful to give an example in help questions) Consider a data frame my.data<-data.frame(a=rep(0:1,12),b=rep(0:2,8),y=rnorm(24)) and f<-function(x) { anova(lm(x ~ my.factor,data=my.data)) } If we generate a new response vector (perhaps for simulations) and want to regress it on a and b then the function above works nicely> z<-rnorm(24) > f(z)Analysis of Variance Table Response: x Df Sum Sq Mean Sq F value Pr(>F) a 1 0.0086 0.0086 0.0078 0.9305 b 1 0.1013 0.1013 0.0921 0.7645 Residuals 21 23.0914 1.0996 Here the response variable is the value of z. Presumably the reason it `doesn't work' is that the question was different. If we want to specify a column in `my.data' as the response variable we need to pass in the name of the variable and somehow get that name into the function. This can be done with substitute(), as discussed in more detail in the R Newsletter. This case is fairly simple and we can use g<-function(x) { ff<-eval(substitute(x~a+b)) anova(lm(ff,data=my.data)) }> g(y)Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(>F) a 1 2.1294 2.1294 2.3335 0.1415 b 1 0.1537 0.1537 0.1685 0.6856 Residuals 21 19.1634 0.9125 which is the correct answer as it matches> anova(lm(y~a+b,data=my.data))Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(>F) a 1 2.1294 2.1294 2.3335 0.1415 b 1 0.1537 0.1537 0.1685 0.6856 Residuals 21 19.1634 0.9125 -thomas Thomas Lumley Asst. Professor, Biostatistics tlumley at u.washington.edu University of Washington, Seattle -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._