Hello, everybody. I'm putting together some lecture notes and course exercises on R programming. My plan is to pick some R packages, ask students to read through code and see why things work, maybe make some changes. As I look for examples, I'm running up against the problem that packages use coding idioms that are unfamiliar to me. A difficult thing for me is explaining scope of variables in R functions. When should we pass an object to a function, when should we let the R system search about for an object? I've been puzzling through ?environment for quite a while. Here's an example from one of the packages that I like, called "ltm". In the function "ltm.fit" the work of calculating estimates is sent to different functions like "EM' and "loglikltm" and "scoreltm". Before that, this is used: environment(EM) <- environment(loglikltm) <- environment(scoreltm) <- environment() ##and then EM is called res.EM <- EM(betas, constraint, control$iter.em, control$verbose) I want to make sure I understand this. The environment line gets the current environment and then assigns it for those 3 functions, right? All variables and functions that can be accessed from the current position in the code become available to function EM, loglikltm, scoreltm. So, which options should be explicitly inserted into a function call, which should be left in the environment for R to find when it needs them? 1. I *think* that when EM is called, the variables "betas", "constraint", and "control" are already in the environment. The EM function is declared like this, using the same words "beta" and "constraint" EM <- function (betas, constraint, iter, verbose = FALSE) { It seems to me that if I wrote the function call like this (leave out "betas" and "constraint") res.EM <- EM(control$iter.em, control$verbose) R will run EM and go find "betas" and "constraint" in the environment, there was no need to name them as arguments. 2 Is a function like EM allowed to alter objects that it finds through the environment, ones that are not passed as arguments? I understand that a function cannot alter an object that is passed explicitly, but what about the ones it grabs from the environment? If you have ideas about packages that might be handy teaching examples, please let me know. pj -- Paul E. Johnson Professor, Political Science 1541 Lilac Lane, Room 504 University of Kansas
On Dec 26, 2010, at 22:30 , Paul Johnson wrote:> Hello, everybody. > > I'm putting together some lecture notes and course exercises on R > programming. My plan is to pick some R packages, ask students to read > through code and see why things work, maybe make some changes. As I > look for examples, I'm running up against the problem that packages > use coding idioms that are unfamiliar to me. > > A difficult thing for me is explaining scope of variables in R > functions. When should we pass an object to a function, when should > we let the R system search about for an object? I've been puzzling > through ?environment for quite a while. > > Here's an example from one of the packages that I like, called "ltm". > In the function "ltm.fit" the work of calculating estimates is sent to > different functions like "EM' and "loglikltm" and "scoreltm". Before > that, this is used: > > environment(EM) <- environment(loglikltm) <- environment(scoreltm) <- > environment() > > ##and then EM is called > res.EM <- EM(betas, constraint, control$iter.em, control$verbose) > > I want to make sure I understand this. The environment line gets the > current environment and then assigns it for those 3 functions, right? > All variables and functions that can be accessed from the current > position in the code become available to function EM, loglikltm, > scoreltm.Yes. I'm pretty sure that the net effect is the same as redefining the three functions inside the current function. I.e. g <- function(fee){fee+fie(fum)} f <- function(foo){ environment(g) <- environment() fum <- 3.14 g(foo) } is equivalent to g <- function(fee){fee+fie(fum)} f <- function(foo){ g <- function(fee){fee+fie(fum)} fum <- 3.14 g(foo) } since a local copy must be created before the environment of g can be changed.> > So, which options should be explicitly inserted into a function call, > which should be left in the environment for R to find when it needs > them?First of all, those are arguments, not options. Arguments can be optional (when there is a default, mostly) but that is something else. Options are set with, say, options(width=60).> > 1. I *think* that when EM is called, the variables "betas", > "constraint", and "control" are already in the environment. > > The EM function is declared like this, using the same words "beta" and > "constraint" > > EM <- > function (betas, constraint, iter, verbose = FALSE) { > > It seems to me that if I wrote the function call like this (leave out > "betas" and "constraint") > > res.EM <- EM(control$iter.em, control$verbose) > > R will run EM and go find "betas" and "constraint" in the environment, > there was no need to name them as arguments.Well, only if the call is always EM(betas, constraints, ....). They could on occasion be matched to something else.> > > 2 Is a function like EM allowed to alter objects that it finds through > the environment, ones that are not passed as arguments? I understand > that a function cannot alter an object that is passed explicitly, but > what about the ones it grabs from the environment? >You are "allowed" to alter anything that you can find. Sometimes it is just a very bad idea, and/or bad programming style... The superassignment operator "<<-" was explicitly designed to allow modification of objects in the lexical scope of a function, so at least in some cases, it must be considered good style to use it (examples can be found in the paper by Ihaka and Gentleman on lexical scope, 1996 IIRC). However, some care must be taken; in particular, if you don't make sure that the object already exists in the appropriate environment, another object of the same name might get clobbered, e.g. in the global environment. Best, -pd (& thanks for that KU t-shirt, by the way!)> If you have ideas about packages that might be handy teaching > examples, please let me know. > > pj > -- > Paul E. Johnson > Professor, Political Science > 1541 Lilac Lane, Room 504 > University of Kansas > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel-- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
On 10-12-26 4:30 PM, Paul Johnson wrote: > Hello, everybody. > > I'm putting together some lecture notes and course exercises on R > programming. My plan is to pick some R packages, ask students to read > through code and see why things work, maybe make some changes. As I > look for examples, I'm running up against the problem that packages > use coding idioms that are unfamiliar to me. > > A difficult thing for me is explaining scope of variables in R > functions. When should we pass an object to a function, when should > we let the R system search about for an object? I've been puzzling > through ?environment for quite a while. Take a look at the Language Definition, not just the ?environment page. > > Here's an example from one of the packages that I like, called "ltm". > In the function "ltm.fit" the work of calculating estimates is sent to > different functions like "EM' and "loglikltm" and "scoreltm". Before > that, this is used: > > environment(EM)<- environment(loglikltm)<- environment(scoreltm)<- > environment() > > ##and then EM is called > res.EM<- EM(betas, constraint, control$iter.em, control$verbose) > > I want to make sure I understand this. The environment line gets the > current environment and then assigns it for those 3 functions, right? > All variables and functions that can be accessed from the current > position in the code become available to function EM, loglikltm, > scoreltm. That's one way to think of it, but it is slightly more accurate to say that three new functions are created, whose associated environments are set to the current environment. > > So, which options should be explicitly inserted into a function call, > which should be left in the environment for R to find when it needs > them? That's a matter of style. I would say that it is usually better style not to mess around with a function's environment. > > 1. I *think* that when EM is called, the variables "betas", > "constraint", and "control" are already in the environment. That need not be true, as long as they are in the environment by the time EM, loglikltm, scoreltm are called. > > The EM function is declared like this, using the same words "beta" and > "constraint" > > EM<- > function (betas, constraint, iter, verbose = FALSE) { > > It seems to me that if I wrote the function call like this (leave out > "betas" and "constraint") > > res.EM<- EM(control$iter.em, control$verbose) > > R will run EM and go find "betas" and "constraint" in the environment, > there was no need to name them as arguments. Including them as arguments means that new local copies will be created in the evaluation frame. > > > 2 Is a function like EM allowed to alter objects that it finds through > the environment, ones that are not passed as arguments? I understand > that a function cannot alter an object that is passed explicitly, but > what about the ones it grabs from the environment? Yes it's allowed, but the usual rules of assignment won't do it. Read about the <<- operator for modifying things that are not local. In summary: beta <- 1 creates or modifies a new local variable, while beta <<- 1 goes looking for beta, and modifies the first one it finds. If it fails to find one, it creates one in the global environment. Duncan Murdoch > If you have ideas about packages that might be handy teaching > examples, please let me know. > > pj