thr3ads.net - R devel - [Rd] environment question [Dec 2010]

If this information is useful, please help other people find it:
Share via:

Paul Johnson

2010-Dec-26 21:30 UTC

[Rd] environment question

Hello, everybody.

I'm putting together some lecture notes and course exercises on R
programming.  My plan is to pick some R packages, ask students to read
through code and see why things work, maybe make some changes.  As I
look for examples, I'm running up against the problem that packages
use coding idioms that are unfamiliar to me.

A difficult thing for me is explaining scope of variables in R
functions.  When should we pass an object to a function, when should
we let the R system search about for an object?  I've been puzzling
through ?environment for quite a while.

Here's an example from one of the packages that I like, called
"ltm".
In the function "ltm.fit" the work of calculating estimates is sent to
different functions like "EM' and "loglikltm" and
"scoreltm".  Before
that, this is used:

environment(EM) <- environment(loglikltm) <- environment(scoreltm) <-
environment()

##and then EM is called
res.EM <- EM(betas, constraint, control$iter.em, control$verbose)

I want to make sure I understand this. The environment line gets the
current environment and then assigns it for those 3 functions, right?
All variables and functions that can be accessed from the current
position in the code become available to function EM, loglikltm,
scoreltm.

So, which options should be explicitly inserted into a function call,
which should be left in the environment for R to find when it needs
them?

1. I *think* that when EM is called, the variables "betas",
"constraint", and "control" are already in the environment.

The EM function is declared like this, using the same words "beta" and
"constraint"

EM <-
function (betas, constraint, iter, verbose = FALSE) {

It seems to me that if I wrote the function call like this (leave out
"betas" and "constraint")

res.EM <- EM(control$iter.em, control$verbose)

R will run EM and go find "betas" and "constraint" in the
environment,
there was no need to name them as arguments.


2 Is a function like EM allowed to alter objects that it finds through
the environment, ones that are not passed as arguments? I understand
that a function cannot alter an object that is passed explicitly, but
what about the ones it grabs from the environment?

If you have ideas about packages that might be handy teaching
examples, please let me know.

pj
-- 
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas

peter dalgaard

2010-Dec-26 22:39 UTC

head link

[Rd] environment question

On Dec 26, 2010, at 22:30 , Paul Johnson wrote:
> Hello, everybody.
> 
> I'm putting together some lecture notes and course exercises on R
> programming.  My plan is to pick some R packages, ask students to read
> through code and see why things work, maybe make some changes.  As I
> look for examples, I'm running up against the problem that packages
> use coding idioms that are unfamiliar to me.
> 
> A difficult thing for me is explaining scope of variables in R
> functions.  When should we pass an object to a function, when should
> we let the R system search about for an object?  I've been puzzling
> through ?environment for quite a while.
> 
> Here's an example from one of the packages that I like, called
"ltm".
> In the function "ltm.fit" the work of calculating estimates is
sent to
> different functions like "EM' and "loglikltm" and
"scoreltm".  Before
> that, this is used:
> 
> environment(EM) <- environment(loglikltm) <- environment(scoreltm)
<-
> environment()
> 
> ##and then EM is called
> res.EM <- EM(betas, constraint, control$iter.em, control$verbose)
> 
> I want to make sure I understand this. The environment line gets the
> current environment and then assigns it for those 3 functions, right?
> All variables and functions that can be accessed from the current
> position in the code become available to function EM, loglikltm,
> scoreltm.
Yes. I'm pretty sure that the net effect is the same as redefining the three
functions inside the current function. I.e.

g <- function(fee){fee+fie(fum)}
f <- function(foo){
  environment(g) <- environment()
  fum <- 3.14
  g(foo)
}

is equivalent to

g <- function(fee){fee+fie(fum)}
f <- function(foo){
  g <- function(fee){fee+fie(fum)}
  fum <- 3.14
  g(foo)
}

since a local copy must be created before the environment of g can be changed.
> 
> So, which options should be explicitly inserted into a function call,
> which should be left in the environment for R to find when it needs
> them?
First of all, those are arguments, not options. Arguments can be optional (when
there is a default, mostly) but that is something else. Options are set with,
say, options(width=60).
> 
> 1. I *think* that when EM is called, the variables "betas",
> "constraint", and "control" are already in the
environment.
> 
> The EM function is declared like this, using the same words
"beta" and
> "constraint"
> 
> EM <-
> function (betas, constraint, iter, verbose = FALSE) {
> 
> It seems to me that if I wrote the function call like this (leave out
> "betas" and "constraint")
> 
> res.EM <- EM(control$iter.em, control$verbose)
> 
> R will run EM and go find "betas" and "constraint" in
the environment,
> there was no need to name them as arguments.
Well, only if the call is always EM(betas, constraints, ....). They could on
occasion be matched to something else.

> 
> 
> 2 Is a function like EM allowed to alter objects that it finds through
> the environment, ones that are not passed as arguments? I understand
> that a function cannot alter an object that is passed explicitly, but
> what about the ones it grabs from the environment?
> 
You are "allowed" to alter anything that you can find. Sometimes it is
just a very bad idea, and/or bad programming style...

The superassignment operator "<<-" was explicitly designed to
allow modification of objects in the lexical scope of a function, so at least in
some cases, it must be considered good style to use it (examples can be found in
the paper by Ihaka and Gentleman on lexical scope, 1996 IIRC). However, some
care must be taken; in particular, if you don't make sure that the object
already exists in the appropriate environment, another object of the same name
might get clobbered, e.g. in the global environment.

Best,
-pd

(& thanks for that KU t-shirt, by the way!)
> If you have ideas about packages that might be handy teaching
> examples, please let me know.
> 
> pj
> -- 
> Paul E. Johnson
> Professor, Political Science
> 1541 Lilac Lane, Room 504
> University of Kansas
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com

Duncan Murdoch

2010-Dec-27 11:24 UTC

head link

[Rd] environment question

On 10-12-26 4:30 PM, Paul Johnson wrote:
 > Hello, everybody.
 >
 > I'm putting together some lecture notes and course exercises on R
 > programming.  My plan is to pick some R packages, ask students to read
 > through code and see why things work, maybe make some changes.  As I
 > look for examples, I'm running up against the problem that packages
 > use coding idioms that are unfamiliar to me.
 >
 > A difficult thing for me is explaining scope of variables in R
 > functions.  When should we pass an object to a function, when should
 > we let the R system search about for an object?  I've been puzzling
 > through ?environment for quite a while.

Take a look at the Language Definition, not just the ?environment page.

 >
 > Here's an example from one of the packages that I like, called
"ltm".
 > In the function "ltm.fit" the work of calculating estimates is
sent to
 > different functions like "EM' and "loglikltm" and
"scoreltm".  Before
 > that, this is used:
 >
 > environment(EM)<- environment(loglikltm)<-
environment(scoreltm)<-
 > environment()
 >
 > ##and then EM is called
 > res.EM<- EM(betas, constraint, control$iter.em, control$verbose)
 >
 > I want to make sure I understand this. The environment line gets the
 > current environment and then assigns it for those 3 functions, right?
 > All variables and functions that can be accessed from the current
 > position in the code become available to function EM, loglikltm,
 > scoreltm.

That's one way to think of it, but it is slightly more accurate to say 
that three new functions are created, whose associated environments are 
set to the current environment.

 >
 > So, which options should be explicitly inserted into a function call,
 > which should be left in the environment for R to find when it needs
 > them?

That's a matter of style.  I would say that it is usually better style 
not to mess around with a function's environment.

 >
 > 1. I *think* that when EM is called, the variables "betas",
 > "constraint", and "control" are already in the
environment.

That need not be true, as long as they are in the environment by the 
time EM, loglikltm, scoreltm are called.

 >
 > The EM function is declared like this, using the same words
"beta" and
 > "constraint"
 >
 > EM<-
 > function (betas, constraint, iter, verbose = FALSE) {
 >
 > It seems to me that if I wrote the function call like this (leave out
 > "betas" and "constraint")
 >
 > res.EM<- EM(control$iter.em, control$verbose)
 >
 > R will run EM and go find "betas" and "constraint" in
the environment,
 > there was no need to name them as arguments.

Including them as arguments means that new local copies will be created 
in the evaluation frame.

 >
 >
 > 2 Is a function like EM allowed to alter objects that it finds through
 > the environment, ones that are not passed as arguments? I understand
 > that a function cannot alter an object that is passed explicitly, but
 > what about the ones it grabs from the environment?

Yes it's allowed, but the usual rules of assignment won't do it.  Read 
about the <<- operator for modifying things that are not local.  In
summary:

  beta <- 1

creates or modifies a new local variable, while

  beta <<- 1

goes looking for beta, and modifies the first one it finds.  If it fails 
to find one, it creates one in the global environment.

Duncan Murdoch

 > If you have ideas about packages that might be handy teaching
 > examples, please let me know.
 >
 > pj

Possibly Parallel Threads

Search for more reasonably related threads

R devel - Dec 2010 - environment question

[Rd] environment question

[Rd] environment question

[Rd] environment question

Possibly Parallel Threads