Hello, I am facing a problem with lapply which I ''''think''' may be a bug. This is the most basic function in which I can reproduce it: myfun <- function() { foo = data.frame(1:10,10:1) foos = list(foo) fooCollumn=2 cFoo = lapply(foos,subset,select=fooCollumn) return(cFoo) } I am building a list of dataframes, in each of which I want to keep only column 2 (obviously I would not do it this way in real life but that's just to demonstrate the bug). If I execute the commands inline it works but if I clean my environment, then define the function and then execute: > myfun() I get this error: Error in eval(expr, envir, enclos) : object "fooCollumn" not found while fooCollumn is defined, in the function, right before lapply. In addition, if I define it outside the function and then execute the function: > fooCollumn=1 > myfun() it works but uses the value defined in the general environment and not the one defined in the function. This is with R 2.5.0 on both OS X and Linux (Fedora Core 6) What did I do wrong? Is this indeed a bug? An intended behavior? Thanks in advance. JiHO --- http://jo.irisson.free.fr/
Dimitris Rizopoulos
2007-May-18 14:50 UTC
[R] lapply not reading arguments from the correct environment
subset() was not defined inside myfun(); try this version instead: myfun <- function () { foo <- data.frame(1:10, 10:1) foos <- list(foo) fooCollumn <- 2 my.subset <- function(...) subset(...) cFoo <- lapply(foos, my.subset, select = fooCollumn) cFoo } myfun() I hope it helps. Best, Dimitris ---- Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm ----- Original Message ----- From: "jiho" <jo.irisson at gmail.com> To: <r-help at stat.math.ethz.ch> Sent: Friday, May 18, 2007 4:41 PM Subject: [R] lapply not reading arguments from the correct environment> Hello, > > I am facing a problem with lapply which I ''''think''' may be a bug. > This is the most basic function in which I can reproduce it: > > myfun <- function() > { > foo = data.frame(1:10,10:1) > foos = list(foo) > fooCollumn=2 > cFoo = lapply(foos,subset,select=fooCollumn) > return(cFoo) > } > > I am building a list of dataframes, in each of which I want to keep > only column 2 (obviously I would not do it this way in real life but > that's just to demonstrate the bug). > If I execute the commands inline it works but if I clean my > environment, then define the function and then execute: > > myfun() > I get this error: > Error in eval(expr, envir, enclos) : object "fooCollumn" not found > while fooCollumn is defined, in the function, right before lapply. > In > addition, if I define it outside the function and then execute the > function: > > fooCollumn=1 > > myfun() > it works but uses the value defined in the general environment and > not the one defined in the function. > This is with R 2.5.0 on both OS X and Linux (Fedora Core 6) > What did I do wrong? Is this indeed a bug? An intended behavior? > Thanks in advance. > > JiHO > --- > http://jo.irisson.free.fr/ > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
Thomas Lumley
2007-May-18 15:09 UTC
[R] lapply not reading arguments from the correct environment
On Fri, 18 May 2007, jiho wrote:> Hello, > > I am facing a problem with lapply which I ''''think''' may be a bug. > This is the most basic function in which I can reproduce it: > > myfun <- function() > { > foo = data.frame(1:10,10:1) > foos = list(foo) > fooCollumn=2 > cFoo = lapply(foos,subset,select=fooCollumn) > return(cFoo) > } ><snip>> I get this error: > Error in eval(expr, envir, enclos) : object "fooCollumn" not found > while fooCollumn is defined, in the function, right before lapply.<snip>> This is with R 2.5.0 on both OS X and Linux (Fedora Core 6) > What did I do wrong? Is this indeed a bug? An intended behavior?No, it isn't a bug (though it may be confusing). The problem is that subset() evaluates its "select" argument in an unusual way. Usually the argument would be evaluated inside myfun() and the value passed to lapply(), and everything would work as you expect. subset() bypasses the normal evaluation and explicitly evaluates the "select" argument in the calling frame, ie, inside lapply(), where fooCollumn is not visible. You could do lapply(foos, function(foo) subset(foo, select=fooCollum)) capturing fooCollum by lexical scope. In R this is often a better option than passing extra arguments to lapply (or other functions that take function arguments). -thomas
Prof Brian Ripley
2007-May-18 16:05 UTC
[R] lapply not reading arguments from the correct environment
You need to study carefully what the semantics of 'subset' are. The function body of myfun is not in the evaluation environment. (The issue is 'subset', not 'lapply': select is an *expression* and not a value.) Hint: using subset() programmatically is almost always a mistake. R's subsetting function is '[': subset is a convenience wrapper. On Fri, 18 May 2007, jiho wrote:> Hello, > > I am facing a problem with lapply which I ''''think''' may be a bug. > This is the most basic function in which I can reproduce it: > > myfun <- function() > { > foo = data.frame(1:10,10:1) > foos = list(foo) > fooCollumn=2 > cFoo = lapply(foos,subset,select=fooCollumn) > return(cFoo) > } > > I am building a list of dataframes, in each of which I want to keep > only column 2 (obviously I would not do it this way in real life but > that's just to demonstrate the bug). > If I execute the commands inline it works but if I clean my > environment, then define the function and then execute: > > myfun() > I get this error: > Error in eval(expr, envir, enclos) : object "fooCollumn" not found > while fooCollumn is defined, in the function, right before lapply. In > addition, if I define it outside the function and then execute the > function: > > fooCollumn=1 > > myfun() > it works but uses the value defined in the general environment and > not the one defined in the function. > This is with R 2.5.0 on both OS X and Linux (Fedora Core 6) > What did I do wrong? Is this indeed a bug? An intended behavior?It is a bug, in your function. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595