On Thu, Jan 22, 2015 at 11:44 AM, <luke-tierney at uiowa.edu> wrote:> > For default methods there ought to be a way to create those so the > default method is computed at creation or load time and stored in an > environment.We had considered that, but we thought the definition of the function would be easier to interpret if it explicitly specified the namespace, instead of using tricks with environments. The same applies for memoizing the lookup in front of a loop. The implementation of these functions is almost simpler in C than it is in R, so there is relatively little risk to this change. But I agree the benefits are also somewhat minor.> For other cases if I want to use foo::bar many times, say > in a loop, I would do > > foo_bar <- foo::bar > > and use foo_bar, or something along those lines. > > When :: and ::: were introduce they were intended primarily for > reflection and debugging, so speed was not an issue. ::: is still > really only reliably usable that way, and making it faster may just > encourage bad practice. :: is different and there are good arguments > for using it in code, but I'm not yet seeing good arguments for use in > ways that would be performance-critical, but I'm happy to be convinced > otherwise. If there is a need for a faster :: then going to a > SPECIALSXP is fine; it would also be good to make the byte code > compiler aware of it, and possibly to work on ways to improve the > performance further e.g. through cacheing. > > Best, > > luke > > > On Thu, 22 Jan 2015, Peter Haverty wrote: > > >> Hi all, >> >> When S4 methods are defined on base function (say, "match"), the >> function becomes a method with the body "base::match(x,y)". A call to >> such a function often spends more time doing "::" than in the function >> itself. I always assumed that "::" was a very low-level thing, but it >> turns out to be a plain old function defined in base/R/namespace.R. >> What would you all think about making "::" and ":::" .Primitives? I >> have submitted some examples, timings, and a patch to the R bug >> tracker (https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=16134). >> I'd be very interested to hear your thoughts on the matter. >> >> Regards, >> Pete >> >> ____________________ >> Peter M. Haverty, Ph.D. >> Genentech, Inc. >> phaverty at gene.com >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > > -- > Luke Tierney > Ralph E. Wareham Professor of Mathematical Sciences > University of Iowa Phone: 319-335-3386 > Department of Statistics and Fax: 319-335-3017 > Actuarial Science > 241 Schaeffer Hall email: luke-tierney at uiowa.edu > Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu > > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
Hi all, I use Luke's "::" hoisting trick often. I think it would be fantastic if the JIT just did that for you. The main trouble, for me, is in code I don't own. When common Bioconductor packages are loaded many, many base functions are saddled with this substantial dispatch and "::" overhead. While we have the hood up, the parser could help out a bit here too. It already has special cases for "::" and ":::". Currently you get the symbols "pkg" and "name" and have to go fishing in the calling environment for the associated values. It would be nice to have the parser or JIT rewrite base::match as doubleColon("base","match") or directly provide the symbols "base" and "match" to the subsequent code. I think it's also kind of entertaining that the comments in base/R/namespace.R note that they are using ":::" for speed purposes only. Pete ____________________ Peter M. Haverty, Ph.D. Genentech, Inc. phaverty at gene.com On Thu, Jan 22, 2015 at 12:54 PM, Michael Lawrence <lawrence.michael at gene.com> wrote:> On Thu, Jan 22, 2015 at 11:44 AM, <luke-tierney at uiowa.edu> wrote: >> >> For default methods there ought to be a way to create those so the >> default method is computed at creation or load time and stored in an >> environment. > > We had considered that, but we thought the definition of the function > would be easier to interpret if it explicitly specified the namespace, > instead of using tricks with environments. The same applies for > memoizing the lookup in front of a loop. > > The implementation of these functions is almost simpler in C than it > is in R, so there is relatively little risk to this change. But I > agree the benefits are also somewhat minor. > >> For other cases if I want to use foo::bar many times, say >> in a loop, I would do >> >> foo_bar <- foo::bar >> >> and use foo_bar, or something along those lines. >> >> When :: and ::: were introduce they were intended primarily for >> reflection and debugging, so speed was not an issue. ::: is still >> really only reliably usable that way, and making it faster may just >> encourage bad practice. :: is different and there are good arguments >> for using it in code, but I'm not yet seeing good arguments for use in >> ways that would be performance-critical, but I'm happy to be convinced >> otherwise. If there is a need for a faster :: then going to a >> SPECIALSXP is fine; it would also be good to make the byte code >> compiler aware of it, and possibly to work on ways to improve the >> performance further e.g. through cacheing. >> >> Best, >> >> luke >> >> >> On Thu, 22 Jan 2015, Peter Haverty wrote: >> >> >>> Hi all, >>> >>> When S4 methods are defined on base function (say, "match"), the >>> function becomes a method with the body "base::match(x,y)". A call to >>> such a function often spends more time doing "::" than in the function >>> itself. I always assumed that "::" was a very low-level thing, but it >>> turns out to be a plain old function defined in base/R/namespace.R. >>> What would you all think about making "::" and ":::" .Primitives? I >>> have submitted some examples, timings, and a patch to the R bug >>> tracker (https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=16134). >>> I'd be very interested to hear your thoughts on the matter. >>> >>> Regards, >>> Pete >>> >>> ____________________ >>> Peter M. Haverty, Ph.D. >>> Genentech, Inc. >>> phaverty at gene.com >>> >>> ______________________________________________ >>> R-devel at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >>> >> >> -- >> Luke Tierney >> Ralph E. Wareham Professor of Mathematical Sciences >> University of Iowa Phone: 319-335-3386 >> Department of Statistics and Fax: 319-335-3017 >> Actuarial Science >> 241 Schaeffer Hall email: luke-tierney at uiowa.edu >> Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu >> >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel
On Thu, 22 Jan 2015, Michael Lawrence wrote:> On Thu, Jan 22, 2015 at 11:44 AM, <luke-tierney at uiowa.edu> wrote: >> >> For default methods there ought to be a way to create those so the >> default method is computed at creation or load time and stored in an >> environment. > > We had considered that, but we thought the definition of the function > would be easier to interpret if it explicitly specified the namespace, > instead of using tricks with environments. The same applies for > memoizing the lookup in front of a loop.interpret in what sense (human reader or R interpreter)? In either case I'm not convinced.> The implementation of these functions is almost simpler in C than it > is in R, so there is relatively little risk to this change. But I > agree the benefits are also somewhat minor.I don't disagree, but it remains that even calling the C version has costs that should not need to be paid. But maybe we can leave that to the compiler/byte code engine. Optimizing references to symbols resolved statically to name spaces and imports is on the to do list, and with a little care that mechanism should work for foo::bar uses as well. Best, luke> >> For other cases if I want to use foo::bar many times, say >> in a loop, I would do >> >> foo_bar <- foo::bar >> >> and use foo_bar, or something along those lines. >> >> When :: and ::: were introduce they were intended primarily for >> reflection and debugging, so speed was not an issue. ::: is still >> really only reliably usable that way, and making it faster may just >> encourage bad practice. :: is different and there are good arguments >> for using it in code, but I'm not yet seeing good arguments for use in >> ways that would be performance-critical, but I'm happy to be convinced >> otherwise. If there is a need for a faster :: then going to a >> SPECIALSXP is fine; it would also be good to make the byte code >> compiler aware of it, and possibly to work on ways to improve the >> performance further e.g. through cacheing. >> >> Best, >> >> luke >> >> >> On Thu, 22 Jan 2015, Peter Haverty wrote: >> >> >>> Hi all, >>> >>> When S4 methods are defined on base function (say, "match"), the >>> function becomes a method with the body "base::match(x,y)". A call to >>> such a function often spends more time doing "::" than in the function >>> itself. I always assumed that "::" was a very low-level thing, but it >>> turns out to be a plain old function defined in base/R/namespace.R. >>> What would you all think about making "::" and ":::" .Primitives? I >>> have submitted some examples, timings, and a patch to the R bug >>> tracker (https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=16134). >>> I'd be very interested to hear your thoughts on the matter. >>> >>> Regards, >>> Pete >>> >>> ____________________ >>> Peter M. Haverty, Ph.D. >>> Genentech, Inc. >>> phaverty at gene.com >>> >>> ______________________________________________ >>> R-devel at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >>> >> >> -- >> Luke Tierney >> Ralph E. Wareham Professor of Mathematical Sciences >> University of Iowa Phone: 319-335-3386 >> Department of Statistics and Fax: 319-335-3017 >> Actuarial Science >> 241 Schaeffer Hall email: luke-tierney at uiowa.edu >> Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu >> >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >-- Luke Tierney Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics and Fax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: luke-tierney at uiowa.edu Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu
I tend to use this (in my own internal code *only*): exported <- function (pkg) { if (pkg == "base") { function (fun) { fun <- as.character(substitute(fun)) res <- .BaseNamespaceEnv[[fun]] if (is.null(res)) stop(fun, " is not found in package base") res } } else { ns <- getNamespace(pkg) exports <- getNamespaceInfo(ns, "exports") function (obj) { obj <- as.character(substitute(obj)) exportedObj <- exports[[obj]] if (is.null(exportedObj)) { if (is.null(ns[[obj]])) { stop(obj, " does not exists in package ", pkg) } else { stop(obj, " is not exported from package ", pkg) } } ns[[exportedObj]] } } } stats <- exported("stats") stats(acf) stats("[.acf") stats("inexistant") exported("base")(ls) exported("base")(inexistant) ## Performance tests for what it?s worth microbenchmark::microbenchmark(stats::acf, (stats <- exported("stats"))(acf), stats(acf)) microbenchmark::microbenchmark(base::ls, (base <- exported("base"))(ls), base(ls), .BaseNamespaceEnv$ls) So, `::` is slow and I can get better speed results thanks to binding both the namespace and the exports environments in the `stats` closure. Unless I miss something, this is not much a problem for base package that is never unloaded. Yet, .BaseNamespaceEnv$xxx, or baseenv()$xxx does the job faster and simpler. However, there is a vicious problem with my exported() function, which is, to say the least, dangerous under the hand of unaware users. Indeed: stats <- exported(?stats?) creates a new binding to both the namespace and the exports environments of the stats package. So, if I do: detach(?package:stats?, unload = TRUE), then library(?stats?), I got two versions of the package in memory, and my `stats`closure refers to an outdated version of the package. This is particularly problematic if the package was recompiled in between (in the context of debugging). Conclusion: much of the lost of performance in `::` is due to not caching the environments. This is fully justified to keep the dynamism of the language at full power and to avoid a messy state of R as described here above? Regarding dynamism, even `stats::acf`remains discutable. Moreover, it is possible to do many other crazy things with these environments, once one got a grip on them. So, even getNamespace() and getNamespaceInfo() are dangerous. Perhaps this should be emphasised in the ?getNamespace man page? This is also why the code above is not released in the wild? Well, now it is :-( Best, Philippe ..............................................<?}))><........ ) ) ) ) ) ( ( ( ( ( Prof. Philippe Grosjean ) ) ) ) ) ( ( ( ( ( Numerical Ecology of Aquatic Systems ) ) ) ) ) Mons University, Belgium ( ( ( ( ( ..............................................................> On 23 Jan 2015, at 16:01, luke-tierney at uiowa.edu wrote: > > On Thu, 22 Jan 2015, Michael Lawrence wrote: > >> On Thu, Jan 22, 2015 at 11:44 AM, <luke-tierney at uiowa.edu> wrote: >>> >>> For default methods there ought to be a way to create those so the >>> default method is computed at creation or load time and stored in an >>> environment. >> >> We had considered that, but we thought the definition of the function >> would be easier to interpret if it explicitly specified the namespace, >> instead of using tricks with environments. The same applies for >> memoizing the lookup in front of a loop. > > interpret in what sense (human reader or R interpreter)? In either > case I'm not convinced. > >> The implementation of these functions is almost simpler in C than it >> is in R, so there is relatively little risk to this change. But I >> agree the benefits are also somewhat minor. > > I don't disagree, but it remains that even calling the C version has > costs that should not need to be paid. But maybe we can leave that to > the compiler/byte code engine. Optimizing references to symbols > resolved statically to name spaces and imports is on the to do list, > and with a little care that mechanism should work for foo::bar uses as > well. > > Best, > > luke > >> >>> For other cases if I want to use foo::bar many times, say >>> in a loop, I would do >>> >>> foo_bar <- foo::bar >>> >>> and use foo_bar, or something along those lines. >>> >>> When :: and ::: were introduce they were intended primarily for >>> reflection and debugging, so speed was not an issue. ::: is still >>> really only reliably usable that way, and making it faster may just >>> encourage bad practice. :: is different and there are good arguments >>> for using it in code, but I'm not yet seeing good arguments for use in >>> ways that would be performance-critical, but I'm happy to be convinced >>> otherwise. If there is a need for a faster :: then going to a >>> SPECIALSXP is fine; it would also be good to make the byte code >>> compiler aware of it, and possibly to work on ways to improve the >>> performance further e.g. through cacheing. >>> >>> Best, >>> >>> luke >>> >>> >>> On Thu, 22 Jan 2015, Peter Haverty wrote: >>> >>> >>>> Hi all, >>>> >>>> When S4 methods are defined on base function (say, "match"), the >>>> function becomes a method with the body "base::match(x,y)". A call to >>>> such a function often spends more time doing "::" than in the function >>>> itself. I always assumed that "::" was a very low-level thing, but it >>>> turns out to be a plain old function defined in base/R/namespace.R. >>>> What would you all think about making "::" and ":::" .Primitives? I >>>> have submitted some examples, timings, and a patch to the R bug >>>> tracker (https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=16134). >>>> I'd be very interested to hear your thoughts on the matter. >>>> >>>> Regards, >>>> Pete >>>> >>>> ____________________ >>>> Peter M. Haverty, Ph.D. >>>> Genentech, Inc. >>>> phaverty at gene.com >>>> >>>> ______________________________________________ >>>> R-devel at r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-devel >>>> >>> >>> -- >>> Luke Tierney >>> Ralph E. Wareham Professor of Mathematical Sciences >>> University of Iowa Phone: 319-335-3386 >>> Department of Statistics and Fax: 319-335-3017 >>> Actuarial Science >>> 241 Schaeffer Hall email: luke-tierney at uiowa.edu >>> Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu >>> >>> >>> ______________________________________________ >>> R-devel at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >> > > -- > Luke Tierney > Ralph E. Wareham Professor of Mathematical Sciences > University of Iowa Phone: 319-335-3386 > Department of Statistics and Fax: 319-335-3017 > Actuarial Science > 241 Schaeffer Hall email: luke-tierney at uiowa.edu > Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
Hi, On 01/23/2015 07:01 AM, luke-tierney at uiowa.edu wrote:> On Thu, 22 Jan 2015, Michael Lawrence wrote: > >> On Thu, Jan 22, 2015 at 11:44 AM, <luke-tierney at uiowa.edu> wrote: >>> >>> For default methods there ought to be a way to create those so the >>> default method is computed at creation or load time and stored in an >>> environment. >> >> We had considered that, but we thought the definition of the function >> would be easier to interpret if it explicitly specified the namespace, >> instead of using tricks with environments. The same applies for >> memoizing the lookup in front of a loop. > > interpret in what sense (human reader or R interpreter)? In either > case I'm not convinced.From a developer perspective, especially when debugging, when we do selectMethod("match", ...) and it turns out that this returns the default method, it's good to see: Method Definition (Class "derivedDefaultMethod"): function (x, table, nomatch = NA_integer_, incomparables = NULL, ...) base::match(x, table, nomatch = nomatch, incomparables = incomparables, ...) <environment: namespace:BiocGenerics> Signatures: x table target "DataFrame" "ANY" defined "ANY" "ANY" rather than some obscure/uninformative body. I hope we can keep that.> >> The implementation of these functions is almost simpler in C than it >> is in R, so there is relatively little risk to this change. But I >> agree the benefits are also somewhat minor. > > I don't disagree, but it remains that even calling the C version has > costs that should not need to be paid. But maybe we can leave that to > the compiler/byte code engine. Optimizing references to symbols > resolved statically to name spaces and imports is on the to do list, > and with a little care that mechanism should work for foo::bar uses as > well.That would be great. Thanks! H.> > Best, > > luke > >> >>> For other cases if I want to use foo::bar many times, say >>> in a loop, I would do >>> >>> foo_bar <- foo::bar >>> >>> and use foo_bar, or something along those lines. >>> >>> When :: and ::: were introduce they were intended primarily for >>> reflection and debugging, so speed was not an issue. ::: is still >>> really only reliably usable that way, and making it faster may just >>> encourage bad practice. :: is different and there are good arguments >>> for using it in code, but I'm not yet seeing good arguments for use in >>> ways that would be performance-critical, but I'm happy to be convinced >>> otherwise. If there is a need for a faster :: then going to a >>> SPECIALSXP is fine; it would also be good to make the byte code >>> compiler aware of it, and possibly to work on ways to improve the >>> performance further e.g. through cacheing. >>> >>> Best, >>> >>> luke >>> >>> >>> On Thu, 22 Jan 2015, Peter Haverty wrote: >>> >>> >>>> Hi all, >>>> >>>> When S4 methods are defined on base function (say, "match"), the >>>> function becomes a method with the body "base::match(x,y)". A call to >>>> such a function often spends more time doing "::" than in the function >>>> itself. I always assumed that "::" was a very low-level thing, but it >>>> turns out to be a plain old function defined in base/R/namespace.R. >>>> What would you all think about making "::" and ":::" .Primitives? I >>>> have submitted some examples, timings, and a patch to the R bug >>>> tracker (https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=16134). >>>> I'd be very interested to hear your thoughts on the matter. >>>> >>>> Regards, >>>> Pete >>>> >>>> ____________________ >>>> Peter M. Haverty, Ph.D. >>>> Genentech, Inc. >>>> phaverty at gene.com >>>> >>>> ______________________________________________ >>>> R-devel at r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-devel >>>> >>> >>> -- >>> Luke Tierney >>> Ralph E. Wareham Professor of Mathematical Sciences >>> University of Iowa Phone: 319-335-3386 >>> Department of Statistics and Fax: 319-335-3017 >>> Actuarial Science >>> 241 Schaeffer Hall email: luke-tierney at uiowa.edu >>> Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu >>> >>> >>> ______________________________________________ >>> R-devel at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >> >-- Herv? Pag?s Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fredhutch.org Phone: (206) 667-5791 Fax: (206) 667-1319
On 22/01/2015 4:06 PM, Peter Haverty wrote:> Hi all, > > I use Luke's "::" hoisting trick often. I think it would be fantastic > if the JIT just did that for you. > > The main trouble, for me, is in code I don't own. When common > Bioconductor packages are loaded many, many base functions are saddled > with this substantial dispatch and "::" overhead. > > While we have the hood up, the parser could help out a bit here too. > It already has special cases for "::" and ":::". Currently you get the > symbols "pkg" and "name" and have to go fishing in the calling > environment for the associated values.I don't think the parser should do this, but it does seem like a reasonable optimization for the compiler to do. It would be nice to have the> parser or JIT rewrite base::match as doubleColon("base","match") or > directly provide the symbols "base" and "match" to the subsequent > code.Currently the parser provides the expression `::`(base, match), and the `::` function converts those symbols to character strings "base" and "match". While the parser could have saved it some work by giving the expression `::`("base", "match"), I think it's a bad idea to start messing with things that way. After all, a user could have defined their own `::` function, and they should get what they typed. Duncan Murdoch