Peter Langfelder
2013-May-29 22:38 UTC
[Rd] What is preferable - a single large package or a few smaller packages?
Hi all, I maintain the WGCNA package which at present has nearly 200 functions. In the future there will be more. Curious whether it would be preferable or useful to split the package into a couple different ones with different aims. Obviously, when one calls a function in R, package name spaces have to be traversed to find the matching name - does the speed of this depend on how functions are partitioned into packages? Any other considerations? My knowledge of R internals in this regard is pretty non-existent - thanks for any pointers. Best, Peter
Hervé Pagès
2013-May-29 23:48 UTC
[Rd] What is preferable - a single large package or a few smaller packages?
Hi Peter, On 05/29/2013 03:38 PM, Peter Langfelder wrote:> Hi all, > > I maintain the WGCNA package which at present has nearly 200 > functions. In the future there will be more. Curious whether it would > be preferable or useful to split the package into a couple different > ones with different aims. Obviously, when one calls a function in R, > package name spaces have to be traversed to find the matching name - > does the speed of this depend on how functions are partitioned into > packages? Any other considerations?Other important considerations are maintainability and user-friendliness. If you think the package can keep growing and still remain relatively easy to maintain, then maybe you don't need to split it. But if the package becomes too hard to maintain and/or can naturally be divided into more or less independent departments, and if the end-user generally doesn't need all functionalities from all departments for a typical work flow, then you might want to split. That will benefit both: the user and you. That will also make it easier to have other people collaborate to the whole thing (if one day you decide you need some help for that). The impact on the speed of function name lookup would be the last thing I would worry about. My 2 cents. H.> My knowledge of R internals in > this regard is pretty non-existent - thanks for any pointers. > > Best, > > Peter > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >-- Herv? Pag?s Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319
Prof Brian Ripley
2013-May-30 08:01 UTC
[Rd] What is preferable - a single large package or a few smaller packages?
On 29/05/2013 23:38, Peter Langfelder wrote:> Hi all, > > I maintain the WGCNA package which at present has nearly 200 > functions. In the future there will be more. Curious whether it would > be preferable or useful to split the package into a couple different > ones with different aims. Obviously, when one calls a function in R, > package name spaces have to be traversed to find the matching name - > does the speed of this depend on how functions are partitioned into > packages? Any other considerations? My knowledge of R internals in > this regard is pretty non-existent - thanks for any pointers.Namespace environments are hashed, so essentially lookup is independent of size. And since lazy-loading the memory footprint depends far more on what has been used in the session than the number of functions. In any case, 200 functions is not a 'large' package. 'stats' has nearly 1100 in its namespace .... Performance for really large packages was improved to the point of a being a non-issue before 2.0.0. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595