arilamstein at gmail.com
2015-Mar-12 15:41 UTC
[Rd] Best way to handle dependency on non-CRAN package / large data package?
I have just written a package called choroplethrZip <https://github.com/arilamstein/choroplethrZip> which contains a shapefile and metadata on US Zip codes. It is currently hosted on github, has a tagged version number (v1.0.0) and passes R CMD check as verified by Travis. My plan is to use this in the next version of my package choroplethr <https://github.com/arilamstein/choroplethr>. This is exactly what I have done in the past with other map/data packages (notably choroplethrMaps <https://github.com/arilamstein/choroplethrMaps> and choroplethrAdmin1 <https://github.com/arilamstein/choroplethrAdmin1>), and is the architecture that CRAN requested: large data in a separate package, listing it in the 'Suggests', and putting code like this where appropriate: if (!requireNamespace("choroplethrAdmin1", quietly = TRUE)) { stop("Package choroplethrAdmin1 is needed for this function to work. Please install it.", call. = FALSE) } The problem I now face is that choroplethrZip is too large to be hosted on CRAN (~75MB), and I am unclear on the best way to manage this dependency. Presumably I could just change the above message to say Please install choropltherZip by typing: library(devtools) install_github('arilamstein/choroplethr at v1.0.0') But I don't know if this is the best way to do this, or if there is anything else to consider. I have never had to manage package dependencies outside of CRAN, and have always thought of CRAN as being a "closed ecosystem", where there were not any dependencies outside of CRAN. Can anyone provide guidance on this? Thanks. Ari [[alternative HTML version deleted]]
Dirk Eddelbuettel
2015-Mar-12 16:22 UTC
[Rd] Best way to handle dependency on non-CRAN package / large data package?
On 12 March 2015 at 08:41, arilamstein at gmail.com wrote: | But I don't know if this is the best way to do this, or if there is | anything else to consider. I have never had to manage package dependencies | outside of CRAN, and have always thought of CRAN as being a "closed | ecosystem", where there were not any dependencies outside of CRAN. | | Can anyone provide guidance on this? drat can help with this problem. Have a look at http://dirk.eddelbuettel.com/code/drat.html as well as my blog and the GitHub repo of drat. In a nutshell, it creates repositories you can access via update.packages() and install.packages() as if they were CRAN or BioC. It also uses GitHub to automagically provide a repository server via the webserverd "embedded" in each GitHub repo (and turned on as soon as you use the gh-pages branch). Some package authors have turned to using drat to distribute packages (often in addition to CRAN, you can also do it instead of CRAN given a constraint as here). One such package author and I are working on another short blog post detailing just this. If you want, I can send you an 'informal preview' as yet another source of documentation. Dirk -- http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org
arilamstein at gmail.com
2015-Mar-12 16:40 UTC
[Rd] Best way to handle dependency on non-CRAN package / large data package?
Thanks Dirk. I'm looking at it now. At first glance your documentation brings up a good limitation of simply telling users to type "devtools::install_github()". Namely, what happens when the census bureau updates their shapefiles, and I subsequently decide to update the package? Or if I discover an error in the package and decide to update it? The choroplethr package could have a dependency, and it's not clear how to make that dependency explicit to the user. On Thu, Mar 12, 2015 at 9:22 AM, Dirk Eddelbuettel <edd at debian.org> wrote:> > On 12 March 2015 at 08:41, arilamstein at gmail.com wrote: > | But I don't know if this is the best way to do this, or if there is > | anything else to consider. I have never had to manage package > dependencies > | outside of CRAN, and have always thought of CRAN as being a "closed > | ecosystem", where there were not any dependencies outside of CRAN. > | > | Can anyone provide guidance on this? > > drat can help with this problem. Have a look at > > http://dirk.eddelbuettel.com/code/drat.html > > as well as my blog and the GitHub repo of drat. > > In a nutshell, it creates repositories you can access via update.packages() > and install.packages() as if they were CRAN or BioC. It also uses GitHub > to > automagically provide a repository server via the webserverd "embedded" in > each GitHub repo (and turned on as soon as you use the gh-pages branch). > > Some package authors have turned to using drat to distribute packages > (often > in addition to CRAN, you can also do it instead of CRAN given a constraint > as > here). One such package author and I are working on another short blog > post > detailing just this. If you want, I can send you an 'informal preview' as > yet another source of documentation. > > Dirk > > -- > http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org >[[alternative HTML version deleted]]
Reasonably Related Threads
- Best way to handle dependency on non-CRAN package / large data package?
- Best way to handle dependency on non-CRAN package / large data package?
- Best way to handle dependency on non-CRAN package / large data package?
- remove higher order interaction terms
- Apply same linear model to subset of dataframe