thr3ads.net - R devel - [Rd] Best way to handle dependency on non-CRAN package / large data package? [Mar 2015]

If this information is useful, please help other people find it:
Share via:

arilamstein at gmail.com

2015-Mar-12 15:41 UTC

[Rd] Best way to handle dependency on non-CRAN package / large data package?

I have just written a package called choroplethrZip
<https://github.com/arilamstein/choroplethrZip> which contains a shapefile
and metadata on US Zip codes. It is currently hosted on github, has a
tagged version number (v1.0.0) and passes R CMD check as verified by
Travis. My plan is to use this in the next version of my package choroplethr
<https://github.com/arilamstein/choroplethr>.

This is exactly what I have done in the past with other map/data packages
(notably choroplethrMaps <https://github.com/arilamstein/choroplethrMaps>
 and choroplethrAdmin1
<https://github.com/arilamstein/choroplethrAdmin1>),
and is the architecture that CRAN requested: large data in a separate
package, listing it in the 'Suggests', and putting code like this where
appropriate:

if (!requireNamespace("choroplethrAdmin1", quietly = TRUE)) {
  stop("Package choroplethrAdmin1 is needed for this function to work.
Please install it.", call. = FALSE)
}

The problem I now face is that choroplethrZip is too large to be hosted on
CRAN (~75MB), and I am unclear on the best way to manage this dependency.
Presumably I could just change the above message to say

Please install choropltherZip by typing:
    library(devtools)
    install_github('arilamstein/choroplethr at v1.0.0')

But I don't know if this is the best way to do this, or if there is
anything else to consider. I have never had to manage package dependencies
outside of CRAN, and have always thought of CRAN as being a "closed
ecosystem", where there were not any dependencies outside of CRAN.

Can anyone provide guidance on this?

Thanks.

Ari

	[[alternative HTML version deleted]]

Dirk Eddelbuettel

2015-Mar-12 16:22 UTC

head link

[Rd] Best way to handle dependency on non-CRAN package / large data package?

On 12 March 2015 at 08:41, arilamstein at gmail.com wrote:
| But I don't know if this is the best way to do this, or if there is
| anything else to consider. I have never had to manage package dependencies
| outside of CRAN, and have always thought of CRAN as being a "closed
| ecosystem", where there were not any dependencies outside of CRAN.
| 
| Can anyone provide guidance on this?

drat can help with this problem. Have a look at 

     http://dirk.eddelbuettel.com/code/drat.html

as well as my blog and the GitHub repo of drat.

In a nutshell, it creates repositories you can access via update.packages()
and install.packages() as if they were CRAN or BioC.  It also uses GitHub to
automagically provide a repository server via the webserverd
"embedded" in
each GitHub repo (and turned on as soon as you use the gh-pages branch).

Some package authors have turned to using drat to distribute packages (often
in addition to CRAN, you can also do it instead of CRAN given a constraint as
here).  One such package author and I are working on another short blog post
detailing just this.  If you want, I can send you an 'informal preview'
as
yet another source of documentation.

Dirk

-- 
http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org

arilamstein at gmail.com

2015-Mar-12 16:40 UTC

head link

[Rd] Best way to handle dependency on non-CRAN package / large data package?

Thanks Dirk. I'm looking at it now.

At first glance your documentation brings up a good limitation of simply
telling users to type "devtools::install_github()". Namely, what
happens
when the census bureau updates their shapefiles, and I subsequently decide
to update the package? Or if I discover an error in the package and decide
to update it? The choroplethr package could have a dependency, and it's not
clear how to make that dependency explicit to the user.



On Thu, Mar 12, 2015 at 9:22 AM, Dirk Eddelbuettel <edd at debian.org>
wrote:
>
> On 12 March 2015 at 08:41, arilamstein at gmail.com wrote:
> | But I don't know if this is the best way to do this, or if there is
> | anything else to consider. I have never had to manage package
> dependencies
> | outside of CRAN, and have always thought of CRAN as being a "closed
> | ecosystem", where there were not any dependencies outside of CRAN.
> |
> | Can anyone provide guidance on this?
>
> drat can help with this problem. Have a look at
>
>      http://dirk.eddelbuettel.com/code/drat.html
>
> as well as my blog and the GitHub repo of drat.
>
> In a nutshell, it creates repositories you can access via update.packages()
> and install.packages() as if they were CRAN or BioC.  It also uses GitHub
> to
> automagically provide a repository server via the webserverd
"embedded" in
> each GitHub repo (and turned on as soon as you use the gh-pages branch).
>
> Some package authors have turned to using drat to distribute packages
> (often
> in addition to CRAN, you can also do it instead of CRAN given a constraint
> as
> here).  One such package author and I are working on another short blog
> post
> detailing just this.  If you want, I can send you an 'informal
preview' as
> yet another source of documentation.
>
> Dirk
>
> --
> http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org
>
	[[alternative HTML version deleted]]

Reasonably Related Threads

Search for more apparently analagous threads

R devel - Mar 2015 - Best way to handle dependency on non-CRAN package / large data package?

[Rd] Best way to handle dependency on non-CRAN package / large data package?

[Rd] Best way to handle dependency on non-CRAN package / large data package?

[Rd] Best way to handle dependency on non-CRAN package / large data package?

Reasonably Related Threads