I have encountered an issue while preparing some of the Bioconductor packages for our upcoming release, and Duncan Murdoch suggested that I bring one of the related points up here. The background is that we are building our packages under Windows with "Rcmd install --build" which will flag the zip options. When the total size of files in <pkg>\data are over a certain threshold of size (100k IIRC) the will be zipped in a file named Rdata.zip to save space. So far, so good. The problem arose in that at least one of the packages in the BioC suite is directly acccessing files in the data directory and not going through the data() command or related functions - and in this case the package authors are doing it in .First.lib() which causes the package loading to fail. The issue that Duncan suggested I raise is whether or not it should be considered accepted behaviour for a package author to be accessing files in <pkg>\data (at this time I don't know the reasoning behind the specific example here, I just know that that's what they've done) or if this should be considered a Bad Thing. Thanks -Jeff
Jeff Gentry wrote:>... >The issue that Duncan suggested I raise is whether or not it should be >considered accepted behaviour for a package author to be accessing files >in <pkg>\data (at this time I don't know the reasoning behind the specific >example here, I just know that that's what they've done) or if this should >be considered a Bad Thing. >I do this in some instances where I have data in a format not supported by data(), for example, database files that are used by package programs or tests (but I have always wondered if it is "the right" thing to do). If it is considered a Bad Thing then there needs to be another location for files like this, perhaps inst/data or something like that. Paul Gilbert
w.huber@dkfz-heidelberg.de
2003-Oct-24 08:26 UTC
[Rd] Accessing data files w/ --use-zip-data
Hi all,> The issue that Duncan suggested I raise is whether or not it should be > considered accepted behaviour for a package author to be accessing files > in <pkg>\data (at this time I don't know the reasoning behind the specific > example here, I just know that that's what they've done) or if this should > be considered a Bad Thing.Examples are the makecdfenv und matchprobes packages, part of which I wrote. They deal with data formats that are external to R, namely vendor (Affymetrix) specific file formats, and provide tools for importing these data into R. For the examples / vignettes, I need to provide example data files. Up to now I am putting them into the data directory, and access them with specializedImportFunction(file.path(.path.package("matchprobes"), "data", "HG-U95Av2_probe_tab.gz"), ...) Is this considered offensive to the R package structure? Whereelse to put data that comes in specialized (not .rda, .txt, .csv) formats? Thanks Wolfgang