What is the benefit of lazyload DB in this circumstance? I don't see it
if your .rda files have one data object each and are compressed.
Do you have a `data/filelist' index in your packages, as suggested by
200update.txt and `Writing R Extensions'? The slow examples I have seen
did not and so were wasting a lot of time preparing indices that could
have been supplied.
The design expectation was that large data packages would supply an index
and not use lazyloading for datasets but use separate compressed dumps for
each object. If there is some reason to change that, please send an RFC
for the requirements and a design.
Did this not occur during the alpha/beta period for 2.0.0 several months
ago or has something in BioC changed since? (I did ascertain that if
filelist was supplied the then BioC packages installed and loaded quickly
and smoothly.)
On Tue, 8 Feb 2005, James MacDonald wrote:
> Hi all,
>
> Bioconductor has several metaData packages that contain quite large
> data sets. In the past, these data were simply held in the /data
> directory of the package as .rda files and load()ed as needed.
> Converting to using lazy data loading may have memory and performance
> advantages, but for the larger metaData packages the installation is
> painfully slow (it has taken > 30 min to install a large metaData
> package on a PIII, 933 MHz box running Mandrake 9.2). The vast majority
> of the time is spent moving datasets to lazyload DB.
>
> It takes a long time to build the win32 packages as well, but once the
> package is built, the installation is quick, so there is no real problem
> for our end users. So my question is this; is there a mechanism that can
> be used to pre-build the lazyload DB for source packages to decrease the
> installation time for our end users?
--
Brian D. Ripley, ripley@stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595