thr3ads.net - R devel - [Rd] Pre-building lazyload DB [Feb 2005]

If this information is useful, please help other people find it:
Share via:

James MacDonald

2005-Feb-08 17:29 UTC

[Rd] Pre-building lazyload DB

Hi all,

Bioconductor has several metaData packages that contain quite large
data sets. In the past, these data were simply held in the /data
directory of the package as .rda files and load()ed as needed.
Converting to using lazy data loading may have memory and performance
advantages, but for the larger metaData packages the installation is
painfully slow (it has taken > 30 min to install a large metaData
package on a PIII, 933 MHz box running Mandrake 9.2). The vast majority
of the time is spent moving datasets to lazyload DB.

It takes a long time to build the win32 packages as well, but once the
package is built, the installation is quick, so there is no real problem
for our end users. So my question is this; is there a mechanism that can
be used to pre-build the lazyload DB for source packages to decrease the
installation time for our end users?

Best,

Jim



James W. MacDonald
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623


**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used
for urgent or sensitive issues.

Prof Brian Ripley

2005-Feb-08 17:54 UTC

head link

[Rd] Pre-building lazyload DB

What is the benefit of lazyload DB in this circumstance?  I don't see it
if your .rda files have one data object each and are compressed.

Do you have a `data/filelist' index in your packages, as suggested by 
200update.txt and `Writing R Extensions'?  The slow examples I have seen 
did not and so were wasting a lot of time preparing indices that could 
have been supplied.

The design expectation was that large data packages would supply an index 
and not use lazyloading for datasets but use separate compressed dumps for 
each object.  If there is some reason to change that, please send an RFC 
for the requirements and a design.

Did this not occur during the alpha/beta period for 2.0.0 several months 
ago or has something in BioC changed since?  (I did ascertain that if 
filelist was supplied the then BioC packages installed and loaded quickly 
and smoothly.)

On Tue, 8 Feb 2005, James MacDonald wrote:
> Hi all,
>
> Bioconductor has several metaData packages that contain quite large
> data sets. In the past, these data were simply held in the /data
> directory of the package as .rda files and load()ed as needed.
> Converting to using lazy data loading may have memory and performance
> advantages, but for the larger metaData packages the installation is
> painfully slow (it has taken > 30 min to install a large metaData
> package on a PIII, 933 MHz box running Mandrake 9.2). The vast majority
> of the time is spent moving datasets to lazyload DB.
>
> It takes a long time to build the win32 packages as well, but once the
> package is built, the installation is quick, so there is no real problem
> for our end users. So my question is this; is there a mechanism that can
> be used to pre-build the lazyload DB for source packages to decrease the
> installation time for our end users?
-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

Maybe Matching Threads

Search for more reasonably related threads

R devel - Feb 2005 - Pre-building lazyload DB

[Rd] Pre-building lazyload DB

[Rd] Pre-building lazyload DB

Maybe Matching Threads