thr3ads.net - R help - [R] load ing and saving R objects [Jun 2005]

If this information is useful, please help other people find it:
Share via:

Richard Mott

2005-Jun-14 13:02 UTC

[R] load ing and saving R objects

Does anyone know a way to do the following:

Save a large number of R objects to a file (like load() does) but then 
read back only a small named subset of them . As far as I can see, 
load() reads back everything.

The context is:

I have an application which will generate a large number of large 
matrices (approx 15000 matrices each of dimension 2000*30). I can 
generate these matrices using an R-package I wrote, but it requires a 
large amouint of memory and is slow so I want to do this only once.  
However, I then want to do some subsequent processing, comprising a very 
large number of runs in which small  (~ 10) random selection of matrices 
from the previously computed set are used for linear modeling.  So I 
need a way to load back named objects previously saved in a call to 
save(). I can;t see anyway of doing this. Any ideas?

Thanks

Richard Mott
 

-- 
----------------------------------------------------
Richard Mott       | Wellcome Trust Centre 
tel 01865 287588   | for Human Genetics
fax 01865 287697   | Roosevelt Drive, Oxford OX3 7BN

Wiener, Matthew

2005-Jun-14 13:15 UTC

head link

[R] load ing and saving R objects

This may not be quite the answer you're looking for, but I sometimes save
each such object in its own file (usually <object.name>.RData).  Then, if
you know which objects you're looking for, you know their names, and can
load the individual files.

Hope this helps,

Matt Wiener

-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Richard Mott
Sent: Tuesday, June 14, 2005 9:03 AM
To: r-help at stat.math.ethz.ch
Subject: [R] load ing and saving R objects


Does anyone know a way to do the following:

Save a large number of R objects to a file (like load() does) but then 
read back only a small named subset of them . As far as I can see, 
load() reads back everything.

The context is:

I have an application which will generate a large number of large 
matrices (approx 15000 matrices each of dimension 2000*30). I can 
generate these matrices using an R-package I wrote, but it requires a 
large amouint of memory and is slow so I want to do this only once.  
However, I then want to do some subsequent processing, comprising a very 
large number of runs in which small  (~ 10) random selection of matrices 
from the previously computed set are used for linear modeling.  So I 
need a way to load back named objects previously saved in a call to 
save(). I can;t see anyway of doing this. Any ideas?

Thanks

Richard Mott
 

-- 
----------------------------------------------------
Richard Mott       | Wellcome Trust Centre 
tel 01865 287588   | for Human Genetics
fax 01865 287697   | Roosevelt Drive, Oxford OX3 7BN

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

Barry Rowlingson

2005-Jun-14 13:24 UTC

head link

[R] load ing and saving R objects

Richard Mott wrote:> Does anyone know a way to do the following:
> 
> Save a large number of R objects to a file (like load() does) but then 
> read back only a small named subset of them . As far as I can see, 
> load() reads back everything.
  Save them to individual files when you generate them?

  for(i in 1:15000){

   m=generateBigMatrix(i)

   filename=paste("BigMatrix-",i,".Rdata",sep='')
   save(m,file=filename)
  }

Note that load will always overwrite 'm', so to load a sample of them in
you'll need to do something like this:

  bigSamples=list()

  for(i in sample(15000,N)){
    filename=paste("BigMatrix-",i,".Rdata",sep='')
    load(filename)
    bigSamples[[i]]=m
  }

  But there may be a more efficient way to string up a big list like 
that, I can never remember - get it working, then worry about optimisation.

  I hope your filesystem is happy with 15000 objects in it. I would 
dedicate a folder or directory for just these objects' files, since it 
then becomes near impossible to see anything other than the big matrix 
files...

Baz

Roger D. Peng

2005-Jun-14 13:38 UTC

head link

[R] load ing and saving R objects

I would suggest saving each object to an individual file with 
some sort of systematic file name.  That way, you can implement a 
  rudimentary key-value database and load only the objects you 
want.  You might be interested in the 'serialize()' and 
'unserialize()' functions for this purpose.

If having ~15000 files is not desirable, then you need a database 
like GDBM.  If you can live with something simpler, you might 
take a look at my 'filehash' package at 
http://sandybox.typepad.com/software/.  It hasn't been tested 
much but it may suit your needs.

-roger

Richard Mott wrote:> Does anyone know a way to do the following:
> 
> Save a large number of R objects to a file (like load() does) but then 
> read back only a small named subset of them . As far as I can see, 
> load() reads back everything.
> 
> The context is:
> 
> I have an application which will generate a large number of large 
> matrices (approx 15000 matrices each of dimension 2000*30). I can 
> generate these matrices using an R-package I wrote, but it requires a 
> large amouint of memory and is slow so I want to do this only once.  
> However, I then want to do some subsequent processing, comprising a very 
> large number of runs in which small  (~ 10) random selection of matrices 
> from the previously computed set are used for linear modeling.  So I 
> need a way to load back named objects previously saved in a call to 
> save(). I can;t see anyway of doing this. Any ideas?
> 
> Thanks
> 
> Richard Mott
>  
> 
-- 
Roger D. Peng
http://www.biostat.jhsph.edu/~rpeng/

bogdan romocea

2005-Jun-14 16:34 UTC

head link

[R] load ing and saving R objects

> On Tue, 14 Jun 2005, Prof Brian Ripley wrote:
> If your file system does not like 15000 files you can always 
> save in a DBMS.
Or, switch to a better/more appropriate file system:
http://en.wikipedia.org/wiki/Comparison_of_file_systems
ReiserFS would allow you to store up to about 1.2 million files in a directory.

> -----Original Message-----
> From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk]
> Sent: Tuesday, June 14, 2005 10:41 AM
> To: Barry Rowlingson
> Cc: r-help at stat.math.ethz.ch; Richard Mott
> Subject: Re: [R] load ing and saving R objects
> 
> 
> On Tue, 14 Jun 2005, Barry Rowlingson wrote:
> 
> > Richard Mott wrote:
> >> Does anyone know a way to do the following:
> >>
> >> Save a large number of R objects to a file (like load() 
> does) but then
> >> read back only a small named subset of them . As far as I can see,
> >> load() reads back everything.
> >
> >  Save them to individual files when you generate them?
> >
> >  for(i in 1:15000){
> >
> >   m=generateBigMatrix(i)
> >
> >  
filename=paste("BigMatrix-",i,".Rdata",sep='')
> >   save(m,file=filename)
> >  }
> >
> > Note that load will always overwrite 'm', so to load a 
> sample of them in
> > you'll need to do something like this:
> >
> >  bigSamples=list()
> >
> >  for(i in sample(15000,N)){
> >   
filename=paste("BigMatrix-",i,".Rdata",sep='')
> >    load(filename)
> >    bigSamples[[i]]=m
> >  }
> >
> >  But there may be a more efficient way to string up a big list like
> > that, I can never remember - get it working, then worry 
> about optimisation.
> 
> (Yes, use bigSamples <- vector("list", 15000) first.)
> 
> >  I hope your filesystem is happy with 15000 objects in it. I would
> > dedicate a folder or directory for just these objects' 
> files, since it
> > then becomes near impossible to see anything other than the 
> big matrix
> > files...
> 
> .readRDS/.saveRDS might be a better way to do this, and avoids always 
> restoring to "m".
> 
> If your file system does not like 15000 files you can always 
> save in a 
> DBMS.
> 
> I did once look into restoring just some of the objects in a save()ed 
> file, but it is not really possible to do so efficiently due 
> to sharing 
> between objects.
> 
> -- 
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

Gabor Grothendieck

2005-Jun-14 22:01 UTC

head link

[R] load ing and saving R objects

On 6/14/05, Richard Mott <rmott at well.ox.ac.uk>
wrote:> Does anyone know a way to do the following:
> 
> Save a large number of R objects to a file (like load() does) but then
> read back only a small named subset of them . As far as I can see,
> load() reads back everything.
> 
> The context is:
> 
> I have an application which will generate a large number of large
> matrices (approx 15000 matrices each of dimension 2000*30). I can
> generate these matrices using an R-package I wrote, but it requires a
> large amouint of memory and is slow so I want to do this only once.
> However, I then want to do some subsequent processing, comprising a very
> large number of runs in which small  (~ 10) random selection of matrices
> from the previously computed set are used for linear modeling.  So I
> need a way to load back named objects previously saved in a call to
> save(). I can;t see anyway of doing this. Any ideas?

Check out the g.data delayed data package on CRAN and the article 
in R News 2/3.

Reasonably Related Threads

Search for more seemingly similar threads

R help - Jun 2005 - load ing and saving R objects

[R] load ing and saving R objects

[R] load ing and saving R objects

[R] load ing and saving R objects

[R] load ing and saving R objects

[R] load ing and saving R objects

[R] load ing and saving R objects

Reasonably Related Threads