I am new on R, so I have a maybe naive question: if I have many large data.frames and I use only one or two per session, what's the best way? If all are stored in the actual .Rdata, the system gets slow. On the other hand, I wouldn't like to make a separate package for the data. Should I save it with save() and then remove it with rm() ? Could I reload it then? Thanks for suggestions Meinhard Ploner ps my system: R 1.4.1 on mac/darwin. -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Hello Meinhard> Should I save it with save() and then remove it with rm() ? > Could I reload it then?that is possible with load(filename). It will be saved under the same name, as when you used save, actually you can save a whole list of variables with save and load again.> > Thanks for suggestions > Meinhard Ploner > > ps my system: R 1.4.1 on mac/darwin. > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._-- Joerg Maeder .:|:||:..:.||.:: maeder at atmos.umnw.ethz.ch Tel: +41 1 633 36 25 .:|:||:..:.||.:: http://www.iac.ethz.ch/staff/maeder PhD student at INSTITUTE FOR ATMOSPHERIC AND CLIMATE SCIENCE (IACETH) ETH Z?RICH Switzerland -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Save each dataset in its own rda file: save(mydataset1R,file="mydataset1R.rda") Then check before deleting: mydataset1R.backup <- mydataset1R rm(mydataset1R) attach("mydataset1R.rda") this will leave mydataset1R in pos=2 (check with search() and then ls(2) ) Compare mydataset1R and mydataset1R.backup (for example, compare means by cols etc) Then rm(mydataset1R.backup) Do the same for the rest of your large datasets. Then q(), saving your workspace. Now, .RData DOES NOT include your large datasets. You might want keep each mydataset*R.rda in a separate directory. Then start R from the appropriate directory. Your workspace will not have any of the mydataset*R, as they are no longer in .RData. Use attach("mydataset1R.rda") to bring the required dataset to pos=2. A good advantage of keeping your large datasets in different rda files is that if you ever have a problem in R and .RData gets corrupted, your large files are safe. And a good advanntage, but also a caution, of keeping the large R objects in pos!=1 is that save.image() WILL NOT save the large datasets (you must use save() specifying the environment, check help(save)) Agus Dr. Agustin Lobo Instituto de Ciencias de la Tierra (CSIC) Lluis Sole Sabaris s/n 08028 Barcelona SPAIN tel 34 93409 5410 fax 34 93411 0012 alobo at ija.csic.es On Fri, 22 Feb 2002, Meinhard Ploner wrote:> I am new on R, so I have a maybe naive question: > > if I have many large data.frames and I use only one or two per session, > what's the best way? > If all are stored in the actual .Rdata, the system gets slow. > On the other hand, I wouldn't like to make a separate package for the > data. > > Should I save it with save() and then remove it with rm() ? > Could I reload it then? > > Thanks for suggestions > Meinhard Ploner > > > > ps my system: R 1.4.1 on mac/darwin. > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ >-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Fri, Feb 22, 2002 at 11:23:29AM +0100, Meinhard Ploner wrote:> I am new on R, so I have a maybe naive question: > > if I have many large data.frames and I use only one or two per session, > what's the best way? > If all are stored in the actual .Rdata, the system gets slow. > On the other hand, I wouldn't like to make a separate package for the > data.Take a look at the package "g.data" at your local CRAN: g.data: Delayed-Data Packages Create and maintain delayed-data packages (DDP's). Data stored in a DDP are available on demand, but do not take up memory until requested. You attach a DDP with g.data.attach(), then read from it and assign to it in a manner similar to S-Plus, except that you must run g.data.save() to actually commit to disk. Version: 1.2 Date: 2001-11-30 Author: David Brahm <brahm at alum.mit.edu> -- -------------------------------------------------------------- http://www.reed.edu/~jones Albyn Jones jones at reed.edu Reed College, Portland OR 97202 (503)-771-1112 x7418 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._