Is R file IO slow in general or am I missing something? It takes me 5 minutes to do a load(MYFILE) where MYFILE is a 27 MB Rdata file. Is there any way to speed this up? The one idea I have is having R call a C or Perl routine, reading the file in that language, converting the data in to R objects, then sending them back into R. This is more work that I want to do, however, in loading Rdata files. Any ideas would be appreciated. Ramzi Aboud University of Rochester ____________________________________________________________________________________ Need Mail bonding?
Just an idea: Two things that can slow down save()/load() is if you save() in ASCII format or a compressed binary format. If this is your case for MYFILE, try to resave in a non-compressed binary format. See ?save for details. /HB On 3/1/07, ramzi abboud <ramziabb at yahoo.com> wrote:> Is R file IO slow in general or am I missing > something? It takes me 5 minutes to do a load(MYFILE) > where MYFILE is a 27 MB Rdata file. Is there any way > to speed this up? > > The one idea I have is having R call a C or Perl > routine, reading the file in that language, converting > the data in to R objects, then sending them back into > R. This is more work that I want to do, however, in > loading Rdata files. > > Any ideas would be appreciated. > Ramzi Aboud > University of Rochester > > > > > > > ____________________________________________________________________________________ > Need Mail bonding? > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
A 27MB .RData file is relatively big, in may experience. What do you think is slow? Maybe it's your computer that is slow? -roger ramzi abboud wrote:> Is R file IO slow in general or am I missing > something? It takes me 5 minutes to do a load(MYFILE) > where MYFILE is a 27 MB Rdata file. Is there any way > to speed this up? > > The one idea I have is having R call a C or Perl > routine, reading the file in that language, converting > the data in to R objects, then sending them back into > R. This is more work that I want to do, however, in > loading Rdata files. > > Any ideas would be appreciated. > Ramzi Aboud > University of Rochester > > > > > > > ____________________________________________________________________________________ > Need Mail bonding? > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Roger D. Peng | http://www.biostat.jhsph.edu/~rpeng/
It is not slow on my system. The file was 34MB on disk and took about 37 seconds to write out (probably mostly disk I/O on my laptop) and 12 seconds to read in after I flushed the system cache.> x <- runif(27e6/4) # creates a 34MB file on disk > object.size(x)[1] 54000024> system.time(save.image('test.xx'))[1] 23.16 0.40 37.93 NA NA> gc()used (Mb) gc trigger (Mb) max used (Mb) Ncells 258153 6.9 531268 14.2 350000 9.4 Vcells 6864025 52.4 8380235 64.0 6865364 52.4> rm(x) > gc()used (Mb) gc trigger (Mb) max used (Mb) Ncells 258150 6.9 531268 14.2 350000 9.4 Vcells 114019 0.9 6704187 51.2 6865364 52.4> system.time(load('test.xx')) #without flushing system cache it takes 4 seconds[1] 3.64 0.01 4.07 NA NA> gc()used (Mb) gc trigger (Mb) max used (Mb) Ncells 258153 6.9 531268 14.2 350000 9.4 Vcells 6864025 52.4 7911488 60.4 6870189 52.5> system.time(load('test.xx')) # after flushing system cache[1] 3.48 0.11 12.12 NA NA>So it must be someelse on your system. Jim Holtman "What is the problem you are trying to solve?" ----- Original Message ---- From: ramzi abboud <ramziabb@yahoo.com> To: r-help@stat.math.ethz.ch Sent: Thursday, March 1, 2007 12:22:22 PM Subject: [R] R File IO Slow? Is R file IO slow in general or am I missing something? It takes me 5 minutes to do a load(MYFILE) where MYFILE is a 27 MB Rdata file. Is there any way to speed this up? The one idea I have is having R call a C or Perl routine, reading the file in that language, converting the data in to R objects, then sending them back into R. This is more work that I want to do, however, in loading Rdata files. Any ideas would be appreciated. Ramzi Aboud University of Rochester ____________________________________________________________________________________ Need Mail bonding? ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ____________________________________________________________________________________ No need to miss a message. Get email on-the-go [[alternative HTML version deleted]]
On Thu, 2007-03-01 at 09:22 -0800, ramzi abboud wrote:> Is R file IO slow in general or am I missing > something? It takes me 5 minutes to do a load(MYFILE) > where MYFILE is a 27 MB Rdata file. Is there any way > to speed this up? > > The one idea I have is having R call a C or Perl > routine, reading the file in that language, converting > the data in to R objects, then sending them back into > R. This is more work that I want to do, however, in > loading Rdata files. > > Any ideas would be appreciated. > Ramzi Aboud > University of RochesterHere are some timings on my system, which runs Linux on a 3.2 Ghz P4 with 2 Gb of RAM and a 7200 rpm HD. I typically get around 28 Mb/sec throughput on this drive, which is about 15% lower than normal, as it is an encrypted partition using 256 bit AES.> Vec <- 1:15000000> system.time(save(Vec, file = "Vec.RData"))[1] 33.297 0.565 38.889 0.000 0.000 # File is ~29 Mb> file.info("Vec.RData")$size[1] 30112009> system.time(load("Vec.RData"))[1] 5.607 0.167 6.575 0.000 0.000 Not terribly burdensome... You might want to be sure that you are not low on RAM, resulting in a lot of swapping to disk, or perhaps just a slow drive. HTH, Marc Schwartz