Hi all, Using big vectors (more than 4GB) is unfortunately not possible under Windows or other OS's if not enough RAM exists. Could it be possible to implement an a new data type in R, like a vector, but instead holding the information in memory, the data lies on an file. If data is accessed, the data type vector get the information automatically from the file. There is a package out there (named ff) but the accessed boundary have to be declared by the user this is a disadvantage. Greetings.
_ wrote:> Hi all, > Using big vectors (more than 4GB) is unfortunately not possible under > Windows or other OS's if not enough RAM exists.This is NOT true. It is not limited by RAM, but rather by RAM and swap space. With 500G hard disks at about $100, the more serious limitation is a 32bit OS. Speed is a different consideration, but I doubt that taking over what the OS is suppose to do will be the real answer. Paul> Could it be possible to implement an a new data type in R, like a > vector, but instead holding the information in memory, the data lies on > an file. If data is accessed, the data type vector get the information > automatically from the file. > There is a package out there (named ff) but the accessed boundary have > to be declared by the user this is a disadvantage. > > Greetings. > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel=================================================================================== La version fran?aise suit le texte anglais. ------------------------------------------------------------------------------------ This email may contain privileged and/or confidential information, and the Bank of Canada does not waive any related rights. Any distribution, use, or copying of this email or the information it contains by other than the intended recipient is unauthorized. If you received this email in error please delete it immediately from your system and notify the sender promptly by email that you have done so. ------------------------------------------------------------------------------------ Le pr?sent courriel peut contenir de l'information privil?gi?e ou confidentielle. La Banque du Canada ne renonce pas aux droits qui s'y rapportent. Toute diffusion, utilisation ou copie de ce courriel ou des renseignements qu'il contient par une personne autre que le ou les destinataires d?sign?s est interdite. Si vous recevez ce courriel par erreur, veuillez le supprimer imm?diatement et envoyer sans d?lai ? l'exp?diteur un message ?lectronique pour l'aviser que vous avez ?limin? de votre ordinateur toute copie du courriel re?u.
On Feb 14, 2008, at 6:32 AM, _ wrote:> Hi all, > Using big vectors (more than 4GB) is unfortunately not possible under > Windows or other OS's if not enough RAM exists. > Could it be possible to implement an a new data type in R, like a > vector, but instead holding the information in memory, the data lies > on > an file. If data is accessed, the data type vector get the information > automatically from the file. > There is a package out there (named ff) but the accessed boundary have > to be declared by the user this is a disadvantage. >I don't think you have been reading the documentation carefully enough - it doesn't impose any limits itself. Whatever limits you hit with it are due to the OS and/or R, so you cannot write a package that you describe without hitting those limits. They are as follows: size of an integer in R which limits the length of a single vector (2^31-1 ~ 2G entries on 32-bit machines) and file size limit of your OS. The former is a really hard limit, the only way to overcome it (without modifying R) is to use multiple indices (which the ff package suggests). You can overcome the file size limit by simply using multiple files (or using a more reasonable OS). Cheers, Simon
You may want to look at the SQLiteDF package, this allows you to put your data into an SQLite database and treat that like a normal vector or data frame inside of R. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org (801) 408-8111> -----Original Message----- > From: r-devel-bounces at r-project.org > [mailto:r-devel-bounces at r-project.org] On Behalf Of _ > Sent: Thursday, February 14, 2008 4:32 AM > To: r-devel at r-project.org > Subject: [Rd] Vector binding on harddisk > > Hi all, > Using big vectors (more than 4GB) is unfortunately not > possible under Windows or other OS's if not enough RAM exists. > Could it be possible to implement an a new data type in R, > like a vector, but instead holding the information in memory, > the data lies on an file. If data is accessed, the data type > vector get the information automatically from the file. > There is a package out there (named ff) but the accessed > boundary have to be declared by the user this is a disadvantage. > > Greetings. > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >