Dear R users, I’ve search the R site for help on this topic but it is hard to find a precise answer for my questions. Which are the best options to overcome the RAM memory limitation problems when using R on “large” data sets (such as 2 or 3 million records)? - Is the free available version of R (as opposed to the one provided by REvolution Computing) compatible with a windows 64-bit machine? And if I increase the RAM memory enough on win-64, would this virtually solve my memory limitation problems? - Is a Unix-like platform a better option than win-64? Again, would this solve my memory limitation problems? - Any better option? Thanks in advance for your help, Lars. [[alternative HTML version deleted]]
Hello Lars, On 2009.11.28 18:53:09, Lars Bishop wrote:> Dear R users, > > I?ve search the R site for help on this topic but it is hard to find a > precise answer for my questions. > > Which are the best options to overcome the RAM memory limitation problems > when using R on ?large? data sets (such as 2 or 3 million records)?I think you'll have to provide a more precise definition of "large"---are we talking 1 GB of records or 100 GB? Also, it would help to know what you are trying to do with the data. The documentation for the biglm and bigmemory packages may provide some help.> - Is the free available version of R (as opposed to the one provided > by REvolution Computing) compatible with a windows 64-bit machine? > And if I increase the RAM memory enough on win-64, would this > virtually solve my memory limitation problems?I'm not familiar enough with the commercial version of R, but I do believe it provides better support for parallelization, which may be of some help. I don't think, however, that this version will "solve" your problem.> - Is a Unix-like platform a better option than win-64? Again, would > this solve my memory limitation problems?Possibly, but Win64 should provide plenty of memory (I believe Windows 7 Ultimate can use up to 192 GB of memory). You just have to find the system that can take that much... With Unix/Linux you can probably cut back some overhead, and the memory management is most likely better, but unless you need to go over 192GB of memory, you don't necessarily have to move to a different platform. ~Jason -- Jason W. Morgan Graduate Student Department of Political Science *The Ohio State University* 154 North Oval Mall Columbus, Ohio 43210
On 28/11/2009 6:53 PM, Lars Bishop wrote:> Dear R users, > > I?ve search the R site for help on this topic but it is hard to find a > precise answer for my questions. > > Which are the best options to overcome the RAM memory limitation problems > when using R on ?large? data sets (such as 2 or 3 million records)?There are several packages for handling datasets without keeping them in RAM: bigmemory, ff, etc. You may find that you need to write functions to handle your data a block at a time, or you may find they have already been written, e.g. biglm. You can also keep your data in a database and just retrieve it a block at a time for processing.> > - Is the free available version of R (as opposed to the one > provided by REvolution Computing) compatible with a windows 64-bit machine? > And if I increase the RAM memory enough on win-64, would this virtually > solve my memory limitation problems?It is compatible with Win64, but it is a 32 bit application. It benefits from running on 64 bit Windows (because Windows can get out of the way and give it most of 4 GB to work in), but not as much as a true 64 bit application. So it probably doesn't solve your problem.> - Is a Unix-like platform a better option than win-64? Again, would > this solve my memory limitation problems?There are builds available for 64 bit Linux and MacOS (and maybe others); they'd likely help more than running 32 bit R in Win64. I don't know how they compare to running Revolution's 64 bit R in Win64. Duncan Murdoch> > > > - Any better option? > Thanks in advance for your help, > Lars. > > [[alternative HTML version deleted]] > > > > ------------------------------------------------------------------------ > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.