Greetings, I am acquiring a new computer in order to conduct data analysis. I currently have a 32-bit Vista OS with 3G of RAM and I consistently run into memory allocation problems. I will likely be required to run Windows 7 on the new system, but have flexibility as far as hardware goes. Can people recommend the best hardware to minimize memory allocation problems? I am leaning towards dual core on a 64-bit system with 8G of RAM. Given the Windows constraint, is there anything I am missing here? I know that Windows limits the RAM that a single application can access. Does this fact over-ride many hardware considerations? Any way around this? Thanks, JD -- View this message in context: http://n4.nabble.com/Best-Hardware-OS-For-Large-Data-Sets-tp1572129p1572129.html Sent from the R help mailing list archive at Nabble.com.
On Feb 27, 2010, at 12:47 PM, J. Daniel wrote:> > Greetings, > > I am acquiring a new computer in order to conduct data analysis. I > currently have a 32-bit Vista OS with 3G of RAM and I consistently > run into > memory allocation problems. I will likely be required to run > Windows 7 on > the new system, but have flexibility as far as hardware goes. Can > people > recommend the best hardware to minimize memory allocation problems? > I am > leaning towards dual core on a 64-bit system with 8G of RAM. Given > the > Windows constraint, is there anything I am missing here?Perhaps the fact that the stable CRAN version of R for (any) Windows is 32-bit? It would expand your memory space somewhat but not as much as you might naively expect. (There was a recent announcement that an experimental version of a 64- bit R was available (even with an installer) and there are vendors who will supply a 64-bit Windows version for an un-announced price. The fact that there was not as of January support for binary packages seems to a bit of a constraint on who would be able to "step up" to use full 64 bit R capabilities on Win64. I'm guessing from the your failure to mention potential software constraints that you are not among that more capable group, as I am also not.) https://stat.ethz.ch/pipermail/r-devel/2010-January/056301.html https://stat.ethz.ch/pipermail/r-devel/2010-January/056411.html> > I know that Windows limits the RAM that a single application can > access. > Does this fact over-ride many hardware considerations? Any way > around this? > > Thanks, > > JD >-- David Winsemius, MD Heritage Laboratories West Hartford, CT
On 27/02/10 17:47, J. Daniel wrote:> Greetings, > > I am acquiring a new computer in order to conduct data analysis. I > currently have a 32-bit Vista OS with 3G of RAM and I consistently run into > memory allocation problems. I will likely be required to run Windows 7 on > the new system, but have flexibility as far as hardware goes. Can people > recommend the best hardware to minimize memory allocation problems? I am > leaning towards dual core on a 64-bit system with 8G of RAM. Given the > Windows constraint, is there anything I am missing here? > > I know that Windows limits the RAM that a single application can access. > Does this fact over-ride many hardware considerations? Any way around this? >You are right on the RAM limit: the way around it is to move to 64-bit operating system.There is an experimental build of core R for 64-bit windows [1] and there is at least one commercial version available [2]. (You can run the 32 bit version of R on 64-bit Windows, but it will only use up to 3.5G of memory [3].) How much memory you should have really depends on your data sets and what you do. I have 16G on my 4-core workstation and frequently use it up, but I do marketing analysis on tens of millions of telco customers. I overflow to AWS which has instances with 7.5G, 15G, 17G, 34G, and 68G memory [4] which you may consider as guides for your system(s). I would reconsider the operating system constraint. A Unix-like 64-bit operating system (I'm a Fedora guy but anything should work well) may be a better long term solution and is likely to give you more easy access to cloud computing (e.g. AWS or your own cluster) when your processing requirements grow. Also 64 bit seems to be better supported in that environment. In all instances you are still going to be constrained by R limiting a vector to 2^31-1 elements and, worse, representing a matrix as a vector which means the product of the dimensions is limited to 2^31-1. What you gain is the ability to have many more <2^31-1 vectors available. Hope this helps a little Allan [1] http://cran.r-project.org/bin/windows64/contrib/ [2] http://www.revolution-computing.com/ [3] See FAQ 2.9 at http://cran.r-project.org/bin/windows/base/rw-FAQ.html [4] http://aws.amazon.com/ec2/instance-types/> Thanks, > > JD > > >
JD, I would recommend 64 bit, and win 7 on a quad core system has been surprisingly stable for me. Allan's points are also spot on. I would suggest reading the references Allan provided. Increased memory will increase your ability to handle n vectors of 2^31-1 elements, a limit that was a design decision to make results comparable between 32 bit and 64 bit systems. This means larger scale data will need to be handled in chunks, but a 64 bit system lets you handle more of those chunks in memory simultaneously. As far as memory allocation problems, this is just a next-jump in the learning curve. Technically, you can handle large scale data on a 32 bit system with 2 G RAM. It would just mean your chunks may be smaller, the computations would take a little longer (more read calls to the disk, more careful planning to avoid paging in Windows environments), and your parameter estimates would be based on perhaps more aggregates. That's not to say chunk-size could not be made the same on 64 bit systems with loads of RAM so that the estimates would remain comparable. Welcome to the world of portions, JD. Sincerely, KeithC. -----Original Message----- From: J. Daniel [mailto:jdlecy at gmail.com] Sent: Saturday, February 27, 2010 10:47 AM To: r-help at r-project.org Subject: [R] Best Hardware & OS For Large Data Sets Greetings, I am acquiring a new computer in order to conduct data analysis. I currently have a 32-bit Vista OS with 3G of RAM and I consistently run into memory allocation problems. I will likely be required to run Windows 7 on the new system, but have flexibility as far as hardware goes. Can people recommend the best hardware to minimize memory allocation problems? I am leaning towards dual core on a 64-bit system with 8G of RAM. Given the Windows constraint, is there anything I am missing here? I know that Windows limits the RAM that a single application can access. Does this fact over-ride many hardware considerations? Any way around this? Thanks, JD -- View this message in context: http://n4.nabble.com/Best-Hardware-OS-For-Large-Data-Sets-tp1572129p1572129. html Sent from the R help mailing list archive at Nabble.com.