Hello, I am new to R, can anyone give me an idea of how R handle a large dataset (e.g. couple of Gbytes)? Thanks a lot! Best, Mingjun
I believe this is determined by how much memory your computer has, not particularly by R itself. Mingjun Huang wrote:> Hello, > > I am new to R, can anyone give me an idea of how R handle a large dataset > (e.g. couple of Gbytes)? Thanks a lot! > > > Best, > Mingjun > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Depends on the RAM in your machine. And in your definition of 'handle'. You may be able to load a very large dataset into R, but won't be able to use some functions that require additional memory. A vague answer to a vague question... :) Julian Mingjun Huang wrote:> Hello, > > I am new to R, can anyone give me an idea of how R handle a large dataset > (e.g. couple of Gbytes)? Thanks a lot! > > > Best, > Mingjun > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Mingjun Huang wrote:> Hello, > > I am new to R, can anyone give me an idea of how R handle a large dataset > (e.g. couple of Gbytes)? Thanks a lot! > >In answer to your subject line: yes, not gigantic (multi-terrabyte), but large is likely to be OK. How depends on what you are trying to do. If you are running a 64 bit OS with a terrabyte or more of RAM you shouldn't notice unless you are doing something messy (all bets are off if you are attempting to list all permutations of the data!); if you are running 32bit with a max size set by the OS of 2Gb then the answer is with much care and considerable cunning and possibly modification of your intermediate goals. Q1 is always - why are you dealing with such a big dataset? Is all the data equally informative? Because you can collect data doesn't mean you have to, or if you do insist on collecting it (presumably automatically), that it will be useful. -- Dr Richard Rowe Zoology & Tropical Ecology School of Marine & Tropical Biology James Cook University Townsville 4811 AUSTRALIA ph +61 7 47 81 4851 fax +61 7 47 25 1570 JCU has CRICOS Provider Code 00117J