Hi, do any one have experience with loading dataset that is larger than 2GB into R. My organization is a SAS oriented shop and I'm in the process of switching it to R. One of the complain about R has always been it's inability to handle large dataset (>GB) efficiently. I would like some comments from someone with experience of working on >2GB dataset in R. Thanks. Apollo
Absolutely no problem on 64-bit OSes with enough memory. Many 32-bit OSes have problems with > 2Gb files. Please do read the posting guide and tell us basic facts like which OS you are running on, so we don't have to speculate to answer your question. Also, what you want to do with the dataset? This matters crucially. On Wed, 24 Nov 2004, apollo wong wrote:> Hi, do any one have experience with loading dataset > that is larger than 2GB into R. My organization is a > SAS oriented shop and I'm in the process of switching > it to R. One of the complain about R has always been > it's inability to handle large dataset (>GB) > efficiently. I would like some comments from someone > with experience of working on >2GB dataset in R.-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Hi, I've been using large datasets (>GB) and I've stored them in MySQL databases and use RMySQL to access them. My feeling is that most of the times you don't need to keep the dataset in your workspace, but you need to access parts of it or aggregate it in some way, before run some analysis. So use what is best from each world, databases to store and perform partial selections and aggregations, and R to statistical analysis. You'll be amazed with the speed of this 2 together (R & MySQL). Regards EJ On Wed, 2004-11-24 at 15:37, apollo wong wrote:> Hi, do any one have experience with loading dataset > that is larger than 2GB into R. My organization is a > SAS oriented shop and I'm in the process of switching > it to R. One of the complain about R has always been > it's inability to handle large dataset (>GB) > efficiently. I would like some comments from someone > with experience of working on >2GB dataset in R. > Thanks. > Apollo > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Possibly Parallel Threads
- question about R on Linux Cluster
- Memory leak in R v1.6.0?
- [HCL] Apollo 1085VC Supported by blazer_usb
- USERDB environment is unset
- [Bug 63101] New: [Ubuntu 10.04.4 LTS 32-bit] NVIDIA GeForce 7300 GT AGP graphics card will not display any text characters on VIA Technologies Apollo MVP3-based mainboard