On 18/02/2011 5:44 AM, Michael Holt wrote:> Hello Everyone,
>
> I'm pretty new to R and I'm trying to get some idea of the
capabilities of
> the language. I work with some pretty large data sets and the consensus
> seems to be that R does not work well with big data. I've started
talking to
> the guys at Revolution, but I need to get some outside opinions of what R
> can actually handle. At about what size does R start to run into problems?
>
Vectors are limited to about 2 billion entries (2^31 - 1). Matrices are
vectors, so that limit applies to the total count of entries.
Dataframes are lists of vectors, so that limit applies separately to the
numbers of rows and columns.
Simple R code keeps everything in memory, so you're likely to run into
hardware limits if you start working with really big vectors. There are
a number of packages that alleviate that by paging data in and out, but
it takes a bit of work on your part to use them. As far as I know,
Revolution offers nothing in this area that isn't on CRAN, but they can
certainly give you advice.
Duncan Murdoch