{Hijacking the thread from from R-help to R-devel -- as I am
consciously shifting the focus away from the original question
...
}
>>>>> David Winsemius <dwinsemius at comcast.net>
>>>>> on Tue, 10 Aug 2010 08:42:12 -0400 writes:
> On Aug 9, 2010, at 2:45 PM, Theo Tannen wrote:
>> Are integers strictly a signed 32 bit number on R even if
>> I am running a 64 bit version of R on a x86_64 bit
>> machine?
>>
>> I ask because I have integers stored in a hdf5 file where
>> some of the data is 64 bit integers. When I read that
>> into R using the hdf5 library it seems any integer
>> greater than 2**31 returns NA.
> That's the limit. It's hard coded and not affected by the
> memory pointer size.
>>
>> Any solutions?
> I have heard of packages that handle "big numbers". A bit
> of searching produces suggestions to look at gmp on CRAN
> and Rmpfr on R-Forge.
Note that Rmpfr has been on CRAN, too, for a while now.
If you only need large integers (and rationals), 'gmp' is enough
though.
*However* note that the gmp or Rmpfr (or any other arbitray
precision) implementation will be considerably slower in usage
than if there was native 64-bit integer support.
Introducing 64-bit integers natively into "base R" is an
"interesting" project, notably if we also allowed using them for
indices, and changed the internal structures to use them instead
of 32-bit.
This would allow to free ourselves from the increasingly
relevant maximum-atomic-object-length = 2^31 problem.
The latter is something we have planned to address, possibly for
R 3.0.
However, for that, using 64-bit integers is just one
possibility, another being to use "double precision integers".
Personally, I'd prefer the "long long" (64-bit) integers quite
a bit, but there are other considerations, e.g.,
one big challenge will be to go there in a way such that not
all R packages using compiled code will have to be patched
extensively...
another aspect is how the BLAS / Lapack team will address the
problem.
Martin Maechler, ETH Zurich