LLVM supports integers up to about 8 million bits. This is a wonderful feature that I would like to expose in the language I'm designing, so that if you were, say, implementing SHA512, you could write the code in terms of variables of type [int 512], and have it all work with near optimal efficiency, the size known at compile time, and (unlike the case where you were using an arbitrary precision integer class) no heap allocation. So my question is, is it okay to go ahead and do this, or are there any caveats in terms of efficiency or correctness? In particular, I remember reading something about there being problems with returning integers larger than two machine words, but I can't find it again; is there currently any such problem, or is it the case that if there was, it's fixed now?
On Sun, Feb 28, 2010 at 11:54 AM, Russell Wallace <russell.wallace at gmail.com> wrote:> So my question is, is it okay to go ahead and do this, or are there > any caveats in terms of efficiency or correctness? In particular, I > remember reading something about there being problems with returning > integers larger than two machine words, but I can't find it again; is > there currently any such problem, or is it the case that if there was, > it's fixed now?In terms of correctness, it should work except for the fact that the LLVM code generators don't implement more complicated operations on such integers, like multiplication, division, and variable-width shifts. The issues with returning large integers are fixed, at least on x86. In terms of efficiency, the generated code is likely to be less than ideal; juggling 512-bit numbers takes a lot of registers, and everything will be unrolled. This might be okay for a 512-bit number, but it would be a complete mess for a 2048-bit number. Overall, for arbitrary uses, you're probably better off using a more conventional bignum library. -Eli
On Sun, Feb 28, 2010 at 8:58 PM, Eli Friedman <eli.friedman at gmail.com> wrote:> In terms of correctness, it should work except for the fact that the > LLVM code generators don't implement more complicated operations on > such integers, like multiplication, division, and variable-width > shifts. The issues with returning large integers are fixed, at least > on x86.But not on other platforms? What's the largest integer such that something like 'return ((a * b) / c) >> d' works correctly on all major platforms?