Clinton Mead via llvm-dev
2015-Nov-10 04:46 UTC
[llvm-dev] Generating Big Num addition code which uses ADC (add with carry) instructions
I'm trying to work out LLVM code which generates something similar to the following when adding large multiword numbers stored as separate words: ADD x1 x1 ADC x2 y2 ADC x3 y3 etc, where such a three argument add like ADC on x86 (which includes a carry in the addition) is available as a machine op. The background to this is that I'm trying to implement fast multiword addition in Haskell, which can compile via LLVM, so I thought if I knew what LLVM code generates the ideal assembly I can work backwards and use the appropriate Haskell prim-ops. Of course one can use GMP, but I'd suspect with two-four word additions the loss of inlining opportunities and the cost of the external call would be greater than the addition itself. So it would be great to be able to work out how to do it in LLVM and then either implement it with current GHC or add an extra GHC prim-op which allows GHC to access the appropriate LLVM code. Sorry I don't know much about LLVM besides my brief readings of the docs, and I've posted here because I couldn't find an llvm-users list. There's a clang-users list, but I didn't think that would be appropriate because GHC compiles via LLVM directly, not via clang. Any help or guidance appreciated. Clinton -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20151110/b22e8e2b/attachment.html>
David Jones via llvm-dev
2015-Nov-10 16:44 UTC
[llvm-dev] Generating Big Num addition code which uses ADC (add with carry) instructions
This might "just work" on many targets by using the appropriate integral type. e.g. on x86_64, something like %a = add i128 %b, %c will lower to add rax,rbx adc rcx,rdx Some caveats, based on my experience with LLVM 3.3, so things may have changed by then: - Not all targets support operations on integers wider than the "native width". e.g. on x86_64, trying to multiply i256 x i256 will net you an assert. I believe i128 x i128 is handled, perhaps by a function call, but nothing larger. - At least on x86_64, everything is fully unrolled. So adding i2048 + i2048 will generate an ADD and 31 ADC instructions, with no looping. This makes sense for "small" multiprecision integers, but can get silly as things get wider. In my project, I am using i128 in a few places (add, subtract, shift) to get efficient code, and relying on my own library functions for anything wider. On Mon, Nov 9, 2015 at 11:46 PM, Clinton Mead via llvm-dev < llvm-dev at lists.llvm.org> wrote:> I'm trying to work out LLVM code which generates something similar to the > following when adding large multiword numbers stored as separate words: > > ADD x1 x1 > ADC x2 y2 > ADC x3 y3 > > etc, where such a three argument add like ADC on x86 (which includes a > carry in the addition) is available as a machine op. > > The background to this is that I'm trying to implement fast multiword > addition in Haskell, which can compile via LLVM, so I thought if I knew > what LLVM code generates the ideal assembly I can work backwards and use > the appropriate Haskell prim-ops. > > Of course one can use GMP, but I'd suspect with two-four word additions > the loss of inlining opportunities and the cost of the external call would > be greater than the addition itself. So it would be great to be able to > work out how to do it in LLVM and then either implement it with current GHC > or add an extra GHC prim-op which allows GHC to access the appropriate LLVM > code. > > Sorry I don't know much about LLVM besides my brief readings of the docs, > and I've posted here because I couldn't find an llvm-users list. There's a > clang-users list, but I didn't think that would be appropriate because GHC > compiles via LLVM directly, not via clang. > > Any help or guidance appreciated. > > Clinton > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20151110/54bcc17a/attachment.html>
Clinton Mead via llvm-dev
2015-Nov-13 12:52 UTC
[llvm-dev] Generating Big Num addition code which uses ADC (add with carry) instructions
I'm looking for code that produces the results that the i128 code produces but by instead using multiple i64s. Like I said, this is Haskell primops generating the llvm code, so it would really good if I could just point LLVM in the right direction instead of having to write a new primop for GHC. If I can see working LLVM code that does an efficient addition with carry on multiple machine size words I can then perhaps work out the Haskell primops that would produce that code. On Wed, Nov 11, 2015 at 3:44 AM, David Jones <djones at xtreme-eda.com> wrote:> This might "just work" on many targets by using the appropriate integral > type. > > e.g. on x86_64, something like > > %a = add i128 %b, %c > > will lower to > > add rax,rbx > adc rcx,rdx > > Some caveats, based on my experience with LLVM 3.3, so things may have > changed by then: > > - Not all targets support operations on integers wider than the "native > width". e.g. on x86_64, trying to multiply i256 x i256 will net you an > assert. I believe i128 x i128 is handled, perhaps by a function call, but > nothing larger. > - At least on x86_64, everything is fully unrolled. So adding i2048 + > i2048 will generate an ADD and 31 ADC instructions, with no looping. This > makes sense for "small" multiprecision integers, but can get silly as > things get wider. > > In my project, I am using i128 in a few places (add, subtract, shift) to > get efficient code, and relying on my own library functions for anything > wider. > > > On Mon, Nov 9, 2015 at 11:46 PM, Clinton Mead via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> I'm trying to work out LLVM code which generates something similar to the >> following when adding large multiword numbers stored as separate words: >> >> ADD x1 x1 >> ADC x2 y2 >> ADC x3 y3 >> >> etc, where such a three argument add like ADC on x86 (which includes a >> carry in the addition) is available as a machine op. >> >> The background to this is that I'm trying to implement fast multiword >> addition in Haskell, which can compile via LLVM, so I thought if I knew >> what LLVM code generates the ideal assembly I can work backwards and use >> the appropriate Haskell prim-ops. >> >> Of course one can use GMP, but I'd suspect with two-four word additions >> the loss of inlining opportunities and the cost of the external call would >> be greater than the addition itself. So it would be great to be able to >> work out how to do it in LLVM and then either implement it with current GHC >> or add an extra GHC prim-op which allows GHC to access the appropriate LLVM >> code. >> >> Sorry I don't know much about LLVM besides my brief readings of the docs, >> and I've posted here because I couldn't find an llvm-users list. There's a >> clang-users list, but I didn't think that would be appropriate because GHC >> compiles via LLVM directly, not via clang. >> >> Any help or guidance appreciated. >> >> Clinton >> >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20151113/4695db19/attachment.html>