Vivek Pandey via llvm-dev
2020-May-21 09:30 UTC
[llvm-dev] on division of __int128 bit integer
Hi Team, I observer that division of __int128 bit is very heavy operation. It internally call a routine '__udivti3', which internally call ' __udivmodti4'. Due to it the overall performance is much much slower (almost 15 time slower than if I do it via a combination of 64-bit or microsoft '_udiv128'). Also what to know if I can directly call below routine directly from my code : *unsigned long long* *__udivmodti4* *(unsigned long long *a*, unsigned long long *b*, unsigned long long **c*)* This is against statement "Conceptually,in operation, like ‘/’ or ‘%’, no function is called, so no header is provided for *__udivmodti4* etc. – and we should not call them explicitly in our code." but wanted to know more on it. Thank you! Regards, Vivek JP Pandey, Software Professional, *+91-7775054441* -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200521/2effa903/attachment.html>
Paweł Bylica via llvm-dev
2020-May-21 13:38 UTC
[llvm-dev] on division of __int128 bit integer
Hi, First of all, the code in compiler-rt/builtins is rather confusing for me. No idea what `tu_int` is but I'm guessing this must be `unsigned __int128`, not `unsigned long long`. https://github.com/llvm/llvm-project/blob/master/compiler-rt/lib/builtins/udivmodti4.c#L22 Secondly the "slow" case proceeds single bit at a time, so this loop may need ~64 iterations in worst case: https://github.com/llvm/llvm-project/blob/master/compiler-rt/lib/builtins/udivmodti4.c#L167-L188>From measurements, it looks GCC has it implemented more efficientlyalthough I have no idea where do they have the code for it. If you are interested, I have division implemented for unsigned 128-bit integer here: https://github.com/chfast/intx/blob/master/include/intx/int128.hpp#L632-L687>From quick check it takes 16 ns, GCC runtime lib: 25 ns, GMP: 20 ns.It follows "Improved division by invariant integers" https://gmplib.org/~tege/division-paper.pdf If make sense I can contribute this implementation as replacement for `__udivmodti4`. Although I'm not sure what requirements for __udivmodti4 are. -- Paweł On Thu, May 21, 2020 at 11:30 AM Vivek Pandey via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi Team, > > I observer that division of __int128 bit is very heavy operation. > It internally call a routine '__udivti3', which internally call ' > __udivmodti4'. > > Due to it the overall performance is much much slower (almost 15 time > slower than if I do it via a combination of 64-bit or microsoft '_udiv128 > '). > > Also what to know if I can directly call below routine directly from my > code : > *unsigned long long* *__udivmodti4* *(unsigned long long *a*, unsigned > long long *b*, unsigned long long **c*)* > > This is against statement "Conceptually,in operation, like ‘/’ or ‘%’, > no function is called, so no header is provided for *__udivmodti4* etc. – > and we should not call them explicitly in our code." but wanted to know > more on it. > > Thank you! > Regards, > Vivek JP Pandey, > Software Professional, > *+91-7775054441* > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200521/e6fc2eab/attachment.html>
Vivek Pandey via llvm-dev
2020-Jun-03 16:57 UTC
[llvm-dev] on division of __int128 bit integer
Dear Paweł, Thanks a lot! The solution helped! Thank you! Regards, Vivek JP Pandey, Software Professional, *+91-7775054441* On Thu, May 21, 2020 at 7:08 PM Paweł Bylica <chfast at gmail.com> wrote:> Hi, > > First of all, the code in compiler-rt/builtins is rather confusing for me. > No idea what `tu_int` is but I'm guessing this must be `unsigned __int128`, > not `unsigned long long`. > > https://github.com/llvm/llvm-project/blob/master/compiler-rt/lib/builtins/udivmodti4.c#L22 > > Secondly the "slow" case proceeds single bit at a time, so this loop may > need ~64 iterations in worst case: > > https://github.com/llvm/llvm-project/blob/master/compiler-rt/lib/builtins/udivmodti4.c#L167-L188 > > From measurements, it looks GCC has it implemented more efficiently > although I have no idea where do they have the code for it. > > If you are interested, I have division implemented for unsigned 128-bit > integer here: > https://github.com/chfast/intx/blob/master/include/intx/int128.hpp#L632-L687 > From quick check it takes 16 ns, GCC runtime lib: 25 ns, GMP: 20 ns. > > It follows "Improved division by invariant integers" > https://gmplib.org/~tege/division-paper.pdf > > If make sense I can contribute this implementation as replacement for > `__udivmodti4`. Although I'm not sure what requirements for __udivmodti4 > are. > > -- > Paweł > > > On Thu, May 21, 2020 at 11:30 AM Vivek Pandey via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Hi Team, >> >> I observer that division of __int128 bit is very heavy operation. >> It internally call a routine '__udivti3', which internally call ' >> __udivmodti4'. >> >> Due to it the overall performance is much much slower (almost 15 time >> slower than if I do it via a combination of 64-bit or microsoft '_udiv128 >> '). >> >> Also what to know if I can directly call below routine directly from my >> code : >> *unsigned long long* *__udivmodti4* *(unsigned long long *a*, unsigned >> long long *b*, unsigned long long **c*)* >> >> This is against statement "Conceptually,in operation, like ‘/’ or ‘%’, >> no function is called, so no header is provided for *__udivmodti4* etc. >> – and we should not call them explicitly in our code." but wanted to >> know more on it. >> >> Thank you! >> Regards, >> Vivek JP Pandey, >> Software Professional, >> *+91-7775054441* >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200603/8d9702e2/attachment.html>