thr3ads.net - llvm dev - [llvm-dev] A pattern for portable __builtin_add

If this information is useful, please help other people find it:
Share via:

Paweł Bylica via llvm-dev

2018-Nov-20 10:45 UTC

[llvm-dev] A pattern for portable __builtin_add_overflow()

Hi LLVM, clang,

I'm trying to write a portable version of __builtin_add_overflow() it a way
that the compiler would
recognize the pattern and use the add_overflow intrinsic / the best
possible machine instruction.

Here are docs about these builtins:
https://clang.llvm.org/docs/LanguageExtensions.html#checked-arithmetic-builtins
.

With unsigned types this is easy:

int uaddo_native(unsigned a, unsigned b, unsigned* s)
{
return __builtin_add_overflow(a, b, s);
}

int uaddo_portable(unsigned a, unsigned b, unsigned* s)
{
*s = a + b;
return *s < a;
}

We get exactly the same assembly:
uaddo_native: # @uaddo_native
xor eax, eax
add edi, esi
setb al
mov dword ptr [rdx], edi
ret
uaddo_portable: # @uaddo_portable
xor eax, eax
add edi, esi
setb al
mov dword ptr [rdx], edi
ret

But with signed types it is not so easy. I tried 2 versions, but the result
is quite far away from the optimal assembly.


int saddo_native(int a, int b, int* s)
{
return __builtin_add_overflow(a, b, s);
}

int saddo_portable(int a, int b, int* s)
{
*s = (unsigned)a + (unsigned)b;
return (a > 0) ? *s <= b : *s > b;
}

int saddo_portable2(int a, int b, int* s)
{
*s = (unsigned)a + (unsigned)b;
int cond = a > 0;
int check = *s > b;
return (cond & !check) | (!cond & check);
}

Assembly:

saddo_native: # @saddo_native
xor eax, eax
add edi, esi
seto al
mov dword ptr [rdx], edi
ret
saddo_portable: # @saddo_portable
lea eax, [rsi + rdi]
mov dword ptr [rdx], eax
cmp eax, esi
setle al
setg cl
test edi, edi
jg .LBB3_2
mov eax, ecx
.LBB3_2:
movzx eax, al
ret
saddo_portable2: # @saddo_portable2
lea eax, [rsi + rdi]
mov dword ptr [rdx], eax
test edi, edi
setg cl
cmp eax, esi
setg al
xor al, cl
movzx eax, al
ret


Do you know the trick to force the compiler to use the seto instruction?

I also noticed that the transformation for uaddo_portable happens in
CodeGen, not in IR.

Bests,
Paweł
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20181120/1779e2db/attachment-0001.html>

Sanjoy Das via llvm-dev

2018-Nov-22 17:54 UTC

head link

[llvm-dev] A pattern for portable __builtin_add_overflow()

Going by InstCombiner::foldICmpWithConstant,
https://gcc.godbolt.org/z/To_qmm should work.

-- Sanjoy
On Tue, Nov 20, 2018 at 2:46 AM Paweł Bylica via llvm-dev
<llvm-dev at lists.llvm.org> wrote:>
> Hi LLVM, clang,
>
> I'm trying to write a portable version of __builtin_add_overflow() it a
way that the compiler would
> recognize the pattern and use the add_overflow intrinsic / the best
possible machine instruction.
>
> Here are docs about these builtins:
https://clang.llvm.org/docs/LanguageExtensions.html#checked-arithmetic-builtins.
>
> With unsigned types this is easy:
>
> int uaddo_native(unsigned a, unsigned b, unsigned* s)
> {
> return __builtin_add_overflow(a, b, s);
> }
>
> int uaddo_portable(unsigned a, unsigned b, unsigned* s)
> {
> *s = a + b;
> return *s < a;
> }
>
> We get exactly the same assembly:
> uaddo_native: # @uaddo_native
> xor eax, eax
> add edi, esi
> setb al
> mov dword ptr [rdx], edi
> ret
> uaddo_portable: # @uaddo_portable
> xor eax, eax
> add edi, esi
> setb al
> mov dword ptr [rdx], edi
> ret
>
> But with signed types it is not so easy. I tried 2 versions, but the result
is quite far away from the optimal assembly.
>
>
> int saddo_native(int a, int b, int* s)
> {
> return __builtin_add_overflow(a, b, s);
> }
>
> int saddo_portable(int a, int b, int* s)
> {
> *s = (unsigned)a + (unsigned)b;
> return (a > 0) ? *s <= b : *s > b;
> }
>
> int saddo_portable2(int a, int b, int* s)
> {
> *s = (unsigned)a + (unsigned)b;
> int cond = a > 0;
> int check = *s > b;
> return (cond & !check) | (!cond & check);
> }
>
> Assembly:
>
> saddo_native: # @saddo_native
> xor eax, eax
> add edi, esi
> seto al
> mov dword ptr [rdx], edi
> ret
> saddo_portable: # @saddo_portable
> lea eax, [rsi + rdi]
> mov dword ptr [rdx], eax
> cmp eax, esi
> setle al
> setg cl
> test edi, edi
> jg .LBB3_2
> mov eax, ecx
> .LBB3_2:
> movzx eax, al
> ret
> saddo_portable2: # @saddo_portable2
> lea eax, [rsi + rdi]
> mov dword ptr [rdx], eax
> test edi, edi
> setg cl
> cmp eax, esi
> setg al
> xor al, cl
> movzx eax, al
> ret
>
>
> Do you know the trick to force the compiler to use the seto instruction?
>
> I also noticed that the transformation for uaddo_portable happens in
CodeGen, not in IR.
>
> Bests,
> Paweł
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Paweł Bylica via llvm-dev

2018-Nov-22 20:07 UTC

head link

[llvm-dev] A pattern for portable __builtin_add_overflow()

Thanks Sanjoy,

It looks it works only up to 32-bit integers (
http://llvm.org/doxygen/InstCombineCompares_8cpp_source.html#l01255).
I will need it also for 64-bit integers.

Even if the pattern match would handle the 64-bit case I would need to use
128-bit types what is not portable still.

// P.

On Thu, Nov 22, 2018 at 6:54 PM Sanjoy Das <sanjoy at
playingwithpointers.com>
wrote:
> Going by InstCombiner::foldICmpWithConstant,
> https://gcc.godbolt.org/z/To_qmm should work.
>
> -- Sanjoy
> On Tue, Nov 20, 2018 at 2:46 AM Paweł Bylica via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
> >
> > Hi LLVM, clang,
> >
> > I'm trying to write a portable version of __builtin_add_overflow()
it a
> way that the compiler would
> > recognize the pattern and use the add_overflow intrinsic / the best
> possible machine instruction.
> >
> > Here are docs about these builtins:
>
https://clang.llvm.org/docs/LanguageExtensions.html#checked-arithmetic-builtins
> .
> >
> > With unsigned types this is easy:
> >
> > int uaddo_native(unsigned a, unsigned b, unsigned* s)
> > {
> > return __builtin_add_overflow(a, b, s);
> > }
> >
> > int uaddo_portable(unsigned a, unsigned b, unsigned* s)
> > {
> > *s = a + b;
> > return *s < a;
> > }
> >
> > We get exactly the same assembly:
> > uaddo_native: # @uaddo_native
> > xor eax, eax
> > add edi, esi
> > setb al
> > mov dword ptr [rdx], edi
> > ret
> > uaddo_portable: # @uaddo_portable
> > xor eax, eax
> > add edi, esi
> > setb al
> > mov dword ptr [rdx], edi
> > ret
> >
> > But with signed types it is not so easy. I tried 2 versions, but the
> result is quite far away from the optimal assembly.
> >
> >
> > int saddo_native(int a, int b, int* s)
> > {
> > return __builtin_add_overflow(a, b, s);
> > }
> >
> > int saddo_portable(int a, int b, int* s)
> > {
> > *s = (unsigned)a + (unsigned)b;
> > return (a > 0) ? *s <= b : *s > b;
> > }
> >
> > int saddo_portable2(int a, int b, int* s)
> > {
> > *s = (unsigned)a + (unsigned)b;
> > int cond = a > 0;
> > int check = *s > b;
> > return (cond & !check) | (!cond & check);
> > }
> >
> > Assembly:
> >
> > saddo_native: # @saddo_native
> > xor eax, eax
> > add edi, esi
> > seto al
> > mov dword ptr [rdx], edi
> > ret
> > saddo_portable: # @saddo_portable
> > lea eax, [rsi + rdi]
> > mov dword ptr [rdx], eax
> > cmp eax, esi
> > setle al
> > setg cl
> > test edi, edi
> > jg .LBB3_2
> > mov eax, ecx
> > .LBB3_2:
> > movzx eax, al
> > ret
> > saddo_portable2: # @saddo_portable2
> > lea eax, [rsi + rdi]
> > mov dword ptr [rdx], eax
> > test edi, edi
> > setg cl
> > cmp eax, esi
> > setg al
> > xor al, cl
> > movzx eax, al
> > ret
> >
> >
> > Do you know the trick to force the compiler to use the seto
instruction?
> >
> > I also noticed that the transformation for uaddo_portable happens in
> CodeGen, not in IR.
> >
> > Bests,
> > Paweł
> >
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20181122/27ebe627/attachment.html>

Seemingly Similar Threads

Search for more possibly parallel threads

llvm dev - Nov 2018 - A pattern for portable __builtin_add_overflow()

[llvm-dev] A pattern for portable __builtin_add_overflow()

[llvm-dev] A pattern for portable __builtin_add_overflow()

[llvm-dev] A pattern for portable __builtin_add_overflow()

Seemingly Similar Threads