thr3ads.net - llvm dev - [LLVMdev] [RFC] Integer Saturation Intrinsics [Jan 2015]

If this information is useful, please help other people find it:
Share via:

David Majnemer

2015-Jan-15 10:51 UTC

[LLVMdev] [RFC] Integer Saturation Intrinsics

On Thu, Jan 15, 2015 at 2:33 AM, David Chisnall <David.Chisnall at
cl.cam.ac.uk> wrote:
> A couple of questions:
>
> 1) Should this really be an intrinsic and not a flag on add?  The add
> instruction already allows overflow to be either undefined or defined to
> wrap.  Making it defined to saturate seems a natural extension.
>
I don't think this should be a flag on add.  Flags are designed such that
the middle-end may be ignorant of them and nothing bad might happen, it is
always safe to ignore or drop flags when doing so is convenient (for a
concrete example, take a look at reassociate).

In this case, the saturating nature of the operation does not seem like
something that can be safely ignored.

>
> 2) How do you imagine this being used and what are the guarantees for
> sequences of operations with respect to optimisation?  If I do a+b-c (or +c
> where c is negative), and a+b would saturate, but a+(b-c) would not, then
> is it allowed for an optimiser to generate the second rather than the
> first?  If it's an intrinsic that's opaque to optimisers, then
that's not a
> problem for correctness, but then you'll miss some potentially
beneficial
> optimisations.
>
> David
>
> > On 14 Jan 2015, at 22:08, Ahmed Bougacha <ahmed.bougacha at
gmail.com>
> wrote:
> >
> > Hi all,
> >
> > The patches linked below introduce a new family of intrinsics, for
> > integer saturation: @llvm.usat, and @llvm.ssat (unsigned/signed).
> > Quoting the added documentation:
> >
> >      %r = call i32 @llvm.ssat.i32(i32 %x, i32 %n)
> >
> > is equivalent to the expression min(max(x, -2^(n-1)), 2^(n-1)-1),
itself
> > implementable as the following IR:
> >
> >      %min_sint_n = i32 ... ; the min. signed integer of bitwidth n,
> -2^(n-1)
> >      %max_sint_n = i32 ... ; the max. signed integer of bitwidth n,
> 2^(n-1)-1
> >      %0 = icmp slt i32 %x, %min_sint_n
> >      %1 = select i1 %0, i32 %min_sint_n, i32 %x
> >      %2 = icmp sgt i32 %1, %max_sint_n
> >      %r = select i1 %2, i32 %max_sint_n, i32 %1
> >
> >
> > As a starting point, here are two patches:
> > - http://reviews.llvm.org/D6976  Add Integer Saturation Intrinsics.
> > - http://reviews.llvm.org/D6977  [CodeGen] Add legalization for
> > Integer Saturation Intrinsics.
> >
> > From there, we can generate several new instructions, more efficient
> > than their expanded counterpart.  Locally, I have worked on:
> > - ARM: the SSAT/USAT instructions (scalar)
> > - AArch64: the SQ/UQ ADD/SUB AArch64 instructions (vector/scalar
> > saturating arithmetic)
> > - X86: PACK SS/US (vector, saturate+truncate)
> > - X86: PADD/SUB S/US (vector, saturating arithmetic)
> >
> > Anyway, let's first agree on the intrinsics, so that further
> > development is done on trunk.
> >
> > Thanks!
> > -Ahmed
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150115/e29c2894/attachment.html>

David Chisnall

2015-Jan-15 11:02 UTC

head link

[LLVMdev] [RFC] Integer Saturation Intrinsics

On 15 Jan 2015, at 10:51, David Majnemer <david.majnemer at gmail.com>
wrote:> 
> I don't think this should be a flag on add.  Flags are designed such
that the middle-end may be ignorant of them and nothing bad might happen, it is
always safe to ignore or drop flags when doing so is convenient (for a concrete
example, take a look at reassociate).
This is true of metadata, not of flags.  Consider the atomic memory order on
loads and stores, for example.  It is definitely not safe for an optimiser to
ignore these.

David

David Majnemer

2015-Jan-15 11:10 UTC

head link

[LLVMdev] [RFC] Integer Saturation Intrinsics

On Thu, Jan 15, 2015 at 3:02 AM, David Chisnall <David.Chisnall at
cl.cam.ac.uk> wrote:
> On 15 Jan 2015, at 10:51, David Majnemer <david.majnemer at
gmail.com> wrote:
> >
> > I don't think this should be a flag on add.  Flags are designed
such
> that the middle-end may be ignorant of them and nothing bad might happen,
> it is always safe to ignore or drop flags when doing so is convenient (for
> a concrete example, take a look at reassociate).
>
> This is true of metadata, not of flags.  Consider the atomic memory order
> on loads and stores, for example.  It is definitely not safe for an
> optimiser to ignore these.
>
The arithmetic operations are *very* consistent that flags imply a
relaxation of constraint: floating point has fast math flags, division has
exact, etc.
Memory instructions seem to have gone in the other direction.

>
> David
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150115/c6e13d87/attachment.html>

Herbie Robinson

2015-Jan-15 17:40 UTC

head link

[LLVMdev] [RFC] Integer Saturation Intrinsics

On 1/15/15 5:51 AM, David Majnemer wrote:>
>
> On Thu, Jan 15, 2015 at 2:33 AM, David Chisnall 
> <David.Chisnall at cl.cam.ac.uk <mailto:David.Chisnall at
cl.cam.ac.uk>> wrote:
>
>     A couple of questions:
>
>     1) Should this really be an intrinsic and not a flag on add?  The
>     add instruction already allows overflow to be either undefined or
>     defined to wrap.  Making it defined to saturate seems a natural
>     extension.
>
>
> I don't think this should be a flag on add.  Flags are designed such 
> that the middle-end may be ignorant of them and nothing bad might 
> happen, it is always safe to ignore or drop flags when doing so is 
> convenient (for a concrete example, take a look at reassociate).
>
> In this case, the saturating nature of the operation does not seem 
> like something that can be safely ignored.The undefined vs. wrap is a semantic difference that could affect 
optimization, too.  The result of the operation is undefined in one case 
and well defined in the wrap case.  The undefined case should allow some 
code motion optimizations that they wrap case doesn't.

Algebraic transformations are affected by both wrap and saturation; 
although, the saturation is more restrictive in that it affects add and 
subtract while wrap would only be blocking refactoring when multiply and 
divide are involved.

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150115/cb708b3e/attachment.html>

llvm dev - Jan 2015 - [LLVMdev] [RFC] Integer Saturation Intrinsics

[LLVMdev] [RFC] Integer Saturation Intrinsics

[LLVMdev] [RFC] Integer Saturation Intrinsics

[LLVMdev] [RFC] Integer Saturation Intrinsics

[LLVMdev] [RFC] Integer Saturation Intrinsics