thr3ads.net - llvm dev - [LLVMdev] RFC: Integer saturation intrinsics [Jun 2011]

If this information is useful, please help other people find it:
Share via:

Evan Cheng

2011-Jun-17 23:22 UTC

[LLVMdev] RFC: Integer saturation intrinsics

On Jun 17, 2011, at 3:42 PM, Eli Friedman wrote:
> On Fri, Jun 17, 2011 at 3:08 PM, Evan Cheng <evan.cheng at apple.com>
wrote:
>> Hi all,
>> 
>> I'm proposing integer saturation intrinsics.
>> 
>> def int_ssat : Intrinsic<[llvm_anyint_ty], [LLVMMatchType<0>,
llvm_i32_ty]>;
>> def int_usat : Intrinsic<[llvm_anyint_ty], [LLVMMatchType<0>,
llvm_i32_ty]>;
>> 
>> The first operand is the integer value being saturated, and second is
the saturation bit position.
>> 
>> For scalar integer types, the semantics are:
>> 
>> int_ssat: x < -(1 << (bit-1)) ? -(1 << (bit-1)) : (x
> (1 << (bit-1))-1 ? (1 << (bit-1))-1 : x)
>> int_usat: x < 0 ? 0 : (x > (1 << bit)-1 : (1 <<
bit)-1 : x)
>> 
>> e.g. If bit is 8, the range is -128 to 127 for int_ssat; 0 to 255 for
int_usat. This is useful for i16 / i32 / i64 to i8 satuation code like the
following:
>> (x < -256) ? -256 : (x > 255 ? 255 : x)
>> (x < 0) ? 0 : (x > 255 ? 255 : x)
>> 
>> If the source type is an integer vector type, then each element of the
vector is being saturated.
>> 
>> For ARM, these intrinsics will map to usat / ssat instructions. The
semantics matches exactly. X86 doesn't have saturation instructions.
However, SSE does have packed add, packed sub, and pack with saturation. So
it's possible to instruction select patterns such as (int_{s|u}sat
({add|sub} x, y), c).
> 
> The stated pattern simply doesn't work.  A portable saturating
> add/subtract intrinsic might be nice given that most vector
> instruction sets have such an instruction, but this seems completely
> orthogonal.
Can you explain why you think the pattern (which?) would not work?
> 
>> The plan is to form calls to these intrinsics in InstCombine. Legalizer
can expand these intrinsics if they are not legal. The expansion should be
fairly straight forward and produce code that is at least as good as what LLVM
is currently generating for these code sequence.
>> 
>> Comments?
> 
> Is there some reason why pattern-matching this in an ARM-specific
> DAGCombine doesn't work?
It's not possible to look beyond a single BB at isel time.

Evan
> 
> -Eli

Eli Friedman

2011-Jun-17 23:50 UTC

head link

[LLVMdev] RFC: Integer saturation intrinsics

On Fri, Jun 17, 2011 at 4:22 PM, Evan Cheng <evan.cheng at apple.com>
wrote:>
> On Jun 17, 2011, at 3:42 PM, Eli Friedman wrote:
>
>> On Fri, Jun 17, 2011 at 3:08 PM, Evan Cheng <evan.cheng at
apple.com> wrote:
>>> Hi all,
>>>
>>> I'm proposing integer saturation intrinsics.
>>>
>>> def int_ssat : Intrinsic<[llvm_anyint_ty],
[LLVMMatchType<0>, llvm_i32_ty]>;
>>> def int_usat : Intrinsic<[llvm_anyint_ty],
[LLVMMatchType<0>, llvm_i32_ty]>;
>>>
>>> The first operand is the integer value being saturated, and second
is the saturation bit position.
>>>
>>> For scalar integer types, the semantics are:
>>>
>>> int_ssat: x < -(1 << (bit-1)) ? -(1 << (bit-1)) : (x
> (1 << (bit-1))-1 ? (1 << (bit-1))-1 : x)
>>> int_usat: x < 0 ? 0 : (x > (1 << bit)-1 : (1 <<
bit)-1 : x)
>>>
>>> e.g. If bit is 8, the range is -128 to 127 for int_ssat; 0 to 255
for int_usat. This is useful for i16 / i32 / i64 to i8 satuation code like the
following:
>>> (x < -256) ? -256 : (x > 255 ? 255 : x)
>>> (x < 0) ? 0 : (x > 255 ? 255 : x)
>>>
>>> If the source type is an integer vector type, then each element of
the vector is being saturated.
>>>
>>> For ARM, these intrinsics will map to usat / ssat instructions. The
semantics matches exactly. X86 doesn't have saturation instructions.
However, SSE does have packed add, packed sub, and pack with saturation. So
it's possible to instruction select patterns such as (int_{s|u}sat
({add|sub} x, y), c).
>>
>> The stated pattern simply doesn't work.  A portable saturating
>> add/subtract intrinsic might be nice given that most vector
>> instruction sets have such an instruction, but this seems completely
>> orthogonal.
>
> Can you explain why you think the pattern (which?) would not work?
Suppose you want a paddsw.  To express the equivalent using ssat, you
would have to write (trunc (ssat (add (sext x), (sext y)), c)).  And I
wouldn't trust that to work.
>>
>>> The plan is to form calls to these intrinsics in InstCombine.
Legalizer can expand these intrinsics if they are not legal. The expansion
should be fairly straight forward and produce code that is at least as good as
what LLVM is currently generating for these code sequence.
>>>
>>> Comments?
>>
>> Is there some reason why pattern-matching this in an ARM-specific
>> DAGCombine doesn't work?
>
> It's not possible to look beyond a single BB at isel time.
Anything that we can match to ssat should be of the form max(min(x,
SATMAX), SATMIN) (where max and min are icmp+select pairs).  If the
min and max aren't in the same block, and we don't have an IR
transformation to put them in the same block, we should fix that
rather than introducing an instrinsic for this special case, I
think...

-Eli

Eli Friedman

2011-Jun-18 00:49 UTC

head link

[LLVMdev] RFC: Integer saturation intrinsics

On Fri, Jun 17, 2011 at 4:50 PM, Eli Friedman <eli.friedman at gmail.com>
wrote:>>>> The plan is to form calls to these intrinsics in InstCombine.
Legalizer can expand these intrinsics if they are not legal. The expansion
should be fairly straight forward and produce code that is at least as good as
what LLVM is currently generating for these code sequence.
>>>>
>>>> Comments?
>>>
>>> Is there some reason why pattern-matching this in an ARM-specific
>>> DAGCombine doesn't work?
>>
>> It's not possible to look beyond a single BB at isel time.
>
> Anything that we can match to ssat should be of the form max(min(x,
> SATMAX), SATMIN) (where max and min are icmp+select pairs).  If the
> min and max aren't in the same block, and we don't have an IR
> transformation to put them in the same block, we should fix that
> rather than introducing an instrinsic for this special case, I
> think...
Okay, thinking about it a bit more, I don't think this is practical.

I'm still skeptical that adding platform-independent intrinsics for
arbitrary ARM instructions is a good idea simply because we don't have
the infrastructure to handle them otherwise.  It wouldn't be
especially hard to allow target-specific transforms on IR...

-Eli

Seemingly Similar Threads

Search for more possibly parallel threads

llvm dev - Jun 2011 - [LLVMdev] RFC: Integer saturation intrinsics

[LLVMdev] RFC: Integer saturation intrinsics

[LLVMdev] RFC: Integer saturation intrinsics

[LLVMdev] RFC: Integer saturation intrinsics

Seemingly Similar Threads