thr3ads.net - llvm dev - [LLVMdev] max/min intrinsics [Dec 2012]

If this information is useful, please help other people find it:
Share via:

Redmond, Paul

2012-Dec-05 16:26 UTC

[LLVMdev] max/min intrinsics

I have been working on a patch to add support for max/min reductions in
LoopVectorize. One of the comments that came up in review is that the
implementation could be simplified (and less fragile) if max and min intrinsics
were recognized rather than looking for compare-select sequences.

The suggestion was to change compare-selects into max and min intrinsic calls
during instcombine.

The intrinsics to add are:
declare iN llvm.{smin,smax}.iN(iN %a, iN %b)
declare iN llvm.{umin,umax}.iN(iN %a, iN %b)
declare fN llvm.{fmin,fmax}.fN(fN %a, fN %b)

What does the community think?

Paul

Sebastian Pop

2012-Dec-05 17:06 UTC

head link

[LLVMdev] max/min intrinsics

Redmond, Paul wrote:> I have been working on a patch to add support for max/min reductions in
LoopVectorize. One of the comments that came up in review is that the
implementation could be simplified (and less fragile) if max and min intrinsics
were recognized rather than looking for compare-select sequences.
> 
> The suggestion was to change compare-selects into max and min intrinsic
calls during instcombine.
+1

Sebastian
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The
Linux Foundation

Duncan Sands

2012-Dec-05 17:29 UTC

head link

[LLVMdev] max/min intrinsics

Hi Paul,

On 05/12/12 17:26, Redmond, Paul wrote:> I have been working on a patch to add support for max/min reductions in
LoopVectorize. One of the comments that came up in review is that the
implementation could be simplified (and less fragile) if max and min intrinsics
were recognized rather than looking for compare-select sequences.
>
> The suggestion was to change compare-selects into max and min intrinsic
calls during instcombine.
>
> The intrinsics to add are:
> declare iN llvm.{smin,smax}.iN(iN %a, iN %b)
> declare iN llvm.{umin,umax}.iN(iN %a, iN %b)
> declare fN llvm.{fmin,fmax}.fN(fN %a, fN %b)
>
> What does the community think?
it seems reasonable to me.

Ciao, Duncan.

Shuxin Yang

2012-Dec-05 18:34 UTC

head link

[LLVMdev] max/min intrinsics

Min/max certainly makes Loop Nest Optimization (including the innermost 
loop vectorization) lots easier.
However, I like they are "lowered" in lower level scalar opt (sopt).

I kinda feel "raw" instructions is bit easier than integrated 
instruction to optimized in sopt, and
the "raw" instruction could expose more opportunities.

e.g. In the following snippet,  if we break the max() into "raw" 
instruction, the cost of comparison
is reduced thanks to the CSE, and it also reveals that more often than 
not, Z hold value of min_v + 2.
However, max() obscure this info.
-----------------------------------------------------------------------------
    if (min_v > max_v) {  // the branch is highly biased.
        stuff...

    t = max(min_v, max_v);
    z = t + 2;
-----------------------------------------------------------------------------

Similar arguments for FMA formation, saturation add/sub recognition etc, 
etc etc...

IMHO, If some passes need to recognize these pattern, they are better 
proactively call some functions
to recognize them, and right after the passes lower them back to the 
"raw" form if the downstream
passes don't like these integrated instructions.

On 12/5/12 8:26 AM, Redmond, Paul wrote:> I have been working on a patch to add support for max/min reductions in
LoopVectorize. One of the comments that came up in review is that the
implementation could be simplified (and less fragile) if max and min intrinsics
were recognized rather than looking for compare-select sequences.
>
> The suggestion was to change compare-selects into max and min intrinsic
calls during instcombine.
>
> The intrinsics to add are:
> declare iN llvm.{smin,smax}.iN(iN %a, iN %b)
> declare iN llvm.{umin,umax}.iN(iN %a, iN %b)
> declare fN llvm.{fmin,fmax}.fN(fN %a, fN %b)
>
> What does the community think?
>
> Paul
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Chris Lattner

2012-Dec-05 19:48 UTC

head link

[LLVMdev] max/min intrinsics

On Dec 5, 2012, at 8:26 AM, "Redmond, Paul" <paul.redmond at
intel.com> wrote:
> I have been working on a patch to add support for max/min reductions in
LoopVectorize. One of the comments that came up in review is that the
implementation could be simplified (and less fragile) if max and min intrinsics
were recognized rather than looking for compare-select sequences.
> 
> The suggestion was to change compare-selects into max and min intrinsic
calls during instcombine.
> 
> The intrinsics to add are:
> declare iN llvm.{smin,smax}.iN(iN %a, iN %b)
> declare iN llvm.{umin,umax}.iN(iN %a, iN %b)
> declare fN llvm.{fmin,fmax}.fN(fN %a, fN %b)
> 
> What does the community think?
It seems inevitable.  For the floating point version, please make it very clear
what the behavior of max(-0,+0) and related cases are.  This also means stuff
that matches compare/select idioms (e.g. llvm/Support/PatternMatch.h) will need
to be updated.

-Chris

Tim Northover

2012-Dec-05 20:43 UTC

head link

[LLVMdev] max/min intrinsics

> It seems inevitable.  For the floating point version, please make it very
> clear what the behavior of max(-0,+0) and related cases are.
Along these lines, AArch64 has an instruction "FMAXNM". It returns the
maximum if neither value is NaN, but returns the number if just one
value is NaN. This is in addition to an "FMAX" which propagates NaNs.

I suspect you'll just want to consider this as an "oh yes, make sure
that the result is NaN if either input is" advisory notice, but I
haven't actually thought through the details of implementation yet.

Tim.

Schoedel, Kevin P

2012-Dec-17 18:50 UTC

head link

[LLVMdev] max/min intrinsics

On Wednesday, December 05, 2012 at 2:48 PM, Chris Lattner
wrote:> > What does the community think?
> 
> It seems inevitable.  For the floating point version, please make it very
clear
> what the behavior of max(-0,+0) and related cases are.
The following is our current proposal for llvm.fmax/fmin.*:

[1] If exactly one argument is a NaN, the intrinsic returns the other argument.
[2] If both arguments are NaN, the intrinsic returns a NaN.
[3] An SNaN may behave as a QNaN.
[4] If the arguments compare equal, the intrinsic returns a value that compares
equal to both arguments.
[5] Otherwise, the intrinsic returns the greater/lesser of the two arguments.

Rationale and notes:

Points [1] and [2] match the C/Posix library functions' specs.

Point [3] matches the OpenCL library functions, and may permit some
implementations to test for NaNs less expensively.

Point [4] accounts for fmax(-0,+0) in IEEE 754 arithmetic, and any similar cases
that might exist in other systems (LLVM needs a VAX backend). IEEE specifies
that comparisons ignore the sign of zero, so requiring fmax to order ±0 would be
expensive on many systems, and is not necessary to support common library
functions.

The intrinsics can replace calls to the C and OpenCL library functions.

The intrinsics can be implemented as calls to the C or OpenCL library functions.
They can also be implemented by IEEE 754 maxNum()/minNum() operations (but not
vice versa).

The intrinsics are not equivalent to an fcmp/select sequence.

-- 
Kevin Schoedel, Software Developer, Intel of Canada
<kevin.p.schoedel at intel.com>      +1 (519) 772-2580
Disclaimer: the above just might possibly contain a
statement that is not an official opinion of Intel.

Seemingly Similar Threads

Search for more possibly parallel threads

llvm dev - Dec 2012 - [LLVMdev] max/min intrinsics

[LLVMdev] max/min intrinsics

[LLVMdev] max/min intrinsics

[LLVMdev] max/min intrinsics

[LLVMdev] max/min intrinsics

[LLVMdev] max/min intrinsics

[LLVMdev] max/min intrinsics

[LLVMdev] max/min intrinsics

Seemingly Similar Threads