thr3ads.net - llvm dev - [llvm-dev] how to simplify FP ops with an undef operand? [Mar 2018]

If this information is useful, please help other people find it:
Share via:

Kaylor, Andrew via llvm-dev

2018-Mar-01 18:07 UTC

[llvm-dev] how to simplify FP ops with an undef operand?

So you don’t think sNaNs can just be treated as if they were qNaNs? I understand
why we would want to ignore the signaling part of things, but the rules for
operating on NaNs are pretty clear and reasonable to implement. The signaling
aspect can, I think, be safely ignored when we are in the mode of assuming the
default FP environment.

As for the distinction between IEEE and LLVM IR, I would think we would want to
define LLVM IR in such a way that it is possible to create and IEEE-compliant
compiler. I know we’re not there yet, but we’re working toward it.

From: Chris Lattner [mailto:clattner at nondot.org]
Sent: Wednesday, February 28, 2018 8:42 PM
To: Friedman, Eli <efriedma at codeaurora.org>
Cc: Kaylor, Andrew <andrew.kaylor at intel.com>; Sanjay Patel <spatel
at rotateright.com>; Matt Arsenault <arsenm2 at gmail.com>; llvm-dev
<llvm-dev at lists.llvm.org>; John Regehr <regehr at cs.utah.edu>
Subject: Re: [llvm-dev] how to simplify FP ops with an undef operand?




On Feb 28, 2018, at 6:33 PM, Friedman, Eli <efriedma at
codeaurora.org<mailto:efriedma at codeaurora.org>> wrote:

On 2/28/2018 5:46 PM, Chris Lattner wrote:
On Feb 28, 2018, at 3:29 PM, Kaylor, Andrew via llvm-dev <llvm-dev at
lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:
For the first part of Sanjay’s question, I think the answer is, “Yes, we can
fold all of these to NaN in the general case.”

Agreed.  Those IR instructions are undefined on SNAN, and that undef could take
on an SNAN value.  Folding these instructions to undef seems reasonable, and it
is arguable that you could even fold it to an ‘unreachable'.

fdiv snan, snan is undefined?  As opposed to producing a qnan, as specified by
IEEE-754?

You’re talking about IEEE, I’m talking about LLVM IR.  LLVM IR is undefined on
SNaNs.  It looks like LangRef isn’t clear about this, the only mention of SNaNs
is in this statement:

            "fdiv is not (currently) defined on SNaN’s.”

However, fdiv/fmul/etc are pervasively treated as not having side effects.  The
intention, and the only sensible definition for them, is that they are undefined
on SNaNs.

-Chris


-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180301/488b6191/attachment.html>

Chris Lattner via llvm-dev

2018-Mar-02 05:32 UTC

head link

[llvm-dev] how to simplify FP ops with an undef operand?

On Mar 1, 2018, at 10:07 AM, Kaylor, Andrew <andrew.kaylor at intel.com>
wrote:> So you don’t think sNaNs can just be treated as if they were qNaNs? I
understand why we would want to ignore the signaling part of things, but the
rules for operating on NaNs are pretty clear and reasonable to implement. The
signaling aspect can, I think, be safely ignored when we are in the mode of
assuming the default FP environment.
>  
> As for the distinction between IEEE and LLVM IR, I would think we would
want to define LLVM IR in such a way that it is possible to create and
IEEE-compliant compiler. I know we’re not there yet, but we’re working toward
it.
There appears to be confusion about the role of LLVM IR and its relation to
undef and undefined behavior, at least it isn’t the first time :-)

Let me try to clarify.  Many LLVM IR instructions are only defined on some
inputs.  For inputs outside their domain, they have undefined behavior or
produce undefined results.  This isn’t perfectly codified, but people are
working on it, but there are some things we *know* based on how the operations
are modeled and what the compiler does with them.

Hopefully uncontroversial points:

 - Floating point operations are represented in LLVM IR in two ways: the
fdiv/fmul/fadd etc instructions, and the llvm.experimental.constrained.*
intrinsic forms.

 - The instruction forms are modeled as having no side effects.  fdiv/frem trap
on divide by zero, but are otherwise defined on the same set of inputs as
fadd/fmul/etc.

- Because they have no side effects, these instructions can be reordered freely
(though for fdiv/frem, see footnote [1] below).  For example, it is legal to
transform this:

   foo(x,y)
   tmp = a+b

into:

   tmp = a+b
   foo(x,y)

This can occur for many reasons: for example, because the compiler decides it is
profitable (e.g. hoisting a loop invariant computation out of a loop), as a side
effect of instruction scheduling, selection dag not having chain nodes on the
ISD nodes, etc.

- Because the instruction forms have no side effects and can be reordered, they
are not ok to use in the face of non-standard rounding mode or trapping flags. 
This is the point of the experimental intrinsic forms, and the reason they
exist.

- The intrinsic forms are defined to allow explicit rounding mode control and
other features, but also are defined as having side effects.  This allows them
to be used in the face of rounding mode changes, but also makes speculation a
lot more careful.  These limitations to speculation are why we don’t just apply
the intrinsic forms to the instructions.

- C99/C++ say nothing about SNaN’s, and there is some push to remove SNaN’s from
the IEEE 754 standard.  See, e.g. this page, which was one of the first hits I
found online, I’m sure there are others:
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1011.htm
<http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1011.htm>.  I’m not
familiar with the state of the art in Java or other languages.

- The fact that C99 and C++ are undefined on SNaN’s by default, and default to
ignoring rounding modes, mean that it is fine for clang to produce
fadd/fmul/fdiv instructions in the normal mode.  It only needs to generate the
intrinsic forms when the FENV_ACCESS pragma is set.

Potentially controversial points:

 - Because LLVM reorders and speculates the instruction forms, and because IEEE
defines the corresponding IEEE operations as trapping on SNaNs, it is clear that
SNaNs are outside of the domain of these LLVM operations.  Either speculation is
ok or trapping on SNaN is ok, pick one…  (and we already did :)

- Because the LLVM instructions are not defined on SNaNs, SNaNs are outside of
their domain, and thus the LLVM instructions are undefined on these inputs.  As
such, it would be perfectly reasonable to “constant fold” an "fadd SNaN,
42” instruction into unreachable and delete all the code after it, or turn it
into a call to formatHardDrive().  [2]

- Because an ‘undef’ operand can be an arbitrary bit pattern representable by
the type, and because the f32/f64 etc *types* can represent SNaNs, it is within
the right of the compiler to constant fold “fadd undef, 42” into unreachable. 
QED.

Summary and Recommendation:

I don’t see any reason around this, and I thought this was always the documented
behavior in LangRef.  It seem that that was never documented and it has led to
confusion on this thread.  I’d love to be surprised and find out that I’ve
misinterpreted things (I’m no fan of UB!!!) but I don’t see a way around this. 
This is just logical behavior that flows from how the compiler works and how it
has always worked.

All that said, in my opinion, while it is within the “right" of the
compiler to constant fold these things to unreachable, I see no motivation to
actually do so.   LLVM has gone out of its way to define some simple forms of UB
like trivial TBAA violations, and I see no downside to being nicer here. The
code generator currently turns a floating point undef into a reference to some
random FP register, which (at worse) causes an SNaN trap, but could just be a
silent failure.

As such, my recommendation is to simply document these as having UB when
presented with SNaN inputs, but make the constant folder/instcombine/... fold
“fadd undef, X” into “undef” instead of “unreachable”.

In theory we could go further and define a new class of UB concepts in LLVM IR
along the lines of “produces a undetermined value or traps, but doesn’t cause
arbitrary UB” but that is a huge ball of wax with far reaching implications.

-Chris

[1] IIRC, we are more conservative about speculating divide/rem instructions
because of divide by zero.  If that is true, it is possible we could handle
these better than described above.

[2] Of course, executing an ‘unreachable’ instruction *can* format your hard
drive, if the unreachable is at the bottom of the current function, and if the
fall through function formats your hard drive...

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180301/aa313609/attachment.html>

Robin Kruppe via llvm-dev

2018-Mar-02 13:59 UTC

head link

[llvm-dev] how to simplify FP ops with an undef operand?

On 2 March 2018 at 06:32, Chris Lattner via llvm-dev <
llvm-dev at lists.llvm.org> wrote:> Potentially controversial points:
>
>  - Because LLVM reorders and speculates the instruction forms, and because
> IEEE defines the corresponding IEEE operations as trapping on SNaNs, it is
> clear that SNaNs are outside of the domain of these LLVM operations.
Either> speculation is ok or trapping on SNaN is ok, pick one…  (and we already
did> :)
Whether operations on sNaNs trap in the "default execution
environment", or
otherwise interrupt normal control flow or have side effects, seems to be
the key point of disagreement here. I don't believe they do, at least as
far as my amateur reading of IEEE 754-2008 can tell:

1) (most) operations on sNaN signal an _invalid operation_ exception
(§7.2), and so do many other operations on other values (also §7.2), such
as: 0 * inf, inf / inf, fma(0, inf, x), sqrt on negative inputs, converting
a float to an integer when the source is NaN/is infinity/does not fit in
the destination type, etc.
2) IEEE specifies a default way of handling exceptions (§7.1), which for
_invalid operation_ is returning a quiet NaN (§7.2).
3) Language standards should offer a way to override the default exception
handling (§8.1).
4) _Immediate_ alternate exception handling (§8.3) can be implemented via
traps (§8.3, NOTE 2).

As I said I'm not an expert on this standard, but it seems very clear-cut
to me that IEEE specifies operations like divide(x, sNaN) should return a
quiet NaN, nothing else, unless the program uses language-provided
facilities to install some other behavior. In this respect sNaN operations
are not any different from other invalid, inexact, overflowing, etc.
operations (as Steve already said).

If this is the case, there is no reason to treat e.g. "fdiv %x, snan"
as
having side effects or some sort of UB: fdiv and friends already assume a
"default" fenv where nobody looks at flags, changes rounding modes,
installs alternative exception handling, etc. so the invalid operation
exception from sNaN operands is just as irrelevant as all the other
exceptions are. LLVM can simply assume the default exception handling (as
it already does in many cases) and fold calculations on signaling NaNs to
quiet NaNs if it so wishes.

I have not surveyed the numerous hardware implementations (and everything
else that goes into the "default execution environment", e.g., what
the OS
does), so it might be that some of those default to trapping on sNaNs. I've
never heard of such a thing, and just verified that it does not happen on
my x86_64 machine, but there's a lot of weirdness out there. If you know of
any targets that trap on sNaN by default, please tell us. Otherwise, going
only by IEEE (as you yourself did), I don't see how traps could be a
possibility without the program opting into fenv access (in which case the
frontend has to emit constrained intrinsics anyway).

Cheers,
Robin
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180302/f36c1f17/attachment.html>

Stephen Canon via llvm-dev

2018-Mar-02 16:31 UTC

head link

[llvm-dev] how to simplify FP ops with an undef operand?

Thanks for expanding, Chris. Responses inline.
> On Mar 2, 2018, at 12:32 AM, Chris Lattner via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
<snip>
>  - Because LLVM reorders and speculates the instruction forms, and because
IEEE defines the corresponding IEEE operations as trapping on SNaNs, it is clear
that SNaNs are outside of the domain of these LLVM operations.  Either
speculation is ok or trapping on SNaN is ok, pick one…  (and we already did :)
I see the source of confusion now.

IEEE does not define any operations as trapping on sNaN. It defines operations
as raising the invalid flag on sNaN, which is *not a trap* under default
exception handling. It is exactly the same as raising the underflow, overflow,
inexact, or division-by-zero flag.

Any llvm instruction necessarily assumes default exception handling—otherwise,
we would be using the constrained intrinsics instead. So there’s no reason for
sNaN inputs to ever be undef with the llvm instructions. They are just NaNs.

– Steve
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180302/5483a381/attachment.html>

Steve (Numerics) Canon via llvm-dev

2018-Mar-03 21:55 UTC

head link

[llvm-dev] how to simplify FP ops with an undef operand?

On Mar 3, 2018, at 15:54, Chris Lattner <clattner at nondot.org> wrote:
>> On Mar 2, 2018, at 8:31 AM, Stephen Canon <scanon at apple.com>
wrote:
>> 
>> Thanks for expanding, Chris. Responses inline.
>> 
>>>> On Mar 2, 2018, at 12:32 AM, Chris Lattner via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
>>> 
>>> <snip>
>>> 
>>>  - Because LLVM reorders and speculates the instruction forms, and
because IEEE defines the corresponding IEEE operations as trapping on SNaNs, it
is clear that SNaNs are outside of the domain of these LLVM operations.  Either
speculation is ok or trapping on SNaN is ok, pick one…  (and we already did :)
>> 
>> I see the source of confusion now.
>> 
>> IEEE does not define any operations as trapping on sNaN. It defines
operations as raising the invalid flag on sNaN, which is *not a trap* under
default exception handling. It is exactly the same as raising the underflow,
overflow, inexact, or division-by-zero flag.
>> 
>> Any llvm instruction necessarily assumes default exception
handling—otherwise, we would be using the constrained intrinsics instead. So
there’s no reason for sNaN inputs to ever be undef with the llvm instructions.
They are just NaNs.
> 
> Ah yes, I completely misunderstood that!  Thank you for clarifying.  In
that case, it seems perfectly reasonable for “fadd undef, 1” to fold to undef,
right?
Yes, indeed.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180303/74aa0c70/attachment.html>

Chris Lattner via llvm-dev

2018-Mar-04 16:24 UTC

head link

[llvm-dev] how to simplify FP ops with an undef operand?

> On Mar 3, 2018, at 1:55 PM, Steve (Numerics) Canon <scanon at
apple.com> wrote:
> 
> On Mar 3, 2018, at 15:54, Chris Lattner <clattner at nondot.org
<mailto:clattner at nondot.org>> wrote:
> 
>>> On Mar 2, 2018, at 8:31 AM, Stephen Canon <scanon at apple.com
<mailto:scanon at apple.com>> wrote:
>>> 
>>> Thanks for expanding, Chris. Responses inline.
>>> 
>>>> On Mar 2, 2018, at 12:32 AM, Chris Lattner via llvm-dev
<llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>
wrote:
>>> 
>>> <snip>
>>> 
>>>>  - Because LLVM reorders and speculates the instruction forms,
and because IEEE defines the corresponding IEEE operations as trapping on SNaNs,
it is clear that SNaNs are outside of the domain of these LLVM operations. 
Either speculation is ok or trapping on SNaN is ok, pick one…  (and we already
did :)
>>> 
>>> I see the source of confusion now.
>>> 
>>> IEEE does not define any operations as trapping on sNaN. It defines
operations as raising the invalid flag on sNaN, which is *not a trap* under
default exception handling. It is exactly the same as raising the underflow,
overflow, inexact, or division-by-zero flag.
>>> 
>>> Any llvm instruction necessarily assumes default exception
handling—otherwise, we would be using the constrained intrinsics instead. So
there’s no reason for sNaN inputs to ever be undef with the llvm instructions.
They are just NaNs.
>> 
>> Ah yes, I completely misunderstood that!  Thank you for clarifying.  In
that case, it seems perfectly reasonable for “fadd undef, 1” to fold to undef,
right?
> 
> Yes, indeed.
Great! Can someone please update LangRef so we codify this for the next time I
forget? :-)

-Chris

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180304/8b8fd625/attachment.html>

Ralf Jung via llvm-dev

2018-Mar-06 09:55 UTC

head link

[llvm-dev] how to simplify FP ops with an undef operand?

Hi,
> *Hopefully uncontroversial points:*
> 
>  - Floating point operations are represented in LLVM IR in two ways: the
> fdiv/fmul/fadd etc /instructions/, and the llvm.experimental.constrained.*
> /intrinsic/ forms.
> 
>  - The instruction forms are modeled as having no side effects.  fdiv/frem
trap
> on divide by zero, but are otherwise defined on the same set of inputs as
> fadd/fmul/etc.
> 
> - Because they have no side effects, these instructions can be reordered
freely
> (though for fdiv/frem, see footnote [1] below).  For example, it is legal
to
> transform this:
> 
>    foo(x,y)
>    tmp = a+b
> 
> into:
> 
>    tmp = a+b
>    foo(x,y)
> 
> This can occur for many reasons: for example, because the compiler decides
it is
> profitable (e.g. hoisting a loop invariant computation out of a loop), as a
side
> effect of instruction scheduling, selection dag not having chain nodes on
the
> ISD nodes, etc.
[snip]
> - Because the LLVM instructions are not defined on SNaNs, SNaNs are outside
of
> their domain, and thus the LLVM instructions are undefined on these inputs.
 As
> such, it would be perfectly reasonable to “constant fold” an "fadd
SNaN, 42”
> instruction into unreachable and delete all the code after it, or turn it
into a
> call to formatHardDrive().  [2]
Isn't "possibly raises UB" in contradiction with "does not
have side-effects"?
In your reordering example quoted below, if `foo` never returns but `a+b` raises
UB, then doing the reordering could introduce UB into the program.  Returning
undef or poison should be fine, but raising UB or calling formatHardDrive()
seems to be incompatible with desired optimizations.  Did I miss something?

Kind regards,
Ralf

Seemingly Similar Threads

Search for more reasonably related threads

llvm dev - Mar 2018 - how to simplify FP ops with an undef operand?

[llvm-dev] how to simplify FP ops with an undef operand?

[llvm-dev] how to simplify FP ops with an undef operand?

[llvm-dev] how to simplify FP ops with an undef operand?

[llvm-dev] how to simplify FP ops with an undef operand?

[llvm-dev] how to simplify FP ops with an undef operand?

[llvm-dev] how to simplify FP ops with an undef operand?

[llvm-dev] how to simplify FP ops with an undef operand?

Seemingly Similar Threads