Kaylor, Andrew via llvm-dev
2018-Mar-01 18:07 UTC
[llvm-dev] how to simplify FP ops with an undef operand?
So you don’t think sNaNs can just be treated as if they were qNaNs? I understand why we would want to ignore the signaling part of things, but the rules for operating on NaNs are pretty clear and reasonable to implement. The signaling aspect can, I think, be safely ignored when we are in the mode of assuming the default FP environment. As for the distinction between IEEE and LLVM IR, I would think we would want to define LLVM IR in such a way that it is possible to create and IEEE-compliant compiler. I know we’re not there yet, but we’re working toward it. From: Chris Lattner [mailto:clattner at nondot.org] Sent: Wednesday, February 28, 2018 8:42 PM To: Friedman, Eli <efriedma at codeaurora.org> Cc: Kaylor, Andrew <andrew.kaylor at intel.com>; Sanjay Patel <spatel at rotateright.com>; Matt Arsenault <arsenm2 at gmail.com>; llvm-dev <llvm-dev at lists.llvm.org>; John Regehr <regehr at cs.utah.edu> Subject: Re: [llvm-dev] how to simplify FP ops with an undef operand? On Feb 28, 2018, at 6:33 PM, Friedman, Eli <efriedma at codeaurora.org<mailto:efriedma at codeaurora.org>> wrote: On 2/28/2018 5:46 PM, Chris Lattner wrote: On Feb 28, 2018, at 3:29 PM, Kaylor, Andrew via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote: For the first part of Sanjay’s question, I think the answer is, “Yes, we can fold all of these to NaN in the general case.” Agreed. Those IR instructions are undefined on SNAN, and that undef could take on an SNAN value. Folding these instructions to undef seems reasonable, and it is arguable that you could even fold it to an ‘unreachable'. fdiv snan, snan is undefined? As opposed to producing a qnan, as specified by IEEE-754? You’re talking about IEEE, I’m talking about LLVM IR. LLVM IR is undefined on SNaNs. It looks like LangRef isn’t clear about this, the only mention of SNaNs is in this statement: "fdiv is not (currently) defined on SNaN’s.” However, fdiv/fmul/etc are pervasively treated as not having side effects. The intention, and the only sensible definition for them, is that they are undefined on SNaNs. -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180301/488b6191/attachment.html>
Chris Lattner via llvm-dev
2018-Mar-02 05:32 UTC
[llvm-dev] how to simplify FP ops with an undef operand?
On Mar 1, 2018, at 10:07 AM, Kaylor, Andrew <andrew.kaylor at intel.com> wrote:> So you don’t think sNaNs can just be treated as if they were qNaNs? I understand why we would want to ignore the signaling part of things, but the rules for operating on NaNs are pretty clear and reasonable to implement. The signaling aspect can, I think, be safely ignored when we are in the mode of assuming the default FP environment. > > As for the distinction between IEEE and LLVM IR, I would think we would want to define LLVM IR in such a way that it is possible to create and IEEE-compliant compiler. I know we’re not there yet, but we’re working toward it.There appears to be confusion about the role of LLVM IR and its relation to undef and undefined behavior, at least it isn’t the first time :-) Let me try to clarify. Many LLVM IR instructions are only defined on some inputs. For inputs outside their domain, they have undefined behavior or produce undefined results. This isn’t perfectly codified, but people are working on it, but there are some things we *know* based on how the operations are modeled and what the compiler does with them. Hopefully uncontroversial points: - Floating point operations are represented in LLVM IR in two ways: the fdiv/fmul/fadd etc instructions, and the llvm.experimental.constrained.* intrinsic forms. - The instruction forms are modeled as having no side effects. fdiv/frem trap on divide by zero, but are otherwise defined on the same set of inputs as fadd/fmul/etc. - Because they have no side effects, these instructions can be reordered freely (though for fdiv/frem, see footnote [1] below). For example, it is legal to transform this: foo(x,y) tmp = a+b into: tmp = a+b foo(x,y) This can occur for many reasons: for example, because the compiler decides it is profitable (e.g. hoisting a loop invariant computation out of a loop), as a side effect of instruction scheduling, selection dag not having chain nodes on the ISD nodes, etc. - Because the instruction forms have no side effects and can be reordered, they are not ok to use in the face of non-standard rounding mode or trapping flags. This is the point of the experimental intrinsic forms, and the reason they exist. - The intrinsic forms are defined to allow explicit rounding mode control and other features, but also are defined as having side effects. This allows them to be used in the face of rounding mode changes, but also makes speculation a lot more careful. These limitations to speculation are why we don’t just apply the intrinsic forms to the instructions. - C99/C++ say nothing about SNaN’s, and there is some push to remove SNaN’s from the IEEE 754 standard. See, e.g. this page, which was one of the first hits I found online, I’m sure there are others: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1011.htm <http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1011.htm>. I’m not familiar with the state of the art in Java or other languages. - The fact that C99 and C++ are undefined on SNaN’s by default, and default to ignoring rounding modes, mean that it is fine for clang to produce fadd/fmul/fdiv instructions in the normal mode. It only needs to generate the intrinsic forms when the FENV_ACCESS pragma is set. Potentially controversial points: - Because LLVM reorders and speculates the instruction forms, and because IEEE defines the corresponding IEEE operations as trapping on SNaNs, it is clear that SNaNs are outside of the domain of these LLVM operations. Either speculation is ok or trapping on SNaN is ok, pick one… (and we already did :) - Because the LLVM instructions are not defined on SNaNs, SNaNs are outside of their domain, and thus the LLVM instructions are undefined on these inputs. As such, it would be perfectly reasonable to “constant fold” an "fadd SNaN, 42” instruction into unreachable and delete all the code after it, or turn it into a call to formatHardDrive(). [2] - Because an ‘undef’ operand can be an arbitrary bit pattern representable by the type, and because the f32/f64 etc *types* can represent SNaNs, it is within the right of the compiler to constant fold “fadd undef, 42” into unreachable. QED. Summary and Recommendation: I don’t see any reason around this, and I thought this was always the documented behavior in LangRef. It seem that that was never documented and it has led to confusion on this thread. I’d love to be surprised and find out that I’ve misinterpreted things (I’m no fan of UB!!!) but I don’t see a way around this. This is just logical behavior that flows from how the compiler works and how it has always worked. All that said, in my opinion, while it is within the “right" of the compiler to constant fold these things to unreachable, I see no motivation to actually do so. LLVM has gone out of its way to define some simple forms of UB like trivial TBAA violations, and I see no downside to being nicer here. The code generator currently turns a floating point undef into a reference to some random FP register, which (at worse) causes an SNaN trap, but could just be a silent failure. As such, my recommendation is to simply document these as having UB when presented with SNaN inputs, but make the constant folder/instcombine/... fold “fadd undef, X” into “undef” instead of “unreachable”. In theory we could go further and define a new class of UB concepts in LLVM IR along the lines of “produces a undetermined value or traps, but doesn’t cause arbitrary UB” but that is a huge ball of wax with far reaching implications. -Chris [1] IIRC, we are more conservative about speculating divide/rem instructions because of divide by zero. If that is true, it is possible we could handle these better than described above. [2] Of course, executing an ‘unreachable’ instruction *can* format your hard drive, if the unreachable is at the bottom of the current function, and if the fall through function formats your hard drive... -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180301/aa313609/attachment.html>
Robin Kruppe via llvm-dev
2018-Mar-02 13:59 UTC
[llvm-dev] how to simplify FP ops with an undef operand?
On 2 March 2018 at 06:32, Chris Lattner via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Potentially controversial points: > > - Because LLVM reorders and speculates the instruction forms, and because > IEEE defines the corresponding IEEE operations as trapping on SNaNs, it is > clear that SNaNs are outside of the domain of these LLVM operations.Either> speculation is ok or trapping on SNaN is ok, pick one… (and we alreadydid> :)Whether operations on sNaNs trap in the "default execution environment", or otherwise interrupt normal control flow or have side effects, seems to be the key point of disagreement here. I don't believe they do, at least as far as my amateur reading of IEEE 754-2008 can tell: 1) (most) operations on sNaN signal an _invalid operation_ exception (§7.2), and so do many other operations on other values (also §7.2), such as: 0 * inf, inf / inf, fma(0, inf, x), sqrt on negative inputs, converting a float to an integer when the source is NaN/is infinity/does not fit in the destination type, etc. 2) IEEE specifies a default way of handling exceptions (§7.1), which for _invalid operation_ is returning a quiet NaN (§7.2). 3) Language standards should offer a way to override the default exception handling (§8.1). 4) _Immediate_ alternate exception handling (§8.3) can be implemented via traps (§8.3, NOTE 2). As I said I'm not an expert on this standard, but it seems very clear-cut to me that IEEE specifies operations like divide(x, sNaN) should return a quiet NaN, nothing else, unless the program uses language-provided facilities to install some other behavior. In this respect sNaN operations are not any different from other invalid, inexact, overflowing, etc. operations (as Steve already said). If this is the case, there is no reason to treat e.g. "fdiv %x, snan" as having side effects or some sort of UB: fdiv and friends already assume a "default" fenv where nobody looks at flags, changes rounding modes, installs alternative exception handling, etc. so the invalid operation exception from sNaN operands is just as irrelevant as all the other exceptions are. LLVM can simply assume the default exception handling (as it already does in many cases) and fold calculations on signaling NaNs to quiet NaNs if it so wishes. I have not surveyed the numerous hardware implementations (and everything else that goes into the "default execution environment", e.g., what the OS does), so it might be that some of those default to trapping on sNaNs. I've never heard of such a thing, and just verified that it does not happen on my x86_64 machine, but there's a lot of weirdness out there. If you know of any targets that trap on sNaN by default, please tell us. Otherwise, going only by IEEE (as you yourself did), I don't see how traps could be a possibility without the program opting into fenv access (in which case the frontend has to emit constrained intrinsics anyway). Cheers, Robin -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180302/f36c1f17/attachment.html>
Stephen Canon via llvm-dev
2018-Mar-02 16:31 UTC
[llvm-dev] how to simplify FP ops with an undef operand?
Thanks for expanding, Chris. Responses inline.> On Mar 2, 2018, at 12:32 AM, Chris Lattner via llvm-dev <llvm-dev at lists.llvm.org> wrote:<snip>> - Because LLVM reorders and speculates the instruction forms, and because IEEE defines the corresponding IEEE operations as trapping on SNaNs, it is clear that SNaNs are outside of the domain of these LLVM operations. Either speculation is ok or trapping on SNaN is ok, pick one… (and we already did :)I see the source of confusion now. IEEE does not define any operations as trapping on sNaN. It defines operations as raising the invalid flag on sNaN, which is *not a trap* under default exception handling. It is exactly the same as raising the underflow, overflow, inexact, or division-by-zero flag. Any llvm instruction necessarily assumes default exception handling—otherwise, we would be using the constrained intrinsics instead. So there’s no reason for sNaN inputs to ever be undef with the llvm instructions. They are just NaNs. – Steve -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180302/5483a381/attachment.html>
Steve (Numerics) Canon via llvm-dev
2018-Mar-03 21:55 UTC
[llvm-dev] how to simplify FP ops with an undef operand?
On Mar 3, 2018, at 15:54, Chris Lattner <clattner at nondot.org> wrote:>> On Mar 2, 2018, at 8:31 AM, Stephen Canon <scanon at apple.com> wrote: >> >> Thanks for expanding, Chris. Responses inline. >> >>>> On Mar 2, 2018, at 12:32 AM, Chris Lattner via llvm-dev <llvm-dev at lists.llvm.org> wrote: >>> >>> <snip> >>> >>> - Because LLVM reorders and speculates the instruction forms, and because IEEE defines the corresponding IEEE operations as trapping on SNaNs, it is clear that SNaNs are outside of the domain of these LLVM operations. Either speculation is ok or trapping on SNaN is ok, pick one… (and we already did :) >> >> I see the source of confusion now. >> >> IEEE does not define any operations as trapping on sNaN. It defines operations as raising the invalid flag on sNaN, which is *not a trap* under default exception handling. It is exactly the same as raising the underflow, overflow, inexact, or division-by-zero flag. >> >> Any llvm instruction necessarily assumes default exception handling—otherwise, we would be using the constrained intrinsics instead. So there’s no reason for sNaN inputs to ever be undef with the llvm instructions. They are just NaNs. > > Ah yes, I completely misunderstood that! Thank you for clarifying. In that case, it seems perfectly reasonable for “fadd undef, 1” to fold to undef, right?Yes, indeed. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180303/74aa0c70/attachment.html>
Chris Lattner via llvm-dev
2018-Mar-04 16:24 UTC
[llvm-dev] how to simplify FP ops with an undef operand?
> On Mar 3, 2018, at 1:55 PM, Steve (Numerics) Canon <scanon at apple.com> wrote: > > On Mar 3, 2018, at 15:54, Chris Lattner <clattner at nondot.org <mailto:clattner at nondot.org>> wrote: > >>> On Mar 2, 2018, at 8:31 AM, Stephen Canon <scanon at apple.com <mailto:scanon at apple.com>> wrote: >>> >>> Thanks for expanding, Chris. Responses inline. >>> >>>> On Mar 2, 2018, at 12:32 AM, Chris Lattner via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >>> >>> <snip> >>> >>>> - Because LLVM reorders and speculates the instruction forms, and because IEEE defines the corresponding IEEE operations as trapping on SNaNs, it is clear that SNaNs are outside of the domain of these LLVM operations. Either speculation is ok or trapping on SNaN is ok, pick one… (and we already did :) >>> >>> I see the source of confusion now. >>> >>> IEEE does not define any operations as trapping on sNaN. It defines operations as raising the invalid flag on sNaN, which is *not a trap* under default exception handling. It is exactly the same as raising the underflow, overflow, inexact, or division-by-zero flag. >>> >>> Any llvm instruction necessarily assumes default exception handling—otherwise, we would be using the constrained intrinsics instead. So there’s no reason for sNaN inputs to ever be undef with the llvm instructions. They are just NaNs. >> >> Ah yes, I completely misunderstood that! Thank you for clarifying. In that case, it seems perfectly reasonable for “fadd undef, 1” to fold to undef, right? > > Yes, indeed.Great! Can someone please update LangRef so we codify this for the next time I forget? :-) -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180304/8b8fd625/attachment.html>
Ralf Jung via llvm-dev
2018-Mar-06 09:55 UTC
[llvm-dev] how to simplify FP ops with an undef operand?
Hi,> *Hopefully uncontroversial points:* > > - Floating point operations are represented in LLVM IR in two ways: the > fdiv/fmul/fadd etc /instructions/, and the llvm.experimental.constrained.* > /intrinsic/ forms. > > - The instruction forms are modeled as having no side effects. fdiv/frem trap > on divide by zero, but are otherwise defined on the same set of inputs as > fadd/fmul/etc. > > - Because they have no side effects, these instructions can be reordered freely > (though for fdiv/frem, see footnote [1] below). For example, it is legal to > transform this: > > foo(x,y) > tmp = a+b > > into: > > tmp = a+b > foo(x,y) > > This can occur for many reasons: for example, because the compiler decides it is > profitable (e.g. hoisting a loop invariant computation out of a loop), as a side > effect of instruction scheduling, selection dag not having chain nodes on the > ISD nodes, etc.[snip]> - Because the LLVM instructions are not defined on SNaNs, SNaNs are outside of > their domain, and thus the LLVM instructions are undefined on these inputs. As > such, it would be perfectly reasonable to “constant fold” an "fadd SNaN, 42” > instruction into unreachable and delete all the code after it, or turn it into a > call to formatHardDrive(). [2]Isn't "possibly raises UB" in contradiction with "does not have side-effects"? In your reordering example quoted below, if `foo` never returns but `a+b` raises UB, then doing the reordering could introduce UB into the program. Returning undef or poison should be fine, but raising UB or calling formatHardDrive() seems to be incompatible with desired optimizations. Did I miss something? Kind regards, Ralf