Scott Manley via llvm-dev
2019-Jul-10 22:14 UTC
[llvm-dev] RFC: change -fp-contract=off to actually disable FMAs
> I think you have a different definition of fused then. Fused is adescription of how the operation is computed/rounded, not an instruction count. "Only fuse FP ops when the result won't be affected" is what the existing comment says. So it can't be both a fused op and not a fused op if it's only meant to imply a difference in rounding. I'm just re-using the existing wording, and I agree it could be cleaned up if that's the intent of the -fp-contract option -- which I why I was asking for context.> For FMA, I think your example IR is correctly handled. The fastinstruction flag should override the global FP option you’re providing. For the issue you are describing, this is more of a question of whether clang should be emitting the fast flag or not. I disagree. How does clang know what would ultimately form an FMA? It would have to blanket remove 'fast' from all fadds. On Wed, Jul 10, 2019 at 4:16 PM Matt Arsenault <arsenm2 at gmail.com> wrote:> > > On Jul 10, 2019, at 16:56, Scott Manley <rscottmanley at gmail.com> wrote: > > At any rate, I was only offering an additional reason. Personally I think > it's strange for an option to say "this will never fuse ops" and then under > the covers will fuse ops, regardless of how FMAD is defined. However, my > primary concern is for FMAs. They have both numeric and performance > implications and I do not think it's unreasonable that off means off. > > > I think you have a different definition of fused then. Fused is a > description of how the operation is computed/rounded, not an instruction > count. The F in FMAD is not fused (I know this naming scheme is not great. > Every other FP node besides FMA has an F prefix) > > For FMA, I think your example IR is correctly handled. The fast > instruction flag should override the global FP option you’re providing. For > the issue you are describing, this is more of a question of whether clang > should be emitting the fast flag or not. > > -Matt >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190710/ebaf94a1/attachment.html>
JF Bastien via llvm-dev
2019-Jul-10 22:48 UTC
[llvm-dev] [cfe-dev] RFC: change -fp-contract=off to actually disable FMAs
By the definitions I’m hearing, it sounds like this option also disable fusion of offset calculations into loads and stores. Ditto immediate materialization directly into instructions instead of as separate immediate move. That doesn’t sound like what “contraction” was designed for. It sounds like you want a different knob, and it sounds like a knob LLVM rarely offers, if ever. Can you provide concrete cases where the compiler is wrong? Are those cases intractably unsolvable as performance defects?> On Jul 10, 2019, at 3:14 PM, Scott Manley via cfe-dev <cfe-dev at lists.llvm.org> wrote: > > > I think you have a different definition of fused then. Fused is a description of how the operation is computed/rounded, not an instruction count. > > "Only fuse FP ops when the result won't be affected" is what the existing comment says. So it can't be both a fused op and not a fused op if it's only meant to imply a difference in rounding. I'm just re-using the existing wording, and I agree it could be cleaned up if that's the intent of the -fp-contract option -- which I why I was asking for context. > > > For FMA, I think your example IR is correctly handled. The fast instruction flag should override the global FP option you’re providing. For the issue you are describing, this is more of a question of whether clang should be emitting the fast flag or not. > > I disagree. How does clang know what would ultimately form an FMA? It would have to blanket remove 'fast' from all fadds. > > On Wed, Jul 10, 2019 at 4:16 PM Matt Arsenault <arsenm2 at gmail.com <mailto:arsenm2 at gmail.com>> wrote: > > >> On Jul 10, 2019, at 16:56, Scott Manley <rscottmanley at gmail.com <mailto:rscottmanley at gmail.com>> wrote: >> >> At any rate, I was only offering an additional reason. Personally I think it's strange for an option to say "this will never fuse ops" and then under the covers will fuse ops, regardless of how FMAD is defined. However, my primary concern is for FMAs. They have both numeric and performance implications and I do not think it's unreasonable that off means off. > > I think you have a different definition of fused then. Fused is a description of how the operation is computed/rounded, not an instruction count. The F in FMAD is not fused (I know this naming scheme is not great. Every other FP node besides FMA has an F prefix) > > For FMA, I think your example IR is correctly handled. The fast instruction flag should override the global FP option you’re providing. For the issue you are describing, this is more of a question of whether clang should be emitting the fast flag or not. > > -Matt > _______________________________________________ > cfe-dev mailing list > cfe-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190710/33394478/attachment.html>
Stephen Canon via llvm-dev
2019-Jul-12 15:54 UTC
[llvm-dev] [cfe-dev] RFC: change -fp-contract=off to actually disable FMAs
Echoing what everyone else has said, keying on the word “fused” is a red herring here. fp-contract refers to behavior governed by the STDC FP_CONTRACT pragma. “Contraction” has a formal definition in the C standard:> A floating expression may be contracted, that is, evaluated as though it were a single operation, thereby omitting rounding errors implied by the source code and the way to disallow contracted expressions.Note that this definition is *purely* in terms of the rounding of arithmetic operations performed by the abstract machine; there is no notion of instructions generated. Formation of fused multiply-add instructions is one specific form of fusion licensed by this pragma, which happens to be the main one of interest from the standpoint of compiler performance optimization for FMA-based architectures. There’s some imprecision in the documentation caused by a mismatch between what’s interesting for compiler writers (where rounding changes due to FMA formation are allowed) and the abstract specification. That should be cleaned up. However, fp-contract is not a knob to control whether or not abstract-machine operations generate a single arithmetic instruction—it definitely does not, and should not, enable or disable MAD formation. – Steve> On Jul 10, 2019, at 6:14 PM, Scott Manley via cfe-dev <cfe-dev at lists.llvm.org> wrote: > > > I think you have a different definition of fused then. Fused is a description of how the operation is computed/rounded, not an instruction count. > > "Only fuse FP ops when the result won't be affected" is what the existing comment says. So it can't be both a fused op and not a fused op if it's only meant to imply a difference in rounding. I'm just re-using the existing wording, and I agree it could be cleaned up if that's the intent of the -fp-contract option -- which I why I was asking for context. > > > For FMA, I think your example IR is correctly handled. The fast instruction flag should override the global FP option you’re providing. For the issue you are describing, this is more of a question of whether clang should be emitting the fast flag or not. > > I disagree. How does clang know what would ultimately form an FMA? It would have to blanket remove 'fast' from all fadds. > >> >> On Wed, Jul 10, 2019 at 4:16 PM Matt Arsenault <arsenm2 at gmail.com> wrote: >> >>> On Jul 10, 2019, at 16:56, Scott Manley <rscottmanley at gmail.com> wrote: >>> >>> At any rate, I was only offering an additional reason. Personally I think it's strange for an option to say "this will never fuse ops" and then under the covers will fuse ops, regardless of how FMAD is defined. However, my primary concern is for FMAs. They have both numeric and performance implications and I do not think it's unreasonable that off means off. >> >> I think you have a different definition of fused then. Fused is a description of how the operation is computed/rounded, not an instruction count. The F in FMAD is not fused (I know this naming scheme is not great. Every other FP node besides FMA has an F prefix) >> >> For FMA, I think your example IR is correctly handled. The fast instruction flag should override the global FP option you’re providing. For the issue you are describing, this is more of a question of whether clang should be emitting the fast flag or not.
Scott Manley via llvm-dev
2019-Jul-12 16:32 UTC
[llvm-dev] [cfe-dev] RFC: change -fp-contract=off to actually disable FMAs
> However, fp-contract is not a knob to control whether or notabstract-machine operations generate a single arithmetic instruction I think that makes sense, but the end result is the same. Wouldn't you agree that -fp-contract=off still contracts floating point expressions with the initial example I posted? That is the core of what I'm trying to resolve here. I still have some confusion of what FMAD is supposed to be. Is FMAD actually MAD? Or is it something else? I am fine with leaving it alone if FMAD is not actually contracting floating point operations. On Fri, Jul 12, 2019 at 10:54 AM Stephen Canon <scanon at apple.com> wrote:> Echoing what everyone else has said, keying on the word “fused” is a red > herring here. > > fp-contract refers to behavior governed by the STDC FP_CONTRACT pragma. > “Contraction” has a formal definition in the C standard: > > > A floating expression may be contracted, that is, evaluated as though it > were a single operation, thereby omitting rounding errors implied by the > source code and the way to disallow contracted expressions. > > Note that this definition is *purely* in terms of the rounding of > arithmetic operations performed by the abstract machine; there is no notion > of instructions generated. Formation of fused multiply-add instructions is > one specific form of fusion licensed by this pragma, which happens to be > the main one of interest from the standpoint of compiler performance > optimization for FMA-based architectures. > > There’s some imprecision in the documentation caused by a mismatch between > what’s interesting for compiler writers (where rounding changes due to FMA > formation are allowed) and the abstract specification. That should be > cleaned up. However, fp-contract is not a knob to control whether or not > abstract-machine operations generate a single arithmetic instruction—it > definitely does not, and should not, enable or disable MAD formation. > > – Steve > > > On Jul 10, 2019, at 6:14 PM, Scott Manley via cfe-dev < > cfe-dev at lists.llvm.org> wrote: > > > > > I think you have a different definition of fused then. Fused is a > description of how the operation is computed/rounded, not an instruction > count. > > > > "Only fuse FP ops when the result won't be affected" is what the > existing comment says. So it can't be both a fused op and not a fused op if > it's only meant to imply a difference in rounding. I'm just re-using the > existing wording, and I agree it could be cleaned up if that's the intent > of the -fp-contract option -- which I why I was asking for context. > > > > > For FMA, I think your example IR is correctly handled. The fast > instruction flag should override the global FP option you’re providing. For > the issue you are describing, this is more of a question of whether clang > should be emitting the fast flag or not. > > > > I disagree. How does clang know what would ultimately form an FMA? It > would have to blanket remove 'fast' from all fadds. > > > >> > >> On Wed, Jul 10, 2019 at 4:16 PM Matt Arsenault <arsenm2 at gmail.com> > wrote: > >> > >>> On Jul 10, 2019, at 16:56, Scott Manley <rscottmanley at gmail.com> > wrote: > >>> > >>> At any rate, I was only offering an additional reason. Personally I > think it's strange for an option to say "this will never fuse ops" and then > under the covers will fuse ops, regardless of how FMAD is defined. However, > my primary concern is for FMAs. They have both numeric and performance > implications and I do not think it's unreasonable that off means off. > >> > >> I think you have a different definition of fused then. Fused is a > description of how the operation is computed/rounded, not an instruction > count. The F in FMAD is not fused (I know this naming scheme is not great. > Every other FP node besides FMA has an F prefix) > >> > >> For FMA, I think your example IR is correctly handled. The fast > instruction flag should override the global FP option you’re providing. For > the issue you are describing, this is more of a question of whether clang > should be emitting the fast flag or not. > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190712/b8742796/attachment.html>
Possibly Parallel Threads
- RFC: change -fp-contract=off to actually disable FMAs
- RFC: change -fp-contract=off to actually disable FMAs
- RFC: change -fp-contract=off to actually disable FMAs
- RFC: change -fp-contract=off to actually disable FMAs
- Condition code in DAGCombiner::visitFADDForFMACombine?