Hal Finkel via llvm-dev
2017-May-26 21:02 UTC
[llvm-dev] Poison/Undef at CodeGen level Was: [poison] is select-of-select to logic+select allowed?
On 05/26/2017 03:02 PM, Matthias Braun wrote:> >> Regarding SDAG, and given that poison is already there, we would need >> to adopt a similar solution to the IR. Maybe right now we can get >> away with it because nsw is not exploited significantly (as you say). >> Just because there’s no explicit poison in SDAG, just having nsw is >> sufficient to cause miscompilations when combined with other >> transformations. >> See, for example, this bug report for InstCombine regarding >> select+nsw:_https://bugs.llvm.org/show_bug.cgi?id=31633_ > > Poison/Undef at the CodeGen level is a very interesting discussion! I > don't think there is any documentation about posion/undef at the > CodeGen level and I haven't discussed this much with others, so I'd > like to hear further feedback: > > - I think we should not introduce a notion of poison (which I would > call "delayed UB") at the SelectionDAG/CodeGen level[1] > - Instructions either produce UB immediately, so while there are > nsw/nuw flags on SelectionDAG arithmetic operatiosn I think we can > only assume that they produce a target specific value on overflow, but > not arbitrary behavior. Instructions that can produce UB should marked > "hasSideEffect" and code motion around it be limited.You mean like every integer division? Having arbitrary side effects seems too strong.> - Typical optimization scenarios like establishing loop trip count > bounds for which poison/UB is helpful should not matter for CodeGen. > - I don't have any evidence that optimizations in CodeGen require a > model of poison to work. If someone can given me a counter example > then I'd be hard pressed to disable the optimization in MI and push it > towards the IR level.I'm not sure it is always possible to push these optimizations to the IR level, at least not without adding more Value* ties to instructions (e.g. what we do with MMOs). I have some experience, for example, taking optimizations taking advantage of hardware-counter-based loops and moving into the IR level (so that we can take advantage of ScalarEvolution), and it works to some extent, but is also fairly hacky (the legality-checking part of lib/Target/PowerPC/PPCCTRLoops.cpp, for example, needs to understand what legalization will later do - it's the best we can do right now, but it's a mess). If we had better loop analysis at the MI level, this would be much better. We already have some interesting MI-level passes that do interesting things with loops (lib/CodeGen/MachinePipeliner.cpp, for example). I think that we should have the same semantics here on both the IR level and the MI level (whatever they are). Having different semantics in this regard is going to be confusing (and lead to subtle bugs because optimizations valid in one part of the pipeline will be invalid in other parts). These issues often come up around speculative execution, for example, and I definitely think we need to be able to deal with speculative execution at the MI level (because speculation opportunities often come from legalization/isel and pseudo-instruction expansion and because the modeling is better at the MI level). -Hal> > - Matthias-- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170526/367c8c22/attachment.html>
Matthias Braun via llvm-dev
2017-May-26 21:27 UTC
[llvm-dev] Poison/Undef at CodeGen level Was: [poison] is select-of-select to logic+select allowed?
> On May 26, 2017, at 2:02 PM, Hal Finkel <hfinkel at anl.gov> wrote: > > > On 05/26/2017 03:02 PM, Matthias Braun wrote: >> >>> Regarding SDAG, and given that poison is already there, we would need to adopt a similar solution to the IR. Maybe right now we can get away with it because nsw is not exploited significantly (as you say). Just because there’s no explicit poison in SDAG, just having nsw is sufficient to cause miscompilations when combined with other transformations. >>> See, for example, this bug report for InstCombine regarding select+nsw: https://bugs.llvm.org/show_bug.cgi?id=31633 <https://bugs.llvm.org/show_bug.cgi?id=31633> >> Poison/Undef at the CodeGen level is a very interesting discussion! I don't think there is any documentation about posion/undef at the CodeGen level and I haven't discussed this much with others, so I'd like to hear further feedback: >> >> - I think we should not introduce a notion of poison (which I would call "delayed UB") at the SelectionDAG/CodeGen level[1] >> - Instructions either produce UB immediately, so while there are nsw/nuw flags on SelectionDAG arithmetic operatiosn I think we can only assume that they produce a target specific value on overflow, but not arbitrary behavior. Instructions that can produce UB should marked "hasSideEffect" and code motion around it be limited. > > You mean like every integer division? Having arbitrary side effects seems too strong.If we can model it more precise sure. But in principle if you divide by zero many CPUs will trap immediately.> >> - Typical optimization scenarios like establishing loop trip count bounds for which poison/UB is helpful should not matter for CodeGen. >> - I don't have any evidence that optimizations in CodeGen require a model of poison to work. If someone can given me a counter example then I'd be hard pressed to disable the optimization in MI and push it towards the IR level. > > I'm not sure it is always possible to push these optimizations to the IR level, at least not without adding more Value* ties to instructions (e.g. what we do with MMOs). I have some experience, for example, taking optimizations taking advantage of hardware-counter-based loops and moving into the IR level (so that we can take advantage of ScalarEvolution), and it works to some extent, but is also fairly hacky (the legality-checking part of lib/Target/PowerPC/PPCCTRLoops.cpp, for example, needs to understand what legalization will later do - it's the best we can do right now, but it's a mess). If we had better loop analysis at the MI level, this would be much better. We already have some interesting MI-level passes that do interesting things with loops (lib/CodeGen/MachinePipeliner.cpp, for example). > > I think that we should have the same semantics here on both the IR level and the MI level (whatever they are). Having different semantics in this regard is going to be confusing (and lead to subtle bugs because optimizations valid in one part of the pipeline will be invalid in other parts). These issues often come up around speculative execution, for example, and I definitely think we need to be able to deal with speculative execution at the MI level (because speculation opportunities often come from legalization/isel and pseudo-instruction expansion and because the modeling is better at the MI level).Does that mean a vreg (or even a physreg) can conceptually hold a poison value? Given that failed to capture the semantics at the IR level I have bad feelings about getting this right at the MI level with all the additional machine specific semantics. Reasoning about optimisations definitely gets easier without poison and I'd really like to see some good example first to motivate the maintenance burden. A notion of unknown/unspecified value should be enough to enable hoisting, we shouldn't need poison for that. - Matthias -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170526/7e1e561b/attachment.html>
Sanjoy Das via llvm-dev
2017-May-26 22:15 UTC
[llvm-dev] Poison/Undef at CodeGen level Was: [poison] is select-of-select to logic+select allowed?
Hi, On Fri, May 26, 2017 at 2:27 PM, Matthias Braun <mbraun at apple.com> wrote:> Does that mean a vreg (or even a physreg) can conceptually hold a poison > value? Given that failed to capture the semantics at the IR level I have bad > feelings about getting this right at the MI level with all the additional > machine specific semantics. Reasoning about optimisations definitely gets > easier without poison and I'd really like to see some good example first to > motivate the maintenance burden. > > A notion of unknown/unspecified value should be enough to enable hoisting, > we shouldn't need poison for that.The definition of poison we used in "Taming Undefined Behavior in LLVM" has a natural expression of this -- we can say that in MI all poison is "frozen on arrival". However, this disallows sinking: %RAX = gen_frozen_poison L: use(%RAX) jmp L will not be the same as: L: %RAX = gen_frozen_poison use(%RAX) jmp L Thanks, -- Sanjoy
Hal Finkel via llvm-dev
2017-May-27 03:15 UTC
[llvm-dev] Poison/Undef at CodeGen level Was: [poison] is select-of-select to logic+select allowed?
On 05/26/2017 04:27 PM, Matthias Braun wrote:> >> On May 26, 2017, at 2:02 PM, Hal Finkel <hfinkel at anl.gov >> <mailto:hfinkel at anl.gov>> wrote: >> >> >> On 05/26/2017 03:02 PM, Matthias Braun wrote: >>> >>>> Regarding SDAG, and given that poison is already there, we would >>>> need to adopt a similar solution to the IR. Maybe right now we can >>>> get away with it because nsw is not exploited significantly (as you >>>> say). Just because there’s no explicit poison in SDAG, just having >>>> nsw is sufficient to cause miscompilations when combined with other >>>> transformations. >>>> See, for example, this bug report for InstCombine regarding >>>> select+nsw:_https://bugs.llvm.org/show_bug.cgi?id=31633_ >>> >>> Poison/Undef at the CodeGen level is a very interesting discussion! >>> I don't think there is any documentation about posion/undef at the >>> CodeGen level and I haven't discussed this much with others, so I'd >>> like to hear further feedback: >>> >>> - I think we should not introduce a notion of poison (which I would >>> call "delayed UB") at the SelectionDAG/CodeGen level[1] >>> - Instructions either produce UB immediately, so while there are >>> nsw/nuw flags on SelectionDAG arithmetic operatiosn I think we can >>> only assume that they produce a target specific value on overflow, >>> but not arbitrary behavior. Instructions that can produce UB should >>> marked "hasSideEffect" and code motion around it be limited. >> >> You mean like every integer division? Having arbitrary side effects >> seems too strong. > If we can model it more precise sure. But in principle if you divide > by zero many CPUs will trap immediately.We don't want to limit our ability to move memory references around integer division (for scheduling, etc.).> >> >>> - Typical optimization scenarios like establishing loop trip count >>> bounds for which poison/UB is helpful should not matter for CodeGen. >>> - I don't have any evidence that optimizations in CodeGen require a >>> model of poison to work. If someone can given me a counter example >>> then I'd be hard pressed to disable the optimization in MI and push >>> it towards the IR level. >> >> I'm not sure it is always possible to push these optimizations to the >> IR level, at least not without adding more Value* ties to >> instructions (e.g. what we do with MMOs). I have some experience, for >> example, taking optimizations taking advantage of >> hardware-counter-based loops and moving into the IR level (so that we >> can take advantage of ScalarEvolution), and it works to some extent, >> but is also fairly hacky (the legality-checking part of >> lib/Target/PowerPC/PPCCTRLoops.cpp, for example, needs to understand >> what legalization will later do - it's the best we can do right now, >> but it's a mess). If we had better loop analysis at the MI level, >> this would be much better. We already have some interesting MI-level >> passes that do interesting things with loops >> (lib/CodeGen/MachinePipeliner.cpp, for example). >> >> I think that we should have the same semantics here on both the IR >> level and the MI level (whatever they are). Having different >> semantics in this regard is going to be confusing (and lead to subtle >> bugs because optimizations valid in one part of the pipeline will be >> invalid in other parts). These issues often come up around >> speculative execution, for example, and I definitely think we need to >> be able to deal with speculative execution at the MI level (because >> speculation opportunities often come from legalization/isel and >> pseudo-instruction expansion and because the modeling is better at >> the MI level). > > Does that mean a vreg (or even a physreg) can conceptually hold a > poison value? Given that failed to capture the semantics at the IR > level I have bad feelings about getting this right at the MI level > with all the additional machine specific semantics. Reasoning about > optimisations definitely gets easier without poison and I'd really > like to see some good example first to motivate the maintenance burden. > > A notion of unknown/unspecified value should be enough to enable > hoisting, we shouldn't need poison for that.I think that we should get this right once, and when we do, we should use these self-consistent semantics everywhere. I don't see why machine-specific semantics make this harder or easier in general (there are certainly instructions that won't have UB even when their corresponding IR-level counterparts do, which might make things easier, and instructions can have more dependencies to track, which might make things harder). -Hal> > - Matthias-- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170526/a8c5a1d3/attachment.html>