Nuno Lopes via llvm-dev
2017-May-23 16:16 UTC
[llvm-dev] [poison] is select-of-select to logic+select allowed?
Regarding the patches, there are two concerns AFAICT: 1. It’s a new instruction and as usual when introducing a new instruction it will require work for some time until most optimizations know about it, and to get rid of any potential perf regression. No big deal; we just need to do the work (and we have already done some of it). 2. The patch was written by a student, which may not have time to support it properly going forward. That means that someone needs to step in and take ownership. Of course we will be here to help, but none of us at the moment can commit to own this. The blog post that John mentioned below describes our machinery to find bugs in optimizations and in new semantics, and we will put it to use to test LLVM periodically. On the plus side, freeze can helps us get rid of a bunch of miscompilations, which no one enjoys. The usage of freeze can be incremental; we can try out with a few things to see how it goes. We have more radical ideas down the road, like replace undef with poison, but one thing at a time :) Regarding SDAG, and given that poison is already there, we would need to adopt a similar solution to the IR. Maybe right now we can get away with it because nsw is not exploited significantly (as you say). Just because there’s no explicit poison in SDAG, just having nsw is sufficient to cause miscompilations when combined with other transformations. See, for example, this bug report for InstCombine regarding select+nsw: https://bugs.llvm.org/show_bug.cgi?id=31633 Nuno From: Sanjay Patel [mailto:spatel at rotateright.com] Sent: 23 de maio de 2017 14:25 To: Nuno Lopes <nuno.lopes at ist.utl.pt> Cc: John Regehr <regehr at cs.utah.edu>; llvm-dev <llvm-dev at lists.llvm.org>; Matthias Braun <mbraun at apple.com>; Sanjoy Das <sanjoy at playingwithpointers.com>; David Majnemer <david.majnemer at gmail.com>; Hal Finkel <hfinkel at anl.gov>; Friedman, Eli <efriedma at codeaurora.org> Subject: Re: [poison] is select-of-select to logic+select allowed? The examples and table are great! That at least makes it clear what we need to think about and fix (and just how complex the problem is). Are there known limitations/objections to the freeze instruction patches that John listed? Or are we just being cautious and/or waiting for more people to review those? I can answer some of the DAG questions: (a) Yes, we have nsw/nuw/exact down there. They've been around since: https://reviews.llvm.org/rL210467 . (b) Yes, we use those bits to enable transforms, but it's not nearly as common as in IR. It's also likely that we'll drop those bits during transforms. (c) Yes, we do all kinds of select transforms without regard for poison. (d) AFAIK, there is no accounting for poison in the DAG for any transforms, but it's possible that I just didn't find it in the source. On Tue, May 23, 2017 at 4:16 AM, Nuno Lopes <nuno.lopes at ist.utl.pt <mailto:nuno.lopes at ist.utl.pt> > wrote: Hi, Let me try to give a bit more context on why select is so tricky. First thing to consider is which transformations we would like to support: 1) Control-flow -> select (SimplifyCFG) if (c) a = x else a = y => %a = select %c, %x, %y 2) select -> control-flow; reverse of 1) Not sure if this is done at IR level, or only later at SDAG. 3) select -> arithmetic %a = select %c, true, %y => %a = or %c, %y 4) select removal %c = icmp eq %x, C %r = select %c, C, %x => %r = %x 5) select hoisting past binops %a = udiv %x, %y %b = udiv %x, %z %r = select %c, %a, %b => %t = select %c, %y, %z %r = udiv %x, %t 6) Bonus: easy to move select instructions around. Should be possible to hoist selects out of loops easily. Ideally we want semantics for select that allow all transformations 1-6. It's hard because: 1) %a can only be poison conditionally on %c, i.e., %a is poison if %c=true and %x is poison, or %c=false and %y is poison 2) since we introduce a branch on %c, select on poison has to be UB like branch on poison 3) with arithmetic all operands are always evaluated, so conditional poison like in 1) doesn’t work 4) the example provided replaces C with %x in case %x=C. C is never poison, but %x might be. 5) We cannot introduce a division by poison if %y and %z are not poison 6) Making select trigger UB for some cases makes movement harder because then we need to prove that it won’t trigger UB before e.g. hoisting it out of loops. Summary table of what each transformation allows for %z = select %c, %x, %y. Each column is a different alternative of semantics for select: UB if %c poison + conditional poison UB if %c poison + poison if either %x/%y poison Conditional poison + non-det choice if %c poison Conditional poison + poison if %c poison** Poison if any of %c/%x/%y are poison SimplifyCFG ✓ ✓ ✓ Select->control-flow ✓ ✓ Select->arithmetic ✓ ✓ Select removal ✓ ✓ ✓ ✓ Select hoist ✓ ✓ ✓ Easy movement ✓ ✓ ✓ Modulo bugs in the table, you can see there’s no single column with all rows with a ✓. That means there’s no way (that I’m aware of) to make all transformations that we are interested in doing to be correct. A solution is to introduce something like the freeze instruction that can land a ✓ on any cell you want. So unless someone has a clever idea and proposes a new column that has ticks in all rows, we are left with picking a trade-off: either we disable some optimizations, or we introduce something like freeze to continue doing them. BTW, this table assumes that branch on poison is UB, otherwise optimizations like GVN are wrong (for more details see our paper: http://www.cs.utah.edu/~regehr/papers/undef-pldi17.pdf). The column marked with ** is the one that Alive currently implements. Regarding SDAG: it has undef, and I believe there was some discussion regarding introducing poison there as well. I don’t recall if it was introduce already, but I believe there’s already nsw/nuw there. If that’s the case, the (il)legality of transformations should be exactly the same as in LLVM IR. Otherwise some transformations may be easier. Sorry for the longish email; just wanted to give some background about the problem so that we can reach some consensus at some point. Nuno -----Original Message----- From: John Regehr Sent: 22 de maio de 2017 21:45 Subject: Re: [poison] is select-of-select to logic+select allowed? Nuno and I have been looking through the results of experiments like the one reported here: <https://blog.regehr.org/archives/1510> https://blog.regehr.org/archives/1510 Indeed there are some select-related transformations in LLVM that are illegal in terms of Alive's semantics. As far as we know, they cannot be made legal (by adjusting the semantics of instructions) without breaking a bunch of other optimizations. The optimizations currently in LLVM are not only inconsistent but they can lead to end-to-end miscompilations. Sorting this out is going to require (1) figuring out what select means WRT poison/undef and (2) backing out some optimizations. This is going to be slightly painful but it is the only way forward. John On 5/22/17 2:32 PM, Sanjay Patel wrote:> Some InstCombine transforms for select-of-select were added here:> <https://reviews.llvm.org/rL228409> https://reviews.llvm.org/rL228409>> But Alive says this is more poisonous:>> Name: selsel> %s1 = select i1 %cond1, i8 C1, i8 C2> %s2 = select i1 %cond2, i8 %s1, i8 C2> =>> %andcond = and i1 %cond1, %cond2> %s2 = select i1 %andcond, i8 C1, i8 C2>> <http://rise4fun.com/Alive/JT6> http://rise4fun.com/Alive/JT6>> Are those transforms legal?-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170523/00f3185c/attachment.html>
John Regehr via llvm-dev
2017-May-23 16:27 UTC
[llvm-dev] [poison] is select-of-select to logic+select allowed?
Would it make sense to first commit to a semantics for select, document this, and attempt to comment out all now-invalid optimizations? We don't want to remove them entirely since, as Nuno points out, we can bring some of them back later using appropriate freezes. What I don't know is who would be affected by the performance regressions in the meantime, but I suspect that they wouldn't be too big of a deal for most of us. John On 05/23/2017 10:16 AM, Nuno Lopes wrote:> Regarding the patches, there are two concerns AFAICT: > > 1. It’s a new instruction and as usual when introducing a new > instruction it will require work for some time until most > optimizations know about it, and to get rid of any potential perf > regression. No big deal; we just need to do the work (and we have > already done some of it). > 2. The patch was written by a student, which may not have time to > support it properly going forward. That means that someone needs to > step in and take ownership. Of course we will be here to help, but > none of us at the moment can commit to own this. The blog post that > John mentioned below describes our machinery to find bugs in > optimizations and in new semantics, and we will put it to use to > test LLVM periodically. > > On the plus side, freeze can helps us get rid of a bunch of > miscompilations, which no one enjoys. > > The usage of freeze can be incremental; we can try out with a few things > to see how it goes. We have more radical ideas down the road, like > replace undef with poison, but one thing at a time :) > > Regarding SDAG, and given that poison is already there, we would need to > adopt a similar solution to the IR. Maybe right now we can get away > with it because nsw is not exploited significantly (as you say). Just > because there’s no explicit poison in SDAG, just having nsw is > sufficient to cause miscompilations when combined with other > transformations. > > See, for example, this bug report for InstCombine regarding select+nsw: > https://bugs.llvm.org/show_bug.cgi?id=31633 > > Nuno > > *From:*Sanjay Patel [mailto:spatel at rotateright.com] > *Sent:* 23 de maio de 2017 14:25 > *To:* Nuno Lopes <nuno.lopes at ist.utl.pt> > *Cc:* John Regehr <regehr at cs.utah.edu>; llvm-dev > <llvm-dev at lists.llvm.org>; Matthias Braun <mbraun at apple.com>; Sanjoy Das > <sanjoy at playingwithpointers.com>; David Majnemer > <david.majnemer at gmail.com>; Hal Finkel <hfinkel at anl.gov>; Friedman, Eli > <efriedma at codeaurora.org> > *Subject:* Re: [poison] is select-of-select to logic+select allowed? > > The examples and table are great! That at least makes it clear what we > need to think about and fix (and just how complex the problem is). > > Are there known limitations/objections to the freeze instruction patches > that John listed? Or are we just being cautious and/or waiting for more > people to review those? > > I can answer some of the DAG questions: > (a) Yes, we have nsw/nuw/exact down there. They've been around since: > https://reviews.llvm.org/rL210467 . > > (b) Yes, we use those bits to enable transforms, but it's not nearly as > common as in IR. It's also likely that we'll drop those bits during > transforms. > > (c) Yes, we do all kinds of select transforms without regard for poison. > > (d) AFAIK, there is no accounting for poison in the DAG for any > transforms, but it's possible that I just didn't find it in the source. > > On Tue, May 23, 2017 at 4:16 AM, Nuno Lopes <nuno.lopes at ist.utl.pt > <mailto:nuno.lopes at ist.utl.pt>> wrote: > > Hi, > > Let me try to give a bit more context on why select is so tricky. > > First thing to consider is which transformations we would like to > support: > > _1) Control-flow -> select (SimplifyCFG)_ > > if (c) > > a = x > > else > > a = y > > => > > %a = select %c, %x, %y > > _2) select -> control-flow_; reverse of 1) > > Not sure if this is done at IR level, or only later at SDAG. > > _3) select -> arithmetic_ > > %a = select %c, true, %y > > => > > %a = or %c, %y > > _4) select removal_ > > %c = icmp eq %x, C > > %r = select %c, C, %x > > => > > %r = %x > > _5) select hoisting past binops_ > > %a = udiv %x, %y > > %b = udiv %x, %z > > %r = select %c, %a, %b > > => > > %t = select %c, %y, %z > > %r = udiv %x, %t > > _6) Bonus: easy to move_ select instructions around. Should be > possible to hoist selects out of loops easily. > > Ideally we want semantics for select that allow all transformations > 1-6. It's hard because: > > 1) %a can only be poison conditionally on %c, i.e., %a is poison if > %c=true and %x is poison, or %c=false and %y is poison > > 2) since we introduce a branch on %c, select on poison has to be UB > like branch on poison > > 3) with arithmetic all operands are always evaluated, so conditional > poison like in 1) doesn’t work > > 4) the example provided replaces C with %x in case %x=C. C is never > poison, but %x might be. > > 5) We cannot introduce a division by poison if %y and %z are not poison > > 6) Making select trigger UB for some cases makes movement harder > because then we need to prove that it won’t trigger UB before e.g. > hoisting it out of loops. > > Summary table of what each transformation allows for %z = select %c, > %x, %y. Each column is a different alternative of semantics for select: > > > > UB if %c poison > > + conditional poison > > > > UB if %c poison + poison if either > %x/%y poison > > > > Conditional poison > > + non-det choice if %c poison > > > > Conditional poison + poison if %c poison** > > > > Poison if any of > %c/%x/%y are poison > > SimplifyCFG > > > > ✓ > > > > > > ✓ > > > > ✓ > > > > Select->control-flow > > > > ✓ > > > > ✓ > > > > > > > > Select->arithmetic > > > > > > ✓ > > > > > > > > ✓ > > Select removal > > > > ✓ > > > > ✓ > > > > > > ✓ > > > > ✓ > > Select hoist > > > > ✓ > > > > ✓ > > > > ✓ > > > > > > Easy movement > > > > > > > > ✓ > > > > ✓ > > > > ✓ > > Modulo bugs in the table, you can see there’s no single column with > all rows with a ✓. That means there’s no way (that I’m aware of) to > make all transformations that we are interested in doing to be > correct. A solution is to introduce something like the freeze > instruction that can land a ✓ on any cell you want. > > So unless someone has a clever idea and proposes a new column that > has ticks in all rows, we are left with picking a trade-off: either > we disable some optimizations, or we introduce something like freeze > to continue doing them. > > BTW, this table assumes that branch on poison is UB, otherwise > optimizations like GVN are wrong (for more details see our paper: > http://www.cs.utah.edu/~regehr/papers/undef-pldi17.pdf). The column > marked with ** is the one that Alive currently implements. > > Regarding SDAG: it has undef, and I believe there was some > discussion regarding introducing poison there as well. I don’t > recall if it was introduce already, but I believe there’s already > nsw/nuw there. If that’s the case, the (il)legality of > transformations should be exactly the same as in LLVM IR. Otherwise > some transformations may be easier. > > Sorry for the longish email; just wanted to give some background > about the problem so that we can reach some consensus at some point. > > Nuno > > -----Original Message----- > From: John Regehr > Sent: 22 de maio de 2017 21:45 > Subject: Re: [poison] is select-of-select to logic+select allowed? > > Nuno and I have been looking through the results of experiments like > the one reported here: > > https://blog.regehr.org/archives/1510 > > Indeed there are some select-related transformations in LLVM that > are illegal in terms of Alive's semantics. As far as we know, they > cannot be made legal (by adjusting the semantics of instructions) > without breaking a bunch of other optimizations. > > The optimizations currently in LLVM are not only inconsistent but > they can lead to end-to-end miscompilations. Sorting this out is > going to require (1) figuring out what select means WRT poison/undef > and (2) backing out some optimizations. This is going to be slightly > painful but it is the only way forward. > > John > > On 5/22/17 2:32 PM, Sanjay Patel wrote: > > > Some InstCombine transforms for select-of-select were added here: > > > https://reviews.llvm.org/rL228409 > > > > > > But Alive says this is more poisonous: > > > > > > Name: selsel > > > %s1 = select i1 %cond1, i8 C1, i8 C2 > > > %s2 = select i1 %cond2, i8 %s1, i8 C2 > > > => > > > %andcond = and i1 %cond1, %cond2 > > > %s2 = select i1 %andcond, i8 C1, i8 C2 > > > > > > http://rise4fun.com/Alive/JT6 > > > > > > Are those transforms legal? >
Davide Italiano via llvm-dev
2017-May-23 16:37 UTC
[llvm-dev] [poison] is select-of-select to logic+select allowed?
On Tue, May 23, 2017 at 9:27 AM, John Regehr via llvm-dev <llvm-dev at lists.llvm.org> wrote:> Would it make sense to first commit to a semantics for select, document > this, and attempt to comment out all now-invalid optimizations? We don't > want to remove them entirely since, as Nuno points out, we can bring some of > them back later using appropriate freezes. > > What I don't know is who would be affected by the performance regressions in > the meantime, but I suspect that they wouldn't be too big of a deal for most > of us. > > John >As somebody who has been bitten by `undef` funny behaviour several times (most recently in NewGVN, see, e.g. https://bugs.llvm.org/show_bug.cgi?id=33129), I'm in favour of this. My anecdotal evidence/past experience is that it causes too much pain for relatively little benefit. In fact, we're (likely) going to disable aggressive `undef` optimizations, at least for PHIs, in NewGVN, at least until `undef` is fixed. For `select` in particular, it doesn't seem to be a big problem, but I think some benchmarking is needed to make sure things don't regress badly. -- Davide
Sanjoy Das via llvm-dev
2017-May-23 17:37 UTC
[llvm-dev] [poison] is select-of-select to logic+select allowed?
Hi, On Tue, May 23, 2017 at 9:27 AM, John Regehr <regehr at cs.utah.edu> wrote:> Would it make sense to first commit to a semantics for select, document > this, and attempt to comment out all now-invalid optimizations? We don't > want to remove them entirely since, as Nuno points out, we can bring some of > them back later using appropriate freezes.This sounds like a good plan. I'd only suggest that instead of commenting out the optimizations altogether we put it under an off by default flag, if practical. This would make it easier to run A/B tests. -- Sanjoy
Matthias Braun via llvm-dev
2017-May-26 20:02 UTC
[llvm-dev] Poison/Undef at CodeGen level Was: [poison] is select-of-select to logic+select allowed?
> Regarding SDAG, and given that poison is already there, we would need to adopt a similar solution to the IR. Maybe right now we can get away with it because nsw is not exploited significantly (as you say). Just because there’s no explicit poison in SDAG, just having nsw is sufficient to cause miscompilations when combined with other transformations. > See, for example, this bug report for InstCombine regarding select+nsw: https://bugs.llvm.org/show_bug.cgi?id=31633 <https://bugs.llvm.org/show_bug.cgi?id=31633>Poison/Undef at the CodeGen level is a very interesting discussion! I don't think there is any documentation about posion/undef at the CodeGen level and I haven't discussed this much with others, so I'd like to hear further feedback: - I think we should not introduce a notion of poison (which I would call "delayed UB") at the SelectionDAG/CodeGen level[1] - Instructions either produce UB immediately, so while there are nsw/nuw flags on SelectionDAG arithmetic operatiosn I think we can only assume that they produce a target specific value on overflow, but not arbitrary behavior. Instructions that can produce UB should marked "hasSideEffect" and code motion around it be limited. - Typical optimization scenarios like establishing loop trip count bounds for which poison/UB is helpful should not matter for CodeGen. - I don't have any evidence that optimizations in CodeGen require a model of poison to work. If someone can given me a counter example then I'd be hard pressed to disable the optimization in MI and push it towards the IR level. - Matthias -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170526/13c90dff/attachment.html>
Hal Finkel via llvm-dev
2017-May-26 21:02 UTC
[llvm-dev] Poison/Undef at CodeGen level Was: [poison] is select-of-select to logic+select allowed?
On 05/26/2017 03:02 PM, Matthias Braun wrote:> >> Regarding SDAG, and given that poison is already there, we would need >> to adopt a similar solution to the IR. Maybe right now we can get >> away with it because nsw is not exploited significantly (as you say). >> Just because there’s no explicit poison in SDAG, just having nsw is >> sufficient to cause miscompilations when combined with other >> transformations. >> See, for example, this bug report for InstCombine regarding >> select+nsw:_https://bugs.llvm.org/show_bug.cgi?id=31633_ > > Poison/Undef at the CodeGen level is a very interesting discussion! I > don't think there is any documentation about posion/undef at the > CodeGen level and I haven't discussed this much with others, so I'd > like to hear further feedback: > > - I think we should not introduce a notion of poison (which I would > call "delayed UB") at the SelectionDAG/CodeGen level[1] > - Instructions either produce UB immediately, so while there are > nsw/nuw flags on SelectionDAG arithmetic operatiosn I think we can > only assume that they produce a target specific value on overflow, but > not arbitrary behavior. Instructions that can produce UB should marked > "hasSideEffect" and code motion around it be limited.You mean like every integer division? Having arbitrary side effects seems too strong.> - Typical optimization scenarios like establishing loop trip count > bounds for which poison/UB is helpful should not matter for CodeGen. > - I don't have any evidence that optimizations in CodeGen require a > model of poison to work. If someone can given me a counter example > then I'd be hard pressed to disable the optimization in MI and push it > towards the IR level.I'm not sure it is always possible to push these optimizations to the IR level, at least not without adding more Value* ties to instructions (e.g. what we do with MMOs). I have some experience, for example, taking optimizations taking advantage of hardware-counter-based loops and moving into the IR level (so that we can take advantage of ScalarEvolution), and it works to some extent, but is also fairly hacky (the legality-checking part of lib/Target/PowerPC/PPCCTRLoops.cpp, for example, needs to understand what legalization will later do - it's the best we can do right now, but it's a mess). If we had better loop analysis at the MI level, this would be much better. We already have some interesting MI-level passes that do interesting things with loops (lib/CodeGen/MachinePipeliner.cpp, for example). I think that we should have the same semantics here on both the IR level and the MI level (whatever they are). Having different semantics in this regard is going to be confusing (and lead to subtle bugs because optimizations valid in one part of the pipeline will be invalid in other parts). These issues often come up around speculative execution, for example, and I definitely think we need to be able to deal with speculative execution at the MI level (because speculation opportunities often come from legalization/isel and pseudo-instruction expansion and because the modeling is better at the MI level). -Hal> > - Matthias-- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170526/367c8c22/attachment.html>
Nuno Lopes via llvm-dev
2017-May-29 19:31 UTC
[llvm-dev] Poison/Undef at CodeGen level Was: [poison] is select-of-select to logic+select allowed?
>> Regarding SDAG, and given that poison is already there, we would need to >> adopt a similar solution to the IR. Maybe right now we can get away with >> it because nsw is not exploited significantly (as you say). Just because >> there’s no explicit poison in SDAG, just having nsw is sufficient to >> cause miscompilations when combined with other transformations.See, for example, this bug report for InstCombine regarding select+nsw: https://bugs.llvm.org/show_bug.cgi?id=31633> > Poison/Undef at the CodeGen level is a very interesting discussion! I > don't think there is any documentation about posion/undef at the CodeGen > level and I haven't discussed this much with others, so I'd like to hear > further feedback: > > - I think we should not introduce a notion of poison (which I would call > "delayed UB") at the SelectionDAG/CodeGen level[1] > - Instructions either produce UB immediately, so while there are nsw/nuw > flags on SelectionDAG arithmetic operatiosn I think we can only assume > that they produce a target specific value on overflow, but not arbitrary > behavior. Instructions that can produce UB should marked "hasSideEffect" > and code motion around it be limited. > - Typical optimization scenarios like establishing loop trip count bounds > for which poison/UB is helpful should not matter for CodeGen. > - I don't have any evidence that optimizations in CodeGen require a model > of poison to work. If someone can given me a counter example then I'd be > hard pressed to disable the optimization in MI and push it towards the IR > level.I'm not convinced that you need poison at SDAG/MI level either. However, you already have it :) For NSW/NUW to work, undef is not enough: you need poison. So right now there's already poison in SDAG; it's just that it is probably not exploited much more beyond folding "a+1 > a" into true or something like that. But poison is there for sure. Then there's undef. It seems very useful in CodeGen; I just don't know if it could be replaced with poison altogether or not. I haven't studied CodeGen as much as the IR, so I can't make any statement on what's the best way to go. To do so, we need to have a list of transformations/optimizations that are currently done at SDAG+MI levels to figure out what's the best semantics. I'm happy to help with defining the semantics. Nuno