Sanjoy Das via llvm-dev
2017-Jun-08 17:33 UTC
[llvm-dev] RFC: Killing undef and spreading poison
Hi Peter, On Thu, Jun 8, 2017 at 9:41 AM, Peter Lawrence <peterl95124 at sbcglobal.net> wrote:> >> On Jun 7, 2017, at 2:23 PM, Nuno Lopes <nunoplopes at sapo.pt> wrote: >> >> Since most add/sub operations compiled from C have the nsw attribute, we cannot simply restrict movement of these instructions. > > Nuno, > I’m not saying the operations can’t be moved, > I’m saying that once you move one the ‘nsw’ attribute no longer applies, > unless you can mathematically prove that it still does, > otherwise an “add nsw” has to be converted to plain “add”. > > It is only by illegally retaining the ‘nsw’ after hoisting the multiply out of the ‘if’ statement > that subsequent transformations yield end-to-end-miscompilation in Sanjoy’s example.That would be correct (and we do this for some constructs: for instance when we have !dereferenceable attached to a load instruction we will strip the !dereferenceable when hoisting it out of control flow). However, with poison we want to have our cake and eat it too (perhaps eating is not the best analogy with poison :) ) -- we want to (whenever correct) exploit the fact that a certain operation does not overflow when possible even when hoisting it above control flow. For instance, if we have: if (cond) { t = a +nsw b; print(t); } Now if once we hoist t out of the control block: t = a +nsw b; if (cond) { print(t); } in the transformed program, t itself may sign overflow. In LLVM IR (or at least in the semantics we'd like), this has no correctness implications -- t becomes "poison" (which is basically deferred undefined behavior), and the program is undefined only if the generated poison value is used in a "side effecting" manner. Assuming that print is a "side effect", this means at print, we can assume t isn't poison (and thus a + b did not sign overflow). This is a weaker model than C/C++; and the difficult bits are getting the poison propagation rules correct, and to have a sound definition of a "side effect" (i.e. the points at which poison == deferred UB actually becomes UB).> I think the LLVM community in general has misunderstood and misused ‘nsw’, don’t you agree now ?FYI, I think it is poor form to insinuate such things when you clearly haven't made an effort to dig back and understand the all of prior discussions we've had in this area (hint: we've discussed and explicitly decided to not implement the semantics you're suggesting). Of course, fresh ideas are always welcome but I suggest you start by first reading http://www.cs.utah.edu/~regehr/papers/undef-pldi17.pdf and some of the mailing list discussions we've had in the past on this topic. Thanks! -- Sanjoy
Peter Lawrence via llvm-dev
2017-Jun-08 17:52 UTC
[llvm-dev] RFC: Killing undef and spreading poison
> On Jun 8, 2017, at 10:33 AM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote: > > Hi Peter, > > On Thu, Jun 8, 2017 at 9:41 AM, Peter Lawrence > <peterl95124 at sbcglobal.net> wrote: >> >>> On Jun 7, 2017, at 2:23 PM, Nuno Lopes <nunoplopes at sapo.pt> wrote: >>> >>> Since most add/sub operations compiled from C have the nsw attribute, we cannot simply restrict movement of these instructions. >> >> Nuno, >> I’m not saying the operations can’t be moved, >> I’m saying that once you move one the ‘nsw’ attribute no longer applies, >> unless you can mathematically prove that it still does, >> otherwise an “add nsw” has to be converted to plain “add”. >> >> It is only by illegally retaining the ‘nsw’ after hoisting the multiply out of the ‘if’ statement >> that subsequent transformations yield end-to-end-miscompilation in Sanjoy’s example. > > That would be correct (and we do this for some constructs: for > instance when we have !dereferenceable attached to a load instruction > we will strip the !dereferenceable when hoisting it out of control > flow).In other words you are agreeing with me. And once we’ve agreed on that, why do you insist on illegally hoisting the ‘nsw’ Out of the if-statement, since adding “poison” is clearly not necessary (once The ‘nsw’ is stripped off the multiply the ‘sext’ can no longer be commuted). In other words “poison” isn’t cake, it is a bandaid over an illegal transformation, It has no benefit. Don’t make the illegal transformation and ‘poison’ isn’t necessary. Peter Lawrence.> However, with poison we want to have our cake and eat it too > (perhaps eating is not the best analogy with poison :) ) -- we want to > (whenever correct) exploit the fact that a certain operation does not > overflow when possible even when hoisting it above control flow. For > instance, if we have: > > if (cond) { > t = a +nsw b; > print(t); > } > > Now if once we hoist t out of the control block: > > t = a +nsw b; > if (cond) { > print(t); > } > > in the transformed program, t itself may sign overflow. In LLVM IR > (or at least in the semantics we'd like), this has no correctness > implications -- t becomes "poison" (which is basically deferred > undefined behavior), and the program is undefined only if the > generated poison value is used in a "side effecting" manner. Assuming > that print is a "side effect", this means at print, we can assume t > isn't poison (and thus a + b did not sign overflow). This is a weaker > model than C/C++; and the difficult bits are getting the poison > propagation rules correct, and to have a sound definition of a "side > effect" (i.e. the points at which poison == deferred UB actually > becomes UB). > >> I think the LLVM community in general has misunderstood and misused ‘nsw’, don’t you agree now ? > > FYI, I think it is poor form to insinuate such things when you clearly > haven't made an effort to dig back and understand the all of prior > discussions we've had in this area (hint: we've discussed and > explicitly decided to not implement the semantics you're suggesting). > Of course, fresh ideas are always welcome but I suggest you start by > first reading http://www.cs.utah.edu/~regehr/papers/undef-pldi17.pdf > and some of the mailing list discussions we've had in the past on this > topic. > > Thanks! > -- Sanjoy
Peter Lawrence via llvm-dev
2017-Jun-08 18:12 UTC
[llvm-dev] RFC: Killing undef and spreading poison
> On Jun 8, 2017, at 10:33 AM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote: > > Hi Peter, > > On Thu, Jun 8, 2017 at 9:41 AM, Peter Lawrence > <peterl95124 at sbcglobal.net> wrote: >> >>> On Jun 7, 2017, at 2:23 PM, Nuno Lopes <nunoplopes at sapo.pt> wrote: >>> >>> Since most add/sub operations compiled from C have the nsw attribute, we cannot simply restrict movement of these instructions. >> >> Nuno, >> I’m not saying the operations can’t be moved, >> I’m saying that once you move one the ‘nsw’ attribute no longer applies, >> unless you can mathematically prove that it still does, >> otherwise an “add nsw” has to be converted to plain “add”. >> >> It is only by illegally retaining the ‘nsw’ after hoisting the multiply out of the ‘if’ statement >> that subsequent transformations yield end-to-end-miscompilation in Sanjoy’s example. > > >> I think the LLVM community in general has misunderstood and misused ‘nsw’, don’t you agree now ? > > FYI, I think it is poor form to insinuate such things when you clearly > haven't made an effort to dig back and understand the all of prior > discussions we've had in this area (hint: we've discussed and > explicitly decided to not implement the semantics you're suggesting). > Of course, fresh ideas are always welcome but I suggest you start by > first reading http://www.cs.utah.edu/~regehr/papers/undef-pldi17.pdf > and some of the mailing list discussions we've had in the past on this > topic.Sanjoy, I have read the paper, and I dug the archives all the way back to Nov to find Your example with the nsw-multiply-and-commuting-sext problem. The reason I keep asking for additional examples is that this one where You have illegally transformed the ‘nsw’ isn’t convincing. So please, Provide some additional examples, or some pointers going farther back Than last Nov in the archives, especially ones that show why you decided To allow the illegal transformation of ‘nsw’. Peter Lawrence.> > Thanks! > -- Sanjoy
Peter Lawrence via llvm-dev
2017-Jun-08 19:29 UTC
[llvm-dev] RFC: Killing undef and spreading poison
Sanjoy, in your blog post https://www.playingwithpointers.com/problem-with-undef.html <https://www.playingwithpointers.com/problem-with-undef.html> you describe a problem with LLVM “undef”, yet in your paper http://www.cs.utah.edu/~regehr/papers/undef-pldi17.pdf <http://www.cs.utah.edu/~regehr/papers/undef-pldi17.pdf> you do not suggest fixing this problem, even though in chpt. 9 you identify other compilers that do attempt to avoid it. It seems to me that since transforming a program with multiple uses of an “undef” variable into one with multiple distinct “undef”s results in behavior that is inconsistent (and that violates users expectations) we should not allow it. This seems to me to be one of those situations where the phrase “extraordinary claims require extraordinary evidence” applies, and I’m not seeing any, are you ? IIUC, a “freeze” operation can always be inserted over an “undef” (now “poison”) that forces a single consistent value, but why the extra overhead of “freeze”, as there doesn’t seem to be a convincing argument for ever allowing the same variable to have inconsistent values, or am I missing something ? Peter Lawrence. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170608/14b5cb27/attachment.html>
Peter Lawrence via llvm-dev
2017-Jun-08 22:27 UTC
[llvm-dev] RFC: Killing undef and spreading poison
> On Jun 8, 2017, at 10:33 AM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote: > > >> (hint: we've discussed and >> explicitly decided to not implement the semantics you're suggesting).Sanjoy, I’ve search the entire archives of 2017 and 2016 and have not found anything remotely resembling a discussion about hoisting ‘nsw’ / ‘nuw’, can you be more specific ? Peter Lawrence. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170608/746f2595/attachment.html>
Peter Lawrence via llvm-dev
2017-Jun-08 23:04 UTC
[llvm-dev] RFC: Killing undef and spreading poison
Sanjoy, From your paper http://www.cs.utah.edu/~regehr/papers/undef-pldi17.pdf <http://www.cs.utah.edu/~regehr/papers/undef-pldi17.pdf> we have this proposal 1. remove “undef” and use “poison” instead 2. Introduce a new instruction: “freeze” … 3. All operations over “poison” unconditionally return “poison” except “phi", “select", and "freeze" 4. Branching on “poison” is immediate UB. 3::: then (X & 0) does not translate to (0) when X is “poison” ?, that seems strange, can you explain / elaborate ? 4::: hmm, I know of no machine where this is what actually happens[*], can you explain / elaborate ? Peter Lawrence. [*. I am assuming you are using consistent terminology, where in the beginning of the paper you make a distinction between immediate and deferred undefined, with immediate being a trap like divide-by-zero.] -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170608/4d387512/attachment-0001.html>
John Regehr via llvm-dev
2017-Jun-08 23:33 UTC
[llvm-dev] RFC: Killing undef and spreading poison
> 3::: then (X & 0) does not translate to (0) when X is “poison” > ?, that seems strange, can you explain / elaborate ?This is how poison already works, please see the LangRef. John
John Regehr via llvm-dev
2017-Jun-09 03:30 UTC
[llvm-dev] RFC: Killing undef and spreading poison
> 4. Branching on “poison” is immediate UB.> 4::: hmm, I know of no machine where this is what actually happens[*], > can you explain / elaborate ?Well, branching on poison has to mean something. We propose making it immediate UB. Another choice would be nondeterministic branching. Either way, some optimizations become legal and others become illegal. It's just a design tradeoff. The semantics of actual machines aren't as important here as you might be tempted to think. Peter, I think you are kind of wearing us out here. Could you please tone it down a bit, read everything carefully, play with some examples (Alive is particularly helpful, and it implements both the old and proposed new semantics), and then get back to us with questions that don't ask us to re-explain everything that we've already spent months or years discussing? I'm not exactly sure which discussions Sanjoy was referring to (there have been many) but if you look for "the nsw story" in the archives (ca. 2011/2012) you'll turn up a lot of material. Thanks for your consideration, John