On 06/28/2017 02:18 PM, Chandler Carruth wrote:> On Wed, Jun 28, 2017 at 12:09 PM Peter Lawrence > <peterl95124 at sbcglobal.net <mailto:peterl95124 at sbcglobal.net>> wrote: > > Chandler, > Please give some citations, I’ve search the > llvm-dev archives and didn't find any. > > > They are all in the discussions from Nuno, Sanjoy, John and others > around poison semantics. I won't be able to find the citations any > easier than you, I simply wanted to point out that rather than > conclude this is novel, you should probably ask folks who have > participated in these discussions for reference points that they may have.Peter, I'll also point out that, in general, information on the subject of our UB handling is also spread throughout numerous bug reports, patch reviews, and in addition, has been the subject of many, many hours of in-person discussions. The fact that there's no one place to get a good in-depth overview is something that we, as a community, should strive to improve. This is a very important topic. However, I can assure you that what you've found by searching llvm-dev is only part of the relevant history. -Hal> > Peter Lawrence. > > >> On Jun 28, 2017, at 12:01 PM, Chandler Carruth >> <chandlerc at gmail.com <mailto:chandlerc at gmail.com>> wrote: >> >> On Wed, Jun 28, 2017 at 9:39 AM Peter Lawrence >> <peterl95124 at sbcglobal.net <mailto:peterl95124 at sbcglobal.net>> wrote: >> >> >> Preface: This paper shows that "poison" was never actually >> necessary >> in the first place. “Poison"s existence is based on incorrect >> assumptions >> that are being explored for the first time. >> >> >> Just so you are aware, there have been numerous people in the >> community who have been exploring the issues you bring up for >> many years. So I don't think it is correct to say that they are >> being explored for the first time. >> >> >> >> >> I have been re-reading Dan Gohman's original post "the nsw >> story" [1] >> and have come to the conclusion that Dan got it wrong in some >> respects. >> >> He came up with "no signed wrap" ("nsw") which in his words means >> "The purpose of the nsw flag is to indicate instructions >> which are >> known to have no overflow". The key operative word here is >> "known". >> >> This means literally an operation with an additional >> "llvm.assume". >> For example (a +nsw b) when a and b are i32 is really >> >> ((llvm.assume((INT_MIN <= ((i64)a+(i64)b) <= INT_MAX) , (a + b)) >> >> It is this "assume" that justifies the transformation that >> Dan does >> to optimize sign-extension of i32 induction variables out of >> loops >> on LP64 targets. So far so good. >> >> Note that there is no "undef" in the IR either before or >> after the >> transform, this doesn't just fall out because of a clever >> definition >> of IR "undef". >> >> Note that even the concept of "undefined" never enters into the >> justification, only that "nsw" ==> "assume" ==> the loop >> iteration >> bounds don't wrap ==> i64 arithmetic will generate the exact same >> iterations as i32 arithmetic ==> the induction variable can be >> promoted ==> there is no longer any sign-extend inside the loop. >> >> Note that clang can generate "+nsw" for signed “+" regardless >> of whether the precise C standard wording is "undefined behavior" >> or more simply "unspecified value". >> >> >> >> Where Dan goes wrong is in thinking that "+nsw" is an operation >> rather than an operation with an assume, and therefore he feels >> the need to define what the result of this operation is when it >> overflows, which then seems to require a new "poison" instruction >> because "undef" isn't good enough. >> >> But there is no need to ask what the result of overflow is >> because "+nsw" is like a "+" inside of an if-statement whose >> condition precludes overflow, and if it can't overflow then >> asking about it is a non sequitor. >> >> And speculatively hoisting the "+nsw" doesn't cause problems >> because hoisting a "+nsw" is like taking a "+" outside of the >> if-statement that guarantees no overflow, it is then simply >> a plain old un-attributed "+" operation which has no undefined >> behavior. >> >> Dan's follow on email "nsw is still inconsistent" [2] shows by >> example why it is illegal to hoist the "nsw" attribute along >> with the "+" operation. >> >> It therefore makes no sense to discuss the result of "+nsw" as >> ever being either "undef" or "poison", and so the need for >> "poison" >> is gone. >> >> >> >> Here's what Dan thought at the time about this "poison" creation >> >> "I wrote up a description of this concept, and it's >> been in >> LangRef ever since. It sticks out though, because it >> got pretty >> big and complex, especially in view of its relative >> obscurity. >> Realistically speaking, it's probably not fully >> watertight yet." >> >> I agree with Dan here "it's probably not fully watertight >> yet", and >> apparently other folks agree because yet another instruction, >> "freeze", is being proposed to fix "poison"s problems. My guess >> is that "freeze is probably not fully watertight yet" either, but >> since "poison" isn't needed it is time to delete it from the >> LangRef, >> and we can therefore stop considering "freeze". >> >> >> Peter Lawrence. >> >> >> References >> >> [1. llvm-dev, Dan Gohman, Tue Nov 29 15:21:58 PST 2011 ] >> [2. llvm-dev, Dan Gohman, Mon Dec 12 12:58:31 PST 2011 ] >> >-- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170628/3e901af0/attachment.html>
Hal, Chandler, Can we get back on topic, IE if you’ve actually read the paper I’d like to hear some feedback on its technical content Peter Lawrence.> On Jun 28, 2017, at 12:34 PM, Hal Finkel <hfinkel at anl.gov> wrote: > > > On 06/28/2017 02:18 PM, Chandler Carruth wrote: >> On Wed, Jun 28, 2017 at 12:09 PM Peter Lawrence <peterl95124 at sbcglobal.net <mailto:peterl95124 at sbcglobal.net>> wrote: >> Chandler, >> Please give some citations, I’ve search the llvm-dev archives and didn't find any. >> >> They are all in the discussions from Nuno, Sanjoy, John and others around poison semantics. I won't be able to find the citations any easier than you, I simply wanted to point out that rather than conclude this is novel, you should probably ask folks who have participated in these discussions for reference points that they may have. > > Peter, I'll also point out that, in general, information on the subject of our UB handling is also spread throughout numerous bug reports, patch reviews, and in addition, has been the subject of many, many hours of in-person discussions. The fact that there's no one place to get a good in-depth overview is something that we, as a community, should strive to improve. This is a very important topic. However, I can assure you that what you've found by searching llvm-dev is only part of the relevant history. > > -Hal > >> >> >> Peter Lawrence. >> >> >>> On Jun 28, 2017, at 12:01 PM, Chandler Carruth <chandlerc at gmail.com <mailto:chandlerc at gmail.com>> wrote: >>> >>> On Wed, Jun 28, 2017 at 9:39 AM Peter Lawrence <peterl95124 at sbcglobal.net <mailto:peterl95124 at sbcglobal.net>> wrote: >>> >>> Preface: This paper shows that "poison" was never actually necessary >>> in the first place. “Poison"s existence is based on incorrect assumptions >>> that are being explored for the first time. >>> >>> Just so you are aware, there have been numerous people in the community who have been exploring the issues you bring up for many years. So I don't think it is correct to say that they are being explored for the first time. >>> >>> >>> >>> >>> I have been re-reading Dan Gohman's original post "the nsw story" [1] >>> and have come to the conclusion that Dan got it wrong in some respects. >>> >>> He came up with "no signed wrap" ("nsw") which in his words means >>> "The purpose of the nsw flag is to indicate instructions which are >>> known to have no overflow". The key operative word here is "known". >>> >>> This means literally an operation with an additional "llvm.assume". >>> For example (a +nsw b) when a and b are i32 is really >>> >>> ((llvm.assume((INT_MIN <= ((i64)a+(i64)b) <= INT_MAX) , (a + b)) >>> >>> It is this "assume" that justifies the transformation that Dan does >>> to optimize sign-extension of i32 induction variables out of loops >>> on LP64 targets. So far so good. >>> >>> Note that there is no "undef" in the IR either before or after the >>> transform, this doesn't just fall out because of a clever definition >>> of IR "undef". >>> >>> Note that even the concept of "undefined" never enters into the >>> justification, only that "nsw" ==> "assume" ==> the loop iteration >>> bounds don't wrap ==> i64 arithmetic will generate the exact same >>> iterations as i32 arithmetic ==> the induction variable can be >>> promoted ==> there is no longer any sign-extend inside the loop. >>> >>> Note that clang can generate "+nsw" for signed “+" regardless >>> of whether the precise C standard wording is "undefined behavior" >>> or more simply "unspecified value". >>> >>> >>> >>> Where Dan goes wrong is in thinking that "+nsw" is an operation >>> rather than an operation with an assume, and therefore he feels >>> the need to define what the result of this operation is when it >>> overflows, which then seems to require a new "poison" instruction >>> because "undef" isn't good enough. >>> >>> But there is no need to ask what the result of overflow is >>> because "+nsw" is like a "+" inside of an if-statement whose >>> condition precludes overflow, and if it can't overflow then >>> asking about it is a non sequitor. >>> >>> And speculatively hoisting the "+nsw" doesn't cause problems >>> because hoisting a "+nsw" is like taking a "+" outside of the >>> if-statement that guarantees no overflow, it is then simply >>> a plain old un-attributed "+" operation which has no undefined >>> behavior. >>> >>> Dan's follow on email "nsw is still inconsistent" [2] shows by >>> example why it is illegal to hoist the "nsw" attribute along >>> with the "+" operation. >>> >>> It therefore makes no sense to discuss the result of "+nsw" as >>> ever being either "undef" or "poison", and so the need for "poison" >>> is gone. >>> >>> >>> >>> Here's what Dan thought at the time about this "poison" creation >>> >>> "I wrote up a description of this concept, and it's been in >>> LangRef ever since. It sticks out though, because it got pretty >>> big and complex, especially in view of its relative obscurity. >>> Realistically speaking, it's probably not fully watertight yet." >>> >>> I agree with Dan here "it's probably not fully watertight yet", and >>> apparently other folks agree because yet another instruction, >>> "freeze", is being proposed to fix "poison"s problems. My guess >>> is that "freeze is probably not fully watertight yet" either, but >>> since "poison" isn't needed it is time to delete it from the LangRef, >>> and we can therefore stop considering "freeze". >>> >>> >>> Peter Lawrence. >>> >>> >>> References >>> >>> [1. llvm-dev, Dan Gohman, Tue Nov 29 15:21:58 PST 2011 ] >>> [2. llvm-dev, Dan Gohman, Mon Dec 12 12:58:31 PST 2011 ] >>> >> > > -- > Hal Finkel > Lead, Compiler Technology and Programming Languages > Leadership Computing Facility > Argonne National Laboratory-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170628/80535a98/attachment-0001.html>
On Wed, Jun 28, 2017 at 12:42 PM Peter Lawrence <peterl95124 at sbcglobal.net> wrote:> Hal, > Chandler, > Can we get back on topic, IE if you’ve actually read the > paper I’d like to > hear some feedback on its technical content >Sure, I was mostly mentioning prior discussion because I don't think this is going to add a lot to the discussion around `nsw` and the `llvm.assume` intrinsic back when Hal and I started discussing modeling assumptions as first class entities in the IR.>>> I have been re-reading Dan Gohman's original post "the nsw story" [1] >>> and have come to the conclusion that Dan got it wrong in some respects. >>> >>> He came up with "no signed wrap" ("nsw") which in his words means >>> "The purpose of the nsw flag is to indicate instructions which are >>> known to have no overflow". The key operative word here is "known". >>> >>> This means literally an operation with an additional "llvm.assume". >>> For example (a +nsw b) when a and b are i32 is really >>> >>> ((llvm.assume((INT_MIN <= ((i64)a+(i64)b) <= INT_MAX) , (a + b)) >>> >>> It is this "assume" that justifies the transformation that Dan does >>> to optimize sign-extension of i32 induction variables out of loops >>> on LP64 targets. So far so good. >>> >>I just want to point out that this isn't complete. *Before* an `add nsw` instruction is hoisted, it could be thought of as a normal `add` plus `llvm.assume`. But the `llvm.assume` cannot be hoisted above predicates. So what this would end up doing is allowing the hoisting of the `add` by separating the invariant and keeping the invariant below the predicate. This is a fairly minor point, but I didn't want it to be unstated.> >>> Where Dan goes wrong is in thinking that "+nsw" is an operation >>> rather than an operation with an assume, and therefore he feels >>> the need to define what the result of this operation is when it >>> overflows, which then seems to require a new "poison" instruction >>> because "undef" isn't good enough. >>> >>A meta comment. I think it would be helpful for you to phrase your discussion in a less personalized way. Rather than talking about "So-and-so went wrong when they thought ...", I would instead merely discuss the situation as it is today. This helps keep the discussion from sliding into inappropriate areas, which has already been a concern, so it seems worth pointing out as a suggestion for future emails. To the actual point, it seems backwards to say that we made a mistake in specifying `nsw` as an operation rather than an `assume`. At the time we specified `nsw` there was no such construct as `llvm.assume`. Perhaps introducing `llvm.assume` would have been better than introducing `nsw`, perhaps not. But it doesn't actually matter. This kind of historical critique doesn't seem like a useful way to frame the discussion. The point is, we have `llvm.assume` now, and we can and should consider whether it is a better tool for the job. Hopefully I have understood the point you are trying to make? My response is that I don't think this is clearly a superior tool for the job. We did actually discuss things like `nsw` when designing `llvm.assume`. One serious issue with `llvm.assume` is that it *cannot be speculated*. Even though we gain the ability to speculate the `add` by decoupling the invariant, we would retain a large amount of IR that could not be speculated and would cause us to retain control flow and be a significant barrier to optimization. In fact, `llvm.assume` is *incredibly* expensive already. We talked at length when introducing it about how it should be relatively sparingly used and used with care to provide the most impactful hints to the optimizer rather than trying to encode everything that might possibly be assumed to be true. On the flip side, `add nsw` is extremely common in the IR produced by many frontends. So it would seem like a poor fit to use a very expensive and heavyweight tool to represent something that is very common. I'm happy to view `add nsw` as an `add` with an implicit `llvm.assume` inside of control flow where the result is "used" (for a complex definition of "used" that should be discussed on the `undef` thread and not here) if that is a useful notional or theoretical model. But the *representation optimization* of `add nsw` still seems quite useful. It therefore makes no sense to discuss the result of "+nsw" as>>> ever being either "undef" or "poison", and so the need for "poison" >>> is gone. >>> >> Note that there have been examples which show that the `undef` semanticsalone as used to model reading from uninitialized memory are effectively equivalent to `poison`. So I don't believe that removing `nsw` would necessarily be sufficient here. I think it would also require other changes. And in your emails about `undef` you seem to agree that we would have to change `undef` itself. I think it would be more productive to focus on that thread. If we decide to get rid of poison entirely (including changing `undef`), *then* we can have a discussion about how to model `nsw`. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170629/a2c69f28/attachment-0001.html>