Andrew Trick via llvm-dev
2017-Aug-09 03:29 UTC
[llvm-dev] Improving SCEV's behavior around IR level no-wrap
> On Aug 8, 2017, at 5:34 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote: > > Hi Pankaj, > > IIRC there was pushback on this proposal so I did not proceed further. > Are you blocked on this? > > [+CC Andy, who I remember had some objections.] > > — SanjoyOff the top of my head, my concern is that expression comparison is no longer constant time, which I think is fundamental to SCEV. I may be able to dig through my notes next week, after vacation... -Andy> On Tue, Aug 8, 2017 at 3:06 PM, Chawla, Pankaj <pankaj.chawla at intel.com> wrote: >> Hi Sanjoy, >> >> Any update on this? >> Are there plans to implement this proposal? >> >> Thanks, >> Pankaj >> >> >> -----Original Message----- >> Date: Fri, 23 Sep 2016 02:09:19 -0700 >> From: Sanjoy Das via llvm-dev <llvm-dev at lists.llvm.org> >> To: llvm-dev <llvm-dev at lists.llvm.org>, Andrew Trick >> <atrick at apple.com>, Dan Gohman <dan433584 at gmail.com>, Hal Finkel >> <hfinkel at anl.gov>, Chandler Carruth <chandlerc at gmail.com>, David >> Majnemer <david.majnemer at gmail.com>, Sebastian Pop <sebpop at gmail.com> >> Subject: [llvm-dev] Improving SCEV's behavior around IR level no-wrap >> flags >> Message-ID: >> <CAMiUf7fs5xDnfaChLEcft+auNoVW_LksqLA48KJnH3rNgcMftQ at mail.gmail.com> >> Content-Type: text/plain; charset=UTF-8 >> >> Hi all, >> >> This is about a project I've been prototyping on-and-off for a while that has finally reached a point where I can claim it to be "potentially viable". I'd like to gather some input from the community before moving too far ahead. >> >> >> # The problem >> >> There is a representation issue within SCEV that prevents it from fully using information from nsw/nuw flags present in the IR. This isn't just a theoretical issue, e.g. today LLVM won't unroll this >> loop: >> >> void f(int x, long* arr) { >> for (int i = x + 1; i < x + 3; i++) >> arr[i] = 40; >> } >> >> since SCEV is unable to exploit the no-overflow on x+1 and x+3 to prove that the loop only runs twice. >> >> The fundamental problem here is that SCEV expressions are unique'd but the nsw/nuw flags on SCEV expressions are not part of the key they're unique'd by. Instead, nsw/nuw flags on SCEV expressions are expressed by mutating the SCEV expressions in place. >> >> This means "add %x, 1" and "add nsw %x, 1" both map to the _same_ SCEV expression (that is, literally the same SCEV* object), and we can't mutate the common SCEV expression to flag it as nsw since that will incorrectly denote "add %x, 1" as nsw too. >> >> In general, this means SCEV has to be very conservative about marking SCEV expressions as no-wrap. In some cases (e.g. the loop above), this ends up being excessively conservative. >> >> One path forward is to have SCEV try to prove that if a certain operation produces poison, the program definitely has undefined behavior. This can let us mutate the corresponding SCEV objects to pull the "nsw"-ness into SCEV. For instance, if we have >> >> %x = load ... >> %t = add i32 nsw %x, 1 >> %addr = gep(%array, %t) >> store i32 0, %addr >> %t2 = add i32 %x, 1 >> >> then transferring NSW to getSCEV(%t) is okay, since even though %t2 (which will be mapped to the same SCEV expression as %t) does not have "nsw" on the instruction, we know adding 1 to %x cannot overflow since the program would have UB otherwise. >> >> Bjarke Roune has implemented some of this. However, this is difficult to do for cases like the x+1 .. x+3 loop above without running a control flow analysis over the entire function. And this approach does not work in the presence of function calls or general control flow, like >> >> %x = load ... >> %t = add i32 nsw %x, 1 >> call void @f() >> %addr = gep(%array, %t) >> store i32 0, %addr >> >> or >> >> %x = load ... >> %t = add i32 nsw %x, 1 >> if (<condition>) >> return; >> %addr = gep(%array, %t) >> store i32 0, %addr >> >> since unless the side-effecting use of %t (the store) "strongly"[1] post dominates the def of %x, there is no guaranteed undefined behavior on a poisonous %t. Things are even more complex if %x is not a load, but an expression SCEV an look through, like an add or a shift by a constant. >> >> *I think the current representation of nsw/nuw in SCEV expressions is not congruent with LLVM's specification of poison values, and that is blocking us from exploiting poison values as intended by LLVM's >> design.* >> >> >> >> # The proposed solution >> >> Since poison values are, well, _values_, I propose we model them as data within SCEV. We treat nsw/nuw flags as "operands" since they contribute to the result of an SCEV expression just like normal inputs to the expression. >> >> This means we'd treat "add %x, %y" as a different SCEV expression than "add nsw %x, %y", since the latter sometimes produces poison while the former doesn't. The latter would be known to not overflow, and SCEV would use that fact in the usual ways. >> >> With this change SCEV expressions will be pointer equal less often, and while relying on pointer equality for value equality will be correct, it will be somewhat pessimistic; and we'll have to implement and use some form of structural equality. >> >> In other words, some places that do >> >> SCEV *X = ... >> SCEV *Y = ... >> if (X == Y) >> ... >> >> will have to be changed to do >> >> SCEV *X = ... >> SCEV *Y = ... >> if (X->equals(Y)) >> ... >> >> This has potential for compile-time regressions. Hopefully they'll all be addressable. >> >> There are cases in which SCEV (via trip count analysis, say) can _prove_ that a certain expression does not overflow. In those cases we will mutate the SCEV expression to indicate no-wrap; since the no-wrap flag is just a "cache" of a proof based on the structure of the SCEV expression, and _does_ apply to all SCEV expressions with the same shapes. >> >> Concretely, we'll endow relevant SCEV expression types with two sets distinct of flags: >> >> - AxiomaticFlags: These flags follow from nsw/nuw annotations in the >> IR. These will be part of the key the SCEV expression is unique'd >> on. >> - ComputedFlags: These flags are derived from the structure of the >> SCEV expression, and they're *not* a part of the key the SCEV >> expression is unique'd on. >> >> For the purposes of consumption, there will be no difference between AxiomaticFlags and ComputedFlags. Consumers will get a union of the two when they ask for the set of flags that apply to a specific SCEV expression. >> >> ComputedFlags will, in general, depend on AxiomaticFlags. For instance if AxiomaticFlags is "nsw" for, say, {1,+,1}, we can add "nuw" to its ComputedFlags. There is no need to further distinguish "{1,+,1}-axiomatic<nsw>" on the computed<nuw> dimension since "{1,+,1}-axiomatic<nsw>" will always be computed<nuw>. >> >> >> >> >> What do you think? Does the overall picture here make sense? >> >> Alternate solutions are also more than welcome (especially if they're easier to implement!). >> >> Thanks, >> -- Sanjoy >> >> [1]: That is, it the store is guaranteed to execute once the load has >> been issued. >> >> >> ------------------------------ >> >>
Chawla, Pankaj via llvm-dev
2017-Aug-09 19:27 UTC
[llvm-dev] Improving SCEV's behavior around IR level no-wrap
I am not blocked by it. I was just wondering whether preserving no-wrap flags in SCEV can help produce better code using SCEVExpander. What I mean is that if I construct SCEV out of incoming optimized IR and then use SCEVExpander to generate code, can I recover the optimized IR back using some combination of peephole optimizations (like instcombine) on the generated code? Or are we going to lose performance due to missing no-wrap flags? Unfortunately, I do not have a test case to demonstrate that we may lose performance in some cases. Thanks, Pankaj -----Original Message----- From: Andrew Trick [mailto:atrick at apple.com] Sent: Tuesday, August 08, 2017 8:30 PM To: Sanjoy Das <sanjoy at playingwithpointers.com> Cc: Chawla, Pankaj <pankaj.chawla at intel.com>; llvm-dev at lists.llvm.org Subject: Re: Improving SCEV's behavior around IR level no-wrap> On Aug 8, 2017, at 5:34 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote: > > Hi Pankaj, > > IIRC there was pushback on this proposal so I did not proceed further. > Are you blocked on this? > > [+CC Andy, who I remember had some objections.] > > — SanjoyOff the top of my head, my concern is that expression comparison is no longer constant time, which I think is fundamental to SCEV. I may be able to dig through my notes next week, after vacation... -Andy> On Tue, Aug 8, 2017 at 3:06 PM, Chawla, Pankaj <pankaj.chawla at intel.com> wrote: >> Hi Sanjoy, >> >> Any update on this? >> Are there plans to implement this proposal? >> >> Thanks, >> Pankaj >> >> >> -----Original Message----- >> Date: Fri, 23 Sep 2016 02:09:19 -0700 >> From: Sanjoy Das via llvm-dev <llvm-dev at lists.llvm.org> >> To: llvm-dev <llvm-dev at lists.llvm.org>, Andrew Trick >> <atrick at apple.com>, Dan Gohman <dan433584 at gmail.com>, Hal Finkel >> <hfinkel at anl.gov>, Chandler Carruth <chandlerc at gmail.com>, David >> Majnemer <david.majnemer at gmail.com>, Sebastian Pop >> <sebpop at gmail.com> >> Subject: [llvm-dev] Improving SCEV's behavior around IR level no-wrap >> flags >> Message-ID: >> >> <CAMiUf7fs5xDnfaChLEcft+auNoVW_LksqLA48KJnH3rNgcMftQ at mail.gmail.com> >> Content-Type: text/plain; charset=UTF-8 >> >> Hi all, >> >> This is about a project I've been prototyping on-and-off for a while that has finally reached a point where I can claim it to be "potentially viable". I'd like to gather some input from the community before moving too far ahead. >> >> >> # The problem >> >> There is a representation issue within SCEV that prevents it from >> fully using information from nsw/nuw flags present in the IR. This >> isn't just a theoretical issue, e.g. today LLVM won't unroll this >> loop: >> >> void f(int x, long* arr) { >> for (int i = x + 1; i < x + 3; i++) >> arr[i] = 40; >> } >> >> since SCEV is unable to exploit the no-overflow on x+1 and x+3 to prove that the loop only runs twice. >> >> The fundamental problem here is that SCEV expressions are unique'd but the nsw/nuw flags on SCEV expressions are not part of the key they're unique'd by. Instead, nsw/nuw flags on SCEV expressions are expressed by mutating the SCEV expressions in place. >> >> This means "add %x, 1" and "add nsw %x, 1" both map to the _same_ SCEV expression (that is, literally the same SCEV* object), and we can't mutate the common SCEV expression to flag it as nsw since that will incorrectly denote "add %x, 1" as nsw too. >> >> In general, this means SCEV has to be very conservative about marking SCEV expressions as no-wrap. In some cases (e.g. the loop above), this ends up being excessively conservative. >> >> One path forward is to have SCEV try to prove that if a certain >> operation produces poison, the program definitely has undefined >> behavior. This can let us mutate the corresponding SCEV objects to >> pull the "nsw"-ness into SCEV. For instance, if we have >> >> %x = load ... >> %t = add i32 nsw %x, 1 >> %addr = gep(%array, %t) >> store i32 0, %addr >> %t2 = add i32 %x, 1 >> >> then transferring NSW to getSCEV(%t) is okay, since even though %t2 (which will be mapped to the same SCEV expression as %t) does not have "nsw" on the instruction, we know adding 1 to %x cannot overflow since the program would have UB otherwise. >> >> Bjarke Roune has implemented some of this. However, this is difficult >> to do for cases like the x+1 .. x+3 loop above without running a >> control flow analysis over the entire function. And this approach >> does not work in the presence of function calls or general control >> flow, like >> >> %x = load ... >> %t = add i32 nsw %x, 1 >> call void @f() >> %addr = gep(%array, %t) >> store i32 0, %addr >> >> or >> >> %x = load ... >> %t = add i32 nsw %x, 1 >> if (<condition>) >> return; >> %addr = gep(%array, %t) >> store i32 0, %addr >> >> since unless the side-effecting use of %t (the store) "strongly"[1] post dominates the def of %x, there is no guaranteed undefined behavior on a poisonous %t. Things are even more complex if %x is not a load, but an expression SCEV an look through, like an add or a shift by a constant. >> >> *I think the current representation of nsw/nuw in SCEV expressions is >> not congruent with LLVM's specification of poison values, and that is >> blocking us from exploiting poison values as intended by LLVM's >> design.* >> >> >> >> # The proposed solution >> >> Since poison values are, well, _values_, I propose we model them as data within SCEV. We treat nsw/nuw flags as "operands" since they contribute to the result of an SCEV expression just like normal inputs to the expression. >> >> This means we'd treat "add %x, %y" as a different SCEV expression than "add nsw %x, %y", since the latter sometimes produces poison while the former doesn't. The latter would be known to not overflow, and SCEV would use that fact in the usual ways. >> >> With this change SCEV expressions will be pointer equal less often, and while relying on pointer equality for value equality will be correct, it will be somewhat pessimistic; and we'll have to implement and use some form of structural equality. >> >> In other words, some places that do >> >> SCEV *X = ... >> SCEV *Y = ... >> if (X == Y) >> ... >> >> will have to be changed to do >> >> SCEV *X = ... >> SCEV *Y = ... >> if (X->equals(Y)) >> ... >> >> This has potential for compile-time regressions. Hopefully they'll all be addressable. >> >> There are cases in which SCEV (via trip count analysis, say) can _prove_ that a certain expression does not overflow. In those cases we will mutate the SCEV expression to indicate no-wrap; since the no-wrap flag is just a "cache" of a proof based on the structure of the SCEV expression, and _does_ apply to all SCEV expressions with the same shapes. >> >> Concretely, we'll endow relevant SCEV expression types with two sets distinct of flags: >> >> - AxiomaticFlags: These flags follow from nsw/nuw annotations in the >> IR. These will be part of the key the SCEV expression is unique'd >> on. >> - ComputedFlags: These flags are derived from the structure of the >> SCEV expression, and they're *not* a part of the key the SCEV >> expression is unique'd on. >> >> For the purposes of consumption, there will be no difference between AxiomaticFlags and ComputedFlags. Consumers will get a union of the two when they ask for the set of flags that apply to a specific SCEV expression. >> >> ComputedFlags will, in general, depend on AxiomaticFlags. For instance if AxiomaticFlags is "nsw" for, say, {1,+,1}, we can add "nuw" to its ComputedFlags. There is no need to further distinguish "{1,+,1}-axiomatic<nsw>" on the computed<nuw> dimension since "{1,+,1}-axiomatic<nsw>" will always be computed<nuw>. >> >> >> >> >> What do you think? Does the overall picture here make sense? >> >> Alternate solutions are also more than welcome (especially if they're easier to implement!). >> >> Thanks, >> -- Sanjoy >> >> [1]: That is, it the store is guaranteed to execute once the load has >> been issued. >> >> >> ------------------------------ >> >>
Sanjoy Das via llvm-dev
2017-Aug-14 05:32 UTC
[llvm-dev] Improving SCEV's behavior around IR level no-wrap
Hi, On Wed, Aug 9, 2017 at 12:27 PM, Chawla, Pankaj <pankaj.chawla at intel.com> wrote:> I am not blocked by it. I was just wondering whether preserving no-wrap flags in SCEV can help produce better code using SCEVExpander. What I mean is that if I construct SCEV out of incoming optimized IR and then use SCEVExpander to generate code, can I recover the optimized IR back using some combination of peephole optimizations (like instcombine) on the generated code? Or are we going to lose performance due to missing no-wrap flags?I'm fairly certain that there are cases where round-tripping through SCEV -> SCEVExpander will lose information for good in clang-generated IR, since most of the NSW flags in clang-generated IR come language axioms (signed integer overflow is UB) that can't be re-proved by any analysis once they've been erased. -- Sanjoy> > Unfortunately, I do not have a test case to demonstrate that we may lose performance in some cases. > > Thanks, > Pankaj > > > -----Original Message----- > From: Andrew Trick [mailto:atrick at apple.com] > Sent: Tuesday, August 08, 2017 8:30 PM > To: Sanjoy Das <sanjoy at playingwithpointers.com> > Cc: Chawla, Pankaj <pankaj.chawla at intel.com>; llvm-dev at lists.llvm.org > Subject: Re: Improving SCEV's behavior around IR level no-wrap > > >> On Aug 8, 2017, at 5:34 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote: >> >> Hi Pankaj, >> >> IIRC there was pushback on this proposal so I did not proceed further. >> Are you blocked on this? >> >> [+CC Andy, who I remember had some objections.] >> >> — Sanjoy > > Off the top of my head, my concern is that expression comparison is no longer constant time, which I think is fundamental to SCEV. > > I may be able to dig through my notes next week, after vacation... > > -Andy > >> On Tue, Aug 8, 2017 at 3:06 PM, Chawla, Pankaj <pankaj.chawla at intel.com> wrote: >>> Hi Sanjoy, >>> >>> Any update on this? >>> Are there plans to implement this proposal? >>> >>> Thanks, >>> Pankaj >>> >>> >>> -----Original Message----- >>> Date: Fri, 23 Sep 2016 02:09:19 -0700 >>> From: Sanjoy Das via llvm-dev <llvm-dev at lists.llvm.org> >>> To: llvm-dev <llvm-dev at lists.llvm.org>, Andrew Trick >>> <atrick at apple.com>, Dan Gohman <dan433584 at gmail.com>, Hal Finkel >>> <hfinkel at anl.gov>, Chandler Carruth <chandlerc at gmail.com>, David >>> Majnemer <david.majnemer at gmail.com>, Sebastian Pop >>> <sebpop at gmail.com> >>> Subject: [llvm-dev] Improving SCEV's behavior around IR level no-wrap >>> flags >>> Message-ID: >>> >>> <CAMiUf7fs5xDnfaChLEcft+auNoVW_LksqLA48KJnH3rNgcMftQ at mail.gmail.com> >>> Content-Type: text/plain; charset=UTF-8 >>> >>> Hi all, >>> >>> This is about a project I've been prototyping on-and-off for a while that has finally reached a point where I can claim it to be "potentially viable". I'd like to gather some input from the community before moving too far ahead. >>> >>> >>> # The problem >>> >>> There is a representation issue within SCEV that prevents it from >>> fully using information from nsw/nuw flags present in the IR. This >>> isn't just a theoretical issue, e.g. today LLVM won't unroll this >>> loop: >>> >>> void f(int x, long* arr) { >>> for (int i = x + 1; i < x + 3; i++) >>> arr[i] = 40; >>> } >>> >>> since SCEV is unable to exploit the no-overflow on x+1 and x+3 to prove that the loop only runs twice. >>> >>> The fundamental problem here is that SCEV expressions are unique'd but the nsw/nuw flags on SCEV expressions are not part of the key they're unique'd by. Instead, nsw/nuw flags on SCEV expressions are expressed by mutating the SCEV expressions in place. >>> >>> This means "add %x, 1" and "add nsw %x, 1" both map to the _same_ SCEV expression (that is, literally the same SCEV* object), and we can't mutate the common SCEV expression to flag it as nsw since that will incorrectly denote "add %x, 1" as nsw too. >>> >>> In general, this means SCEV has to be very conservative about marking SCEV expressions as no-wrap. In some cases (e.g. the loop above), this ends up being excessively conservative. >>> >>> One path forward is to have SCEV try to prove that if a certain >>> operation produces poison, the program definitely has undefined >>> behavior. This can let us mutate the corresponding SCEV objects to >>> pull the "nsw"-ness into SCEV. For instance, if we have >>> >>> %x = load ... >>> %t = add i32 nsw %x, 1 >>> %addr = gep(%array, %t) >>> store i32 0, %addr >>> %t2 = add i32 %x, 1 >>> >>> then transferring NSW to getSCEV(%t) is okay, since even though %t2 (which will be mapped to the same SCEV expression as %t) does not have "nsw" on the instruction, we know adding 1 to %x cannot overflow since the program would have UB otherwise. >>> >>> Bjarke Roune has implemented some of this. However, this is difficult >>> to do for cases like the x+1 .. x+3 loop above without running a >>> control flow analysis over the entire function. And this approach >>> does not work in the presence of function calls or general control >>> flow, like >>> >>> %x = load ... >>> %t = add i32 nsw %x, 1 >>> call void @f() >>> %addr = gep(%array, %t) >>> store i32 0, %addr >>> >>> or >>> >>> %x = load ... >>> %t = add i32 nsw %x, 1 >>> if (<condition>) >>> return; >>> %addr = gep(%array, %t) >>> store i32 0, %addr >>> >>> since unless the side-effecting use of %t (the store) "strongly"[1] post dominates the def of %x, there is no guaranteed undefined behavior on a poisonous %t. Things are even more complex if %x is not a load, but an expression SCEV an look through, like an add or a shift by a constant. >>> >>> *I think the current representation of nsw/nuw in SCEV expressions is >>> not congruent with LLVM's specification of poison values, and that is >>> blocking us from exploiting poison values as intended by LLVM's >>> design.* >>> >>> >>> >>> # The proposed solution >>> >>> Since poison values are, well, _values_, I propose we model them as data within SCEV. We treat nsw/nuw flags as "operands" since they contribute to the result of an SCEV expression just like normal inputs to the expression. >>> >>> This means we'd treat "add %x, %y" as a different SCEV expression than "add nsw %x, %y", since the latter sometimes produces poison while the former doesn't. The latter would be known to not overflow, and SCEV would use that fact in the usual ways. >>> >>> With this change SCEV expressions will be pointer equal less often, and while relying on pointer equality for value equality will be correct, it will be somewhat pessimistic; and we'll have to implement and use some form of structural equality. >>> >>> In other words, some places that do >>> >>> SCEV *X = ... >>> SCEV *Y = ... >>> if (X == Y) >>> ... >>> >>> will have to be changed to do >>> >>> SCEV *X = ... >>> SCEV *Y = ... >>> if (X->equals(Y)) >>> ... >>> >>> This has potential for compile-time regressions. Hopefully they'll all be addressable. >>> >>> There are cases in which SCEV (via trip count analysis, say) can _prove_ that a certain expression does not overflow. In those cases we will mutate the SCEV expression to indicate no-wrap; since the no-wrap flag is just a "cache" of a proof based on the structure of the SCEV expression, and _does_ apply to all SCEV expressions with the same shapes. >>> >>> Concretely, we'll endow relevant SCEV expression types with two sets distinct of flags: >>> >>> - AxiomaticFlags: These flags follow from nsw/nuw annotations in the >>> IR. These will be part of the key the SCEV expression is unique'd >>> on. >>> - ComputedFlags: These flags are derived from the structure of the >>> SCEV expression, and they're *not* a part of the key the SCEV >>> expression is unique'd on. >>> >>> For the purposes of consumption, there will be no difference between AxiomaticFlags and ComputedFlags. Consumers will get a union of the two when they ask for the set of flags that apply to a specific SCEV expression. >>> >>> ComputedFlags will, in general, depend on AxiomaticFlags. For instance if AxiomaticFlags is "nsw" for, say, {1,+,1}, we can add "nuw" to its ComputedFlags. There is no need to further distinguish "{1,+,1}-axiomatic<nsw>" on the computed<nuw> dimension since "{1,+,1}-axiomatic<nsw>" will always be computed<nuw>. >>> >>> >>> >>> >>> What do you think? Does the overall picture here make sense? >>> >>> Alternate solutions are also more than welcome (especially if they're easier to implement!). >>> >>> Thanks, >>> -- Sanjoy >>> >>> [1]: That is, it the store is guaranteed to execute once the load has >>> been issued. >>> >>> >>> ------------------------------ >>> >>> >