Hello everyone, I'm starting a new thread, as a follow-up to several older ones, on invariants in LLVM. Fundamentally, invariants are conditions that the optimizer is allowed to assume will be valid during the execution of the program. Many other compilers support a concept like this, MSVC's __assume, for example. GCC has a builtin called __builtin_assume_aligned which also neatly falls into this class. In my initial patches, I've named the associated IR-level intrinsic @llvm.invariant(i1) (maybe it should be @llvm.assume, etc. -- please feel free to express an opinion). I'll attempt to highlight the primary issue involved with a simple example: int foo(int a, int b) { if (b < 0) { __builtin_assume(a > 5); return a > 3; } else { __builtin_assume((a & 3) == 0); return a & 1; } } (with the current patches below, this simplifies as you'd expect) When simplifying the expression "a > 3" or "a & 1", how do you find the applicable invariants? The invariants are not direct users of the variables they affect, a in this case, but only use those variables transitively (the invariant is a user of the (a > 5) icmp, which is a user of 'a'). Also not all transitive invariant users of the variable apply at any particular point in the execution. Specifically, when we call computeKnownBits on 'a' in the above example, I can't use the invariants without knowing whether I want to know the answer at "return a > 3;" or at "return a & 1;". Generally speaking, I need two things: The instruction specifying the control-flow context, and a DominatorTree in order to efficiently determine if a particular invariant applies at the specified context instruction. As a result, my currently implementations contains: 1. Changes to several analysis passes, ValueTracking (and InstSimplify), InstCombine, LazyValueInfo, and ScalarEvolution in order to find and use invariants. Invariants are generally found by doing a depth-limited filtered recursive search of users of the Value being queried for a relevant invariant. 2. Changes to the APIs of those analysis passes to take optional context instructions and DominatorTree objects, and changes to almost all users of those analysis passed to provide that information when available (this obviously touches a fair amount of code). 3. Documentation changes, lowering, aliasing-analysis updates, etc. As with MSVC's __assume(0), which means 'unreachable', I've done the same with @llvm.invariant(false). Over the past several weeks, I've posted a number of patches comprising a proposed implementation of an invariant intrinsic for LLVM. I've listed them here in reverse order (they apply bottom up). Patches: http://reviews.llvm.org/D149 - A patch for Clang that adds __builtin_assume (and __assume in MS-compatibility mode) and GCC's __builtin_assume_aligned. http://reviews.llvm.org/D4568 - Adds awareness of invariants to ScalarEvolution (specifically when looking for loop guards) http://reviews.llvm.org/D4567 - A small addition to InstCombine to look for return values with all known bits http://reviews.llvm.org/D4543 - Adds a bunch more patterns for recognizing known bits from invariants in value tracking http://reviews.llvm.org/D4511 - Adds invariant awareness to LazyValueInfo (so invariants can be used by jump threading and related optimizations) http://reviews.llvm.org/D4491 - Adds basic canonicalization of the invariant intrinsics themselves (splitting invariant(a && b), etc.) http://reviews.llvm.org/D181 - This adds a scalar-evolution-powered pass, run late in the pipeline, to do more sophisticated updating of pointer alignments based on invariants. http://reviews.llvm.org/D4490 - An update to computeKnownBits and other functions in ValueTracking, and almost all callers of these functions, to use invariants to compute known bits of quantities that applicable invariant conditions (this also handles updating basic pointer alignments). http://reviews.llvm.org/D180 - Adds "ephemeral" value recognition to the inliner and basic code metrics so that side-effect-free instructions only contributing to invariants don't affect inlining, loop unrolling, etc. http://reviews.llvm.org/D178 - Adds the llvm.invariant intrinsic and basic lowering properties. Some points for discussion [by Philip, edited slightly by me]: 1. The current patches remove llvm.invariant(undef) as dead. Would it be better to transform this into 'unreachable' as is done for llvm.invariant(false). 2. Would adding a canonicalization of if(c) { unreachable } to llvm.invariant(c) would be worthwhile? Finally, I'd like to mention that alternate schemes are certainly possible: 1. We could 'hide' all invariants inside dominating blocks guarded by never-true conditional branches (this has the benefit of being able to express invariants using expressions that normally might have side effects, and could simplify the process of finding applicable invariants) [a suggestion by Richard and Nick]. 2. We might want to have passes precompute the Value->(set of Invariants) map, and update it as necessary instead of doing transitive-user searches [a suggestion by Chandler]. 3. I'm sure there is other design space :-) If you're interested in this functionality, please provide feedback, use cases, etc. (and help review patches). Thanks in advance, Hal -- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
On Thu, Jul 17, 2014 at 4:39 PM, Hal Finkel <hfinkel at anl.gov> wrote:> Some points for discussion [by Philip, edited slightly by me]: > > 1. The current patches remove llvm.invariant(undef) as dead. Would it be > better to transform this into 'unreachable' as is done for > llvm.invariant(false). >I strongly thing "yes". With undef, we are granted permission to chose the most useful value for optimization which is 'false' here.> 2. Would adding a canonicalization of if(c) { unreachable } to > llvm.invariant(c) would be worthwhile? >There was a long and painful attempt to implement invariants based on the branch-to-unreachable pattern. It didn't work. I don't expect these patterns to show up often organically and to go away too soon in the optimizer to be useful. The whole point of 'llvm.invariant' instead of the if construct is to distinguish between the case the optimizer should try to remove (a branch to unreachable) and the case the optimizer should try to preserve because of some specific utility. We shouldn't lose this important distinction.> > Finally, I'd like to mention that alternate schemes are certainly possible: > > 1. We could 'hide' all invariants inside dominating blocks guarded by > never-true conditional branches (this has the benefit of being able to > express invariants using expressions that normally might have side effects, > and could simplify the process of finding applicable invariants) [a > suggestion by Richard and Nick]. >I'm really concerned about the increased complexity and cost of invariants in this pattern. I would like to go with the simpler and less intrusive design until we have some compelling evidence that we need to support the more general form.> > 2. We might want to have passes precompute the Value->(set of Invariants) > map, and update it as necessary instead of doing transitive-user searches > [a suggestion by Chandler]. >I think this is mostly just a small API tweak to make the code more maintainable. I'm specifically *not* imagining this as a formal analysis pass or anything of the sort. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140717/4167d72e/attachment.html>
On 07/17/2014 01:39 PM, Hal Finkel wrote:> Hello everyone, > > I'm starting a new thread, as a follow-up to several older ones, on invariants in LLVM. Fundamentally, invariants are conditions that the optimizer is allowed to assume will be valid during the execution of the program. Many other compilers support a concept like this, MSVC's __assume, for example. GCC has a builtin called __builtin_assume_aligned which also neatly falls into this class.First of all, thank you for working on this! I'm thrilled to see progress being made here.> > In my initial patches, I've named the associated IR-level intrinsic @llvm.invariant(i1) (maybe it should be @llvm.assume, etc. -- please feel free to express an opinion).My bikeshed preference would be "assume". "invariant" makes me think "loop invariant" - i.e. something which needs to be proven. "assume" is a traditional name (in the verification world) for something which can be assumed true without verification and can result in invalid results if actually untrue.> > I'll attempt to highlight the primary issue involved with a simple example: > > int foo(int a, int b) { > if (b < 0) { > __builtin_assume(a > 5); > return a > 3; > } else { > __builtin_assume((a & 3) == 0); > return a & 1; > } > } > > (with the current patches below, this simplifies as you'd expect) > > When simplifying the expression "a > 3" or "a & 1", how do you find the applicable invariants? The invariants are not direct users of the variables they affect, a in this case, but only use those variables transitively (the invariant is a user of the (a > 5) icmp, which is a user of 'a'). Also not all transitive invariant users of the variable apply at any particular point in the execution. Specifically, when we call computeKnownBits on 'a' in the above example, I can't use the invariants without knowing whether I want to know the answer at "return a > 3;" or at "return a & 1;".This path dependence property is important. We need to make sure this is clearly spelled out in the documentation or it will lead to ongoing confusion.> Generally speaking, I need two things: The instruction specifying the control-flow context, and a DominatorTree in order to efficiently determine if a particular invariant applies at the specified context instruction. > > As a result, my currently implementations contains: > > 1. Changes to several analysis passes, ValueTracking (and InstSimplify), InstCombine, LazyValueInfo, and ScalarEvolution in order to find and use invariants. Invariants are generally found by doing a depth-limited filtered recursive search of users of the Value being queried for a relevant invariant. > > 2. Changes to the APIs of those analysis passes to take optional context instructions and DominatorTree objects, and changes to almost all users of those analysis passed to provide that information when available (this obviously touches a fair amount of code). > > 3. Documentation changes, lowering, aliasing-analysis updates, etc. > > As with MSVC's __assume(0), which means 'unreachable', I've done the same with @llvm.invariant(false). > > Over the past several weeks, I've posted a number of patches comprising a proposed implementation of an invariant intrinsic for LLVM. I've listed them here in reverse order (they apply bottom up).I'll be happy to help review once we settle on desired semantics. :)> > Patches: > > http://reviews.llvm.org/D149 - A patch for Clang that adds __builtin_assume (and __assume in MS-compatibility mode) and GCC's __builtin_assume_aligned. > > http://reviews.llvm.org/D4568 - Adds awareness of invariants to ScalarEvolution (specifically when looking for loop guards) > http://reviews.llvm.org/D4567 - A small addition to InstCombine to look for return values with all known bits > http://reviews.llvm.org/D4543 - Adds a bunch more patterns for recognizing known bits from invariants in value tracking > http://reviews.llvm.org/D4511 - Adds invariant awareness to LazyValueInfo (so invariants can be used by jump threading and related optimizations) > http://reviews.llvm.org/D4491 - Adds basic canonicalization of the invariant intrinsics themselves (splitting invariant(a && b), etc.) > http://reviews.llvm.org/D181 - This adds a scalar-evolution-powered pass, run late in the pipeline, to do more sophisticated updating of pointer alignments based on invariants. > http://reviews.llvm.org/D4490 - An update to computeKnownBits and other functions in ValueTracking, and almost all callers of these functions, to use invariants to compute known bits of quantities that applicable invariant conditions (this also handles updating basic pointer alignments). > http://reviews.llvm.org/D180 - Adds "ephemeral" value recognition to the inliner and basic code metrics so that side-effect-free instructions only contributing to invariants don't affect inlining, loop unrolling, etc. > http://reviews.llvm.org/D178 - Adds the llvm.invariant intrinsic and basic lowering properties. > > Some points for discussion [by Philip, edited slightly by me]: > > 1. The current patches remove llvm.invariant(undef) as dead. Would it be better to transform this into 'unreachable' as is done for llvm.invariant(false). > > 2. Would adding a canonicalization of if(c) { unreachable } to llvm.invariant(c) would be worthwhile?A couple to add: 3. An "llvm.invariant" has zero code generation cost. Given that, a lot of pattern matching and profitability heuristics will need adjusted to ignore them. I worry about the maintenance cost of this in the optimizer. I can see a couple of possibilities here: - Canonicalize the placement of "llvm.invariants: at the end of each basic block. This at least reduces the patterns to be matched. - Remove the invariant instructions entirely from the in flight IR. Use the "llvm.invariant" instruction only for serialization, but have the conditions stored directly on the basic blocks when in memory. This would reduce the pattern matching problem, but at cost of extra complexity. 4. How will the presence of "llvm.invariants" effect things like code placement? I could see have lots of extra users (which are dropped late in the optimizer) causing somewhat surprising results. (i.e. we perform CSE, instruction gets placed differently, performance differs)> > Finally, I'd like to mention that alternate schemes are certainly possible: > > 1. We could 'hide' all invariants inside dominating blocks guarded by never-true conditional branches (this has the benefit of being able to express invariants using expressions that normally might have side effects, and could simplify the process of finding applicable invariants) [a suggestion by Richard and Nick].I'm not really clear on what this proposal would entail. Could you spell out an example? (If you think this is worth discussing further.)> > 2. We might want to have passes precompute the Value->(set of Invariants) map, and update it as necessary instead of doing transitive-user searches [a suggestion by Chandler].This seems like something to consider as a future enhancement.> > 3. I'm sure there is other design space :-) > > If you're interested in this functionality, please provide feedback, use cases, etc. (and help review patches).Interested!> > Thanks in advance, > HalPhilip
On 07/17/2014 01:51 PM, Chandler Carruth wrote:> > > 2. Would adding a canonicalization of if(c) { unreachable } to > llvm.invariant(c) would be worthwhile? > > > There was a long and painful attempt to implement invariants based on > the branch-to-unreachable pattern. It didn't work. I don't expect > these patterns to show up often organically and to go away too soon in > the optimizer to be useful. The whole point of 'llvm.invariant' > instead of the if construct is to distinguish between the case the > optimizer should try to remove (a branch to unreachable) and the case > the optimizer should try to preserve because of some specific utility. > We shouldn't lose this important distinction.On first thought, I disagree. I may not be understanding your point though. My understanding of the previous work was that it tried to use "branch-to-unreachable" as the canonical form. This is inherently problematic in an IR based on basic blocks since it split basic blocks into many smaller chunks. It might work out better if we used extended basic blocks, but we don't. I don't see any harm in canonicalizing to "llvm.invariant" from "if(c) unreachable". In either case, we remove the branch and can merge the basic blocks. In the former, we preserve more information for later passes. The only real downside is potentially keeping more Values alive and thus forcing the compiler to do more work. Can you spell out your objections more?> > > 2. We might want to have passes precompute the Value->(set of > Invariants) map, and update it as necessary instead of doing > transitive-user searches [a suggestion by Chandler]. > > > I think this is mostly just a small API tweak to make the code more > maintainable. I'm specifically *not* imagining this as a formal > analysis pass or anything of the sort.I would support this. I'm not seeing the downside to an analysis pass. Probably not worth implementing up front, but might be worthwhile down the road. Philip -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140717/c3f0ceef/attachment.html>
On Thu, Jul 17, 2014 at 5:31 PM, Philip Reames <listmail at philipreames.com> wrote:> 3. An "llvm.invariant" has zero code generation cost. Given that, a lot > of pattern matching and profitability heuristics will need adjusted to > ignore them. >FWIW, this has been the fundamental point of contention in the entire design. I've discussed this several times with Andy, Hal, and others. I'll try to summarize a few points here, although getting all of the points is beyond me. Also, this is *my* takeaway, I'm probably not doing justice to other positions here, but hopefully others will chime in. IMO, this cannot be avoided. This is *the* tradeoff of encoding assumptions: you risk the complexity of the assumption you are expressing imposing more optimization costs than you get in simplifications due to the extra information. I think we should just make this very clear in the documentation, and then not worry about it too much. The idea is to only use this to express really important, and impossible to deduce invariants of the program or high-level abstraction in the IR. Unless it has really significant effects, it isn't worth it, don't do it. =/ It may not be the most theoretically satisfying result, but I think it is practical. The other ideas have been to try to use some metadata encoding scheme to partition the abstraction from the actual IR. However, all of these IMO degenerate to specifying a "shadow IR" -- an entirely different IR that we still have to canonicalize, optimize, and pattern match, but which is somehow different and separated from the actual IR. I think the complexity of this is tremendously higher and would be a real problem long-term. I also don't think that the gain is worth the tradeoff. I worry about the maintenance cost of this in the optimizer. As long as we don't go "crazy" trying to recover the performance, it should be OK. Hal has already put together the patches needed for his current approach. We'll likely have to tweak this approach a little, but we don't need to go throguh and change every single hasOneUse() check to do something special with these invariants (or assumptions).> I can see a couple of possibilities here: > - Canonicalize the placement of "llvm.invariants: at the end of each basic > block. This at least reduces the patterns to be matched. >I think that the use of use-def SSA graph information makes the placement of these unimportant. On the flip side, sinking the invariant below a call which might unwind would reduce scope over which the invariant applies. - Remove the invariant instructions entirely from the in flight IR. Use the> "llvm.invariant" instruction only for serialization, but have the > conditions stored directly on the basic blocks when in memory. This would > reduce the pattern matching problem, but at cost of extra complexity. >This is very similar to the metadata approach -- it boils down to needing to have and hold a reasonably complete "IR" which isn't the actual IR.> > 4. How will the presence of "llvm.invariants" effect things like code > placement? I could see have lots of extra users (which are dropped late in > the optimizer) causing somewhat surprising results. (i.e. we perform CSE, > instruction gets placed differently, performance differs) >Somewhat surprising, but I suspect it won't be terribly different. These won't make past codegen prep (and maybe removed even before LSR or such) and so shouldn't really change the scheduling within a basic block. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140717/83116d38/attachment.html>
----- Original Message -----> From: "Philip Reames" <listmail at philipreames.com> > To: "Hal Finkel" <hfinkel at anl.gov>, "LLVM Dev" <llvmdev at cs.uiuc.edu> > Cc: "Chandler Carruth" <chandlerc at gmail.com>, "Andrew Trick" <atrick at apple.com>, "Richard Smith" > <richard at metafoo.co.uk>, "Nick Lewycky" <nlewycky at google.com> > Sent: Thursday, July 17, 2014 4:31:10 PM > Subject: Re: [RFC] Invariants in LLVM > > > On 07/17/2014 01:39 PM, Hal Finkel wrote: > > Hello everyone, > > > > I'm starting a new thread, as a follow-up to several older ones, on > > invariants in LLVM. Fundamentally, invariants are conditions that > > the optimizer is allowed to assume will be valid during the > > execution of the program. Many other compilers support a concept > > like this, MSVC's __assume, for example. GCC has a builtin called > > __builtin_assume_aligned which also neatly falls into this class. > First of all, thank you for working on this! I'm thrilled to see > progress being made here. > > > > In my initial patches, I've named the associated IR-level intrinsic > > @llvm.invariant(i1) (maybe it should be @llvm.assume, etc. -- > > please feel free to express an opinion). > My bikeshed preference would be "assume". "invariant" makes me think > "loop invariant" - i.e. something which needs to be proven. "assume" > is > a traditional name (in the verification world) for something which > can > be assumed true without verification and can result in invalid > results > if actually untrue.That's two votes for 'assume' (you and Chandler from his code review); anyone else? -Hal> > > > > I'll attempt to highlight the primary issue involved with a simple > > example: > > > > int foo(int a, int b) { > > if (b < 0) { > > __builtin_assume(a > 5); > > return a > 3; > > } else { > > __builtin_assume((a & 3) == 0); > > return a & 1; > > } > > } > > > > (with the current patches below, this simplifies as you'd expect) > > > > When simplifying the expression "a > 3" or "a & 1", how do you find > > the applicable invariants? The invariants are not direct users of > > the variables they affect, a in this case, but only use those > > variables transitively (the invariant is a user of the (a > 5) > > icmp, which is a user of 'a'). Also not all transitive invariant > > users of the variable apply at any particular point in the > > execution. Specifically, when we call computeKnownBits on 'a' in > > the above example, I can't use the invariants without knowing > > whether I want to know the answer at "return a > 3;" or at "return > > a & 1;". > This path dependence property is important. We need to make sure > this > is clearly spelled out in the documentation or it will lead to > ongoing > confusion. > > Generally speaking, I need two things: The instruction specifying > > the control-flow context, and a DominatorTree in order to > > efficiently determine if a particular invariant applies at the > > specified context instruction. > > > > As a result, my currently implementations contains: > > > > 1. Changes to several analysis passes, ValueTracking (and > > InstSimplify), InstCombine, LazyValueInfo, and ScalarEvolution > > in order to find and use invariants. Invariants are generally > > found by doing a depth-limited filtered recursive search of > > users of the Value being queried for a relevant invariant. > > > > 2. Changes to the APIs of those analysis passes to take optional > > context instructions and DominatorTree objects, and changes to > > almost all users of those analysis passed to provide that > > information when available (this obviously touches a fair amount > > of code). > > > > 3. Documentation changes, lowering, aliasing-analysis updates, > > etc. > > > > As with MSVC's __assume(0), which means 'unreachable', I've done > > the same with @llvm.invariant(false). > > > > Over the past several weeks, I've posted a number of patches > > comprising a proposed implementation of an invariant intrinsic for > > LLVM. I've listed them here in reverse order (they apply bottom > > up). > I'll be happy to help review once we settle on desired semantics. :) > > > > Patches: > > > > http://reviews.llvm.org/D149 - A patch for Clang that adds > > __builtin_assume (and __assume in MS-compatibility mode) and GCC's > > __builtin_assume_aligned. > > > > http://reviews.llvm.org/D4568 - Adds awareness of invariants to > > ScalarEvolution (specifically when looking for loop guards) > > http://reviews.llvm.org/D4567 - A small addition to InstCombine to > > look for return values with all known bits > > http://reviews.llvm.org/D4543 - Adds a bunch more patterns for > > recognizing known bits from invariants in value tracking > > http://reviews.llvm.org/D4511 - Adds invariant awareness to > > LazyValueInfo (so invariants can be used by jump threading and > > related optimizations) > > http://reviews.llvm.org/D4491 - Adds basic canonicalization of the > > invariant intrinsics themselves (splitting invariant(a && b), > > etc.) > > http://reviews.llvm.org/D181 - This adds a scalar-evolution-powered > > pass, run late in the pipeline, to do more sophisticated updating > > of pointer alignments based on invariants. > > http://reviews.llvm.org/D4490 - An update to computeKnownBits and > > other functions in ValueTracking, and almost all callers of these > > functions, to use invariants to compute known bits of quantities > > that applicable invariant conditions (this also handles updating > > basic pointer alignments). > > http://reviews.llvm.org/D180 - Adds "ephemeral" value recognition > > to the inliner and basic code metrics so that side-effect-free > > instructions only contributing to invariants don't affect > > inlining, loop unrolling, etc. > > http://reviews.llvm.org/D178 - Adds the llvm.invariant intrinsic > > and basic lowering properties. > > > > Some points for discussion [by Philip, edited slightly by me]: > > > > 1. The current patches remove llvm.invariant(undef) as dead. > > Would it be better to transform this into 'unreachable' as is > > done for llvm.invariant(false). > > > > 2. Would adding a canonicalization of if(c) { unreachable } to > > llvm.invariant(c) would be worthwhile? > A couple to add: > 3. An "llvm.invariant" has zero code generation cost. Given that, a > lot > of pattern matching and profitability heuristics will need adjusted > to > ignore them. I worry about the maintenance cost of this in the > optimizer. I can see a couple of possibilities here: > - Canonicalize the placement of "llvm.invariants: at the end of each > basic block. This at least reduces the patterns to be matched. > - Remove the invariant instructions entirely from the in flight IR. > Use > the "llvm.invariant" instruction only for serialization, but have the > conditions stored directly on the basic blocks when in memory. This > would reduce the pattern matching problem, but at cost of extra > complexity. > > 4. How will the presence of "llvm.invariants" effect things like code > placement? I could see have lots of extra users (which are dropped > late > in the optimizer) causing somewhat surprising results. (i.e. we > perform > CSE, instruction gets placed differently, performance differs) > > > > > Finally, I'd like to mention that alternate schemes are certainly > > possible: > > > > 1. We could 'hide' all invariants inside dominating blocks > > guarded by never-true conditional branches (this has the benefit > > of being able to express invariants using expressions that > > normally might have side effects, and could simplify the process > > of finding applicable invariants) [a suggestion by Richard and > > Nick]. > I'm not really clear on what this proposal would entail. Could you > spell out an example? (If you think this is worth discussing > further.) > > > > 2. We might want to have passes precompute the Value->(set of > > Invariants) map, and update it as necessary instead of doing > > transitive-user searches [a suggestion by Chandler]. > This seems like something to consider as a future enhancement. > > > > 3. I'm sure there is other design space :-) > > > > If you're interested in this functionality, please provide > > feedback, use cases, etc. (and help review patches). > Interested! > > > > Thanks in advance, > > Hal > Philip >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory