Philip Reames
2012-Oct-02 21:43 UTC
[LLVMdev] How best to represent assume statements in LLVM IR?
On 10/02/2012 12:52 AM, Duncan Sands wrote:> Hi Philip, > >> Does anyone have any suggestions on how to best represent an assumption >> statement(*) in IR? > > good question! There have been various attempts, for example Nick tried > teaching the optimizers to not prune the branch to unreachable in > > br %cond, label %assumption_holds, %assumption_doesnt_hold > assumption_doesnt_hold: > unreachable > > This then leads to %cond being replaced with true everywhere downstream, > which > is good. Unfortunately it also causes a bunch of other optimizations to > not > occur, and the overall result was not a win. > > Rafael added "range" metadata to the IR, however it is only for loads. > If it > could also be attached to the definition of %cond, saying that the value > is 1, > and various places are taught to understand that, then that might be > another > possibility.It really seems to me that the core problem is that the assumption branch is removed before the information it contained could be leveraged to remove downstream code. If the "unreachable" is replaced with a call to an external function marked noreturn (i.e. a trap like function), the optimizer does correctly optimize downstream code. This seems to be a classic ordering problem. Note: I haven't studied how the downstream code is optimized. It seems odd to me that the information is lost when the branch is lost. I may dig into what's causing the analysis information to be dropped - there might simply be a bug somewhere along the way. Nick's approach seems to tackle the ordering problem by preventing the optimization of the unreachable branch. As you pointed out, this is undesirable since we don't want the code to calculate the assumption condition to be retained. I'm not quiet clear on how Rafael's solution addresses the problem at hand, so I won't address that. I've gone ahead and implemented my hack solution - use an external function to force downstream optimization, replace it with unreachable and then re-optimize. At least for the trivial examples I'm testing with, this seems to get the desired behavior. The downside of course is that it requires the all of the other passes to be run twice. If anyone is interested, I can throw the (nearly trivial) code up on github for anyone else who wants it. > However I think people would rather only have such> metadata be > attached to "inputs" to a function: function parameters, loads from > memory etc. > In your case, is the assumption usually about a function parameter?For the initial examples I'm considering - specifically object invariants and function preconditions - all assumptions are about function parameters. However, I can think of some examples where having the ability to note an assumption about an arbitrary intermediate value might be useful. If I can, I'd like to not exclude that case. One such example (for a language with array bounds checks): int n = external_func_compute_loop_bound(); assume n < array.size(); for(int i = 0; i < n; i++) { process(arrray[i]); } Philip *removed extra text for readability*
Hal Finkel
2012-Oct-02 21:55 UTC
[LLVMdev] How best to represent assume statements in LLVM IR?
On Tue, 02 Oct 2012 14:43:51 -0700 Philip Reames <listmail at philipreames.com> wrote:> On 10/02/2012 12:52 AM, Duncan Sands wrote: > > Hi Philip, > > > >> Does anyone have any suggestions on how to best represent an > >> assumption statement(*) in IR? > > > > good question! There have been various attempts, for example Nick > > tried teaching the optimizers to not prune the branch to > > unreachable in > > > > br %cond, label %assumption_holds, %assumption_doesnt_hold > > assumption_doesnt_hold: > > unreachable > > > > This then leads to %cond being replaced with true everywhere > > downstream, which > > is good. Unfortunately it also causes a bunch of other > > optimizations to not > > occur, and the overall result was not a win. > > > > Rafael added "range" metadata to the IR, however it is only for > > loads. If it > > could also be attached to the definition of %cond, saying that the > > value is 1, > > and various places are taught to understand that, then that might be > > another > > possibility. > It really seems to me that the core problem is that the assumption > branch is removed before the information it contained could be > leveraged to remove downstream code. If the "unreachable" is > replaced with a call to an external function marked noreturn (i.e. a > trap like function), the optimizer does correctly optimize downstream > code. This seems to be a classic ordering problem. > > Note: I haven't studied how the downstream code is optimized. It > seems odd to me that the information is lost when the branch is > lost. I may dig into what's causing the analysis information to be > dropped - there might simply be a bug somewhere along the way. > > Nick's approach seems to tackle the ordering problem by preventing > the optimization of the unreachable branch. As you pointed out, this > is undesirable since we don't want the code to calculate the > assumption condition to be retained. > > I'm not quiet clear on how Rafael's solution addresses the problem at > hand, so I won't address that. > > I've gone ahead and implemented my hack solution - use an external > function to force downstream optimization, replace it with > unreachable and then re-optimize. At least for the trivial examples > I'm testing with, this seems to get the desired behavior. The > downside of course is that it requires the all of the other passes to > be run twice.How about having an analysis pass that turns these things into SCEVs that can be associated as predicates with each downstream instruction? These can stick around so long as this analysis is not invalidated (this would imply some infrastructure work, but the work would be incremental in nature).> > If anyone is interested, I can throw the (nearly trivial) code up on > github for anyone else who wants it.Can't hurt :) -Hal> > > However I think people would rather only have such > > metadata be > > attached to "inputs" to a function: function parameters, loads from > > memory etc. > > In your case, is the assumption usually about a function parameter? > For the initial examples I'm considering - specifically object > invariants and function preconditions - all assumptions are about > function parameters. However, I can think of some examples where > having the ability to note an assumption about an arbitrary > intermediate value might be useful. If I can, I'd like to not > exclude that case. > > One such example (for a language with array bounds checks): > int n = external_func_compute_loop_bound(); > assume n < array.size(); > for(int i = 0; i < n; i++) { > process(arrray[i]); > } > > Philip > > *removed extra text for readability* > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-- Hal Finkel Postdoctoral Appointee Leadership Computing Facility Argonne National Laboratory
Philip Reames
2012-Oct-02 22:01 UTC
[LLVMdev] How best to represent assume statements in LLVM IR?
On 10/02/2012 02:55 PM, Hal Finkel wrote:> On Tue, 02 Oct 2012 14:43:51 -0700 > Philip Reames <listmail at philipreames.com> wrote: > >> On 10/02/2012 12:52 AM, Duncan Sands wrote: >>> Hi Philip, >>> >>>> Does anyone have any suggestions on how to best represent an >>>> assumption statement(*) in IR? >>> >>> good question! There have been various attempts, for example Nick >>> tried teaching the optimizers to not prune the branch to >>> unreachable in >>> >>> br %cond, label %assumption_holds, %assumption_doesnt_hold >>> assumption_doesnt_hold: >>> unreachable >>> >>> This then leads to %cond being replaced with true everywhere >>> downstream, which >>> is good. Unfortunately it also causes a bunch of other >>> optimizations to not >>> occur, and the overall result was not a win. >>> >>> Rafael added "range" metadata to the IR, however it is only for >>> loads. If it >>> could also be attached to the definition of %cond, saying that the >>> value is 1, >>> and various places are taught to understand that, then that might be >>> another >>> possibility. >> It really seems to me that the core problem is that the assumption >> branch is removed before the information it contained could be >> leveraged to remove downstream code. If the "unreachable" is >> replaced with a call to an external function marked noreturn (i.e. a >> trap like function), the optimizer does correctly optimize downstream >> code. This seems to be a classic ordering problem. >> >> Note: I haven't studied how the downstream code is optimized. It >> seems odd to me that the information is lost when the branch is >> lost. I may dig into what's causing the analysis information to be >> dropped - there might simply be a bug somewhere along the way. >> >> Nick's approach seems to tackle the ordering problem by preventing >> the optimization of the unreachable branch. As you pointed out, this >> is undesirable since we don't want the code to calculate the >> assumption condition to be retained. >> >> I'm not quiet clear on how Rafael's solution addresses the problem at >> hand, so I won't address that. >> >> I've gone ahead and implemented my hack solution - use an external >> function to force downstream optimization, replace it with >> unreachable and then re-optimize. At least for the trivial examples >> I'm testing with, this seems to get the desired behavior. The >> downside of course is that it requires the all of the other passes to >> be run twice. > > How about having an analysis pass that turns these things into SCEVs > that can be associated as predicates with each downstream instruction? > These can stick around so long as this analysis is not invalidated > (this would imply some infrastructure work, but the work would be > incremental in nature).I like this idea and may pursue it. This is a side project and I've got a few things above this on the queue for that project, so it may be a bit. I'll let you know once I start looking at that aspect.> >> >> If anyone is interested, I can throw the (nearly trivial) code up on >> github for anyone else who wants it. > > Can't hurt :)Give me a day or two to get the code in a shape where I'm not ashamed to publicly attach my name to it. :) Philip> >> >> > However I think people would rather only have such >>> metadata be >>> attached to "inputs" to a function: function parameters, loads from >>> memory etc. >>> In your case, is the assumption usually about a function parameter? >> For the initial examples I'm considering - specifically object >> invariants and function preconditions - all assumptions are about >> function parameters. However, I can think of some examples where >> having the ability to note an assumption about an arbitrary >> intermediate value might be useful. If I can, I'd like to not >> exclude that case. >> >> One such example (for a language with array bounds checks): >> int n = external_func_compute_loop_bound(); >> assume n < array.size(); >> for(int i = 0; i < n; i++) { >> process(arrray[i]); >> } >> >> Philip >> >> *removed extra text for readability* >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > >
Possibly Parallel Threads
- [LLVMdev] How best to represent assume statements in LLVM IR?
- [LLVMdev] How best to represent assume statements in LLVM IR?
- [LLVMdev] How best to represent assume statements in LLVM IR?
- show 0 at y axis in xyplot lattice
- Σχετ: show 0 at y axis in xyplot lattice