Philip Reames via llvm-dev
2018-May-16 21:49 UTC
[llvm-dev] RFC: separating guards and implicit control flow
All, TLDR: guards currently require reasoning about implicit control flow. LLVM struggles with implicit control flow within a basic block. We should redesign guards to admit this rather than trying to boil the ocean. As you may be aware, LLVM currently has experimental support for a construct called a "guard". A guard is like a branch to a potential deoptimzation point, except that the guard is allowed to "spuriously" deoptimize even if the condition doesn't require it. This critical flexibility allows an earlier guard to be "widened" (i.e. made to fail more often) if doing so allows the elimination of a later guard or otherwise is useful to prune possible code paths. At this point, the basic notion of a guard is fairly well proven with both critical transformations (LoopPredication, GurardWidening) available upstream, and multiple downstream speculative optimizations built on top. As we've explored the design, we've stumbled on a implementation challenge. Today, guards are modeled as call, not invokes. As it turns out, LLVM is not terribly good at dealing with potential implicit exits within a basic block. At this point, thanks to a fuzzer and a lot of work, we've worked through most of the correctness bugs with this representation, but the optimization problem remains. Many of our optimization passes (e.g. SCEV, LICM, GVN to a degree, etc..) just give up when they encounter a potentially throwing call within a basic block. Unfortunately, this limitation turns out to be a fairly fundamental one as efficient algorithms for such cases require being able to answer ordering questions about instructions which our current linked list implementation of a basic block makes prohibitively slow. (I'm skipping most of the details here; if you're curious ask and I'll expand.) My proposal is to seriously evaluate whether we should stop trying to fix LLVM's problems with implicit control flow, and instead, model guards as explicit control flow since that's what all of our algorithms are tuned for. That is, I'm going to continue to investigate efficient handling for implicit control in parallel, but I'd like to change the intrinsics just enough to allow experimentation with an alternate explicit control flow based form at the same time. Doing so requires a small change to the LangRef - we currently state guards can't be invoked. The most basic implementation patch is quite contained, but once we allow the form, we'll need to update many of the algorithms we've taught about implicit guards to consider a guard as a possible terminator. This doesn't look had or ugly from the couple I've prototyped so far. The advantage of this is that enumerating terminators within a CFG is a well understood and fast operation over LLVM IR. This proposal does have the downside of loosing some optimizations that are block local or specific to single (explicit) exit loops, but I've come to believe that's a worthwhile tradeoff. Also, teaching the optimizer to better optimize multiple exit loops is generally useful to a much large audience and useful for JITted languages, in particular, since not all of our control flow is represented by guards anyways. Philip p.s. It's worth noting that assumes have basically the same problem. If this approach works, we may want to consider changing the representation of assumes as well.
Hal Finkel via llvm-dev
2018-May-17 00:53 UTC
[llvm-dev] RFC: separating guards and implicit control flow
On 05/16/2018 04:49 PM, Philip Reames wrote:> All, > > TLDR: guards currently require reasoning about implicit control flow. > LLVM struggles with implicit control flow within a basic block. We > should redesign guards to admit this rather than trying to boil the > ocean. > > As you may be aware, LLVM currently has experimental support for a > construct called a "guard". A guard is like a branch to a potential > deoptimzation point, except that the guard is allowed to "spuriously" > deoptimize even if the condition doesn't require it. This critical > flexibility allows an earlier guard to be "widened" (i.e. made to fail > more often) if doing so allows the elimination of a later guard or > otherwise is useful to prune possible code paths. At this point, the > basic notion of a guard is fairly well proven with both critical > transformations (LoopPredication, GurardWidening) available upstream, > and multiple downstream speculative optimizations built on top. > > As we've explored the design, we've stumbled on a implementation > challenge. Today, guards are modeled as call, not invokes. As it > turns out, LLVM is not terribly good at dealing with potential > implicit exits within a basic block. At this point, thanks to a > fuzzer and a lot of work, we've worked through most of the correctness > bugs with this representation, but the optimization problem remains. > Many of our optimization passes (e.g. SCEV, LICM, GVN to a degree, > etc..) just give up when they encounter a potentially throwing call > within a basic block. Unfortunately, this limitation turns out to be > a fairly fundamental one as efficient algorithms for such cases > require being able to answer ordering questions about instructions > which our current linked list implementation of a basic block makes > prohibitively slow. (I'm skipping most of the details here; if you're > curious ask and I'll expand.) > > My proposal is to seriously evaluate whether we should stop trying to > fix LLVM's problems with implicit control flow, and instead, model > guards as explicit control flow since that's what all of our > algorithms are tuned for. That is, I'm going to continue to > investigate efficient handling for implicit control in parallel, but > I'd like to change the intrinsics just enough to allow experimentation > with an alternate explicit control flow based form at the same time. > > Doing so requires a small change to the LangRef - we currently state > guards can't be invoked. The most basic implementation patch is quite > contained, but once we allow the form, we'll need to update many of > the algorithms we've taught about implicit guards to consider a guard > as a possible terminator. This doesn't look had or ugly from the > couple I've prototyped so far. > > The advantage of this is that enumerating terminators within a CFG is > a well understood and fast operation over LLVM IR. This proposal does > have the downside of loosing some optimizations that are block local > or specific to single (explicit) exit loops, but I've come to believe > that's a worthwhile tradeoff. Also, teaching the optimizer to better > optimize multiple exit loops is generally useful to a much large > audience and useful for JITted languages, in particular, since not all > of our control flow is represented by guards anyways.As you likely already know, I'm not a fan of having calls that might unwind, or otherwise have implicit edges, in the middle of basic blocks. I continue to believe that they should be terminators. As such, while I realize that there are definite efficiency concerns about doing this in general (and, perhaps, a bunch of optimizations that would need to start looking across multiple blocks to maintain feature parity), I'm fully supportive of this change (I'm assuming that you've not found any serious compile-time implications in this regard). -Hal> > Philip > > p.s. It's worth noting that assumes have basically the same problem. > If this approach works, we may want to consider changing the > representation of assumes as well. >-- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory
Reid Kleckner via llvm-dev
2018-May-17 17:59 UTC
[llvm-dev] RFC: separating guards and implicit control flow
On Wed, May 16, 2018 at 5:53 PM Hal Finkel via llvm-dev < llvm-dev at lists.llvm.org> wrote:> As you likely already know, I'm not a fan of having calls that might > unwind, or otherwise have implicit edges, in the middle of basic blocks. > I continue to believe that they should be terminators. As such, while I > realize that there are definite efficiency concerns about doing this in > general (and, perhaps, a bunch of optimizations that would need to start > looking across multiple blocks to maintain feature parity), I'm fully > supportive of this change (I'm assuming that you've not found any > serious compile-time implications in this regard). >Without being on the hook to do any of the work or hard thinking, I can say I agree. :) We might be able to paper over a lot of these concerns about missed optimizations with some new ExtendedBasicBlock helpers that iterate instructions across single-successor single-predecessor BB edges from potentially throwing calls and trivially simplifiable unconditional branches. We can reuse the "unwind to caller" spelling that we already use in WinEH, so potentially throwing calls with no landingpad would look like: invoke void @foo() to label %normal unwind to caller We could either null out the unwind label operand of today's InvokeInst or find a way to not avoid allocating it in the first place. That would require a fair amount of rework, though. I'd probably just null it out for now. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180517/e156c51d/attachment.html>