Carlo Alberto Ferraris
2011-Jul-08 05:15 UTC
[LLVMdev] Missed optimization with indirectbr terminator
Nella citazione giovedì 7 luglio 2011 19:41:16, John McCall ha scritto:> On Jul 7, 2011, at 4:33 AM, Carlo Alberto Ferraris wrote: >> Il 07/07/2011 11:14, Cameron Zwarich ha scritto: >>> I haven't read the code in detail, but it looks like JumpThreading at >>> least attempts to thread across indirect branches. You can either try >>> to fix it or file a bug with your test case. >> In the source it says "If the predecessor is an indirect goto, we >> can't split the edge. >> <http://llvm.org/docs/doxygen/html/JumpThreading_8cpp_source.html#l00914>" >> Why would that be the case? > > Splitting an edge creates a block which executes when leaving a > specific block to go to a specific successor. The only way to split an > indirect goto is to insert code before the jump which checks for a > specific destination address. This is very likely to be a pessimization.Do you mean like turning a indirectbr %addr, [%a, %b, ...] into a switch %addr, undef, [blockaddr(%a), %a], [blo ckaddr(%b), %b], ... ? (I know that the switch doesn't accept a pointer as first argument, it's just an example)> To answer your original question, the current implementation design > for indirect goto is intentionally based around having a single block > that terminates in an indirectbr. Otherwise the CFG of a > context-threaded interpreter loop becomes inherently O(N^2): there > are N instruction labels, each of which has approximately N > predecessors. We rely on a code-generation optimization, tail > duplication, to actually restore the context-threading behavior. > > In short, don't look at the IR for whether context-threading is > happening; look at the generated code.I'll try to inspect the assembler. Just a quick thought in the mean time, in the snippet I posted, if the backedge pointed directly to %19, other optimizations would likely notice that the loop could be removed entirely and replaced with a single addition. Do you think the code generator is able t o do this? -- Carlo Alberto Ferraris <cafxx at strayorange.com <mailto:cafxx at strayorange.com>> website/blog <http://cafxx.strayorange.com> - +39 333 7643 235 -------------- next part -------------- A non-text attachment was scrubbed... Name: cafxx.vcf Type: text/x-vcard Size: 233 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110708/88dfa186/attachment.vcf>
John McCall
2011-Jul-08 07:05 UTC
[LLVMdev] Missed optimization with indirectbr terminator
On Jul 7, 2011, at 10:15 PM, Carlo Alberto Ferraris wrote:> Nella citazione giovedì 7 luglio 2011 19:41:16, John McCall ha scritto: >> On Jul 7, 2011, at 4:33 AM, Carlo Alberto Ferraris wrote: >>> Il 07/07/2011 11:14, Cameron Zwarich ha scritto: >>>> I haven't read the code in detail, but it looks like JumpThreading at least attempts to thread across indirect branches. You can either try to fix it or file a bug with your test case. >>> In the source it says "If the predecessor is an indirect goto, we can't split the edge. <http://llvm.org/docs/doxygen/html/JumpThreading_8cpp_source.html#l00914>" Why would that be the case? >> Splitting an edge creates a block which executes when leaving a >> specific block to go to a specific successor. The only way to split an >> indirect goto is to insert code before the jump which checks for a >> specific destination address. This is very likely to be a pessimization. > Do you mean like turning a indirectbr %addr, [%a, %b, ...] > into a switch %addr, undef, [blockaddr(%a), %a], [blo > ckaddr(%b), %b], ... > ? (I know that the switch doesn't accept a pointer as first argument, it's just an example)That difference is relevant, though, because there's no way to implement such a switch except as a series of pointer comparisons (or at least offset comparisons). But I was approaching it from the perspective of trying to split a single edge, rather than changing the entire terminator; there's no natural way to do that with an indirect branch, so you have to do explicit pointer comparisons before the goto. Doing even one or two of those could easily kill a lot of the benefit of the indirect branch. Basically, generic optimizations should not be in the business of trying to split indirect branch edges, which is why the generic splitting routines fail on them.> I'll try to inspect the assembler. Just a quick thought in the mean time, in the snippet I posted, if the backedge pointed directly to %19, other optimizations would likely notice that the loop could be removed entirely and replaced with a single addition. Do you think the code generator is able to do this?I don't think anyone's taught the loop optimizations to work over indirect branches, but it's not impossible. This specific case would be better fixed by teaching some existing pass, maybe JumpThreading, that an indirectbr on a phi with constant blockaddr operands can be usefully optimized. John.
Cameron Zwarich
2011-Jul-08 07:21 UTC
[LLVMdev] Missed optimization with indirectbr terminator
On Jul 7, 2011, at 10:15 PM, Carlo Alberto Ferraris wrote:> I'll try to inspect the assembler. Just a quick thought in the mean time, in the snippet I posted, if the backedge pointed directly to %19, other optimizations would likely notice that the loop could be removed entirely and replaced with a single addition. Do you think the code generator is able t > o do this?Why would you write a loop with an indirect branch where the loop can be deleted? Cameron
Frits van Bommel
2011-Jul-08 08:09 UTC
[LLVMdev] Missed optimization with indirectbr terminator
On 8 July 2011 09:05, John McCall <rjmccall at apple.com> wrote:> On Jul 7, 2011, at 10:15 PM, Carlo Alberto Ferraris wrote: >> I'll try to inspect the assembler. Just a quick thought in the mean time, in the snippet I posted, if the backedge pointed directly to %19, other optimizations would likely notice that the loop could be removed entirely and replaced with a single addition. Do you think the code generator is able to do this? > > I don't think anyone's taught the loop optimizations to work > over indirect branches, but it's not impossible. This specific > case would be better fixed by teaching some existing pass, > maybe JumpThreading, that an indirectbr on a phi with constant > blockaddr operands can be usefully optimized.Actually, jumpthreading already knows about indirectbr. The problem is that JumpThreading refuses to thread across loop headers (that is, the target blocks of backedges). It should have the same problem with a phi of booleans and a conditional branch instead of a phi of pointers and an indirectbr. The comment above JumpThreading::FindLoopHeaders() explains the current situation: /// FindLoopHeaders - We do not want jump threading to turn proper loop /// structures into irreducible loops. Doing this breaks up the loop nesting /// hierarchy and pessimizes later transformations. To prevent this from /// happening, we first have to find the loop headers. Here we approximate this /// by finding targets of backedges in the CFG. /// /// Note that there definitely are cases when we want to allow threading of /// edges across a loop header. For example, threading a jump from outside the /// loop (the preheader) to an exit block of the loop is definitely profitable. /// It is also almost always profitable to thread backedges from within the loop /// to exit blocks, and is often profitable to thread backedges to other blocks /// within the loop (forming a nested loop). This simple analysis is not rich /// enough to track all of these properties and keep it up-to-date as the CFG /// mutates, so we don't allow any of these transformations. Note that the case discussed in this thread (a backedge to a block within the loop) is mentioned as "often profitable", but that JumpThreading simply doesn't know enough about loops to recognize it. There are several ways to solve this, as I see it: 1) teach jumpthreading about loops, 2) teach some other pass that knows about loops how to thread backedges, or 3) introduce a whole new pass just for this purpose.
Carlo Alberto Ferraris
2011-Jul-08 14:38 UTC
[LLVMdev] Missed optimization with indirectbr terminator
Il 08/07/2011 09:21, Cameron Zwarich ha scritto:> On Jul 7, 2011, at 10:15 PM, Carlo Alberto Ferraris wrote: > >> I'll try to inspect the assembler. Just a quick thought in the mean time, in the snippet I posted, if the backedge pointed directly to %19, other optimizations would likely notice that the loop could be removed entirely and replaced with a single addition. Do you think the code generator is able t >> o do this? > Why would you write a loop with an indirect branch where the loop can be deleted?I'm working on an IPO that uses indirectbrs. The loop was part of the code being compiled. -- Carlo Alberto Ferraris <cafxx at strayorange.com <mailto:cafxx at strayorange.com>> website/blog <http://cafxx.strayorange.com> - +39 333 7643 235 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110708/74b37c7c/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: cafxx.vcf Type: text/x-vcard Size: 233 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110708/74b37c7c/attachment.vcf>
Maybe Matching Threads
- [LLVMdev] Missed optimization with indirectbr terminator
- [LLVMdev] Missed optimization with indirectbr terminator
- [LLVMdev] Missed optimization with indirectbr terminator
- [LLVMdev] Missed optimization with indirectbr terminator
- [LLVMdev] Missed optimization with indirectbr terminator