Carlo Alberto Ferraris
2011-Jul-07 11:33 UTC
[LLVMdev] Missed optimization with indirectbr terminator
Il 07/07/2011 11:14, Cameron Zwarich ha scritto:> I haven't read the code in detail, but it looks like JumpThreading at > least attempts to thread across indirect branches. You can either try > to fix it or file a bug with your test case.In the source it says "If the predecessor is an indirect goto, we can't split the edge. <http://llvm.org/docs/doxygen/html/JumpThreading_8cpp_source.html#l00914>" Why would that be the case? -- Carlo Alberto Ferraris <cafxx at strayorange.com <mailto:cafxx at strayorange.com>> website/blog <http://cafxx.strayorange.com> - +39 333 7643 235 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110707/ccae877a/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: cafxx.vcf Type: text/x-vcard Size: 233 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110707/ccae877a/attachment.vcf>
John McCall
2011-Jul-07 17:41 UTC
[LLVMdev] Missed optimization with indirectbr terminator
On Jul 7, 2011, at 4:33 AM, Carlo Alberto Ferraris wrote:> Il 07/07/2011 11:14, Cameron Zwarich ha scritto: >> >> I haven't read the code in detail, but it looks like JumpThreading at least attempts to thread across indirect branches. You can either try to fix it or file a bug with your test case. > In the source it says "If the predecessor is an indirect goto, we can't split the edge." Why would that be the case?Splitting an edge creates a block which executes when leaving a specific block to go to a specific successor. The only way to split an indirect goto is to insert code before the jump which checks for a specific destination address. This is very likely to be a pessimization. To answer your original question, the current implementation design for indirect goto is intentionally based around having a single block that terminates in an indirectbr. Otherwise the CFG of a context-threaded interpreter loop becomes inherently O(N^2): there are N instruction labels, each of which has approximately N predecessors. We rely on a code-generation optimization, tail duplication, to actually restore the context-threading behavior. In short, don't look at the IR for whether context-threading is happening; look at the generated code. John. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110707/9958ba8d/attachment.html>
Carlo Alberto Ferraris
2011-Jul-08 05:15 UTC
[LLVMdev] Missed optimization with indirectbr terminator
Nella citazione giovedì 7 luglio 2011 19:41:16, John McCall ha scritto:> On Jul 7, 2011, at 4:33 AM, Carlo Alberto Ferraris wrote: >> Il 07/07/2011 11:14, Cameron Zwarich ha scritto: >>> I haven't read the code in detail, but it looks like JumpThreading at >>> least attempts to thread across indirect branches. You can either try >>> to fix it or file a bug with your test case. >> In the source it says "If the predecessor is an indirect goto, we >> can't split the edge. >> <http://llvm.org/docs/doxygen/html/JumpThreading_8cpp_source.html#l00914>" >> Why would that be the case? > > Splitting an edge creates a block which executes when leaving a > specific block to go to a specific successor. The only way to split an > indirect goto is to insert code before the jump which checks for a > specific destination address. This is very likely to be a pessimization.Do you mean like turning a indirectbr %addr, [%a, %b, ...] into a switch %addr, undef, [blockaddr(%a), %a], [blo ckaddr(%b), %b], ... ? (I know that the switch doesn't accept a pointer as first argument, it's just an example)> To answer your original question, the current implementation design > for indirect goto is intentionally based around having a single block > that terminates in an indirectbr. Otherwise the CFG of a > context-threaded interpreter loop becomes inherently O(N^2): there > are N instruction labels, each of which has approximately N > predecessors. We rely on a code-generation optimization, tail > duplication, to actually restore the context-threading behavior. > > In short, don't look at the IR for whether context-threading is > happening; look at the generated code.I'll try to inspect the assembler. Just a quick thought in the mean time, in the snippet I posted, if the backedge pointed directly to %19, other optimizations would likely notice that the loop could be removed entirely and replaced with a single addition. Do you think the code generator is able t o do this? -- Carlo Alberto Ferraris <cafxx at strayorange.com <mailto:cafxx at strayorange.com>> website/blog <http://cafxx.strayorange.com> - +39 333 7643 235 -------------- next part -------------- A non-text attachment was scrubbed... Name: cafxx.vcf Type: text/x-vcard Size: 233 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110708/88dfa186/attachment.vcf>
Maybe Matching Threads
- [LLVMdev] Missed optimization with indirectbr terminator
- [LLVMdev] Missed optimization with indirectbr terminator
- [LLVMdev] Missed optimization with indirectbr terminator
- [LLVMdev] Missed optimization with indirectbr terminator
- [LLVMdev] Multiple successors, single dynamic successor