Andrea Canciani via llvm-dev
2015-Sep-30 09:42 UTC
[llvm-dev] Optimizing jumps to identical code blocks
Rust pattern matching code can sometimes generate very redundant LLVM IR, in which several branches contain exactly the same code. LLVM successfully unifies that, but the dispatching mechanism does not simplify accordingly. I created a gist here: https://gist.github.com/ranma42/d2e6d50999e801ffd4ed (based on two examples available in Rust issues: https://github.com/rust-lang/rust/pull/24270#issuecomment-136681741 https://github.com/rust-lang/rust/issues/13623#issuecomment-136700526 ) In "enum4.s" cmpl $1, %eax je LBB0_5 cmpl $2, %eax je LBB0_5 cmpl $3, %eax LBB0_5: could be removed. (Further optimization would be possible by observing that the two 32-bit comparison could be unified into a single 64-bit comparison, but I believe this is a different issue) In "enum6.s" all of the elements of the jump table point to the same label (which is also right after the jump to the table dispatch code), so the jump table is actually useless. Is there any combination of passes that should make it possible for LLVM to optimize this? Would outline + merge functions be a step in the right direction? I googled for a similar thread in the mailing list and I found http://lists.llvm.org/pipermail/llvm-dev/2010-May/031629.html Unfortunately the thread does not provide a way to remove identical blocks in LLVM IR. Instead it suggests that this is only possible by relying on target-dependent optimizations. Does this mean that the best approach would be to make the target-dependent passes also optimize away jumps (when possible)? I am somewhat afraid that this would prevent further optimizations that might be exposed by the removal of identical blocks (like the 32+32 -> 64 coalescing I mentioned before). Andrea -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150930/384f1750/attachment.html>
Philip Reames via llvm-dev
2015-Oct-01 22:06 UTC
[llvm-dev] Optimizing jumps to identical code blocks
This looks like at least one missing case in SimplifyCFG most likely. I suspect that our tail commoning code is being restricted to two incoming blocks to a phi, but that's just a guess. Can you file a bug with the IR from your gist attached? Philip On 09/30/2015 02:42 AM, Andrea Canciani via llvm-dev wrote:> Rust pattern matching code can sometimes generate very redundant LLVM > IR, in which several branches contain exactly the same code. > LLVM successfully unifies that, but the dispatching mechanism does not > simplify accordingly. > > I created a gist here: > https://gist.github.com/ranma42/d2e6d50999e801ffd4ed > (based on two examples available in Rust issues: > https://github.com/rust-lang/rust/pull/24270#issuecomment-136681741 > https://github.com/rust-lang/rust/issues/13623#issuecomment-136700526 ) > > In "enum4.s" > > cmpl $1, %eax > je LBB0_5 > cmpl $2, %eax > je LBB0_5 > cmpl $3, %eax > LBB0_5: > > could be removed. > (Further optimization would be possible by observing that the two > 32-bit comparison could be unified into a single 64-bit comparison, > but I believe this is a different issue) > > In "enum6.s" all of the elements of the jump table point to the same > label (which is also right after the jump to the table dispatch code), > so the jump table is actually useless. > > Is there any combination of passes that should make it possible for > LLVM to optimize this? > Would outline + merge functions be a step in the right direction? > I googled for a similar thread in the mailing list and I found > http://lists.llvm.org/pipermail/llvm-dev/2010-May/031629.html > Unfortunately the thread does not provide a way to remove identical > blocks in LLVM IR. Instead it suggests that this is only possible by > relying on target-dependent optimizations. Does this mean that the > best approach would be to make the target-dependent passes also > optimize away jumps (when possible)? > I am somewhat afraid that this would prevent further optimizations > that might be exposed by the removal of identical blocks (like the > 32+32 -> 64 coalescing I mentioned before). > > Andrea > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20151001/aec0d08d/attachment-0001.html>
Andrea Canciani via llvm-dev
2015-Dec-31 13:51 UTC
[llvm-dev] Optimizing jumps to identical code blocks
On Thu, Oct 1, 2015 at 11:06 PM, Philip Reames <listmail at philipreames.com> wrote:> This looks like at least one missing case in SimplifyCFG most likely. I > suspect that our tail commoning code is being restricted to two incoming > blocks to a phi, but that's just a guess. Can you file a bug with the IR > from your gist attached? >Sorry, I forgot to reply to the list. I eventually found a bugreport which should be relevant: https://llvm.org/bugs/show_bug.cgi?id=24220 This other bug might be somewhat related https://llvm.org/bugs/show_bug.cgi?id=2056 AFAICT there is also some work on this here http://reviews.llvm.org/D14122 (but I did not understand if the patch has been abandoned or if it is stuck for some other reason). Andrea> > Philip > > > On 09/30/2015 02:42 AM, Andrea Canciani via llvm-dev wrote: > > Rust pattern matching code can sometimes generate very redundant LLVM IR, > in which several branches contain exactly the same code. > LLVM successfully unifies that, but the dispatching mechanism does not > simplify accordingly. > > I created a gist here: > https://gist.github.com/ranma42/d2e6d50999e801ffd4ed > (based on two examples available in Rust issues: > <https://github.com/rust-lang/rust/pull/24270#issuecomment-136681741> > https://github.com/rust-lang/rust/pull/24270#issuecomment-136681741 > https://github.com/rust-lang/rust/issues/13623#issuecomment-136700526 ) > > In "enum4.s" > > cmpl $1, %eax > je LBB0_5 > cmpl $2, %eax > je LBB0_5 > cmpl $3, %eax > LBB0_5: > > could be removed. > (Further optimization would be possible by observing that the two 32-bit > comparison could be unified into a single 64-bit comparison, but I believe > this is a different issue) > > In "enum6.s" all of the elements of the jump table point to the same label > (which is also right after the jump to the table dispatch code), so the > jump table is actually useless. > > Is there any combination of passes that should make it possible for LLVM > to optimize this? > Would outline + merge functions be a step in the right direction? > I googled for a similar thread in the mailing list and I found > http://lists.llvm.org/pipermail/llvm-dev/2010-May/031629.html > Unfortunately the thread does not provide a way to remove identical blocks > in LLVM IR. Instead it suggests that this is only possible by relying on > target-dependent optimizations. Does this mean that the best approach would > be to make the target-dependent passes also optimize away jumps (when > possible)? > I am somewhat afraid that this would prevent further optimizations that > might be exposed by the removal of identical blocks (like the 32+32 -> 64 > coalescing I mentioned before). > > Andrea > > > _______________________________________________ > LLVM Developers mailing listllvm-dev at lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20151231/9a14c4a4/attachment.html>
Reasonably Related Threads
- [RFC] New pass: LoopExitValues
- [hexagon][PowerPC] code regression (sub-optimal code) on LLVM 9 when generating hardware loops, and the "llvm.uadd" intrinsic.
- [LLVMdev] Need a clue to improve the optimization of some C code
- Nowaday Scalar Evolution's Problem.
- [RFC] New pass: LoopExitValues