thr3ads.net - llvm dev - [llvm-dev] Optimizing jumps to identical code blocks [Sep 2015]

If this information is useful, please help other people find it:
Share via:

Andrea Canciani via llvm-dev

2015-Sep-30 09:42 UTC

[llvm-dev] Optimizing jumps to identical code blocks

Rust pattern matching code can sometimes generate very redundant LLVM IR,
in which several branches contain exactly the same code.
LLVM successfully unifies that, but the dispatching mechanism does not
simplify accordingly.

I created a gist here:
https://gist.github.com/ranma42/d2e6d50999e801ffd4ed
(based on two examples available in Rust issues:
https://github.com/rust-lang/rust/pull/24270#issuecomment-136681741
https://github.com/rust-lang/rust/issues/13623#issuecomment-136700526 )

In "enum4.s"

    cmpl    $1, %eax
    je    LBB0_5
    cmpl    $2, %eax
    je    LBB0_5
    cmpl    $3, %eax
LBB0_5:

could be removed.
(Further optimization would be possible by observing that the two 32-bit
comparison could be unified into a single 64-bit comparison, but I believe
this is a different issue)

In "enum6.s" all of the elements of the jump table point to the same
label
(which is also right after the jump to the table dispatch code), so the
jump table is actually useless.

Is there any combination of passes that should make it possible for LLVM to
optimize this?
Would outline + merge functions be a step in the right direction?
I googled for a similar thread in the mailing list and I found
http://lists.llvm.org/pipermail/llvm-dev/2010-May/031629.html
Unfortunately the thread does not provide a way to remove identical blocks
in LLVM IR. Instead it suggests that this is only possible by relying on
target-dependent optimizations. Does this mean that the best approach would
be to make the target-dependent passes also optimize away jumps (when
possible)?
I am somewhat afraid that this would prevent further optimizations that
might be exposed by the removal of identical blocks (like the 32+32 -> 64
coalescing I mentioned before).

Andrea
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150930/384f1750/attachment.html>

Philip Reames via llvm-dev

2015-Oct-01 22:06 UTC

head link

[llvm-dev] Optimizing jumps to identical code blocks

This looks like at least one missing case in SimplifyCFG most likely.  I 
suspect that our tail commoning code is being restricted to two incoming 
blocks to a phi, but that's just a guess.  Can you file a bug with the 
IR from your gist attached?

Philip

On 09/30/2015 02:42 AM, Andrea Canciani via llvm-dev
wrote:> Rust pattern matching code can sometimes generate very redundant LLVM 
> IR, in which several branches contain exactly the same code.
> LLVM successfully unifies that, but the dispatching mechanism does not 
> simplify accordingly.
>
> I created a gist here:
> https://gist.github.com/ranma42/d2e6d50999e801ffd4ed
> (based on two examples available in Rust issues: 
> https://github.com/rust-lang/rust/pull/24270#issuecomment-136681741
> https://github.com/rust-lang/rust/issues/13623#issuecomment-136700526 )
>
> In "enum4.s"
>
>     cmpl    $1, %eax
>     je    LBB0_5
>     cmpl    $2, %eax
>     je    LBB0_5
>     cmpl    $3, %eax
> LBB0_5:
>
> could be removed.
> (Further optimization would be possible by observing that the two 
> 32-bit comparison could be unified into a single 64-bit comparison, 
> but I believe this is a different issue)
>
> In "enum6.s" all of the elements of the jump table point to the
same
> label (which is also right after the jump to the table dispatch code), 
> so the jump table is actually useless.
>
> Is there any combination of passes that should make it possible for 
> LLVM to optimize this?
> Would outline + merge functions be a step in the right direction?
> I googled for a similar thread in the mailing list and I found 
> http://lists.llvm.org/pipermail/llvm-dev/2010-May/031629.html
> Unfortunately the thread does not provide a way to remove identical 
> blocks in LLVM IR. Instead it suggests that this is only possible by 
> relying on target-dependent optimizations. Does this mean that the 
> best approach would be to make the target-dependent passes also 
> optimize away jumps (when possible)?
> I am somewhat afraid that this would prevent further optimizations 
> that might be exposed by the removal of identical blocks (like the 
> 32+32 -> 64 coalescing I mentioned before).
>
> Andrea
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20151001/aec0d08d/attachment-0001.html>

Andrea Canciani via llvm-dev

2015-Dec-31 13:51 UTC

head link

[llvm-dev] Optimizing jumps to identical code blocks

On Thu, Oct 1, 2015 at 11:06 PM, Philip Reames <listmail at
philipreames.com>
wrote:
> This looks like at least one missing case in SimplifyCFG most likely.  I
> suspect that our tail commoning code is being restricted to two incoming
> blocks to a phi, but that's just a guess.  Can you file a bug with the
IR
> from your gist attached?
>
Sorry, I forgot to reply to the list. I eventually found a bugreport which
should be relevant: https://llvm.org/bugs/show_bug.cgi?id=24220
This other bug might be somewhat related
https://llvm.org/bugs/show_bug.cgi?id=2056
AFAICT there is also some work on this here http://reviews.llvm.org/D14122
(but I did not understand if the patch has been abandoned or if it is stuck
for some other reason).

Andrea

>
> Philip
>
>
> On 09/30/2015 02:42 AM, Andrea Canciani via llvm-dev wrote:
>
> Rust pattern matching code can sometimes generate very redundant LLVM IR,
> in which several branches contain exactly the same code.
> LLVM successfully unifies that, but the dispatching mechanism does not
> simplify accordingly.
>
> I created a gist here:
> https://gist.github.com/ranma42/d2e6d50999e801ffd4ed
> (based on two examples available in Rust issues:
> <https://github.com/rust-lang/rust/pull/24270#issuecomment-136681741>
> https://github.com/rust-lang/rust/pull/24270#issuecomment-136681741
> https://github.com/rust-lang/rust/issues/13623#issuecomment-136700526 )
>
> In "enum4.s"
>
>     cmpl    $1, %eax
>     je    LBB0_5
>     cmpl    $2, %eax
>     je    LBB0_5
>     cmpl    $3, %eax
> LBB0_5:
>
> could be removed.
> (Further optimization would be possible by observing that the two 32-bit
> comparison could be unified into a single 64-bit comparison, but I believe
> this is a different issue)
>
> In "enum6.s" all of the elements of the jump table point to the
same label
> (which is also right after the jump to the table dispatch code), so the
> jump table is actually useless.
>
> Is there any combination of passes that should make it possible for LLVM
> to optimize this?
> Would outline + merge functions be a step in the right direction?
> I googled for a similar thread in the mailing list and I found
> http://lists.llvm.org/pipermail/llvm-dev/2010-May/031629.html
> Unfortunately the thread does not provide a way to remove identical blocks
> in LLVM IR. Instead it suggests that this is only possible by relying on
> target-dependent optimizations. Does this mean that the best approach would
> be to make the target-dependent passes also optimize away jumps (when
> possible)?
> I am somewhat afraid that this would prevent further optimizations that
> might be exposed by the removal of identical blocks (like the 32+32 ->
64
> coalescing I mentioned before).
>
> Andrea
>
>
> _______________________________________________
> LLVM Developers mailing listllvm-dev at
lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20151231/9a14c4a4/attachment.html>

Reasonably Related Threads

Search for more reasonably related threads

llvm dev - Sep 2015 - Optimizing jumps to identical code blocks

[llvm-dev] Optimizing jumps to identical code blocks

[llvm-dev] Optimizing jumps to identical code blocks

[llvm-dev] Optimizing jumps to identical code blocks

Reasonably Related Threads