thr3ads.net - llvm dev - [llvm-dev] InstCombine: Narrow switch instructions using known bits [Sep 2018]

If this information is useful, please help other people find it:
Share via:

Bakhvalov, Denis via llvm-dev

2018-Sep-21 14:01 UTC

[llvm-dev] InstCombine: Narrow switch instructions using known bits

Hello,

I found one case where narrowing switch instructions transformation in
InstCombine produces worse code.
Let's suppose that I have such code:

$ cat a.c
void foo();
void bar();
void zoo();

void my_func(unsigned int a) {
    unsigned char b = a & 0xF;
    switch (b) {
        case 0:  foo(); break;
        case 1:  bar(); break;
        case 2:  foo(); break;
        case 3:  foo(); break;
        case 4:  foo(); break;
        case 5:  bar(); break;
        case 6:  foo(); break;
        case 7:  foo(); break;
        case 8:  bar(); break;
        case 9:  foo(); break;
        case 10: foo(); break;

        default: zoo();
    }
}

Using recent clang:

$ clang -O3 -S -c a.c -o a.s
I have the following assembly in the beginning of my_func:
# bad case
        movl    %edi, %eax
        andb    $15, %al
        cmpb    $10, %al
        ja      .LBB0_9                           # jump to the default case

        andl    $15, %edi
        jmpq    *.LJTI0_0(,%rdi,8)      # go to jump table

I found that if I disable switch shrinking like shown below:

$ git diff
diff --git a/lib/Transforms/InstCombine/InstructionCombining.cpp
b/lib/Transforms/InstCombine/InstructionCombining.cpp
index 5d5a9b2..3682b88 100644
--- a/lib/Transforms/InstCombine/InstructionCombining.cpp
+++ b/lib/Transforms/InstCombine/InstructionCombining.cpp
@@ -2429,6 +2429,8 @@ Instruction *InstCombiner::visitSwitchInst(SwitchInst
&SI) {
     return &SI;
   }

+  return nullptr;
+
   KnownBits Known = computeKnownBits(Cond, 0, &SI);
   unsigned LeadingKnownZeros = Known.countMinLeadingZeros();
   unsigned LeadingKnownOnes = Known.countMinLeadingOnes();

I get better assembly (there is no additional MOV and AND instructions):
# good case
        andl    $15, %edi
        cmpl    $10, %edi
        ja      .LBB0_9

        jmpq    *.LJTI0_0(,%rdi,8)

This transformation was introduced in the commit:
commit 4eb03123dfda2de88a84852834845678833c8c36
Author: Akira Hatanaka <ahatanaka at apple.com<mailto:ahatanaka at
apple.com>>
Date:   Thu Oct 16 06:00:46 2014 +0000
    Reapply r219832 - InstCombine: Narrow switch instructions using known bits.
>From IR point of view, after the 'opt -codegenprepare' the
difference is that in good case we have simple 'and' operation (all
calculations are made in i32). In the bad case there is 'trunc' to i4
and then 'zext' to i8.
During instruction selection we expand switch into a jump table. In the bad case
we use 2 copies of the value that we are switching on. First is in i8 that we
use to determine whether we should jump to default case. The second is in i64,
which we use for calculating address in the jump table. In the good case they
were combined.

But there is still one thing that I don't understand. What is the bad case
that this transformation (narrowing switch instructions) was supposed to fix,
i.e. does this transformation still make sense?

Best regards,
Denis Bakhvalov.

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180921/03aef177/attachment.html>

llvm dev - Sep 2018 - InstCombine: Narrow switch instructions using known bits

[llvm-dev] InstCombine: Narrow switch instructions using known bits