Maurice Marks
2013-Oct-30 21:05 UTC
[LLVMdev] Optimization bug - spurious shift in partial word test
In the situation where a partial word is tested, lets say >0, by shifting
left to get the sign bit into the msb and testing llvm is inserting a
spurious right shift instruction.
For example this IR:
...
%0 = load i64* %a.addr, align 8
%shl = shl i64 %0, 28
%cmp = icmp sgt i64 %shl, 0
...
results in
...
shlq $28, %rdi
sarq $28, %rdi ; <<< spurious shift
testq %rdi, %rdi
gcc doesnt have this problem. It just emits the shift and test.
The reason appears to be that the instruction combining pass decides that
the shift and test is equivalent to a test on the partial word, in this
case an I36.
>From the -debug log:
>From the -debug log:
....
INSTCOMBINE ITERATION #0 on testit
IC: ADDING: 10 instrs to worklist
IC: Visiting: %shl = shl i64 %a, 28
IC: Visiting: %cmp = icmp sgt i64 %shl, 0
IC: ADD: %0 = trunc i64 %a to i36
IC: Old = %cmp = icmp sgt i64 %shl, 0
New = <badref> = icmp sgt i36 %0, 0
IC: ADD: %cmp = icmp sgt i36 %0, 0
IC: ERASE %1 = icmp sgt i64 %shl, 0
IC: ADD: %shl = shl i64 %a, 28
IC: DCE: %shl = shl i64 %a, 28
--- etc
So apparently the extra shift is inserted to restore the I36, although it
is never referenced again.
Here is a little C test program, try it at -O3 in both gcc and clang and
you will see the problem:
#include <stdint.h>
uint8_t testit(uint64_t a) {
return ((int64_t) (a << 28) > 0) ;
}
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131030/29e3dde8/attachment.html>
Rafael EspĂndola
2013-Nov-01 21:40 UTC
[LLVMdev] Optimization bug - spurious shift in partial word test
Could you report a bug in llvm.org/bug? On 30 October 2013 17:05, Maurice Marks <maurice.marks at gmail.com> wrote:> In the situation where a partial word is tested, lets say >0, by shifting > left to get the sign bit into the msb and testing llvm is inserting a > spurious right shift instruction. > > For example this IR: > ... > %0 = load i64* %a.addr, align 8 > %shl = shl i64 %0, 28 > %cmp = icmp sgt i64 %shl, 0 > ... > results in > ... > shlq $28, %rdi > sarq $28, %rdi ; <<< spurious shift > testq %rdi, %rdi > > gcc doesnt have this problem. It just emits the shift and test. > > The reason appears to be that the instruction combining pass decides that > the shift and test is equivalent to a test on the partial word, in this case > an I36. > > From the -debug log: > > From the -debug log: > > .... > > INSTCOMBINE ITERATION #0 on testit > IC: ADDING: 10 instrs to worklist > IC: Visiting: %shl = shl i64 %a, 28 > IC: Visiting: %cmp = icmp sgt i64 %shl, 0 > IC: ADD: %0 = trunc i64 %a to i36 > IC: Old = %cmp = icmp sgt i64 %shl, 0 > New = <badref> = icmp sgt i36 %0, 0 > IC: ADD: %cmp = icmp sgt i36 %0, 0 > IC: ERASE %1 = icmp sgt i64 %shl, 0 > IC: ADD: %shl = shl i64 %a, 28 > IC: DCE: %shl = shl i64 %a, 28 > > --- etc > > So apparently the extra shift is inserted to restore the I36, although it is > never referenced again. > > > Here is a little C test program, try it at -O3 in both gcc and clang and you > will see the problem: > > #include <stdint.h> > > uint8_t testit(uint64_t a) { > return ((int64_t) (a << 28) > 0) ; > } > > > > > > > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >
Maybe Matching Threads
- Alive now available online
- Alive now available online
- [InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines
- [InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines
- [InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines