David Jones via llvm-dev
2019-Feb-21 18:43 UTC
[llvm-dev] select instruction and jump threading
Given this IR: %4 = getelementptr inbounds %LMtop.IRType, %LMtop.IRType* %3, i32 0, i32 8 %5 = getelementptr inbounds { i8, i8 }, { i8, i8 }* %4, i32 0, i32 0 %6 = getelementptr inbounds { i8, i8 }, { i8, i8 }* %4, i32 0, i32 1 %7 = load i8, i8* %5, align 8 %8 = load i8, i8* %6, align 1 %9 = getelementptr inbounds %LMtop.IRType, %LMtop.IRType* %3, i32 0, i32 9 %10 = getelementptr inbounds { i8, i8 }, { i8, i8 }* %9, i32 0, i32 0 %11 = getelementptr inbounds { i8, i8 }, { i8, i8 }* %9, i32 0, i32 1 %12 = load i8, i8* %10, align 2 %13 = load i8, i8* %11, align 1 %14 = getelementptr inbounds %LMtop.IRType, %LMtop.IRType* %3, i32 0, i32 10 %15 = getelementptr inbounds { i8, i8 }, { i8, i8 }* %14, i32 0, i32 0 %16 = getelementptr inbounds { i8, i8 }, { i8, i8 }* %14, i32 0, i32 1 %17 = load i8, i8* %15, align 4 %18 = load i8, i8* %16, align 1 %19 = or i8 %13, %18 %20 = add i8 %12, %17 %21 = or i8 %8, %19 %22 = add i8 %7, %20 %23 = icmp ne i8 %21, 0 %24 = select i1 %23, i8 -1, i8 %22 %25 = select i1 %23, i8 -1, i8 0 %26 = and i8 %24, 15 %27 = and i8 %25, 15 %28 = getelementptr inbounds %LMtop.IRType, %LMtop.IRType* %3, i32 0, i32 11 %29 = getelementptr inbounds { i8, i8 }, { i8, i8 }* %28, i32 0, i32 0 %30 = getelementptr inbounds { i8, i8 }, { i8, i8 }* %28, i32 0, i32 1 store i8 %26, i8* %29, align 2 store i8 %27, i8* %30, align 1 LLVM 7.0.0 produces the following x86_64 code: 86c8: 8a 88 02 01 00 00 mov 0x102(%rax),%cl 86ce: 8a 90 03 01 00 00 mov 0x103(%rax),%dl 86d4: 0a 90 01 01 00 00 or 0x101(%rax),%dl 86da: 02 88 00 01 00 00 add 0x100(%rax),%cl 86e0: 02 88 04 01 00 00 add 0x104(%rax),%cl 86e6: 80 e1 0f and $0xf,%cl 86e9: 0a 90 05 01 00 00 or 0x105(%rax),%dl 86ef: 40 b6 0f mov $0xf,%sil 86f2: b2 0f mov $0xf,%dl 86f4: 75 02 jne 86f8 <LMtop.W0+0x38> 86f6: 89 ca mov %ecx,%edx 86f8: 75 02 jne 86fc <LMtop.W0+0x3c> 86fa: 31 f6 xor %esi,%esi 86fc: 88 90 06 01 00 00 mov %dl,0x106(%rax) 8702: 40 88 b0 07 01 00 00 mov %sil,0x107(%rax) Note how the branch at 0x86f4 targets another branch on the same condition. Practically, the branch at 0x86f8 is redundant as it can never be taken. It ought to be removed, and the branch at 0x86f4 ought to target 0x86fc. 1. Is this a known issue? I couldn't find anything obvious in Bugzilla. 2. Would it be better for me to use branches and phi whenever I knowingly want to do more than one select with the exact same condition? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190221/f9da1ef8/attachment.html>