Phil Tomson via llvm-dev
2016-Nov-02 22:24 UTC
[llvm-dev] rotl: undocumented LLVM instruction?
We've recently moved our project from LLVM 3.6 to LLVM 3.9. I noticed one of our code generation tests is breaking in 3.9. The test is: ; RUN: llc < %s -march=xstg | FileCheck %s define i64 @bclr64(i64 %a, i64 %b) nounwind readnone { entry: ; CHECK: bclr r1, r0, r1, 64 %sub = sub i64 %b, 1 %shl = shl i64 1, %sub %xor = xor i64 %shl, -1 %and = and i64 %a, %xor ret i64 %and } I ran llc with -debug to get a better idea of what's going on and found: Initial selection DAG: BB#0 'bclr64:entry' SelectionDAG has 14 nodes: t0: ch = EntryToken t2: i64,ch = CopyFromReg t0, Register:i64 %vreg0 t4: i64,ch = CopyFromReg t0, Register:i64 %vreg1 t6: i64 = sub t4, Constant:i64<1> t7: i64 = shl Constant:i64<1>, t6 t9: i64 = xor t7, Constant:i64<-1> t10: i64 = and t2, t9 t12: ch,glue = CopyToReg t0, Register:i64 %R1, t10 t13: ch = XSTGISD::Ret t12, Register:i64 %R1, t12:1 Combining: t13: ch = XSTGISD::Ret t12, Register:i64 %R1, t12:1 Combining: t12: ch,glue = CopyToReg t0, Register:i64 %R1, t10 Combining: t11: i64 = Register %R1 Combining: t10: i64 = and t2, t9 Combining: t9: i64 = xor t7, Constant:i64<-1> ... into: t15: i64 = rotl Constant:i64<-2>, t6 Combining: t10: i64 = and t2, t15 Combining: t15: i64 = rotl Constant:i64<-2>, t6 Combining: t14: i64 = Constant<-2> Combining: t6: i64 = sub t4, Constant:i64<1> ... into: t17: i64 = add t4, Constant:i64<-1> Combining: t15: i64 = rotl Constant:i64<-2>, t17 These rotl instructions weren't showing up when I ran llc 3.6 and that's completely changing the generated code at the end which means the test fails (and it's less optimal than it was in 3.6). I've been looking in the LLVM language docs (3.9 version) and I don't see any documentation on 'rotl'. What does it do? Why isn't it in the docs? Phil -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161102/dde367b0/attachment.html>
Hans Wennborg via llvm-dev
2016-Nov-02 22:34 UTC
[llvm-dev] rotl: undocumented LLVM instruction?
On Wed, Nov 2, 2016 at 3:24 PM, Phil Tomson via llvm-dev <llvm-dev at lists.llvm.org> wrote:> We've recently moved our project from LLVM 3.6 to LLVM 3.9. I noticed one > of our code generation tests is breaking in 3.9. > > The test is: > > ; RUN: llc < %s -march=xstg | FileCheck %s > > define i64 @bclr64(i64 %a, i64 %b) nounwind readnone { > entry: > ; CHECK: bclr r1, r0, r1, 64 > %sub = sub i64 %b, 1 > %shl = shl i64 1, %sub > %xor = xor i64 %shl, -1 > %and = and i64 %a, %xor > ret i64 %and > } > > I ran llc with -debug to get a better idea of what's going on and found: > > Initial selection DAG: BB#0 'bclr64:entry' > SelectionDAG has 14 nodes: > t0: ch = EntryToken > t2: i64,ch = CopyFromReg t0, Register:i64 %vreg0 > t4: i64,ch = CopyFromReg t0, Register:i64 %vreg1 > t6: i64 = sub t4, Constant:i64<1> > t7: i64 = shl Constant:i64<1>, t6 > t9: i64 = xor t7, Constant:i64<-1> > t10: i64 = and t2, t9 > t12: ch,glue = CopyToReg t0, Register:i64 %R1, t10 > t13: ch = XSTGISD::Ret t12, Register:i64 %R1, t12:1 > > > > Combining: t13: ch = XSTGISD::Ret t12, Register:i64 %R1, t12:1 > > Combining: t12: ch,glue = CopyToReg t0, Register:i64 %R1, t10 > > Combining: t11: i64 = Register %R1 > > Combining: t10: i64 = and t2, t9 > > Combining: t9: i64 = xor t7, Constant:i64<-1> > ... into: t15: i64 = rotl Constant:i64<-2>, t6 > > Combining: t10: i64 = and t2, t15 > > Combining: t15: i64 = rotl Constant:i64<-2>, t6 > > Combining: t14: i64 = Constant<-2> > > Combining: t6: i64 = sub t4, Constant:i64<1> > ... into: t17: i64 = add t4, Constant:i64<-1> > > Combining: t15: i64 = rotl Constant:i64<-2>, t17 > > > > These rotl instructions weren't showing up when I ran llc 3.6 and that's > completely changing the generated code at the end which means the test fails > (and it's less optimal than it was in 3.6). > > I've been looking in the LLVM language docs (3.9 version) and I don't see > any documentation on 'rotl'. What does it do? Why isn't it in the docs?rotl is not an IR instruction, it's a node in the instruction-selection dag (ISD::ROTL). It performs bitwise rotation to the left. The change to your code is probably due to this new transformation: http://llvm.org/viewvc/llvm-project?view=revision&revision=232572
Ryan Taylor via llvm-dev
2016-Nov-02 23:10 UTC
[llvm-dev] rotl: undocumented LLVM instruction?
I believe some of the ISDs were introduced to allow for DAG optimizations under the assumption that some of the major architectures directly support these types of instructions. -Ryan On Wed, Nov 2, 2016 at 6:24 PM, Phil Tomson via llvm-dev < llvm-dev at lists.llvm.org> wrote:> We've recently moved our project from LLVM 3.6 to LLVM 3.9. I noticed > one of our code generation tests is breaking in 3.9. > > The test is: > > ; RUN: llc < %s -march=xstg | FileCheck %s > > define i64 @bclr64(i64 %a, i64 %b) nounwind readnone { > entry: > ; CHECK: bclr r1, r0, r1, 64 > %sub = sub i64 %b, 1 > %shl = shl i64 1, %sub > %xor = xor i64 %shl, -1 > %and = and i64 %a, %xor > ret i64 %and > } > > I ran llc with -debug to get a better idea of what's going on and found: > > Initial selection DAG: BB#0 'bclr64:entry' > SelectionDAG has 14 nodes: > t0: ch = EntryToken > t2: i64,ch = CopyFromReg t0, Register:i64 %vreg0 > t4: i64,ch = CopyFromReg t0, Register:i64 %vreg1 > t6: i64 = sub t4, Constant:i64<1> > t7: i64 = shl Constant:i64<1>, t6 > t9: i64 = xor t7, Constant:i64<-1> > t10: i64 = and t2, t9 > t12: ch,glue = CopyToReg t0, Register:i64 %R1, t10 > t13: ch = XSTGISD::Ret t12, Register:i64 %R1, t12:1 > > > > Combining: t13: ch = XSTGISD::Ret t12, Register:i64 %R1, t12:1 > > Combining: t12: ch,glue = CopyToReg t0, Register:i64 %R1, t10 > > Combining: t11: i64 = Register %R1 > > Combining: t10: i64 = and t2, t9 > > Combining: t9: i64 = xor t7, Constant:i64<-1> > ... into: t15: i64 = rotl Constant:i64<-2>, t6 > > Combining: t10: i64 = and t2, t15 > > Combining: t15: i64 = rotl Constant:i64<-2>, t6 > > Combining: t14: i64 = Constant<-2> > > Combining: t6: i64 = sub t4, Constant:i64<1> > ... into: t17: i64 = add t4, Constant:i64<-1> > > Combining: t15: i64 = rotl Constant:i64<-2>, t17 > > > > These rotl instructions weren't showing up when I ran llc 3.6 and that's > completely changing the generated code at the end which means the test > fails (and it's less optimal than it was in 3.6). > > I've been looking in the LLVM language docs (3.9 version) and I don't see > any documentation on 'rotl'. What does it do? Why isn't it in the docs? > > Phil > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161102/82e20975/attachment.html>
Phil Tomson via llvm-dev
2016-Nov-03 21:07 UTC
[llvm-dev] rotl: undocumented LLVM instruction?
Is there any way to get it to delay this optimization where it goes from this: Initial selection DAG: BB#0 'bclr64:entry' SelectionDAG has 14 nodes: t0: ch = EntryToken t2: i64,ch = CopyFromReg t0, Register:i64 %vreg0 t4: i64,ch = CopyFromReg t0, Register:i64 %vreg1 t6: i64 = sub t4, Constant:i64<1> t7: i64 = shl Constant:i64<1>, t6 t9: i64 = xor t7, Constant:i64<-1> t10: i64 = and t2, t9 t12: ch,glue = CopyToReg t0, Register:i64 %R1, t10 t13: ch = XSTGISD::Ret t12, Register:i64 %R1, t12:1 Combining: t13: ch = XSTGISD::Ret t12, Register:i64 %R1, t12:1 Combining: t12: ch,glue = CopyToReg t0, Register:i64 %R1, t10 Combining: t11: i64 = Register %R1 Combining: t10: i64 = and t2, t9 Combining: t9: i64 = xor t7, Constant:i64<-1> ... into: t15: i64 = rotl Constant:i64<-2>, t6 ...to this: Optimized lowered selection DAG: BB#0 'bclr64:entry' SelectionDAG has 13 nodes: t0: ch = EntryToken t2: i64,ch = CopyFromReg t0, Register:i64 %vreg0 t4: i64,ch = CopyFromReg t0, Register:i64 %vreg1 t17: i64 = add t4, Constant:i64<-1> t15: i64 = rotl Constant:i64<-2>, t17 t10: i64 = and t2, t15 t12: ch,glue = CopyToReg t0, Register:i64 %R1, t10 t13: ch = XSTGISD::Ret t12, Register:i64 %R1, t12:1 That combining of the xor & and there ends up giving us suboptimal results as compared with 3.6. For example, in 3.6 the generated code is simply: bclr64: # @bclr64 # BB#0: # %entry addI r1, r1, -1, 64 bclr r1, r0, r1, 64 jabs r511 Whereas with 3.9 the generated code is: bclr64: # @bclr64 # BB#0: # %entry addI r1, r1, -1, 64 movimm r2, -2, 64 rol r1, r2, r1, 64 bitop1 r1, r0, r1, AND, 64 jabs r511 ... it seems to be negatively impacting some of our larger benchmarks as well that used to contains several bclr (bit clear) commands but now contain much less. Phil On Wed, Nov 2, 2016 at 4:10 PM, Ryan Taylor <ryta1203 at gmail.com> wrote:> I believe some of the ISDs were introduced to allow for DAG optimizations > under the assumption that some of the major architectures directly support > these types of instructions. > > -Ryan > > On Wed, Nov 2, 2016 at 6:24 PM, Phil Tomson via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> We've recently moved our project from LLVM 3.6 to LLVM 3.9. I noticed >> one of our code generation tests is breaking in 3.9. >> >> The test is: >> >> ; RUN: llc < %s -march=xstg | FileCheck %s >> >> define i64 @bclr64(i64 %a, i64 %b) nounwind readnone { >> entry: >> ; CHECK: bclr r1, r0, r1, 64 >> %sub = sub i64 %b, 1 >> %shl = shl i64 1, %sub >> %xor = xor i64 %shl, -1 >> %and = and i64 %a, %xor >> ret i64 %and >> } >> >> I ran llc with -debug to get a better idea of what's going on and found: >> >> Initial selection DAG: BB#0 'bclr64:entry' >> SelectionDAG has 14 nodes: >> t0: ch = EntryToken >> t2: i64,ch = CopyFromReg t0, Register:i64 %vreg0 >> t4: i64,ch = CopyFromReg t0, Register:i64 %vreg1 >> t6: i64 = sub t4, Constant:i64<1> >> t7: i64 = shl Constant:i64<1>, t6 >> t9: i64 = xor t7, Constant:i64<-1> >> t10: i64 = and t2, t9 >> t12: ch,glue = CopyToReg t0, Register:i64 %R1, t10 >> t13: ch = XSTGISD::Ret t12, Register:i64 %R1, t12:1 >> >> >> >> Combining: t13: ch = XSTGISD::Ret t12, Register:i64 %R1, t12:1 >> >> Combining: t12: ch,glue = CopyToReg t0, Register:i64 %R1, t10 >> >> Combining: t11: i64 = Register %R1 >> >> Combining: t10: i64 = and t2, t9 >> >> Combining: t9: i64 = xor t7, Constant:i64<-1> >> ... into: t15: i64 = rotl Constant:i64<-2>, t6 >> >> Combining: t10: i64 = and t2, t15 >> >> Combining: t15: i64 = rotl Constant:i64<-2>, t6 >> >> Combining: t14: i64 = Constant<-2> >> >> Combining: t6: i64 = sub t4, Constant:i64<1> >> ... into: t17: i64 = add t4, Constant:i64<-1> >> >> Combining: t15: i64 = rotl Constant:i64<-2>, t17 >> >> >> >> These rotl instructions weren't showing up when I ran llc 3.6 and that's >> completely changing the generated code at the end which means the test >> fails (and it's less optimal than it was in 3.6). >> >> I've been looking in the LLVM language docs (3.9 version) and I don't see >> any documentation on 'rotl'. What does it do? Why isn't it in the docs? >> >> Phil >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161103/86ab47ce/attachment.html>