Hi, I recently worked on doing constant canonicalisation in DAGCombine level to make as more folding as possible. And I found the behavior of "visitShiftByConstant" is same with what I have done, just only it only be enabled under a shift context. But I think maybe we could expand it to as more case as possible. The original issuse of it: For code as below: unsigned array[4]; unsigned foo(unsigned long x) { return array[(x>>2)&3ul]; sequence before this canonicalisation (ARM): foo: .fnstart @ BB#0: @ %entry lsrs r0, r0, #2 movs r1, #3 ands r1, r0 lsls r0, r1, #2 ldr r1, .LCPI0_0 ldr r0, [r1, r0] bx lr .p2align 2 sequence after this canonicalisation: foo: .fnstart @ BB#0: @ %entry movs r1, #12 ands r1, r0 ldr r0, .LCPI0_0 ldr r0, [r0, r1] bx lr .p2align 2 This canonicalisation makes shift folding possible. But I wonder if only shift context could benifit from this.When I worked on CoreMark optimization recently, I found a case similar to below: define i32 @test(i32 %v) { entry: %a = and i32 %v, 65534 %b = lshr i32 %a, 1 %c = and i32 %v, 65535 %d = lshr i32 %c, 1 %e = add i32 %b, %d ret i32 %e } It would be profitable as well if we could enable the canonicalisation on it. sequence before this canonicalisation (ARM): test: .fnstart @ BB#0: @ %entry movw r1, #65534 and r1, r0, r1 ubfx r0, r0, #1, #15 add r0, r0, r1, lsr #1 bx lr sequence after this canonicalisation: test: .fnstart @ BB#0: @ %entry ubfx r0, r0, #1, #15 add r0, r0, r0 bx lr But when I tried to expand the condition to this case, there are lots of various regression fails. The expanding may prevent some other possible folding or something. Maybe I missed something or the expanding is not reasonable at all.but I didn't figure out yet. Any suggestions? Any inspiration is highly appreciated. And there maybe some other cases could be included also. If you have any findings, please let me know as well. Thank you very much! :) Best Regards, Jojo -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161208/7f4701ae/attachment.html>
Renato Golin via llvm-dev
2016-Dec-15 12:10 UTC
[llvm-dev] visitShiftByConstant of DAGCombiner
On 8 December 2016 at 02:34, Jojo Ma <jojo.ma at linaro.org> wrote:> It would be profitable as well if we could enable the canonicalisation on > it. > sequence before this canonicalisation (ARM): > test: > .fnstart > @ BB#0: @ %entry > movw r1, #65534 > and r1, r0, r1 > ubfx r0, r0, #1, #15 > add r0, r0, r1, lsr #1 > bx lr > > sequence after this canonicalisation: > test: > .fnstart > @ BB#0: @ %entry > ubfx r0, r0, #1, #15 > add r0, r0, r0 > bx lr > > But when I tried to expand the condition to this case, there are lots of > various regression fails.Hi Jojo, I think this canonicalisation is correct, at least in this case, so it's quite possible that the regressions are just bad tests, or you actually made the code better in other places that weren't expecting it. I'm adding some folks that have worked on the DAG combiner recently to have a look, but it would be good to know what regressions were there to see if they're really regressions (ie. worse code) or just different/better code. You may also have missed a check on your code (which would wrongly combine constants). Can you create a review on Phab of your proposed change? Also, if you could list the regressions and see if they have a common pattern (ie. "tests expecting an additional mask failed", or "the wrong mask was applied"), it would become clearer if they are real regressions or not. cheers, --renato
Thanks very much, Renato! I looked into the regressions, there're many cases have similar context as the case I mentioned but N just has only one use. In this case, the canonicalisation won't make it more profitable, will only prevent the possible folding or make the sequence is not expected. The regressions be eliminated, after I limit the expanding to not "N->hasOneUse()". And now only one regression left: FAIL: LLVM :: CodeGen/Thumb2/machine-licm.ll (18865 of 29222) ******************** TEST 'LLVM :: CodeGen/Thumb2/machine-licm.ll' FAILED ******************** Script: -- /home/likewise-open/SPREADTRUM/ jojo.ma/jojoma/source/llvm/llvm-linaro/llvm/build/./bin/llc < /home/likewise-open/SPREADTRUM/ jojo.ma/jojoma/source/llvm/llvm-linaro/llvm/llvm/test/CodeGen/Thumb2/machine-licm.ll -mtriple=thumbv7-apple-darwin -mcpu=cortex-a8 -relocation-model=dynamic-no-pic -disable-fp-elim | /home/likewise-open/SPREADTRUM/ jojo.ma/jojoma/source/llvm/llvm-linaro/llvm/build/./bin/FileCheck /home/likewise-open/SPREADTRUM/ jojo.ma/jojoma/source/llvm/llvm-linaro/llvm/llvm/test/CodeGen/Thumb2/machine-licm.ll /home/likewise-open/SPREADTRUM/ jojo.ma/jojoma/source/llvm/llvm-linaro/llvm/build/./bin/llc < /home/likewise-open/SPREADTRUM/ jojo.ma/jojoma/source/llvm/llvm-linaro/llvm/llvm/test/CodeGen/Thumb2/machine-licm.ll -mtriple=thumbv7-apple-darwin -mcpu=cortex-a8 -relocation-model=pic -disable-fp-elim | /home/likewise-open/SPREADTRUM/ jojo.ma/jojoma/source/llvm/llvm-linaro/llvm/build/./bin/FileCheck /home/likewise-open/SPREADTRUM/ jojo.ma/jojoma/source/llvm/llvm-linaro/llvm/llvm/test/CodeGen/Thumb2/machine-licm.ll --check-prefix=PIC -- Exit Code: 1 Command Output (stderr): -- /home/likewise-open/SPREADTRUM/ jojo.ma/jojoma/source/llvm/llvm-linaro/llvm/llvm/test/CodeGen/Thumb2/machine-licm.ll:88:10: error: expected string not found in input ; CHECK: movw {{(r[0-9])|(lr)}}, #32768 ^ <stdin>:56:2: note: scanning from here movw r12, #32768 ^ The outputs before and after the canonicalisation are: - before canonicalisation _t3: @ BB#0: @ %bb.nph push {r7, lr} mov r7, sp movw lr, #32768 movs r2, #0 movw r9, #16386 movw r12, #65534 movt lr, #65535 LBB2_1: @ %bb @ =>This Inner Loop Header: Depth=1 eor.w r3, r0, r1 adds r2, #1 ands r3, r3, #1 it ne eorne.w r1, r1, r9 and.w r3, r1, r12 orr.w r1, lr, r3, lsr #1 it eq lsreq r1, r3, #1 ubfx r0, r0, #1, #7 uxtb r3, r2 cmp r3, #8 bne LBB2_1 @ BB#2: @ %bb8 uxth r0, r1 pop {r7, pc} - after canonicalisation _t3: @ BB#0: @ %bb.nph movw r12, #32768 movs r2, #0 movw r9, #16386 movt r12, #65535 LBB2_1: @ %bb @ =>This Inner Loop Header: Depth=1 eor.w r3, r0, r1 adds r2, #1 ands r3, r3, #1 it ne eorne.w r1, r1, r9 uxtb r3, r2 ubfx r1, r1, #1, #15 it ne orrne.w r1, r1, r12 ubfx r0, r0, #1, #7 cmp r3, #8 bne LBB2_1 @ BB#2: @ %bb8 uxth r0, r1 bx lr The context of this is same with the case of CoreMa, so i can't rule it out in my change. I created a review on phabricator of my proposal: https://reviews.llvm.org/D27916 Hope all the above could make the problem clear. Welcome any review and advice.Thanks! Best Regards, Jojo On 15 December 2016 at 20:10, Renato Golin <renato.golin at linaro.org> wrote:> On 8 December 2016 at 02:34, Jojo Ma <jojo.ma at linaro.org> wrote: > > It would be profitable as well if we could enable the canonicalisation on > > it. > > sequence before this canonicalisation (ARM): > > test: > > .fnstart > > @ BB#0: @ %entry > > movw r1, #65534 > > and r1, r0, r1 > > ubfx r0, r0, #1, #15 > > add r0, r0, r1, lsr #1 > > bx lr > > > > sequence after this canonicalisation: > > test: > > .fnstart > > @ BB#0: @ %entry > > ubfx r0, r0, #1, #15 > > add r0, r0, r0 > > bx lr > > > > But when I tried to expand the condition to this case, there are lots of > > various regression fails. > > Hi Jojo, > > I think this canonicalisation is correct, at least in this case, so > it's quite possible that the regressions are just bad tests, or you > actually made the code better in other places that weren't expecting > it. > > I'm adding some folks that have worked on the DAG combiner recently to > have a look, but it would be good to know what regressions were there > to see if they're really regressions (ie. worse code) or just > different/better code. > > You may also have missed a check on your code (which would wrongly > combine constants). Can you create a review on Phab of your proposed > change? > > Also, if you could list the regressions and see if they have a common > pattern (ie. "tests expecting an additional mask failed", or "the > wrong mask was applied"), it would become clearer if they are real > regressions or not. > > cheers, > --renato >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161219/3d896f51/attachment.html>