Philip Reames via llvm-dev
2021-Jul-20 17:17 UTC
[llvm-dev] Question about Unrolling Loop with Multiple Exits
./opt -loop-unroll -enable-new-pm=0 < llvm-dev.ll -S -unroll-runtime -debug Args: ./opt -loop-unroll -enable-new-pm=0 -S -unroll-runtime -debug Loop Unroll: F[test] Loop %while.body.us Loop Size = 10 runtime unrolling with count: 8 Exiting block %if.end8.us: TripCount=0, TripMultiple=1, BreakoutTrip=1 Exiting block %while.cond.backedge: TripCount=0, TripMultiple=1, BreakoutTrip=1 Trying runtime unrolling on Loop: Loop at depth 1 containing: %while.body.us<header>,%if.end8.us<exiting>,%while.cond.backedge<latch><exiting> Using prolog remainder. Bailout for multi-exit handling when latch exit has >1 predecessor. Multiple exit/exiting blocks in loop and multi-exit unrolling not enabled! Won't unroll; remainder loop could not be generated when assuming runtime trip count ; ModuleID = '<stdin>' source_filename = "<stdin>" declare i32 @foo(i8*) define void @test(i8* %s, i64 %a) { entry: %s.addr.a = getelementptr i8, i8* %s, i64 %a br label %while.body.us while.body.us: ; preds = %while.cond.backedge, %entry %s.addr = phi i8* [ %incdec.ptr, %while.cond.backedge ], [ %s, %entry ] %incdec.ptr = getelementptr inbounds i8, i8* %s.addr, i64 1 %incdec.val = load i8, i8* %s.addr, align 1 %cmp1 = icmp eq i8 %incdec.val, 10 br i1 %cmp1, label %if.end8.us, label %while.cond.backedge if.end8.us: ; preds = %while.body.us %call9 = tail call i32 @foo(i8* nonnull %incdec.ptr) %cmp2 = icmp ult i32 %call9, 0 br i1 %cmp2, label %while.cond.backedge, label %return.loopexit while.cond.backedge: ; preds = %if.end8.us, %while.body.us %cmp3 = icmp eq i8* %incdec.ptr, %s.addr.a br i1 %cmp3, label %return.loopexit, label %while.body.us return.loopexit: ; preds = %while.cond.backedge, %if.end8.us ret void } The debug output appears to give a pretty good clue as to where you could start if desired. Philip On 7/20/21 9:45 AM, Jingu Kang wrote:> > Hi Philip, > > I have reduced the test case roughly as below. > > declare i32 @foo(i8 *) > > define void @test(i8* %s, i64 %a) { > > entry: > > %s.addr.a = getelementptr i8, i8* %s, i64 %a > > br label %while.body.us > > while.body.us: > > %s.addr = phi i8* [ %incdec.ptr, %while.cond.backedge ], [ %s, %entry ] > > %incdec.ptr = getelementptr inbounds i8, i8* %s.addr, i64 1 > > %incdec.val = load i8, i8* %s.addr, align 1 > > %cmp1 = icmp eq i8 %incdec.val, 10 > > br i1 %cmp1, label %if.end8.us, label %while.cond.backedge > > if.end8.us: > > %call9 = tail call i32 @foo(i8* nonnull %incdec.ptr) > > %cmp2 = icmp ult i32 %call9, 0 > > br i1 %cmp2, label %while.cond.backedge, label %return.loopexit > > while.cond.backedge: > > %cmp3 = icmp eq i8* %incdec.ptr, %s.addr.a > > br i1 %cmp3, label %return.loopexit, label %while.body.us > > return.loopexit: > > ret void > > } > > Roughly, I unrolled the loop manually 2 times as below and ignored the > remaining loop simply. > > define void @test(i8* %s, i64 %a) { > > entry: > > %s.addr.a = getelementptr i8, i8* %s, i64 %a > > br label %while.body.us > > while.body.us: > > %s.addr = phi i8* [ %incdec.ptr, %while.cond.backedge ], [ %s, %entry ] > > %incdec.ptr = getelementptr inbounds i8, i8* %s.addr, i64 1 > > %incdec.val = load i8, i8* %s.addr, align 1 > > %cmp1 = icmp eq i8 %incdec.val, 10 > > br i1 %cmp1, label %if.end8.us, label %while.body.us.1 > > while.body.us.1: > > %incdec.ptr.1 = getelementptr inbounds i8, i8* %incdec.ptr, i64 1 > > %incdec.val.1 = load i8, i8* %incdec.ptr, align 1 > > %cmp1.1 = icmp eq i8 %incdec.val.1, 10 > > br i1 %cmp1.1, label %if.end8.us, label %while.cond.backedge > > if.end8.us: > > %incdec.ptr.phi = phi i8* [ %incdec.ptr, %while.body.us ], [ > %incdec.ptr.1, %while.body.us.1 ] > > %call9 = tail call i32 @foo(i8* nonnull %incdec.ptr.phi) > > %cmp2 = icmp ult i32 %call9, 0 > > br i1 %cmp2, label %while.cond.backedge, label %return.loopexit > > while.cond.backedge: > > %cmp3 = icmp eq i8* %incdec.ptr.1, %s.addr.a > > br i1 %cmp3, label %return.loopexit, label %while.body.us > > return.loopexit: > > ret void > > } > > If possible, can we make loop unroll pass handle this kind of cases > please? > > If I missed something, please let me know. > > Thanks > > JinGu Kang > > *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org> *On Behalf Of > *Philip Reames via llvm-dev > *Sent:* 16 July 2021 20:12 > *To:* Jingu Kang <Jingu.Kang at arm.com> > *Cc:* llvm-dev at lists.llvm.org > *Subject:* Re: [llvm-dev] Question about Unrolling Loop with Multiple > Exits > > Jingu, > > I'm not fluent in AArch64 assembly, sorry.� If you want more > meaningful commentary, you'll need to reduce your c++ example further, > and give a better explanation of what you expect the unroller to do > for you.� > > Philip > > On 7/16/21 12:09 PM, Jingu Kang wrote: > > Sorry for poor example� > > � > > The AArch64 assembly output of the example from gcc is as below. > The loop is unrolled 7 times. I have written some comments to > explain how the assembly code is mapped to C source code. As you > can see on `.L3` label, the �if (*s++ != '\n')� block is > unrolled 7 times. > > � > > ������� stp���� x29, x30, [sp, -48]! > > ������� mov���� x29, sp > > ������� stp���� x19, x20, [sp, 16] > > ������� mov���� x19, x0� --> x19 is char *s > > ������� mov���� x20, x1� --> x20 is char *end > > ������� str���� x21, [sp, 32] > > ������� orr���� x21, x3, x2� --> x21 is > check1 | check2 > > .L70: > > ������� sub���� x0, x20, x19� --> x0 = end > - s; > > ������� add���� x1, x0, 1 > > ������� ands��� x2, x1, 7� --> unroll count is 7 > > ������� beq���� .L3���� --> .L3 is > inside while loop. if (*s++ != '\n') > > ������� cmp���� x19, x20 �--> while(s <= end) > > ������� bhi���� .L2���� --> .L2 is > label ret1. > > � > > ������� ldrb��� w3, [x19], 1� --> start of > remainder > > ������� cmp���� w3, 10 > > ������� beq���� .L71 > > ������� cmp���� x2, 1 > > ������� beq���� .L3 > > ������� cmp���� x2, 2 > > ������� beq���� .L49 > > ������� cmp���� x2, 3 > > ������� beq���� .L50 > > ������� cmp���� x2, 4 > > ������� beq���� .L51 > > ������� cmp���� x2, 5 > > ������� beq���� .L52 > > ������� cmp���� x2, 6 > > ������� beq���� .L53 > > ������� ldrb��� w4, [x19], 1 > > ������� cmp���� w4, 10 > > ������� bne���� .L53 > > .L71: > > ������� cbz���� x21, .L4� --> if(check1 || > check2) > > ������� bl����� foo() > > ������� mov���� x19, x0 > > ������� cbz���� x0, .L2�� --> if (!s) > goto ret1; > > � > > .L4:��������������������� > --> if(boo(s)) goto ret0; > > ������� mov���� x0, x19 > > ����� ��bl����� boo(char*) > > ������� cbz���� w0, .L70 > > ������� mov���� w0, 0 > > ������� ldp���� x19, x20, [sp, 16] > > ������� ldr���� x21, [sp, 32] > > ������� ldp���� x29, x30, [sp], 48 > > ������� ret > > .L53:��������� --> if (*s++ != '\n') for remainder > > ������� ldrb��� w5, [x19], 1 > > ������� cmp���� w5, 10 > > ������� beq���� .L71 > > .L52:��������� --> if (*s++ != '\n') for remainder > > ������� ldrb��� w6, [x19], 1 > > ������� cmp���� w6, 10 > > ������� beq���� .L71 > > .L51:��������� --> if (*s++ != '\n') for remainder > > ������� ldrb��� w7, [x19], 1 > > ������� cmp���� w7, 10 > > ������� beq���� .L71 > > .L50:��������� --> if (*s++ != '\n') for remainder > > ������� ldrb��� w8, [x19], 1 > > ������� cmp���� w8, 10 > > ������� beq���� .L71 > > .L49:��������� --> if (*s++ != '\n') for remainder > > ������� ldrb��� w9, [x19], 1 > > ������� cmp���� w9, 10 > > ������� beq���� .L71 > > .L3:������������������������� > --> if (*s++ != '\n'), 7 times unrolled > > ������� cmp���� x19, x20 > > ������� bhi���� .L2 > > ������� ldrb��� w10, [x19] > > ������� add�� ��x19, x19, 1 > > ������� mov���� x11, x19 > > ������� cmp���� w10, 10 > > ������� beq���� .L71 > > ������� ldrb��� w12, [x19], 1 > > ������� cmp���� w12, 10 > > ������� beq���� .L71 > > ������� ldrb��� w13, [x11, 1] > > ������� add���� x19, x11, 2 > > ������� cmp���� w13, 10 > > ������� beq��� �.L71 > > ������� ldrb��� w14, [x11, 2] > > ������� add���� x19, x11, 3 > > ������� cmp���� w14, 10 > > ������� beq���� .L71 > > ������� ldrb��� w15, [x11, 3] > > ������� add���� x19, x11, 4 > > ������� cmp���� w15, 10 > > ������� beq���� .L71 > > ������� ldrb��� w16, [x11, 4] > > ������� add���� x19, x11, 5 > > ������� cmp���� w16, 10 > > ������� beq���� .L71 > > ������� ldrb��� w17, [x11, 5] > > ������� add���� x19, x11, 6 > > ������� cmp���� w17, 10 > > ������� beq���� .L71 > > ������� ldrb��� w18, [x11, 6] > > ������� add���� x19, x11, 7 > > ������� cmp���� w18, 10 > > ������� beq���� .L71 > > ������� b������ .L3 > > .L2:��������������������� > --> label ret1 > > ������� mov���� w0, 1 > > ������� ldp���� x19, x20, [sp, 16] > > ������� ldr���� x21, [sp, 32] > > ������� ldp���� x29, x30, [sp], 48 > > ������� ret > > � > > I am sorry for showing you assembly output directly� It looks > like the rtl level�s unroll pass of gcc unrolls above loop and I > need to check it next week. Once I have clear idea about above > unrolling from gcc, let me reduce the example more and let you know. > > � > > Thanks > > JinGu Kang > > � > > *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org> > <mailto:llvm-dev-bounces at lists.llvm.org> *On Behalf Of *Philip > Reames via llvm-dev > *Sent:* 16 July 2021 18:27 > *To:* Jingu Kang <Jingu.Kang at arm.com> <mailto:Jingu.Kang at arm.com> > *Cc:* llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > *Subject:* Re: [llvm-dev] Question about Unrolling Loop with > Multiple Exits > > � > > � > > On 7/16/21 9:29 AM, Jingu Kang wrote: > > Hi Philip, > > Thanks for your kind reply. > > > A) Are you measuring on tip of tree?� There were changes > for multiple exit unrolling which landed very recently. > > Yep, I am investigating benchmarks with llvm tip and I can see > the llvm fails to unroll some loops with multiple exits. > > > B) One of your exits does not dominate your latch.� Those > are generally hard > > > C) This example does not seem to require gotos.� I > strongly suggest reducing your test cases if you want more > informed commentary.� > > � > > I am looking at perlbench recently and it has `goto` > statements inside loop. The example is a reduced case. > > Right, but the gotos aren't relevant for your reduced test.� You > can reduce further. > > > When I look at the gcc�s output of the example, it looks > like gcc unrolls only the below `if` statement block� > > � > > ��� if (*s++ != '\n') > > ����� continue; > > Your phrasing here does not parse for me.� Can you restate this > with different wording and maybe a fully worked example? > > > � > > Thanks > > JinGu Kang > > � > > *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org> > <mailto:llvm-dev-bounces at lists.llvm.org> *On Behalf Of *Philip > Reames via llvm-dev > *Sent:* 16 July 2021 15:52 > *To:* Jingu Kang <Jingu.Kang at arm.com> > <mailto:Jingu.Kang at arm.com>; llvm-dev at lists.llvm.org > <mailto:llvm-dev at lists.llvm.org> > *Subject:* Re: [llvm-dev] Question about Unrolling Loop with > Multiple Exits > > � > > A) Are you measuring on tip of tree?� There were changes for > multiple exit unrolling which landed very recently. > > B) One of your exits does not dominate your latch.� Those > are generally hard.� > > C) This example does not seem to require gotos.� I strongly > suggest reducing your test cases if you want more informed > commentary.� > > Philip > > On 7/16/21 7:42 AM, Jingu Kang via llvm-dev wrote: > > Hi All, > > � > > While I am investigating benchmarks, I have found loops > which llvm fails to unroll because the loops have multiple > exits. > > For example, > > � > > char *foo(void); > > int boo(char *s); > > � > > int test(char *s, char *end, char *check1, char *check2) { > > � while (s <= end) { > > ��� if (*s++ != '\n') > > ����� continue; > > ��� if (check1 || check2) { > > ����� s = foo(); > > ����� if (!s) > > ������� goto ret1; > > ��� }�� > > ����if (boo(s)) > > ����� goto ret0; > > � } > > � goto ret1; > > � > > ret0: > > � return 0; > > ret1: > > � return 1; > > } > > � > > Above code causes below messages from LoopUnroll pass. > > � > > Bailout for multi-exit handling when latch exit has >1 > predecessor. > > Multiple exit/exiting blocks in loop and multi-exit > unrolling not enabled! > > � > > I can see the option `unroll-runtime-multi-exit` and > comments about it. I wonder there are already reviews for > the work on phabriactor or some people are working on it. > > If someone knows information about it, please share it. > > � > > Thanks > > JinGu Kang > > � > > � > > � > > � > > � > > � > > > > > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210720/dc3a3a47/attachment.html>
Jingu Kang via llvm-dev
2021-Jul-21 08:51 UTC
[llvm-dev] Question about Unrolling Loop with Multiple Exits
Thanks for kind suggestion Philip. Let me have a look at the points where the debug message mentions. Regards JinGu Kang From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of Philip Reames via llvm-dev Sent: 20 July 2021 18:17 To: Jingu Kang <Jingu.Kang at arm.com> Cc: llvm-dev at lists.llvm.org Subject: Re: [llvm-dev] Question about Unrolling Loop with Multiple Exits ./opt -loop-unroll� -enable-new-pm=0 < llvm-dev.ll -S -unroll-runtime -debug Args: ./opt -loop-unroll -enable-new-pm=0 -S -unroll-runtime -debug Loop Unroll: F[test] Loop %while.body.us � Loop Size = 10 � runtime unrolling with count: 8 � Exiting block %if.end8.us: TripCount=0, TripMultiple=1, BreakoutTrip=1 � Exiting block %while.cond.backedge: TripCount=0, TripMultiple=1, BreakoutTrip=1 Trying runtime unrolling on Loop: Loop at depth 1 containing: %while.body.us<header>,%if.end8.us<exiting>,%while.cond.backedge<latch><exiting> Using prolog remainder. Bailout for multi-exit handling when latch exit has >1 predecessor. Multiple exit/exiting blocks in loop and multi-exit unrolling not enabled! Won't unroll; remainder loop could not be generated when assuming runtime trip count ; ModuleID = '<stdin>' source_filename = "<stdin>" declare i32 @foo(i8*) define void @test(i8* %s, i64 %a) { entry: � %s.addr.a = getelementptr i8, i8* %s, i64 %a � br label %while.body.us while.body.us:����������������������������������� ; preds = %while.cond.backedge, %entry � %s.addr = phi i8* [ %incdec.ptr, %while.cond.backedge ], [ %s, %entry ] � %incdec.ptr = getelementptr inbounds i8, i8* %s.addr, i64 1 � %incdec.val = load i8, i8* %s.addr, align 1 � %cmp1 = icmp eq i8 %incdec.val, 10 � br i1 %cmp1, label %if.end8.us, label %while.cond.backedge if.end8.us:�������������������������������������� ; preds = %while.body.us � %call9 = tail call i32 @foo(i8* nonnull %incdec.ptr) � %cmp2 = icmp ult i32 %call9, 0 � br i1 %cmp2, label %while.cond.backedge, label %return.loopexit while.cond.backedge:����������������������������� ; preds = %if.end8.us, %while.body.us � %cmp3 = icmp eq i8* %incdec.ptr, %s.addr.a � br i1 %cmp3, label %return.loopexit, label %while.body.us return.loopexit:��������������������������������� ; preds = %while.cond.backedge, %if.end8.us � ret void } The debug output appears to give a pretty good clue as to where you could start if desired. Philip On 7/20/21 9:45 AM, Jingu Kang wrote: Hi Philip, � I have reduced the test case roughly as below. � declare i32 @foo(i8 *) � define void @test(i8* %s, i64 %a) { entry: � %s.addr.a = getelementptr i8, i8* %s, i64 %a � br label %while.body.us � while.body.us:��� ��%s.addr = phi i8* [ %incdec.ptr, %while.cond.backedge ], [ %s, %entry ] � %incdec.ptr = getelementptr inbounds i8, i8* %s.addr, i64 1 � %incdec.val = load i8, i8* %s.addr, align 1 � %cmp1 = icmp eq i8 %incdec.val, 10 � br i1 %cmp1, label %if.end8.us, label %while.cond.backedge � if.end8.us:��� ��%call9 = tail call i32 @foo(i8* nonnull %incdec.ptr) � %cmp2 = icmp ult i32 %call9, 0 � br i1 %cmp2, label %while.cond.backedge, label %return.loopexit � while.cond.backedge: � %cmp3 = icmp eq i8* %incdec.ptr, %s.addr.a � br i1 %cmp3, label %return.loopexit, label %while.body.us � return.loopexit:��� ��ret void } � Roughly, I unrolled the loop manually 2 times as below and ignored the remaining loop simply. � define void @test(i8* %s, i64 %a) { entry: � %s.addr.a = getelementptr i8, i8* %s, i64 %a � br label %while.body.us � while.body.us:��� ��%s.addr = phi i8* [ %incdec.ptr, %while.cond.backedge ], [ %s, %entry ] � %incdec.ptr = getelementptr inbounds i8, i8* %s.addr, i64 1 � %incdec.val = load i8, i8* %s.addr, align 1 � %cmp1 = icmp eq i8 %incdec.val, 10 � br i1 %cmp1, label %if.end8.us, label %while.body.us.1 � while.body.us.1:��� ��%incdec.ptr.1 = getelementptr inbounds i8, i8* %incdec.ptr, i64 1 � %incdec.val.1 = load i8, i8* %incdec.ptr, align 1 � %cmp1.1 = icmp eq i8 %incdec.val.1, 10 � br i1 %cmp1.1, label %if.end8.us, label %while.cond.backedge � if.end8.us:��� ��%incdec.ptr.phi = phi i8* [ %incdec.ptr, %while.body.us ], [ %incdec.ptr.1, %while.body.us.1 ] � %call9 = tail call i32 @foo(i8* nonnull %incdec.ptr.phi) � %cmp2 = icmp ult i32 %call9, 0 � br i1 %cmp2, label %while.cond.backedge, label %return.loopexit � while.cond.backedge: � %cmp3 = icmp eq i8* %incdec.ptr.1, %s.addr.a � br i1 %cmp3, label %return.loopexit, label %while.body.us � return.loopexit:��� ��ret void } � If possible, can we make loop unroll pass handle this kind of cases please? If I missed something, please let me know. � Thanks JinGu Kang � From: llvm-dev <llvm-dev-bounces at lists.llvm.org><mailto:llvm-dev-bounces at lists.llvm.org> On Behalf Of Philip Reames via llvm-dev Sent: 16 July 2021 20:12 To: Jingu Kang <Jingu.Kang at arm.com><mailto:Jingu.Kang at arm.com> Cc: llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> Subject: Re: [llvm-dev] Question about Unrolling Loop with Multiple Exits � Jingu, I'm not fluent in AArch64 assembly, sorry.� If you want more meaningful commentary, you'll need to reduce your c++ example further, and give a better explanation of what you expect the unroller to do for you.� Philip On 7/16/21 12:09 PM, Jingu Kang wrote: Sorry for poor example� � The AArch64 assembly output of the example from gcc is as below. The loop is unrolled 7 times. I have written some comments to explain how the assembly code is mapped to C source code. As you can see on `.L3` label, the �if (*s++ != '\n')� block is unrolled 7 times. � ������� stp���� x29, x30, [sp, -48]! ������� mov���� x29, sp ������� stp���� x19, x20, [sp, 16] ������� mov���� x19, x0� --> x19 is char *s ������� mov���� x20, x1� --> x20 is char *end ������� str���� x21, [sp, 32] ������� orr���� x21, x3, x2� --> x21 is check1 | check2 .L70: ������� sub���� x0, x20, x19� --> x0 = end - s; ������� add���� x1, x0, 1 ������� ands��� x2, x1, 7� --> unroll count is 7 ������� beq���� .L3���� --> .L3 is inside while loop. if (*s++ != '\n') ������� cmp���� x19, x20 �--> while(s <= end) ������� bhi���� .L2���� --> .L2 is label ret1. � ������� ldrb��� w3, [x19], 1� --> start of remainder ������� cmp���� w3, 10 ������� beq���� .L71 ������� cmp���� x2, 1 ������� beq���� .L3 ������� cmp���� x2, 2 ������� beq���� .L49 ������� cmp���� x2, 3 ������� beq���� .L50 ������� cmp���� x2, 4 ������� beq���� .L51 ������� cmp���� x2, 5 ������� beq���� .L52 ������� cmp���� x2, 6 ������� beq���� .L53 ������� ldrb��� w4, [x19], 1 ������� cmp���� w4, 10 ������� bne���� .L53 .L71: ������� cbz���� x21, .L4� --> if(check1 || check2) ������� bl����� foo() ������� mov���� x19, x0 ������� cbz���� x0, .L2�� --> if (!s) goto ret1; � .L4:��������������������� --> if(boo(s)) goto ret0; ������� mov���� x0, x19 ����� ��bl����� boo(char*) ������� cbz���� w0, .L70 ������� mov���� w0, 0 ������� ldp���� x19, x20, [sp, 16] ������� ldr���� x21, [sp, 32] ������� ldp���� x29, x30, [sp], 48 ������� ret .L53:��������� --> if (*s++ != '\n') for remainder ������� ldrb��� w5, [x19], 1 ������� cmp���� w5, 10 ������� beq���� .L71 .L52:��������� --> if (*s++ != '\n') for remainder ������� ldrb��� w6, [x19], 1 ������� cmp���� w6, 10 ������� beq���� .L71 .L51:��������� --> if (*s++ != '\n') for remainder ������� ldrb��� w7, [x19], 1 ������� cmp���� w7, 10 ������� beq���� .L71 .L50:��������� --> if (*s++ != '\n') for remainder ������� ldrb��� w8, [x19], 1 ������� cmp���� w8, 10 ������� beq���� .L71 .L49:��������� --> if (*s++ != '\n') for remainder ������� ldrb��� w9, [x19], 1 ������� cmp���� w9, 10 ������� beq���� .L71 .L3:������������������������� --> if (*s++ != '\n'), 7 times unrolled ������� cmp���� x19, x20 ������� bhi���� .L2 ������� ldrb��� w10, [x19] ������� add�� ��x19, x19, 1 ������� mov���� x11, x19 ������� cmp���� w10, 10 ������� beq���� .L71 ������� ldrb��� w12, [x19], 1 ������� cmp���� w12, 10 ������� beq���� .L71 ������� ldrb��� w13, [x11, 1] ������� add���� x19, x11, 2 ������� cmp���� w13, 10 ������� beq��� �.L71 ������� ldrb��� w14, [x11, 2] ������� add���� x19, x11, 3 ������� cmp���� w14, 10 ������� beq���� .L71 ������� ldrb��� w15, [x11, 3] ������� add���� x19, x11, 4 ������� cmp���� w15, 10 ������� beq���� .L71 ������� ldrb��� w16, [x11, 4] ������� add���� x19, x11, 5 ������� cmp���� w16, 10 ������� beq���� .L71 ������� ldrb��� w17, [x11, 5] ������� add���� x19, x11, 6 ������� cmp���� w17, 10 ������� beq���� .L71 ������� ldrb��� w18, [x11, 6] ������� add���� x19, x11, 7 ������� cmp���� w18, 10 ������� beq���� .L71 ������� b������ .L3 .L2:��������������������� --> label ret1 ������� mov���� w0, 1 ������� ldp���� x19, x20, [sp, 16] ������� ldr���� x21, [sp, 32] ������� ldp���� x29, x30, [sp], 48 ������� ret � I am sorry for showing you assembly output directly� It looks like the rtl level�s unroll pass of gcc unrolls above loop and I need to check it next week. Once I have clear idea about above unrolling from gcc, let me reduce the example more and let you know. � Thanks JinGu Kang � From: llvm-dev <llvm-dev-bounces at lists.llvm.org><mailto:llvm-dev-bounces at lists.llvm.org> On Behalf Of Philip Reames via llvm-dev Sent: 16 July 2021 18:27 To: Jingu Kang <Jingu.Kang at arm.com><mailto:Jingu.Kang at arm.com> Cc: llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> Subject: Re: [llvm-dev] Question about Unrolling Loop with Multiple Exits � � On 7/16/21 9:29 AM, Jingu Kang wrote: Hi Philip, Thanks for your kind reply.> A) Are you measuring on tip of tree?� There were changes for multiple exit unrolling which landed very recently.Yep, I am investigating benchmarks with llvm tip and I can see the llvm fails to unroll some loops with multiple exits.> B) One of your exits does not dominate your latch.� Those are generally hard > C) This example does not seem to require gotos.� I strongly suggest reducing your test cases if you want more informed commentary.�� I am looking at perlbench recently and it has `goto` statements inside loop. The example is a reduced case. Right, but the gotos aren't relevant for your reduced test.� You can reduce further. When I look at the gcc�s output of the example, it looks like gcc unrolls only the below `if` statement block� � ��� if (*s++ != '\n') ����� continue; Your phrasing here does not parse for me.� Can you restate this with different wording and maybe a fully worked example? � Thanks JinGu Kang � From: llvm-dev <llvm-dev-bounces at lists.llvm.org><mailto:llvm-dev-bounces at lists.llvm.org> On Behalf Of Philip Reames via llvm-dev Sent: 16 July 2021 15:52 To: Jingu Kang <Jingu.Kang at arm.com><mailto:Jingu.Kang at arm.com>; llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> Subject: Re: [llvm-dev] Question about Unrolling Loop with Multiple Exits � A) Are you measuring on tip of tree?� There were changes for multiple exit unrolling which landed very recently. B) One of your exits does not dominate your latch.� Those are generally hard.� C) This example does not seem to require gotos.� I strongly suggest reducing your test cases if you want more informed commentary.� Philip On 7/16/21 7:42 AM, Jingu Kang via llvm-dev wrote: Hi All, � While I am investigating benchmarks, I have found loops which llvm fails to unroll because the loops have multiple exits. For example, � char *foo(void); int boo(char *s); � int test(char *s, char *end, char *check1, char *check2) { � while (s <= end) { ��� if (*s++ != '\n') ����� continue; ��� if (check1 || check2) { ����� s = foo(); ����� if (!s) ������� goto ret1; ��� }�� ����if (boo(s)) ����� goto ret0; � } � goto ret1; � ret0: � return 0; ret1: � return 1; } � Above code causes below messages from LoopUnroll pass. � Bailout for multi-exit handling when latch exit has >1 predecessor. Multiple exit/exiting blocks in loop and multi-exit unrolling not enabled! � I can see the option `unroll-runtime-multi-exit` and comments about it. I wonder there are already reviews for the work on phabriactor or some people are working on it. If someone knows information about it, please share it. � Thanks JinGu Kang � � � � � � _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210721/adfd3a71/attachment-0001.html>