Hello,
For a very simple loop where all IV users are post-inc users, I observed
redundant add instructions in AArch64.
From LSR debug, I can see initial formula for icmp is the one that
transformed to a post-inc form in OptimizeLoopTermCond() and later
expanded in post-inc mode. Based on the observation that the icmp is
already a post-inc user, I hacked LSR to prevent the icmp from being
transformed to post-inc form in OptimizeLoopTermCond() before the
initial formulae are determined. Luckily, I was able to remove the
redundant add instruction with this hack, but I really doubt if it
make sense to prevent a loop terminating condition from being changed to
postinc form when it's already a post-inc user.
# Input IR :
define void @foo(i32 %n, i32* %P) {
entry:
%cmp7 = icmp sgt i32 %n, 1
br i1 %cmp7, label %for.body.preheader, label %for.end
for.body.preheader: ; preds = %entry
%n_sext = sext i32 %n to i64
br label %for.body
for.body:
%K.in = phi i64 [ %n_sext, %for.body.preheader ], [ %K, %for.body ]
%K = add i64 %K.in, 1
%StoredAddr = getelementptr i32, i32* %P, i64 %K
%StoredValue = trunc i64 %K to i32
store volatile i32 %StoredValue, i32* %StoredAddr
%cmp = icmp sgt i64 %K, 1
br i1 %cmp, label %for.body, label %for.end
for.end:
ret void
}
# Output in AArch64 where you can see redundant add instructions for
stored value, store address, and in cmp :
foo:
.cfi_startproc
// BB#0:
cmp w0, #2
b.lt .LBB0_3
// BB#1:
sxtw x9, w0
add w8, w0, #1
.LBB0_2:
add x10, x1, x9, lsl #2
add x9, x9, #1
str w8, [x10, #4]
add w8, w8, #1
cmp x9, #1
b.gt .LBB0_2
.LBB0_3:
ret
> On May 27, 2016, at 2:50 PM, via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hello, > > For a very simple loop where all IV users are post-inc users, I observed redundant add instructions in AArch64. > > From LSR debug, I can see initial formula for icmp is the one that transformed to a post-inc form in OptimizeLoopTermCond() and later expanded in post-inc mode. Based on the observation that the icmp is already a post-inc user, I hacked LSR to prevent the icmp from being transformed to post-inc form in OptimizeLoopTermCond() before the initial formulae are determined. Luckily, I was able to remove the redundant add instruction with this hack, but I really doubt if it make sense to prevent a loop terminating condition from being changed to postinc form when it's already a post-inc user.I agree, but don’t have a better suggestion. You could file a bug. Anyone have time to try out some fixes? Andy> # Input IR : > > define void @foo(i32 %n, i32* %P) { > entry: > %cmp7 = icmp sgt i32 %n, 1 > br i1 %cmp7, label %for.body.preheader, label %for.end > > for.body.preheader: ; preds = %entry > %n_sext = sext i32 %n to i64 > br label %for.body > > for.body: > %K.in = phi i64 [ %n_sext, %for.body.preheader ], [ %K, %for.body ] > %K = add i64 %K.in, 1 > > %StoredAddr = getelementptr i32, i32* %P, i64 %K > %StoredValue = trunc i64 %K to i32 > store volatile i32 %StoredValue, i32* %StoredAddr > %cmp = icmp sgt i64 %K, 1 > br i1 %cmp, label %for.body, label %for.end > > for.end: > ret void > } > > > # Output in AArch64 where you can see redundant add instructions for stored value, store address, and in cmp : > > foo: > .cfi_startproc > // BB#0: > cmp w0, #2 > b.lt .LBB0_3 > // BB#1: > sxtw x9, w0 > add w8, w0, #1 > .LBB0_2: > add x10, x1, x9, lsl #2 > add x9, x9, #1 > str w8, [x10, #4] > add w8, w8, #1 > cmp x9, #1 > b.gt .LBB0_2 > .LBB0_3: > ret > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Thanks Andy for your response. We already have a related bug opened in https://llvm.org/bugs/show_bug.cgi?id=26913 . I may happy to prepare a fix for it. However, as I don’t have much experience in LSR, I first need to get some fundamental idea. For me, it seems that LSR try to handle a loop terminating condition as the post-inc form, while handling other IV users as pre-inc. If this is true, what the reasoning behind the use of post-inc. Is there any assumption about using post-inc or pre-inc form in the cost model? Thanks, Jun -----Original Message----- From: atrick at apple.com [mailto:atrick at apple.com] Sent: Friday, May 27, 2016 6:15 PM To: junbuml at codeaurora.org Cc: llvm-dev at lists.llvm.org Subject: Re: [llvm-dev] Handling post-inc users in LSR> On May 27, 2016, at 2:50 PM, via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hello, > > For a very simple loop where all IV users are post-inc users, I observed redundant add instructions in AArch64. > > From LSR debug, I can see initial formula for icmp is the one that transformed to a post-inc form in OptimizeLoopTermCond() and later expanded in post-inc mode. Based on the observation that the icmp is already a post-inc user, I hacked LSR to prevent the icmp from being transformed to post-inc form in OptimizeLoopTermCond() before the initial formulae are determined. Luckily, I was able to remove the redundant add instruction with this hack, but I really doubt if it make sense to prevent a loop terminating condition from being changed to postinc form when it's already a post-inc user.I agree, but don’t have a better suggestion. You could file a bug. Anyone have time to try out some fixes? Andy> # Input IR : > > define void @foo(i32 %n, i32* %P) { > entry: > %cmp7 = icmp sgt i32 %n, 1 > br i1 %cmp7, label %for.body.preheader, label %for.end > > for.body.preheader: ; preds = %entry > %n_sext = sext i32 %n to i64 > br label %for.body > > for.body: > %K.in = phi i64 [ %n_sext, %for.body.preheader ], [ %K, %for.body ] > %K = add i64 %K.in, 1 > > %StoredAddr = getelementptr i32, i32* %P, i64 %K %StoredValue = > trunc i64 %K to i32 store volatile i32 %StoredValue, i32* %StoredAddr > %cmp = icmp sgt i64 %K, 1 br i1 %cmp, label %for.body, label %for.end > > for.end: > ret void > } > > > # Output in AArch64 where you can see redundant add instructions for stored value, store address, and in cmp : > > foo: > .cfi_startproc > // BB#0: > cmp w0, #2 > b.lt .LBB0_3 > // BB#1: > sxtw x9, w0 > add w8, w0, #1 > .LBB0_2: > add x10, x1, x9, lsl #2 > add x9, x9, #1 > str w8, [x10, #4] > add w8, w8, #1 > cmp x9, #1 > b.gt .LBB0_2 > .LBB0_3: > ret > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev