Neil Henning via llvm-dev
2020-Jul-22 09:52 UTC
[llvm-dev] Unlikely branches can have expensive contents hoisted
Hey all - me again,
So I'm looking at llvm.expect specifically for branch hints. In the
following example LLVM will hoist the pow/cos calls into the entry block
even though I've used the llvm.expect intrinsic to make it clear that one
of the calls is unlikely to occur.
target datalayout
"e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-windows-msvc-coff"
define dllexport double @foo(i32 %val) {
entry:
%0 = icmp slt i32 %val, 42
%1 = call i1 @llvm.expect.i1(i1 %0, i1 false)
%2 = sitofp i32 %val to double
br i1 %1, label %true, label %false
true:
%3 = call double @llvm.cos.f64(double %2)
br label %merge
false:
%4 = call double @llvm.pow.f64(double %2, double 4.242000e+01)
br label %merge
merge:
%var.1.0 = phi double [ %4, %false ], [ %3, %true ]
ret double %var.1.0
}
declare i1 @llvm.expect.i1(i1, i1)
declare double @llvm.cos.f64(double)
declare double @llvm.pow.f64(double, double)
This seems counter intuitive to me - I've told LLVM that one of the calls
will probably not happen, and I expected it to preserve the call in the
unlikely branch so we don't pay the cost for something unlikely to be used.
I also injected a pass locally that adds, for branches whose condition is
llvm.expect, the branch weight metadata - but LLVM will still always fold
the branch away ensuring that the expensive call is always called.
The part of SimplifyCFG that does this is FoldTwoEntryPHINode from what I
can tell.
So is there anything I can do here without modifying LLVM? Have I missed
something?
Otherwise I guess I'd have to change FoldTwoEntryPHINode to not do this in
the presence of branch weights / expect?
Thanks for any help,
Cheers,
-Neil.
--
Neil Henning
Senior Software Engineer Compiler
unity.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200722/27617ff3/attachment.html>
Hiroshi Yamauchi via llvm-dev
2020-Jul-22 16:17 UTC
[llvm-dev] Unlikely branches can have expensive contents hoisted
On Wed, Jul 22, 2020 at 2:52 AM Neil Henning via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hey all - me again, > > So I'm looking at llvm.expect specifically for branch hints. In the > following example LLVM will hoist the pow/cos calls into the entry block > even though I've used the llvm.expect intrinsic to make it clear that one > of the calls is unlikely to occur. > > target datalayout > "e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" > target triple = "x86_64-pc-windows-msvc-coff" > > define dllexport double @foo(i32 %val) { > entry: > %0 = icmp slt i32 %val, 42 > %1 = call i1 @llvm.expect.i1(i1 %0, i1 false) > %2 = sitofp i32 %val to double > br i1 %1, label %true, label %false > > true: > %3 = call double @llvm.cos.f64(double %2) > br label %merge > > false: > %4 = call double @llvm.pow.f64(double %2, double 4.242000e+01) > br label %merge > > merge: > %var.1.0 = phi double [ %4, %false ], [ %3, %true ] > ret double %var.1.0 > } > > declare i1 @llvm.expect.i1(i1, i1) > declare double @llvm.cos.f64(double) > declare double @llvm.pow.f64(double, double) > > This seems counter intuitive to me - I've told LLVM that one of the calls > will probably not happen, and I expected it to preserve the call in the > unlikely branch so we don't pay the cost for something unlikely to be used. > > I also injected a pass locally that adds, for branches whose condition is > llvm.expect, the branch weight metadata - but LLVM will still always fold > the branch away ensuring that the expensive call is always called. > > The part of SimplifyCFG that does this is FoldTwoEntryPHINode from what I > can tell. > > So is there anything I can do here without modifying LLVM? Have I missed > something? > > Otherwise I guess I'd have to change FoldTwoEntryPHINode to not do this in > the presence of branch weights / expect? >Passing -two-entry-phi-node-folding-threshold=1 seems to prevent this folding, but that may not be what you need. As it doesn't look like FoldTwoEntryPHINode checks for branch hints, it may make sense to change FoldTwoEntryPHINode.> Thanks for any help, > > Cheers, > -Neil. > > -- > Neil Henning > Senior Software Engineer Compiler > unity.com > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200722/a834e09e/attachment.html>
Roman Lebedev via llvm-dev
2020-Jul-22 16:29 UTC
[llvm-dev] Unlikely branches can have expensive contents hoisted
I would think that it is, again, a cost model issue. https://godbolt.org/z/cTKnfK (latency) Cost Model: Found an estimated cost of 3 for instruction: %3 = call double @llvm.cos.f64(double %2) Cost Model: Found an estimated cost of 3 for instruction: %4 = call double @llvm.pow.f64(double %2, double 4.242000e+01) Is that actually correct? I'd expect it to be somewhat larger.. Roman. On Wed, Jul 22, 2020 at 7:17 PM Hiroshi Yamauchi via llvm-dev < llvm-dev at lists.llvm.org> wrote:> > > On Wed, Jul 22, 2020 at 2:52 AM Neil Henning via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Hey all - me again, >> >> So I'm looking at llvm.expect specifically for branch hints. In the >> following example LLVM will hoist the pow/cos calls into the entry block >> even though I've used the llvm.expect intrinsic to make it clear that one >> of the calls is unlikely to occur. >> >> target datalayout >> "e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" >> target triple = "x86_64-pc-windows-msvc-coff" >> >> define dllexport double @foo(i32 %val) { >> entry: >> %0 = icmp slt i32 %val, 42 >> %1 = call i1 @llvm.expect.i1(i1 %0, i1 false) >> %2 = sitofp i32 %val to double >> br i1 %1, label %true, label %false >> >> true: >> %3 = call double @llvm.cos.f64(double %2) >> br label %merge >> >> false: >> %4 = call double @llvm.pow.f64(double %2, double 4.242000e+01) >> br label %merge >> >> merge: >> %var.1.0 = phi double [ %4, %false ], [ %3, %true ] >> ret double %var.1.0 >> } >> >> declare i1 @llvm.expect.i1(i1, i1) >> declare double @llvm.cos.f64(double) >> declare double @llvm.pow.f64(double, double) >> >> This seems counter intuitive to me - I've told LLVM that one of the calls >> will probably not happen, and I expected it to preserve the call in the >> unlikely branch so we don't pay the cost for something unlikely to be used. >> >> I also injected a pass locally that adds, for branches whose condition is >> llvm.expect, the branch weight metadata - but LLVM will still always fold >> the branch away ensuring that the expensive call is always called. >> >> The part of SimplifyCFG that does this is FoldTwoEntryPHINode from what I >> can tell. >> >> So is there anything I can do here without modifying LLVM? Have I missed >> something? >> >> Otherwise I guess I'd have to change FoldTwoEntryPHINode to not do this >> in the presence of branch weights / expect? >> > > Passing -two-entry-phi-node-folding-threshold=1 seems to prevent this > folding, but that may not be what you need. > > As it doesn't look like FoldTwoEntryPHINode checks for branch hints, it > may make sense to change FoldTwoEntryPHINode. > > >> Thanks for any help, >> >> Cheers, >> -Neil. >> >> -- >> Neil Henning >> Senior Software Engineer Compiler >> unity.com >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200722/febcf3f8/attachment.html>
Seemingly Similar Threads
- LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target
- default behavior or
- LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target
- LLC crash while handling DEBUG info
- LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target