Neil Henning via llvm-dev
2020-Jul-22 09:52 UTC
[llvm-dev] Unlikely branches can have expensive contents hoisted
Hey all - me again, So I'm looking at llvm.expect specifically for branch hints. In the following example LLVM will hoist the pow/cos calls into the entry block even though I've used the llvm.expect intrinsic to make it clear that one of the calls is unlikely to occur. target datalayout "e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-pc-windows-msvc-coff" define dllexport double @foo(i32 %val) { entry: %0 = icmp slt i32 %val, 42 %1 = call i1 @llvm.expect.i1(i1 %0, i1 false) %2 = sitofp i32 %val to double br i1 %1, label %true, label %false true: %3 = call double @llvm.cos.f64(double %2) br label %merge false: %4 = call double @llvm.pow.f64(double %2, double 4.242000e+01) br label %merge merge: %var.1.0 = phi double [ %4, %false ], [ %3, %true ] ret double %var.1.0 } declare i1 @llvm.expect.i1(i1, i1) declare double @llvm.cos.f64(double) declare double @llvm.pow.f64(double, double) This seems counter intuitive to me - I've told LLVM that one of the calls will probably not happen, and I expected it to preserve the call in the unlikely branch so we don't pay the cost for something unlikely to be used. I also injected a pass locally that adds, for branches whose condition is llvm.expect, the branch weight metadata - but LLVM will still always fold the branch away ensuring that the expensive call is always called. The part of SimplifyCFG that does this is FoldTwoEntryPHINode from what I can tell. So is there anything I can do here without modifying LLVM? Have I missed something? Otherwise I guess I'd have to change FoldTwoEntryPHINode to not do this in the presence of branch weights / expect? Thanks for any help, Cheers, -Neil. -- Neil Henning Senior Software Engineer Compiler unity.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200722/27617ff3/attachment.html>
Hiroshi Yamauchi via llvm-dev
2020-Jul-22 16:17 UTC
[llvm-dev] Unlikely branches can have expensive contents hoisted
On Wed, Jul 22, 2020 at 2:52 AM Neil Henning via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hey all - me again, > > So I'm looking at llvm.expect specifically for branch hints. In the > following example LLVM will hoist the pow/cos calls into the entry block > even though I've used the llvm.expect intrinsic to make it clear that one > of the calls is unlikely to occur. > > target datalayout > "e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" > target triple = "x86_64-pc-windows-msvc-coff" > > define dllexport double @foo(i32 %val) { > entry: > %0 = icmp slt i32 %val, 42 > %1 = call i1 @llvm.expect.i1(i1 %0, i1 false) > %2 = sitofp i32 %val to double > br i1 %1, label %true, label %false > > true: > %3 = call double @llvm.cos.f64(double %2) > br label %merge > > false: > %4 = call double @llvm.pow.f64(double %2, double 4.242000e+01) > br label %merge > > merge: > %var.1.0 = phi double [ %4, %false ], [ %3, %true ] > ret double %var.1.0 > } > > declare i1 @llvm.expect.i1(i1, i1) > declare double @llvm.cos.f64(double) > declare double @llvm.pow.f64(double, double) > > This seems counter intuitive to me - I've told LLVM that one of the calls > will probably not happen, and I expected it to preserve the call in the > unlikely branch so we don't pay the cost for something unlikely to be used. > > I also injected a pass locally that adds, for branches whose condition is > llvm.expect, the branch weight metadata - but LLVM will still always fold > the branch away ensuring that the expensive call is always called. > > The part of SimplifyCFG that does this is FoldTwoEntryPHINode from what I > can tell. > > So is there anything I can do here without modifying LLVM? Have I missed > something? > > Otherwise I guess I'd have to change FoldTwoEntryPHINode to not do this in > the presence of branch weights / expect? >Passing -two-entry-phi-node-folding-threshold=1 seems to prevent this folding, but that may not be what you need. As it doesn't look like FoldTwoEntryPHINode checks for branch hints, it may make sense to change FoldTwoEntryPHINode.> Thanks for any help, > > Cheers, > -Neil. > > -- > Neil Henning > Senior Software Engineer Compiler > unity.com > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200722/a834e09e/attachment.html>
Roman Lebedev via llvm-dev
2020-Jul-22 16:29 UTC
[llvm-dev] Unlikely branches can have expensive contents hoisted
I would think that it is, again, a cost model issue. https://godbolt.org/z/cTKnfK (latency) Cost Model: Found an estimated cost of 3 for instruction: %3 = call double @llvm.cos.f64(double %2) Cost Model: Found an estimated cost of 3 for instruction: %4 = call double @llvm.pow.f64(double %2, double 4.242000e+01) Is that actually correct? I'd expect it to be somewhat larger.. Roman. On Wed, Jul 22, 2020 at 7:17 PM Hiroshi Yamauchi via llvm-dev < llvm-dev at lists.llvm.org> wrote:> > > On Wed, Jul 22, 2020 at 2:52 AM Neil Henning via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Hey all - me again, >> >> So I'm looking at llvm.expect specifically for branch hints. In the >> following example LLVM will hoist the pow/cos calls into the entry block >> even though I've used the llvm.expect intrinsic to make it clear that one >> of the calls is unlikely to occur. >> >> target datalayout >> "e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" >> target triple = "x86_64-pc-windows-msvc-coff" >> >> define dllexport double @foo(i32 %val) { >> entry: >> %0 = icmp slt i32 %val, 42 >> %1 = call i1 @llvm.expect.i1(i1 %0, i1 false) >> %2 = sitofp i32 %val to double >> br i1 %1, label %true, label %false >> >> true: >> %3 = call double @llvm.cos.f64(double %2) >> br label %merge >> >> false: >> %4 = call double @llvm.pow.f64(double %2, double 4.242000e+01) >> br label %merge >> >> merge: >> %var.1.0 = phi double [ %4, %false ], [ %3, %true ] >> ret double %var.1.0 >> } >> >> declare i1 @llvm.expect.i1(i1, i1) >> declare double @llvm.cos.f64(double) >> declare double @llvm.pow.f64(double, double) >> >> This seems counter intuitive to me - I've told LLVM that one of the calls >> will probably not happen, and I expected it to preserve the call in the >> unlikely branch so we don't pay the cost for something unlikely to be used. >> >> I also injected a pass locally that adds, for branches whose condition is >> llvm.expect, the branch weight metadata - but LLVM will still always fold >> the branch away ensuring that the expensive call is always called. >> >> The part of SimplifyCFG that does this is FoldTwoEntryPHINode from what I >> can tell. >> >> So is there anything I can do here without modifying LLVM? Have I missed >> something? >> >> Otherwise I guess I'd have to change FoldTwoEntryPHINode to not do this >> in the presence of branch weights / expect? >> > > Passing -two-entry-phi-node-folding-threshold=1 seems to prevent this > folding, but that may not be what you need. > > As it doesn't look like FoldTwoEntryPHINode checks for branch hints, it > may make sense to change FoldTwoEntryPHINode. > > >> Thanks for any help, >> >> Cheers, >> -Neil. >> >> -- >> Neil Henning >> Senior Software Engineer Compiler >> unity.com >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200722/febcf3f8/attachment.html>
Reasonably Related Threads
- LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target
- default behavior or
- LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target
- LLC crash while handling DEBUG info
- LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target