Liu, Yaxun (Sam) via llvm-dev
2018-Jan-11 16:26 UTC
[llvm-dev] question about unrolling loops with convergent instructions
I have a loop with convergent instructions with a loop count of 1024. I use pragma to specify unroll count to be 32. However, the loop was unrolled by 512, which results in very long compilation time. In tryToUnrollLoop, there is // If the loop contains a convergent operation, the prelude we'd add // to do the first few instructions before we hit the unrolled loop // is unsafe -- it adds a control-flow dependency to the convergent // operation. Therefore restrict remainder loop (try unrollig without). // // TODO: This is quite conservative. In practice, convergent_op() // is likely to be called unconditionally in the loop. In this // case, the program would be ill-formed (on most architectures) // unless n were the same on all threads in a thread group. // Assuming n is the same on all threads, any kind of unrolling is // safe. But currently llvm's notion of convergence isn't powerful // enough to express this. if (Convergent) UP.AllowRemainder = false; Later in computeUnrollCount, there is // 2nd priority is unroll count set by pragma. unsigned PragmaCount = UnrollCountPragmaValue(L); if (PragmaCount > 0) { UP.Count = PragmaCount; UP.Runtime = true; UP.AllowExpensiveTripCount = true; UP.Force = true; if (UP.AllowRemainder && getUnrolledLoopSize(LoopSize, UP) < PragmaUnrollThreshold) return true; } Because UP.AllowRemainder is false, the unroll count specified by pragma is ignored. Later on, computeUnrollCount calculates an unroll count of 512. Is this a bug? Essentially, this disables unroll count specified by pragma for any loops containing convergent operations, even though the unroll count divides the trip count. Thanks. Sam -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180111/5b0acb3f/attachment.html>