Tobias Grosser
2011-Dec-14 12:10 UTC
[LLVMdev] SCEV cannot derive number of loop iterations
Hi, I am looking at two very simple kernels. They implement the following loops: constant_bound(): for (int i = 0; i < 100; i+=4); parameteric_bound(): for (int i = 0; i < n; i+=4); For the first loop SCEV is able to derive the number of loop iterations, for the second loop it returns 'Unpredictable backedge-taken count'. Is this expected because it is a difficult problem or is this just a missed-analysis? Any ideas what is needed to detect this case? Also, I am surprised that SCEV misses the nsw/nuw flags in this case: %tmp = mul nuw nsw i64 %indvar, 4 --> {0,+,4}<%for.body> Exits: <<Unknown>> Cheers Tobi P.S.: LLVM-IR test case attached. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: could_not_compute.ll URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111214/fe6cb6a1/attachment.ksh>
On Dec 14, 2011, at 4:10 AM, Tobias Grosser wrote:> Hi, > > I am looking at two very simple kernels. They implement the following loops: > > constant_bound(): > for (int i = 0; i < 100; i+=4); > > parameteric_bound(): > for (int i = 0; i < n; i+=4); > > For the first loop SCEV is able to derive the number of loop iterations, > for the second loop it returns 'Unpredictable backedge-taken count'. > > Is this expected because it is a difficult problem or is this just a missed-analysis? Any ideas what is needed to detect this case? > > Also, I am surprised that SCEV misses the nsw/nuw flags in this case: > > %tmp = mul nuw nsw i64 %indvar, 4 > --> {0,+,4}<%for.body> Exits: <<Unknown>> > > Cheers > TobiHi Tobias, Sorry I missed this message initially. Trip count computation really does not do well with non-unit strides, especially when they aren't represented be an existing phi. If there was a simple fix, it would have been fixed long ago. It would help to add a check for %n > 0 before the loop, but the bigger problem is that there are a couple different places where we effectively need know that %n + 4 <= INT64_MAX. The question is whether we can infer this from the NSW flag. There are multiple problems with this, and there's no one place we can currently handle it. Just consider this issue: the original phi is a recurrence {0,+,1}<L> which cannot wrap. But we have no phi that represents the scaled form of that recurrence {0,+,4}<L>. So how do we know that recurrence can never wrap? We have a multiply instruction that doesn't wrap, but that only tells us that one value at a particular point in the code doesn't wrap. It doesn't tell us about this ethereal thing we call a recurrence. We would have to determine that whenever we rely on our computed expression for the scaled recurrence, we are dominated by that multiply--not something we can currently represent. This limitation is fundamental to the design of SCEV. The problem is not really SCEV though, it's the way LLVM uses it. It's obvious looking at your test case that the trip count is computed by a mul<nsw> of {0,+,1}<L><nsw>, so cannot wrap. I could certainly write an analysis that works directly on your IR to determine that fact. In general, I think we can do better by redesigning analysis currently encapsulated within SCEV to be expressed in terms of IR values, using SCEV as a supplemental analysis rather than a proxy for the IR. This would decrease the modularity and generality of the code, but would allow LLVM to properly deal with messy realities of NSW semantics and control dependence. We could start by rewriting trip count computation in this manner. File a bug if you'd like. I can't say I'll be able to implement it in the near future. This problem could also be engineered away by relying on profile data. If we know this is *the* important loop, we can disregard code bloat and create two versions gated by bounds checks. Then we can ignore NSW/NUW flags because we have an explicit test in the CFG. -Andy