thr3ads.net - llvm dev - [llvm-dev] Is SCEV unable to determine obvious max trip count? [Mar 2019]

If this information is useful, please help other people find it:
Share via:

Nemanja Ivanovic via llvm-dev

2019-Mar-22 18:50 UTC

[llvm-dev] Is SCEV unable to determine obvious max trip count?

Given a simple test where the maximum trip count is quite obvious, we end
up producing a vectorized loop (or excessively complex control flow)
seemingly because we are unable to determine the maximum trip count of the
loop.
Here's an example:
#ifndef _START
#define _START 3
#endif
unsigned long test(unsigned long *a, unsigned long b) {
  for (unsigned long i = _START; i < (b & 7); i++)
    a[i] = b;
  return b;
}

If we compile that with -D_START=0, the loop is fully unrolled with a
comparison and exit if the result of the and is equal to any of the 7
possible values at each unrolled iteration.
However, if we do not have that define (i.e. loop doesn't start from zero),
we will vectorize the loop. There's even a "min.iters" check that
is very
obviously redundant:

%and = and i64 %b, 7
...
%0 = add nsw i64 %and, -3
%min.iters.check = icmp ult i64 %0, 24

The constant here will be different on different targets, but redundant
nonetheless.
Does anyone think this is important to fix this or have any ideas on how to
improve the handling here?
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190322/0f4592de/attachment.html>

Eli Friedman via llvm-dev

2019-Mar-25 18:44 UTC

head link

[llvm-dev] [EXT] Is SCEV unable to determine obvious max trip count?

If you look at the SCEV output (opt -analyze -scalar-evolution) on the optimized
loop without vectorization/unrolling, it looks something like this:

Determining loop execution counts for: @test
Loop %for.body: backedge-taken count is (-4 + (zext i3 (trunc i64 %b to i3) to
i64))<nsw>
Loop %for.body: max backedge-taken count is -2
Loop %for.body: Predicated backedge-taken count is (-4 + (zext i3 (trunc i64 %b
to i3) to i64))<nsw>

If you exclude the guard outside the loop, the max backedge-taken count is
completely unknown: the trip count can be -1UL, -2UL, -3UL, or -4UL (i.e.
numbers close to ULONG_MAX).  But there’s important information outside the
loop: there’s a guard “icmp ugt i64 %and, 3”, which excludes all those cases. 
SCEV currently has some very limited support for reducing the max backedge-taken
count based on conditions outside the loop; see ScalarEvolution::howFarToZero,
in particular, the bit that calls isLoopEntryGuardedByCond.  But that support
isn’t very complete; it could definitely be improved.

As far as I know, nobody is working on this at the moment.
-Eli

From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of Nemanja
Ivanovic via llvm-dev
Sent: Friday, March 22, 2019 11:51 AM
To: llvm-dev <llvm-dev at lists.llvm.org>
Subject: [EXT] [llvm-dev] Is SCEV unable to determine obvious max trip count?

Given a simple test where the maximum trip count is quite obvious, we end up
producing a vectorized loop (or excessively complex control flow) seemingly
because we are unable to determine the maximum trip count of the loop.
Here's an example:
#ifndef _START
#define _START 3
#endif
unsigned long test(unsigned long *a, unsigned long b) {
  for (unsigned long i = _START; i < (b & 7); i++)
    a[i] = b;
  return b;
}

If we compile that with -D_START=0, the loop is fully unrolled with a comparison
and exit if the result of the and is equal to any of the 7 possible values at
each unrolled iteration.
However, if we do not have that define (i.e. loop doesn't start from zero),
we will vectorize the loop. There's even a "min.iters" check that
is very obviously redundant:

%and = and i64 %b, 7
...
%0 = add nsw i64 %and, -3
%min.iters.check = icmp ult i64 %0, 24

The constant here will be different on different targets, but redundant
nonetheless.
Does anyone think this is important to fix this or have any ideas on how to
improve the handling here?
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190325/170a2a13/attachment.html>

llvm dev - Mar 2019 - Is SCEV unable to determine obvious max trip count?

[llvm-dev] Is SCEV unable to determine obvious max trip count?

[llvm-dev] [EXT] Is SCEV unable to determine obvious max trip count?