thr3ads.net - llvm dev - [LLVMdev] SIV tests in LoopDependence Analysis, Sanjoy's patch [Apr 2012]

If this information is useful, please help other people find it:
Share via:

Preston Briggs

2012-Mar-29 04:34 UTC

[LLVMdev] SIV tests in LoopDependence Analysis, Sanjoy's patch

Hi Sanjoy & Hal,

Looking at LoopDependenceAnalysis::analyzeZIV ...
I don't understand much about SCEV, but the code seems inadequate.
Consider these three examples:

int a = ...; // we can't be sure of the relationship between a and b
int b = ...; // perhaps they are parameters, or the result of I/O

for (int i = 0; i < n; i++) {
  v[a][i] = ...;
  v[a + 1][i] = ...;
}

for (int i = 0; i < n; i++) {
  v[a][i] = ...;
  v[a][i] = ...;
}

for (int i = 0; i < n; i++) {
  v[a][i] = ...;
  v[b][i] = ...;
}


In the first loop, we can be confident that the two stores are to different
locations and there is no dependence.

In the second loop, we can be sure that the two references are to the same
location and that there's a consistent loop-independent output dependence
between the two stores.

In the third loop, we can't be sure what's going on and must therefore
assume an inconsistent loop-independent output dependence.

The ZIVanalysis code doesn't do the right thing.  I believe, but am not
certain, that it will get the 1st two loops right, but will return
Independent for the 3rd case. Better code might look something like:

SCEV *delta = a - b
if (delta is zero) {
  consistant &= true
  return Dependent
}
else if (delta is a known constant)
  return Independent
}
else { // delta is some symbolic term
  consistent = false
  return Dependent
}


By the way, when I assert all these things about dependence analysis, I'm
not always correct. If you disagree, please let me know and we'll discuss
it.

Thanks,
Preston
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20120328/a4bc2d0d/attachment.html>

Sanjoy Das

2012-Apr-01 20:54 UTC

head link

[LLVMdev] SIV tests in LoopDependence Analysis, Sanjoy's patch

Hi Preston,

Thanks for the feedback!
> In LoopDependenceAnalysis::AnalyzePair, what's going to happen if we
> have something like this
>
> for (i = 0; i < n; i++)
>   for (j = 0; j < n; j++)
>     A[i][j]++;
>
> versus
>
> for (i = 0; i < n; i++)
>   for (j = 0; j < n; j++)
>     A[j][i]++;
I think this can be fixed by ordering the subscripts on the loop
nesting.  This should be doable, since the add recurrences remember
the loop whose induction variable they're representing.

Secondly, I think we can specialize on dealing only with linear add
recurrences and constants (symbolic or actual constants) at this
point.  Since LLVM represents indices like

for (i = 0; i < n; i++)
  for (j = 0; j < n; j++)
    a[i + j]++

as quite nicely as add recurrences of add recurrences we should be
able to get quite far along with this approach.
> And what's going to happen here?
>
> for (i = 0; i < n; i++)
>     A[i+1][i+2] = A[i][i];
I left out handling coupled subscripts at this point because ignoring
them only results in more conservative safer inferences about
dependence.  More sophisticated analysis like this can be added
incrementally, I think.
> And here?
>
> for (i = 0; i < n; i++)
>   for (j = 0; j < n; j++)
>     A[i] = A[i] + B[i][j];
I'm not too clear about this.  Assuming A and B point to two different
locations, there seems to be only one loop independent dependence --
from A[i] to A[i].

You are right about the ZIV issue, it needs to be fixed.

I think the following would be a good start at modelling dependencies
between two Values:

class Dependence {
public:
  struct Level {
    enum { LT  = 1, EQ  = 2, GT  = 4, ALL = 7 } direction;
    bool scalar;
    const SCEV *distance; // NULL implies no distance available
    const Loop *loop;
  };

  enum Kind { Flow, Anti, Output, Input };

  Kind getKind() const {
    return kind;
  }

  const Value *getSource() const {
    return source;
  }

  const Value *getDestination() const {
    return destination;
  }

  typedef SmallVector<const Level *, 4>::const_iterator const_iterator;
  const_iterator begin() const { return levels.begin(); }
  const_iterator end() const { return levels.end(); }

private:
  Value *source, *destination;
  Kind kind;
  SmallVector<const Level *, 4> levels;

  friend class LoopDependenceAnalysis;
};

More methods, like `isConsistent` and `isConfused` can be added easily
to this interface, since the raw data required to compute them is
available.

Given two Values, then, we can first break them up into their
respective indices.  We can then order them based on their loop
nesting.  To actually compute the dependence, we can traverse down the
subscript lists of the two Values and keep checking individual
subscript pairs.

I'm not sure on how to deal with cases such as these:

for (i = 0; i < n; i++) {
  for (j = 0; j < n; j++) {
    t = a[i][j];
    a[i + 1][j + 1] = t;
  }
  for (j = 0; j < n; j++) {
    t = a[i][j];
    a[i + 1][j + 1] = t;
  }
}

My gut feeling is that we should stop checking individual subscripts
when the loop nesting have diverged.  But don't know how to concretely
represent such dependencies.

I'm assuming in cases such as these:

for (i = 0; i < n; i++)
  for (j = 0; j < n; j++)
    a[i][0] = a[j][i]

we can have a dependence edge [S U | 0].

Let me know what you think!
-- 
Sanjoy Das.
http://playingwithpointers.com

Preston Briggs

2012-Apr-03 05:25 UTC

head link

[LLVMdev] SIV tests in LoopDependence Analysis, Sanjoy's patch

Hi Sanjoy,

I wondered:>> In LoopDependenceAnalysis::AnalyzePair, what's going to happen if
we
>> have something like this
>>
>> for (i = 0; i < n; i++)
>>   for (j = 0; j < n; j++)
>>     A[i][j]++;
>>
>> versus
>>
>> for (i = 0; i < n; i++)
>>   for (j = 0; j < n; j++)
>>     A[j][i]++;
> I think this can be fixed by ordering the subscripts on the loop
> nesting.  This should be doable, since the add recurrences remember
> the loop whose induction variable they're representing.
Yes, we need to record the results of the individual subscript tests in
a vector indexed by the loop-nesting depth.
The order of the subscripts is irrelevant.
> Secondly, I think we can specialize on dealing only with linear add
> recurrences and constants (symbolic or actual constants) at this
> point.  Since LLVM represents indices like
>
> for (i = 0; i < n; i++)
>  for (j = 0; j < n; j++)
>    a[i + j]++
>
> as quite nicely as add recurrences of add recurrences we should be
> able to get quite far along with this approach.
Yes, this will be fine. Indeed, I don't think we'll ever be able to do
more.
>> And what's going to happen here?
>>
>> for (i = 0; i < n; i++)
>>     A[i+1][i+2] = A[i][i];
>
> I left out handling coupled subscripts at this point because ignoring
> them only results in more conservative safer inferences about
> dependence.  More sophisticated analysis like this can be added
> incrementally, I think.
Yes, though my point was that you are currently recording
two different results, when there's actually only one loop level.
>> And here?
>>
>> for (i = 0; i < n; i++)
>>   for (j = 0; j < n; j++)
>>     A[i] = A[i] + B[i][j];
>
> I'm not too clear about this.  Assuming A and B point to two different
> locations, there seems to be only one loop independent dependence --
> from A[i] to A[i].
Again, I'm trying to point out that the number and order of subscripts has
nothing to do with the loop levels. What we want to find here is that
there are 3 dependences:
1) a loop-carried anti dependence from the load of A to the store,
with a direction vector of [= S|<]
2) a loop-carried flow dependence from the store to the load of A,
with direction vector [= S|0]
3) a loop-carried output dependence from the store to the store, with
direction vector [= S|0]
(see Section 8.2.3 in Allen & Kennedy)
> I think the following would be a good start at modelling dependencies
> between two Values:
>
> class Dependence {
> public:
>  struct Level {
>    enum { LT  = 1, EQ  = 2, GT  = 4, ALL = 7 } direction;
>    bool scalar;
>    const SCEV *distance; // NULL implies no distance available
>    const Loop *loop;
>  };
What is the loop value used for?
Seems redundant, in that we can get to the statements given the
source and destination pointers, and from there to the loop nest.
The relative position in the loop nest is given by the index of
the levels vector.

>  enum Kind { Flow, Anti, Output, Input };
>
>  Kind getKind() const {
>    return kind;
>  }
>
>  const Value *getSource() const {
>    return source;
>  }
>
>  const Value *getDestination() const {
>    return destination;
>  }
>
>  typedef SmallVector<const Level *, 4>::const_iterator
const_iterator;
>  const_iterator begin() const { return levels.begin(); }
>  const_iterator end() const { return levels.end(); }
>
> private:
>  Value *source, *destination;
>  Kind kind;
>  SmallVector<const Level *, 4> levels;
I'd malloc an ordinary vector of the appropriate length,
since we know the length at allocation time.
>  friend class LoopDependenceAnalysis;
> };
>
> More methods, like `isConsistent` and `isConfused` can be added easily
> to this interface, since the raw data required to compute them is
> available.
You'll also need flags for loop-independent, consistent, and confused.
I thought earlier that the latter two could be computed, but I was wrong.
Can't compute them when the source or destination isn't in the loop.

When a dependence is completely confused, we needn't represent the
direction vector (what you call levels), so we can perhaps save a bit of
storage in that common case.
> Given two Values, then, we can first break them up into their
> respective indices.  We can then order them based on their loop
> nesting.  To actually compute the dependence, we can traverse down the
> subscript lists of the two Values and keep checking individual
> subscript pairs.
Sort of.  I would break them up into two lists if subscripts.
Don't sort these lists!  Instead, find the common nesting depth for
the 2 statements and allocate distance vectors of the appropriate length.
Then, as we get results from testing individual subscript pairs,
we record them in the direction vector at the appropriate level.

Initialize each level in the direction vector with ALL.
Set the Scalar flag for each level.

Imagine we're starting simple, with no attempt to take advantage of
coupled subscripts.
Examine each pair of subscripts in sequence.
If a pair is ZIV, we test hoping to prove independence.
If we can't, we move on to the next subscript.

If a pair is MIV, we ignore it and move on to the next subscript.

If a pair is SIV, we check to see if it's something we can handle
(strongSIV, weakzeroSIV, ...). If not, skip to next subscript.

If we can handle it, compute direction (if StrongSIV, compute distance).
Find the appropriate level based on the index variable, turn off the
Scalar flag,
and intersect the new direction with whatever is already stored there.
If the result is NONE, then there's no dependence and we're done.
If we have a distance, set it (if a distance already exists, compare to
see if they are the same. If not, then no dependence exists, and we're
done).
Probably don't bother to record symbolic distances.
Move on to the next subscript.

That's the general idea.
I'll spend some time over the next week or so
trying to get it written up in more detail on my
website: https://sites.google.com/site/parallelizationforllvm/dependence-test

Does this make sense.
If not, please get back to me.
> I'm not sure on how to deal with cases such as these:
>
> for (i = 0; i < n; i++) {
>  for (j = 0; j < n; j++) {
>    t = a[i][j];
>    a[i + 1][j + 1] = t;
>  }
>  for (j = 0; j < n; j++) {
>    t = a[i][j];
>    a[i + 1][j + 1] = t;
>  }
> }
>
> My gut feeling is that we should stop checking individual subscripts
> when the loop nesting have diverged.  But don't know how to concretely
> represent such dependencies.
We can compute dependence for imperfectly nested loops like this;
indeed, we must if we want to try and parallelize the outer loop.
But there's also the possibility of fusing the two inner loops.
If we want to do that (and we often do), then we need to look
for dependences between them.

Generally, dependences between 2 statements only have to do with
the common loops. So dependences between these two inner loops
should pay attention to subscripts that mention "i".
So we'd say that values flow from store in the first loop to
load in the second loop, with a distance vector [1|<]

However, in this special case, where we have two loops next to each
other with identical loop bounds, we'd like to fuse the loops, if it's
legal.
Often very useful as a way to save loads and cache misses.
Now someother part of the compiler would notice the potential
for fusing (the adjacent loops with identical headers), but it would rely
on the dependence analyzer to make sure fusion was legal, i.e., that
no dependences would be violated. To do so, we need to pretend
that the fusion has already taken place, and check subscripts for
both the i and j loops.

Does this make sense?

I'd arrange for it by providing an extra entry point with an extra parameter
specifying how many loop levels to check.

> I'm assuming in cases such as these:
>
> for (i = 0; i < n; i++)
>  for (j = 0; j < n; j++)
>    a[i][0] = a[j][i]
>
> we can have a dependence edge [S U | 0].
Scalar directions occur when a loop level isn't mentioned in any subscript.
For example,

for (i = 0; i < n; i++)
  for (j = 0; j < n; j++)
    a[i + 1] = a[i] + 1;

Here we see two dependences
1) a loop-carried flow dependence from the store to A and the load,
with dv of [1 S|0]
2) a loop-carried output dependence from the store to the store, with
dv of [0 S|0]
I don't think there's an anti dependence in this case.

Your example is more complicated. Let's look at it in detail

for (i = 0; i < n; i++)
 for (j = 0; j < n; j++)
   a[i][0] = a[j][i]

Comparing the store against itself, we find a loop-carried output
dependence with dv of [0 S|0].
Do you see how it happens? We have two stores with subscripts [i][0] and [i][0].
The j loop isn't mentioned, so its level is S.
The i loop is mentioned in the first subscript, so that level is 0 (or =).
And since we're talking about the same instruction, it can't be a
loop-independent dependence.

Comparing the load against the store, we see [j][i] and [i][0].
The subscripts are coupled, with an MIV and an SIV.
Looking at the SIV, we see that there can only be a dependence when i = 0.
Propagating that fact into the other subscript pair,
we find that the dependence can only exist when j = i = 0.
So there can't be a loop-carried dependence, so no flow dependence.
On the other hand, we can have an inconsistent loop-independent anti
dependence from the load to the store, with dv [0 0|<]

Preston

Maybe Matching Threads

Search for more reasonably related threads

llvm dev - Apr 2012 - [LLVMdev] SIV tests in LoopDependence Analysis, Sanjoy's patch

[LLVMdev] SIV tests in LoopDependence Analysis, Sanjoy's patch

[LLVMdev] SIV tests in LoopDependence Analysis, Sanjoy's patch

[LLVMdev] SIV tests in LoopDependence Analysis, Sanjoy's patch

Maybe Matching Threads