thr3ads.net - llvm dev - [llvm-dev] Delinearization validity checks in DependenceAnalysis [May 2019]

If this information is useful, please help other people find it:
Share via:

Bardia Mahjour via llvm-dev

2019-May-13 13:48 UTC

[llvm-dev] Delinearization validity checks in DependenceAnalysis

Hi all,

I have been looking at the `DependenceAnalysis` pass in
`llvm/include/llvm/Analysis/DependenceAnalysis.h`.
In order for this analysis to produce accurate dependence vectors for
multi-dimensional arrays in nested loops, it needs to "delinearize"
array
element accesses to recover the subscripts in each dimension of the array.
I believe that the current implementation of delinearization is based on
this paper:
http://web.cse.ohio-state.edu/~pouchet.2/doc/ics-article.15a.pdf.

This paper describes how to delinearize the subscripts, and as a last step
it requires certain conditions to be met in order to validate that the
delinearized indexes are correct. The `tryDelinearize` function in
`DependenceAnalysis.cpp` appears to be checking for cases where these
conditions can be validated *at compile time*:

```
  // Statically check that the array bounds are in-range. The first
subscript we
  // don't have a size for and it cannot overflow into another subscript,
so is
  // always safe. The others need to be 0 <= subscript[i] < bound, for
both
src
  // and dst.
  // FIXME: It may be better to record these sizes and add them as
constraints
  // to the dependency checks.
  for (int i = 1; i < size; ++i) {
    if (!isKnownNonNegative(SrcSubscripts[i], SrcPtr))
      return false;

    if (!isKnownLessThan(SrcSubscripts[i], Sizes[i - 1]))
      return false;

    if (!isKnownNonNegative(DstSubscripts[i], DstPtr))
      return false;

     if (!isKnownLessThan(DstSubscripts[i], Sizes[i - 1]))
      return false;
    }
```

The problem is that in a lot of cases these conditions cannot be proven
statically, even though the delinearized indexes are in fact correct. For
example consider this simple loop:

```
void foo(int n, int m, int a[][m]) {
  for (int i = 0; i < n-1; ++i)
    for (int j = 2; j < m; ++j) {
      a[i][j] = a[i+1][j-2];
    }
}

clang test.c -c -O3 -S -Xclang -disable-llvm-passes -emit-llvm
opt -mem2reg -instcombine -indvars -loop-simplify -loop-rotate -inline
-simplifycfg test.ll -S -o test.simp.ll
opt test.simp.ll -analyze -da
```

will produce:

```
da analyze - none!
da analyze - anti [* *|<]!
da analyze - output [* *]!
```

If the validity checks were not present, the result would be much more
accurate dependence vector with accurate dependence distances:

```
da analyze - none!
da analyze - consistent anti [1 -2]!
da analyze - none!
```

In my experience the static validity checks tend to fail in many common
cases (involving loops with symbolic bounds). Sometimes this happens
because SCEV is not able to simplify the expressions due to presence of
type casts and sign/zero extensions, but other times the conditions are
just not computable at compile-time.

So far I haven't been able to come up with an example where the validity
checks in the current implementation legitimately catch a case of invalid
delinearization. I've also disabled these checks and run some tests without
finding any failures (other than improvements in the dependence analysis
LIT tests).

I also had a quick look at Polly which benefits from delinearization. From
what I saw, a much more optimistic approach is taken where the validity
checks seem to be avoided.

My questions are:
1. Does anyone have more information about why these validity checks are
necessary in the current implementation, perhaps with some examples showing
an incorrect delinearization that's possible without those checks?
2. Are there any concerns with putting these validity checks under an
option so that we can more easily disable them and experiment? This could
also help us improve LIT tests, since some of them have been pessimised to
compensate for DA's inability to delinearize, and could fail to catch
regressions as a result of bad changes to the data dependence analysis.

Looking forward to your help on this.

Thank you,

Bardia Mahjour
Compiler Optimizations
Hydra Squad Lead
IBM Toronto Software Lab
bmahjour at ca.ibm.com (905) 413-2336

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190513/3607186d/attachment.html>

David Green via llvm-dev

2019-May-14 18:50 UTC

head link

[llvm-dev] Delinearization validity checks in DependenceAnalysis

Hello

Interestingly, the example you provide works for me so long as either it's a
32bit target, or the array bounds (n and m) are changed to unsigned.

For a bit of history, DA used to have a different delinearisation method based
on geps, but it was known to be invalid in same cases and was eventually
removed. There was no delinearisation for a while until these checks were fixed,
enough to be correct, but not as powerful as they could be. I believe Pollys
delinearisation is much more powerful, and can version loops with runtime
conditions.

IIRC, the checks are there for cases like this:
void foo(unsigned n, unsigned m, int *a) {
  for (int i = 0; i < n; ++i)
    for (int j = 0; j < m; ++j) {
      a[i*m + j] = a[i*m + j+2];
    }
}

The "j-2" can underflow into the previous i iterations space. So the
distance vector isn't [0 2] (and isn't even [= <]). Unfortunately
this is perfectly valid in C (I think for the int a[][m] case too).

See this comment for why they were needed and perhaps a better way to fix it:
https://github.com/llvm/llvm-project/commit/d143c65de3c884d09197da279d2f04f094efaf15#diff-57473b376037dd3637516348b9b02556L3274

Any improvements to the delinearisation code would be most welcome.
Dave




From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Bardia
Mahjour via llvm-dev <llvm-dev at lists.llvm.org>
Sent: 13 May 2019 14:48
To: llvm-dev at lists.llvm.org
Subject: [llvm-dev] Delinearization validity checks in DependenceAnalysis
 
Hi all,

I have been looking at the `DependenceAnalysis` pass in
`llvm/include/llvm/Analysis/DependenceAnalysis.h`.
In order for this analysis to produce accurate dependence vectors for
multi-dimensional arrays in nested loops, it needs to "delinearize"
array element accesses to recover the subscripts in each dimension of the array.
I believe that the current implementation of delinearization is based on this
paper: http://web.cse.ohio-state.edu/~pouchet.2/doc/ics-article.15a.pdf.

This paper describes how to delinearize the subscripts, and as a last step it
requires certain conditions to be met in order to validate that the delinearized
indexes are correct. The `tryDelinearize` function in `DependenceAnalysis.cpp`
appears to be checking for cases where these conditions can be validated *at
compile time*:

```
// Statically check that the array bounds are in-range. The first subscript we
// don't have a size for and it cannot overflow into another subscript, so
is
// always safe. The others need to be 0 <= subscript[i] < bound, for both
src
// and dst.
// FIXME: It may be better to record these sizes and add them as constraints
// to the dependency checks.
for (int i = 1; i < size; ++i) {
if (!isKnownNonNegative(SrcSubscripts[i], SrcPtr))
return false;

if (!isKnownLessThan(SrcSubscripts[i], Sizes[i - 1]))
return false;

if (!isKnownNonNegative(DstSubscripts[i], DstPtr))
return false;

if (!isKnownLessThan(DstSubscripts[i], Sizes[i - 1]))
return false;
}
```

The problem is that in a lot of cases these conditions cannot be proven
statically, even though the delinearized indexes are in fact correct. For
example consider this simple loop:

```
void foo(int n, int m, int a[][m]) {
for (int i = 0; i < n-1; ++i)
for (int j = 2; j < m; ++j) {
a[i][j] = a[i+1][j-2];
}
}

clang test.c -c -O3 -S -Xclang -disable-llvm-passes -emit-llvm
opt -mem2reg -instcombine -indvars -loop-simplify -loop-rotate -inline
-simplifycfg test.ll -S -o test.simp.ll
opt test.simp.ll -analyze -da
```

will produce:

```
da analyze - none!
da analyze - anti [* *|<]!
da analyze - output [* *]!
```

If the validity checks were not present, the result would be much more accurate
dependence vector with accurate dependence distances:

```
da analyze - none!
da analyze - consistent anti [1 -2]!
da analyze - none!
```

In my experience the static validity checks tend to fail in many common cases
(involving loops with symbolic bounds). Sometimes this happens because SCEV is
not able to simplify the expressions due to presence of type casts and sign/zero
extensions, but other times the conditions are just not computable at
compile-time.

So far I haven't been able to come up with an example where the validity
checks in the current implementation legitimately catch a case of invalid
delinearization. I've also disabled these checks and run some tests without
finding any failures (other than improvements in the dependence analysis LIT
tests).

I also had a quick look at Polly which benefits from delinearization. From what
I saw, a much more optimistic approach is taken where the validity checks seem
to be avoided.

My questions are:
1. Does anyone have more information about why these validity checks are
necessary in the current implementation, perhaps with some examples showing an
incorrect delinearization that's possible without those checks?
2. Are there any concerns with putting these validity checks under an option so
that we can more easily disable them and experiment? This could also help us
improve LIT tests, since some of them have been pessimised to compensate for
DA's inability to delinearize, and could fail to catch regressions as a
result of bad changes to the data dependence analysis.

Looking forward to your help on this.

Thank you,

Bardia Mahjour
Compiler Optimizations
Hydra Squad Lead
IBM Toronto Software Lab
bmahjour at ca.ibm.com (905) 413-2336

Bardia Mahjour via llvm-dev

2019-May-15 16:54 UTC

head link

[llvm-dev] Delinearization validity checks in DependenceAnalysis

Hi David,

Thank you very much for your response.

I also get correct results for my example (for a 64-bit target) if the
upper bounds are changed to unsigned. The reason is simply because clang
zero-extends `m` for address calculations but sign-extends it for the loop
upper bound. This prevents SCEV from canceling out the 'm' term from the
difference expression that looks like `(-3 + (sext i32 %m to i64) + (-1 *
(zext i32 %m to i64))<nsw>)`. While we cannot reduce expressions of this
form in general, it does pose a sever limitation for the vast majority of
cases where the loop upper bound is non-negative.

Regarding your example, I'm not sure I fully understand the concern and I
would appreciate it if you could clarify that for me. My understanding is
that if the intended shape of 'a' is in fact 'a[n][m]' then the
example, as
provided, has an out-of-bound access to start with. To avoid this out-bound
access one would need to change the upper bound of the j-loop to be
'm-2'.
Interestingly, if we do that, the current validity checks start to pass and
we get [0 2] as the dependence vector. Here's a slightly modified version
of your example that I tried:

```
typedef unsigned long long TT;
void foo(TT n, TT m, int *a) {
  for (TT i = 0; i < n; ++i)
    for (TT j = 0; j < m-2; ++j) {
      a[i*m + j] = a[i*m + j+2];
    }
}
```

If the concern is that the actual intended shape of the array may have been
something other than a[n][m], and that we are indexing it such that the
accesses are in-bound with respect to the memory of the whole array but not
with respect to individual dimensions, then I'm not sure we can reason
about *any* delinearization statically (except for the limited cases where
the bounds are compile-time known).

Am I misunderstanding the issue?

Bardia Mahjour
Compiler Optimizations
Hydra Squad Lead
IBM Toronto Software Lab
bmahjour at ca.ibm.com (905) 413-2336





From:	David Green <David.Green at arm.com>
To:	"llvm-dev at lists.llvm.org" <llvm-dev at lists.llvm.org>,
Bardia
            Mahjour <bmahjour at ca.ibm.com>
Cc:	nd <nd at arm.com>
Date:	2019/05/14 02:50 PM
Subject:	[EXTERNAL] Re: [llvm-dev] Delinearization validity checks in
            DependenceAnalysis



Hello

Interestingly, the example you provide works for me so long as either it's
a 32bit target, or the array bounds (n and m) are changed to unsigned.

For a bit of history, DA used to have a different delinearisation method
based on geps, but it was known to be invalid in same cases and was
eventually removed. There was no delinearisation for a while until these
checks were fixed, enough to be correct, but not as powerful as they could
be. I believe Pollys delinearisation is much more powerful, and can version
loops with runtime conditions.

IIRC, the checks are there for cases like this:
void foo(unsigned n, unsigned m, int *a) {
  for (int i = 0; i < n; ++i)
    for (int j = 0; j < m; ++j) {
      a[i*m + j] = a[i*m + j+2];
    }
}

The "j-2" can underflow into the previous i iterations space. So the
distance vector isn't [0 2] (and isn't even [= <]). Unfortunately
this is
perfectly valid in C (I think for the int a[][m] case too).

See this comment for why they were needed and perhaps a better way to fix
it:
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_llvm_llvm-2Dproject_commit_d143c65de3c884d09197da279d2f04f094efaf15-23diff-2D57473b376037dd3637516348b9b02556L3274&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=aihobyOnVzXW7OPSK1-NiSYQkq7oP3ZSUVc4BemvrVo&m=46eKxI_sFjjeBzn7X-OLXSEUwHN-HVCD16TF9OuyIqc&s=adPvJDhPtFMlaTWihZmvWjXqFUFHDnzcV84oaDGlryM&e

Any improvements to the delinearisation code would be most welcome.
Dave




From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Bardia
Mahjour via llvm-dev <llvm-dev at lists.llvm.org>
Sent: 13 May 2019 14:48
To: llvm-dev at lists.llvm.org
Subject: [llvm-dev] Delinearization validity checks in DependenceAnalysis

Hi all,

I have been looking at the `DependenceAnalysis` pass in
`llvm/include/llvm/Analysis/DependenceAnalysis.h`.
In order for this analysis to produce accurate dependence vectors for
multi-dimensional arrays in nested loops, it needs to "delinearize"
array
element accesses to recover the subscripts in each dimension of the array.
I believe that the current implementation of delinearization is based on
this paper:
https://urldefense.proofpoint.com/v2/url?u=http-3A__web.cse.ohio-2Dstate.edu_-7Epouchet.2_doc_ics-2Darticle.15a.pdf&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=aihobyOnVzXW7OPSK1-NiSYQkq7oP3ZSUVc4BemvrVo&m=46eKxI_sFjjeBzn7X-OLXSEUwHN-HVCD16TF9OuyIqc&s=nraVp9R56UlYi0We27kmGSjZGnM296r0HFpbRR62Fzs&e.

This paper describes how to delinearize the subscripts, and as a last step
it requires certain conditions to be met in order to validate that the
delinearized indexes are correct. The `tryDelinearize` function in
`DependenceAnalysis.cpp` appears to be checking for cases where these
conditions can be validated *at compile time*:

```
// Statically check that the array bounds are in-range. The first subscript
we
// don't have a size for and it cannot overflow into another subscript, so
is
// always safe. The others need to be 0 <= subscript[i] < bound, for both
src
// and dst.
// FIXME: It may be better to record these sizes and add them as
constraints
// to the dependency checks.
for (int i = 1; i < size; ++i) {
if (!isKnownNonNegative(SrcSubscripts[i], SrcPtr))
return false;

if (!isKnownLessThan(SrcSubscripts[i], Sizes[i - 1]))
return false;

if (!isKnownNonNegative(DstSubscripts[i], DstPtr))
return false;

if (!isKnownLessThan(DstSubscripts[i], Sizes[i - 1]))
return false;
}
```

The problem is that in a lot of cases these conditions cannot be proven
statically, even though the delinearized indexes are in fact correct. For
example consider this simple loop:

```
void foo(int n, int m, int a[][m]) {
for (int i = 0; i < n-1; ++i)
for (int j = 2; j < m; ++j) {
a[i][j] = a[i+1][j-2];
}
}

clang test.c -c -O3 -S -Xclang -disable-llvm-passes -emit-llvm
opt -mem2reg -instcombine -indvars -loop-simplify -loop-rotate -inline
-simplifycfg test.ll -S -o test.simp.ll
opt test.simp.ll -analyze -da
```

will produce:

```
da analyze - none!
da analyze - anti [* *|<]!
da analyze - output [* *]!
```

If the validity checks were not present, the result would be much more
accurate dependence vector with accurate dependence distances:

```
da analyze - none!
da analyze - consistent anti [1 -2]!
da analyze - none!
```

In my experience the static validity checks tend to fail in many common
cases (involving loops with symbolic bounds). Sometimes this happens
because SCEV is not able to simplify the expressions due to presence of
type casts and sign/zero extensions, but other times the conditions are
just not computable at compile-time.

So far I haven't been able to come up with an example where the validity
checks in the current implementation legitimately catch a case of invalid
delinearization. I've also disabled these checks and run some tests without
finding any failures (other than improvements in the dependence analysis
LIT tests).

I also had a quick look at Polly which benefits from delinearization. From
what I saw, a much more optimistic approach is taken where the validity
checks seem to be avoided.

My questions are:
1. Does anyone have more information about why these validity checks are
necessary in the current implementation, perhaps with some examples showing
an incorrect delinearization that's possible without those checks?
2. Are there any concerns with putting these validity checks under an
option so that we can more easily disable them and experiment? This could
also help us improve LIT tests, since some of them have been pessimised to
compensate for DA's inability to delinearize, and could fail to catch
regressions as a result of bad changes to the data dependence analysis.

Looking forward to your help on this.

Thank you,

Bardia Mahjour
Compiler Optimizations
Hydra Squad Lead
IBM Toronto Software Lab
bmahjour at ca.ibm.com (905) 413-2336





-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190515/bba4fa17/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190515/bba4fa17/attachment.gif>

Michael Kruse via llvm-dev

2019-May-22 16:25 UTC

head link

[llvm-dev] Delinearization validity checks in DependenceAnalysis

Am Mo., 13. Mai 2019 um 08:49 Uhr schrieb Bardia Mahjour via llvm-dev
<llvm-dev at lists.llvm.org>:> I have been looking at the `DependenceAnalysis` pass in
`llvm/include/llvm/Analysis/DependenceAnalysis.h`.
> In order for this analysis to produce accurate dependence vectors for
multi-dimensional arrays in nested loops, it needs to "delinearize"
array element accesses to recover the subscripts in each dimension of the array.
I believe that the current implementation of delinearization is based on this
paper: http://web.cse.ohio-state.edu/~pouchet.2/doc/ics-article.15a.pdf.
> 1. Does anyone have more information about why these validity checks are
necessary in the current implementation, perhaps with some examples showing an
incorrect delinearization that's possible without those checks?
Think of an array A[2][2] and two accesses:

A[1][0] = value;
use(A[0][2]);

The subscript of the second is out-of-range and will be evaluated to
the same address as A[1][0]. A dependence-analysis that assumes that
different subscripts means different addresses will not catch the
dependence between the two statements.

Polly implements the FIXME: It assumes that all accesses are
in-bounds, but will execute the original code if this is not the case.
A runtime check testing for the condition decides whether the
optimized (whose assumptions turned out to be wrong) or original code
is going to be executed.

> 2. Are there any concerns with putting these validity checks under an
option so that we can more easily disable them and experiment? This could also
help us improve LIT tests, since some of them have been pessimised to compensate
for DA's inability to delinearize, and could fail to catch regressions as a
result of bad changes to the data dependence analysis.
You can upload a patch for this and I don't see issues as long as by
default the safety checks are enabled. However, if you just want to
evaluate, why not just implement it in your local branch?


Michael

Apparently Analagous Threads

Search for more seemingly similar threads

llvm dev - May 2019 - Delinearization validity checks in DependenceAnalysis

[llvm-dev] Delinearization validity checks in DependenceAnalysis

[llvm-dev] Delinearization validity checks in DependenceAnalysis

[llvm-dev] Delinearization validity checks in DependenceAnalysis

[llvm-dev] Delinearization validity checks in DependenceAnalysis

Apparently Analagous Threads