thr3ads.net - llvm dev - [LLVMdev] Extending GetElementPointer, or Premature Linearization Considered Harmful [May 2012]

If this information is useful, please help other people find it:
Share via:

Preston Briggs

2012-May-04 01:12 UTC

[LLVMdev] Extending GetElementPointer, or Premature Linearization Considered Harmful

Is there any chance of replacing/extending the GEP instruction?

As noted in the GEP FAQ, GEPs don't support variable-length arrays; when
the front ends have to support VLAs, they linearize the subscript
expressions, throwing away information. The FAQ suggests that folks
interested in writing an analysis that understands array indices (I'm
thinking of dependence analysis) should be prepared to reverse-engineer the
linearization, but I don't believe it's possible to recover all the
possible subscripts, including some common and useful situations.

The FAQ suggests that one way to solve this problem is to use the SCEV
library which always represents the VLA and non-VLA index in the same
manner. I don't see it. Here's some code I wrote to explore how various
array references are compiled:

*bool preston::runOnFunction(Function &F) {*
*  errs() << "\nFunction: ";*
*  errs().write_escaped(F.getName()) << '\n';*
*  SE = &getAnalysis<ScalarEvolution>();*
*  for (inst_iterator ii = inst_begin(F), ie = inst_end(F); ii != ie; ++ii)
{*
*    Instruction *inst = &*ii;*
*    errs() << *inst << "\n";*
*    if (StoreInst *SI = dyn_cast<StoreInst>(inst)) {*
*      Value *operand = SI->getPointerOperand();*
*      if (const GEPOperator *GEP = dyn_cast<GEPOperator>(operand)) {*
*        for (GEPOperator::const_op_iterator idx = GEP->idx_begin(),*
*                                            end = GEP->idx_end();*
*             idx != end;** idx += 1) {*
*          const SCEV *scev = SE->getSCEV(*idx);*
*          errs() << *scev << "\n";*
*        }*
*      }*
*    }*
*  }*
*  return false;*
}


Basically, it zips though the instructions in a routine, dumping them out.
When it finds a store, it recovers the associated GEP and dumps its operand
SCEVs. To make it easy to understand, I typically write very simple test
cases, e.g.,

*void zap(long int n, long int A[]) {*
*  for (unsigned int i = 0; i < n; i++)*
*    A[1 + 2*i] = 0;*
*}*


In this case, we see

*%arrayidx = getelementptr inbounds i64* %A, i64 %add3*
*store i64 0, i64* %arrayidx, align 8*
*{1,+,2}<%for.body>*


which is easy enough.  If we have something more complex, like this

*void z1(long int n, long int A[][100][100]) {
**  for (long int i = 0; i < n; i++)
**    for (long int j = 0; j < n; j++)
**      for (long int k = 0; k < n; k++)
** **A[1 + 2*i][3 + 4*j][5 + 6*k] = 0;*
}


we'll see

*%arrayidx12 = getelementptr inbounds [100 x [100 x i64]]* %A, i64 %add109,
i64 %add88, i64 %add*
*store i64 0, i64* %arrayidx12, align 8*
*{1,+,2}<%for.cond1.preheader>*
*{3,+,4}<%for.cond4.preheader>*
*{5,+,6}<%for.body6>*


which looks great; 3 simple indices, no problem.
But consider this:

*void z2(long int n, long int A[][n][n][100][100]) {*
*  for (long int i = 0; i < n; i++)*
*    for (long int j = 0; j < n; j++)*
*      for (long int k = 0; k < n; k++)*
* **for (long int l = 0; l < n; l++)*
* **for (long int m = 0; m < n; m++)*
* **A[1 + 2*i][3 + 4*j][5 + 6*k][7 + 8*l][9 + 10*m] = 0;*
}


which produces

*%arrayidx24 = getelementptr inbounds [100 x [100 x i64]]* %A, i64
%arrayidx21.sum, i64 %add1411, i64 %add*
*store i64 0, i64* %arrayidx24, align 8*
*{{{(5 + ((3 + %n) * %n)),+,(2 * %n * %n)}<%for.cond1.preheader>,+,(4 *
%n)}<%for.cond4.preheader>,+,6}<%for.cond7.preheader>*
*{7,+,8}<%for.cond10.preheader>*
*{9,+,10}<%for.body12>*


This is more tedious. There are 2 easy indices hanging from the GEP, but
the rest are compressed into one SCEV. The upshot is: Whenever I look at a
memory reference, I need to attempt to delinearize the first SCEV (in the
order I've printed them) to look for other implied indices. (As a bonus,
delinearization would sometimes be able to untangle examples where humans
have linearized their own references, perhaps in antique C code, written
before VLAs.) Some, like the examples above can be handled; others seem
impossible, which is too bad.

Here are some examples that are hard to delinearize:

   - Coupled subscripts, where an index appears in multiple subscript
   positions, e.g., *A[i][i + j]*. Sad because the Delta test is designed
   to handle exactly these cases.
   - Non-linear subscripts, like *A[i][j*j]*.  While the *[j*j]* subscript
   is hard to analyze, we might be able to disprove the dependence using the
   simple *[i]* subscript.
   - Non-linear subscripts, like *A[i][B[j]]*. Again, it's tough to analyze
   the *[B[j]]* part, but we might be able to disprove the dependence using
   the simple *[i]* subscript.

It's also plausible to analyze pairs of non-linear subscripts, like *[1 +
2*B[i]]* and [*2*B[i]]*, easily proving there's no dependence despite our
lack of knowledge about *B[i]*.

So, ...
Perhaps we could consider a new variant of the GEP instruction that lets us
recover all the subscripts, without any loss of info, regardless of how
absurd they might appear. The current GEP allows many subscripts, but the
strides are encoded in the type. An alternative might support many
subscripts, each with an explicit stride, maybe constant, maybe a
parameter, maybe even the result of a computation. If we're clever, we
could even handle complex accesses resulting from structs of vectors of
structs of ...

I think such a change might have a fair amount of impact all through the
optimizer (thinking especially of strength reduction and
common-subexpression elimination), but we could also introduce a phase that
explicitly linearizes a complex GEP, yielding the current, simpler forms.
As long as such a linearization occurs after dependence analysis, no
information would be lost and the impact to the rest of the infrastructure
would be minimized.

Is such an idea completely out of the question?

Thanks,
Preston
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20120503/eddd021f/attachment.html>

Hongbin Zheng

2012-May-04 02:21 UTC

head link

[LLVMdev] Extending GetElementPointer, or Premature Linearization Considered Harmful

Hi Preston,

On Fri, May 4, 2012 at 9:12 AM, Preston Briggs <preston.briggs at
gmail.com> wrote:>
> which produces
>
> %arrayidx24 = getelementptr inbounds [100 x [100 x i64]]* %A, i64
> %arrayidx21.sum, i64 %add1411, i64 %add
> store i64 0, i64* %arrayidx24, align 8
> {{{(5 + ((3 + %n) * %n)),+,(2 * %n * %n)}<%for.cond1.preheader>,+,(4
* %n)}<%for.cond4.preheader>,+,6}<%for.cond7.preheader>This expression is not straight forward because llvm always fold the
loop invariant in the AddExpr into the AddRecExpr.
If I understand the AddRecExpr correctly, the above SCEV is equivalent to:
(5 + ((3 + %n) * %n)) + (2 * %n * %n) * {0,+,1}<%for.cond1.preheader>
+ (4 * %n) * {0,+,1}<%for.cond4.preheader> + 6 *
{0,+,1}<%for.cond7.preheader>
In the above example, you can treat {0,+,1}<%for.cond1.preheader>,
{0,+,1}<%for.cond4.preheader> and {0,+,1}<%for.cond7.preheader> as
the
(virtual) canonical induction variables of the corresponding loop,
which start from 0, and increased by 1 in every iteration.

In this case, you may need to introduce your own data structure to
represent linear expressions, which re-organize the sub-expressions in
SCEV in form of what you want. To construct the linear expressions,
you need to visit and analyze all sub-expressions in the SCEV, instead
of simply looking at the top-level SCEV. Maybe you could have a look
at the SCEVValidator[1], which check if a SCEV is affine, and and
SCEVAffinator[2], which build a kind of affine expression from SCEV,
all of them need to visit and analyze all sub-expressions in the SCEV.

best regards
ether

[1]http://llvm.org/viewvc/llvm-project/polly/trunk/lib/Support/SCEVValidator.cpp?view=markup
[2]http://llvm.org/viewvc/llvm-project/polly/trunk/lib/Analysis/ScopInfo.cpp?view=markup

Preston Briggs

2012-May-04 05:45 UTC

head link

[LLVMdev] Extending GetElementPointer, or Premature Linearization Considered Harmful

Hi Ether,
> Preston wrote:
> > %arrayidx24 = getelementptr inbounds [100 x [100 x i64]]* %A, i64
> > %arrayidx21.sum, i64 %add1411, i64 %add
> > store i64 0, i64* %arrayidx24, align 8
> > {{{(5 + ((3 + %n) * %n)),+,(2 * %n *
%n)}<%for.cond1.preheader>,+,(4 *
%n)}<%for.cond4.preheader>,+,6}<%for.cond7.preheader>
And Ether replied:> This expression is not straight forward because llvm always fold the
> loop invariant in the AddExpr into the AddRecExpr.
> If I understand the AddRecExpr correctly, the above SCEV is equivalent to:
> (5 + ((3 + %n) * %n)) + (2 * %n * %n) * {0,+,1}<%for.cond1.preheader>
> + (4 * %n) * {0,+,1}<%for.cond4.preheader> + 6 *
> {0,+,1}<%for.cond7.preheader>
> In the above example, you can treat {0,+,1}<%for.cond1.preheader>,
> {0,+,1}<%for.cond4.preheader> and {0,+,1}<%for.cond7.preheader>
as the
> (virtual) canonical induction variables of the corresponding loop,
> which start from 0, and increased by 1 in every iteration.
Well, these are two representations of the same thing.
I think they're both equally easy to interpret.
When we see an SECV like {a, +, b}<loop.i>, we can interpret it as [a
+ b*i] directly, or first rewrite it as a + b*{0, +, 1}, doesn't
matter.
> In this case, you may need to introduce your own data structure to
> represent linear expressions, which re-organize the sub-expressions in
> SCEV in form of what you want. To construct the linear expressions,
> you need to visit and analyze all sub-expressions in the SCEV, instead
> of simply looking at the top-level SCEV. Maybe you could have a look
> at the SCEVValidator[1], which check if a SCEV is affine, and and
> SCEVAffinator[2], which build a kind of affine expression from SCEV,
> all of them need to visit and analyze all sub-expressions in the SCEV.
I've written various bits of code that exhaustively explore SCEVs;
that's not the difficulty. The hard part is taking something like:

{{{((7 + (104 * %n)) * %n),+,((8 + (200 * %n)) *
%n)}<%for.cond1.preheader>,+,(305 * %n *
%n)}<%for.cond4.preheader>,+,(9 + (6 * %n * %n))}<%for.body6>

and figuring out the number of subscripts and what they might be.
I'm sure there are 3 indices, call them i, j, and k.
2 of the subscripts have bounds of n,
but figuring out that one has a bound of 100 seems hard.
And how many subscripts are there?

i appears in 2 subscripts, one with stride 8, but the other might be
any factor of 200.
j appears would seem to appear in 2 subscripts, but they're hard to prove.
k appears in 2 subscripts, one with stride 9 and one with stride 6.

Knowing somehow that there's a bound of 100 in one of the subscripts
would help a lot, but how to prove that?

Note that this is example is affine; it confuses us because some of
the subscripts are coupled.

Extending the GEP, my real goal, avoid this confusion.

Thanks,
Preston

Duncan Sands

2012-May-04 08:46 UTC

head link

[LLVMdev] Extending GetElementPointer, or Premature Linearization Considered Harmful

Hi Preston,
> As noted in the GEP FAQ, GEPs don't support variable-length arrays;
that's not quite right.  The problem is only with arrays of variable length
arrays, and more generally with arrays where the element type has variable
size (this occurs with Ada, which has all kinds of funky variable sized types,
for example).  Consider your examples:
>     *void zap(long int n, long int A[]) {*
>     *  for (unsigned int i = 0; i < n; i++)*
>     *    A[1 + 2*i] = 0;*
>     *}*
...>     *%arrayidx = getelementptr inbounds i64* %A, i64 %add3*
Here GEP has no problem with this variable sized array.
>     *void z1(long int n, long int A[][100][100]) {
>     **  for (long int i = 0; i < n; i++)
>     **    for (long int j = 0; j < n; j++)
>     **      for (long int k = 0; k < n; k++)
>     ****A[1 + 2*i][3 + 4*j][5 + 6*k] = 0;*
>     }
...>     *%arrayidx12 = getelementptr inbounds [100 x [100 x i64]]* %A, i64
%add109,
Here neither.
>     *void z2(long int n, long int A[][n][n][100][100]) { *
>     *  for (long int i = 0; i < n; i++) *
>     *    for (long int j = 0; j < n; j++) *
>     *      for (long int k = 0; k < n; k++) *
>     ***for (long int l = 0; l < n; l++) *
>     ***for (long int m = 0; m < n; m++) *
>     ***A[1 + 2*i][3 + 4*j][5 + 6*k][7 + 8*l][9 + 10*m] = 0; *
>     }
This is where you run into trouble, because this is an array with a variable
sized element type.

Currently GEP is designed so that the offset from the base pointer is an affine
function *with constant multipliers*, eg: 3*x + 10*y.  This is analytically a
completely different beast to an offset of eg: x * y.

Yes, this is a limitation, but x*y is fundamentally different to and harder than
3*x.

Ciao, Duncan.

Possibly Parallel Threads

Search for more reasonably related threads

llvm dev - May 2012 - [LLVMdev] Extending GetElementPointer, or Premature Linearization Considered Harmful

[LLVMdev] Extending GetElementPointer, or Premature Linearization Considered Harmful

[LLVMdev] Extending GetElementPointer, or Premature Linearization Considered Harmful

[LLVMdev] Extending GetElementPointer, or Premature Linearization Considered Harmful

[LLVMdev] Extending GetElementPointer, or Premature Linearization Considered Harmful

Possibly Parallel Threads