thr3ads.net - llvm dev - [LLVMdev] Multi-dimensional array accesses in LLVM-IR

If this information is useful, please help other people find it:
Share via:

Tobias Grosser

2012-Sep-12 08:38 UTC

[LLVMdev] Multi-dimensional array accesses in LLVM-IR | Thoughts

On 09/11/2012 03:38 PM, Hal Finkel wrote:> This is a general problem that also affects, for example, Preston's
> dependence-analysis work. How have you thought about dealing with this?
Hi Hal,

you are right. Preston's dependence-analysis and Polly are sharing the same
problem. As far as I know, non of us implemented a solution yet.
Preston, what is your status on multi-dim arrays?

As stated earlier the problem is that access functions to multi-dimensional
arrays
are linearized at LLVM-IR level. Unfortunately it is a lot harder to reason
about
linearized accesses. Hence, we want to recover the multi-dimensional access
function.

In LLVM-IR words: When we currently analyze a multi-dimensional
variable length memory access, we calculate a single SCEV expression
which describes the memory access. However, we would prefer to have for
each array dimension a separate SCEV* expression. Those individual expressions
will be a lot easier to analyze.

I think there are two major directions we can go:

1) Represent multi-dimensional access functions in LLVM-IR

Instead of linearizing multi-dimensional array accesses to single 
dimensional array accesses, we extend LLVM-IR such that it can express 
multi-dimensional accesses and we ensure that front-ends directly
generate LLVM-IR multi-dimensional accesses.

2) Recover multi-dimensionality from linearized accesses

We take a linearized single-dimensional SCEV* expression, apply some 
magic and recover both the number and size of the original array
dimensions as well as a set of SCEV* expressions describing the individual
access functions of the original dimensions.

Both approaches have advantages/disadvantages:

Approach 1)
   Pro:
	- Provides exact information about multi-dimensional arrays,
          if the multi-dimensionality is explicit in the source program

   Contra:
	- We need to explicitly express multi-dimensionality in LLVM-IR
	- Front-ends need to explicitly generate the multi-dimensional IR
	- Existing passes need to maintain and reason about the new IR
	- Only C99 supports variable length arrays. Most existing C/C++
	  programs have a manual implementation of variable length
	  arrays, which cannot be automatically translated by the front-end.

Approach 2)
   Pro:
	- Does not distinguish between manually simulated multi-dimensional
	  arrays and multi-dimensional arrays that have been explicitly
	  expressed in the source code.
	- Changes are limited to an LLVM-IR analysis

   Contra:
	- The general problem is difficult
	- We may need to add some run-time checks as statically proving
          the delinearization may not be entirely possible

I personally would first have a look at approach '2'. Access functions
of
multi-dimensional follow often a very regular structure, which appears not
only for accesses generated by compiler front-ends, but also for
manual implementations of multi-dimensional arrays. One common pattern
is that there is a single parameter that represents the size of a
dimension [1]. Meaning SCEV* expressions describing access to an array
"A[][m][p]" will contain the sizes '%p' and '%m * %p'
more or less explicit.

Here some examples I committed in r163619:

multidim_only_ivs_2d.ll:
; void foo(long n, long m, double A[n][m]) {
;   for (long i = 0; i < n; i++)
;     for (long j = 0; j < m; j++)
;       A[i][j] = 1.0;                                                          
; }         
; Access function: {{0,+,%m}<%for.i>,+,1}<nw><%for.j>

multidim_only_ivs_3d.ll:
; void foo(long n, long m, long o, double A[n][m][o]) {                         
;   for (long i = 0; i < n; i++)
;     for (long j = 0; j < m; j++)
;       for (long k = 0; k < o; k++)
;         A[i][j][k] = 1.0;                                                     
; }
; Access function: {{{0,+,(%m *
%o)}<%for.i>,+,%o}<%for.j>,+,1}<nw><%for.k>

multidim_ivs_and_integer_offsets_3d.ll:
; void foo(long n, long m, long o, double A[n][m][o]) {
;   for (long i = 0; i < n; i++)
;     for (long j = 0; j < m; j++)
;       for (long k = 0; k < o; k++)
;         A[i+3][j-4][k+7] = 1.0;                                               
; }                                                                             
;                                                                               
; Access function:
;   {{{(56 + (8 * (-4 + (3 * %m)) * %o) + %A),+,(8 * %m * %o)}<%for.i>,+,
;      (8 * %o)}<%for.j>,+,8}<%for.k>  

multidim_only_ivs_3d_cast.ll:
; void foo(int n, int m, int o, double A[n][m][o]) {                            
;   for (int i = 0; i < n; i++)
;     for (int j = 0; j < m; j++)
;       for (int k = 0; k < o; k++)
;         A[i][j][k] = 1.0;                                                     
; }                                                                             
;                                                                               
; Access function:                                                              
;   {{{%A,+,(8 * (zext i32 %m to i64) * (zext i32 %o to i64))}<%for.i>,+,
;    (8 * (zext i32 %o to i64))}<%for.j>,+,8}<%for.k>

multidim_ivs_and_parameteric_offsets_3d.ll:
; void foo(long n, long m, long o, double A[n][m][o], long p, long q, long r) { 
;                                                                               
;   for (long i = 0; i < n; i++)
;     for (long j = 0; j < m; j++)
;       for (long k = 0; k < o; k++)
;         A[i+p][j+q][k+r] = 1.0;                                               
; }                                                                             
;                                                                               
; Access function:                                                              
;    {{{((8 * ((((%m * %p) + %q) * %o) + %r)) + %A),+,(8 * %m *
%o)}<%for.i>,+,
;        (8 * %o)}<%for.j>,+,8}<%for.k> 

Those examples are obviously not exhaustive, but they already cover many common
cases. Analyzing them seems to be possible. As you may realize the size of the
array dimensions are all explicitly available in the 'step' of the
SCEVAddRec
repressions. Even in the last example which contains parameters both for the
sizes and the offsets, the parameters for the sizes are the only ones that
appear on the right hand sides ('8 * %m * %o' and '8 * %o'). 

I would guess that we could write a SCEVIterator, that analyzes the SCEVs and
guesses
the dimension sizes. After having written that, it should be possible to write
another SCEVIterator that given the dimension sizes can separate the different
dimensions of a SCEV.

I doubt we will always be able to prove that our guess is correct, but we could
add a run-time check to test the conditions that we can not prove statically.
This is
also the point, where I think the front-end (or the user with attributes) could
help.
Accesses could be annotated with meta-data providing the size of the array
dimensions,
such that our analysis can start from this meta-data. This could allow our
analysis
to take some short cuts and to avoid some run-time checks.

Does this sound like a reasonable approach? Any ideas or suggestions on that
topic?

Cheers
Tobi

[1] FORTRAN seems to express array size as "max(0, %dim_size)"

Armin Größlinger

2012-Sep-12 12:18 UTC

head link

[LLVMdev] Multi-dimensional array accesses in LLVM-IR | Thoughts

Hi,

I had a few thoughts about recovering multi-dimensional accesses in Polly and
maybe it's time to share them.

On 09/12/2012 10:38 AM, Tobias Grosser wrote:> [...] Even in the last example which contains parameters both for the
> sizes and the offsets, the parameters for the sizes are the only ones that
> appear on the right hand sides ('8 * %m * %o' and '8 *
%o').
I guess this is the key observation. When we have a multi-dimensional access,
the coefficients of the iterators have to form an increasing chain w.r.t.
divisibility. For the example
> multidim_ivs_and_integer_offsets_3d.ll:
> ; void foo(long n, long m, long o, double A[n][m][o]) {
> ;   for (long i = 0; i < n; i++)
> ;     for (long j = 0; j < m; j++)
> ;       for (long k = 0; k < o;  k++)
> ;         A[i+3][j-4][k+7] = 1.0;
> ; }
with access function
> ;   {{{(56 + (8 * (-4 + (3 * %m)) * %o) + %A),+,(8 * %m *
%o)}<%for.i>,+,
> ;      (8 * %o)}<%for.j>,+,8}<%for.k>
we have (writing the SCEV in ordinary notation and in a suggestive order)

   8*%o*%m * i  +  8*%o * j  +  8 * k  +  24*%m*%o - 32*%o + %A + 56

and we see that the coefficients of the iterators are 8, 8*%o, 8*%o*%m and
every coefficient in this chain divides it successor. By factoring these
coefficients out from all the term they divide, we find the subscripts for the
dimensions:

  8*%o*%m * (i+3)  +  8*%o * (j-4)  +  8 * (k+7)  +  %A

So A[i+4][j-4][k-7] is our candidate for a multi-dimensional access. The math
tool we'd need to implement this is (multivariate) polynomial division or
something similar (coefficients could be arbitrary polynomials, e.g., %o+5*%m+7
if somebody declares something like  double A[][o+5*m+7]  or even more complex
expressions).

Note that two iterators can have the same coefficients (e.g., for
A[i+j][.][.]) or one iterator can have two coefficients (e.g., A[k][k][.]). Or
a "coefficient" may actually be a constant (e.g., A[42][.][.]) or
missing in
one access (e.g., A[0][.][.]) so determining the coefficients is non-trivial
(but not too difficult I guess).
> I would guess that we could write a SCEVIterator, that analyzes the SCEVs
and guesses
> the dimension sizes. 
I'm not sure if this cane be done on the SCEV directly (instead of
polynomials)
for cases that are more complex than products of parameters but I haven't
thought about it thoroughly. OTOH, add recursions should be of the right
structure to check divisibility directly; I'm not sure without more
thinking...
> After having written that, it should be possible to write
> another SCEVIterator that given the dimension sizes can separate the
different
> dimensions of a SCEV.
Here we have to be careful because when there are several accesses to the same
array (base pointer) not only the coefficients (8, 8*%o, 8*%o*%m above) have to
match but also the range of the subscripts must be right (globally over all
arrays). For example with two accesses

  A[i+4][j-4][k-7]
  A[...][i-k][3*j]

we'd need to check that

   max{j-4, i-k}  -  min{j-4, i-k}   <=  %m - 1
   max{k-7, 3*j}  -  min{k-7, 3*j}   <=  %o - 1

so there is no (inconsistent) overrun between the dimensions (%m and %o are the
factors "between" the dimensions). Note that this also ensures that
the factors
between the dimensions are positive, i.e., no dimension is collapsed.

This check can, of course, be relaxed to checking

   0 <= j-4 <= %m - 1
   0 <= k-7 <= %o - 1

   0 <= i-k <= %m - 1
   0 <= 3*j <= %o - 1

(allowing for independent checks for each access) in the assumption that
subscripts start from 0 but I'm not sure how often this will hold on the IR
(after loop index and array base pointer shifts during optimization).

The problem here is that the factors between the dimensions (%m and %o here)
could be non-affine.
> I doubt we will always be able to prove that our guess is correct, but we
could
> add a run-time check to test the conditions that we can not prove
statically. This is
> also the point, where I think the front-end (or the user with attributes)
could help.
> Accesses could be annotated with meta-data providing the size of the array
dimensions,
> such that our analysis can start from this meta-data. This could allow our
analysis
> to take some short cuts and to avoid some run-time checks.
Makes sense. I think that it should not be too difficult to implement the above
approach for affine factors between the dimensions (which should cover the
common cases) and then we should evaluate how often we need run-time checks and
whether we need more information from the front-ends because other cases occur
in practice.

So far my thoughts for now.

Regards,
Armin

Hal Finkel

2012-Sep-12 14:27 UTC

head link

[LLVMdev] Multi-dimensional array accesses in LLVM-IR | Thoughts

On Wed, 12 Sep 2012 14:18:43 +0200
Armin Größlinger <armin.groesslinger at uni-passau.de> wrote:
> Hi,
> 
> I had a few thoughts about recovering multi-dimensional accesses in
> Polly and maybe it's time to share them.
> 
> On 09/12/2012 10:38 AM, Tobias Grosser wrote:
> > [...] Even in the last example which contains parameters both for
> > the sizes and the offsets, the parameters for the sizes are the
> > only ones that appear on the right hand sides ('8 * %m * %o'
and '8
> > * %o'). 
> 
> I guess this is the key observation. When we have a multi-dimensional
> access, the coefficients of the iterators have to form an increasing
> chain w.r.t. divisibility. For the example
> 
> > multidim_ivs_and_integer_offsets_3d.ll:
> > ; void foo(long n, long m, long o, double A[n][m][o]) {
> > ;   for (long i = 0; i < n; i++)
> > ;     for (long j = 0; j < m; j++)
> > ;       for (long k = 0; k < o;  k++)
> > ;         A[i+3][j-4][k+7] = 1.0;
> > ; }
> 
> with access function
> 
> > ;   {{{(56 + (8 * (-4 + (3 * %m)) * %o) + %A),+,(8 * %m *
> > %o)}<%for.i>,+, ;      (8 *
%o)}<%for.j>,+,8}<%for.k>
> 
> we have (writing the SCEV in ordinary notation and in a suggestive
> order)
> 
>    8*%o*%m * i  +  8*%o * j  +  8 * k  +  24*%m*%o - 32*%o + %A + 56
> 
> and we see that the coefficients of the iterators are 8, 8*%o,
> 8*%o*%m and every coefficient in this chain divides it successor. By
> factoring these coefficients out from all the term they divide, we
> find the subscripts for the dimensions:
> 
>   8*%o*%m * (i+3)  +  8*%o * (j-4)  +  8 * (k+7)  +  %A
> 
> So A[i+4][j-4][k-7] is our candidate for a multi-dimensional access.
> The math tool we'd need to implement this is (multivariate)
> polynomial division or something similar
Agreed. Furthermore, I already have code that does this (I've started
working on a transformation pass which re-factors and optimizes integer
polynomial expressions, so I can repurpose code from there). It sounds
like we just need to extract integer polynomials used by GEPs
containing induction variables, and, order-by-order, factor out the
non-induction variables. That will give us the 'size' of each
dimension. As you've noted below, we need to make sure that no index
exceeds its size, and that all accesses to an array within a given loop
are consistent. Does that sound reasonable?

 -Hal
> (coefficients could be
> arbitrary polynomials, e.g., %o+5*%m+7 if somebody declares something
> like  double A[][o+5*m+7]  or even more complex expressions).
> 
> Note that two iterators can have the same coefficients (e.g., for
> A[i+j][.][.]) or one iterator can have two coefficients (e.g.,
> A[k][k][.]). Or a "coefficient" may actually be a constant (e.g.,
> A[42][.][.]) or missing in one access (e.g., A[0][.][.]) so
> determining the coefficients is non-trivial (but not too difficult I
> guess).
> 
> > I would guess that we could write a SCEVIterator, that analyzes the
> > SCEVs and guesses the dimension sizes. 
> 
> I'm not sure if this cane be done on the SCEV directly (instead of
> polynomials) for cases that are more complex than products of
> parameters but I haven't thought about it thoroughly. OTOH, add
> recursions should be of the right structure to check divisibility
> directly; I'm not sure without more thinking...
> 
> > After having written that, it should be possible to write
> > another SCEVIterator that given the dimension sizes can separate
> > the different dimensions of a SCEV.
> 
> Here we have to be careful because when there are several accesses to
> the same array (base pointer) not only the coefficients (8, 8*%o,
> 8*%o*%m above) have to match but also the range of the subscripts
> must be right (globally over all arrays). For example with two
> accesses
> 
>   A[i+4][j-4][k-7]
>   A[...][i-k][3*j]
> 
> we'd need to check that
> 
>    max{j-4, i-k}  -  min{j-4, i-k}   <=  %m - 1
>    max{k-7, 3*j}  -  min{k-7, 3*j}   <=  %o - 1
> 
> so there is no (inconsistent) overrun between the dimensions (%m and
> %o are the factors "between" the dimensions). Note that this also
> ensures that the factors between the dimensions are positive, i.e.,
> no dimension is collapsed.
> 
> This check can, of course, be relaxed to checking
> 
>    0 <= j-4 <= %m - 1
>    0 <= k-7 <= %o - 1
> 
>    0 <= i-k <= %m - 1
>    0 <= 3*j <= %o - 1
> 
> (allowing for independent checks for each access) in the assumption
> that subscripts start from 0 but I'm not sure how often this will
> hold on the IR (after loop index and array base pointer shifts during
> optimization).
> 
> The problem here is that the factors between the dimensions (%m and
> %o here) could be non-affine.
> 
> > I doubt we will always be able to prove that our guess is correct,
> > but we could add a run-time check to test the conditions that we
> > can not prove statically. This is also the point, where I think the
> > front-end (or the user with attributes) could help. Accesses could
> > be annotated with meta-data providing the size of the array
> > dimensions, such that our analysis can start from this meta-data.
> > This could allow our analysis to take some short cuts and to avoid
> > some run-time checks.
> 
> Makes sense. I think that it should not be too difficult to implement
> the above approach for affine factors between the dimensions (which
> should cover the common cases) and then we should evaluate how often
> we need run-time checks and whether we need more information from the
> front-ends because other cases occur in practice.
> 
> So far my thoughts for now.
> 
> Regards,
> Armin


-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory

Renato Golin

2012-Sep-12 14:56 UTC

head link

[LLVMdev] Multi-dimensional array accesses in LLVM-IR | Thoughts

On 12 September 2012 09:38, Tobias Grosser <tobias at grosser.es>
wrote:> I personally would first have a look at approach '2'.
While I normally argue to leave the IR as it is (since it's a compiler
IR, not a magical one), I can see some trends going on that should not
be ignored.

This is one example, where the front-end bends its knees to generate
IR that LLVM understands, so that you can revert it to what should
have been in the first place (to analyse parallelism) and convert it
back again to "correct" IR. As you mentioned, there will be cases that
the analysis won't work, ex. when receiving an array from a function,
it could look like an opaque pointer in some architectures.

Last month, in the Cambridge LLVM Social, David Chisnall asked me
about a builder that would validate procedure call standards depending
on the target, so that the front-end could use that to build the
horrible messes we end up doing in that scenario. Another idea is to
create a pass that will convert from high-level function declaration
to low-level target-dependent declaration during the validation phase,
so the front-end produces "good-looking" calls (with types as-is) and
the pass makes them target-valid.

This technique could also apply to your case. If the front-end
supports multi-dimensional access and produces code conforming to
that, your pass can analyse and vectorize. If not, no worries, you
just ignore them. Which means not all front-ends must change at the
same time to accommodate your analysis.

Later on, a validation pass would transform all multi- to
single-dimensional array access and produce IR that the back-ends can
consume.

It might not be the easiest route, but would avoid some dead-ends down
the road...


-- 
cheers,
--renato

http://systemcall.org/

David Tweed

2012-Sep-12 16:56 UTC

head link

[LLVMdev] Multi-dimensional array accesses in LLVM-IR | Thoughts

| Here we have to be careful because when there are several accesses to the
same
| array (base pointer) not only the coefficients (8, 8*%o, 8*%o*%m above)
have to
| match but also the range of the subscripts must be right (globally over
all
| arrays). For example with two accesses
|
|  A[i+4][j-4][k-7]
|  A[...][i-k][3*j]
|
| we'd need to check that
|
|   max{j-4, i-k}  -  min{j-4, i-k}   <=  %m - 1
|   max{k-7, 3*j}  -  min{k-7, 3*j}   <=  %o - 1

An interjection from the peanut gallery...

One thing that would be nice would be a way to indicate in LLVM instructions
that _by construction_ a given variable will be a valid index for a given
dimension of a given array. This isn't so much use in C-style languages
where arrays don't have size data canonically associated with them, but for
languages with constructs like (in the simplest case)

for i=0:dimension(A,0): acc+=A[i]

or other more complicated "index generator" expressions like
generating
indices from stencils. At the moment (as I understand the LLVM IR) a
language can only specify that an entire linearised access expression is
within the linearised bounds. This would provide a way to hopefully boost
the "safe to apply" rate of LLVM optimisers like polly for languages
that
are amenable to such constructs. This presumably requires a way to link
array pointers, dimensions, bounds and indexes so it's not obvious how to do
it within the LLVM context of course... (There's also the issue of exposing
transitivity: the source language may know i is a by construction index into
A, and B is the same shape as A hence i is also guaranteed to be
within-bounds index into B.)

Just a vague thought,
Regards,
Dave

Hal Finkel

2012-Sep-12 22:27 UTC

head link

[LLVMdev] Multi-dimensional array accesses in LLVM-IR | Thoughts

On Wed, 12 Sep 2012 15:56:25 +0100
Renato Golin <rengolin at systemcall.org> wrote:
> On 12 September 2012 09:38, Tobias Grosser <tobias at grosser.es>
wrote:
> > I personally would first have a look at approach '2'.
> 
> While I normally argue to leave the IR as it is (since it's a compiler
> IR, not a magical one), I can see some trends going on that should not
> be ignored.
> 
> This is one example, where the front-end bends its knees to generate
> IR that LLVM understands, so that you can revert it to what should
> have been in the first place (to analyse parallelism) and convert it
> back again to "correct" IR. As you mentioned, there will be cases
that
> the analysis won't work, ex. when receiving an array from a function,
> it could look like an opaque pointer in some architectures.
I agree that this seems suboptimal: information known to the frontend
is lost, and then must be guessed by the backend. This might also be a
case where metadata is helpful.

I also think that it is important to keep in mind that this particular
case is one in which guessing is important. This is because there is a
lot of existing C/C++ code which does manual indexing of
multidimensional arrays, and we should try to support iteration-space
transformations involving those arrays as well.
> 
> Last month, in the Cambridge LLVM Social, David Chisnall asked me
> about a builder that would validate procedure call standards depending
> on the target, so that the front-end could use that to build the
> horrible messes we end up doing in that scenario. Another idea is to
> create a pass that will convert from high-level function declaration
> to low-level target-dependent declaration during the validation phase,
> so the front-end produces "good-looking" calls (with types as-is)
and
> the pass makes them target-valid.
> 
> This technique could also apply to your case. If the front-end
> supports multi-dimensional access and produces code conforming to
> that,
Do you mean that there is some kind of canonical form that all of the
frontends will use, and that LLVM provides some kind of utility for
generating this canonical form?

 -Hal
> your pass can analyse and vectorize. If not, no worries, you
> just ignore them. Which means not all front-ends must change at the
> same time to accommodate your analysis.
> 
> Later on, a validation pass would transform all multi- to
> single-dimensional array access and produce IR that the back-ends can
> consume.
> 
> It might not be the easiest route, but would avoid some dead-ends down
> the road...
> 
> 


-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory

Apparently Analagous Threads

Search for more apparently analagous threads

llvm dev - Sep 2012 - [LLVMdev] Multi-dimensional array accesses in LLVM-IR | Thoughts

[LLVMdev] Multi-dimensional array accesses in LLVM-IR | Thoughts

[LLVMdev] Multi-dimensional array accesses in LLVM-IR | Thoughts

[LLVMdev] Multi-dimensional array accesses in LLVM-IR | Thoughts

[LLVMdev] Multi-dimensional array accesses in LLVM-IR | Thoughts

[LLVMdev] Multi-dimensional array accesses in LLVM-IR | Thoughts

[LLVMdev] Multi-dimensional array accesses in LLVM-IR | Thoughts

Apparently Analagous Threads