thr3ads.net - llvm dev - [llvm-dev] [RFC] A new multidimensional array indexing intrinsic [Jul 2019]

If this information is useful, please help other people find it:
Share via:

Kaylor, Andrew via llvm-dev

2019-Jul-22 23:08 UTC

[llvm-dev] [RFC] A new multidimensional array indexing intrinsic

Is it even possible to implement the original proposal using an intrinsic?
I'm not sure how to interpret part of the description. It says the syntax
for the intrinsic would be this:

<result> = llvm.multidim.array.index.* <ty> <ty>*
<ptrval> {<stride>, <idx>}*

It isn't clear to me what that means. The later example is also in a
somewhat generalized form:

  %arrayidx = llvm.multidim.array.index.* i64 i64* %A, %str_1, %idx_1, %str_2,
%idx_2

Trying to expand this into something concrete it looks to me like the extra
value-less type argument ('i64' immediately following the intrinsic
name) won't work, and if I'm reading it correctly that's a necessary
element. The GEP instruction accepts a raw type as a pseudo-argument. It does
this to prepare for the coming transition to opaque pointers, and it can do it
because it's a first class instruction. However, an intrinsic can't do
this. And yet without it we won't know what to use for the GEP's source
element type "argument" after pointer's lose their pointee type.

-Andy

-----Original Message-----
From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of Doerfert,
Johannes via llvm-dev
Sent: Monday, July 22, 2019 11:56 AM
To: Philip Reames <listmail at philipreames.com>
Cc: SAHIL GIRISH YERAWAR <cs15btech11044 at iith.ac.in>; llvm-dev
<llvm-dev at lists.llvm.org>; Michael Ferguson <mferguson at
cray.com>; Michael Kruse <llvm at meinersbur.de>; Tobias Grosser
<tobias.grosser at inf.ethz.ch>
Subject: Re: [llvm-dev] [RFC] A new multidimensional array indexing intrinsic

On 07/22, Philip Reames via llvm-dev wrote:> We could also simply extend the existing inrange mechanism to 
> non-constantexpr GEPs.  It would remove an inconsistency in the 
> semantics, be relatively straight forward, and solve the motivating
example.
Agreed. I'd very much prefer a GEP extension over an intrinsic solution.

--
  Johannes


> On 7/22/19 10:01 AM, Peter Collingbourne via llvm-dev wrote:
> > The restrictions of this new intrinsic seem very similar to those of 
> > the inrange attribute on GEPs. It seems that the main advantage of 
> > your proposal is that it would allow for non-constant strides (i.e. 
> > variable length arrays) in dimensions other than the first one. Do 
> > these appear frequently enough in the programs that you're 
> > interested in to be worth optimizing for?
> > 
> > Peter
> > 
> > On Sun, Jul 21, 2019 at 3:54 PM Siddharth Bhat via llvm-dev 
> > <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
> > 
> >     Hello,
> > 
> >     We would like to begin discussions around a new set of intrinsics,
> >     to better express
> >     multi-dimensional array indexing within LLVM. The motivations and
> >     a possible design
> >     are sketched out below.
> > 
> >     Rendered RFC link here
> >     
> >
<https://github.com/bollu/llvm-multidim-array-indexing-proposal/blob
> > /master/RFC.md>
> > 
> > 
> >     Raw markdown:
> > 
> >     # Introducing a new multidimensional array indexing intrinsic
> > 
> >     ## The request for change: A new intrinsic,
`llvm.multidim.array.index.*`.
> > 
> >     We propose the addition of a new family of intrinsics,
`llvm.multidim.array.index.*`.
> >     This will allow us to represent array indexing into an array
`A[d1][d2][..][dk]`
> >     as `llvm.multidim.array.index.k A d1, d2, d3, ... dk` instead of
flattening
> >     the information into `gep A, d1 * n1 + d2 * n2 + ... + dk * nk`.
The former
> >     unflattened representation is advantageous for analysis and
optimisation. It also
> >     allows us to represent array indexing semantics of languages such
as Fortran
> >     and Chapel, which differs from that of C based array indexing.
> >     	
> >     ## Motivating the need for a multi-dimensional array indexing 
> > intrinsic
> > 
> >     There are primarily one of two views of array indexing involving
> >     multiple dimensions most languages take, which we discuss to
motivate
> >     the need for a multi-dimensional array index. This consideration
impacts
> >     the kinds of analysis we can perform on the program. In Polly, we
care about
> >     dependence analysis, so the examples here will focus on that
particular problem.
> > 
> >     Let us consider an array indexing operation of the form:
> >     ```cpp
> >     int ex1(int n, int m, B[n][m], int x1, int x2, int y1, int y2) {
> >     	__builtin_assume(x1 != x2);
> >     	__builtin_assume(y1 != y2);
> >     	B[x1][y1] = 1;
> >     	printf("%d", B[x2][y2]);
> >     	exit(0);
> >     }
> >     ```
> > 
> >     One would like to infer that the array indices _interpreted as
tuples_
> >     `(x1, y1)` and `(x2, y2)` do not have the same value, due to the
guarding asserts
> >     that `x1 != x2` and `y1 != y2`. As a result, the write `B[x1][y1]
= 1` can
> >     in no way interfere with the value of `B[x2][y2]`. Consquently,
> >     we can optimise the program into:
> > 
> > 
> >     ```cpp
> >     int ex1_opt(int n, int m, B[n][m], int x1, int x2, int y1, int y2)
{
> >          // B[x1][y1] = 1 is omitted because the result
> >          // cannot be used:
> >          //  It is not used in the print and then the program exits
> >     	printf("%d", B[x2][y2]);
> >     	exit(0);
> >     }
> >     ```
> > 
> >     However, alas, this is illegal, for the C language does not
provide
> >     semantics that allow the final inference above. It is conceivable
that
> >     `x1 != x2, y1 != y2`, but the indices do actually alias, since
> >     according to C semantics, the two indices alias if the _flattened
> >     representation of the indices alias_. Consider the parameter
> >     values:
> > 
> >     ```
> >     n = m = 3
> >     x1 = 1, y1 = 0; B[x1][y1] = nx1+y1 = 3*1+0=3
> >     x2 = 0, y2 = 3; B[x2][y2] = nx2+y2 = 3*0+3=3
> >     ```
> > 
> >     Hence, the array elements `B[x1][y1]` and `B[x2][y2]` _can alias_,
and
> >     so the transformation proposed in `ex1_opt` is unsound in general.
> > 
> > 
> >     In contrast, many langagues other than C require that index
> >     expressions for multidimensional arrays have each component within
the
> >     array dimension for that component. As a result, in the example
above,
> >     the index pair `(0,3)` would be out-of-bounds. In languages with
these
> >     semantics, one can infer that the indexing:
> > 
> >     `[x1][y1] != [x2][y2]` iff `x1 != x2 || y1 != y2`.
> > 
> >     While this particular example is not very interesting, it shows
the
> >     spirit of the lack of expressiveness in LLVM we are trying to
> >     improve.
> > 
> >     Julia, Fortran, and Chapel are examples of such languages which
target
> >     LLVM.
> > 
> >     Currently, the LLVM semantics of `getelementptr` talk purely of
> >     the C style flattened views of arrays. This inhibits the ability
of the optimiser
> >     to understand the multidimensional examples as given above, and we
> >     are forced to make conservative assumptions, inhibiting
optimisation.
> >     This information must ideally be expressible in LLVM, such that
LLVM's optimisers and
> >     alias analysis can use this information to model
multidimensional-array
> >     semantics.
> > 
> >     There is a more realistic (and involved) example in [Appendix
A](#Appendix-A)
> >     in the same spirit as the above simple example, but one a compiler
might
> >     realistically wish to perform.
> > 
> >     ## Evaluation of the impact of the intrinsic on accuracy of 
> > dependence analysis
> > 
> >     - This has been implemented in an exprimental branch of Polly, and
was used
> >     on the COSMO climate weather model. This greatly helped increase
the accuracy
> >     of Polly's analysis, since we eliminated the guessing game
from the array analysis.
> > 
> >     - This has also been implemented as part of a GSoC effort to unify
> >     Chapel and Polly.
> > 
> >     - Molly, the distributed version of Polly written by Michael Kruse
for his PhD
> >        also implemented a similar scheme. In his use-case, optimistic
run-time checks
> >        with delinearization was not possible, so this kind of
intrinsic was _necessary_
> >        for the application, not just _good to have_. More details are
available
> >        in his PhD thesis: [Lattice QCD Optimization and Polytopic
Representations of Distributed
Memory](https://tel.archives-ouvertes.fr/tel-01078440).
> >        In particular, Chapter 9 contains a detailed discussion.
> > 
> >     # Representations
> > 
> >     ## Intrinsic
> > 
> >     ### Syntax
> >     ```
> >     <result> = llvm.multidim.array.index.* <ty>
<ty>* <ptrval> {<stride>, <idx>}*
> >     ```
> > 
> >     ### Overview:
> > 
> >     The `llvm.multidim.array.index.*` intrinsic is used to get the
address of
> >     an element from an array. It performs address calcuation only and
> >     does not access memory. It is similar to `getelementptr`. However,
it imposes
> >     additional semantics which allows the optimiser to provide better
optimisations
> >     than `getlementptr`.
> > 
> > 
> > 
> >     ### Arguments:
> > 
> >     The first argument is always a type used as the basis for the
calculations. The
> >     second argument is always a pointer, and is the base address to
start the
> >     calculation from. The remaining arguments are a list of pairs.
Each pair
> >     contains a dimension stride, and an offset with respect to that
stride.
> > 
> > 
> >     ### Semantics:
> > 
> >     `llvm.multidim.array.index.*` represents a multi-dimensional array
index, In particular, this will
> >     mean that we will assume that all indices `<idx_i>` are
non-negative.
> > 
> >     Additionally, we assume that, for each `<str_i>`,
`<idx_i>` pair, that
> >     `0 <= idx_i < str_i`.
> > 
> >     Optimizations can assume that, given two
llvm.multidim.array.index.* instructions with matching types:
> > 
> >     ```
> >     llvm.multidim.array.index.* <ty> <ty>* <ptrvalA>
<strA_1>, <idxA_1>, ..., <strA_N>, <idxA_N>
> >     llvm.multidim.array.index.* <ty> <ty>* <ptrvalB>
<strB_1>, <idxB_1>, ..., <strB_N>, <idxb_N>
> >     ```
> > 
> >     If `ptrvalA == ptrvalB` and the strides are equal `(strA_1 ==
strB_1 && ... && strA_N == strB_N)` then:
> > 
> >     - If all the indices are equal (that is, `idxA_1 == idxB_1
&& ... && idxA_N == idxB_N`),
> >        then the resulting pointer values are equal.
> > 
> >     - If any index value is not equal (that is, there exists an `i`
such that `idxA_i != idxB_i`),
> >        then the resulting pointer values are not equal.
> > 
> > 
> >     ##### Address computation:
> >     Consider an invocation of `llvm.multidim.array.index.*`:
> > 
> >     ```
> >     <result> = call @llvm.multidim.array.index.* <ty>
<ty>* <ptrval> <str_0>, <idx_0>, <str_1>
<idx_1>, ..., <str_n> <idx_n>
> >     ```
> > 
> >     If the pairs are denoted by `(str_i, idx_i)`, where `str_i`
denotes the stride
> >     and `idx_i` denotes the index of the ith pair, then the final
address (in bytes)
> >     is computed as:
> > 
> >     ```
> >     ptrval + len(ty) * [(str_0 * idx_0) + (str_1 * idx_1) + ... (str_n
* idx_n)]
> >     ```
> > 
> >     ## Transitioning to `llvm.multidim.array.index.*`: Allow
`multidim_array_index` to refer to a GEP instruction:
> > 
> >     This is a sketch of how we might gradually introduce the
`llvm.multidim.array.index.*`
> >     intrinsic into LLVM without immediately losing the analyses
> >     that are performed on `getelememtptr` instructions. This section
> >     lists out some possible choices that we have, since the authors
> >     do not have a "best" solution.
> > 
> >     ##### Choice 1: Write a `llvm.multidim.array.index.*` to `GEP` 
> > pass, with the `GEP` annotated with metadata
> > 
> >     This pass will flatten all `llvm.multidim.array.index.*`
expressions into a `GEP` annotated with metadata. This metadata will indicate
that the index expression computed by the lowered GEP is guaranteed to be in a
canonical form which allows the analysis
> >     to infer stride and index sizes.
> > 
> >     A multidim index of the form:
> >     ```
> >     %arrayidx = llvm.multidim.array.index.* i64 i64* %A, %str_1,
%idx_1, %str_2, %idx_2
> >     ```
> > 
> >     is lowered to:
> > 
> >     ```
> >     %mul1 = mul nsw i64 %str_1, %idx_1
> >     %mul2 = mul1 nsw i64 %str_2, %idx_2
> >     %total = add nsw i64 %mul2, %mul1
> >     %arrayidx = getelementptr inbounds i64, i64* %A, i64 %total,
!multidim !1
> >     ```
> >     with guarantees that the first term in each multiplication is the
stride
> >     and the second term in each multiplication is the index. (What
happens
> >     if intermediate transformation passes decide to change the order?
This seems
> >     complicated).
> > 
> >     **TODO:** Lookup how to attach metadata such that the metadata can
communicate
> >     which of the values are strides and which are indeces
> > 
> > 
> > 
> >     # Caveats
> > 
> >     Currently, we assume that the array shape is immutable. However,
we will need to deal with
> >     being able to express `reshape` like primitives where the array
shape can be mutated. However,
> >     this appears to make expressing this information quite difficult:
We now need to attach the shape
> >     information to an array per "shape-live-range".
> > 
> > 
> >     ## Appendix-A
> > 
> >     ##### A realistic, more involved example of dependence analysis 
> > going wrong
> > 
> >     ```cpp
> >     // In an array A of size (n0 x n1),
> >     // fill a subarray of size (s0 x s1)
> >     // which starts at an offset (o0 x o1) in the larger array A
> >     void set_subarray(unsigned n0, unsigned n1,
> >     	unsigned o0, unsigned o1,
> >     	unsigned s0, unsigned s1,
> >     	float A[n0][n1]) {
> >     	for (unsigned i = 0; i < s0; i++)
> >     		for (unsigned j = 0; j < s1; j++)
> >     				S: A[i + o0][j + o1] = 1;
> >     }
> >     ```
> >     We first reduce this index expression to a sum of products:
> > 
> >     ```
> >     (i + o0) * n1 + (j + o1) = n1i + n1o0 + j + o1
> >     ix(i, j, n0, n1, o0, o1) = n1i + n1o0 + j + o1
> >     ```
> > 
> >     `ix` is the index expression which `LLVM` will see, since it is
fully
> >     flattened, in comparison with the multi-dimensional index
expression
> >     `index:[i + o0][j + o1] | sizes:[][n1]`.
> > 
> >     We will now show _why_ this multi-dimensional index is not always
correct,
> >     and why guessing for one language will not work for another:
> > 
> >     Consider a call `set_subarray(n0=8, n1=9, o0=4, o1=6, s0=3,
s1=6)`. At face
> >     value, this is incorrect if we view the array as 2D grid, since
the size
> >     of the array is `(n0, n1) = (8, 9)`, but we are writing an array
of size
> >     `(s0, s1) = (3, 6)` starting from `(o0, o1) = (4, 6)`. Clearly, we
will
> >     exceed the width of the array, since `(s1 + o1 = 6 + 6 = 12) >
(n1 = 9)`.
> >     However, now think of the array as a flattened 1D representation.
In this
> >     case, the total size of the array is `n1xn2 = 8x9 = 72`, while the
largest
> >     element we will access is at the largest value of `(i, j)`. That
is,
> >     `i = s0 - 1 = 2`, and `j = s1 - 1 = 5`.
> > 
> >     The largest index will be `ix(i=2, j=5, n0=8, n1=9, o0=4, o1=6) =
8*2+8*4+5+6=59`.
> >     Since `59 < 72`, we are clearly at _legal_ array indices, by C
semantics!
> > 
> >     The definition of the semantics of the language **changed the
illegal
> >     multidimensional access** (which is illegal since it exceeds the
`n1`
> >     dimension), into a **legal flattened 1D access** (which is legal
since the
> >     flattened array indices are inbounds).
> > 
> >     LLVM has no way of expressing these two different semantics.
Hence, we are
> >     forced to:
> >     1. Consider flattened 1D accesses, which makes analysis of index
expressions
> >     equivalent to analysis of polynomials over the integers, which is
hard.
> >     2. Guess multidimensional representations, and use them at the
expense of
> >     soundness bugs as shown above.
> >     3. Guess multidimensional representations, use them, and check
their validity
> >     at runtime, causing a runtime performance hit. This implementation
follows
> >     the description from the paper [Optimistic Delinearization of
Parametrically Sized Arrays](optim-delin-parametrically-sized-arrays).
> > 
> > 
> >     Currently, Polly opts for option (3), which is to emit runtime
checks. If
> >     the run-time checks fail, then Polly will not run its optimised
code. Instead,
> >     It keeps a copy of the unoptimised code around, which is run in
this case.
> >     Note that this effectively doubles the amount of
performance-sensitive code
> >     which is finally emitted after running Polly.
> > 
> >     Ideally, we would like a mechanism to directly express the
multidimensional
> >     semantics, which would eliminate this kind of guesswork from
Polly/LLVM,
> >     which would both make code faster, and easier to analyze.
> > 
> >     ## References
> >     - [The chapel language
specification](https://chapel-lang.org/docs/1.13/_downloads/chapelLanguageSpec.pdf)
> >     - [Fortran 2003
standard](http://www.j3-fortran.org/doc/year/04/04-007.pdf}{Fortran 
<http://www.j3-fortran.org/doc/year/04/04-007.pdf%7D%7BFortran>  2003
standard)
> >     - [C++ subscripting](http://eel.is/c++draft/expr.sub)
> >     - [Michael Kruse's PhD thesis: Lattice QCD Optimization and
Polytopic Representations of Distributed
Memory](https://tel.archives-ouvertes.fr/tel-01078440).
> >     - [Molly source code link for the
intrinsic](https://github.com/Meinersbur/llvm/blob/molly/include/llvm/IR/IntrinsicsMolly.td#L3)
> >     -[Optimistic Delinearization of Parametrically Sized 
> > Arrays](optim-delin-parametrically-sized-arrays)
> > 
> >     
> > [optim-delin-parametrically-sized-arrays]:http://delivery.acm.org/10
> >
.1145/2760000/2751248/p351-grosser.pdf?ip=93.3.109.183&id=2751248&ac
> >
c=CHORUS&key=4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35%
> >
2E6D218144511F3437&__acm__=1557948443_2f56675c6d04796f27b84593535c9f
> > 70
> > 
> > 
> >     _______________________________________________
> >     LLVM Developers mailing list
> >     llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
> >     https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> > 
> > 
> > 
> > --
> > --
> > Peter
> > 
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 

Johannes Doerfert
Researcher

Argonne National Laboratory
Lemont, IL 60439, USA

jdoerfert at anl.gov

Michael Kruse via llvm-dev

2019-Jul-22 23:49 UTC

head link

[llvm-dev] [RFC] A new multidimensional array indexing intrinsic

Intrinsics can return `llvm_any_ty` (Intrinsics.td). In that case the
return type is added as a suffix to the intrinsic's name, i.e. the
syntax in the RFC is not 100% the syntax for intrinsics. Same for the
parameters which each must have their types explicitly mentioned.

Michael

Am Mo., 22. Juli 2019 um 19:08 Uhr schrieb Kaylor, Andrew
<andrew.kaylor at intel.com>:>
> Is it even possible to implement the original proposal using an intrinsic?
I'm not sure how to interpret part of the description. It says the syntax
for the intrinsic would be this:
>
> <result> = llvm.multidim.array.index.* <ty> <ty>*
<ptrval> {<stride>, <idx>}*
>
> It isn't clear to me what that means. The later example is also in a
somewhat generalized form:
>
>   %arrayidx = llvm.multidim.array.index.* i64 i64* %A, %str_1, %idx_1,
%str_2, %idx_2
>
> Trying to expand this into something concrete it looks to me like the extra
value-less type argument ('i64' immediately following the intrinsic
name) won't work, and if I'm reading it correctly that's a necessary
element. The GEP instruction accepts a raw type as a pseudo-argument. It does
this to prepare for the coming transition to opaque pointers, and it can do it
because it's a first class instruction. However, an intrinsic can't do
this. And yet without it we won't know what to use for the GEP's source
element type "argument" after pointer's lose their pointee type.
>
> -Andy
>
> -----Original Message-----
> From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of
Doerfert, Johannes via llvm-dev
> Sent: Monday, July 22, 2019 11:56 AM
> To: Philip Reames <listmail at philipreames.com>
> Cc: SAHIL GIRISH YERAWAR <cs15btech11044 at iith.ac.in>; llvm-dev
<llvm-dev at lists.llvm.org>; Michael Ferguson <mferguson at
cray.com>; Michael Kruse <llvm at meinersbur.de>; Tobias Grosser
<tobias.grosser at inf.ethz.ch>
> Subject: Re: [llvm-dev] [RFC] A new multidimensional array indexing
intrinsic
>
> On 07/22, Philip Reames via llvm-dev wrote:
> > We could also simply extend the existing inrange mechanism to
> > non-constantexpr GEPs.  It would remove an inconsistency in the
> > semantics, be relatively straight forward, and solve the motivating
example.
>
> Agreed. I'd very much prefer a GEP extension over an intrinsic
solution.
>
> --
>   Johannes
>
>
>
> > On 7/22/19 10:01 AM, Peter Collingbourne via llvm-dev wrote:
> > > The restrictions of this new intrinsic seem very similar to those
of
> > > the inrange attribute on GEPs. It seems that the main advantage
of
> > > your proposal is that it would allow for non-constant strides
(i.e.
> > > variable length arrays) in dimensions other than the first one.
Do
> > > these appear frequently enough in the programs that you're
> > > interested in to be worth optimizing for?
> > >
> > > Peter
> > >
> > > On Sun, Jul 21, 2019 at 3:54 PM Siddharth Bhat via llvm-dev
> > > <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
> > >
> > >     Hello,
> > >
> > >     We would like to begin discussions around a new set of
intrinsics,
> > >     to better express
> > >     multi-dimensional array indexing within LLVM. The motivations
and
> > >     a possible design
> > >     are sketched out below.
> > >
> > >     Rendered RFC link here
> > >
> > >
<https://github.com/bollu/llvm-multidim-array-indexing-proposal/blob
> > > /master/RFC.md>
> > >
> > >
> > >     Raw markdown:
> > >
> > >     # Introducing a new multidimensional array indexing intrinsic
> > >
> > >     ## The request for change: A new intrinsic,
`llvm.multidim.array.index.*`.
> > >
> > >     We propose the addition of a new family of intrinsics,
`llvm.multidim.array.index.*`.
> > >     This will allow us to represent array indexing into an array
`A[d1][d2][..][dk]`
> > >     as `llvm.multidim.array.index.k A d1, d2, d3, ... dk` instead
of flattening
> > >     the information into `gep A, d1 * n1 + d2 * n2 + ... + dk *
nk`. The former
> > >     unflattened representation is advantageous for analysis and
optimisation. It also
> > >     allows us to represent array indexing semantics of languages
such as Fortran
> > >     and Chapel, which differs from that of C based array
indexing.
> > >
> > >     ## Motivating the need for a multi-dimensional array indexing
> > > intrinsic
> > >
> > >     There are primarily one of two views of array indexing
involving
> > >     multiple dimensions most languages take, which we discuss to
motivate
> > >     the need for a multi-dimensional array index. This
consideration impacts
> > >     the kinds of analysis we can perform on the program. In
Polly, we care about
> > >     dependence analysis, so the examples here will focus on that
particular problem.
> > >
> > >     Let us consider an array indexing operation of the form:
> > >     ```cpp
> > >     int ex1(int n, int m, B[n][m], int x1, int x2, int y1, int
y2) {
> > >             __builtin_assume(x1 != x2);
> > >             __builtin_assume(y1 != y2);
> > >             B[x1][y1] = 1;
> > >             printf("%d", B[x2][y2]);
> > >             exit(0);
> > >     }
> > >     ```
> > >
> > >     One would like to infer that the array indices _interpreted
as tuples_
> > >     `(x1, y1)` and `(x2, y2)` do not have the same value, due to
the guarding asserts
> > >     that `x1 != x2` and `y1 != y2`. As a result, the write
`B[x1][y1] = 1` can
> > >     in no way interfere with the value of `B[x2][y2]`.
Consquently,
> > >     we can optimise the program into:
> > >
> > >
> > >     ```cpp
> > >     int ex1_opt(int n, int m, B[n][m], int x1, int x2, int y1,
int y2) {
> > >          // B[x1][y1] = 1 is omitted because the result
> > >          // cannot be used:
> > >          //  It is not used in the print and then the program
exits
> > >             printf("%d", B[x2][y2]);
> > >             exit(0);
> > >     }
> > >     ```
> > >
> > >     However, alas, this is illegal, for the C language does not
provide
> > >     semantics that allow the final inference above. It is
conceivable that
> > >     `x1 != x2, y1 != y2`, but the indices do actually alias,
since
> > >     according to C semantics, the two indices alias if the
_flattened
> > >     representation of the indices alias_. Consider the parameter
> > >     values:
> > >
> > >     ```
> > >     n = m = 3
> > >     x1 = 1, y1 = 0; B[x1][y1] = nx1+y1 = 3*1+0=3
> > >     x2 = 0, y2 = 3; B[x2][y2] = nx2+y2 = 3*0+3=3
> > >     ```
> > >
> > >     Hence, the array elements `B[x1][y1]` and `B[x2][y2]` _can
alias_, and
> > >     so the transformation proposed in `ex1_opt` is unsound in
general.
> > >
> > >
> > >     In contrast, many langagues other than C require that index
> > >     expressions for multidimensional arrays have each component
within the
> > >     array dimension for that component. As a result, in the
example above,
> > >     the index pair `(0,3)` would be out-of-bounds. In languages
with these
> > >     semantics, one can infer that the indexing:
> > >
> > >     `[x1][y1] != [x2][y2]` iff `x1 != x2 || y1 != y2`.
> > >
> > >     While this particular example is not very interesting, it
shows the
> > >     spirit of the lack of expressiveness in LLVM we are trying to
> > >     improve.
> > >
> > >     Julia, Fortran, and Chapel are examples of such languages
which target
> > >     LLVM.
> > >
> > >     Currently, the LLVM semantics of `getelementptr` talk purely
of
> > >     the C style flattened views of arrays. This inhibits the
ability of the optimiser
> > >     to understand the multidimensional examples as given above,
and we
> > >     are forced to make conservative assumptions, inhibiting
optimisation.
> > >     This information must ideally be expressible in LLVM, such
that LLVM's optimisers and
> > >     alias analysis can use this information to model
multidimensional-array
> > >     semantics.
> > >
> > >     There is a more realistic (and involved) example in [Appendix
A](#Appendix-A)
> > >     in the same spirit as the above simple example, but one a
compiler might
> > >     realistically wish to perform.
> > >
> > >     ## Evaluation of the impact of the intrinsic on accuracy of
> > > dependence analysis
> > >
> > >     - This has been implemented in an exprimental branch of
Polly, and was used
> > >     on the COSMO climate weather model. This greatly helped
increase the accuracy
> > >     of Polly's analysis, since we eliminated the guessing
game from the array analysis.
> > >
> > >     - This has also been implemented as part of a GSoC effort to
unify
> > >     Chapel and Polly.
> > >
> > >     - Molly, the distributed version of Polly written by Michael
Kruse for his PhD
> > >        also implemented a similar scheme. In his use-case,
optimistic run-time checks
> > >        with delinearization was not possible, so this kind of
intrinsic was _necessary_
> > >        for the application, not just _good to have_. More details
are available
> > >        in his PhD thesis: [Lattice QCD Optimization and Polytopic
Representations of Distributed
Memory](https://tel.archives-ouvertes.fr/tel-01078440).
> > >        In particular, Chapter 9 contains a detailed discussion.
> > >
> > >     # Representations
> > >
> > >     ## Intrinsic
> > >
> > >     ### Syntax
> > >     ```
> > >     <result> = llvm.multidim.array.index.* <ty>
<ty>* <ptrval> {<stride>, <idx>}*
> > >     ```
> > >
> > >     ### Overview:
> > >
> > >     The `llvm.multidim.array.index.*` intrinsic is used to get
the address of
> > >     an element from an array. It performs address calcuation only
and
> > >     does not access memory. It is similar to `getelementptr`.
However, it imposes
> > >     additional semantics which allows the optimiser to provide
better optimisations
> > >     than `getlementptr`.
> > >
> > >
> > >
> > >     ### Arguments:
> > >
> > >     The first argument is always a type used as the basis for the
calculations. The
> > >     second argument is always a pointer, and is the base address
to start the
> > >     calculation from. The remaining arguments are a list of
pairs. Each pair
> > >     contains a dimension stride, and an offset with respect to
that stride.
> > >
> > >
> > >     ### Semantics:
> > >
> > >     `llvm.multidim.array.index.*` represents a multi-dimensional
array index, In particular, this will
> > >     mean that we will assume that all indices `<idx_i>` are
non-negative.
> > >
> > >     Additionally, we assume that, for each `<str_i>`,
`<idx_i>` pair, that
> > >     `0 <= idx_i < str_i`.
> > >
> > >     Optimizations can assume that, given two
llvm.multidim.array.index.* instructions with matching types:
> > >
> > >     ```
> > >     llvm.multidim.array.index.* <ty> <ty>*
<ptrvalA> <strA_1>, <idxA_1>, ..., <strA_N>,
<idxA_N>
> > >     llvm.multidim.array.index.* <ty> <ty>*
<ptrvalB> <strB_1>, <idxB_1>, ..., <strB_N>,
<idxb_N>
> > >     ```
> > >
> > >     If `ptrvalA == ptrvalB` and the strides are equal `(strA_1 ==
strB_1 && ... && strA_N == strB_N)` then:
> > >
> > >     - If all the indices are equal (that is, `idxA_1 == idxB_1
&& ... && idxA_N == idxB_N`),
> > >        then the resulting pointer values are equal.
> > >
> > >     - If any index value is not equal (that is, there exists an
`i` such that `idxA_i != idxB_i`),
> > >        then the resulting pointer values are not equal.
> > >
> > >
> > >     ##### Address computation:
> > >     Consider an invocation of `llvm.multidim.array.index.*`:
> > >
> > >     ```
> > >     <result> = call @llvm.multidim.array.index.* <ty>
<ty>* <ptrval> <str_0>, <idx_0>, <str_1>
<idx_1>, ..., <str_n> <idx_n>
> > >     ```
> > >
> > >     If the pairs are denoted by `(str_i, idx_i)`, where `str_i`
denotes the stride
> > >     and `idx_i` denotes the index of the ith pair, then the final
address (in bytes)
> > >     is computed as:
> > >
> > >     ```
> > >     ptrval + len(ty) * [(str_0 * idx_0) + (str_1 * idx_1) + ...
(str_n * idx_n)]
> > >     ```
> > >
> > >     ## Transitioning to `llvm.multidim.array.index.*`: Allow
`multidim_array_index` to refer to a GEP instruction:
> > >
> > >     This is a sketch of how we might gradually introduce the
`llvm.multidim.array.index.*`
> > >     intrinsic into LLVM without immediately losing the analyses
> > >     that are performed on `getelememtptr` instructions. This
section
> > >     lists out some possible choices that we have, since the
authors
> > >     do not have a "best" solution.
> > >
> > >     ##### Choice 1: Write a `llvm.multidim.array.index.*` to
`GEP`
> > > pass, with the `GEP` annotated with metadata
> > >
> > >     This pass will flatten all `llvm.multidim.array.index.*`
expressions into a `GEP` annotated with metadata. This metadata will indicate
that the index expression computed by the lowered GEP is guaranteed to be in a
canonical form which allows the analysis
> > >     to infer stride and index sizes.
> > >
> > >     A multidim index of the form:
> > >     ```
> > >     %arrayidx = llvm.multidim.array.index.* i64 i64* %A, %str_1,
%idx_1, %str_2, %idx_2
> > >     ```
> > >
> > >     is lowered to:
> > >
> > >     ```
> > >     %mul1 = mul nsw i64 %str_1, %idx_1
> > >     %mul2 = mul1 nsw i64 %str_2, %idx_2
> > >     %total = add nsw i64 %mul2, %mul1
> > >     %arrayidx = getelementptr inbounds i64, i64* %A, i64 %total,
!multidim !1
> > >     ```
> > >     with guarantees that the first term in each multiplication is
the stride
> > >     and the second term in each multiplication is the index.
(What happens
> > >     if intermediate transformation passes decide to change the
order? This seems
> > >     complicated).
> > >
> > >     **TODO:** Lookup how to attach metadata such that the
metadata can communicate
> > >     which of the values are strides and which are indeces
> > >
> > >
> > >
> > >     # Caveats
> > >
> > >     Currently, we assume that the array shape is immutable.
However, we will need to deal with
> > >     being able to express `reshape` like primitives where the
array shape can be mutated. However,
> > >     this appears to make expressing this information quite
difficult: We now need to attach the shape
> > >     information to an array per "shape-live-range".
> > >
> > >
> > >     ## Appendix-A
> > >
> > >     ##### A realistic, more involved example of dependence
analysis
> > > going wrong
> > >
> > >     ```cpp
> > >     // In an array A of size (n0 x n1),
> > >     // fill a subarray of size (s0 x s1)
> > >     // which starts at an offset (o0 x o1) in the larger array A
> > >     void set_subarray(unsigned n0, unsigned n1,
> > >             unsigned o0, unsigned o1,
> > >             unsigned s0, unsigned s1,
> > >             float A[n0][n1]) {
> > >             for (unsigned i = 0; i < s0; i++)
> > >                     for (unsigned j = 0; j < s1; j++)
> > >                                     S: A[i + o0][j + o1] = 1;
> > >     }
> > >     ```
> > >     We first reduce this index expression to a sum of products:
> > >
> > >     ```
> > >     (i + o0) * n1 + (j + o1) = n1i + n1o0 + j + o1
> > >     ix(i, j, n0, n1, o0, o1) = n1i + n1o0 + j + o1
> > >     ```
> > >
> > >     `ix` is the index expression which `LLVM` will see, since it
is fully
> > >     flattened, in comparison with the multi-dimensional index
expression
> > >     `index:[i + o0][j + o1] | sizes:[][n1]`.
> > >
> > >     We will now show _why_ this multi-dimensional index is not
always correct,
> > >     and why guessing for one language will not work for another:
> > >
> > >     Consider a call `set_subarray(n0=8, n1=9, o0=4, o1=6, s0=3,
s1=6)`. At face
> > >     value, this is incorrect if we view the array as 2D grid,
since the size
> > >     of the array is `(n0, n1) = (8, 9)`, but we are writing an
array of size
> > >     `(s0, s1) = (3, 6)` starting from `(o0, o1) = (4, 6)`.
Clearly, we will
> > >     exceed the width of the array, since `(s1 + o1 = 6 + 6 = 12)
> (n1 = 9)`.
> > >     However, now think of the array as a flattened 1D
representation. In this
> > >     case, the total size of the array is `n1xn2 = 8x9 = 72`,
while the largest
> > >     element we will access is at the largest value of `(i, j)`.
That is,
> > >     `i = s0 - 1 = 2`, and `j = s1 - 1 = 5`.
> > >
> > >     The largest index will be `ix(i=2, j=5, n0=8, n1=9, o0=4,
o1=6) = 8*2+8*4+5+6=59`.
> > >     Since `59 < 72`, we are clearly at _legal_ array indices,
by C semantics!
> > >
> > >     The definition of the semantics of the language **changed the
illegal
> > >     multidimensional access** (which is illegal since it exceeds
the `n1`
> > >     dimension), into a **legal flattened 1D access** (which is
legal since the
> > >     flattened array indices are inbounds).
> > >
> > >     LLVM has no way of expressing these two different semantics.
Hence, we are
> > >     forced to:
> > >     1. Consider flattened 1D accesses, which makes analysis of
index expressions
> > >     equivalent to analysis of polynomials over the integers,
which is hard.
> > >     2. Guess multidimensional representations, and use them at
the expense of
> > >     soundness bugs as shown above.
> > >     3. Guess multidimensional representations, use them, and
check their validity
> > >     at runtime, causing a runtime performance hit. This
implementation follows
> > >     the description from the paper [Optimistic Delinearization of
Parametrically Sized Arrays](optim-delin-parametrically-sized-arrays).
> > >
> > >
> > >     Currently, Polly opts for option (3), which is to emit
runtime checks. If
> > >     the run-time checks fail, then Polly will not run its
optimised code. Instead,
> > >     It keeps a copy of the unoptimised code around, which is run
in this case.
> > >     Note that this effectively doubles the amount of
performance-sensitive code
> > >     which is finally emitted after running Polly.
> > >
> > >     Ideally, we would like a mechanism to directly express the
multidimensional
> > >     semantics, which would eliminate this kind of guesswork from
Polly/LLVM,
> > >     which would both make code faster, and easier to analyze.
> > >
> > >     ## References
> > >     - [The chapel language
specification](https://chapel-lang.org/docs/1.13/_downloads/chapelLanguageSpec.pdf)
> > >     - [Fortran 2003
standard](http://www.j3-fortran.org/doc/year/04/04-007.pdf}{Fortran 
<http://www.j3-fortran.org/doc/year/04/04-007.pdf%7D%7BFortran>  2003
standard)
> > >     - [C++ subscripting](http://eel.is/c++draft/expr.sub)
> > >     - [Michael Kruse's PhD thesis: Lattice QCD Optimization
and Polytopic Representations of Distributed
Memory](https://tel.archives-ouvertes.fr/tel-01078440).
> > >     - [Molly source code link for the
intrinsic](https://github.com/Meinersbur/llvm/blob/molly/include/llvm/IR/IntrinsicsMolly.td#L3)
> > >     -[Optimistic Delinearization of Parametrically Sized
> > > Arrays](optim-delin-parametrically-sized-arrays)
> > >
> > >
> > >
[optim-delin-parametrically-sized-arrays]:http://delivery.acm.org/10
> > >
.1145/2760000/2751248/p351-grosser.pdf?ip=93.3.109.183&id=2751248&ac
> > >
c=CHORUS&key=4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35%
> > >
2E6D218144511F3437&__acm__=1557948443_2f56675c6d04796f27b84593535c9f
> > > 70
> > >
> > >
> > >     _______________________________________________
> > >     LLVM Developers mailing list
> > >     llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
> > >     https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> > >
> > >
> > >
> > > --
> > > --
> > > Peter
> > >
> > > _______________________________________________
> > > LLVM Developers mailing list
> > > llvm-dev at lists.llvm.org
> > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
> --
>
> Johannes Doerfert
> Researcher
>
> Argonne National Laboratory
> Lemont, IL 60439, USA
>
> jdoerfert at anl.gov

Kaylor, Andrew via llvm-dev

2019-Jul-23 00:14 UTC

head link

[llvm-dev] [RFC] A new multidimensional array indexing intrinsic

I don't think the return type has anything to do with what I was asking
about. Perhaps seeing a concrete, fully-legal IR representation of what's
being proposed would help.

Here's a best guess, switching the indexed base type to a non-integer for
illustration purposes:

%arrayidx = call i64 @llvm.multidim.array.index.i64.p0f64.i64.i64.i64.i64
double* %A, i64 %str_1, i64 %idx_1, i64 %str_2, i64 %idx_2

According to the RFC, that would get lowered to this:

  %mul1 = mul nsw i64 %str_1, %idx_1
  %mul2 = mul1 nsw i64 %str_2, %idx_2
  %total = add nsw i64 %mul2, %mul1
  %arrayidx = getelementptr inbounds double, double* %A, i64 %total, !multidim
!1

The problem I'm having is that the source element type in the GEP
instruction in the lowering can only be inferred from the pointer type, which is
going away at some point. When pointers become typeless, the call will look
something like this:

%arrayidx = call i64 @llvm.multidim.array.index.i64.p0.i64.i64.i64.i64 void* %A,
i64 %str_1, i64 %idx_1, i64 %str_2, i64 %idx_2

Do you see what I'm saying? Have I misunderstood something?

-Andy


-----Original Message-----
From: Michael Kruse <llvm at meinersbur.de> 
Sent: Monday, July 22, 2019 4:50 PM
To: Kaylor, Andrew <andrew.kaylor at intel.com>
Cc: Doerfert, Johannes <jdoerfert at anl.gov>; Philip Reames <listmail
at philipreames.com>; LLVM Developers Mailing List <llvm-dev at
lists.llvm.org>; SAHIL GIRISH YERAWAR <cs15btech11044 at iith.ac.in>;
Michael Ferguson <mferguson at cray.com>; Michael Kruse <llvm at
meinersbur.de>; Tobias Grosser <tobias.grosser at inf.ethz.ch>
Subject: Re: [llvm-dev] [RFC] A new multidimensional array indexing intrinsic

Intrinsics can return `llvm_any_ty` (Intrinsics.td). In that case the return
type is added as a suffix to the intrinsic's name, i.e. the syntax in the
RFC is not 100% the syntax for intrinsics. Same for the parameters which each
must have their types explicitly mentioned.

Michael

Am Mo., 22. Juli 2019 um 19:08 Uhr schrieb Kaylor, Andrew
<andrew.kaylor at intel.com>:>
> Is it even possible to implement the original proposal using an intrinsic?
I'm not sure how to interpret part of the description. It says the syntax
for the intrinsic would be this:
>
> <result> = llvm.multidim.array.index.* <ty> <ty>*
<ptrval> {<stride>,
> <idx>}*
>
> It isn't clear to me what that means. The later example is also in a
somewhat generalized form:
>
>   %arrayidx = llvm.multidim.array.index.* i64 i64* %A, %str_1, %idx_1, 
> %str_2, %idx_2
>
> Trying to expand this into something concrete it looks to me like the extra
value-less type argument ('i64' immediately following the intrinsic
name) won't work, and if I'm reading it correctly that's a necessary
element. The GEP instruction accepts a raw type as a pseudo-argument. It does
this to prepare for the coming transition to opaque pointers, and it can do it
because it's a first class instruction. However, an intrinsic can't do
this. And yet without it we won't know what to use for the GEP's source
element type "argument" after pointer's lose their pointee type.
>
> -Andy
>
> -----Original Message-----
> From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of 
> Doerfert, Johannes via llvm-dev
> Sent: Monday, July 22, 2019 11:56 AM
> To: Philip Reames <listmail at philipreames.com>
> Cc: SAHIL GIRISH YERAWAR <cs15btech11044 at iith.ac.in>; llvm-dev 
> <llvm-dev at lists.llvm.org>; Michael Ferguson <mferguson at
cray.com>;
> Michael Kruse <llvm at meinersbur.de>; Tobias Grosser 
> <tobias.grosser at inf.ethz.ch>
> Subject: Re: [llvm-dev] [RFC] A new multidimensional array indexing 
> intrinsic
>
> On 07/22, Philip Reames via llvm-dev wrote:
> > We could also simply extend the existing inrange mechanism to 
> > non-constantexpr GEPs.  It would remove an inconsistency in the 
> > semantics, be relatively straight forward, and solve the motivating
example.
>
> Agreed. I'd very much prefer a GEP extension over an intrinsic
solution.
>
> --
>   Johannes
>
>
>
> > On 7/22/19 10:01 AM, Peter Collingbourne via llvm-dev wrote:
> > > The restrictions of this new intrinsic seem very similar to those
> > > of the inrange attribute on GEPs. It seems that the main
advantage
> > > of your proposal is that it would allow for non-constant strides
(i.e.
> > > variable length arrays) in dimensions other than the first one.
Do
> > > these appear frequently enough in the programs that you're 
> > > interested in to be worth optimizing for?
> > >
> > > Peter
> > >
> > > On Sun, Jul 21, 2019 at 3:54 PM Siddharth Bhat via llvm-dev 
> > > <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
> > >
> > >     Hello,
> > >
> > >     We would like to begin discussions around a new set of
intrinsics,
> > >     to better express
> > >     multi-dimensional array indexing within LLVM. The motivations
and
> > >     a possible design
> > >     are sketched out below.
> > >
> > >     Rendered RFC link here
> > >
> > >
<https://github.com/bollu/llvm-multidim-array-indexing-proposal/bl
> > > ob
> > > /master/RFC.md>
> > >
> > >
> > >     Raw markdown:
> > >
> > >     # Introducing a new multidimensional array indexing intrinsic
> > >
> > >     ## The request for change: A new intrinsic,
`llvm.multidim.array.index.*`.
> > >
> > >     We propose the addition of a new family of intrinsics,
`llvm.multidim.array.index.*`.
> > >     This will allow us to represent array indexing into an array
`A[d1][d2][..][dk]`
> > >     as `llvm.multidim.array.index.k A d1, d2, d3, ... dk` instead
of flattening
> > >     the information into `gep A, d1 * n1 + d2 * n2 + ... + dk *
nk`. The former
> > >     unflattened representation is advantageous for analysis and
optimisation. It also
> > >     allows us to represent array indexing semantics of languages
such as Fortran
> > >     and Chapel, which differs from that of C based array
indexing.
> > >
> > >     ## Motivating the need for a multi-dimensional array indexing
> > > intrinsic
> > >
> > >     There are primarily one of two views of array indexing
involving
> > >     multiple dimensions most languages take, which we discuss to
motivate
> > >     the need for a multi-dimensional array index. This
consideration impacts
> > >     the kinds of analysis we can perform on the program. In
Polly, we care about
> > >     dependence analysis, so the examples here will focus on that
particular problem.
> > >
> > >     Let us consider an array indexing operation of the form:
> > >     ```cpp
> > >     int ex1(int n, int m, B[n][m], int x1, int x2, int y1, int
y2) {
> > >             __builtin_assume(x1 != x2);
> > >             __builtin_assume(y1 != y2);
> > >             B[x1][y1] = 1;
> > >             printf("%d", B[x2][y2]);
> > >             exit(0);
> > >     }
> > >     ```
> > >
> > >     One would like to infer that the array indices _interpreted
as tuples_
> > >     `(x1, y1)` and `(x2, y2)` do not have the same value, due to
the guarding asserts
> > >     that `x1 != x2` and `y1 != y2`. As a result, the write
`B[x1][y1] = 1` can
> > >     in no way interfere with the value of `B[x2][y2]`.
Consquently,
> > >     we can optimise the program into:
> > >
> > >
> > >     ```cpp
> > >     int ex1_opt(int n, int m, B[n][m], int x1, int x2, int y1,
int y2) {
> > >          // B[x1][y1] = 1 is omitted because the result
> > >          // cannot be used:
> > >          //  It is not used in the print and then the program
exits
> > >             printf("%d", B[x2][y2]);
> > >             exit(0);
> > >     }
> > >     ```
> > >
> > >     However, alas, this is illegal, for the C language does not
provide
> > >     semantics that allow the final inference above. It is
conceivable that
> > >     `x1 != x2, y1 != y2`, but the indices do actually alias,
since
> > >     according to C semantics, the two indices alias if the
_flattened
> > >     representation of the indices alias_. Consider the parameter
> > >     values:
> > >
> > >     ```
> > >     n = m = 3
> > >     x1 = 1, y1 = 0; B[x1][y1] = nx1+y1 = 3*1+0=3
> > >     x2 = 0, y2 = 3; B[x2][y2] = nx2+y2 = 3*0+3=3
> > >     ```
> > >
> > >     Hence, the array elements `B[x1][y1]` and `B[x2][y2]` _can
alias_, and
> > >     so the transformation proposed in `ex1_opt` is unsound in
general.
> > >
> > >
> > >     In contrast, many langagues other than C require that index
> > >     expressions for multidimensional arrays have each component
within the
> > >     array dimension for that component. As a result, in the
example above,
> > >     the index pair `(0,3)` would be out-of-bounds. In languages
with these
> > >     semantics, one can infer that the indexing:
> > >
> > >     `[x1][y1] != [x2][y2]` iff `x1 != x2 || y1 != y2`.
> > >
> > >     While this particular example is not very interesting, it
shows the
> > >     spirit of the lack of expressiveness in LLVM we are trying to
> > >     improve.
> > >
> > >     Julia, Fortran, and Chapel are examples of such languages
which target
> > >     LLVM.
> > >
> > >     Currently, the LLVM semantics of `getelementptr` talk purely
of
> > >     the C style flattened views of arrays. This inhibits the
ability of the optimiser
> > >     to understand the multidimensional examples as given above,
and we
> > >     are forced to make conservative assumptions, inhibiting
optimisation.
> > >     This information must ideally be expressible in LLVM, such
that LLVM's optimisers and
> > >     alias analysis can use this information to model
multidimensional-array
> > >     semantics.
> > >
> > >     There is a more realistic (and involved) example in [Appendix
A](#Appendix-A)
> > >     in the same spirit as the above simple example, but one a
compiler might
> > >     realistically wish to perform.
> > >
> > >     ## Evaluation of the impact of the intrinsic on accuracy of 
> > > dependence analysis
> > >
> > >     - This has been implemented in an exprimental branch of
Polly, and was used
> > >     on the COSMO climate weather model. This greatly helped
increase the accuracy
> > >     of Polly's analysis, since we eliminated the guessing
game from the array analysis.
> > >
> > >     - This has also been implemented as part of a GSoC effort to
unify
> > >     Chapel and Polly.
> > >
> > >     - Molly, the distributed version of Polly written by Michael
Kruse for his PhD
> > >        also implemented a similar scheme. In his use-case,
optimistic run-time checks
> > >        with delinearization was not possible, so this kind of
intrinsic was _necessary_
> > >        for the application, not just _good to have_. More details
are available
> > >        in his PhD thesis: [Lattice QCD Optimization and Polytopic
Representations of Distributed
Memory](https://tel.archives-ouvertes.fr/tel-01078440).
> > >        In particular, Chapter 9 contains a detailed discussion.
> > >
> > >     # Representations
> > >
> > >     ## Intrinsic
> > >
> > >     ### Syntax
> > >     ```
> > >     <result> = llvm.multidim.array.index.* <ty>
<ty>* <ptrval> {<stride>, <idx>}*
> > >     ```
> > >
> > >     ### Overview:
> > >
> > >     The `llvm.multidim.array.index.*` intrinsic is used to get
the address of
> > >     an element from an array. It performs address calcuation only
and
> > >     does not access memory. It is similar to `getelementptr`.
However, it imposes
> > >     additional semantics which allows the optimiser to provide
better optimisations
> > >     than `getlementptr`.
> > >
> > >
> > >
> > >     ### Arguments:
> > >
> > >     The first argument is always a type used as the basis for the
calculations. The
> > >     second argument is always a pointer, and is the base address
to start the
> > >     calculation from. The remaining arguments are a list of
pairs. Each pair
> > >     contains a dimension stride, and an offset with respect to
that stride.
> > >
> > >
> > >     ### Semantics:
> > >
> > >     `llvm.multidim.array.index.*` represents a multi-dimensional
array index, In particular, this will
> > >     mean that we will assume that all indices `<idx_i>` are
non-negative.
> > >
> > >     Additionally, we assume that, for each `<str_i>`,
`<idx_i>` pair, that
> > >     `0 <= idx_i < str_i`.
> > >
> > >     Optimizations can assume that, given two
llvm.multidim.array.index.* instructions with matching types:
> > >
> > >     ```
> > >     llvm.multidim.array.index.* <ty> <ty>*
<ptrvalA> <strA_1>, <idxA_1>, ..., <strA_N>,
<idxA_N>
> > >     llvm.multidim.array.index.* <ty> <ty>*
<ptrvalB> <strB_1>, <idxB_1>, ..., <strB_N>,
<idxb_N>
> > >     ```
> > >
> > >     If `ptrvalA == ptrvalB` and the strides are equal `(strA_1 ==
strB_1 && ... && strA_N == strB_N)` then:
> > >
> > >     - If all the indices are equal (that is, `idxA_1 == idxB_1
&& ... && idxA_N == idxB_N`),
> > >        then the resulting pointer values are equal.
> > >
> > >     - If any index value is not equal (that is, there exists an
`i` such that `idxA_i != idxB_i`),
> > >        then the resulting pointer values are not equal.
> > >
> > >
> > >     ##### Address computation:
> > >     Consider an invocation of `llvm.multidim.array.index.*`:
> > >
> > >     ```
> > >     <result> = call @llvm.multidim.array.index.* <ty>
<ty>* <ptrval> <str_0>, <idx_0>, <str_1>
<idx_1>, ..., <str_n> <idx_n>
> > >     ```
> > >
> > >     If the pairs are denoted by `(str_i, idx_i)`, where `str_i`
denotes the stride
> > >     and `idx_i` denotes the index of the ith pair, then the final
address (in bytes)
> > >     is computed as:
> > >
> > >     ```
> > >     ptrval + len(ty) * [(str_0 * idx_0) + (str_1 * idx_1) + ...
(str_n * idx_n)]
> > >     ```
> > >
> > >     ## Transitioning to `llvm.multidim.array.index.*`: Allow
`multidim_array_index` to refer to a GEP instruction:
> > >
> > >     This is a sketch of how we might gradually introduce the
`llvm.multidim.array.index.*`
> > >     intrinsic into LLVM without immediately losing the analyses
> > >     that are performed on `getelememtptr` instructions. This
section
> > >     lists out some possible choices that we have, since the
authors
> > >     do not have a "best" solution.
> > >
> > >     ##### Choice 1: Write a `llvm.multidim.array.index.*` to
`GEP`
> > > pass, with the `GEP` annotated with metadata
> > >
> > >     This pass will flatten all `llvm.multidim.array.index.*`
expressions into a `GEP` annotated with metadata. This metadata will indicate
that the index expression computed by the lowered GEP is guaranteed to be in a
canonical form which allows the analysis
> > >     to infer stride and index sizes.
> > >
> > >     A multidim index of the form:
> > >     ```
> > >     %arrayidx = llvm.multidim.array.index.* i64 i64* %A, %str_1,
%idx_1, %str_2, %idx_2
> > >     ```
> > >
> > >     is lowered to:
> > >
> > >     ```
> > >     %mul1 = mul nsw i64 %str_1, %idx_1
> > >     %mul2 = mul1 nsw i64 %str_2, %idx_2
> > >     %total = add nsw i64 %mul2, %mul1
> > >     %arrayidx = getelementptr inbounds i64, i64* %A, i64 %total,
!multidim !1
> > >     ```
> > >     with guarantees that the first term in each multiplication is
the stride
> > >     and the second term in each multiplication is the index.
(What happens
> > >     if intermediate transformation passes decide to change the
order? This seems
> > >     complicated).
> > >
> > >     **TODO:** Lookup how to attach metadata such that the
metadata can communicate
> > >     which of the values are strides and which are indeces
> > >
> > >
> > >
> > >     # Caveats
> > >
> > >     Currently, we assume that the array shape is immutable.
However, we will need to deal with
> > >     being able to express `reshape` like primitives where the
array shape can be mutated. However,
> > >     this appears to make expressing this information quite
difficult: We now need to attach the shape
> > >     information to an array per "shape-live-range".
> > >
> > >
> > >     ## Appendix-A
> > >
> > >     ##### A realistic, more involved example of dependence 
> > > analysis going wrong
> > >
> > >     ```cpp
> > >     // In an array A of size (n0 x n1),
> > >     // fill a subarray of size (s0 x s1)
> > >     // which starts at an offset (o0 x o1) in the larger array A
> > >     void set_subarray(unsigned n0, unsigned n1,
> > >             unsigned o0, unsigned o1,
> > >             unsigned s0, unsigned s1,
> > >             float A[n0][n1]) {
> > >             for (unsigned i = 0; i < s0; i++)
> > >                     for (unsigned j = 0; j < s1; j++)
> > >                                     S: A[i + o0][j + o1] = 1;
> > >     }
> > >     ```
> > >     We first reduce this index expression to a sum of products:
> > >
> > >     ```
> > >     (i + o0) * n1 + (j + o1) = n1i + n1o0 + j + o1
> > >     ix(i, j, n0, n1, o0, o1) = n1i + n1o0 + j + o1
> > >     ```
> > >
> > >     `ix` is the index expression which `LLVM` will see, since it
is fully
> > >     flattened, in comparison with the multi-dimensional index
expression
> > >     `index:[i + o0][j + o1] | sizes:[][n1]`.
> > >
> > >     We will now show _why_ this multi-dimensional index is not
always correct,
> > >     and why guessing for one language will not work for another:
> > >
> > >     Consider a call `set_subarray(n0=8, n1=9, o0=4, o1=6, s0=3,
s1=6)`. At face
> > >     value, this is incorrect if we view the array as 2D grid,
since the size
> > >     of the array is `(n0, n1) = (8, 9)`, but we are writing an
array of size
> > >     `(s0, s1) = (3, 6)` starting from `(o0, o1) = (4, 6)`.
Clearly, we will
> > >     exceed the width of the array, since `(s1 + o1 = 6 + 6 = 12)
> (n1 = 9)`.
> > >     However, now think of the array as a flattened 1D
representation. In this
> > >     case, the total size of the array is `n1xn2 = 8x9 = 72`,
while the largest
> > >     element we will access is at the largest value of `(i, j)`.
That is,
> > >     `i = s0 - 1 = 2`, and `j = s1 - 1 = 5`.
> > >
> > >     The largest index will be `ix(i=2, j=5, n0=8, n1=9, o0=4,
o1=6) = 8*2+8*4+5+6=59`.
> > >     Since `59 < 72`, we are clearly at _legal_ array indices,
by C semantics!
> > >
> > >     The definition of the semantics of the language **changed the
illegal
> > >     multidimensional access** (which is illegal since it exceeds
the `n1`
> > >     dimension), into a **legal flattened 1D access** (which is
legal since the
> > >     flattened array indices are inbounds).
> > >
> > >     LLVM has no way of expressing these two different semantics.
Hence, we are
> > >     forced to:
> > >     1. Consider flattened 1D accesses, which makes analysis of
index expressions
> > >     equivalent to analysis of polynomials over the integers,
which is hard.
> > >     2. Guess multidimensional representations, and use them at
the expense of
> > >     soundness bugs as shown above.
> > >     3. Guess multidimensional representations, use them, and
check their validity
> > >     at runtime, causing a runtime performance hit. This
implementation follows
> > >     the description from the paper [Optimistic Delinearization of
Parametrically Sized Arrays](optim-delin-parametrically-sized-arrays).
> > >
> > >
> > >     Currently, Polly opts for option (3), which is to emit
runtime checks. If
> > >     the run-time checks fail, then Polly will not run its
optimised code. Instead,
> > >     It keeps a copy of the unoptimised code around, which is run
in this case.
> > >     Note that this effectively doubles the amount of
performance-sensitive code
> > >     which is finally emitted after running Polly.
> > >
> > >     Ideally, we would like a mechanism to directly express the
multidimensional
> > >     semantics, which would eliminate this kind of guesswork from
Polly/LLVM,
> > >     which would both make code faster, and easier to analyze.
> > >
> > >     ## References
> > >     - [The chapel language
specification](https://chapel-lang.org/docs/1.13/_downloads/chapelLanguageSpec.pdf)
> > >     - [Fortran 2003
standard](http://www.j3-fortran.org/doc/year/04/04-007.pdf}{Fortran 
<http://www.j3-fortran.org/doc/year/04/04-007.pdf%7D%7BFortran>  2003
standard)
> > >     - [C++ subscripting](http://eel.is/c++draft/expr.sub)
> > >     - [Michael Kruse's PhD thesis: Lattice QCD Optimization
and Polytopic Representations of Distributed
Memory](https://tel.archives-ouvertes.fr/tel-01078440).
> > >     - [Molly source code link for the
intrinsic](https://github.com/Meinersbur/llvm/blob/molly/include/llvm/IR/IntrinsicsMolly.td#L3)
> > >     -[Optimistic Delinearization of Parametrically Sized
> > > Arrays](optim-delin-parametrically-sized-arrays)
> > >
> > >
> > >
[optim-delin-parametrically-sized-arrays]:http://delivery.acm.org/
> > > 10 
> > >
.1145/2760000/2751248/p351-grosser.pdf?ip=93.3.109.183&id=2751248&
> > > ac 
> > >
c=CHORUS&key=4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E4D4702B0C3E38B3
> > > 5% 
> > >
2E6D218144511F3437&__acm__=1557948443_2f56675c6d04796f27b84593535c
> > > 9f
> > > 70
> > >
> > >
> > >     _______________________________________________
> > >     LLVM Developers mailing list
> > >     llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
> > >     https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> > >
> > >
> > >
> > > --
> > > --
> > > Peter
> > >
> > > _______________________________________________
> > > LLVM Developers mailing list
> > > llvm-dev at lists.llvm.org
> > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
> --
>
> Johannes Doerfert
> Researcher
>
> Argonne National Laboratory
> Lemont, IL 60439, USA
>
> jdoerfert at anl.gov

Possibly Parallel Threads

Search for more seemingly similar threads

llvm dev - Jul 2019 - [RFC] A new multidimensional array indexing intrinsic

[llvm-dev] [RFC] A new multidimensional array indexing intrinsic

[llvm-dev] [RFC] A new multidimensional array indexing intrinsic

[llvm-dev] [RFC] A new multidimensional array indexing intrinsic

Possibly Parallel Threads