thr3ads.net - llvm dev - [llvm-dev] [RFC] Memory region declaration intrinsic [Dec 2021]

If this information is useful, please help other people find it:
Share via:

Roman Lebedev via llvm-dev

2021-Dec-07 19:24 UTC

[llvm-dev] [RFC] Memory region declaration intrinsic

Hi all.

Differential: https://reviews.llvm.org/D115274

This is a follow-up to the "[llvm-dev] [RFC] Adding range metadata to
array subscripts.",
https://lists.llvm.org/pipermail/llvm-dev/2021-March/149390.html

Problem statement:

As per C 6.5.6p9 / http://eel.is/c++draft/expr.add#4, given
```
struct S {
    int a[3];
    int b[3];
    int c[3];
};

void bar(int*);

void foo(S* s) {
  bar(&s.b[1]);
}
```
even though the pointer the bar receives has 4 ints to the left of it
and 4 to the right of it, the only ints it can access are
one to the left and one to the right. I.e. it can not go outside of the S::b.

But, there is currently no way to encode that knowledge into LLVM IR.
There's limited `inrange` thing  for constant expression GEP's,. since:
* https://reviews.llvm.org/D22793
* https://lists.llvm.org/pipermail/llvm-dev/2016-July/102472.html

... but it's limited to constant expressions. There were previous attempts
at
removing that restriction, namely that RFC and my patch:
https://reviews.llvm.org/D114988, however implementation experience/review
pointed out a few design problems:
1. Poor opaque pointers interop, it requires the GEP to be into a structure,
    so if it's a pure pointer computation, we suddenly can't preserve
the knowledge.
2. While just adding a bit[s] to GEP instruction allows the
transformation to just ignore it
    if they aren't explicitly taught about it, which is fine from a
legality standpoint,
    it complicates it's preservation through transformation.
3. While i'm not sure how useful it would be, it limits us to
statically-sized arrays.

Instead of following through with that, let me propose a new design:

<begin langref>
```
.. _int_memory_region_decl:

'``llvm.memory.region.decl``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Syntax:
"""""""

::

      declare i8* @llvm.memory.region.decl.p0i8(i8* nocapture readnone
returned <ptr>, i64 <begin_offset>, i64 <end_offset>) nofree
nosync
nounwind readnone speculatable willreturn

Overview:
"""""""""

The '``llvm.memory.region.decl``' intrinsic annotates memory region.

Arguments:
""""""""""

This is an overloaded intrinsic. The memory region can belong to any address
space. The first argument is a pointer into the memory region. The returned
pointer, which is the first argument, must belong to the same address space
as the argument. The second argument specifies the offset to the pointer (the
first argument) at which the memory region begins. The third argument specifies
the offset to the pointer (the first argument) at which the memory region ends.

Semantics:
""""""""""

The returned pointer, and, transitively, any pointer that is def-use based on
that pointer, points into the memory region ``[ptr+begin_offset,
ptr+end_offset)``,
or is a :ref:`poison value <poisonvalues>` otherwise.

This intrinsic is intended to be an optimization hint, there are no correctness
concerns with completely ignoring and/or dropping it. The main use-case is
to be able to annotate array bounds in C family of languages,
which may allow alloca splitting, and better alias analysis.
```
</end langref>

Example:
```
struct S {
  int a;
  int b[4];
};
int* get(S*s, int i) {
  return &s->b[i];
}
```
is currently lowered into
```
define dso_local nonnull i32* @_Z3getP1Si(%struct.S* readnone %s, i32
%i) local_unnamed_addr #0 {
 %idxprom = sext i32 %i to i64
%arrayidx = getelementptr inbounds %struct.S, %struct.S* %s, i64 0,
i32 1, i64 %idxprom
 ret i32* %arrayidx
}
```
would instead be lowered into
```
define dso_local nonnull i32* @_Z3getP1Si(%struct.S* readnone %s, i32
%i) local_unnamed_addr #0 {
 %arrayidx = getelementptr inbounds %struct.S, %struct.S* %s, i64 0,
i32 1, i64 0
 %arrayidx.bounded = call i32* @llvm.memory.region.decl.p0i32(i32*
%arrayidx, i64 0, i64 32)
 %idxprom = sext i32 %i to i64
 %arrayidx3 = getelementptr inbounds i32, i32* %arrayidx.bounded, i64 %idxprom
 ret i32* %arrayidx3
}
```
Concretely, this tells us that %i u<= 4, which should be useful for
Alias Analysis
in less contrived snippets.

The other motivational example, although still contrived:
```
struct S {
 int a;
 int b[4];
};
int stuff(int i, int array_val, int j, int scalar_val) {
 S s;
 s.a = scalar_val;
 s.b[i] = array_val;
 return s.a;
}
```
currently results in:
```
define dso_local i32 @_Z5stuffiiii(i32 %i, i32 %array_val, i32 %j, i32
%scalar_val) local_unnamed_addr #0 {
entry:
%s = alloca %struct.S, align 4
%0 = bitcast %struct.S* %s to i8*
call void @llvm.lifetime.start.p0i8(i64 20, i8* nonnull %0) #2
%a = getelementptr inbounds %struct.S, %struct.S* %s, i64 0, i32 0
store i32 %scalar_val, i32* %a, align 4, !tbaa !3
%idxprom = sext i32 %i to i64
%arrayidx = getelementptr inbounds %struct.S, %struct.S* %s, i64 0,
i32 1, i64 %idxprom
store i32 %array_val, i32* %arrayidx, align 4, !tbaa !8
%1 = load i32, i32* %a, align 4, !tbaa !3
call void @llvm.lifetime.end.p0i8(i64 20, i8* nonnull %0) #2
ret i32 %1
}
```
Notice the problem? `array_val` couldn't have been stored into `S::a`,
this particular example should optimize to just
```
define dso_local i32 @_Z5stuffiiii(i32 %i, i32 %array_val, i32 %j, i32
%scalar_val) local_unnamed_addr #0 {
 ret i32 %scalar_val
}
```

The even bigger picture here is that SROA simply gives up in presence
of variable GEP's,
but if we annotate the extents of such a variable GEP, then, given
right circumstances,
we may be able to conclude that the alloca could be split up, and
certain parts be promoted.
That is the main motivation for me behind this.

I think, this is sufficient information, but let me know if i should
address something else.

Roman.

Eli Friedman via llvm-dev

2021-Dec-07 21:57 UTC

head link

[llvm-dev] [RFC] Memory region declaration intrinsic

How does this interact with "inbounds" markings on GEPs?  Can the
result of a non-inbounds GEP point outside the specified region?

I assume the input and result pointer alias?  Given that, I don't think the
"nocapture" is right.  (There's a potential use-case for a similar
intrinsic where the result doesn't alias the argument, but I guess that
would be a different thing.)

The usual concern with this sort of intrinsic is the work involved in teaching a
bunch of pointer optimizations the meaning of the new intrinsic.  Encoding array
bounds is probably important enough to be worth the extra work, though.

-Eli
> -----Original Message-----
> From: Roman Lebedev <lebedev.ri at gmail.com>
> Sent: Tuesday, December 7, 2021 11:24 AM
> To: llvm-dev at lists.llvm.org
> Cc: Johannes Doerfert <jdoerfert at anl.gov>; Eli Friedman
> <efriedma at quicinc.com>; Clement Courbet <courbet at
google.com>; Nuno
> Lopes <nunoplopes at sapo.pt>; Nikita Popov <nikita.ppv at
gmail.com>; Peter
> Collingbourne <peter at pcc.me.uk>; Philip Reames <listmail at
philipreames.com>
> Subject: [RFC] Memory region declaration intrinsic
> 
> Hi all.
> 
> Differential: https://reviews.llvm.org/D115274
> 
> This is a follow-up to the "[llvm-dev] [RFC] Adding range metadata to
> array subscripts.",
> https://lists.llvm.org/pipermail/llvm-dev/2021-March/149390.html
> 
> Problem statement:
> 
> As per C 6.5.6p9 / http://eel.is/c++draft/expr.add#4, given
> ```
> struct S {
>     int a[3];
>     int b[3];
>     int c[3];
> };
> 
> void bar(int*);
> 
> void foo(S* s) {
>   bar(&s.b[1]);
> }
> ```
> even though the pointer the bar receives has 4 ints to the left of it
> and 4 to the right of it, the only ints it can access are
> one to the left and one to the right. I.e. it can not go outside of the
S::b.
> 
> But, there is currently no way to encode that knowledge into LLVM IR.
> There's limited `inrange` thing  for constant expression GEP's,.
since:
> * https://reviews.llvm.org/D22793
> * https://lists.llvm.org/pipermail/llvm-dev/2016-July/102472.html
> 
> ... but it's limited to constant expressions. There were previous
attempts at
> removing that restriction, namely that RFC and my patch:
> https://reviews.llvm.org/D114988, however implementation experience/review
> pointed out a few design problems:
> 1. Poor opaque pointers interop, it requires the GEP to be into a
structure,
>     so if it's a pure pointer computation, we suddenly can't
preserve
> the knowledge.
> 2. While just adding a bit[s] to GEP instruction allows the
> transformation to just ignore it
>     if they aren't explicitly taught about it, which is fine from a
> legality standpoint,
>     it complicates it's preservation through transformation.
> 3. While i'm not sure how useful it would be, it limits us to
> statically-sized arrays.
> 
> Instead of following through with that, let me propose a new design:
> 
> <begin langref>
> ```
> .. _int_memory_region_decl:
> 
> '``llvm.memory.region.decl``' Intrinsic
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> Syntax:
> """""""
> 
> ::
> 
>       declare i8* @llvm.memory.region.decl.p0i8(i8* nocapture readnone
> returned <ptr>, i64 <begin_offset>, i64 <end_offset>)
nofree nosync
> nounwind readnone speculatable willreturn
> 
> Overview:
> """""""""
> 
> The '``llvm.memory.region.decl``' intrinsic annotates memory
region.
> 
> Arguments:
> """"""""""
> 
> This is an overloaded intrinsic. The memory region can belong to any
address
> space. The first argument is a pointer into the memory region. The returned
> pointer, which is the first argument, must belong to the same address space
> as the argument. The second argument specifies the offset to the pointer
(the
> first argument) at which the memory region begins. The third argument
specifies
> the offset to the pointer (the first argument) at which the memory region
ends.
> 
> Semantics:
> """"""""""
> 
> The returned pointer, and, transitively, any pointer that is def-use based
on
> that pointer, points into the memory region ``[ptr+begin_offset,
> ptr+end_offset)``,
> or is a :ref:`poison value <poisonvalues>` otherwise.
> 
> This intrinsic is intended to be an optimization hint, there are no
correctness
> concerns with completely ignoring and/or dropping it. The main use-case is
> to be able to annotate array bounds in C family of languages,
> which may allow alloca splitting, and better alias analysis.
> ```
> </end langref>
> 
> Example:
> ```
> struct S {
>   int a;
>   int b[4];
> };
> int* get(S*s, int i) {
>   return &s->b[i];
> }
> ```
> is currently lowered into
> ```
> define dso_local nonnull i32* @_Z3getP1Si(%struct.S* readnone %s, i32
> %i) local_unnamed_addr #0 {
>  %idxprom = sext i32 %i to i64
> %arrayidx = getelementptr inbounds %struct.S, %struct.S* %s, i64 0,
> i32 1, i64 %idxprom
>  ret i32* %arrayidx
> }
> ```
> would instead be lowered into
> ```
> define dso_local nonnull i32* @_Z3getP1Si(%struct.S* readnone %s, i32
> %i) local_unnamed_addr #0 {
>  %arrayidx = getelementptr inbounds %struct.S, %struct.S* %s, i64 0,
> i32 1, i64 0
>  %arrayidx.bounded = call i32* @llvm.memory.region.decl.p0i32(i32*
> %arrayidx, i64 0, i64 32)
>  %idxprom = sext i32 %i to i64
>  %arrayidx3 = getelementptr inbounds i32, i32* %arrayidx.bounded, i64
> %idxprom
>  ret i32* %arrayidx3
> }
> ```
> Concretely, this tells us that %i u<= 4, which should be useful for
> Alias Analysis
> in less contrived snippets.
> 
> The other motivational example, although still contrived:
> ```
> struct S {
>  int a;
>  int b[4];
> };
> int stuff(int i, int array_val, int j, int scalar_val) {
>  S s;
>  s.a = scalar_val;
>  s.b[i] = array_val;
>  return s.a;
> }
> ```
> currently results in:
> ```
> define dso_local i32 @_Z5stuffiiii(i32 %i, i32 %array_val, i32 %j, i32
> %scalar_val) local_unnamed_addr #0 {
> entry:
> %s = alloca %struct.S, align 4
> %0 = bitcast %struct.S* %s to i8*
> call void @llvm.lifetime.start.p0i8(i64 20, i8* nonnull %0) #2
> %a = getelementptr inbounds %struct.S, %struct.S* %s, i64 0, i32 0
> store i32 %scalar_val, i32* %a, align 4, !tbaa !3
> %idxprom = sext i32 %i to i64
> %arrayidx = getelementptr inbounds %struct.S, %struct.S* %s, i64 0,
> i32 1, i64 %idxprom
> store i32 %array_val, i32* %arrayidx, align 4, !tbaa !8
> %1 = load i32, i32* %a, align 4, !tbaa !3
> call void @llvm.lifetime.end.p0i8(i64 20, i8* nonnull %0) #2
> ret i32 %1
> }
> ```
> Notice the problem? `array_val` couldn't have been stored into `S::a`,
> this particular example should optimize to just
> ```
> define dso_local i32 @_Z5stuffiiii(i32 %i, i32 %array_val, i32 %j, i32
> %scalar_val) local_unnamed_addr #0 {
>  ret i32 %scalar_val
> }
> ```
> 
> The even bigger picture here is that SROA simply gives up in presence
> of variable GEP's,
> but if we annotate the extents of such a variable GEP, then, given
> right circumstances,
> we may be able to conclude that the alloca could be split up, and
> certain parts be promoted.
> That is the main motivation for me behind this.
> 
> I think, this is sufficient information, but let me know if i should
> address something else.
> 
> Roman.

Johannes Doerfert via llvm-dev

2021-Dec-09 21:24 UTC

head link

[llvm-dev] [RFC] Memory region declaration intrinsic

I think adding new intrinsics, especially on the def-use chain, is the 
wrong way.

We need to skip them in too many places where we do not use 
"stripPointerCast" which
is always causing extra work and missed optimizations once we add more 
information (in
the form of these intrinsics).

A generic solution would be to add one intrinsic to which we can attach 
information:

```
declare <ty> @llvm.assume.chained(<ty> nocapture readnone returned 
<val>) /* attribute */
```

Now we add information the same way we do it for `llvm.assume`, via 
operand bundles:

```
%p = call @llvm.assume.chained(i32* %base) ["objectsize"(0, 12), 
"align"(16)]
```

The definition from below wrt the semantics now apply to the 
"objectsize" operand bundle,
or whatever name we come up with.

I imagine we can simplify quite a few existing attributes if we adopt 
this scheme (which
also means we could try it out already and see if there is a problem).

~ Johannes



On 12/7/21 13:24, Roman Lebedev via llvm-dev wrote:> Hi all.
>
> Differential: https://reviews.llvm.org/D115274
>
> This is a follow-up to the "[llvm-dev] [RFC] Adding range metadata to
> array subscripts.",
> https://lists.llvm.org/pipermail/llvm-dev/2021-March/149390.html
>
> Problem statement:
>
> As per C 6.5.6p9 / http://eel.is/c++draft/expr.add#4, given
> ```
> struct S {
>      int a[3];
>      int b[3];
>      int c[3];
> };
>
> void bar(int*);
>
> void foo(S* s) {
>    bar(&s.b[1]);
> }
> ```
> even though the pointer the bar receives has 4 ints to the left of it
> and 4 to the right of it, the only ints it can access are
> one to the left and one to the right. I.e. it can not go outside of the
S::b.
>
> But, there is currently no way to encode that knowledge into LLVM IR.
> There's limited `inrange` thing  for constant expression GEP's,.
since:
> * https://reviews.llvm.org/D22793
> * https://lists.llvm.org/pipermail/llvm-dev/2016-July/102472.html
>
> ... but it's limited to constant expressions. There were previous
attempts at
> removing that restriction, namely that RFC and my patch:
> https://reviews.llvm.org/D114988, however implementation experience/review
> pointed out a few design problems:
> 1. Poor opaque pointers interop, it requires the GEP to be into a
structure,
>      so if it's a pure pointer computation, we suddenly can't
preserve
> the knowledge.
> 2. While just adding a bit[s] to GEP instruction allows the
> transformation to just ignore it
>      if they aren't explicitly taught about it, which is fine from a
> legality standpoint,
>      it complicates it's preservation through transformation.
> 3. While i'm not sure how useful it would be, it limits us to
> statically-sized arrays.
>
> Instead of following through with that, let me propose a new design:
>
> <begin langref>
> ```
> .. _int_memory_region_decl:
>
> '``llvm.memory.region.decl``' Intrinsic
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> Syntax:
> """""""
>
> ::
>
>        declare i8* @llvm.memory.region.decl.p0i8(i8* nocapture readnone
> returned <ptr>, i64 <begin_offset>, i64 <end_offset>)
nofree nosync
> nounwind readnone speculatable willreturn
>
> Overview:
> """""""""
>
> The '``llvm.memory.region.decl``' intrinsic annotates memory
region.
>
> Arguments:
> """"""""""
>
> This is an overloaded intrinsic. The memory region can belong to any
address
> space. The first argument is a pointer into the memory region. The returned
> pointer, which is the first argument, must belong to the same address space
> as the argument. The second argument specifies the offset to the pointer
(the
> first argument) at which the memory region begins. The third argument
specifies
> the offset to the pointer (the first argument) at which the memory region
ends.
>
> Semantics:
> """"""""""
>
> The returned pointer, and, transitively, any pointer that is def-use based
on
> that pointer, points into the memory region ``[ptr+begin_offset,
> ptr+end_offset)``,
> or is a :ref:`poison value <poisonvalues>` otherwise.
>
> This intrinsic is intended to be an optimization hint, there are no
correctness
> concerns with completely ignoring and/or dropping it. The main use-case is
> to be able to annotate array bounds in C family of languages,
> which may allow alloca splitting, and better alias analysis.
> ```
> </end langref>
>
> Example:
> ```
> struct S {
>    int a;
>    int b[4];
> };
> int* get(S*s, int i) {
>    return &s->b[i];
> }
> ```
> is currently lowered into
> ```
> define dso_local nonnull i32* @_Z3getP1Si(%struct.S* readnone %s, i32
> %i) local_unnamed_addr #0 {
>   %idxprom = sext i32 %i to i64
> %arrayidx = getelementptr inbounds %struct.S, %struct.S* %s, i64 0,
> i32 1, i64 %idxprom
>   ret i32* %arrayidx
> }
> ```
> would instead be lowered into
> ```
> define dso_local nonnull i32* @_Z3getP1Si(%struct.S* readnone %s, i32
> %i) local_unnamed_addr #0 {
>   %arrayidx = getelementptr inbounds %struct.S, %struct.S* %s, i64 0,
> i32 1, i64 0
>   %arrayidx.bounded = call i32* @llvm.memory.region.decl.p0i32(i32*
> %arrayidx, i64 0, i64 32)
>   %idxprom = sext i32 %i to i64
>   %arrayidx3 = getelementptr inbounds i32, i32* %arrayidx.bounded, i64
%idxprom
>   ret i32* %arrayidx3
> }
> ```
> Concretely, this tells us that %i u<= 4, which should be useful for
> Alias Analysis
> in less contrived snippets.
>
> The other motivational example, although still contrived:
> ```
> struct S {
>   int a;
>   int b[4];
> };
> int stuff(int i, int array_val, int j, int scalar_val) {
>   S s;
>   s.a = scalar_val;
>   s.b[i] = array_val;
>   return s.a;
> }
> ```
> currently results in:
> ```
> define dso_local i32 @_Z5stuffiiii(i32 %i, i32 %array_val, i32 %j, i32
> %scalar_val) local_unnamed_addr #0 {
> entry:
> %s = alloca %struct.S, align 4
> %0 = bitcast %struct.S* %s to i8*
> call void @llvm.lifetime.start.p0i8(i64 20, i8* nonnull %0) #2
> %a = getelementptr inbounds %struct.S, %struct.S* %s, i64 0, i32 0
> store i32 %scalar_val, i32* %a, align 4, !tbaa !3
> %idxprom = sext i32 %i to i64
> %arrayidx = getelementptr inbounds %struct.S, %struct.S* %s, i64 0,
> i32 1, i64 %idxprom
> store i32 %array_val, i32* %arrayidx, align 4, !tbaa !8
> %1 = load i32, i32* %a, align 4, !tbaa !3
> call void @llvm.lifetime.end.p0i8(i64 20, i8* nonnull %0) #2
> ret i32 %1
> }
> ```
> Notice the problem? `array_val` couldn't have been stored into `S::a`,
> this particular example should optimize to just
> ```
> define dso_local i32 @_Z5stuffiiii(i32 %i, i32 %array_val, i32 %j, i32
> %scalar_val) local_unnamed_addr #0 {
>   ret i32 %scalar_val
> }
> ```
>
> The even bigger picture here is that SROA simply gives up in presence
> of variable GEP's,
> but if we annotate the extents of such a variable GEP, then, given
> right circumstances,
> we may be able to conclude that the alloca could be split up, and
> certain parts be promoted.
> That is the main motivation for me behind this.
>
> I think, this is sufficient information, but let me know if i should
> address something else.
>
> Roman.
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Nuno Lopes via llvm-dev

2021-Dec-22 19:43 UTC

head link

[llvm-dev] [RFC] Memory region declaration intrinsic

First of all, apologies for the delay. I was busy moving between countries &
job.

The non-technical question is whether this matters at all. Do we expect any 
perf benefit from improving alias analysis for this case?

On the technical side, I do like this SSI-like approach, where you create a 
new pointer rather than add stuff to the BB/function context. SSI-based data 
is safer as it's on the def-use chain, which the compiler knows how to 
preserve and validate; you can't break it. When we have assumptions stated 
in the control-flow (like assume or lifetime intrinsics), optimizers can 
easily miscompile by moving or not moving those intrinsics where needed.
I guess we can formalize the intrinsic you propose as returning a 
sub-object, with it's own size and attributes, but aliased storage with the 
main object. It's a view, essentially.

For pointer comparisons, I don't see an issue, as ATM we see comparisons as 
if pointers were integers. So no worries.
But GVN is broken for pointers, and this change will make it even more 
broken. If you have 'if (p == region.decl(p, ..)) use(p)', we can't
blindly
replace p with the decl, as the dereferenceability may be different. Nothing 
new, but it adds pressure to fix GVN once and for all (Alina and I will 
hopefully find time to work on it in January).

Then we have the lifetime intrinsics. I assume we want to forbid pointers to 
sub-objects to be given to lifetime intrinsics. But we need to patch the 
optimizations that handle it to know that killing the main object also kills 
the sub-objects. Probably there's nothing to do as this is mainly used in 
codegen, but just thinking out loud.

I can't remember of other pointer stuff that could potentially interact 
badly with these new pointers.

For the design of the intrinsic itself, I think I prefer Johannes approach 
to allow the intrinsic to get multiple attributes. Rather than offsets, 
objectsize() seems to be sufficient for the use case below. We could also 
use it for alignment info as suggested by Johannes.

Thanks,
Nuno


-----Original Message-----
From: Roman Lebedev
Sent: Tuesday, December 7, 2021 7:24 PM
Subject: [RFC] Memory region declaration intrinsic

Hi all.

Differential: https://reviews.llvm.org/D115274

This is a follow-up to the "[llvm-dev] [RFC] Adding range metadata to
array subscripts.",
https://lists.llvm.org/pipermail/llvm-dev/2021-March/149390.html

Problem statement:

As per C 6.5.6p9 / http://eel.is/c++draft/expr.add#4, given
```
struct S {
    int a[3];
    int b[3];
    int c[3];
};

void bar(int*);

void foo(S* s) {
  bar(&s.b[1]);
}
```
even though the pointer the bar receives has 4 ints to the left of it
and 4 to the right of it, the only ints it can access are
one to the left and one to the right. I.e. it can not go outside of the 
S::b.

But, there is currently no way to encode that knowledge into LLVM IR.
There's limited `inrange` thing  for constant expression GEP's,. since:
* https://reviews.llvm.org/D22793
* https://lists.llvm.org/pipermail/llvm-dev/2016-July/102472.html

... but it's limited to constant expressions. There were previous attempts 
at
removing that restriction, namely that RFC and my patch:
https://reviews.llvm.org/D114988, however implementation experience/review
pointed out a few design problems:
1. Poor opaque pointers interop, it requires the GEP to be into a structure,
    so if it's a pure pointer computation, we suddenly can't preserve
the knowledge.
2. While just adding a bit[s] to GEP instruction allows the
transformation to just ignore it
    if they aren't explicitly taught about it, which is fine from a
legality standpoint,
    it complicates it's preservation through transformation.
3. While i'm not sure how useful it would be, it limits us to
statically-sized arrays.

Instead of following through with that, let me propose a new design:

<begin langref>
```
.. _int_memory_region_decl:

'``llvm.memory.region.decl``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Syntax:
"""""""

::

      declare i8* @llvm.memory.region.decl.p0i8(i8* nocapture readnone
returned <ptr>, i64 <begin_offset>, i64 <end_offset>) nofree
nosync
nounwind readnone speculatable willreturn

Overview:
"""""""""

The '``llvm.memory.region.decl``' intrinsic annotates memory region.

Arguments:
""""""""""

This is an overloaded intrinsic. The memory region can belong to any address
space. The first argument is a pointer into the memory region. The returned
pointer, which is the first argument, must belong to the same address space
as the argument. The second argument specifies the offset to the pointer 
(the
first argument) at which the memory region begins. The third argument 
specifies
the offset to the pointer (the first argument) at which the memory region 
ends.

Semantics:
""""""""""

The returned pointer, and, transitively, any pointer that is def-use based 
on
that pointer, points into the memory region ``[ptr+begin_offset,
ptr+end_offset)``,
or is a :ref:`poison value <poisonvalues>` otherwise.

This intrinsic is intended to be an optimization hint, there are no 
correctness
concerns with completely ignoring and/or dropping it. The main use-case is
to be able to annotate array bounds in C family of languages,
which may allow alloca splitting, and better alias analysis.
```
</end langref>

Example:
```
struct S {
  int a;
  int b[4];
};
int* get(S*s, int i) {
  return &s->b[i];
}
```
is currently lowered into
```
define dso_local nonnull i32* @_Z3getP1Si(%struct.S* readnone %s, i32
%i) local_unnamed_addr #0 {
%idxprom = sext i32 %i to i64
%arrayidx = getelementptr inbounds %struct.S, %struct.S* %s, i64 0,
i32 1, i64 %idxprom
ret i32* %arrayidx
}
```
would instead be lowered into
```
define dso_local nonnull i32* @_Z3getP1Si(%struct.S* readnone %s, i32
%i) local_unnamed_addr #0 {
%arrayidx = getelementptr inbounds %struct.S, %struct.S* %s, i64 0,
i32 1, i64 0
%arrayidx.bounded = call i32* @llvm.memory.region.decl.p0i32(i32*
%arrayidx, i64 0, i64 32)
%idxprom = sext i32 %i to i64
%arrayidx3 = getelementptr inbounds i32, i32* %arrayidx.bounded, i64 
%idxprom
ret i32* %arrayidx3
}
```
Concretely, this tells us that %i u<= 4, which should be useful for
Alias Analysis
in less contrived snippets.

The other motivational example, although still contrived:
```
struct S {
int a;
int b[4];
};
int stuff(int i, int array_val, int j, int scalar_val) {
S s;
s.a = scalar_val;
s.b[i] = array_val;
return s.a;
}
```
currently results in:
```
define dso_local i32 @_Z5stuffiiii(i32 %i, i32 %array_val, i32 %j, i32
%scalar_val) local_unnamed_addr #0 {
entry:
%s = alloca %struct.S, align 4
%0 = bitcast %struct.S* %s to i8*
call void @llvm.lifetime.start.p0i8(i64 20, i8* nonnull %0) #2
%a = getelementptr inbounds %struct.S, %struct.S* %s, i64 0, i32 0
store i32 %scalar_val, i32* %a, align 4, !tbaa !3
%idxprom = sext i32 %i to i64
%arrayidx = getelementptr inbounds %struct.S, %struct.S* %s, i64 0,
i32 1, i64 %idxprom
store i32 %array_val, i32* %arrayidx, align 4, !tbaa !8
%1 = load i32, i32* %a, align 4, !tbaa !3
call void @llvm.lifetime.end.p0i8(i64 20, i8* nonnull %0) #2
ret i32 %1
}
```
Notice the problem? `array_val` couldn't have been stored into `S::a`,
this particular example should optimize to just
```
define dso_local i32 @_Z5stuffiiii(i32 %i, i32 %array_val, i32 %j, i32
%scalar_val) local_unnamed_addr #0 {
ret i32 %scalar_val
}
```

The even bigger picture here is that SROA simply gives up in presence
of variable GEP's,
but if we annotate the extents of such a variable GEP, then, given
right circumstances,
we may be able to conclude that the alloca could be split up, and
certain parts be promoted.
That is the main motivation for me behind this.

I think, this is sufficient information, but let me know if i should
address something else.

Roman.

llvm dev - Dec 2021 - [RFC] Memory region declaration intrinsic

[llvm-dev] [RFC] Memory region declaration intrinsic

[llvm-dev] [RFC] Memory region declaration intrinsic

[llvm-dev] [RFC] Memory region declaration intrinsic

[llvm-dev] [RFC] Memory region declaration intrinsic