Eli Friedman via llvm-dev
2021-Dec-07 21:57 UTC
[llvm-dev] [RFC] Memory region declaration intrinsic
How does this interact with "inbounds" markings on GEPs? Can the result of a non-inbounds GEP point outside the specified region? I assume the input and result pointer alias? Given that, I don't think the "nocapture" is right. (There's a potential use-case for a similar intrinsic where the result doesn't alias the argument, but I guess that would be a different thing.) The usual concern with this sort of intrinsic is the work involved in teaching a bunch of pointer optimizations the meaning of the new intrinsic. Encoding array bounds is probably important enough to be worth the extra work, though. -Eli> -----Original Message----- > From: Roman Lebedev <lebedev.ri at gmail.com> > Sent: Tuesday, December 7, 2021 11:24 AM > To: llvm-dev at lists.llvm.org > Cc: Johannes Doerfert <jdoerfert at anl.gov>; Eli Friedman > <efriedma at quicinc.com>; Clement Courbet <courbet at google.com>; Nuno > Lopes <nunoplopes at sapo.pt>; Nikita Popov <nikita.ppv at gmail.com>; Peter > Collingbourne <peter at pcc.me.uk>; Philip Reames <listmail at philipreames.com> > Subject: [RFC] Memory region declaration intrinsic > > Hi all. > > Differential: https://reviews.llvm.org/D115274 > > This is a follow-up to the "[llvm-dev] [RFC] Adding range metadata to > array subscripts.", > https://lists.llvm.org/pipermail/llvm-dev/2021-March/149390.html > > Problem statement: > > As per C 6.5.6p9 / http://eel.is/c++draft/expr.add#4, given > ``` > struct S { > int a[3]; > int b[3]; > int c[3]; > }; > > void bar(int*); > > void foo(S* s) { > bar(&s.b[1]); > } > ``` > even though the pointer the bar receives has 4 ints to the left of it > and 4 to the right of it, the only ints it can access are > one to the left and one to the right. I.e. it can not go outside of the S::b. > > But, there is currently no way to encode that knowledge into LLVM IR. > There's limited `inrange` thing for constant expression GEP's,. since: > * https://reviews.llvm.org/D22793 > * https://lists.llvm.org/pipermail/llvm-dev/2016-July/102472.html > > ... but it's limited to constant expressions. There were previous attempts at > removing that restriction, namely that RFC and my patch: > https://reviews.llvm.org/D114988, however implementation experience/review > pointed out a few design problems: > 1. Poor opaque pointers interop, it requires the GEP to be into a structure, > so if it's a pure pointer computation, we suddenly can't preserve > the knowledge. > 2. While just adding a bit[s] to GEP instruction allows the > transformation to just ignore it > if they aren't explicitly taught about it, which is fine from a > legality standpoint, > it complicates it's preservation through transformation. > 3. While i'm not sure how useful it would be, it limits us to > statically-sized arrays. > > Instead of following through with that, let me propose a new design: > > <begin langref> > ``` > .. _int_memory_region_decl: > > '``llvm.memory.region.decl``' Intrinsic > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > Syntax: > """"""" > > :: > > declare i8* @llvm.memory.region.decl.p0i8(i8* nocapture readnone > returned <ptr>, i64 <begin_offset>, i64 <end_offset>) nofree nosync > nounwind readnone speculatable willreturn > > Overview: > """"""""" > > The '``llvm.memory.region.decl``' intrinsic annotates memory region. > > Arguments: > """""""""" > > This is an overloaded intrinsic. The memory region can belong to any address > space. The first argument is a pointer into the memory region. The returned > pointer, which is the first argument, must belong to the same address space > as the argument. The second argument specifies the offset to the pointer (the > first argument) at which the memory region begins. The third argument specifies > the offset to the pointer (the first argument) at which the memory region ends. > > Semantics: > """""""""" > > The returned pointer, and, transitively, any pointer that is def-use based on > that pointer, points into the memory region ``[ptr+begin_offset, > ptr+end_offset)``, > or is a :ref:`poison value <poisonvalues>` otherwise. > > This intrinsic is intended to be an optimization hint, there are no correctness > concerns with completely ignoring and/or dropping it. The main use-case is > to be able to annotate array bounds in C family of languages, > which may allow alloca splitting, and better alias analysis. > ``` > </end langref> > > Example: > ``` > struct S { > int a; > int b[4]; > }; > int* get(S*s, int i) { > return &s->b[i]; > } > ``` > is currently lowered into > ``` > define dso_local nonnull i32* @_Z3getP1Si(%struct.S* readnone %s, i32 > %i) local_unnamed_addr #0 { > %idxprom = sext i32 %i to i64 > %arrayidx = getelementptr inbounds %struct.S, %struct.S* %s, i64 0, > i32 1, i64 %idxprom > ret i32* %arrayidx > } > ``` > would instead be lowered into > ``` > define dso_local nonnull i32* @_Z3getP1Si(%struct.S* readnone %s, i32 > %i) local_unnamed_addr #0 { > %arrayidx = getelementptr inbounds %struct.S, %struct.S* %s, i64 0, > i32 1, i64 0 > %arrayidx.bounded = call i32* @llvm.memory.region.decl.p0i32(i32* > %arrayidx, i64 0, i64 32) > %idxprom = sext i32 %i to i64 > %arrayidx3 = getelementptr inbounds i32, i32* %arrayidx.bounded, i64 > %idxprom > ret i32* %arrayidx3 > } > ``` > Concretely, this tells us that %i u<= 4, which should be useful for > Alias Analysis > in less contrived snippets. > > The other motivational example, although still contrived: > ``` > struct S { > int a; > int b[4]; > }; > int stuff(int i, int array_val, int j, int scalar_val) { > S s; > s.a = scalar_val; > s.b[i] = array_val; > return s.a; > } > ``` > currently results in: > ``` > define dso_local i32 @_Z5stuffiiii(i32 %i, i32 %array_val, i32 %j, i32 > %scalar_val) local_unnamed_addr #0 { > entry: > %s = alloca %struct.S, align 4 > %0 = bitcast %struct.S* %s to i8* > call void @llvm.lifetime.start.p0i8(i64 20, i8* nonnull %0) #2 > %a = getelementptr inbounds %struct.S, %struct.S* %s, i64 0, i32 0 > store i32 %scalar_val, i32* %a, align 4, !tbaa !3 > %idxprom = sext i32 %i to i64 > %arrayidx = getelementptr inbounds %struct.S, %struct.S* %s, i64 0, > i32 1, i64 %idxprom > store i32 %array_val, i32* %arrayidx, align 4, !tbaa !8 > %1 = load i32, i32* %a, align 4, !tbaa !3 > call void @llvm.lifetime.end.p0i8(i64 20, i8* nonnull %0) #2 > ret i32 %1 > } > ``` > Notice the problem? `array_val` couldn't have been stored into `S::a`, > this particular example should optimize to just > ``` > define dso_local i32 @_Z5stuffiiii(i32 %i, i32 %array_val, i32 %j, i32 > %scalar_val) local_unnamed_addr #0 { > ret i32 %scalar_val > } > ``` > > The even bigger picture here is that SROA simply gives up in presence > of variable GEP's, > but if we annotate the extents of such a variable GEP, then, given > right circumstances, > we may be able to conclude that the alloca could be split up, and > certain parts be promoted. > That is the main motivation for me behind this. > > I think, this is sufficient information, but let me know if i should > address something else. > > Roman.
Roman Lebedev via llvm-dev
2021-Dec-07 22:39 UTC
[llvm-dev] [RFC] Memory region declaration intrinsic
On Wed, Dec 8, 2021 at 12:57 AM Eli Friedman <efriedma at quicinc.com> wrote:> > How does this interact with "inbounds" markings on GEPs? Can the result of a non-inbounds GEP point outside the specified region?Uh, it's a trick question, isn't it? :) Right now I would basically say that "inbounds" markings on GEPs is orthogonal here, because this intrinsic simply "clamps" the underlying "allocated object". (and naturally, you can't unclamp it afterwards *in this def-use chain*) I.e. if you go out of the specified bounds with `inbounds` you get poison (and i think you can't go back from there?); but if you go out of the specified bounds without `inbounds` it's not poison, but dereferencing said pointer is still UB. Let me know if this isn't a proper answer, or if i have talked myself into a corner here :)> I assume the input and result pointer alias?The input pointer is directly returned, so they are the same, so they are mustalias.> Given that, I don't think the "nocapture" is right.Hmm. I guess: https://alive2.llvm.org/ce/z/frx2aI Okay.> (There's a potential use-case for a similar intrinsic where the result doesn't alias the argument, but I guess that would be a different thing.)Nothing immediately comes into mind.> The usual concern with this sort of intrinsic is the work involved in teaching a bunch of pointer optimizations the meaning of the new intrinsic.Yes, ignoring the implementation cost of the improvements that will be made possible, there's going to be //some// implementation cost incurred to de-pessimize the existing transformations, or, perhaps more importantly, avoiding losing this intrinsic during transformations. I acknowledge this worry.> Encoding array bounds is probably important enough to be worth the extra work, though.I think it'll solve a big knowledge propagation hole, so i certainly hope so, yes.> -EliRoman> > -----Original Message----- > > From: Roman Lebedev <lebedev.ri at gmail.com> > > Sent: Tuesday, December 7, 2021 11:24 AM > > To: llvm-dev at lists.llvm.org > > Cc: Johannes Doerfert <jdoerfert at anl.gov>; Eli Friedman > > <efriedma at quicinc.com>; Clement Courbet <courbet at google.com>; Nuno > > Lopes <nunoplopes at sapo.pt>; Nikita Popov <nikita.ppv at gmail.com>; Peter > > Collingbourne <peter at pcc.me.uk>; Philip Reames <listmail at philipreames.com> > > Subject: [RFC] Memory region declaration intrinsic > > > > Hi all. > > > > Differential: https://reviews.llvm.org/D115274 > > > > This is a follow-up to the "[llvm-dev] [RFC] Adding range metadata to > > array subscripts.", > > https://lists.llvm.org/pipermail/llvm-dev/2021-March/149390.html > > > > Problem statement: > > > > As per C 6.5.6p9 / http://eel.is/c++draft/expr.add#4, given > > ``` > > struct S { > > int a[3]; > > int b[3]; > > int c[3]; > > }; > > > > void bar(int*); > > > > void foo(S* s) { > > bar(&s.b[1]); > > } > > ``` > > even though the pointer the bar receives has 4 ints to the left of it > > and 4 to the right of it, the only ints it can access are > > one to the left and one to the right. I.e. it can not go outside of the S::b. > > > > But, there is currently no way to encode that knowledge into LLVM IR. > > There's limited `inrange` thing for constant expression GEP's,. since: > > * https://reviews.llvm.org/D22793 > > * https://lists.llvm.org/pipermail/llvm-dev/2016-July/102472.html > > > > ... but it's limited to constant expressions. There were previous attempts at > > removing that restriction, namely that RFC and my patch: > > https://reviews.llvm.org/D114988, however implementation experience/review > > pointed out a few design problems: > > 1. Poor opaque pointers interop, it requires the GEP to be into a structure, > > so if it's a pure pointer computation, we suddenly can't preserve > > the knowledge. > > 2. While just adding a bit[s] to GEP instruction allows the > > transformation to just ignore it > > if they aren't explicitly taught about it, which is fine from a > > legality standpoint, > > it complicates it's preservation through transformation. > > 3. While i'm not sure how useful it would be, it limits us to > > statically-sized arrays. > > > > Instead of following through with that, let me propose a new design: > > > > <begin langref> > > ``` > > .. _int_memory_region_decl: > > > > '``llvm.memory.region.decl``' Intrinsic > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > > Syntax: > > """"""" > > > > :: > > > > declare i8* @llvm.memory.region.decl.p0i8(i8* nocapture readnone > > returned <ptr>, i64 <begin_offset>, i64 <end_offset>) nofree nosync > > nounwind readnone speculatable willreturn > > > > Overview: > > """"""""" > > > > The '``llvm.memory.region.decl``' intrinsic annotates memory region. > > > > Arguments: > > """""""""" > > > > This is an overloaded intrinsic. The memory region can belong to any address > > space. The first argument is a pointer into the memory region. The returned > > pointer, which is the first argument, must belong to the same address space > > as the argument. The second argument specifies the offset to the pointer (the > > first argument) at which the memory region begins. The third argument specifies > > the offset to the pointer (the first argument) at which the memory region ends. > > > > Semantics: > > """""""""" > > > > The returned pointer, and, transitively, any pointer that is def-use based on > > that pointer, points into the memory region ``[ptr+begin_offset, > > ptr+end_offset)``, > > or is a :ref:`poison value <poisonvalues>` otherwise. > > > > This intrinsic is intended to be an optimization hint, there are no correctness > > concerns with completely ignoring and/or dropping it. The main use-case is > > to be able to annotate array bounds in C family of languages, > > which may allow alloca splitting, and better alias analysis. > > ``` > > </end langref> > > > > Example: > > ``` > > struct S { > > int a; > > int b[4]; > > }; > > int* get(S*s, int i) { > > return &s->b[i]; > > } > > ``` > > is currently lowered into > > ``` > > define dso_local nonnull i32* @_Z3getP1Si(%struct.S* readnone %s, i32 > > %i) local_unnamed_addr #0 { > > %idxprom = sext i32 %i to i64 > > %arrayidx = getelementptr inbounds %struct.S, %struct.S* %s, i64 0, > > i32 1, i64 %idxprom > > ret i32* %arrayidx > > } > > ``` > > would instead be lowered into > > ``` > > define dso_local nonnull i32* @_Z3getP1Si(%struct.S* readnone %s, i32 > > %i) local_unnamed_addr #0 { > > %arrayidx = getelementptr inbounds %struct.S, %struct.S* %s, i64 0, > > i32 1, i64 0 > > %arrayidx.bounded = call i32* @llvm.memory.region.decl.p0i32(i32* > > %arrayidx, i64 0, i64 32) > > %idxprom = sext i32 %i to i64 > > %arrayidx3 = getelementptr inbounds i32, i32* %arrayidx.bounded, i64 > > %idxprom > > ret i32* %arrayidx3 > > } > > ``` > > Concretely, this tells us that %i u<= 4, which should be useful for > > Alias Analysis > > in less contrived snippets. > > > > The other motivational example, although still contrived: > > ``` > > struct S { > > int a; > > int b[4]; > > }; > > int stuff(int i, int array_val, int j, int scalar_val) { > > S s; > > s.a = scalar_val; > > s.b[i] = array_val; > > return s.a; > > } > > ``` > > currently results in: > > ``` > > define dso_local i32 @_Z5stuffiiii(i32 %i, i32 %array_val, i32 %j, i32 > > %scalar_val) local_unnamed_addr #0 { > > entry: > > %s = alloca %struct.S, align 4 > > %0 = bitcast %struct.S* %s to i8* > > call void @llvm.lifetime.start.p0i8(i64 20, i8* nonnull %0) #2 > > %a = getelementptr inbounds %struct.S, %struct.S* %s, i64 0, i32 0 > > store i32 %scalar_val, i32* %a, align 4, !tbaa !3 > > %idxprom = sext i32 %i to i64 > > %arrayidx = getelementptr inbounds %struct.S, %struct.S* %s, i64 0, > > i32 1, i64 %idxprom > > store i32 %array_val, i32* %arrayidx, align 4, !tbaa !8 > > %1 = load i32, i32* %a, align 4, !tbaa !3 > > call void @llvm.lifetime.end.p0i8(i64 20, i8* nonnull %0) #2 > > ret i32 %1 > > } > > ``` > > Notice the problem? `array_val` couldn't have been stored into `S::a`, > > this particular example should optimize to just > > ``` > > define dso_local i32 @_Z5stuffiiii(i32 %i, i32 %array_val, i32 %j, i32 > > %scalar_val) local_unnamed_addr #0 { > > ret i32 %scalar_val > > } > > ``` > > > > The even bigger picture here is that SROA simply gives up in presence > > of variable GEP's, > > but if we annotate the extents of such a variable GEP, then, given > > right circumstances, > > we may be able to conclude that the alloca could be split up, and > > certain parts be promoted. > > That is the main motivation for me behind this. > > > > I think, this is sufficient information, but let me know if i should > > address something else. > > > > Roman.