Hal Finkel
2014-Nov-22 04:35 UTC
[LLVMdev] Upcoming Changes/Additions to Scoped-NoAlias metadata
----- Original Message -----> From: "Raul Silvera" <rsilvera at google.com> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: "Chandler Carruth" <chandlerc at google.com>, "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > Sent: Tuesday, November 18, 2014 9:09:40 PM > Subject: Re: [LLVMdev] Upcoming Changes/Additions to Scoped-NoAlias metadata > > I preserve them only weakly... I don't want full barriers; in fact, I > plan to add InstCombine code to combine calls to @llvm.noalias (it > will also take a list of scopes, not just one, so this is possible). > The goal is to have as few barriers as possible. > > > Good. > > > > Going further, logically the intrinsic should return a pointer to > > > a > > > new object, disjoint from all other live objects. It is not > > > aliased > > > to A, and is well defined even if it contains &A because A is not > > > referenced in the scope. > > > > This is essentially what is done, but only for accesses in the > > scope > > (or some sub-scope). I don't think the semantics allow for what > > you're suggesting. The specific language from 6.7.3.1p4 says: > > > > [from C] > > During each execution of B, let L be any lvalue that has &L based > > on > > P. If L is used to > > access the value of the object X that it designates, ..., > > then the following requirements apply: ... Every other lvalue > > used to access the value of X shall also have its address based on > > P. > > [end from C] > > > > Where B is defined in 6.7.3.1p2 to be, essentially, the block in > > which the relevant declaration appears. And we can really only > > extrapolate from that to the other access in that block, and not to > > the containing block. > > > > > > Inside that block > > (the lifetime of P) , it is safe to assume that X is > > disjoint from an arbitrary live object > > A. It if was > > > > n't > > , either: > > - A is independently referenced inside the block, so there is UB > > and > > all bets are off. > > - A is not independently referenced inside the blo ck, > > so t here are no pairs of accesses to incorrectly reorder as all > > accesses to A in > > the block are done through P. You just need to delimit the block > > with dataflow barriers > > , summar iz > > ing the effect of the block at entry/exit. > > Okay, I think I agree with you assuming that we put in entry/exit > barriers to preserve the block boundaries. I'd specifically like to > avoid that, however. > > I'm not proposing full code motion barriers, only punctual dataflow > use/defs to signal entry/exit to the scope. > > > > Logically, entering the scope transfers the pointed data into a new > unique block of memory, and puts its address on the restrict > pointer. Exiting the scope transfers it back. Of course you do not > want to actually allocate a new object and move the data, but you > can use these semantics to define the scope entry/exit intrinsics. > Their contribution to dataflow is only limited to the content of the > address used to initialize the restricted pointer. These would be > lighter than the proposed intrinsic as they would not have > specialized control-flow restrictions.Thanks for explaining, I now understand what you're proposing.> > This approach makes the restrict attribute effective against all live > variables without having to examine the extent of the scope to > collect all references, which is in general impractical.I think you've misunderstood this. For restrict-qualified local variables, every memory access within the containing block (which is everything in the function for function argument restrict-qualified pointers) get tagged with the scope. This is trivial to determine.> It also > removes the need for scope metadata, as there would be no need to > name the scopes.Indeed.> > > Anyway, this is just a general alternate design, since you were > asking for one.Yes, and thank you for doing so.> I'm sure still would take some time/effort to map it > onto the LLVM framework.That does not seem too difficult, the question is really just whether or not it gives us what we need...>So in this scheme, we'd have the following: void foo(T * restrict a, T * restrict b) { *a = *b; } T * x = ..., *y = ..., *z = ..., *w = ...; foo(x, y); foo(z, w); become: T * x = ..., *y = ..., *z = ..., *w = ...; T * a1 = @llvm.noalias.start(x); // model: reads from x (with a general write control dep). T * b1 = @llvm.noalias.start(y); *a1 = *b1; @llvm.noalias.end(a1, x); // model: reads from a1, writes to x. @llvm.noalias.end(b1, y); T * a2 = @llvm.noalias.start(z); T * b2 = @llvm.noalias.start(w); *a2 = *b2; @llvm.noalias.end(a2, z); @llvm.noalias.end(b2, w); This does indeed seem generally equivalent to the original proposal in the sense that the original proposal has an implicit ending barrier at the last relevant derived access, and here we have explicit ending barriers. The advantage is the lack of metadata (and associated implementation complexity). The disadvantage is that we have additional barriers to manage, and these are write barriers on the underlying pointers. It is not clear to me this would make too much difference, so long as we aggressively hoisted the ending barriers to just after the last use based on their 'noalias' operands. So this is relatively appealing, and I think would not be a bad way to model C99 restrict (extending the scheme to handle mutually-ambiguous restrict-qualified pointers from aggregates seems straightforward). It does not, however, cover cases where the region of guaranteed disjointness (for lack of a better term) is not continuous. This will come up when implementing a scheme such as that in the current C++ alias-set proposal (N4150). To construct a quick example, imagine that our implementation of std::vector is annotated such that (assuming the standard allocator) each std::vector object's internal storage has a distinct alias set, and we have: std::vector<T> x, y; ... T * q = &x[0]; for (int i = 0; i < 1600; ++i) { x[i] = y[i]; *q += x[i]; } so here we know that the memory accesses inside the operator[] from x and y don't alias, but the alias-set attribute does not tell us about the relationship between those accesses and the *q. The point of dominance, however, needs to associated with the declaration of x and y (specifically, we want to preserve the dominance over the loop). A start/end barrier scheme localized around the inlined operator[] functions would not do that, and placing start/end barriers around the entire live region of x and y would not be correct. I can, however, represent this using the metadata scheme. Thanks again, Hal> > > > Regards, > > > >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
Raul Silvera
2014-Nov-25 01:20 UTC
[LLVMdev] Upcoming Changes/Additions to Scoped-NoAlias metadata
I'm going to have to read N4150 before commenting on your second point, but in the meantime, a few questions. In the original proposal, if you have: T A,B; void foo(T *restrict a, T* restrict b) { A = *b; *a = A; } How is this going to be modeled so that B is aliased to *a and *b, but A isn't? What if the references to A are in a call made from foo, which will be inlined later on? Also, what if the reference is in a side path, like this: T A, B; void foo(T *restrict a, T *restrict b, bool c) { T tmp = *b; if (c) { A = tmp; } *a = *b; } Will you alias A to *a and *b? Seems to me you have to, as foo(&A, &B, false) has well-defined semantics. On the scheme I proposed there is no need, as the exit barrier would separate *a from references to A after the call. Not aliasing A *b will save the reload of *b after the conditional. Thanks Raúl E Silvera | SWE | rsilvera at google.com | *408-789-2846* On Fri, Nov 21, 2014 at 8:35 PM, Hal Finkel <hfinkel at anl.gov> wrote:> ----- Original Message ----- > > From: "Raul Silvera" <rsilvera at google.com> > > To: "Hal Finkel" <hfinkel at anl.gov> > > Cc: "Chandler Carruth" <chandlerc at google.com>, "LLVM Developers Mailing > List" <llvmdev at cs.uiuc.edu> > > Sent: Tuesday, November 18, 2014 9:09:40 PM > > Subject: Re: [LLVMdev] Upcoming Changes/Additions to Scoped-NoAlias > metadata > > > > I preserve them only weakly... I don't want full barriers; in fact, I > > plan to add InstCombine code to combine calls to @llvm.noalias (it > > will also take a list of scopes, not just one, so this is possible). > > The goal is to have as few barriers as possible. > > > > > > Good. > > > > > > Going further, logically the intrinsic should return a pointer to > > > > a > > > > new object, disjoint from all other live objects. It is not > > > > aliased > > > > to A, and is well defined even if it contains &A because A is not > > > > referenced in the scope. > > > > > > This is essentially what is done, but only for accesses in the > > > scope > > > (or some sub-scope). I don't think the semantics allow for what > > > you're suggesting. The specific language from 6.7.3.1p4 says: > > > > > > [from C] > > > During each execution of B, let L be any lvalue that has &L based > > > on > > > P. If L is used to > > > access the value of the object X that it designates, ..., > > > then the following requirements apply: ... Every other lvalue > > > used to access the value of X shall also have its address based on > > > P. > > > [end from C] > > > > > > Where B is defined in 6.7.3.1p2 to be, essentially, the block in > > > which the relevant declaration appears. And we can really only > > > extrapolate from that to the other access in that block, and not to > > > the containing block. > > > > > > > > > Inside that block > > > (the lifetime of P) , it is safe to assume that X is > > > disjoint from an arbitrary live object > > > A. It if was > > > > > > n't > > > , either: > > > - A is independently referenced inside the block, so there is UB > > > and > > > all bets are off. > > > - A is not independently referenced inside the blo ck, > > > so t here are no pairs of accesses to incorrectly reorder as all > > > accesses to A in > > > the block are done through P. You just need to delimit the block > > > with dataflow barriers > > > , summar iz > > > ing the effect of the block at entry/exit. > > > > Okay, I think I agree with you assuming that we put in entry/exit > > barriers to preserve the block boundaries. I'd specifically like to > > avoid that, however. > > > > I'm not proposing full code motion barriers, only punctual dataflow > > use/defs to signal entry/exit to the scope. > > > > > > > > Logically, entering the scope transfers the pointed data into a new > > unique block of memory, and puts its address on the restrict > > pointer. Exiting the scope transfers it back. Of course you do not > > want to actually allocate a new object and move the data, but you > > can use these semantics to define the scope entry/exit intrinsics. > > Their contribution to dataflow is only limited to the content of the > > address used to initialize the restricted pointer. These would be > > lighter than the proposed intrinsic as they would not have > > specialized control-flow restrictions. > > Thanks for explaining, I now understand what you're proposing. > > > > > This approach makes the restrict attribute effective against all live > > variables without having to examine the extent of the scope to > > collect all references, which is in general impractical. > > I think you've misunderstood this. For restrict-qualified local variables, > every memory access within the containing block (which is everything in the > function for function argument restrict-qualified pointers) get tagged with > the scope. This is trivial to determine. > > > It also > > removes the need for scope metadata, as there would be no need to > > name the scopes. > > Indeed. > > > > > > > Anyway, this is just a general alternate design, since you were > > asking for one. > > Yes, and thank you for doing so. > > > I'm sure still would take some time/effort to map it > > onto the LLVM framework. > > That does not seem too difficult, the question is really just whether or > not it gives us what we need... > > > > > So in this scheme, we'd have the following: > > void foo(T * restrict a, T * restrict b) { > *a = *b; > } > > T * x = ..., *y = ..., *z = ..., *w = ...; > foo(x, y); > foo(z, w); > > become: > > T * x = ..., *y = ..., *z = ..., *w = ...; > > T * a1 = @llvm.noalias.start(x); // model: reads from x (with a general > write control dep). > T * b1 = @llvm.noalias.start(y); > *a1 = *b1; > @llvm.noalias.end(a1, x); // model: reads from a1, writes to x. > @llvm.noalias.end(b1, y); > > T * a2 = @llvm.noalias.start(z); > T * b2 = @llvm.noalias.start(w); > *a2 = *b2; > @llvm.noalias.end(a2, z); > @llvm.noalias.end(b2, w); > > This does indeed seem generally equivalent to the original proposal in the > sense that the original proposal has an implicit ending barrier at the last > relevant derived access, and here we have explicit ending barriers. The > advantage is the lack of metadata (and associated implementation > complexity). The disadvantage is that we have additional barriers to > manage, and these are write barriers on the underlying pointers. It is not > clear to me this would make too much difference, so long as we aggressively > hoisted the ending barriers to just after the last use based on their > 'noalias' operands. > > So this is relatively appealing, and I think would not be a bad way to > model C99 restrict (extending the scheme to handle mutually-ambiguous > restrict-qualified pointers from aggregates seems straightforward). It does > not, however, cover cases where the region of guaranteed disjointness (for > lack of a better term) is not continuous. This will come up when > implementing a scheme such as that in the current C++ alias-set proposal > (N4150). To construct a quick example, imagine that our implementation of > std::vector is annotated such that (assuming the standard allocator) each > std::vector object's internal storage has a distinct alias set, and we have: > > std::vector<T> x, y; > ... > T * q = &x[0]; > for (int i = 0; i < 1600; ++i) { > x[i] = y[i]; > *q += x[i]; > } > > so here we know that the memory accesses inside the operator[] from x and > y don't alias, but the alias-set attribute does not tell us about the > relationship between those accesses and the *q. The point of dominance, > however, needs to associated with the declaration of x and y (specifically, > we want to preserve the dominance over the loop). A start/end barrier > scheme localized around the inlined operator[] functions would not do that, > and placing start/end barriers around the entire live region of x and y > would not be correct. I can, however, represent this using the metadata > scheme. > > Thanks again, > Hal > > > > > > > > > Regards, > > > > > > > > > > -- > Hal Finkel > Assistant Computational Scientist > Leadership Computing Facility > Argonne National Laboratory >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141124/2bddb304/attachment.html>
Hal Finkel
2014-Nov-25 18:43 UTC
[LLVMdev] Upcoming Changes/Additions to Scoped-NoAlias metadata
----- Original Message -----> From: "Raul Silvera" <rsilvera at google.com> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: "Chandler Carruth" <chandlerc at google.com>, "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > Sent: Monday, November 24, 2014 7:20:04 PM > Subject: Re: [LLVMdev] Upcoming Changes/Additions to Scoped-NoAlias metadata > > > I'm going to have to read N4150 before commenting on your second > point, but in the meantime, a few questions. In the original > proposal, if you have: > > > T A,B; > void foo(T *restrict a, T* restrict b) { > A = *b; > *a = A; > } > > > How is this going to be modeled so that B is aliased to *a and *b, > but A isn't? What if the references to A are in a call made from > foo, which will be inlined later on?Within the current AA framework, this question has no well-defined meaning. AA only answers queries sensibly regarding values that are simultaneously live and referenced at some point in the function's control flow. Metadata aside, even with the current function argument attributes, querying alias('B', '*b') will respond with NoAlias (for both values, llvm::isIdentifiedObject will return true -- it will return true for both GlobalVariables and for noalias arguments). This is another example of why you cannot use our current AA framework for IPA, doing so simply does not make sense. The metadata/intrinsics scheme I've proposed does not change this.> > > Also, what if the reference is in a side path, like this: > > > T A, B; > void foo(T *restrict a, T *restrict b, bool c) { > T tmp = *b; > if (c) { > A = tmp; > } > *a = *b; > } > > > Will you alias A to *a and *b? Seems to me you have to, as foo(&A, > &B, false) has well-defined semantics.No, and it does not do so now. Querying A or B (global variables) vs. *a or *b will return NoAlias. There is no point in the CFG of this function where an object accessed through *a or *b can legally be A or B. Now to the proposed metadata/intrinsics, let's assume that A and B are local variables in the caller. It is true that *a and *b will be based on a @llvm.noalias result, but that noalias result will only be used if the access is tagged with the associated metadata scope (which will be true for *a, *b and the A within the inlined function, but not true otherwise, thus the semantics will be identical after inlining.> On the scheme I proposed > there is no need, as the exit barrier would separate *a from > references to A after the call. Not aliasing A *b will save the > reload of *b after the conditional.I agree your scheme will also capture this. Thanks again, Hal> > > Thanks > > > > > > > Raúl E Silvera | SWE | rsilvera at google.com | 408-789-2846 > > > On Fri, Nov 21, 2014 at 8:35 PM, Hal Finkel < hfinkel at anl.gov > > wrote: > > > ----- Original Message ----- > > From: "Raul Silvera" < rsilvera at google.com > > > To: "Hal Finkel" < hfinkel at anl.gov > > > Cc: "Chandler Carruth" < chandlerc at google.com >, "LLVM Developers > > Mailing List" < llvmdev at cs.uiuc.edu > > > Sent: Tuesday, November 18, 2014 9:09:40 PM > > Subject: Re: [LLVMdev] Upcoming Changes/Additions to Scoped-NoAlias > > metadata > > > > > > I preserve them only weakly... I don't want full barriers; in fact, > > I > > plan to add InstCombine code to combine calls to @llvm.noalias (it > > will also take a list of scopes, not just one, so this is > > possible). > > The goal is to have as few barriers as possible. > > > > > > Good. > > > > > > Going further, logically the intrinsic should return a pointer > > > > to > > > > a > > > > new object, disjoint from all other live objects. It is not > > > > aliased > > > > to A, and is well defined even if it contains &A because A is > > > > not > > > > referenced in the scope. > > > > > > This is essentially what is done, but only for accesses in the > > > scope > > > (or some sub-scope). I don't think the semantics allow for what > > > you're suggesting. The specific language from 6.7.3.1p4 says: > > > > > > [from C] > > > During each execution of B, let L be any lvalue that has &L based > > > on > > > P. If L is used to > > > access the value of the object X that it designates, ..., > > > then the following requirements apply: ... Every other lvalue > > > used to access the value of X shall also have its address based > > > on > > > P. > > > [end from C] > > > > > > Where B is defined in 6.7.3.1p2 to be, essentially, the block in > > > which the relevant declaration appears. And we can really only > > > extrapolate from that to the other access in that block, and not > > > to > > > the containing block. > > > > > > > > > Inside that block > > > (the lifetime of P) , it is safe to assume that X is > > > disjoint from an arbitrary live object > > > A. It if was > > > > > > n't > > > , either: > > > - A is independently referenced inside the block, so there is UB > > > and > > > all bets are off. > > > - A is not independently referenced inside the blo ck, > > > so t here are no pairs of accesses to incorrectly reorder as all > > > accesses to A in > > > the block are done through P. You just need to delimit the block > > > with dataflow barriers > > > , summar iz > > > ing the effect of the block at entry/exit. > > > > Okay, I think I agree with you assuming that we put in entry/exit > > barriers to preserve the block boundaries. I'd specifically like to > > avoid that, however. > > > > I'm not proposing full code motion barriers, only punctual dataflow > > use/defs to signal entry/exit to the scope. > > > > > > > > Logically, entering the scope transfers the pointed data into a new > > unique block of memory, and puts its address on the restrict > > pointer. Exiting the scope transfers it back. Of course you do not > > want to actually allocate a new object and move the data, but you > > can use these semantics to define the scope entry/exit intrinsics. > > Their contribution to dataflow is only limited to the content of > > the > > address used to initialize the restricted pointer. These would be > > lighter than the proposed intrinsic as they would not have > > specialized control-flow restrictions. > > Thanks for explaining, I now understand what you're proposing. > > > > > This approach makes the restrict attribute effective against all > > live > > variables without having to examine the extent of the scope to > > collect all references, which is in general impractical. > > I think you've misunderstood this. For restrict-qualified local > variables, every memory access within the containing block (which is > everything in the function for function argument restrict-qualified > pointers) get tagged with the scope. This is trivial to determine. > > > It also > > removes the need for scope metadata, as there would be no need to > > name the scopes. > > Indeed. > > > > > > > Anyway, this is just a general alternate design, since you were > > asking for one. > > Yes, and thank you for doing so. > > > I'm sure still would take some time/effort to map it > > onto the LLVM framework. > > That does not seem too difficult, the question is really just whether > or not it gives us what we need... > > > > > So in this scheme, we'd have the following: > > void foo(T * restrict a, T * restrict b) { > *a = *b; > } > > T * x = ..., *y = ..., *z = ..., *w = ...; > foo(x, y); > foo(z, w); > > become: > > T * x = ..., *y = ..., *z = ..., *w = ...; > > T * a1 = @llvm.noalias.start(x); // model: reads from x (with a > general write control dep). > T * b1 = @llvm.noalias.start(y); > *a1 = *b1; > @llvm.noalias.end(a1, x); // model: reads from a1, writes to x. > @llvm.noalias.end(b1, y); > > T * a2 = @llvm.noalias.start(z); > T * b2 = @llvm.noalias.start(w); > *a2 = *b2; > @llvm.noalias.end(a2, z); > @llvm.noalias.end(b2, w); > > This does indeed seem generally equivalent to the original proposal > in the sense that the original proposal has an implicit ending > barrier at the last relevant derived access, and here we have > explicit ending barriers. The advantage is the lack of metadata (and > associated implementation complexity). The disadvantage is that we > have additional barriers to manage, and these are write barriers on > the underlying pointers. It is not clear to me this would make too > much difference, so long as we aggressively hoisted the ending > barriers to just after the last use based on their 'noalias' > operands. > > So this is relatively appealing, and I think would not be a bad way > to model C99 restrict (extending the scheme to handle > mutually-ambiguous restrict-qualified pointers from aggregates seems > straightforward). It does not, however, cover cases where the region > of guaranteed disjointness (for lack of a better term) is not > continuous. This will come up when implementing a scheme such as > that in the current C++ alias-set proposal (N4150). To construct a > quick example, imagine that our implementation of std::vector is > annotated such that (assuming the standard allocator) each > std::vector object's internal storage has a distinct alias set, and > we have: > > std::vector<T> x, y; > ... > T * q = &x[0]; > for (int i = 0; i < 1600; ++i) { > x[i] = y[i]; > *q += x[i]; > } > > so here we know that the memory accesses inside the operator[] from x > and y don't alias, but the alias-set attribute does not tell us > about the relationship between those accesses and the *q. The point > of dominance, however, needs to associated with the declaration of x > and y (specifically, we want to preserve the dominance over the > loop). A start/end barrier scheme localized around the inlined > operator[] functions would not do that, and placing start/end > barriers around the entire live region of x and y would not be > correct. I can, however, represent this using the metadata scheme. > > Thanks again, > Hal > > > > > > > > > Regards, > > > > > > > > > > > > -- > Hal Finkel > Assistant Computational Scientist > Leadership Computing Facility > Argonne National Laboratory > >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory