thr3ads.net - llvm dev - [LLVMdev] Upcoming Changes/Additions to Scoped-NoAlias metadata [Nov 2014]

If this information is useful, please help other people find it:
Share via:

Hal Finkel

2014-Nov-18 22:50 UTC

[LLVMdev] Upcoming Changes/Additions to Scoped-NoAlias metadata

----- Original Message -----> From: "Raul Silvera" <rsilvera at google.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "Chandler Carruth" <chandlerc at google.com>,
"LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>
> Sent: Tuesday, November 18, 2014 11:23:25 AM
> Subject: Re: [LLVMdev] Upcoming Changes/Additions to Scoped-NoAlias
metadata
> 
> > > You don't have x1 and x2 in your example, assuming you mean:
> > > 
> > > int i = 0;
> > > T A;
> > > T * y2 = ...
> > > {
> > > T * x1 = &A;
> > > a = x1[i];
> > > }
> > > {
> > > T * restrict x2 = y2;
> > > b = x2[i];
> > > }
> > > 
> > > It should, no? by virtue of x2 being restrict you know that *x2
> > > doesn't alias A, and *x1 is A.
> > 
> > No, it doesn't. The fact that x2 is restrict does not mean that it
> > does not alias with any other potential accesses from variables
> > live
> > in its block. It only means it does not alias with other accesses
> > with that occur in the block where x2 is live. There is no access
> > to
> > A or x1 in that block, so we can say nothing about it.
> > 
> > 
> > 
> > It does. You can assume x2 is not aliased to A and still get
> > well-defined semantics, precisely because A is not referenced in
> > the
> > scope of x2. That refinement would only get you into trouble if A
> > is
> > referenced in the scope of x2, which would trigger UB.
> 
> I don't understand exactly what you're saying here. You can do that
> at the source level where you still have the original blocks. The
> problem is that, at the IR level, these blocks don't remain separate
> basic blocks, and the distinction then matters.
> 
> 
> Agreed. My point is that if you preserve the
> block boundaries yo u
>  
> can use
> better a
>  liasing for the restricted pointers. You are already preserving the
>  blo ck entry
> by introducing the intrinsic
> ; the block exits could be similarly preserved.
I preserve them only weakly... I don't want full barriers; in fact, I plan
to add InstCombine code to combine calls to @llvm.noalias (it will also take a
list of scopes, not just one, so this is possible). The goal is to have as few
barriers as possible.
> 
> 
> 
> > Going further, logically the intrinsic should return a pointer to a
> > new object, disjoint from all other live objects. It is not aliased
> > to A, and is well defined even if it contains &A because A is not
> > referenced in the scope.
> 
> This is essentially what is done, but only for accesses in the scope
> (or some sub-scope). I don't think the semantics allow for what
> you're suggesting. The specific language from 6.7.3.1p4 says:
> 
> [from C]
> During each execution of B, let L be any lvalue that has &L based on
> P. If L is used to
> access the value of the object X that it designates, ...,
> then the following requirements apply: ... Every other lvalue
> used to access the value of X shall also have its address based on P.
> [end from C]
> 
> Where B is defined in 6.7.3.1p2 to be, essentially, the block in
> which the relevant declaration appears. And we can really only
> extrapolate from that to the other access in that block, and not to
> the containing block.
> 
> 
> Inside that block
> (the lifetime of P) , it is safe to assume that X is
> disjoint from an arbitrary live object
>  A. It if was
>  
> n't
>  , either:
> - A is independently referenced inside the block, so there is UB and
> all bets are off.
> - A is not independently referenced inside the blo ck,
> so t here are no pairs of accesses to incorrectly reorder as all
> accesses to A in
>  the block are done through P. You just need to delimit the block
>  with dataflow barriers
> , summar iz
> ing the effect of the block at entry/exit.
Okay, I think I agree with you assuming that we put in entry/exit barriers to
preserve the block boundaries. I'd specifically like to avoid that, however.
> 
> 
> 
> This is similar to the way dummy args are implemented on Fortran
> compilers, extended to arbitrary scopes.
Interesting.

Thanks again,
Hal
> 
> 
> 
> This does require dataflow barriers on
> > entrance/exits to the scope, but those can be made no worse than
> > the
> > original code.
> 
> These don't turn into general scheduling barriers anyway. They'll
be
> tagged as writing to memory, yes, but like with @llvm.assume,
> they'll get special treatment in BasicAA and a few other places so
> they don't hurt code motion too badly.
> 
> > 
> > 
> > 
> > Aliasing x2 to A is not only unnecessary, but also pessimistic
> 
> It is pessimistic, but only in the sense that the restrict qualifier
> does not say anything about it.
> 
> > because in general you do not have access to the dynamic scope of
> > the restricted pointer.
> > 
> > 
> > 
> > 
> > T A, B;
> > T * x1 = .... // either &A or &B
> > T * y2 = .... // maybe &A
> > {
> > T * restrict x2 = y2;
> > *x1 = ...
> > *x2 = ...
> > }
> > 
> > > 
> > > In this case you'll be able to tell *x1 doesn't alias
*x2,
> > > right?
> > 
> > In this case, yes, we can conclude that x1 and x2 don't alias
> > (because *x1 and *x2 cannot both legally refer to the same object).
> > 
> > > How about if you add restrict to x1?
> > 
> > The conclusion is the same, but if you add restrict to x1, you
> > don't
> > need it on x2. x2 is definitely not based on x1, so if x1 is
> > restrict, then we know that x1 and x2 don't alias.
> > 
> > Agreed. So will your approach be able to catch both cases? It
> > seemed
> > to me it wouldn't be able to catch the second one because it would
> > have a different scope, but probably I'm missing something.
> 
> Yes, it will catch it. Just as in the current metadata design, the
> scope of each access is really a list of scopes. The accesses in the
> inner blocks get tagged with both the inner and the outer scopes, so
> they pick up the restrict from the outer scope.
> 
> > 
> > 
> > Thanks for your patience,
> > 
> 
> Not a problem; I appreciate the feedback!
> 
> -Hal
> 
> > 
> > 
> > > 
> > 
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> > 
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> 
-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Raul Silvera

2014-Nov-19 03:09 UTC

head link

[LLVMdev] Upcoming Changes/Additions to Scoped-NoAlias metadata

>
>
> I preserve them only weakly... I don't want full barriers; in fact, I
plan
> to add InstCombine code to combine calls to @llvm.noalias (it will also
> take a list of scopes, not just one, so this is possible). The goal is to
> have as few barriers as possible.

Good.
> > Going further, logically the intrinsic should return a pointer to a
> > > new object, disjoint from all other live objects. It is not
aliased
> > > to A, and is well defined even if it contains &A because A is
not
> > > referenced in the scope.
> >
> > This is essentially what is done, but only for accesses in the scope
> > (or some sub-scope). I don't think the semantics allow for what
> > you're suggesting. The specific language from 6.7.3.1p4 says:
> >
> > [from C]
> > During each execution of B, let L be any lvalue that has &L based
on
> > P. If L is used to
> > access the value of the object X that it designates, ...,
> > then the following requirements apply: ... Every other lvalue
> > used to access the value of X shall also have its address based on P.
> > [end from C]
> >
> > Where B is defined in 6.7.3.1p2 to be, essentially, the block in
> > which the relevant declaration appears. And we can really only
> > extrapolate from that to the other access in that block, and not to
> > the containing block.
> >
> >
> > Inside that block
> > (the lifetime of P) , it is safe to assume that X is
> > disjoint from an arbitrary live object
> >  A. It if was
> >
> > n't
> >  , either:
> > - A is independently referenced inside the block, so there is UB and
> > all bets are off.
> > - A is not independently referenced inside the blo ck,
> > so t here are no pairs of accesses to incorrectly reorder as all
> > accesses to A in
> >  the block are done through P. You just need to delimit the block
> >  with dataflow barriers
> > , summar iz
> > ing the effect of the block at entry/exit.
>
> Okay, I think I agree with you assuming that we put in entry/exit barriers
> to preserve the block boundaries. I'd specifically like to avoid that,
> however.
>
I'm not proposing full code motion barriers, only punctual dataflow
use/defs to signal entry/exit to the scope.

Logically, entering the scope transfers the pointed data into a new unique
block of memory, and puts its address on the restrict pointer. Exiting the
scope transfers it back. Of course you do not want to actually allocate a
new object and move the data, but you can use these semantics to define the
scope entry/exit intrinsics. Their contribution to dataflow is only limited
to the content of the address used to initialize the restricted pointer.
These would be lighter than the proposed intrinsic as they would not have
specialized control-flow restrictions.

This approach makes the restrict attribute effective against all live
variables without having to examine the extent of the scope to collect all
references, which is in general impractical. It also removes the need for
scope metadata, as there would be no need to name the scopes.

Anyway, this is just a general alternate design, since you were asking for
one. I'm sure still would take some time/effort to map it onto the LLVM
framework.

Regards,
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141118/22a4994c/attachment.html>

Hal Finkel

2014-Nov-22 04:35 UTC

head link

[LLVMdev] Upcoming Changes/Additions to Scoped-NoAlias metadata

----- Original Message -----> From: "Raul Silvera" <rsilvera at google.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "Chandler Carruth" <chandlerc at google.com>,
"LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>
> Sent: Tuesday, November 18, 2014 9:09:40 PM
> Subject: Re: [LLVMdev] Upcoming Changes/Additions to Scoped-NoAlias
metadata
> 
> I preserve them only weakly... I don't want full barriers; in fact, I
> plan to add InstCombine code to combine calls to @llvm.noalias (it
> will also take a list of scopes, not just one, so this is possible).
> The goal is to have as few barriers as possible.
> 
> 
> Good.
> 
> > > Going further, logically the intrinsic should return a pointer to
> > > a
> > > new object, disjoint from all other live objects. It is not
> > > aliased
> > > to A, and is well defined even if it contains &A because A is
not
> > > referenced in the scope.
> > 
> > This is essentially what is done, but only for accesses in the
> > scope
> > (or some sub-scope). I don't think the semantics allow for what
> > you're suggesting. The specific language from 6.7.3.1p4 says:
> > 
> > [from C]
> > During each execution of B, let L be any lvalue that has &L based
> > on
> > P. If L is used to
> > access the value of the object X that it designates, ...,
> > then the following requirements apply: ... Every other lvalue
> > used to access the value of X shall also have its address based on
> > P.
> > [end from C]
> > 
> > Where B is defined in 6.7.3.1p2 to be, essentially, the block in
> > which the relevant declaration appears. And we can really only
> > extrapolate from that to the other access in that block, and not to
> > the containing block.
> > 
> > 
> > Inside that block
> > (the lifetime of P) , it is safe to assume that X is
> > disjoint from an arbitrary live object
> > A. It if was
> > 
> > n't
> > , either:
> > - A is independently referenced inside the block, so there is UB
> > and
> > all bets are off.
> > - A is not independently referenced inside the blo ck,
> > so t here are no pairs of accesses to incorrectly reorder as all
> > accesses to A in
> > the block are done through P. You just need to delimit the block
> > with dataflow barriers
> > , summar iz
> > ing the effect of the block at entry/exit.
> 
> Okay, I think I agree with you assuming that we put in entry/exit
> barriers to preserve the block boundaries. I'd specifically like to
> avoid that, however.
> 
> I'm not proposing full code motion barriers, only punctual dataflow
> use/defs to signal entry/exit to the scope.
> 
> 
> 
> Logically, entering the scope transfers the pointed data into a new
> unique block of memory, and puts its address on the restrict
> pointer. Exiting the scope transfers it back. Of course you do not
> want to actually allocate a new object and move the data, but you
> can use these semantics to define the scope entry/exit intrinsics.
> Their contribution to dataflow is only limited to the content of the
> address used to initialize the restricted pointer. These would be
> lighter than the proposed intrinsic as they would not have
> specialized control-flow restrictions.
Thanks for explaining, I now understand what you're proposing.
> 
> This approach makes the restrict attribute effective against all live
> variables without having to examine the extent of the scope to
> collect all references, which is in general impractical.
I think you've misunderstood this. For restrict-qualified local variables,
every memory access within the containing block (which is everything in the
function for function argument restrict-qualified pointers) get tagged with the
scope. This is trivial to determine.
> It also
> removes the need for scope metadata, as there would be no need to
> name the scopes.
Indeed.
> 
> 
> Anyway, this is just a general alternate design, since you were
> asking for one.
Yes, and thank you for doing so.
> I'm sure still would take some time/effort to map it
> onto the LLVM framework.
That does not seem too difficult, the question is really just whether or not it
gives us what we need...
> 
So in this scheme, we'd have the following:

void foo(T * restrict a, T * restrict b) {
  *a = *b;
}

T * x = ..., *y = ..., *z = ..., *w = ...;
foo(x, y);
foo(z, w);

become:

T * x = ..., *y = ..., *z = ..., *w = ...;

T * a1 = @llvm.noalias.start(x); // model: reads from x (with a general write
control dep).
T * b1 = @llvm.noalias.start(y);
*a1 = *b1;
@llvm.noalias.end(a1, x); // model: reads from a1, writes to x.
@llvm.noalias.end(b1, y);

T * a2 = @llvm.noalias.start(z);
T * b2 = @llvm.noalias.start(w);
*a2 = *b2;
@llvm.noalias.end(a2, z);
@llvm.noalias.end(b2, w);

This does indeed seem generally equivalent to the original proposal in the sense
that the original proposal has an implicit ending barrier at the last relevant
derived access, and here we have explicit ending barriers. The advantage is the
lack of metadata (and associated implementation complexity). The disadvantage is
that we have additional barriers to manage, and these are write barriers on the
underlying pointers. It is not clear to me this would make too much difference,
so long as we aggressively hoisted the ending barriers to just after the last
use based on their 'noalias' operands.

So this is relatively appealing, and I think would not be a bad way to model C99
restrict (extending the scheme to handle mutually-ambiguous restrict-qualified
pointers from aggregates seems straightforward). It does not, however, cover
cases where the region of guaranteed disjointness (for lack of a better term) is
not continuous. This will come up when implementing a scheme such as that in the
current C++ alias-set proposal (N4150). To construct a quick example, imagine
that our implementation of std::vector is annotated such that (assuming the
standard allocator) each std::vector object's internal storage has a
distinct alias set, and we have:

  std::vector<T> x, y;
  ...
  T * q = &x[0];
  for (int i = 0; i < 1600; ++i) {
    x[i] = y[i];
    *q += x[i];
  }

so here we know that the memory accesses inside the operator[] from x and y
don't alias, but the alias-set attribute does not tell us about the
relationship between those accesses and the *q. The point of dominance, however,
needs to associated with the declaration of x and y (specifically, we want to
preserve the dominance over the loop). A start/end barrier scheme localized
around the inlined operator[] functions would not do that, and placing start/end
barriers around the entire live region of x and y would not be correct. I can,
however, represent this using the metadata scheme.

Thanks again,
Hal
> 
> 
> 
> Regards,
> 
> 
> 
> 
-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

llvm dev - Nov 2014 - [LLVMdev] Upcoming Changes/Additions to Scoped-NoAlias metadata

[LLVMdev] Upcoming Changes/Additions to Scoped-NoAlias metadata

[LLVMdev] Upcoming Changes/Additions to Scoped-NoAlias metadata

[LLVMdev] Upcoming Changes/Additions to Scoped-NoAlias metadata