thr3ads.net - llvm dev - [LLVMdev] Loads moving across barriers [Dec 2013]

If this information is useful, please help other people find it:
Share via:

Chandler Carruth

2013-Dec-27 02:25 UTC

[LLVMdev] Loads moving across barriers

On Thu, Dec 26, 2013 at 8:54 PM, Andrew Trick <atrick at apple.com> wrote:
> Others can weigh in here. This is just my understanding. Attribute
> propagation has to be optional because we can’t assume inter-procedural
> optimization runs for correct codegen. What if the memfence resides in a
> different module?
>
> In the case of noduplicate, the only reason to propagate AFAICT would be
> to suppress inlining. It seems reasonable enough to expect attribute
> propagation to happen before inlining. So I don't think noduplicate is
an
> issue in practice.
>
I think you've misunderstood the specification of noduplicate... This
isn't
how it works.

Let's assume we have functions A, B, C, and D. Function A is marked as
'noduplicate' and thus all calls to it are marked 'noduplicate'.
Functions
B and C call function A in exactly one place. Function D calls function B
twice, and C in exactly one. place. Functions B and C are internal.
Functions B, C, and D are defined, while function A is only declared.

Only function A and calls to function A are marked as 'noduplicate'. I
don't see any reason why this attribute would be propagated?

Function B cannot be inlined into function D because doing so would
duplicate one of the calls to A. The inliner checks this *while doing the
inlining*, it does not rely on any function attribute on B for correctness
here.

Function C *can* be inlined into function D because there is only one call
site and it is an internal function. Thus, the call to A is not duplicated,
it is merely sunk into D.

So there is no propagation of attributes to achieve correctness even with
noduplicate. The inliner directly checks[1] the callee's call instructions
to ensure that inlining is valid.

I agree with Andy that we should *not* add a requirement to propagate such
attributes.

> I think "memfence" could be an issue if we use the attribute to
summarize
> LLVM atomic load/store and fence instructions (in addition to OpenCL
> barriers).
>
I have no idea what semantics you would attach to it in this case. I've not
seen any clear explanation of such semantics yet in this thread.

The only clear semantics I've seen expressed so far seem much more
appropriate for attaching to a noduplicate call to an intrinsic... But I
think I'll need to read this thread again to re-absorb much of the
information after the holidays. =]

[1]: Note, the current implementation of noduplicate is buggy, but
hopefully in a latent way -- it doesn't handle invokes.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131226/bc3c82de/attachment.html>

Chandler Carruth

2013-Dec-27 02:37 UTC

head link

[LLVMdev] Loads moving across barriers

On Thu, Dec 26, 2013 at 9:25 PM, Chandler Carruth <chandlerc at
google.com>wrote:
> [1]: Note, the current implementation of noduplicate is buggy, but
> hopefully in a latent way -- it doesn't handle invokes.

Patch posted that should fix this based on my reading of the code... no
idea if it works of course... anyways, back to the real topic.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131226/6d9f7143/attachment.html>

Andrew Trick

2013-Dec-27 03:38 UTC

head link

[LLVMdev] Loads moving across barriers

On Dec 26, 2013, at 6:25 PM, Chandler Carruth <chandlerc at google.com>
wrote:
> On Thu, Dec 26, 2013 at 8:54 PM, Andrew Trick <atrick at apple.com>
wrote:
> Others can weigh in here. This is just my understanding. Attribute
propagation has to be optional because we can’t assume inter-procedural
optimization runs for correct codegen. What if the memfence resides in a
different module?
> 
> In the case of noduplicate, the only reason to propagate AFAICT would be to
suppress inlining. It seems reasonable enough to expect attribute propagation to
happen before inlining. So I don't think noduplicate is an issue in
practice.
> 
> I think you've misunderstood the specification of noduplicate... This
isn't how it works.
> 
> Let's assume we have functions A, B, C, and D. Function A is marked as
'noduplicate' and thus all calls to it are marked 'noduplicate'.
Functions B and C call function A in exactly one place. Function D calls
function B twice, and C in exactly one. place. Functions B and C are internal.
Functions B, C, and D are defined, while function A is only declared.
> 
> Only function A and calls to function A are marked as
'noduplicate'. I don't see any reason why this attribute would be
propagated?
> 
> Function B cannot be inlined into function D because doing so would
duplicate one of the calls to A. The inliner checks this *while doing the
inlining*, it does not rely on any function attribute on B for correctness here.
> 
> Function C *can* be inlined into function D because there is only one call
site and it is an internal function. Thus, the call to A is not duplicated, it
is merely sunk into D.
> 
> 
> So there is no propagation of attributes to achieve correctness even with
noduplicate. The inliner directly checks[1] the callee's call instructions
to ensure that inlining is valid.
That makes perfect sense.
> I agree with Andy that we should *not* add a requirement to propagate such
attributes.
> 
>  
> I think "memfence" could be an issue if we use the attribute to
summarize LLVM atomic load/store and fence instructions (in addition to OpenCL
barriers).
> 
> I have no idea what semantics you would attach to it in this case. I've
not seen any clear explanation of such semantics yet in this thread.
> 
> The only clear semantics I've seen expressed so far seem much more
appropriate for attaching to a noduplicate call to an intrinsic... But I think
I'll need to read this thread again to re-absorb much of the information
after the holidays. =]
I think it would be too misleading and bizarre to define a memfence/nomemfence
attribute that does not relate to memory fence instructions. In the future, I do
see a potential use for such an attribute, but that discussion belongs in a
different thread and would be premature (we would have to talk about alias info
on calls).

Let’s say we’re just talking about llvm.opencl.memfence(addressspace) then. The
same argument about attribute propagation applies, though I will defer those who
are trying to make opencl correct.

-Andy
> 
> 
> [1]: Note, the current implementation of noduplicate is buggy, but
hopefully in a latent way -- it doesn't handle invokes.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131226/bd6da6f1/attachment.html>

Matt Arsenault

2014-Jan-04 02:49 UTC

head link

[LLVMdev] Loads moving across barriers

On Dec 26, 2013, at 10:38 PM, Andrew Trick <atrick at apple.com>
wrote:>> 
>>  
>> I think "memfence" could be an issue if we use the attribute
to summarize LLVM atomic load/store and fence instructions (in addition to
OpenCL barriers).
>> 
>> I have no idea what semantics you would attach to it in this case.
I've not seen any clear explanation of such semantics yet in this thread.
>> 
>> The only clear semantics I've seen expressed so far seem much more
appropriate for attaching to a noduplicate call to an intrinsic... But I think
I'll need to read this thread again to re-absorb much of the information
after the holidays. =]
> 
> I think it would be too misleading and bizarre to define a
memfence/nomemfence attribute that does not relate to memory fence instructions.
In the future, I do see a potential use for such an attribute, but that
discussion belongs in a different thread and would be premature (we would have
to talk about alias info on calls).
I’m not sure what you mean by this. I was never thinking the fence instruction
was detached from this, and it would just imply the parent function has a
memfence. I don’t see the use of another memfence attribute that would not do
this.

> 
> Let’s say we’re just talking about llvm.opencl.memfence(addressspace) then.
The same argument about attribute propagation applies, though I will defer those
who are trying to make opencl correct.
> 
> -Andy
> 
So the plan is have a nomemfence(N) attribute that indicates there is no
memfence for addrspace(N) on a function, and can be specified 0 or more times.
readonly or readnone imply nomemfence for all address spaces.

Because the normal case is nomemfence for most address spaces, something else is
needed to avoid needing marking every function with nomemfence(0..max address
space) when only a few usually matter. I suggest module metadata named
!”llvm.used_addrspaces” that will list the address spaces relevant for the
module. The absence of a fence will infer nomemfence for all of these address
spaces minus any found potential fences. Without the metadata, only
nomemfence(0) will be inferred.

For the purpose of inferring nomemfence, fence instructions will be treated as a
memfence for all address spaces, as it currently does not specify one (despite
the atomic instructions accepting non-zero address spaces,)

Additionally, an llvm.memfence intrinsic, which will take a constant i32
parameter for the address space to be fenced. This behaves as call to a function
with nomemfence set for every address space except the one specified by its
constant parameter. This part might not actually be necessary (the metadata plus
target intrinsics marked with whatever nomemfences are required is probably
enough, but might be useful anyway).

Does this sound good?

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140103/1fd782f3/attachment.html>

llvm dev - Dec 2013 - [LLVMdev] Loads moving across barriers

[LLVMdev] Loads moving across barriers

[LLVMdev] Loads moving across barriers

[LLVMdev] Loads moving across barriers

[LLVMdev] Loads moving across barriers