thr3ads.net - llvm dev - [LLVMdev] memory scopes in atomic instructions [Nov 2014]

If this information is useful, please help other people find it:
Share via:

Sahasrabuddhe, Sameer

2014-Nov-17 05:03 UTC

[LLVMdev] memory scopes in atomic instructions

Copying #5 here for reference:

 > 5. Possibly add the following constraint on memory scopes: "The scope
 >   represented by a larger value is nested inside (is a proper subset
 >   of) the scope represented by a smaller value." This would also
imply
 >   that the value used for single-thread scope must be the largest
 >   value used by the target.
 >   This constraint on "nesting" is easily satisfied by HSAIL (and
also
 >   OpenCL), where synchronization scopes increase from a single
 >   work-item to the entire system. But it is conceivable that other
 >   targets do not have this constraint. For example, a platform may
 >   define synchronization scopes in terms of overlapping sets instead
 >   of proper subsets.

On 11/15/2014 12:20 AM, Owen Anderson wrote:> I support this proposal, and have discussed more or less the same idea 
> with various people in the past.
Thanks! Good to know that.
>  Regarding point #5, I believe address spaces may already provide the 
> functionality needed to express overlapping constraints.
I am not sure how address spaces can solve the need for such a 
constraint. Memory scopes are orthogonal to address spaces. An agent 
that atomically accesses a shared location in one address space may 
specify a different memory scope on different instructions.
>  I’m also not aware of any systems that would really need that 
> functionality anyways.
Agreed that there are no known systems with overlapping memory scopes, 
but it might premature to just dismiss the possibility. For example, 
there could be a system with a fixed number of agents, say four. Each 
agent can choose to synchronize with one, two or all three peers on 
different instructions. The resulting memory scopes cannot be ordered as 
proper subsets.

Sameer.

Owen Anderson

2014-Nov-17 05:21 UTC

head link

[LLVMdev] memory scopes in atomic instructions

> On Nov 16, 2014, at 9:03 PM, Sahasrabuddhe, Sameer <Sameer.Sahasrabuddhe
at amd.com> wrote:
> 
>> 
>> Regarding point #5, I believe address spaces may already provide the
functionality needed to express overlapping constraints.
> 
> I am not sure how address spaces can solve the need for such a constraint.
Memory scopes are orthogonal to address spaces. An agent that atomically
accesses a shared location in one address space may specify a different memory
scope on different instructions.
It is already the case that address spaces can (potentially) alias.  As such,
the combination of address spaces and memory scopes can represent any
combination where the sharing properties of memory are statically known, simply
by having (potentially aliasing) address spaces to represent memory pools that
are only shared with a specific combinations of agents.  One can imagine a GPU
that worked like this, and GPU programming models do generally differentiating
various sharing pools statically.

The case that this doesn’t handle is when the sharing properties are not known
statically.  However, I question the utility of designing this, since there are
no known systems that require it.  We should design the representation to cover
all reasonably anticipated systems, not ones that don’t, and have no prospect
of, existing.

—Owen


-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141116/bdc36d35/attachment.html>

Sahasrabuddhe, Sameer

2014-Nov-17 06:13 UTC

head link

[LLVMdev] memory scopes in atomic instructions

On 11/17/2014 10:51 AM, Owen Anderson wrote:
> It is already the case that address spaces can (potentially) alias. 
>  As such, the combination of address spaces and memory scopes can 
> represent any combination where the sharing properties of memory are 
> statically known, simply by having (potentially aliasing) address 
> spaces to represent memory pools that are only shared with a specific 
> combinations of agents.  One can imagine a GPU that worked like this, 
> and GPU programming models do generally differentiating various 
> sharing pools statically.
I am trying to understand this with a concrete example. OpenCL 2.0 
allows atomic instructions in the global address space, which is encoded 
as "1" in the SPIR target. The possible memory scopes are work_item, 
work_group, device and all_svm_devices. We could resolve the global 
address spaces into four statically known "synchronization pools", say
"global_work_item", "global_work_group", etc. They would all
alias with
the real global address space, and could be encoded as new address 
spaces, is that correct? Then we wouldn't even need the memory scope 
argument on the atomic instruction, right?

Note that "global_work_item" isn't even a real address space,
i.e., it
is not a well-defined sequence of addresses that is located somewhere in 
the global address space. It's actually the set of all global locations 
that can potentially be accessed by atomic instructions using 
"work_item" memory scope in a given program. It is not required to be 
contiguous, and can alias with the entire global address space in the 
worst case.

So this is what it looks like to me: The proposal is to encode memory 
scopes as a new field that is orthogonal to address spaces. Address 
spaces are defined on locations, while memory scopes are defined on 
operations. Every combination of an address space and a memory scope 
represents a set of instructions synchronizing with a set of agents 
through a set of locations in that address space. The first two sets are 
statically known (not considering the effect of control flow on the 
instructions). But the set of locations is dynamic, and could span the 
whole address space in the absence of aliasing information.
> The case that this doesn’t handle is when the sharing properties are 
> not known statically.  However, I question the utility of designing 
> this, since there are no known systems that require it.  We should 
> design the representation to cover all reasonably anticipated systems, 
> not ones that don’t, and have no prospect of, existing.
Sure. But we could just leave this undefined for now, without losing the 
ability to express what we need. The idea is to not specify any 
semantics on non-zero memory scopes (such as assuming that they have a 
nesting order).

Sameer.

llvm dev - Nov 2014 - [LLVMdev] memory scopes in atomic instructions

[LLVMdev] memory scopes in atomic instructions

[LLVMdev] memory scopes in atomic instructions

[LLVMdev] memory scopes in atomic instructions