Sahasrabuddhe, Sameer
2014-Nov-17 05:03 UTC
[LLVMdev] memory scopes in atomic instructions
Copying #5 here for reference: > 5. Possibly add the following constraint on memory scopes: "The scope > represented by a larger value is nested inside (is a proper subset > of) the scope represented by a smaller value." This would also imply > that the value used for single-thread scope must be the largest > value used by the target. > This constraint on "nesting" is easily satisfied by HSAIL (and also > OpenCL), where synchronization scopes increase from a single > work-item to the entire system. But it is conceivable that other > targets do not have this constraint. For example, a platform may > define synchronization scopes in terms of overlapping sets instead > of proper subsets. On 11/15/2014 12:20 AM, Owen Anderson wrote:> I support this proposal, and have discussed more or less the same idea > with various people in the past.Thanks! Good to know that.> Regarding point #5, I believe address spaces may already provide the > functionality needed to express overlapping constraints.I am not sure how address spaces can solve the need for such a constraint. Memory scopes are orthogonal to address spaces. An agent that atomically accesses a shared location in one address space may specify a different memory scope on different instructions.> I’m also not aware of any systems that would really need that > functionality anyways.Agreed that there are no known systems with overlapping memory scopes, but it might premature to just dismiss the possibility. For example, there could be a system with a fixed number of agents, say four. Each agent can choose to synchronize with one, two or all three peers on different instructions. The resulting memory scopes cannot be ordered as proper subsets. Sameer.
> On Nov 16, 2014, at 9:03 PM, Sahasrabuddhe, Sameer <Sameer.Sahasrabuddhe at amd.com> wrote: > >> >> Regarding point #5, I believe address spaces may already provide the functionality needed to express overlapping constraints. > > I am not sure how address spaces can solve the need for such a constraint. Memory scopes are orthogonal to address spaces. An agent that atomically accesses a shared location in one address space may specify a different memory scope on different instructions.It is already the case that address spaces can (potentially) alias. As such, the combination of address spaces and memory scopes can represent any combination where the sharing properties of memory are statically known, simply by having (potentially aliasing) address spaces to represent memory pools that are only shared with a specific combinations of agents. One can imagine a GPU that worked like this, and GPU programming models do generally differentiating various sharing pools statically. The case that this doesn’t handle is when the sharing properties are not known statically. However, I question the utility of designing this, since there are no known systems that require it. We should design the representation to cover all reasonably anticipated systems, not ones that don’t, and have no prospect of, existing. —Owen -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141116/bdc36d35/attachment.html>
Sahasrabuddhe, Sameer
2014-Nov-17 06:13 UTC
[LLVMdev] memory scopes in atomic instructions
On 11/17/2014 10:51 AM, Owen Anderson wrote:> It is already the case that address spaces can (potentially) alias. > As such, the combination of address spaces and memory scopes can > represent any combination where the sharing properties of memory are > statically known, simply by having (potentially aliasing) address > spaces to represent memory pools that are only shared with a specific > combinations of agents. One can imagine a GPU that worked like this, > and GPU programming models do generally differentiating various > sharing pools statically.I am trying to understand this with a concrete example. OpenCL 2.0 allows atomic instructions in the global address space, which is encoded as "1" in the SPIR target. The possible memory scopes are work_item, work_group, device and all_svm_devices. We could resolve the global address spaces into four statically known "synchronization pools", say "global_work_item", "global_work_group", etc. They would all alias with the real global address space, and could be encoded as new address spaces, is that correct? Then we wouldn't even need the memory scope argument on the atomic instruction, right? Note that "global_work_item" isn't even a real address space, i.e., it is not a well-defined sequence of addresses that is located somewhere in the global address space. It's actually the set of all global locations that can potentially be accessed by atomic instructions using "work_item" memory scope in a given program. It is not required to be contiguous, and can alias with the entire global address space in the worst case. So this is what it looks like to me: The proposal is to encode memory scopes as a new field that is orthogonal to address spaces. Address spaces are defined on locations, while memory scopes are defined on operations. Every combination of an address space and a memory scope represents a set of instructions synchronizing with a set of agents through a set of locations in that address space. The first two sets are statically known (not considering the effect of control flow on the instructions). But the set of locations is dynamic, and could span the whole address space in the absence of aliasing information.> The case that this doesn’t handle is when the sharing properties are > not known statically. However, I question the utility of designing > this, since there are no known systems that require it. We should > design the representation to cover all reasonably anticipated systems, > not ones that don’t, and have no prospect of, existing.Sure. But we could just leave this undefined for now, without losing the ability to express what we need. The idea is to not specify any semantics on non-zero memory scopes (such as assuming that they have a nesting order). Sameer.