thr3ads.net - llvm dev - [LLVMdev] memory scopes in atomic instructions [Nov 2014]

If this information is useful, please help other people find it:
Share via:

Sahasrabuddhe, Sameer

2014-Nov-14 18:17 UTC

[LLVMdev] memory scopes in atomic instructions

<html>
  <head>

    <meta http-equiv="content-type" content="text/html;
charset=utf-8">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    Hi all,<br>
    <br>
    OpenCL 2.0 introduced the notion of memory scope in atomic
    operations to global memory. These scopes are a hint to the
    underlying platform to optimize how synchronization is achieved.
    HSAIL also has a notion of memory scopes that is compatible with
    OpenCL 2.0. Currently, the LLVM IR uses a binary value
    (SingleThread/CrossThread) to represent synchronization scope on
    atomic instructions. This makes it difficult to translate OpenCL 2.0
    atomic operations to LLVM IR, and also to implement HSAIL memory
    scopes in the proposed HSAIL backend for LLVM.<br>
    <br>
    We would like to enhance the representation of memory scopes in LLVM
    IR to allow more values than just the current two. The intention of
    this email is to invite comments before we start prototyping. Here's
    what we have in mind:<br>
    <ol>
      <li>Update the synchronization scope field in atomic instructions
        from a single bit to a wider field, say 32-bit unsigned integer.
      </li>
      <li>Retain the current default of zero as "system scope",
        replacing the current "cross thread" scope.<br>
      </li>
      <li>All other values are target-defined.</li>
      <li>The use of "single thread scope" is not clear. If it
is
        required in target-independent transforms, then it could be
        encoded as just "1", or as "all ones" in the wider
field. The
        latter option is a bit weird, because most targets will have
        very few scopes. But it is useful in case the next point is
        included in LLVM IR.</li>
      <li>Possibly add the following constraint on memory scopes:
"The
        scope represented by a larger value is nested inside (is a
        proper subset of) the scope represented by a smaller value."
        This would also imply that the value used for single-thread
        scope must be the largest value used by the target.<br>
        This constraint on "nesting" is easily satisfied by HSAIL (and
        also OpenCL), where synchronization scopes increase from a
        single work-item to the entire system. But it is conceivable
        that other targets do not have this constraint. For example, a
        platform may define synchronization scopes in terms of
        overlapping sets instead of proper subsets. <br>
      </li>
      <li>The impact of this change is limited to increasing the number
        of bits used to store synchronization scope. Future
        optimizations on atomics may need to interpret scopes in
        target-defined ways. When the synchronization scopes of two
        atomic instructions do not match, these optimizations must query
        the target for validity. <br>
      </li>
    </ol>
    <b>Relation with SPIR: </b>SPIR defines an enumeration for
memory
    scopes, but it does not support LLVM atomic instructions. So memory
    scopes in SPIR are independent of the representation finally chosen
    in LLVM IR. A compiler that translates SPIR to native LLVM IR will
    have to translate memory scopes wherever appropriate. <br>
    <br>
    Sameer.<br>
  </body>
</html>

Tom Stellard

2014-Nov-14 18:38 UTC

head link

[LLVMdev] memory scopes in atomic instructions

On Fri, Nov 14, 2014 at 11:47:45PM +0530, Sahasrabuddhe, Sameer wrote:

Can you send a plain-text version of this email.  It's easier to read
and reply to.

-Tom
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Owen Anderson

2014-Nov-14 18:50 UTC

head link

[LLVMdev] memory scopes in atomic instructions

Hi Sameer,

I support this proposal, and have discussed more or less the same idea with
various people in the past.  Regarding point #5, I believe address spaces may
already provide the functionality needed to express overlapping constraints. 
I’m also not aware of any systems that would really need that functionality
anyways.

—Owen
> On Nov 14, 2014, at 10:17 AM, Sahasrabuddhe, Sameer
<Sameer.Sahasrabuddhe at amd.com> wrote:
> 
> Hi all,
> 
> OpenCL 2.0 introduced the notion of memory scope in atomic operations to
global memory. These scopes are a hint to the underlying platform to optimize
how synchronization is achieved. HSAIL also has a notion of memory scopes that
is compatible with OpenCL 2.0. Currently, the LLVM IR uses a binary value
(SingleThread/CrossThread) to represent synchronization scope on atomic
instructions. This makes it difficult to translate OpenCL 2.0 atomic operations
to LLVM IR, and also to implement HSAIL memory scopes in the proposed HSAIL
backend for LLVM.
> 
> We would like to enhance the representation of memory scopes in LLVM IR to
allow more values than just the current two. The intention of this email is to
invite comments before we start prototyping. Here's what we have in mind:
> Update the synchronization scope field in atomic instructions from a single
bit to a wider field, say 32-bit unsigned integer.
> Retain the current default of zero as "system scope", replacing
the current "cross thread" scope.
> All other values are target-defined.
> The use of "single thread scope" is not clear. If it is required
in target-independent transforms, then it could be encoded as just
"1", or as "all ones" in the wider field. The latter option
is a bit weird, because most targets will have very few scopes. But it is useful
in case the next point is included in LLVM IR.
> Possibly add the following constraint on memory scopes: "The scope
represented by a larger value is nested inside (is a proper subset of) the scope
represented by a smaller value." This would also imply that the value used
for single-thread scope must be the largest value used by the target.
> This constraint on "nesting" is easily satisfied by HSAIL (and
also OpenCL), where synchronization scopes increase from a single work-item to
the entire system. But it is conceivable that other targets do not have this
constraint. For example, a platform may define synchronization scopes in terms
of overlapping sets instead of proper subsets.
> The impact of this change is limited to increasing the number of bits used
to store synchronization scope. Future optimizations on atomics may need to
interpret scopes in target-defined ways. When the synchronization scopes of two
atomic instructions do not match, these optimizations must query the target for
validity.
> Relation with SPIR: SPIR defines an enumeration for memory scopes, but it
does not support LLVM atomic instructions. So memory scopes in SPIR are
independent of the representation finally chosen in LLVM IR. A compiler that
translates SPIR to native LLVM IR will have to translate memory scopes wherever
appropriate.
> 
> Sameer.
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141114/4816418e/attachment.html>

Sahasrabuddhe, Sameer

2014-Nov-14 19:09 UTC

head link

[LLVMdev] memory scopes in atomic instructions

On 11/15/2014 12:08 AM, Tom Stellard wrote:> Can you send a plain-text version of this email. It's easier to read
> and reply to.
Sorry about that! Here's the plain text (I hope!):

Hi all,

OpenCL 2.0 introduced the notion of memory scope in atomic operations to
global memory. These scopes are a hint to the underlying platform to
optimize how synchronization is achieved. HSAIL also has a notion of
memory scopes that is compatible with OpenCL 2.0. Currently, the LLVM IR
uses a binary value (SingleThread/CrossThread) to represent
synchronization scope on atomic instructions. This makes it difficult to
translate OpenCL 2.0 atomic operations to LLVM IR, and also to implement
HSAIL memory scopes in the proposed HSAIL backend for LLVM.

We would like to enhance the representation of memory scopes in LLVM IR
to allow more values than just the current two. The intention of this
email is to invite comments before we start prototyping. Here's what we
have in mind:

1. Update the synchronization scope field in atomic instructions from a
single bit to a wider field, say 32-bit unsigned integer.
2. Retain the current default of zero as "system scope", replacing
the
current "cross thread" scope.
3. All other values are target-defined.
4. The use of "single thread scope" is not clear. If it is required
in
target-independent transforms, then it could be encoded as just
"1",
or as "all ones" in the wider field. The latter option is a bit
weird, because most targets will have very few scopes. But it is
useful in case the next point is included in LLVM IR.
5. Possibly add the following constraint on memory scopes: "The scope
represented by a larger value is nested inside (is a proper subset
of) the scope represented by a smaller value." This would also imply
that the value used for single-thread scope must be the largest
value used by the target.
This constraint on "nesting" is easily satisfied by HSAIL (and
also
OpenCL), where synchronization scopes increase from a single
work-item to the entire system. But it is conceivable that other
targets do not have this constraint. For example, a platform may
define synchronization scopes in terms of overlapping sets instead
of proper subsets.
6. The impact of this change is limited to increasing the number of
bits used to store synchronization scope. Future optimizations on
atomics may need to interpret scopes in target-defined ways. When
the synchronization scopes of two atomic instructions do not match,
these optimizations must query the target for validity.

*Relation with SPIR: *SPIR defines an enumeration for memory scopes, but
it does not support LLVM atomic instructions. So memory scopes in SPIR
are independent of the representation finally chosen in LLVM IR. A
compiler that translates SPIR to native LLVM IR will have to translate
memory scopes wherever appropriate.

Sameer.

Sahasrabuddhe, Sameer

2014-Nov-17 05:03 UTC

head link

[LLVMdev] memory scopes in atomic instructions

Copying #5 here for reference:

 > 5. Possibly add the following constraint on memory scopes: "The scope
 >   represented by a larger value is nested inside (is a proper subset
 >   of) the scope represented by a smaller value." This would also
imply
 >   that the value used for single-thread scope must be the largest
 >   value used by the target.
 >   This constraint on "nesting" is easily satisfied by HSAIL (and
also
 >   OpenCL), where synchronization scopes increase from a single
 >   work-item to the entire system. But it is conceivable that other
 >   targets do not have this constraint. For example, a platform may
 >   define synchronization scopes in terms of overlapping sets instead
 >   of proper subsets.

On 11/15/2014 12:20 AM, Owen Anderson wrote:> I support this proposal, and have discussed more or less the same idea 
> with various people in the past.
Thanks! Good to know that.
>  Regarding point #5, I believe address spaces may already provide the 
> functionality needed to express overlapping constraints.
I am not sure how address spaces can solve the need for such a 
constraint. Memory scopes are orthogonal to address spaces. An agent 
that atomically accesses a shared location in one address space may 
specify a different memory scope on different instructions.
>  I’m also not aware of any systems that would really need that 
> functionality anyways.
Agreed that there are no known systems with overlapping memory scopes, 
but it might premature to just dismiss the possibility. For example, 
there could be a system with a fixed number of agents, say four. Each 
agent can choose to synchronize with one, two or all three peers on 
different instructions. The resulting memory scopes cannot be ordered as 
proper subsets.

Sameer.

Seemingly Similar Threads

Search for more possibly parallel threads

llvm dev - Nov 2014 - [LLVMdev] memory scopes in atomic instructions

[LLVMdev] memory scopes in atomic instructions

[LLVMdev] memory scopes in atomic instructions

[LLVMdev] memory scopes in atomic instructions

[LLVMdev] memory scopes in atomic instructions

[LLVMdev] memory scopes in atomic instructions

Seemingly Similar Threads