thr3ads.net - llvm dev - [LLVMdev] Proposal for atomic and synchronization instructions [Jul 2007]

If this information is useful, please help other people find it:
Share via:

Chandler Carruth

2007-Jul-09 06:39 UTC

[LLVMdev] Proposal for atomic and synchronization instructions

Hello,

After a fair amount of research and work, I have put together a
concrete proposal for LLVM representations of atomic operations and
synchronization constructs. These aim to provide the minimal
functionality in the IR for representing the hardware constructs that
threading libraries and parallel programming rely on.

http://chandlerc.net/llvm_atomics.html

While I am no expert on the various architectures, I've done my best
at providing base-line implementations for each instruction. I am sure
these will need tweaking and fixing, but should provide a very good
starting point for the targets.

Comments are more than welcome, especially suggestions for
improvement. I hope this provides a sound background and a good
starting place for bringing these features to LLVM. Many thanks also
go to Reid who has helped tremendously with putting this together, and
several of the people at Aerospace who helped in the research phase.

-Chandler Carruth

John Criswell

2007-Jul-09 15:05 UTC

head link

[LLVMdev] Proposal for atomic and synchronization instructions

Chandler Carruth wrote:> Hello,
>
> After a fair amount of research and work, I have put together a
> concrete proposal for LLVM representations of atomic operations and
> synchronization constructs. These aim to provide the minimal
> functionality in the IR for representing the hardware constructs that
> threading libraries and parallel programming rely on.
>
> http://chandlerc.net/llvm_atomics.html
>
> While I am no expert on the various architectures, I've done my best
> at providing base-line implementations for each instruction. I am sure
> these will need tweaking and fixing, but should provide a very good
> starting point for the targets.
>
> Comments are more than welcome, especially suggestions for
> improvement. I hope this provides a sound background and a good
> starting place for bringing these features to LLVM. Many thanks also
> go to Reid who has helped tremendously with putting this together, and
> several of the people at Aerospace who helped in the research phase.
>   This looks good; this is basically what we came up with for the SVA-OS work.

Some comments:

1) You may want to consider adding atomic load-<bitwise operation>-store
instructions in addition to load-<add/subtract> instructions.  The Linux
kernel uses these for atomic bit setting/clearing, and on many systems
they can be implemented more efficiently using special assembly
instructions.  That said, I have never run experiments to verify that
such instructions are critical to kernel performance.

2) You need a strategy for handling architectures that can't handle
atomic operations on certain LLVM data types.  For example, 64 bit
machines can operate atomically on 64 bit operands, but some 32 bit
machines cannot.  I think you can fix this with spin locking, but you
need to be careful on how you do it.  I think in order to do it
correctly, you need a separate spinlock variable for each individual
instruction that requires a spinlock.

3) You say that your operations work on all first class LLVM data
types.  The LLVM vector type is considered a first class type.  Should
LLVM support atomic vector operations?

4) For consistency, you may need to specify that the LLVM volatile
keyword can be added to the atomic operations to prevent optimizations
from modifying or removing them.   I'm not aware of any optimization
that works on atomic ops in practice, but I don't see why they couldn't
be written.

5) With the addition of membar instructions, it may be time to re-think
what the LLVM volatile keyword means.  Currently, volatile prevents
removing, modifying, or reordering of a load or store.  However, membar
also prevents reordering.  Perhaps it would be useful to redefine
volatile to mean that a load/store cannot be removed or modified but can
be reordered, and membar alone prevents reordering.

-- John T.> -Chandler Carruth
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

Torvald Riegel

2007-Jul-09 16:11 UTC

head link

[LLVMdev] Proposal for atomic and synchronization instructions

Hi,

I'd like to see support for something like this. I have some comments, and I
think there is existing work that you can reuse.

TAS and CAS are _not_ theoretically equivalent. TAS is weaker because it can 
solve consensus in a nonblocking way only for 2 threads (it has consensus 
number 2), whereas CAS can solve consensus for any number of threads 
(infinite consensus number).

"While the processor may spin and attempt the atomic operation more than
once
before it is successful, research indicates this is extremely uncommon."
I don't understand this sentence, what do you mean?

You probably don't need to require CAS (compare-and-set) to return the 
previous value (I think some architectures don't), but just return a boolean
value (success/failure).

What are the reasons because of which you picked the Load/Store model for 
barriers and not some other kind (e.g., acquire/release/...)?

Did you have a look at the atomic_ops project?
http://www.hpl.hp.com/research/linux/atomic_ops/
It already has implementations for several architectures and several 
compilers. It uses a different consistency model (different set of 
constraints for operations) and groups necessary memory barriers with 
instructions (helpful on some architectures). It supports a few more 
operations. The author (Hans Boehm) seems to also be active in the area of 
C/C++ memory models (or some support for this).

Torvald

On Monday 09 July 2007 08:39, Chandler Carruth wrote:> Hello,
>
> After a fair amount of research and work, I have put together a
> concrete proposal for LLVM representations of atomic operations and
> synchronization constructs. These aim to provide the minimal
> functionality in the IR for representing the hardware constructs that
> threading libraries and parallel programming rely on.
>
> http://chandlerc.net/llvm_atomics.html
>
> While I am no expert on the various architectures, I've done my best
> at providing base-line implementations for each instruction. I am sure
> these will need tweaking and fixing, but should provide a very good
> starting point for the targets.
>
> Comments are more than welcome, especially suggestions for
> improvement. I hope this provides a sound background and a good
> starting place for bringing these features to LLVM. Many thanks also
> go to Reid who has helped tremendously with putting this together, and
> several of the people at Aerospace who helped in the research phase.
>
> -Chandler Carruth
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Scott Michel

2007-Jul-09 17:33 UTC

head link

[LLVMdev] Proposal for atomic and synchronization instructions

Torvald Riegel wrote:> Hi,
> 
> I'd like to see support for something like this. I have some comments,
and I
> think there is existing work that you can reuse.
"reuse within the compiler."
> "While the processor may spin and attempt the atomic operation more
than once
> before it is successful, research indicates this is extremely
uncommon."
> I don't understand this sentence, what do you mean?
I'm not sure I can pinpoint the paper from which the statement is based,
but I seem to recall something similar in the original LL-SC papers
(Maurice Herlihy, DEC Western Research Labs?) It's a foundation for
lock-free algorithms.
> You probably don't need to require CAS (compare-and-set) to return the 
> previous value (I think some architectures don't), but just return a
boolean
> value (success/failure).
compare and swap?
> What are the reasons because of which you picked the Load/Store model for 
> barriers and not some other kind (e.g., acquire/release/...)?
Chandler looked at what the various current LLVM architectures and
summarized what he found. What he found are the memory barriers that the
various processors support.
> Did you have a look at the atomic_ops project?
> http://www.hpl.hp.com/research/linux/atomic_ops/
> It already has implementations for several architectures and several 
> compilers. It uses a different consistency model (different set of 
> constraints for operations) and groups necessary memory barriers with 
> instructions (helpful on some architectures). It supports a few more 
> operations. The author (Hans Boehm) seems to also be active in the area of 
> C/C++ memory models (or some support for this).
LLVM doesn't emit external library calls -- there is no "-lllvm"
to
which programs have to link, so adding an atomic operation library is
likely to be a non-starter. LLVM is interested in emitting instructions
to make atomic operations (and higher level concurrency primitives)
possible, which is why Chandler's work is usefully important.


-scooter

Andrew Lenharth

2007-Jul-09 20:02 UTC

head link

[LLVMdev] Proposal for atomic and synchronization instructions

Poor alpha, no code examples or entries in your tables.

Andrew

On 7/9/07, Chandler Carruth <chandlerc at gmail.com>
wrote:> Hello,
>
> After a fair amount of research and work, I have put together a
> concrete proposal for LLVM representations of atomic operations and
> synchronization constructs. These aim to provide the minimal
> functionality in the IR for representing the hardware constructs that
> threading libraries and parallel programming rely on.
>
> http://chandlerc.net/llvm_atomics.html
>
> While I am no expert on the various architectures, I've done my best
> at providing base-line implementations for each instruction. I am sure
> these will need tweaking and fixing, but should provide a very good
> starting point for the targets.
>
> Comments are more than welcome, especially suggestions for
> improvement. I hope this provides a sound background and a good
> starting place for bringing these features to LLVM. Many thanks also
> go to Reid who has helped tremendously with putting this together, and
> several of the people at Aerospace who helped in the research phase.
>
> -Chandler Carruth
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

Andrew Lenharth

2007-Jul-09 20:04 UTC

head link

[LLVMdev] Proposal for atomic and synchronization instructions

On 7/9/07, Andrew Lenharth <andrewl at lenharth.org>
wrote:> Poor alpha, no code examples or entries in your tables.
But that said, it uses a load-locked, store-conditional and has
various memory barriers which are sufficient to implement all your
proposal.

Andrew
> On 7/9/07, Chandler Carruth <chandlerc at gmail.com> wrote:
> > Hello,
> >
> > After a fair amount of research and work, I have put together a
> > concrete proposal for LLVM representations of atomic operations and
> > synchronization constructs. These aim to provide the minimal
> > functionality in the IR for representing the hardware constructs that
> > threading libraries and parallel programming rely on.
> >
> > http://chandlerc.net/llvm_atomics.html
> >
> > While I am no expert on the various architectures, I've done my
best
> > at providing base-line implementations for each instruction. I am sure
> > these will need tweaking and fixing, but should provide a very good
> > starting point for the targets.
> >
> > Comments are more than welcome, especially suggestions for
> > improvement. I hope this provides a sound background and a good
> > starting place for bringing these features to LLVM. Many thanks also
> > go to Reid who has helped tremendously with putting this together, and
> > several of the people at Aerospace who helped in the research phase.
> >
> > -Chandler Carruth
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >
>

Chandler Carruth

2007-Jul-09 21:26 UTC

head link

[LLVMdev] Proposal for atomic and synchronization instructions

On 7/9/07, John Criswell <criswell at cs.uiuc.edu>
wrote:> 1) You may want to consider adding atomic load-<bitwise
operation>-store
> instructions in addition to load-<add/subtract> instructions.  The
Linux
> kernel uses these for atomic bit setting/clearing, and on many systems
> they can be implemented more efficiently using special assembly
> instructions.  That said, I have never run experiments to verify that
> such instructions are critical to kernel performance.
I was trying to keep the set of operations as small as possible.
Supporting all of the bitwise operations on the x86 architecture would
be very difficult due to their number. (BTS, BTR, BTC...) Several of
these also have no easy emulation available for other architectures.
(Imagine doing an atomic test and set of a single bit, without
affecting the rest of the bits, on SPARC..) Others do not fit well
into the SSA representation of LLVM. This is particularly true of the
non-exchanging operations on x86. Because x86 can work directly with
memory, avoiding loads and stores, many locked operations are
practical there without saving out the old value as part of the atomic
instruction. Representing this in an SSA fashion would be very
difficult I think, especially when trying to maintain atomicity across
the entire LLVM instruction which isn't necessarily implementable as a
single instruction on x86.

All the same, if there is a demand for these bitwise operations, I am
more than willing to try and work up ways to include them. Do other
people have ideas for implementing them cleanly, and/or ideas about
the demand for them over using "cas" to achieve the functionality?
> 2) You need a strategy for handling architectures that can't handle
> atomic operations on certain LLVM data types.  For example, 64 bit
> machines can operate atomically on 64 bit operands, but some 32 bit
> machines cannot.  I think you can fix this with spin locking, but you
> need to be careful on how you do it.  I think in order to do it
> correctly, you need a separate spinlock variable for each individual
> instruction that requires a spinlock.
Indeed, which is very dangerous with potential for deadlock, etc.
After discussing this on IRC, I have adjusted the proposal to reflect
the idea that the target implementation can reject any instructions
with operands which they cannot effectively lower to an atomic
operation. This makes the available types for the instruction
architecture dependent, but when relying on the atomic architecture
implementation, this seems part of the package.
> 3) You say that your operations work on all first class LLVM data
> types.  The LLVM vector type is considered a first class type.  Should
> LLVM support atomic vector operations?
Thanks for pointing this out! Continuing with the changes above, I
have changed the proposal to state that these instructions only accept
integer types. This should provide the needed functionality of these
operations, while keeping a clear lowering path to the target
architecture.
> 4) For consistency, you may need to specify that the LLVM volatile
> keyword can be added to the atomic operations to prevent optimizations
> from modifying or removing them.   I'm not aware of any optimization
> that works on atomic ops in practice, but I don't see why they
couldn't
> be written.
Why would they need this keyword? If the optimization pass works on
atomic ops, would it need to understand the semantics involved, and
only modify/remove them when those semantics allowed it...
specifically, constant propagation should be able to replace the
values in these operations with constants if it can do so, no? Perhaps
I don't understand what situation you're trying to avoid/prepare for
with this.

> 5) With the addition of membar instructions, it may be time to re-think
> what the LLVM volatile keyword means.  Currently, volatile prevents
> removing, modifying, or reordering of a load or store.  However, membar
> also prevents reordering.  Perhaps it would be useful to redefine
> volatile to mean that a load/store cannot be removed or modified but can
> be reordered, and membar alone prevents reordering.
I personally think this would be very interesting to do, as the
barriers provide a very fine grained level of control over load/store
ordering. However, they have implications if they are used for this
purpose -- should the order enforcement of loads and stores without
interprocess implications cause the bus events that these memory
barriers do? That could result in a huge slowdown.

More likely, these barriers should provide another piece of
information about load/store reordering, not superseding volatile. If
people see value in using a barrier-type set of constraints on
volatile reordering, this should be provided separately I think.

-Chandler Carruth
>
> -- John T.
> > -Chandler Carruth
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

Chris Lattner

2007-Jul-10 18:57 UTC

head link

[LLVMdev] Proposal for atomic and synchronization instructions

On Mon, 9 Jul 2007, John Criswell wrote:>> Comments are more than welcome, especially suggestions for
>> improvement. I hope this provides a sound background and a good
>> starting place for bringing these features to LLVM. Many thanks also
>> go to Reid who has helped tremendously with putting this together, and
>> several of the people at Aerospace who helped in the research phase.
>>
> This looks good; this is basically what we came up with for the SVA-OS
work.
>
> Some comments:
>
> 1) You may want to consider adding atomic load-<bitwise
operation>-store
> instructions in addition to load-<add/subtract> instructions.  The
Linux
Yep, this is also important for supporting the GCC synch builtins:
http://gcc.gnu.org/onlinedocs/gcc/Atomic-Builtins.html#Atomic-Builtins

despite the description, these are not itanium specific.

We will want to support these with GCC 4.2 to support OpenMP.
> 2) You need a strategy for handling architectures that can't handle
> atomic operations on certain LLVM data types.  For example, 64 bit
Yep.
> 3) You say that your operations work on all first class LLVM data
> types.  The LLVM vector type is considered a first class type.  Should
> LLVM support atomic vector operations?
Yep :), potentially through spinlocking.
> 5) With the addition of membar instructions, it may be time to re-think
> what the LLVM volatile keyword means.  Currently, volatile prevents
The volatile marker and synchronization are two different things.  Please 
don't mess with volatile :)

-Chris

-- 
http://nondot.org/sabre/
http://llvm.org/

Possibly Parallel Threads

Search for more apparently analagous threads

llvm dev - Jul 2007 - [LLVMdev] Proposal for atomic and synchronization instructions

[LLVMdev] Proposal for atomic and synchronization instructions

[LLVMdev] Proposal for atomic and synchronization instructions

[LLVMdev] Proposal for atomic and synchronization instructions

[LLVMdev] Proposal for atomic and synchronization instructions

[LLVMdev] Proposal for atomic and synchronization instructions

[LLVMdev] Proposal for atomic and synchronization instructions

[LLVMdev] Proposal for atomic and synchronization instructions

[LLVMdev] Proposal for atomic and synchronization instructions

Possibly Parallel Threads