Chandler Carruth
2007-Jul-09 06:39 UTC
[LLVMdev] Proposal for atomic and synchronization instructions
Hello, After a fair amount of research and work, I have put together a concrete proposal for LLVM representations of atomic operations and synchronization constructs. These aim to provide the minimal functionality in the IR for representing the hardware constructs that threading libraries and parallel programming rely on. http://chandlerc.net/llvm_atomics.html While I am no expert on the various architectures, I've done my best at providing base-line implementations for each instruction. I am sure these will need tweaking and fixing, but should provide a very good starting point for the targets. Comments are more than welcome, especially suggestions for improvement. I hope this provides a sound background and a good starting place for bringing these features to LLVM. Many thanks also go to Reid who has helped tremendously with putting this together, and several of the people at Aerospace who helped in the research phase. -Chandler Carruth
John Criswell
2007-Jul-09 15:05 UTC
[LLVMdev] Proposal for atomic and synchronization instructions
Chandler Carruth wrote:> Hello, > > After a fair amount of research and work, I have put together a > concrete proposal for LLVM representations of atomic operations and > synchronization constructs. These aim to provide the minimal > functionality in the IR for representing the hardware constructs that > threading libraries and parallel programming rely on. > > http://chandlerc.net/llvm_atomics.html > > While I am no expert on the various architectures, I've done my best > at providing base-line implementations for each instruction. I am sure > these will need tweaking and fixing, but should provide a very good > starting point for the targets. > > Comments are more than welcome, especially suggestions for > improvement. I hope this provides a sound background and a good > starting place for bringing these features to LLVM. Many thanks also > go to Reid who has helped tremendously with putting this together, and > several of the people at Aerospace who helped in the research phase. >This looks good; this is basically what we came up with for the SVA-OS work. Some comments: 1) You may want to consider adding atomic load-<bitwise operation>-store instructions in addition to load-<add/subtract> instructions. The Linux kernel uses these for atomic bit setting/clearing, and on many systems they can be implemented more efficiently using special assembly instructions. That said, I have never run experiments to verify that such instructions are critical to kernel performance. 2) You need a strategy for handling architectures that can't handle atomic operations on certain LLVM data types. For example, 64 bit machines can operate atomically on 64 bit operands, but some 32 bit machines cannot. I think you can fix this with spin locking, but you need to be careful on how you do it. I think in order to do it correctly, you need a separate spinlock variable for each individual instruction that requires a spinlock. 3) You say that your operations work on all first class LLVM data types. The LLVM vector type is considered a first class type. Should LLVM support atomic vector operations? 4) For consistency, you may need to specify that the LLVM volatile keyword can be added to the atomic operations to prevent optimizations from modifying or removing them. I'm not aware of any optimization that works on atomic ops in practice, but I don't see why they couldn't be written. 5) With the addition of membar instructions, it may be time to re-think what the LLVM volatile keyword means. Currently, volatile prevents removing, modifying, or reordering of a load or store. However, membar also prevents reordering. Perhaps it would be useful to redefine volatile to mean that a load/store cannot be removed or modified but can be reordered, and membar alone prevents reordering. -- John T.> -Chandler Carruth > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >
Torvald Riegel
2007-Jul-09 16:11 UTC
[LLVMdev] Proposal for atomic and synchronization instructions
Hi, I'd like to see support for something like this. I have some comments, and I think there is existing work that you can reuse. TAS and CAS are _not_ theoretically equivalent. TAS is weaker because it can solve consensus in a nonblocking way only for 2 threads (it has consensus number 2), whereas CAS can solve consensus for any number of threads (infinite consensus number). "While the processor may spin and attempt the atomic operation more than once before it is successful, research indicates this is extremely uncommon." I don't understand this sentence, what do you mean? You probably don't need to require CAS (compare-and-set) to return the previous value (I think some architectures don't), but just return a boolean value (success/failure). What are the reasons because of which you picked the Load/Store model for barriers and not some other kind (e.g., acquire/release/...)? Did you have a look at the atomic_ops project? http://www.hpl.hp.com/research/linux/atomic_ops/ It already has implementations for several architectures and several compilers. It uses a different consistency model (different set of constraints for operations) and groups necessary memory barriers with instructions (helpful on some architectures). It supports a few more operations. The author (Hans Boehm) seems to also be active in the area of C/C++ memory models (or some support for this). Torvald On Monday 09 July 2007 08:39, Chandler Carruth wrote:> Hello, > > After a fair amount of research and work, I have put together a > concrete proposal for LLVM representations of atomic operations and > synchronization constructs. These aim to provide the minimal > functionality in the IR for representing the hardware constructs that > threading libraries and parallel programming rely on. > > http://chandlerc.net/llvm_atomics.html > > While I am no expert on the various architectures, I've done my best > at providing base-line implementations for each instruction. I am sure > these will need tweaking and fixing, but should provide a very good > starting point for the targets. > > Comments are more than welcome, especially suggestions for > improvement. I hope this provides a sound background and a good > starting place for bringing these features to LLVM. Many thanks also > go to Reid who has helped tremendously with putting this together, and > several of the people at Aerospace who helped in the research phase. > > -Chandler Carruth > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Scott Michel
2007-Jul-09 17:33 UTC
[LLVMdev] Proposal for atomic and synchronization instructions
Torvald Riegel wrote:> Hi, > > I'd like to see support for something like this. I have some comments, and I > think there is existing work that you can reuse."reuse within the compiler."> "While the processor may spin and attempt the atomic operation more than once > before it is successful, research indicates this is extremely uncommon." > I don't understand this sentence, what do you mean?I'm not sure I can pinpoint the paper from which the statement is based, but I seem to recall something similar in the original LL-SC papers (Maurice Herlihy, DEC Western Research Labs?) It's a foundation for lock-free algorithms.> You probably don't need to require CAS (compare-and-set) to return the > previous value (I think some architectures don't), but just return a boolean > value (success/failure).compare and swap?> What are the reasons because of which you picked the Load/Store model for > barriers and not some other kind (e.g., acquire/release/...)?Chandler looked at what the various current LLVM architectures and summarized what he found. What he found are the memory barriers that the various processors support.> Did you have a look at the atomic_ops project? > http://www.hpl.hp.com/research/linux/atomic_ops/ > It already has implementations for several architectures and several > compilers. It uses a different consistency model (different set of > constraints for operations) and groups necessary memory barriers with > instructions (helpful on some architectures). It supports a few more > operations. The author (Hans Boehm) seems to also be active in the area of > C/C++ memory models (or some support for this).LLVM doesn't emit external library calls -- there is no "-lllvm" to which programs have to link, so adding an atomic operation library is likely to be a non-starter. LLVM is interested in emitting instructions to make atomic operations (and higher level concurrency primitives) possible, which is why Chandler's work is usefully important. -scooter
Andrew Lenharth
2007-Jul-09 20:02 UTC
[LLVMdev] Proposal for atomic and synchronization instructions
Poor alpha, no code examples or entries in your tables. Andrew On 7/9/07, Chandler Carruth <chandlerc at gmail.com> wrote:> Hello, > > After a fair amount of research and work, I have put together a > concrete proposal for LLVM representations of atomic operations and > synchronization constructs. These aim to provide the minimal > functionality in the IR for representing the hardware constructs that > threading libraries and parallel programming rely on. > > http://chandlerc.net/llvm_atomics.html > > While I am no expert on the various architectures, I've done my best > at providing base-line implementations for each instruction. I am sure > these will need tweaking and fixing, but should provide a very good > starting point for the targets. > > Comments are more than welcome, especially suggestions for > improvement. I hope this provides a sound background and a good > starting place for bringing these features to LLVM. Many thanks also > go to Reid who has helped tremendously with putting this together, and > several of the people at Aerospace who helped in the research phase. > > -Chandler Carruth > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >
Andrew Lenharth
2007-Jul-09 20:04 UTC
[LLVMdev] Proposal for atomic and synchronization instructions
On 7/9/07, Andrew Lenharth <andrewl at lenharth.org> wrote:> Poor alpha, no code examples or entries in your tables.But that said, it uses a load-locked, store-conditional and has various memory barriers which are sufficient to implement all your proposal. Andrew> On 7/9/07, Chandler Carruth <chandlerc at gmail.com> wrote: > > Hello, > > > > After a fair amount of research and work, I have put together a > > concrete proposal for LLVM representations of atomic operations and > > synchronization constructs. These aim to provide the minimal > > functionality in the IR for representing the hardware constructs that > > threading libraries and parallel programming rely on. > > > > http://chandlerc.net/llvm_atomics.html > > > > While I am no expert on the various architectures, I've done my best > > at providing base-line implementations for each instruction. I am sure > > these will need tweaking and fixing, but should provide a very good > > starting point for the targets. > > > > Comments are more than welcome, especially suggestions for > > improvement. I hope this provides a sound background and a good > > starting place for bringing these features to LLVM. Many thanks also > > go to Reid who has helped tremendously with putting this together, and > > several of the people at Aerospace who helped in the research phase. > > > > -Chandler Carruth > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > >
Chandler Carruth
2007-Jul-09 21:26 UTC
[LLVMdev] Proposal for atomic and synchronization instructions
On 7/9/07, John Criswell <criswell at cs.uiuc.edu> wrote:> 1) You may want to consider adding atomic load-<bitwise operation>-store > instructions in addition to load-<add/subtract> instructions. The Linux > kernel uses these for atomic bit setting/clearing, and on many systems > they can be implemented more efficiently using special assembly > instructions. That said, I have never run experiments to verify that > such instructions are critical to kernel performance.I was trying to keep the set of operations as small as possible. Supporting all of the bitwise operations on the x86 architecture would be very difficult due to their number. (BTS, BTR, BTC...) Several of these also have no easy emulation available for other architectures. (Imagine doing an atomic test and set of a single bit, without affecting the rest of the bits, on SPARC..) Others do not fit well into the SSA representation of LLVM. This is particularly true of the non-exchanging operations on x86. Because x86 can work directly with memory, avoiding loads and stores, many locked operations are practical there without saving out the old value as part of the atomic instruction. Representing this in an SSA fashion would be very difficult I think, especially when trying to maintain atomicity across the entire LLVM instruction which isn't necessarily implementable as a single instruction on x86. All the same, if there is a demand for these bitwise operations, I am more than willing to try and work up ways to include them. Do other people have ideas for implementing them cleanly, and/or ideas about the demand for them over using "cas" to achieve the functionality?> 2) You need a strategy for handling architectures that can't handle > atomic operations on certain LLVM data types. For example, 64 bit > machines can operate atomically on 64 bit operands, but some 32 bit > machines cannot. I think you can fix this with spin locking, but you > need to be careful on how you do it. I think in order to do it > correctly, you need a separate spinlock variable for each individual > instruction that requires a spinlock.Indeed, which is very dangerous with potential for deadlock, etc. After discussing this on IRC, I have adjusted the proposal to reflect the idea that the target implementation can reject any instructions with operands which they cannot effectively lower to an atomic operation. This makes the available types for the instruction architecture dependent, but when relying on the atomic architecture implementation, this seems part of the package.> 3) You say that your operations work on all first class LLVM data > types. The LLVM vector type is considered a first class type. Should > LLVM support atomic vector operations?Thanks for pointing this out! Continuing with the changes above, I have changed the proposal to state that these instructions only accept integer types. This should provide the needed functionality of these operations, while keeping a clear lowering path to the target architecture.> 4) For consistency, you may need to specify that the LLVM volatile > keyword can be added to the atomic operations to prevent optimizations > from modifying or removing them. I'm not aware of any optimization > that works on atomic ops in practice, but I don't see why they couldn't > be written.Why would they need this keyword? If the optimization pass works on atomic ops, would it need to understand the semantics involved, and only modify/remove them when those semantics allowed it... specifically, constant propagation should be able to replace the values in these operations with constants if it can do so, no? Perhaps I don't understand what situation you're trying to avoid/prepare for with this.> 5) With the addition of membar instructions, it may be time to re-think > what the LLVM volatile keyword means. Currently, volatile prevents > removing, modifying, or reordering of a load or store. However, membar > also prevents reordering. Perhaps it would be useful to redefine > volatile to mean that a load/store cannot be removed or modified but can > be reordered, and membar alone prevents reordering.I personally think this would be very interesting to do, as the barriers provide a very fine grained level of control over load/store ordering. However, they have implications if they are used for this purpose -- should the order enforcement of loads and stores without interprocess implications cause the bus events that these memory barriers do? That could result in a huge slowdown. More likely, these barriers should provide another piece of information about load/store reordering, not superseding volatile. If people see value in using a barrier-type set of constraints on volatile reordering, this should be provided separately I think. -Chandler Carruth> > -- John T. > > -Chandler Carruth > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >
Chris Lattner
2007-Jul-10 18:57 UTC
[LLVMdev] Proposal for atomic and synchronization instructions
On Mon, 9 Jul 2007, John Criswell wrote:>> Comments are more than welcome, especially suggestions for >> improvement. I hope this provides a sound background and a good >> starting place for bringing these features to LLVM. Many thanks also >> go to Reid who has helped tremendously with putting this together, and >> several of the people at Aerospace who helped in the research phase. >> > This looks good; this is basically what we came up with for the SVA-OS work. > > Some comments: > > 1) You may want to consider adding atomic load-<bitwise operation>-store > instructions in addition to load-<add/subtract> instructions. The LinuxYep, this is also important for supporting the GCC synch builtins: http://gcc.gnu.org/onlinedocs/gcc/Atomic-Builtins.html#Atomic-Builtins despite the description, these are not itanium specific. We will want to support these with GCC 4.2 to support OpenMP.> 2) You need a strategy for handling architectures that can't handle > atomic operations on certain LLVM data types. For example, 64 bitYep.> 3) You say that your operations work on all first class LLVM data > types. The LLVM vector type is considered a first class type. Should > LLVM support atomic vector operations?Yep :), potentially through spinlocking.> 5) With the addition of membar instructions, it may be time to re-think > what the LLVM volatile keyword means. Currently, volatile preventsThe volatile marker and synchronization are two different things. Please don't mess with volatile :) -Chris -- http://nondot.org/sabre/ http://llvm.org/
Reasonably Related Threads
- [LLVMdev] Proposal for atomic and synchronization instructions
- [LLVMdev] Proposal for atomic and synchronization instructions
- [LLVMdev] Atomic Operation and Synchronization Proposal v2
- [LLVMdev] Atomic Operation and Synchronization Proposal v2
- [LLVMdev] Proposal for atomic and synchronization instructions