thr3ads.net - search: "cmpxchg"

[LLVMdev] [RFC] Add second "failure" AtomicOrdering to cmpxchg instruction

2014 Mar 07

3

[LLVMdev] [RFC] Add second "failure" AtomicOrdering to cmpxchg instruction

Hi all, The C++11 (& C11) compare_exchange functions with explicit memory order allow you to specify two sets of semantics, one for when the exchange actually happens and one for when it fails. Unfortunately, at the moment the LLVM IR "cmpxchg" instruction only has one ordering, which means we get sub-optimal codegen. This probably affects all architectures which use load-linked/store-conditional instructions for atomic operations and don't have versions with built-in acquire/release semantics (and so need barriers). For exampl...

cmpxchg on floats

2020 Aug 17

4

cmpxchg on floats

On Fri, Aug 14, 2020 at 10:42:02AM -0700, JF Bastien via llvm-dev wrote: > We (C, C++, and LLVM) are generally moving towards supporting FP as a > first-class thing with all atomic operations †, including cmpxchg. It’s > indeed *usually* specified as a bitwise comparison, not a floating-point > one, although IIRC AMD has an FP cmpxchg. Similarly, some of the > operations are allowed to have separate FP state (say, atomic add won’t > necessarily affect the scalar FP execution’s exception state, m...

cmpxchg on floats

2020 Aug 22

2

cmpxchg on floats

...ia llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > On Fri, Aug 14, 2020 at 10:42:02AM -0700, JF Bastien via llvm-dev wrote: > > > We (C, C++, and LLVM) are generally moving towards supporting FP as a > > > first-class thing with all atomic operations †, including cmpxchg. It’s > > > indeed *usually* specified as a bitwise comparison, not a floating-point > > > one, although IIRC AMD has an FP cmpxchg. Similarly, some of the > > > operations are allowed to have separate FP state (say, atomic add won’t > > > necessarily affect the...

cmpxchg on floats

2020 Aug 22

2

cmpxchg on floats

...at lists.llvm.org> wrote: > > > > On Fri, Aug 14, 2020 at 10:42:02AM -0700, JF Bastien via llvm-dev wrote: > > > > > We (C, C++, and LLVM) are generally moving towards supporting FP as a > > > > > first-class thing with all atomic operations †, including cmpxchg. It’s > > > > > indeed *usually* specified as a bitwise comparison, not a floating-point > > > > > one, although IIRC AMD has an FP cmpxchg. Similarly, some of the > > > > > operations are allowed to have separate FP state (say, atomic add won’t > &gt...

cmpxchg on floats

2020 Aug 14

3

cmpxchg on floats

We've relaxed `atomicrmw xchg` to support floating point types but not cmpxchg -- the cmpxchg comparison behavior is not a floating point comparison, so that would be potentially misleading. I'd say adding the assertion is a good idea. Cheers, Nicolai On Thu, Aug 13, 2020 at 10:59 PM Chris Lattner via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Does the...

cmpxchg on floats

2020 Aug 13

2

cmpxchg on floats

Hi LLVM-dev, when working on MLIR-to-LLVM-IR conversion, I noticed that it is possible to programmatically construct a cmpxchg instruction operating on floats (or actually any type since there is no assertion on pointer element type here https://github.com/llvm/llvm-project/blob/9c2e708f0dc547d386ea528450a33ef4bd2a750b/llvm/lib/IR/Instructions.cpp#L1501), but LangRef specifies that only integers and pointers are accepted (...

[LLVMdev] cmpxchg instruction with pointer operands

2014 Sep 06

5

[LLVMdev] cmpxchg instruction with pointer operands

cmpxchg only support exchange on int operands, but pointer values can be very useful here, e.g. stack<T> in a linked-list, the top can be atomic<Node<T>*>. in clang++, cmpxchg operations on atomic<T*> are bitcasted i64 and do the operation, which is ugly. Any reason or concern why...

[LLVMdev] RFC: add "cmpxchg weak" to LLVM IR

2014 Jun 12

6

[LLVMdev] RFC: add "cmpxchg weak" to LLVM IR

Hi all, I've decided the next step in atomics is the weak compare-and-exchange operation. As with the failure order, I'm going t outline the direction I'd like to take: 1. All cmpxchg instructions now return { iN, i1 } where the first value is what we got before (the loaded result), the second == 1 if an exchange took place. 1. "weak" is an optional modifier to the cmpxchg instructions. If anyone wants a bikeshed to paint, this would be a good one. I wasn't sure my...

[atomics][AArch64] Possible bug in cmpxchg lowering

2017 May 30

3

[atomics][AArch64] Possible bug in cmpxchg lowering

Currently the AtomicExpandPass will lower the following IR: define i1 @foo(i32* %obj, i32 %old, i32 %new) { entry: %v0 = cmpxchg weak volatile i32* %obj, i32 %old, i32 %new _*release acquire*_ %v1 = extractvalue { i32, i1 } %v0, 1 ret i1 %v1 } to the equivalent of the following on AArch64: _*ldxr w8, [x0]*_ cmp w8, w1 b.ne .LBB0_3 // BB#1: // %cmpxchg.trystore...

[Release-testers] [7.0.0 Release] rc1 has been tagged

2018 Aug 17

4

[Release-testers] [7.0.0 Release] rc1 has been tagged

...s >> >> Adjust MaxAtomicInlineWidth for i386/i486 targets. >> >> This is to fix the bug reported in https://bugs.llvm.org/show_bug.cgi?id=34347#c6. >> Currently, all MaxAtomicInlineWidth of x86-32 targets are set to 64. However, >> i386 doesn't support any cmpxchg related instructions. i486 only supports cmpxchg. >> So in this patch MaxAtomicInlineWidth is reset as follows: >> For i386, the MaxAtomicInlineWidth should be 0 because no cmpxchg is supported. >> For i486, the MaxAtomicInlineWidth should be 32 because it supports cmpxchg. >&g...

[PATCH] hvm emul: fix cmpxchg emulation to use an atomic operation

2009 Aug 06

2

[PATCH] hvm emul: fix cmpxchg emulation to use an atomic operation

# HG changeset patch # User Patrick Colp <Patrick.Colp@citrix.com> # Date 1249555177 -3600 # Node ID 684c8fc69d658d058246eb9edfc8eba187ae6f2c # Parent 68e8b8379244e293c55875e7dc3692fc81d3d212 hvm emul: fix cmpxchg emulation to use an atomic operation. Currently HVM cmpxchg emulation is done by doing a normal emulated write, which is not atomic. This patch changes it to use a cmpxchg operation instead, which is atomic. Signed-off-by: Patrick Colp <Patrick.Colp@citrix.com> diff -r 68e8b8379244 -r 684c...

[LLVMdev] Replacing Platform Specific IR Codes with Generic Implementation and Introducing Macro Facilities

2014 May 10

2

[LLVMdev] Replacing Platform Specific IR Codes with Generic Implementation and Introducing Macro Facilities

On 10 May 2014, at 18:14, Tim Northover <t.p.northover at gmail.com> wrote: >> The easiest solution would be to extend the cmpxchg instruction with a >> weak variant. It is then trivial to map load, modify, weak-cmpxchg to >> load-linked, modify, store-conditional (that is what weak cmpxchg was >> intended for in the C[++]11 memory model). > > That would certainly be the easiest. But you'd get les...

[RFC] Implement Batched (group) ticket lock

2014 May 28

7

[RFC] Implement Batched (group) ticket lock

...This increases probability of any eligible lock-holder being in running state (to an average of (batch_size/2)-1). It also provides needed bounded starvation since any lock requester can not acquire more than batch_size times repeatedly during contention. On the negetive side we would need an extra cmpxchg. The patch has the batch size of 4. (As we know increasing batch size means we are closer to unfair locks and batch size of 1 = ticketlock). Result: Test system: 32cpu 2node machine w/ 64GB each (16 pcpu machine +ht). Guests: 8GB 16vcpu guests (average of 8 iterations) % Improvements with k...

[RFC] Implement Batched (group) ticket lock

2014 May 28

7

[RFC] Implement Batched (group) ticket lock

...This increases probability of any eligible lock-holder being in running state (to an average of (batch_size/2)-1). It also provides needed bounded starvation since any lock requester can not acquire more than batch_size times repeatedly during contention. On the negetive side we would need an extra cmpxchg. The patch has the batch size of 4. (As we know increasing batch size means we are closer to unfair locks and batch size of 1 = ticketlock). Result: Test system: 32cpu 2node machine w/ 64GB each (16 pcpu machine +ht). Guests: 8GB 16vcpu guests (average of 8 iterations) % Improvements with k...

[LLVMdev] Proposal: "load linked" and "store conditional" atomic instructions

2014 May 29

4

[LLVMdev] Proposal: "load linked" and "store conditional" atomic instructions

Hi, I've been looking at improving atomicrmw & cmpxchg code more, particularly on architectures using the load-linked/store-conditional model. The summary is that current expansion for cmpxchg seems to happen too late for LLVM to make meaningful use of the opportunities it provides. I'd like to move it earlier and express it in terms of a first-cl...

[PATCH] x86 spinlock: Fix memory corruption on completing completions

2015 Feb 10

4

[PATCH] x86 spinlock: Fix memory corruption on completing completions

...he tail is in the high bytes (head really >> needs to be high to work, if it's in the low byte(s) the xadd would >> overflow from head into tail which would be wrong). > > Unfortunately xadd could result in head overflow as tail is high. > > The other option was repeated cmpxchg which is bad I believe. > Any suggestions? Stupid question... what if we simply move SLOWPATH from .tail to .head? In this case arch_spin_unlock() could do xadd(tickets.head) and check the result In this case __ticket_check_and_clear_slowpath() really needs to cmpxchg the whole .head_tail. Plu...

[PATCH] x86 spinlock: Fix memory corruption on completing completions

2015 Feb 10

4

[PATCH] x86 spinlock: Fix memory corruption on completing completions

...he tail is in the high bytes (head really >> needs to be high to work, if it's in the low byte(s) the xadd would >> overflow from head into tail which would be wrong). > > Unfortunately xadd could result in head overflow as tail is high. > > The other option was repeated cmpxchg which is bad I believe. > Any suggestions? Stupid question... what if we simply move SLOWPATH from .tail to .head? In this case arch_spin_unlock() could do xadd(tickets.head) and check the result In this case __ticket_check_and_clear_slowpath() really needs to cmpxchg the whole .head_tail. Plu...

[PATCH v16 13/14] pvqspinlock: Improve slowpath performance by avoiding cmpxchg

2015 Apr 29

4

[PATCH v16 13/14] pvqspinlock: Improve slowpath performance by avoiding cmpxchg

On Fri, Apr 24, 2015 at 02:56:42PM -0400, Waiman Long wrote: > In the pv_scan_next() function, the slow cmpxchg atomic operation is > performed even if the other CPU is not even close to being halted. This > extra cmpxchg can harm slowpath performance. > > This patch introduces the new mayhalt flag to indicate if the other > spinning CPU is close to being halted or not. The current threshold...

[PATCH v16 13/14] pvqspinlock: Improve slowpath performance by avoiding cmpxchg

2015 Apr 29

4

[PATCH v16 13/14] pvqspinlock: Improve slowpath performance by avoiding cmpxchg

On Fri, Apr 24, 2015 at 02:56:42PM -0400, Waiman Long wrote: > In the pv_scan_next() function, the slow cmpxchg atomic operation is > performed even if the other CPU is not even close to being halted. This > extra cmpxchg can harm slowpath performance. > > This patch introduces the new mayhalt flag to indicate if the other > spinning CPU is close to being halted or not. The current threshold...

[PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock

2015 Apr 13

1

[PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock

...must be set before setting _Q_SLOW_VAL > >>+ * > >>+ * [S] lp = lock [RmW] l = l->locked = 0 > >>+ * MB MB > >>+ * [S] l->locked = _Q_SLOW_VAL [L] lp > >>+ * > >>+ * Matches the cmpxchg() in pv_queue_spin_unlock(). > >>+ */ > >>+ if (!slow_set&& > >>+ !cmpxchg(&l->locked, _Q_LOCKED_VAL, _Q_SLOW_VAL)) { > >>+ /* > >>+ * The lock is free and _Q_SLOW_VAL has never been > >>+ * set. Need to clear the...

search for: cmpxchg