thr3ads.net - search: "redzoen"

[PATCH 3/4] x86,asm: Re-work smp_store_mb()

2016 Jan 12

3

[PATCH 3/4] x86,asm: Re-work smp_store_mb()

On Mon, Nov 02, 2015 at 04:06:46PM -0800, Linus Torvalds wrote: > On Mon, Nov 2, 2015 at 12:15 PM, Davidlohr Bueso <dave at stgolabs.net> wrote: > > > > So I ran some experiments on an IvyBridge (2.8GHz) and the cost of XCHG is > > constantly cheaper (by at least half the latency) than MFENCE. While there > > was a decent amount of variation, this difference

[PATCH 3/4] x86,asm: Re-work smp_store_mb()

2016 Jan 12

3

[PATCH 3/4] x86,asm: Re-work smp_store_mb()

On Mon, Nov 02, 2015 at 04:06:46PM -0800, Linus Torvalds wrote: > On Mon, Nov 2, 2015 at 12:15 PM, Davidlohr Bueso <dave at stgolabs.net> wrote: > > > > So I ran some experiments on an IvyBridge (2.8GHz) and the cost of XCHG is > > constantly cheaper (by at least half the latency) than MFENCE. While there > > was a decent amount of variation, this difference

[PATCH 3/4] x86,asm: Re-work smp_store_mb()

2016 Jan 12

0

[PATCH 3/4] x86,asm: Re-work smp_store_mb()

...t it? .. it's fine for some basic testing. It doesn't show any subtle interactions (ie some operations may have different dynamic behavior when the write buffers are busy etc), but as a baseline for "how fast can things go" the stupid raw loop is fine. And while the xchg into the redzoen wouldn't be acceptable as a real implementation, for timing testing it's likely fine (ie you aren't hitting the problem it can cause). > So mfence is more expensive than locked instructions/xchg, but sfence/lfence > are slightly faster, and xchg and locked instructions are very c...

[PATCH 3/4] x86,asm: Re-work smp_store_mb()

2016 Jan 12

2

[PATCH 3/4] x86,asm: Re-work smp_store_mb()

...ine for some basic testing. It doesn't show any subtle > interactions (ie some operations may have different dynamic behavior > when the write buffers are busy etc), but as a baseline for "how fast > can things go" the stupid raw loop is fine. And while the xchg into > the redzoen wouldn't be acceptable as a real implementation, for > timing testing it's likely fine (ie you aren't hitting the problem it > can cause). I recall reading somewhere that lock addl $0, 32(%rsp) or so (maybe even 64) was better because it avoided stomping on very-likely-to-be-hot...

[PATCH 3/4] x86,asm: Re-work smp_store_mb()

2016 Jan 12

2

[PATCH 3/4] x86,asm: Re-work smp_store_mb()

...ine for some basic testing. It doesn't show any subtle > interactions (ie some operations may have different dynamic behavior > when the write buffers are busy etc), but as a baseline for "how fast > can things go" the stupid raw loop is fine. And while the xchg into > the redzoen wouldn't be acceptable as a real implementation, for > timing testing it's likely fine (ie you aren't hitting the problem it > can cause). I recall reading somewhere that lock addl $0, 32(%rsp) or so (maybe even 64) was better because it avoided stomping on very-likely-to-be-hot...

[PATCH 3/4] x86,asm: Re-work smp_store_mb()

2016 Jan 12

1

[PATCH 3/4] x86,asm: Re-work smp_store_mb()

...ine for some basic testing. It doesn't show any subtle > interactions (ie some operations may have different dynamic behavior > when the write buffers are busy etc), but as a baseline for "how fast > can things go" the stupid raw loop is fine. And while the xchg into > the redzoen wouldn't be acceptable as a real implementation, for > timing testing it's likely fine (ie you aren't hitting the problem it > can cause). > > > So mfence is more expensive than locked instructions/xchg, but sfence/lfence > > are slightly faster, and xchg and locke...

[PATCH 3/4] x86,asm: Re-work smp_store_mb()

2016 Jan 12

1

[PATCH 3/4] x86,asm: Re-work smp_store_mb()

...ine for some basic testing. It doesn't show any subtle > interactions (ie some operations may have different dynamic behavior > when the write buffers are busy etc), but as a baseline for "how fast > can things go" the stupid raw loop is fine. And while the xchg into > the redzoen wouldn't be acceptable as a real implementation, for > timing testing it's likely fine (ie you aren't hitting the problem it > can cause). > > > So mfence is more expensive than locked instructions/xchg, but sfence/lfence > > are slightly faster, and xchg and locke...

search for: redzoen