similar to: LFENCE instruction (was: [rfc][patch 3/3] x86: optimise barriers)

Displaying 20 results from an estimated 4000 matches similar to: "LFENCE instruction (was: [rfc][patch 3/3] x86: optimise barriers)"

2020 Jul 09
4
[PATCH v3 5/6] powerpc/pseries: implement paravirt qspinlocks for SPLPAR
Nicholas Piggin <npiggin at gmail.com> writes: > Signed-off-by: Nicholas Piggin <npiggin at gmail.com> > --- > arch/powerpc/include/asm/paravirt.h | 28 ++++++++ > arch/powerpc/include/asm/qspinlock.h | 66 +++++++++++++++++++ > arch/powerpc/include/asm/qspinlock_paravirt.h | 7 ++ > arch/powerpc/platforms/pseries/Kconfig | 5 ++ >
2020 Jul 09
4
[PATCH v3 5/6] powerpc/pseries: implement paravirt qspinlocks for SPLPAR
Nicholas Piggin <npiggin at gmail.com> writes: > Signed-off-by: Nicholas Piggin <npiggin at gmail.com> > --- > arch/powerpc/include/asm/paravirt.h | 28 ++++++++ > arch/powerpc/include/asm/qspinlock.h | 66 +++++++++++++++++++ > arch/powerpc/include/asm/qspinlock_paravirt.h | 7 ++ > arch/powerpc/platforms/pseries/Kconfig | 5 ++ >
2020 Jul 09
1
[PATCH v3 4/6] powerpc/64s: implement queued spinlocks and rwlocks
Nicholas Piggin <npiggin at gmail.com> writes: > These have shown significantly improved performance and fairness when > spinlock contention is moderate to high on very large systems. > > [ Numbers hopefully forthcoming after more testing, but initial > results look good ] Would be good to have something here, even if it's preliminary. > Thanks to the fast path,
2020 Jul 06
13
[PATCH v3 0/6] powerpc: queued spinlocks and rwlocks
v3 is updated to use __pv_queued_spin_unlock, noticed by Waiman (thank you). Thanks, Nick Nicholas Piggin (6): powerpc/powernv: must include hvcall.h to get PAPR defines powerpc/pseries: move some PAPR paravirt functions to their own file powerpc: move spinlock implementation to simple_spinlock powerpc/64s: implement queued spinlocks and rwlocks powerpc/pseries: implement paravirt
2020 Jul 06
13
[PATCH v3 0/6] powerpc: queued spinlocks and rwlocks
v3 is updated to use __pv_queued_spin_unlock, noticed by Waiman (thank you). Thanks, Nick Nicholas Piggin (6): powerpc/powernv: must include hvcall.h to get PAPR defines powerpc/pseries: move some PAPR paravirt functions to their own file powerpc: move spinlock implementation to simple_spinlock powerpc/64s: implement queued spinlocks and rwlocks powerpc/pseries: implement paravirt
2010 Mar 15
1
[patch] btrfs: fix gfp flags masking
Signed-off-by: Nick Piggin <npiggin@suse.de> -- Index: linux-2.6/fs/btrfs/compression.c =================================================================== --- linux-2.6.orig/fs/btrfs/compression.c +++ linux-2.6/fs/btrfs/compression.c @@ -478,7 +478,7 @@ static noinline int add_ra_bio_pages(str goto next; } - page = alloc_page(mapping_gfp_mask(mapping) | GFP_NOFS); + page =
2020 Jul 02
12
[PATCH 0/8] powerpc: queued spinlocks and rwlocks
This series adds an option to use queued spinlocks for powerpc, and makes it the default for the Book3S-64 subarch. This effort starts with the generic code so it's very simple but still very performant. There are optimisations that can be made to slowpaths, but I think it's better to attack those incrementally if/when we find things, and try to add the improvements to generic code as
2020 Jul 05
1
[PATCH v2 5/6] powerpc/pseries: implement paravirt qspinlocks for SPLPAR
On 7/3/20 3:35 AM, Nicholas Piggin wrote: > Signed-off-by: Nicholas Piggin <npiggin at gmail.com> > --- > arch/powerpc/include/asm/paravirt.h | 28 ++++++++++ > arch/powerpc/include/asm/qspinlock.h | 55 +++++++++++++++++++ > arch/powerpc/include/asm/qspinlock_paravirt.h | 5 ++ > arch/powerpc/platforms/pseries/Kconfig | 5 ++ >
2020 Jul 03
7
[PATCH v2 0/6] powerpc: queued spinlocks and rwlocks
v2 is updated to account for feedback from Will, Peter, and Waiman (thank you), and trims off a couple of RFC and unrelated patches. Thanks, Nick Nicholas Piggin (6): powerpc/powernv: must include hvcall.h to get PAPR defines powerpc/pseries: move some PAPR paravirt functions to their own file powerpc: move spinlock implementation to simple_spinlock powerpc/64s: implement queued
2020 Jul 24
8
[PATCH v4 0/6] powerpc: queued spinlocks and rwlocks
Updated with everybody's feedback (thanks all), and more performance results. What I've found is I might have been measuring the worst load point for the paravirt case, and by looking at a range of loads it's clear that queued spinlocks are overall better even on PV, doubly so when you look at the generally much improved worst case latencies. I have defaulted it to N even though
2020 Jul 02
0
[PATCH 2/8] powerpc/pseries: use smp_rmb() in H_CONFER spin yield
There is no need for rmb(), this allows faster lwsync here. Signed-off-by: Nicholas Piggin <npiggin at gmail.com> --- arch/powerpc/lib/locks.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/lib/locks.c b/arch/powerpc/lib/locks.c index 6440d5943c00..47a530de733e 100644 --- a/arch/powerpc/lib/locks.c +++ b/arch/powerpc/lib/locks.c @@ -30,7 +30,7 @@ void
2008 Jun 10
1
[PATCH] xen: Use wmb instead of rmb in xen_evtchn_do_upcall().
This patch is ported one from 534:77db69c38249 of linux-2.6.18-xen.hg. Use wmb instead of rmb to enforce ordering between evtchn_upcall_pending and evtchn_pending_sel stores in xen_evtchn_do_upcall(). Cc: Samuel Thibault <samuel.thibault at eu.citrix.com> Signed-off-by: Isaku Yamahata <yamahata at valinux.co.jp> --- drivers/xen/events.c | 2 +- 1 files changed, 1 insertions(+), 1
2016 Jan 28
0
[PATCH v5 1/5] x86: add cc clobber for addl
addl clobbers flags (such as CF) but barrier.h didn't tell this to gcc. Historically, gcc doesn't need one on x86, and always considers flags clobbered. We are probably missing the cc clobber in a *lot* of places for this reason. But even if not necessary, it's probably a good thing to add for documentation, and in case gcc semantcs ever change. Reported-by: Borislav Petkov <bp at
2020 Jul 02
3
[PATCH 5/8] powerpc/64s: implement queued spinlocks and rwlocks
On Thu, Jul 02, 2020 at 08:25:43PM +1000, Nicholas Piggin wrote: > Excerpts from Will Deacon's message of July 2, 2020 6:02 pm: > > On Thu, Jul 02, 2020 at 05:48:36PM +1000, Nicholas Piggin wrote: > >> diff --git a/arch/powerpc/include/asm/qspinlock.h b/arch/powerpc/include/asm/qspinlock.h > >> new file mode 100644 > >> index 000000000000..f84da77b6bb7 >
2020 Jul 02
3
[PATCH 5/8] powerpc/64s: implement queued spinlocks and rwlocks
On Thu, Jul 02, 2020 at 08:25:43PM +1000, Nicholas Piggin wrote: > Excerpts from Will Deacon's message of July 2, 2020 6:02 pm: > > On Thu, Jul 02, 2020 at 05:48:36PM +1000, Nicholas Piggin wrote: > >> diff --git a/arch/powerpc/include/asm/qspinlock.h b/arch/powerpc/include/asm/qspinlock.h > >> new file mode 100644 > >> index 000000000000..f84da77b6bb7 >
2017 Oct 27
1
[PATCH v6] x86: use lock+addl for smp_mb()
mfence appears to be way slower than a locked instruction - let's use lock+add unconditionally, as we always did on old 32-bit. Results: perf stat -r 10 -- ./virtio_ring_0_9 --sleep --host-affinity 0 --guest-affinity 0 Before: 0.922565990 seconds time elapsed ( +- 1.15% ) After: 0.578667024 seconds time elapsed
2017 Oct 27
1
[PATCH v6] x86: use lock+addl for smp_mb()
mfence appears to be way slower than a locked instruction - let's use lock+add unconditionally, as we always did on old 32-bit. Results: perf stat -r 10 -- ./virtio_ring_0_9 --sleep --host-affinity 0 --guest-affinity 0 Before: 0.922565990 seconds time elapsed ( +- 1.15% ) After: 0.578667024 seconds time elapsed
2016 Jan 27
0
[PATCH v4 5/5] x86: drop mfence in favor of lock+addl
mfence appears to be way slower than a locked instruction - let's use lock+add unconditionally, as we always did on old 32-bit. Just poking at SP would be the most natural, but if we then read the value from SP, we get a false dependency which will slow us down. This was noted in this article: http://shipilev.net/blog/2014/on-the-fence-with-dependencies/ And is easy to reproduce by sticking
2020 Jul 06
0
[PATCH v3 2/6] powerpc/pseries: move some PAPR paravirt functions to their own file
Signed-off-by: Nicholas Piggin <npiggin at gmail.com> --- arch/powerpc/include/asm/paravirt.h | 61 +++++++++++++++++++++++++++++ arch/powerpc/include/asm/spinlock.h | 24 +----------- arch/powerpc/lib/locks.c | 12 +++--- 3 files changed, 68 insertions(+), 29 deletions(-) create mode 100644 arch/powerpc/include/asm/paravirt.h diff --git a/arch/powerpc/include/asm/paravirt.h
2016 Jan 12
3
[PATCH 3/4] x86,asm: Re-work smp_store_mb()
On Mon, Nov 02, 2015 at 04:06:46PM -0800, Linus Torvalds wrote: > On Mon, Nov 2, 2015 at 12:15 PM, Davidlohr Bueso <dave at stgolabs.net> wrote: > > > > So I ran some experiments on an IvyBridge (2.8GHz) and the cost of XCHG is > > constantly cheaper (by at least half the latency) than MFENCE. While there > > was a decent amount of variation, this difference