search for: x86_feature_xmm2

Displaying 18 results from an estimated 18 matches for "x86_feature_xmm2".

2016 Jan 27
6
[PATCH v4 0/5] x86: faster smp_mb()+documentation tweaks
mb() typically uses mfence on modern x86, but a micro-benchmark shows that it's 2 to 3 times slower than lock; addl that we use on older CPUs. So we really should use the locked variant everywhere, except that intel manual says that clflush is only ordered by mfence, so we can't. Note: some callers of clflush seems to assume sfence will order it, so there could be existing bugs around
2016 Jan 27
6
[PATCH v4 0/5] x86: faster smp_mb()+documentation tweaks
mb() typically uses mfence on modern x86, but a micro-benchmark shows that it's 2 to 3 times slower than lock; addl that we use on older CPUs. So we really should use the locked variant everywhere, except that intel manual says that clflush is only ordered by mfence, so we can't. Note: some callers of clflush seems to assume sfence will order it, so there could be existing bugs around
2016 Jan 28
0
[PATCH v5 1/5] x86: add cc clobber for addl
...x a584e1c..a65bdb1 100644 --- a/arch/x86/include/asm/barrier.h +++ b/arch/x86/include/asm/barrier.h @@ -15,9 +15,12 @@ * Some non-Intel clones support out of order store. wmb() ceases to be a * nop for these. */ -#define mb() alternative("lock; addl $0,0(%%esp)", "mfence", X86_FEATURE_XMM2) -#define rmb() alternative("lock; addl $0,0(%%esp)", "lfence", X86_FEATURE_XMM2) -#define wmb() alternative("lock; addl $0,0(%%esp)", "sfence", X86_FEATURE_XMM) +#define mb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "mfence", \...
2016 Jan 13
6
[PATCH v3 0/4] x86: faster mb()+documentation tweaks
mb() typically uses mfence on modern x86, but a micro-benchmark shows that it's 2 to 3 times slower than lock; addl that we use on older CPUs. So let's use the locked variant everywhere. While I was at it, I found some inconsistencies in comments in arch/x86/include/asm/barrier.h The documentation fixes are included first - I verified that they do not change the generated code at all.
2016 Jan 13
6
[PATCH v3 0/4] x86: faster mb()+documentation tweaks
mb() typically uses mfence on modern x86, but a micro-benchmark shows that it's 2 to 3 times slower than lock; addl that we use on older CPUs. So let's use the locked variant everywhere. While I was at it, I found some inconsistencies in comments in arch/x86/include/asm/barrier.h The documentation fixes are included first - I verified that they do not change the generated code at all.
2016 Jan 28
10
[PATCH v5 0/5] x86: faster smp_mb()+documentation tweaks
mb() typically uses mfence on modern x86, but a micro-benchmark shows that it's 2 to 3 times slower than lock; addl that we use on older CPUs. So we really should use the locked variant everywhere, except that intel manual says that clflush is only ordered by mfence, so we can't. Note: some callers of clflush seems to assume sfence will order it, so there could be existing bugs around
2016 Jan 28
10
[PATCH v5 0/5] x86: faster smp_mb()+documentation tweaks
mb() typically uses mfence on modern x86, but a micro-benchmark shows that it's 2 to 3 times slower than lock; addl that we use on older CPUs. So we really should use the locked variant everywhere, except that intel manual says that clflush is only ordered by mfence, so we can't. Note: some callers of clflush seems to assume sfence will order it, so there could be existing bugs around
2017 Oct 27
1
[PATCH v6] x86: use lock+addl for smp_mb()
...rrier.h +++ b/arch/x86/include/asm/barrier.h @@ -11,11 +11,11 @@ */ #ifdef CONFIG_X86_32 -#define mb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "mfence", \ +#define mb() asm volatile(ALTERNATIVE("lock; addl $0,-4(%%esp)", "mfence", \ X86_FEATURE_XMM2) ::: "memory", "cc") -#define rmb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "lfence", \ +#define rmb() asm volatile(ALTERNATIVE("lock; addl $0,-4(%%esp)", "lfence", \ X86_FEATURE_XMM2) ::: "memory", "c...
2017 Oct 27
1
[PATCH v6] x86: use lock+addl for smp_mb()
...rrier.h +++ b/arch/x86/include/asm/barrier.h @@ -11,11 +11,11 @@ */ #ifdef CONFIG_X86_32 -#define mb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "mfence", \ +#define mb() asm volatile(ALTERNATIVE("lock; addl $0,-4(%%esp)", "mfence", \ X86_FEATURE_XMM2) ::: "memory", "cc") -#define rmb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "lfence", \ +#define rmb() asm volatile(ALTERNATIVE("lock; addl $0,-4(%%esp)", "lfence", \ X86_FEATURE_XMM2) ::: "memory", "c...
2016 Jan 27
0
[PATCH v4 5/5] x86: drop mfence in favor of lock+addl
...rrier.h +++ b/arch/x86/include/asm/barrier.h @@ -11,11 +11,11 @@ */ #ifdef CONFIG_X86_32 -#define mb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "mfence", \ +#define mb() asm volatile(ALTERNATIVE("lock; addl $0,-4(%%esp)", "mfence", \ X86_FEATURE_XMM2) ::: "memory", "cc") -#define rmb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "lfence", \ +#define rmb() asm volatile(ALTERNATIVE("lock; addl $0,-4(%%esp)", "lfence", \ X86_FEATURE_XMM2) ::: "memory", "c...
2016 Jan 12
7
[PATCH v2 0/3] x86: faster mb()+other barrier.h tweaks
mb() typically uses mfence on modern x86, but a micro-benchmark shows that it's 2 to 3 times slower than lock; addl $0,(%%e/rsp) that we use on older CPUs. So let's use the locked variant everywhere - helps keep the code simple as well. While I was at it, I found some inconsistencies in comments in arch/x86/include/asm/barrier.h I hope I'm not splitting this up too much - the reason
2016 Jan 12
7
[PATCH v2 0/3] x86: faster mb()+other barrier.h tweaks
mb() typically uses mfence on modern x86, but a micro-benchmark shows that it's 2 to 3 times slower than lock; addl $0,(%%e/rsp) that we use on older CPUs. So let's use the locked variant everywhere - helps keep the code simple as well. While I was at it, I found some inconsistencies in comments in arch/x86/include/asm/barrier.h I hope I'm not splitting this up too much - the reason
2007 Apr 18
1
[RFC] [PATCH] Split host arch headers for UML's benefit
.... wmb() ceases to be a + * nop for these. + */ + + +/* + * Actually only lfence would be needed for mb() because all stores done + * by the kernel should be already ordered. But keep a full barrier for now. + */ + +#define mb() alternative("lock; addl $0,0(%%esp)", "mfence", X86_FEATURE_XMM2) +#define rmb() alternative("lock; addl $0,0(%%esp)", "lfence", X86_FEATURE_XMM2) + +/** + * read_barrier_depends - Flush all pending reads that subsequents reads + * depend on. + * + * No data-dependent reads from memory-like regions are ever reordered + * over this barrier. A...
2007 Apr 18
1
[RFC] [PATCH] Split host arch headers for UML's benefit
.... wmb() ceases to be a + * nop for these. + */ + + +/* + * Actually only lfence would be needed for mb() because all stores done + * by the kernel should be already ordered. But keep a full barrier for now. + */ + +#define mb() alternative("lock; addl $0,0(%%esp)", "mfence", X86_FEATURE_XMM2) +#define rmb() alternative("lock; addl $0,0(%%esp)", "lfence", X86_FEATURE_XMM2) + +/** + * read_barrier_depends - Flush all pending reads that subsequents reads + * depend on. + * + * No data-dependent reads from memory-like regions are ever reordered + * over this barrier. A...
2016 Jan 12
0
[PATCH v2 2/3] x86: drop a comment left over from X86_OOSTORE
.../ #ifdef CONFIG_X86_32 -/* - * Some non-Intel clones support out of order store. wmb() ceases to be a - * nop for these. - */ - #define mb() asm volatile("lock; addl $0,0(%%esp)" ::: "memory") #define rmb() alternative("lock; addl $0,0(%%esp)", "lfence", X86_FEATURE_XMM2) #define wmb() alternative("lock; addl $0,0(%%esp)", "sfence", X86_FEATURE_XMM) -- MST
2016 Jan 28
0
[PATCH v5 2/5] x86: drop a comment left over from X86_OOSTORE
...r.h +++ b/arch/x86/include/asm/barrier.h @@ -11,10 +11,6 @@ */ #ifdef CONFIG_X86_32 -/* - * Some non-Intel clones support out of order store. wmb() ceases to be a - * nop for these. - */ #define mb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "mfence", \ X86_FEATURE_XMM2) ::: "memory", "cc") #define rmb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "lfence", \ -- MST
2016 Jan 12
3
[PATCH 3/4] x86,asm: Re-work smp_store_mb()
...a584e1c..f0d36e2 100644 --- a/arch/x86/include/asm/barrier.h +++ b/arch/x86/include/asm/barrier.h @@ -15,15 +15,15 @@ * Some non-Intel clones support out of order store. wmb() ceases to be a * nop for these. */ -#define mb() alternative("lock; addl $0,0(%%esp)", "mfence", X86_FEATURE_XMM2) +#define mb() asm volatile("lock; addl $0,0(%%esp)":::"memory") #define rmb() alternative("lock; addl $0,0(%%esp)", "lfence", X86_FEATURE_XMM2) #define wmb() alternative("lock; addl $0,0(%%esp)", "sfence", X86_FEATURE_XMM) #else +#defi...
2016 Jan 12
3
[PATCH 3/4] x86,asm: Re-work smp_store_mb()
...a584e1c..f0d36e2 100644 --- a/arch/x86/include/asm/barrier.h +++ b/arch/x86/include/asm/barrier.h @@ -15,15 +15,15 @@ * Some non-Intel clones support out of order store. wmb() ceases to be a * nop for these. */ -#define mb() alternative("lock; addl $0,0(%%esp)", "mfence", X86_FEATURE_XMM2) +#define mb() asm volatile("lock; addl $0,0(%%esp)":::"memory") #define rmb() alternative("lock; addl $0,0(%%esp)", "lfence", X86_FEATURE_XMM2) #define wmb() alternative("lock; addl $0,0(%%esp)", "sfence", X86_FEATURE_XMM) #else +#defi...