Jason Wang
2019-Jan-07 03:58 UTC
[PATCH RFC 3/4] barriers: convert a control to a data dependency
On 2019/1/3 ??4:57, Michael S. Tsirkin wrote:> It's not uncommon to have two access two unrelated memory locations in a > specific order. At the moment one has to use a memory barrier for this. > > However, if the first access was a read and the second used an address > depending on the first one we would have a data dependency and no > barrier would be necessary. > > This adds a new interface: dependent_ptr_mb which does exactly this: it > returns a pointer with a data dependency on the supplied value. > > Signed-off-by: Michael S. Tsirkin <mst at redhat.com> > --- > Documentation/memory-barriers.txt | 20 ++++++++++++++++++++ > arch/alpha/include/asm/barrier.h | 1 + > include/asm-generic/barrier.h | 18 ++++++++++++++++++ > include/linux/compiler.h | 4 ++++ > 4 files changed, 43 insertions(+) > > diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt > index c1d913944ad8..9dbaa2e1dbf6 100644 > --- a/Documentation/memory-barriers.txt > +++ b/Documentation/memory-barriers.txt > @@ -691,6 +691,18 @@ case what's actually required is: > p = READ_ONCE(b); > } > > +Alternatively, a control dependency can be converted to a data dependency, > +e.g.: > + > + q = READ_ONCE(a); > + if (q) { > + b = dependent_ptr_mb(b, q); > + p = READ_ONCE(b); > + } > + > +Note how the result of dependent_ptr_mb must be used with the following > +accesses in order to have an effect. > + > However, stores are not speculated. This means that ordering -is- provided > for load-store control dependencies, as in the following example: > > @@ -836,6 +848,12 @@ out-guess your code. More generally, although READ_ONCE() does force > the compiler to actually emit code for a given load, it does not force > the compiler to use the results. > > +Converting to a data dependency helps with this too: > + > + q = READ_ONCE(a); > + b = dependent_ptr_mb(b, q); > + WRITE_ONCE(b, 1); > + > In addition, control dependencies apply only to the then-clause and > else-clause of the if-statement in question. In particular, it does > not necessarily apply to code following the if-statement: > @@ -875,6 +893,8 @@ to the CPU containing it. See the section on "Multicopy atomicity" > for more information. > > > + > + > In summary: > > (*) Control dependencies can order prior loads against later stores. > diff --git a/arch/alpha/include/asm/barrier.h b/arch/alpha/include/asm/barrier.h > index 92ec486a4f9e..b4934e8c551b 100644 > --- a/arch/alpha/include/asm/barrier.h > +++ b/arch/alpha/include/asm/barrier.h > @@ -59,6 +59,7 @@ > * as Alpha, "y" could be set to 3 and "x" to 0. Use rmb() > * in cases like this where there are no data dependencies. > */ > +#define ARCH_NEEDS_READ_BARRIER_DEPENDS 1 > #define read_barrier_depends() __asm__ __volatile__("mb": : :"memory") > > #ifdef CONFIG_SMP > diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h > index 2cafdbb9ae4c..fa2e2ef72b68 100644 > --- a/include/asm-generic/barrier.h > +++ b/include/asm-generic/barrier.h > @@ -70,6 +70,24 @@ > #define __smp_read_barrier_depends() read_barrier_depends() > #endif > > +#if defined(COMPILER_HAS_OPTIMIZER_HIDE_VAR) && \ > + !defined(ARCH_NEEDS_READ_BARRIER_DEPENDS) > + > +#define dependent_ptr_mb(ptr, val) ({ \ > + long dependent_ptr_mb_val = (long)(val); \ > + long dependent_ptr_mb_ptr = (long)(ptr) - dependent_ptr_mb_val; \ > + \ > + BUILD_BUG_ON(sizeof(val) > sizeof(long)); \ > + OPTIMIZER_HIDE_VAR(dependent_ptr_mb_val); \ > + (typeof(ptr))(dependent_ptr_mb_ptr + dependent_ptr_mb_val); \ > +}) > + > +#else > + > +#define dependent_ptr_mb(ptr, val) ({ mb(); (ptr); })So for the example of patch 4, we'd better fall back to rmb() or need a dependent_ptr_rmb()? Thanks> + > +#endif > + > #ifdef CONFIG_SMP > > #ifndef smp_mb > diff --git a/include/linux/compiler.h b/include/linux/compiler.h > index 6601d39e8c48..f599c30f1b28 100644 > --- a/include/linux/compiler.h > +++ b/include/linux/compiler.h > @@ -152,9 +152,13 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val, > #endif > > #ifndef OPTIMIZER_HIDE_VAR > + > /* Make the optimizer believe the variable can be manipulated arbitrarily. */ > #define OPTIMIZER_HIDE_VAR(var) \ > __asm__ ("" : "=rm" (var) : "0" (var)) > + > +#define COMPILER_HAS_OPTIMIZER_HIDE_VAR 1 > + > #endif > > /* Not-quite-unique ID. */
Michael S. Tsirkin
2019-Jan-07 04:23 UTC
[PATCH RFC 3/4] barriers: convert a control to a data dependency
On Mon, Jan 07, 2019 at 11:58:23AM +0800, Jason Wang wrote:> > On 2019/1/3 ??4:57, Michael S. Tsirkin wrote: > > It's not uncommon to have two access two unrelated memory locations in a > > specific order. At the moment one has to use a memory barrier for this. > > > > However, if the first access was a read and the second used an address > > depending on the first one we would have a data dependency and no > > barrier would be necessary. > > > > This adds a new interface: dependent_ptr_mb which does exactly this: it > > returns a pointer with a data dependency on the supplied value. > > > > Signed-off-by: Michael S. Tsirkin <mst at redhat.com> > > --- > > Documentation/memory-barriers.txt | 20 ++++++++++++++++++++ > > arch/alpha/include/asm/barrier.h | 1 + > > include/asm-generic/barrier.h | 18 ++++++++++++++++++ > > include/linux/compiler.h | 4 ++++ > > 4 files changed, 43 insertions(+) > > > > diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt > > index c1d913944ad8..9dbaa2e1dbf6 100644 > > --- a/Documentation/memory-barriers.txt > > +++ b/Documentation/memory-barriers.txt > > @@ -691,6 +691,18 @@ case what's actually required is: > > p = READ_ONCE(b); > > } > > +Alternatively, a control dependency can be converted to a data dependency, > > +e.g.: > > + > > + q = READ_ONCE(a); > > + if (q) { > > + b = dependent_ptr_mb(b, q); > > + p = READ_ONCE(b); > > + } > > + > > +Note how the result of dependent_ptr_mb must be used with the following > > +accesses in order to have an effect. > > + > > However, stores are not speculated. This means that ordering -is- provided > > for load-store control dependencies, as in the following example: > > @@ -836,6 +848,12 @@ out-guess your code. More generally, although READ_ONCE() does force > > the compiler to actually emit code for a given load, it does not force > > the compiler to use the results. > > +Converting to a data dependency helps with this too: > > + > > + q = READ_ONCE(a); > > + b = dependent_ptr_mb(b, q); > > + WRITE_ONCE(b, 1); > > + > > In addition, control dependencies apply only to the then-clause and > > else-clause of the if-statement in question. In particular, it does > > not necessarily apply to code following the if-statement: > > @@ -875,6 +893,8 @@ to the CPU containing it. See the section on "Multicopy atomicity" > > for more information. > > + > > + > > In summary: > > (*) Control dependencies can order prior loads against later stores. > > diff --git a/arch/alpha/include/asm/barrier.h b/arch/alpha/include/asm/barrier.h > > index 92ec486a4f9e..b4934e8c551b 100644 > > --- a/arch/alpha/include/asm/barrier.h > > +++ b/arch/alpha/include/asm/barrier.h > > @@ -59,6 +59,7 @@ > > * as Alpha, "y" could be set to 3 and "x" to 0. Use rmb() > > * in cases like this where there are no data dependencies. > > */ > > +#define ARCH_NEEDS_READ_BARRIER_DEPENDS 1 > > #define read_barrier_depends() __asm__ __volatile__("mb": : :"memory") > > #ifdef CONFIG_SMP > > diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h > > index 2cafdbb9ae4c..fa2e2ef72b68 100644 > > --- a/include/asm-generic/barrier.h > > +++ b/include/asm-generic/barrier.h > > @@ -70,6 +70,24 @@ > > #define __smp_read_barrier_depends() read_barrier_depends() > > #endif > > +#if defined(COMPILER_HAS_OPTIMIZER_HIDE_VAR) && \ > > + !defined(ARCH_NEEDS_READ_BARRIER_DEPENDS) > > + > > +#define dependent_ptr_mb(ptr, val) ({ \ > > + long dependent_ptr_mb_val = (long)(val); \ > > + long dependent_ptr_mb_ptr = (long)(ptr) - dependent_ptr_mb_val; \ > > + \ > > + BUILD_BUG_ON(sizeof(val) > sizeof(long)); \ > > + OPTIMIZER_HIDE_VAR(dependent_ptr_mb_val); \ > > + (typeof(ptr))(dependent_ptr_mb_ptr + dependent_ptr_mb_val); \ > > +}) > > + > > +#else > > + > > +#define dependent_ptr_mb(ptr, val) ({ mb(); (ptr); }) > > > So for the example of patch 4, we'd better fall back to rmb() or need a > dependent_ptr_rmb()? > > ThanksYou mean for strongly ordered architectures like Intel? Yes, maybe it makes sense to have dependent_ptr_smp_rmb, dependent_ptr_dma_rmb and dependent_ptr_virt_rmb. mb variant is unused right now so I'll remove it.> > > + > > +#endif > > + > > #ifdef CONFIG_SMP > > #ifndef smp_mb > > diff --git a/include/linux/compiler.h b/include/linux/compiler.h > > index 6601d39e8c48..f599c30f1b28 100644 > > --- a/include/linux/compiler.h > > +++ b/include/linux/compiler.h > > @@ -152,9 +152,13 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val, > > #endif > > #ifndef OPTIMIZER_HIDE_VAR > > + > > /* Make the optimizer believe the variable can be manipulated arbitrarily. */ > > #define OPTIMIZER_HIDE_VAR(var) \ > > __asm__ ("" : "=rm" (var) : "0" (var)) > > + > > +#define COMPILER_HAS_OPTIMIZER_HIDE_VAR 1 > > + > > #endif > > /* Not-quite-unique ID. */
Jason Wang
2019-Jan-07 06:50 UTC
[PATCH RFC 3/4] barriers: convert a control to a data dependency
On 2019/1/7 ??12:23, Michael S. Tsirkin wrote:> On Mon, Jan 07, 2019 at 11:58:23AM +0800, Jason Wang wrote: >> On 2019/1/3 ??4:57, Michael S. Tsirkin wrote: >>> It's not uncommon to have two access two unrelated memory locations in a >>> specific order. At the moment one has to use a memory barrier for this. >>> >>> However, if the first access was a read and the second used an address >>> depending on the first one we would have a data dependency and no >>> barrier would be necessary. >>> >>> This adds a new interface: dependent_ptr_mb which does exactly this: it >>> returns a pointer with a data dependency on the supplied value. >>> >>> Signed-off-by: Michael S. Tsirkin<mst at redhat.com> >>> --- >>> Documentation/memory-barriers.txt | 20 ++++++++++++++++++++ >>> arch/alpha/include/asm/barrier.h | 1 + >>> include/asm-generic/barrier.h | 18 ++++++++++++++++++ >>> include/linux/compiler.h | 4 ++++ >>> 4 files changed, 43 insertions(+) >>> >>> diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt >>> index c1d913944ad8..9dbaa2e1dbf6 100644 >>> --- a/Documentation/memory-barriers.txt >>> +++ b/Documentation/memory-barriers.txt >>> @@ -691,6 +691,18 @@ case what's actually required is: >>> p = READ_ONCE(b); >>> } >>> +Alternatively, a control dependency can be converted to a data dependency, >>> +e.g.: >>> + >>> + q = READ_ONCE(a); >>> + if (q) { >>> + b = dependent_ptr_mb(b, q); >>> + p = READ_ONCE(b); >>> + } >>> + >>> +Note how the result of dependent_ptr_mb must be used with the following >>> +accesses in order to have an effect. >>> + >>> However, stores are not speculated. This means that ordering -is- provided >>> for load-store control dependencies, as in the following example: >>> @@ -836,6 +848,12 @@ out-guess your code. More generally, although READ_ONCE() does force >>> the compiler to actually emit code for a given load, it does not force >>> the compiler to use the results. >>> +Converting to a data dependency helps with this too: >>> + >>> + q = READ_ONCE(a); >>> + b = dependent_ptr_mb(b, q); >>> + WRITE_ONCE(b, 1); >>> + >>> In addition, control dependencies apply only to the then-clause and >>> else-clause of the if-statement in question. In particular, it does >>> not necessarily apply to code following the if-statement: >>> @@ -875,6 +893,8 @@ to the CPU containing it. See the section on "Multicopy atomicity" >>> for more information. >>> + >>> + >>> In summary: >>> (*) Control dependencies can order prior loads against later stores. >>> diff --git a/arch/alpha/include/asm/barrier.h b/arch/alpha/include/asm/barrier.h >>> index 92ec486a4f9e..b4934e8c551b 100644 >>> --- a/arch/alpha/include/asm/barrier.h >>> +++ b/arch/alpha/include/asm/barrier.h >>> @@ -59,6 +59,7 @@ >>> * as Alpha, "y" could be set to 3 and "x" to 0. Use rmb() >>> * in cases like this where there are no data dependencies. >>> */ >>> +#define ARCH_NEEDS_READ_BARRIER_DEPENDS 1 >>> #define read_barrier_depends() __asm__ __volatile__("mb": : :"memory") >>> #ifdef CONFIG_SMP >>> diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h >>> index 2cafdbb9ae4c..fa2e2ef72b68 100644 >>> --- a/include/asm-generic/barrier.h >>> +++ b/include/asm-generic/barrier.h >>> @@ -70,6 +70,24 @@ >>> #define __smp_read_barrier_depends() read_barrier_depends() >>> #endif >>> +#if defined(COMPILER_HAS_OPTIMIZER_HIDE_VAR) && \ >>> + !defined(ARCH_NEEDS_READ_BARRIER_DEPENDS) >>> + >>> +#define dependent_ptr_mb(ptr, val) ({ \ >>> + long dependent_ptr_mb_val = (long)(val); \ >>> + long dependent_ptr_mb_ptr = (long)(ptr) - dependent_ptr_mb_val; \ >>> + \ >>> + BUILD_BUG_ON(sizeof(val) > sizeof(long)); \ >>> + OPTIMIZER_HIDE_VAR(dependent_ptr_mb_val); \ >>> + (typeof(ptr))(dependent_ptr_mb_ptr + dependent_ptr_mb_val); \ >>> +}) >>> + >>> +#else >>> + >>> +#define dependent_ptr_mb(ptr, val) ({ mb(); (ptr); }) >> So for the example of patch 4, we'd better fall back to rmb() or need a >> dependent_ptr_rmb()? >> >> Thanks > You mean for strongly ordered architectures like Intel? > Yes, maybe it makes sense to have dependent_ptr_smp_rmb, > dependent_ptr_dma_rmb and dependent_ptr_virt_rmb. > > mb variant is unused right now so I'll remove it. > >Yes. Thanks
Seemingly Similar Threads
- [PATCH RFC 3/4] barriers: convert a control to a data dependency
- [PATCH RFC 3/4] barriers: convert a control to a data dependency
- [PATCH RFC 3/4] barriers: convert a control to a data dependency
- [PATCH RFC 0/4] barriers using data dependency
- [PATCH RFC 0/4] barriers using data dependency