Displaying 20 results from an estimated 44 matches for "lwsync".
2016 Jan 26
2
[v3,11/41] mips: reuse asm-generic/barrier.h
...gt; > > > (* When the Group-B sets from two different barriers involve instructions in
> > > > the same thread, within that thread one set must contain the other.
> > > >
> > > > P0 P1 P2
> > > > Rx=1 Wy=1 Wz=2
> > > > dep. lwsync lwsync
> > > > Ry=0 Wz=1 Wx=1
> > > > Rz=1
> > > >
> > > > assert(!(z=2))
> > > >
> > > > Forbidden by ppcmem, allowed by herd.
> > > > *)
> > > > {
> > > > 0:r1=x; 0:r2=y; 0:r3=z;
&g...
2016 Jan 26
2
[v3,11/41] mips: reuse asm-generic/barrier.h
...gt; > > > (* When the Group-B sets from two different barriers involve instructions in
> > > > the same thread, within that thread one set must contain the other.
> > > >
> > > > P0 P1 P2
> > > > Rx=1 Wy=1 Wz=2
> > > > dep. lwsync lwsync
> > > > Ry=0 Wz=1 Wx=1
> > > > Rz=1
> > > >
> > > > assert(!(z=2))
> > > >
> > > > Forbidden by ppcmem, allowed by herd.
> > > > *)
> > > > {
> > > > 0:r1=x; 0:r2=y; 0:r3=z;
&g...
2016 Jan 15
2
[v3,11/41] mips: reuse asm-generic/barrier.h
...are connected.
But I will admit that there are some rather strange litmus tests that
challenge this cycle-centric view, for example, the one shown below.
It turns out that herd and ppcmem disagree on the outcome. (The Power
architects side with ppcmem.)
> And I think I'm still confused on LWSYNC (in the smp_wmb case) when one
> of the stores looses a conflict, and if that scenario matters. If it
> does, we should inspect the same case for other barriers.
Indeed. I am still working on how these should be described. My
current thought is to be quite conservative on what ordering is...
2016 Jan 15
2
[v3,11/41] mips: reuse asm-generic/barrier.h
...are connected.
But I will admit that there are some rather strange litmus tests that
challenge this cycle-centric view, for example, the one shown below.
It turns out that herd and ppcmem disagree on the outcome. (The Power
architects side with ppcmem.)
> And I think I'm still confused on LWSYNC (in the smp_wmb case) when one
> of the stores looses a conflict, and if that scenario matters. If it
> does, we should inspect the same case for other barriers.
Indeed. I am still working on how these should be described. My
current thought is to be quite conservative on what ordering is...
2016 Jan 26
5
[v3,11/41] mips: reuse asm-generic/barrier.h
...me rather strange litmus tests that
> > challenge this cycle-centric view, for example, the one shown below.
> > It turns out that herd and ppcmem disagree on the outcome. (The Power
> > architects side with ppcmem.)
> >
> > > And I think I'm still confused on LWSYNC (in the smp_wmb case) when one
> > > of the stores looses a conflict, and if that scenario matters. If it
> > > does, we should inspect the same case for other barriers.
> >
> > Indeed. I am still working on how these should be described. My
> > current though...
2016 Jan 26
5
[v3,11/41] mips: reuse asm-generic/barrier.h
...me rather strange litmus tests that
> > challenge this cycle-centric view, for example, the one shown below.
> > It turns out that herd and ppcmem disagree on the outcome. (The Power
> > architects side with ppcmem.)
> >
> > > And I think I'm still confused on LWSYNC (in the smp_wmb case) when one
> > > of the stores looses a conflict, and if that scenario matters. If it
> > > does, we should inspect the same case for other barriers.
> >
> > Indeed. I am still working on how these should be described. My
> > current though...
2016 Jan 25
0
[v3,11/41] mips: reuse asm-generic/barrier.h
...l admit that there are some rather strange litmus tests that
> challenge this cycle-centric view, for example, the one shown below.
> It turns out that herd and ppcmem disagree on the outcome. (The Power
> architects side with ppcmem.)
>
> > And I think I'm still confused on LWSYNC (in the smp_wmb case) when one
> > of the stores looses a conflict, and if that scenario matters. If it
> > does, we should inspect the same case for other barriers.
>
> Indeed. I am still working on how these should be described. My
> current thought is to be quite conserva...
2016 Jan 26
0
[v3,11/41] mips: reuse asm-generic/barrier.h
...> > > ""
> > > (* When the Group-B sets from two different barriers involve instructions in
> > > the same thread, within that thread one set must contain the other.
> > >
> > > P0 P1 P2
> > > Rx=1 Wy=1 Wz=2
> > > dep. lwsync lwsync
> > > Ry=0 Wz=1 Wx=1
> > > Rz=1
> > >
> > > assert(!(z=2))
> > >
> > > Forbidden by ppcmem, allowed by herd.
> > > *)
> > > {
> > > 0:r1=x; 0:r2=y; 0:r3=z;
> > > 1:r1=x; 1:r2=y; 1:r3=z; 1:r4=1;...
2016 Jan 27
0
[v3,11/41] mips: reuse asm-generic/barrier.h
...n the Group-B sets from two different barriers involve instructions in
> > > > > the same thread, within that thread one set must contain the other.
> > > > >
> > > > > P0 P1 P2
> > > > > Rx=1 Wy=1 Wz=2
> > > > > dep. lwsync lwsync
> > > > > Ry=0 Wz=1 Wx=1
> > > > > Rz=1
> > > > >
> > > > > assert(!(z=2))
> > > > >
> > > > > Forbidden by ppcmem, allowed by herd.
> > > > > *)
> > > > > {
> &...
2016 Jan 15
2
[v3,11/41] mips: reuse asm-generic/barrier.h
On Fri, Jan 15, 2016 at 10:13:48AM +0100, Peter Zijlstra wrote:
> On Fri, Jan 15, 2016 at 09:55:54AM +0100, Peter Zijlstra wrote:
> > On Thu, Jan 14, 2016 at 01:29:13PM -0800, Paul E. McKenney wrote:
> > > So smp_mb() provides transitivity, as do pairs of smp_store_release()
> > > and smp_read_acquire(),
> >
> > But they provide different grades of
2016 Jan 15
2
[v3,11/41] mips: reuse asm-generic/barrier.h
On Fri, Jan 15, 2016 at 10:13:48AM +0100, Peter Zijlstra wrote:
> On Fri, Jan 15, 2016 at 09:55:54AM +0100, Peter Zijlstra wrote:
> > On Thu, Jan 14, 2016 at 01:29:13PM -0800, Paul E. McKenney wrote:
> > > So smp_mb() provides transitivity, as do pairs of smp_store_release()
> > > and smp_read_acquire(),
> >
> > But they provide different grades of
2016 Jan 25
2
[v3,11/41] mips: reuse asm-generic/barrier.h
...1:r1=1; 1:r2=u; 1:r3=v; 1:r4=x; 1:r5=y; 1:r6=z;
> 2:r1=1; 2:r2=u; 2:r3=v; 2:r4=x; 2:r5=y; 2:r6=z;
> 3:r1=1; 3:r2=u; 3:r3=v; 3:r4=x; 3:r5=y; 3:r6=z;
> }
> P0 | P1 | P2 | P3 ;
> lwz r9,0(r4) | lwz r9,0(r5) | lwz r9,0(r6) | stw r1,0(r3) ;
> lwsync | lwsync | lwsync | sync ;
> stw r1,0(r2) | lwz r8,0(r3) | stw r1,0(r7) | lwz r9,0(r2) ;
> lwsync | lwz r7,0(r2) | | ;
> stw r1,0(r5) | lwsync | | ;
> | stw r1,0(r6) |...
2016 Jan 25
2
[v3,11/41] mips: reuse asm-generic/barrier.h
...1:r1=1; 1:r2=u; 1:r3=v; 1:r4=x; 1:r5=y; 1:r6=z;
> 2:r1=1; 2:r2=u; 2:r3=v; 2:r4=x; 2:r5=y; 2:r6=z;
> 3:r1=1; 3:r2=u; 3:r3=v; 3:r4=x; 3:r5=y; 3:r6=z;
> }
> P0 | P1 | P2 | P3 ;
> lwz r9,0(r4) | lwz r9,0(r5) | lwz r9,0(r6) | stw r1,0(r3) ;
> lwsync | lwsync | lwsync | sync ;
> stw r1,0(r2) | lwz r8,0(r3) | stw r1,0(r7) | lwz r9,0(r2) ;
> lwsync | lwz r7,0(r2) | | ;
> stw r1,0(r5) | lwsync | | ;
> | stw r1,0(r6) |...
2016 Jan 15
5
[v3,11/41] mips: reuse asm-generic/barrier.h
On Thu, Jan 14, 2016 at 01:29:13PM -0800, Paul E. McKenney wrote:
> So smp_mb() provides transitivity, as do pairs of smp_store_release()
> and smp_read_acquire(),
But they provide different grades of transitivity, which is where all
the confusion lays.
smp_mb() is strongly/globally transitive, all CPUs will agree on the order.
Whereas the RCpc release+acquire is weakly so, only the two
2016 Jan 15
5
[v3,11/41] mips: reuse asm-generic/barrier.h
On Thu, Jan 14, 2016 at 01:29:13PM -0800, Paul E. McKenney wrote:
> So smp_mb() provides transitivity, as do pairs of smp_store_release()
> and smp_read_acquire(),
But they provide different grades of transitivity, which is where all
the confusion lays.
smp_mb() is strongly/globally transitive, all CPUs will agree on the order.
Whereas the RCpc release+acquire is weakly so, only the two
2016 Jan 15
0
[v3,11/41] mips: reuse asm-generic/barrier.h
...:r3=v; 0:r4=x; 0:r5=y; 0:r6=z;
1:r1=1; 1:r2=u; 1:r3=v; 1:r4=x; 1:r5=y; 1:r6=z;
2:r1=1; 2:r2=u; 2:r3=v; 2:r4=x; 2:r5=y; 2:r6=z;
3:r1=1; 3:r2=u; 3:r3=v; 3:r4=x; 3:r5=y; 3:r6=z;
}
P0 | P1 | P2 | P3 ;
lwz r9,0(r4) | lwz r9,0(r5) | lwz r9,0(r6) | stw r1,0(r3) ;
lwsync | lwsync | lwsync | sync ;
stw r1,0(r2) | lwz r8,0(r3) | stw r1,0(r7) | lwz r9,0(r2) ;
lwsync | lwz r7,0(r2) | | ;
stw r1,0(r5) | lwsync | | ;
| stw r1,0(r6) | | ;
ex...
2016 Jan 05
2
[PATCH v2 15/32] powerpc: define __smp_xxx
...tions(+), 16 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/barrier.h b/arch/powerpc/include/asm/barrier.h
> index 980ad0c..c0deafc 100644
> --- a/arch/powerpc/include/asm/barrier.h
> +++ b/arch/powerpc/include/asm/barrier.h
> @@ -44,19 +44,11 @@
> #define dma_rmb() __lwsync()
> #define dma_wmb() __asm__ __volatile__ (stringify_in_c(SMPWMB) : : :"memory")
>
> -#ifdef CONFIG_SMP
> -#define smp_lwsync() __lwsync()
> +#define __smp_lwsync() __lwsync()
>
so __smp_lwsync() is always mapped to lwsync, right?
> -#define smp_mb() mb()
>...
2016 Jan 05
2
[PATCH v2 15/32] powerpc: define __smp_xxx
...tions(+), 16 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/barrier.h b/arch/powerpc/include/asm/barrier.h
> index 980ad0c..c0deafc 100644
> --- a/arch/powerpc/include/asm/barrier.h
> +++ b/arch/powerpc/include/asm/barrier.h
> @@ -44,19 +44,11 @@
> #define dma_rmb() __lwsync()
> #define dma_wmb() __asm__ __volatile__ (stringify_in_c(SMPWMB) : : :"memory")
>
> -#ifdef CONFIG_SMP
> -#define smp_lwsync() __lwsync()
> +#define __smp_lwsync() __lwsync()
>
so __smp_lwsync() is always mapped to lwsync, right?
> -#define smp_mb() mb()
>...
2016 Jan 26
0
[v3,11/41] mips: reuse asm-generic/barrier.h
...x; 1:r5=y; 1:r6=z;
> > 2:r1=1; 2:r2=u; 2:r3=v; 2:r4=x; 2:r5=y; 2:r6=z;
> > 3:r1=1; 3:r2=u; 3:r3=v; 3:r4=x; 3:r5=y; 3:r6=z;
> > }
> > P0 | P1 | P2 | P3 ;
> > lwz r9,0(r4) | lwz r9,0(r5) | lwz r9,0(r6) | stw r1,0(r3) ;
> > lwsync | lwsync | lwsync | sync ;
> > stw r1,0(r2) | lwz r8,0(r3) | stw r1,0(r7) | lwz r9,0(r2) ;
> > lwsync | lwz r7,0(r2) | | ;
> > stw r1,0(r5) | lwsync | | ;
> > | stw r1,...
2014 Aug 08
2
[LLVMdev] Plan to optimize atomics in LLVM
> Longer term, I hope to improve the fence elimination of the ARM backend with
> a kind of PRE algorithm. Both of these improvements to the ARM backend
> should be fairly straightforward to port to the POWER architecture later,
> and I hope to also do that.
>
> Any reason these couldn't be done at the IR level?
I definitely agree here. At the time, it was a plausible idea