thr3ads.net - llvm dev - [llvm-dev] Memory barrier problem [Feb 2021]

If this information is useful, please help other people find it:
Share via:

Jeroen Dobbelaere via llvm-dev

2021-Feb-04 08:04 UTC

[llvm-dev] Memory barrier problem

> >So a weaker `noalias` or a way to mark uses seems therefore required
for
> >`noalias` deduction.
> 
> Appears to be that way. Can we do that w/o having a weaker restrict in the
> language spec?
The full restrict[0] implementation does not depend on the 'noalias'
attribute on
function arguments. The attribute is even too strong for just mapping a
'C99 restrict pointer argument' to a 'LLVM-IR noalias pointer
argument'.
For backwards compatibility, I kept the default mapping of restrict pointer
arguments
onto 'noalias' and provided the '-fno-noalias-arguments' option
to disable this mapping
For some code, this can result in a wrong 'based on' deduction.[1]

Given that, IMHO, it still makes sense to have a strong and a weaker version of
the
'noalias argument attribute'. At least, the stronger (current
'noalias') version can be
converted to the noalias scope/intrinsics mapping during inlining, keeping the
strong guarantees.
Converting a weaker version likely will need some more tweaking.

The noalias attribute is also used for a struct pointer argument when a function
returns a struct.

Greetings,

Jeroen Dobbelaere
[0] Full Restrict patches: https://reviews.llvm.org/D68484
[1] 'clang/test/CodeGen/restrict/arg_reuse.c' testcase in:
https://reviews.llvm.org/D68521
> 
> -----Original Message-----
> From: Johannes Doerfert <johannesdoerfert at gmail.com>
> Sent: Wednesday, February 3, 2021 10:52 AM
> To: Jeroen Dobbelaere <Jeroen.Dobbelaere at synopsys.com>; Saito,
Hideki
> <hideki.saito at intel.com>; Kaylor, Andrew <andrew.kaylor at
intel.com>
> Cc: llvm-dev at lists.llvm.org
> Subject: Re: [llvm-dev] Memory barrier problem
> 
> 
> On 2/3/21 12:44 PM, Jeroen Dobbelaere wrote:
> >>>> W.r.t. restrict, I'd like to hear more from the
language lawyers on
> >>>> their
> >> original intent when the language construct was born and the
current
> >> interpretation of it in the presence of threading.
> >>> I would have assumed `__restrict` predates "common"
multi-processing in C.
> >> Since the language of restrict is to this day implying other
threads
> >> cannot access those pointers, I would not dare to argue we should
> >> weaken it in order to deduce `noalias`.
> >>> ~ Johannes
> >>>
> > Having interacted recently with wg14 to get a better understanding
> > about some of the corner cases around restrict, I can add the
following:
> >
> > One way to look at a restrict pointer[1], is as if you get a local
array.
> > That means that following code:
> >
> >    void foo_a(int *restrict rpDest, int *restrict rpSrc, int n) {
> >       for (int i=0; i<n; ++i)
> >         rpDest[i] = rpSrc[i]+1;
> >    }
> >
> > is allowed to behave as if it was written as follows:
> >    void foo_b(int *pDest, int *pSrc, int n) {
> >      int localDest[n];
> >      int localSrc[n];
> >      memcpy(&localDest[0], pDest, n*sizeof(int));
> >      memcpy(&localSrct[0], pSrc, n*sizeof(int));
> >      for (int i=0; i<n; ++i)
> >         localDest[i] = localSrc[i]+1;
> >      memcpy(pDest, &localDest[0], n*sizeof(int));
> >    }
> >
> > Calling foo_a and foo_b with overlapping arrays can show different
> > results, depending on how the loop was optimized. That is an
> > indication that this usage of 'foo_a' is triggering undefined
behavior and
> should not be done.
> 
> The way I interpret this is consistent with Eli's opinion and what we
> basically do so far, restrict is stronger than synchronization since the
local
> arrays are not synchronized across threads. If two threads access the same
> memory (even well synchronized) it breaks the restrict requirement and is
> therefor UB.
> 
> So a weaker `noalias` or a way to mark uses seems therefore required for
> `noalias` deduction.
> 
> ~ Johannes
> 
> 
> > Wrt to threading: as long as the restrict pointer (rpDest, rpSrc;
> > localDest, localSrc) is not escaping, a different thread should not be
able
> to access the memory, as there is no way it can get a pointer 'based
on'
> > the restrict pointer.
> >
> > Note [1]: things get more interesting when having a 'pointer to a
restrict
> pointer' (aka int *restrict *prp).
> >
> > Greetings,
> >
> > Jeroen Dobbelaere
> >

Johannes Doerfert via llvm-dev

2021-Feb-12 19:41 UTC

head link

[llvm-dev] Memory barrier problem

On 2/4/21 2:04 AM, Jeroen Dobbelaere wrote:>>> So a weaker `noalias` or a way to mark uses seems therefore
required for
>>> `noalias` deduction.
>> Appears to be that way. Can we do that w/o having a weaker restrict in
the
>> language spec?
> The full restrict[0] implementation does not depend on the
'noalias' attribute on
> function arguments. The attribute is even too strong for just mapping a
> 'C99 restrict pointer argument' to a 'LLVM-IR noalias pointer
argument'.
> For backwards compatibility, I kept the default mapping of restrict pointer
arguments
> onto 'noalias' and provided the '-fno-noalias-arguments'
option to disable this mapping
> For some code, this can result in a wrong 'based on' deduction.[1]
>
> Given that, IMHO, it still makes sense to have a strong and a weaker
version of the
> 'noalias argument attribute'. At least, the stronger (current
'noalias') version can be
> converted to the noalias scope/intrinsics mapping during inlining, keeping
the strong guarantees.
> Converting a weaker version likely will need some more tweaking.
>
> The noalias attribute is also used for a struct pointer argument when a
function returns a struct.
Interesting. I guess we would not keep it for restrict if it is too 
strong but there
are other uses where the guarantees are useful I believe.

Given that full restrict will make the `__restrict` problem go away, 
let's look at the
deduction one.

What if we make `nosync` a value/pointer attribute as well and then have:

   `noalias`
   Does not alias other pointers in scope but synchronizing events might 
still change
   the value because other threads might have the same "expression". 
That is, we declare
   the deductions as correct by weakening `noalias`.

   `nosync`
   The value is not modified in this scope by another thread.

   `noalias` + `nosync`
   Matches the `__restrict` guarantee that nothing not based of the 
pointer can modify it,
   so this is "stronger" than synchronization events and you can
forward
over fences/barriers.

WDYT?

~ Johannes

>
> Greetings,
>
> Jeroen Dobbelaere
> [0] Full Restrict patches: https://reviews.llvm.org/D68484
> [1] 'clang/test/CodeGen/restrict/arg_reuse.c' testcase in:
https://reviews.llvm.org/D68521
>
>> -----Original Message-----
>> From: Johannes Doerfert <johannesdoerfert at gmail.com>
>> Sent: Wednesday, February 3, 2021 10:52 AM
>> To: Jeroen Dobbelaere <Jeroen.Dobbelaere at synopsys.com>; Saito,
Hideki
>> <hideki.saito at intel.com>; Kaylor, Andrew <andrew.kaylor at
intel.com>
>> Cc: llvm-dev at lists.llvm.org
>> Subject: Re: [llvm-dev] Memory barrier problem
>>
>>
>> On 2/3/21 12:44 PM, Jeroen Dobbelaere wrote:
>>>>>> W.r.t. restrict, I'd like to hear more from the
language lawyers on
>>>>>> their
>>>> original intent when the language construct was born and the
current
>>>> interpretation of it in the presence of threading.
>>>>> I would have assumed `__restrict` predates
"common" multi-processing in C.
>>>> Since the language of restrict is to this day implying other
threads
>>>> cannot access those pointers, I would not dare to argue we
should
>>>> weaken it in order to deduce `noalias`.
>>>>> ~ Johannes
>>>>>
>>> Having interacted recently with wg14 to get a better understanding
>>> about some of the corner cases around restrict, I can add the
following:
>>>
>>> One way to look at a restrict pointer[1], is as if you get a local
array.
>>> That means that following code:
>>>
>>>     void foo_a(int *restrict rpDest, int *restrict rpSrc, int n) {
>>>        for (int i=0; i<n; ++i)
>>>          rpDest[i] = rpSrc[i]+1;
>>>     }
>>>
>>> is allowed to behave as if it was written as follows:
>>>     void foo_b(int *pDest, int *pSrc, int n) {
>>>       int localDest[n];
>>>       int localSrc[n];
>>>       memcpy(&localDest[0], pDest, n*sizeof(int));
>>>       memcpy(&localSrct[0], pSrc, n*sizeof(int));
>>>       for (int i=0; i<n; ++i)
>>>          localDest[i] = localSrc[i]+1;
>>>       memcpy(pDest, &localDest[0], n*sizeof(int));
>>>     }
>>>
>>> Calling foo_a and foo_b with overlapping arrays can show different
>>> results, depending on how the loop was optimized. That is an
>>> indication that this usage of 'foo_a' is triggering
undefined behavior and
>> should not be done.
>>
>> The way I interpret this is consistent with Eli's opinion and what
we
>> basically do so far, restrict is stronger than synchronization since
the local
>> arrays are not synchronized across threads. If two threads access the
same
>> memory (even well synchronized) it breaks the restrict requirement and
is
>> therefor UB.
>>
>> So a weaker `noalias` or a way to mark uses seems therefore required
for
>> `noalias` deduction.
>>
>> ~ Johannes
>>
>>
>>> Wrt to threading: as long as the restrict pointer (rpDest, rpSrc;
>>> localDest, localSrc) is not escaping, a different thread should not
be able
>> to access the memory, as there is no way it can get a pointer
'based on'
>>> the restrict pointer.
>>>
>>> Note [1]: things get more interesting when having a 'pointer to
a restrict
>> pointer' (aka int *restrict *prp).
>>> Greetings,
>>>
>>> Jeroen Dobbelaere
>>>

llvm dev - Feb 2021 - Memory barrier problem

[llvm-dev] Memory barrier problem

[llvm-dev] Memory barrier problem