thr3ads.net - llvm dev - [LLVMdev] LLVM Concurrency and Undef [Aug 2011]

If this information is useful, please help other people find it:
Share via:

Jianzhou Zhao

2011-Aug-22 23:20 UTC

[LLVMdev] LLVM Concurrency and Undef

On Mon, Aug 22, 2011 at 6:49 PM, Eli Friedman <eli.friedman at gmail.com>
wrote:> On Mon, Aug 22, 2011 at 3:40 PM, Jianzhou Zhao <jianzhou at
seas.upenn.edu> wrote:
>> On Mon, Aug 22, 2011 at 6:08 PM, Eli Friedman <eli.friedman at
gmail.com> wrote:
>>> On Mon, Aug 22, 2011 at 2:49 PM, Santosh Nagarakatte
>>> <santosh.nagarakatte at gmail.com> wrote:
>>>> Hi all,
>>>>
>>>> I have been trying to understand the use of undef in both
sequential
>>>> and concurrent programs.
>>>>
>>>> >From the LLVM Language Reference Manual, I see the
following
>>>> definition of undef.
>>>> "Undef can be used anywhere a constant is expected, and
indicates that
>>>> the user of the value may receive an unspecified
bit-pattern".
>>>>  LLVM Language Reference manual also demonstrates how
optimizers can
>>>> use these undef values to  optimize the program.
>>>>
>>>> However, on the other hand, with the LLVM Atomics and
Concurrency
>>>> Guide states that
>>>> If code accesses a memory location from multiple threads at the
same
>>>> time, the resulting loads return 'undef'.
>>>> This is different from the C++ memory model, which provides
undefined
>>>> behavior. What is the rationale for returning an undef on
racing
>>>> reads?
>>>>
>>>> LLVM Atomics and Concurrency guide also states the following
>>>> "Note that speculative loads are allowed; a load which is
part of a
>>>> race returns undef, but does not have undefined behavior"
>>>>
>>>> If the speculative loads returns an undef and the returned
value is
>>>> used, then it results in an undefined behavior. Am I correct?
>>>
>>> It behaves like any other undef value... which do often lead to
>>> undefined behavior.
>>>
>>>> If so, what is the purpose of returning an undef with a
speculative load?
>>>> Is it to ensure that the subsequent uses of the value of the
>>>> speculatively introduced load is caught/detected by the
optimization?
>>>
>>> The point is primarily to allow optimizations like LICM to
introduce
>>> loads whose value is never used.  It also keeps consistent
semantics
>>> through CodeGen, where some targets widen loads.
>>>
>>>> Is it possible to separate the "undef" in a
sequential setting and
>>>> "undef" with speculative loads in a concurrent
setting with separate
>>>> undefs?
>>>
>>> The intention is that they should have the same semantics.
>>
>> Suppose there are three threads T1, T2 and T3,
>> T1 (S1 )stores to a location l as non-atomic,
>> T2 then (S2)stores to l as SC-atomic,
>> later T3 (L3)loads from l as SC-atomic.
>>
>> I think the load @ T3 should return undef, since it can see both
>> writes from T1 T2. Then the question is if the SC store @ T2 --- S2
>> and the SC load @ T3 --- L3 introduces an acq/rel (synchronized-with)
>> edge.
>>
>> This will affect if later conflicting accesses are ordered or not, and
>> whether memory accesses are ordered makes load return undef or not.
>>
>> If the S2 and L3 still introduces an SW edge, the next question is
>> suppose there is a later SC-load L3' @ T3, does it also return
undef?
>> But this potentially makes the SC atomic sequence S2 L3 L3'
>> inconsistent --- later SC loads can read different writes from earlier
>> loads if there are no SC-stores in between.
>>
>> So I think data-races SC/acq/rel atomics cannot introduce SW edges.
>> Intuitively if the locations designed for locks are used by non-atomic
>> memory accesses, the locks cannot behave correctly. Is it correct?
>
> Yes, that is correct.  I am now convinced that it is worthwhile to
> include the definition of how SW edges form in LangRef.
Some of the descriptions in LangRef and
http://llvm.org/docs/Atomics.html are like
1) monotonic atomic to the same location are totally ordered
2) all SC atomic to to any location are totally ordered
3) monotonic loads should see monotonic stores monotonically
...

However, these conditions can be broken if any of them are
'interferenced' by non-atomic memory accesses, since that can turn
some of the atomic loads to be undef (with any values). So it is
unlikely forming a total order any more.

So, do these properties hold only for data race free atomics?  For example,

T1:                                         T2:
a:  sc store 1 to location x
                                              b: non-atomic store 2 to x
                                              c: sc store 3 to y
d:  r1 = sc load from y
e:  r2 = sc load from x

Here, r2 still returns undef, since e can see a non-atomic store
although from happens-before. Then a c d e may not be totally ordered.
>
> -Eli
>


-- 
Jianzhou

Jeffrey Yasskin

2011-Aug-22 23:35 UTC

head link

[LLVMdev] LLVM Concurrency and Undef

On Mon, Aug 22, 2011 at 4:20 PM, Jianzhou Zhao <jianzhou at
seas.upenn.edu> wrote:> On Mon, Aug 22, 2011 at 6:49 PM, Eli Friedman <eli.friedman at
gmail.com> wrote:
>> On Mon, Aug 22, 2011 at 3:40 PM, Jianzhou Zhao <jianzhou at
seas.upenn.edu> wrote:
>>> On Mon, Aug 22, 2011 at 6:08 PM, Eli Friedman <eli.friedman at
gmail.com> wrote:
>>>> On Mon, Aug 22, 2011 at 2:49 PM, Santosh Nagarakatte
>>>> <santosh.nagarakatte at gmail.com> wrote:
>>>>> Hi all,
>>>>>
>>>>> I have been trying to understand the use of undef in both
sequential
>>>>> and concurrent programs.
>>>>>
>>>>> >From the LLVM Language Reference Manual, I see the
following
>>>>> definition of undef.
>>>>> "Undef can be used anywhere a constant is expected,
and indicates that
>>>>> the user of the value may receive an unspecified
bit-pattern".
>>>>>  LLVM Language Reference manual also demonstrates how
optimizers can
>>>>> use these undef values to  optimize the program.
>>>>>
>>>>> However, on the other hand, with the LLVM Atomics and
Concurrency
>>>>> Guide states that
>>>>> If code accesses a memory location from multiple threads at
the same
>>>>> time, the resulting loads return 'undef'.
>>>>> This is different from the C++ memory model, which provides
undefined
>>>>> behavior. What is the rationale for returning an undef on
racing
>>>>> reads?
>>>>>
>>>>> LLVM Atomics and Concurrency guide also states the
following
>>>>> "Note that speculative loads are allowed; a load which
is part of a
>>>>> race returns undef, but does not have undefined
behavior"
>>>>>
>>>>> If the speculative loads returns an undef and the returned
value is
>>>>> used, then it results in an undefined behavior. Am I
correct?
>>>>
>>>> It behaves like any other undef value... which do often lead to
>>>> undefined behavior.
>>>>
>>>>> If so, what is the purpose of returning an undef with a
speculative load?
>>>>> Is it to ensure that the subsequent uses of the value of
the
>>>>> speculatively introduced load is caught/detected by the
optimization?
>>>>
>>>> The point is primarily to allow optimizations like LICM to
introduce
>>>> loads whose value is never used.  It also keeps consistent
semantics
>>>> through CodeGen, where some targets widen loads.
>>>>
>>>>> Is it possible to separate the "undef" in a
sequential setting and
>>>>> "undef" with speculative loads in a concurrent
setting with separate
>>>>> undefs?
>>>>
>>>> The intention is that they should have the same semantics.
>>>
>>> Suppose there are three threads T1, T2 and T3,
>>> T1 (S1 )stores to a location l as non-atomic,
>>> T2 then (S2)stores to l as SC-atomic,
>>> later T3 (L3)loads from l as SC-atomic.
>>>
>>> I think the load @ T3 should return undef, since it can see both
>>> writes from T1 T2. Then the question is if the SC store @ T2 --- S2
>>> and the SC load @ T3 --- L3 introduces an acq/rel
(synchronized-with)
>>> edge.
>>>
>>> This will affect if later conflicting accesses are ordered or not,
and
>>> whether memory accesses are ordered makes load return undef or not.
>>>
>>> If the S2 and L3 still introduces an SW edge, the next question is
>>> suppose there is a later SC-load L3' @ T3, does it also return
undef?
>>> But this potentially makes the SC atomic sequence S2 L3 L3'
>>> inconsistent --- later SC loads can read different writes from
earlier
>>> loads if there are no SC-stores in between.
>>>
>>> So I think data-races SC/acq/rel atomics cannot introduce SW edges.
>>> Intuitively if the locations designed for locks are used by
non-atomic
>>> memory accesses, the locks cannot behave correctly. Is it correct?
>>
>> Yes, that is correct.  I am now convinced that it is worthwhile to
>> include the definition of how SW edges form in LangRef.
>
> Some of the descriptions in LangRef and
> http://llvm.org/docs/Atomics.html are like
> 1) monotonic atomic to the same location are totally ordered
> 2) all SC atomic to to any location are totally ordered
> 3) monotonic loads should see monotonic stores monotonically
> ...
>
> However, these conditions can be broken if any of them are
> 'interferenced' by non-atomic memory accesses, since that can turn
> some of the atomic loads to be undef (with any values). So it is
> unlikely forming a total order any more.
>
> So, do these properties hold only for data race free atomics?  For example,
>
> T1:                                         T2:
> a:  sc store 1 to location x
>                                              b: non-atomic store 2 to x
>                                              c: sc store 3 to y
> d:  r1 = sc load from y
> e:  r2 = sc load from x
>
> Here, r2 still returns undef, since e can see a non-atomic store
> although from happens-before. Then a c d e may not be totally ordered.
Say r1==3, so we're guaranteed that 'e' happens after 'b'.
Since
neither a nor b happens before the other, and at least one of them is
not atomic, r2==undef. The seq_cst ordering a, c, d, e is consistent
with the observable results. That's true whether printing r2 prints 1,
2, or 47 since undef can be observed as any value.

Jeffrey

Jianzhou Zhao

2011-Aug-23 00:22 UTC

head link

[LLVMdev] LLVM Concurrency and Undef

On Mon, Aug 22, 2011 at 7:35 PM, Jeffrey Yasskin <jyasskin at google.com>
wrote:> On Mon, Aug 22, 2011 at 4:20 PM, Jianzhou Zhao <jianzhou at
seas.upenn.edu> wrote:
>> On Mon, Aug 22, 2011 at 6:49 PM, Eli Friedman <eli.friedman at
gmail.com> wrote:
>>> On Mon, Aug 22, 2011 at 3:40 PM, Jianzhou Zhao <jianzhou at
seas.upenn.edu> wrote:
>>>> On Mon, Aug 22, 2011 at 6:08 PM, Eli Friedman <eli.friedman
at gmail.com> wrote:
>>>>> On Mon, Aug 22, 2011 at 2:49 PM, Santosh Nagarakatte
>>>>> <santosh.nagarakatte at gmail.com> wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> I have been trying to understand the use of undef in
both sequential
>>>>>> and concurrent programs.
>>>>>>
>>>>>> >From the LLVM Language Reference Manual, I see the
following
>>>>>> definition of undef.
>>>>>> "Undef can be used anywhere a constant is
expected, and indicates that
>>>>>> the user of the value may receive an unspecified
bit-pattern".
>>>>>>  LLVM Language Reference manual also demonstrates how
optimizers can
>>>>>> use these undef values to  optimize the program.
>>>>>>
>>>>>> However, on the other hand, with the LLVM Atomics and
Concurrency
>>>>>> Guide states that
>>>>>> If code accesses a memory location from multiple
threads at the same
>>>>>> time, the resulting loads return 'undef'.
>>>>>> This is different from the C++ memory model, which
provides undefined
>>>>>> behavior. What is the rationale for returning an undef
on racing
>>>>>> reads?
>>>>>>
>>>>>> LLVM Atomics and Concurrency guide also states the
following
>>>>>> "Note that speculative loads are allowed; a load
which is part of a
>>>>>> race returns undef, but does not have undefined
behavior"
>>>>>>
>>>>>> If the speculative loads returns an undef and the
returned value is
>>>>>> used, then it results in an undefined behavior. Am I
correct?
>>>>>
>>>>> It behaves like any other undef value... which do often
lead to
>>>>> undefined behavior.
>>>>>
>>>>>> If so, what is the purpose of returning an undef with a
speculative load?
>>>>>> Is it to ensure that the subsequent uses of the value
of the
>>>>>> speculatively introduced load is caught/detected by the
optimization?
>>>>>
>>>>> The point is primarily to allow optimizations like LICM to
introduce
>>>>> loads whose value is never used.  It also keeps consistent
semantics
>>>>> through CodeGen, where some targets widen loads.
>>>>>
>>>>>> Is it possible to separate the "undef" in a
sequential setting and
>>>>>> "undef" with speculative loads in a
concurrent setting with separate
>>>>>> undefs?
>>>>>
>>>>> The intention is that they should have the same semantics.
>>>>
>>>> Suppose there are three threads T1, T2 and T3,
>>>> T1 (S1 )stores to a location l as non-atomic,
>>>> T2 then (S2)stores to l as SC-atomic,
>>>> later T3 (L3)loads from l as SC-atomic.
>>>>
>>>> I think the load @ T3 should return undef, since it can see
both
>>>> writes from T1 T2. Then the question is if the SC store @ T2
--- S2
>>>> and the SC load @ T3 --- L3 introduces an acq/rel
(synchronized-with)
>>>> edge.
>>>>
>>>> This will affect if later conflicting accesses are ordered or
not, and
>>>> whether memory accesses are ordered makes load return undef or
not.
>>>>
>>>> If the S2 and L3 still introduces an SW edge, the next question
is
>>>> suppose there is a later SC-load L3' @ T3, does it also
return undef?
>>>> But this potentially makes the SC atomic sequence S2 L3 L3'
>>>> inconsistent --- later SC loads can read different writes from
earlier
>>>> loads if there are no SC-stores in between.
>>>>
>>>> So I think data-races SC/acq/rel atomics cannot introduce SW
edges.
>>>> Intuitively if the locations designed for locks are used by
non-atomic
>>>> memory accesses, the locks cannot behave correctly. Is it
correct?
>>>
>>> Yes, that is correct.  I am now convinced that it is worthwhile to
>>> include the definition of how SW edges form in LangRef.
>>
>> Some of the descriptions in LangRef and
>> http://llvm.org/docs/Atomics.html are like
>> 1) monotonic atomic to the same location are totally ordered
>> 2) all SC atomic to to any location are totally ordered
>> 3) monotonic loads should see monotonic stores monotonically
>> ...
>>
>> However, these conditions can be broken if any of them are
>> 'interferenced' by non-atomic memory accesses, since that can
turn
>> some of the atomic loads to be undef (with any values). So it is
>> unlikely forming a total order any more.
>>
>> So, do these properties hold only for data race free atomics?  For
example,
>>
>> T1:                                         T2:
>> a:  sc store 1 to location x
>>                                              b: non-atomic store 2 to x
>>                                              c: sc store 3 to y
>> d:  r1 = sc load from y
>> e:  r2 = sc load from x
>>
>> Here, r2 still returns undef, since e can see a non-atomic store
>> although from happens-before. Then a c d e may not be totally ordered.
>
> Say r1==3, so we're guaranteed that 'e' happens after
'b'. Since
> neither a nor b happens before the other, and at least one of them is
> not atomic, r2==undef. The seq_cst ordering a, c, d, e is consistent
> with the observable results. That's true whether printing r2 prints 1,
> 2, or 47 since undef can be observed as any value.
My confusion is the difference between SC and AcquireRelease. The text says

...
SequentiallyConsistent

SequentiallyConsistent (seq_cst in IR) provides Acquire semantics for
loads and Release semantics for stores. Additionally, it guarantees
that a total ordering exists between all SequentiallyConsistent
operations.
...

Here, what does the total ordering mean? Does it mean given a sequence
of SC atomic in the total order, an SC load should return the recent
SC store from the sequence? This is like the meaning of Sequential
consistent memory model---any SC stores should be visible to all SC
loads immediately in the same order.

In the above program, we have the SC atomics

sc store 1 to x
sc store 3 to y
sc load from y = 3
sc load from x = 47 or undef

Here, no matter how we order them, there is no way to place a store 47
to x before load x with 47. Did I make a wrong assumption?
>
> Jeffrey
>


-- 
Jianzhou

Apparently Analagous Threads

Search for more apparently analagous threads

llvm dev - Aug 2011 - [LLVMdev] LLVM Concurrency and Undef

[LLVMdev] LLVM Concurrency and Undef

[LLVMdev] LLVM Concurrency and Undef

[LLVMdev] LLVM Concurrency and Undef

Apparently Analagous Threads