thr3ads.net - llvm dev - [LLVMdev] LLVM Concurrency and Undef [Aug 2011]

If this information is useful, please help other people find it:
Share via:

Jeffrey Yasskin

2011-Aug-22 23:27 UTC

[LLVMdev] LLVM Concurrency and Undef

On Mon, Aug 22, 2011 at 3:49 PM, Eli Friedman <eli.friedman at gmail.com>
wrote:> On Mon, Aug 22, 2011 at 3:40 PM, Jianzhou Zhao <jianzhou at
seas.upenn.edu> wrote:
>> On Mon, Aug 22, 2011 at 6:08 PM, Eli Friedman <eli.friedman at
gmail.com> wrote:
>>> On Mon, Aug 22, 2011 at 2:49 PM, Santosh Nagarakatte
>>> <santosh.nagarakatte at gmail.com> wrote:
>>>> Hi all,
>>>>
>>>> I have been trying to understand the use of undef in both
sequential
>>>> and concurrent programs.
>>>>
>>>> >From the LLVM Language Reference Manual, I see the
following
>>>> definition of undef.
>>>> "Undef can be used anywhere a constant is expected, and
indicates that
>>>> the user of the value may receive an unspecified
bit-pattern".
>>>>  LLVM Language Reference manual also demonstrates how
optimizers can
>>>> use these undef values to  optimize the program.
>>>>
>>>> However, on the other hand, with the LLVM Atomics and
Concurrency
>>>> Guide states that
>>>> If code accesses a memory location from multiple threads at the
same
>>>> time, the resulting loads return 'undef'.
>>>> This is different from the C++ memory model, which provides
undefined
>>>> behavior. What is the rationale for returning an undef on
racing
>>>> reads?
>>>>
>>>> LLVM Atomics and Concurrency guide also states the following
>>>> "Note that speculative loads are allowed; a load which is
part of a
>>>> race returns undef, but does not have undefined behavior"
>>>>
>>>> If the speculative loads returns an undef and the returned
value is
>>>> used, then it results in an undefined behavior. Am I correct?
>>>
>>> It behaves like any other undef value... which do often lead to
>>> undefined behavior.
>>>
>>>> If so, what is the purpose of returning an undef with a
speculative load?
>>>> Is it to ensure that the subsequent uses of the value of the
>>>> speculatively introduced load is caught/detected by the
optimization?
>>>
>>> The point is primarily to allow optimizations like LICM to
introduce
>>> loads whose value is never used.  It also keeps consistent
semantics
>>> through CodeGen, where some targets widen loads.
>>>
>>>> Is it possible to separate the "undef" in a
sequential setting and
>>>> "undef" with speculative loads in a concurrent
setting with separate
>>>> undefs?
>>>
>>> The intention is that they should have the same semantics.
>>
>> Suppose there are three threads T1, T2 and T3,
>> T1 (S1 )stores to a location l as non-atomic,
>> T2 then (S2)stores to l as SC-atomic,
>> later T3 (L3)loads from l as SC-atomic.
>>
>> I think the load @ T3 should return undef, since it can see both
>> writes from T1 T2. Then the question is if the SC store @ T2 --- S2
>> and the SC load @ T3 --- L3 introduces an acq/rel (synchronized-with)
>> edge.
>>
>> This will affect if later conflicting accesses are ordered or not, and
>> whether memory accesses are ordered makes load return undef or not.
>>
>> If the S2 and L3 still introduces an SW edge, the next question is
>> suppose there is a later SC-load L3' @ T3, does it also return
undef?
>> But this potentially makes the SC atomic sequence S2 L3 L3'
>> inconsistent --- later SC loads can read different writes from earlier
>> loads if there are no SC-stores in between.
>>
>> So I think data-races SC/acq/rel atomics cannot introduce SW edges.
>> Intuitively if the locations designed for locks are used by non-atomic
>> memory accesses, the locks cannot behave correctly. Is it correct?
>
> Yes, that is correct.  I am now convinced that it is worthwhile to
> include the definition of how SW edges form in LangRef.
It's already there:

"We define a happens-before partial order as the least partial order that

* Is a superset of single-thread program order, and
* When a synchronizes-with b, includes an edge from a to b.
Synchronizes-with pairs are introduced by platform-specific
techniques, like pthread locks, thread creation, thread joining, etc.,
and by atomic instructions. (See also Atomic Memory Ordering
Constraints)."

...

"acquire - In addition to the guarantees of monotonic, if this
operation reads a value written by a release atomic operation, it
synchronizes-with that operation."

(Strictly, the release operation synchronizes-with the acquire, not
the other way around.)

Since atomic/non-atomic races are defined to return undef from the
load, even if the load has seq_cst ordering, the load never reads a
value written, so none of the stores synchronize with the load.

The text does say that all seq_cst loads and stores participate in the
global seq_cst ordering that's compatible with the happens-before
ordering, but that doesn't imply that happens-before is a superset of
the seq_cst ordering. I'm not sure whether any paradoxes arise from
allowing racy seq_cst loads to return undef, but I haven't seen any
examples so far.

I don't object to clarifying the text, of course.

Jeffrey

Jianzhou Zhao

2011-Aug-23 00:29 UTC

head link

[LLVMdev] LLVM Concurrency and Undef

On Mon, Aug 22, 2011 at 7:27 PM, Jeffrey Yasskin <jyasskin at google.com>
wrote:> On Mon, Aug 22, 2011 at 3:49 PM, Eli Friedman <eli.friedman at
gmail.com> wrote:
>> On Mon, Aug 22, 2011 at 3:40 PM, Jianzhou Zhao <jianzhou at
seas.upenn.edu> wrote:
>>> On Mon, Aug 22, 2011 at 6:08 PM, Eli Friedman <eli.friedman at
gmail.com> wrote:
>>>> On Mon, Aug 22, 2011 at 2:49 PM, Santosh Nagarakatte
>>>> <santosh.nagarakatte at gmail.com> wrote:
>>>>> Hi all,
>>>>>
>>>>> I have been trying to understand the use of undef in both
sequential
>>>>> and concurrent programs.
>>>>>
>>>>> >From the LLVM Language Reference Manual, I see the
following
>>>>> definition of undef.
>>>>> "Undef can be used anywhere a constant is expected,
and indicates that
>>>>> the user of the value may receive an unspecified
bit-pattern".
>>>>>  LLVM Language Reference manual also demonstrates how
optimizers can
>>>>> use these undef values to  optimize the program.
>>>>>
>>>>> However, on the other hand, with the LLVM Atomics and
Concurrency
>>>>> Guide states that
>>>>> If code accesses a memory location from multiple threads at
the same
>>>>> time, the resulting loads return 'undef'.
>>>>> This is different from the C++ memory model, which provides
undefined
>>>>> behavior. What is the rationale for returning an undef on
racing
>>>>> reads?
>>>>>
>>>>> LLVM Atomics and Concurrency guide also states the
following
>>>>> "Note that speculative loads are allowed; a load which
is part of a
>>>>> race returns undef, but does not have undefined
behavior"
>>>>>
>>>>> If the speculative loads returns an undef and the returned
value is
>>>>> used, then it results in an undefined behavior. Am I
correct?
>>>>
>>>> It behaves like any other undef value... which do often lead to
>>>> undefined behavior.
>>>>
>>>>> If so, what is the purpose of returning an undef with a
speculative load?
>>>>> Is it to ensure that the subsequent uses of the value of
the
>>>>> speculatively introduced load is caught/detected by the
optimization?
>>>>
>>>> The point is primarily to allow optimizations like LICM to
introduce
>>>> loads whose value is never used.  It also keeps consistent
semantics
>>>> through CodeGen, where some targets widen loads.
>>>>
>>>>> Is it possible to separate the "undef" in a
sequential setting and
>>>>> "undef" with speculative loads in a concurrent
setting with separate
>>>>> undefs?
>>>>
>>>> The intention is that they should have the same semantics.
>>>
>>> Suppose there are three threads T1, T2 and T3,
>>> T1 (S1 )stores to a location l as non-atomic,
>>> T2 then (S2)stores to l as SC-atomic,
>>> later T3 (L3)loads from l as SC-atomic.
>>>
>>> I think the load @ T3 should return undef, since it can see both
>>> writes from T1 T2. Then the question is if the SC store @ T2 --- S2
>>> and the SC load @ T3 --- L3 introduces an acq/rel
(synchronized-with)
>>> edge.
>>>
>>> This will affect if later conflicting accesses are ordered or not,
and
>>> whether memory accesses are ordered makes load return undef or not.
>>>
>>> If the S2 and L3 still introduces an SW edge, the next question is
>>> suppose there is a later SC-load L3' @ T3, does it also return
undef?
>>> But this potentially makes the SC atomic sequence S2 L3 L3'
>>> inconsistent --- later SC loads can read different writes from
earlier
>>> loads if there are no SC-stores in between.
>>>
>>> So I think data-races SC/acq/rel atomics cannot introduce SW edges.
>>> Intuitively if the locations designed for locks are used by
non-atomic
>>> memory accesses, the locks cannot behave correctly. Is it correct?
>>
>> Yes, that is correct.  I am now convinced that it is worthwhile to
>> include the definition of how SW edges form in LangRef.
>
> It's already there:
>
> "We define a happens-before partial order as the least partial order
that
>
> * Is a superset of single-thread program order, and
> * When a synchronizes-with b, includes an edge from a to b.
> Synchronizes-with pairs are introduced by platform-specific
> techniques, like pthread locks, thread creation, thread joining, etc.,
> and by atomic instructions. (See also Atomic Memory Ordering
> Constraints)."
>
> ...
>
> "acquire - In addition to the guarantees of monotonic, if this
> operation reads a value written by a release atomic operation, it
> synchronizes-with that operation."
>
> (Strictly, the release operation synchronizes-with the acquire, not
> the other way around.)
>
> Since atomic/non-atomic races are defined to return undef from the
> load, even if the load has seq_cst ordering, the load never reads a
> value written, so none of the stores synchronize with the load.
A undef can be replaced by any concrete value. If the undef returned
from the racy SC load happens to be instantiated by a value of the
latest SC store, does it consider as "... reads a value written by
..."? I think our answer is still no, right?
>
> The text does say that all seq_cst loads and stores participate in the
> global seq_cst ordering that's compatible with the happens-before
> ordering, but that doesn't imply that happens-before is a superset of
> the seq_cst ordering. I'm not sure whether any paradoxes arise from
> allowing racy seq_cst loads to return undef, but I haven't seen any
> examples so far.
>
> I don't object to clarifying the text, of course.
>
> Jeffrey
>


-- 
Jianzhou

Eli Friedman

2011-Aug-23 00:46 UTC

head link

[LLVMdev] LLVM Concurrency and Undef

On Mon, Aug 22, 2011 at 5:29 PM, Jianzhou Zhao <jianzhou at
seas.upenn.edu> wrote:> On Mon, Aug 22, 2011 at 7:27 PM, Jeffrey Yasskin <jyasskin at
google.com> wrote:
>> On Mon, Aug 22, 2011 at 3:49 PM, Eli Friedman <eli.friedman at
gmail.com> wrote:
>>> On Mon, Aug 22, 2011 at 3:40 PM, Jianzhou Zhao <jianzhou at
seas.upenn.edu> wrote:
>>>> On Mon, Aug 22, 2011 at 6:08 PM, Eli Friedman <eli.friedman
at gmail.com> wrote:
>>>>> On Mon, Aug 22, 2011 at 2:49 PM, Santosh Nagarakatte
>>>>> <santosh.nagarakatte at gmail.com> wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> I have been trying to understand the use of undef in
both sequential
>>>>>> and concurrent programs.
>>>>>>
>>>>>> >From the LLVM Language Reference Manual, I see the
following
>>>>>> definition of undef.
>>>>>> "Undef can be used anywhere a constant is
expected, and indicates that
>>>>>> the user of the value may receive an unspecified
bit-pattern".
>>>>>>  LLVM Language Reference manual also demonstrates how
optimizers can
>>>>>> use these undef values to  optimize the program.
>>>>>>
>>>>>> However, on the other hand, with the LLVM Atomics and
Concurrency
>>>>>> Guide states that
>>>>>> If code accesses a memory location from multiple
threads at the same
>>>>>> time, the resulting loads return 'undef'.
>>>>>> This is different from the C++ memory model, which
provides undefined
>>>>>> behavior. What is the rationale for returning an undef
on racing
>>>>>> reads?
>>>>>>
>>>>>> LLVM Atomics and Concurrency guide also states the
following
>>>>>> "Note that speculative loads are allowed; a load
which is part of a
>>>>>> race returns undef, but does not have undefined
behavior"
>>>>>>
>>>>>> If the speculative loads returns an undef and the
returned value is
>>>>>> used, then it results in an undefined behavior. Am I
correct?
>>>>>
>>>>> It behaves like any other undef value... which do often
lead to
>>>>> undefined behavior.
>>>>>
>>>>>> If so, what is the purpose of returning an undef with a
speculative load?
>>>>>> Is it to ensure that the subsequent uses of the value
of the
>>>>>> speculatively introduced load is caught/detected by the
optimization?
>>>>>
>>>>> The point is primarily to allow optimizations like LICM to
introduce
>>>>> loads whose value is never used.  It also keeps consistent
semantics
>>>>> through CodeGen, where some targets widen loads.
>>>>>
>>>>>> Is it possible to separate the "undef" in a
sequential setting and
>>>>>> "undef" with speculative loads in a
concurrent setting with separate
>>>>>> undefs?
>>>>>
>>>>> The intention is that they should have the same semantics.
>>>>
>>>> Suppose there are three threads T1, T2 and T3,
>>>> T1 (S1 )stores to a location l as non-atomic,
>>>> T2 then (S2)stores to l as SC-atomic,
>>>> later T3 (L3)loads from l as SC-atomic.
>>>>
>>>> I think the load @ T3 should return undef, since it can see
both
>>>> writes from T1 T2. Then the question is if the SC store @ T2
--- S2
>>>> and the SC load @ T3 --- L3 introduces an acq/rel
(synchronized-with)
>>>> edge.
>>>>
>>>> This will affect if later conflicting accesses are ordered or
not, and
>>>> whether memory accesses are ordered makes load return undef or
not.
>>>>
>>>> If the S2 and L3 still introduces an SW edge, the next question
is
>>>> suppose there is a later SC-load L3' @ T3, does it also
return undef?
>>>> But this potentially makes the SC atomic sequence S2 L3 L3'
>>>> inconsistent --- later SC loads can read different writes from
earlier
>>>> loads if there are no SC-stores in between.
>>>>
>>>> So I think data-races SC/acq/rel atomics cannot introduce SW
edges.
>>>> Intuitively if the locations designed for locks are used by
non-atomic
>>>> memory accesses, the locks cannot behave correctly. Is it
correct?
>>>
>>> Yes, that is correct.  I am now convinced that it is worthwhile to
>>> include the definition of how SW edges form in LangRef.
>>
>> It's already there:
>>
>> "We define a happens-before partial order as the least partial
order that
>>
>> * Is a superset of single-thread program order, and
>> * When a synchronizes-with b, includes an edge from a to b.
>> Synchronizes-with pairs are introduced by platform-specific
>> techniques, like pthread locks, thread creation, thread joining, etc.,
>> and by atomic instructions. (See also Atomic Memory Ordering
>> Constraints)."
>>
>> ...
>>
>> "acquire - In addition to the guarantees of monotonic, if this
>> operation reads a value written by a release atomic operation, it
>> synchronizes-with that operation."
>>
>> (Strictly, the release operation synchronizes-with the acquire, not
>> the other way around.)
>>
>> Since atomic/non-atomic races are defined to return undef from the
>> load, even if the load has seq_cst ordering, the load never reads a
>> value written, so none of the stores synchronize with the load.
>
> A undef can be replaced by any concrete value. If the undef returned
> from the racy SC load happens to be instantiated by a value of the
> latest SC store, does it consider as "... reads a value written by
> ..."? I think our answer is still no, right?
Right; semantically, undef isn't any concrete value, even though
backends are forced to eventually turn it into one.

-Eli

Maybe Matching Threads

Search for more apparently analagous threads

llvm dev - Aug 2011 - [LLVMdev] LLVM Concurrency and Undef

[LLVMdev] LLVM Concurrency and Undef

[LLVMdev] LLVM Concurrency and Undef

[LLVMdev] LLVM Concurrency and Undef

Maybe Matching Threads