thr3ads.net - llvm dev - [llvm-dev] [RFC] Coroutine and pthread

If this information is useful, please help other people find it:
Share via:

Xun Li via llvm-dev

2020-Nov-18 22:06 UTC

[llvm-dev] [RFC] Coroutine and pthread_self

Hi,

I would like to propose a potential solution to a bug that involves
coroutine and pthread_self().

Description of the bug can be found in
https://bugs.llvm.org/show_bug.cgi?id=47833. Below is a summary:
pthread_self() from glibc is defined with "__attribute__
((__const__))". The const attribute tells the compiler that it does
not read nor write any global state and hence always return the same
result. Hence in the following code:

auto x1 = pthread_self();
...
auto x2 = pthread_self();

the second call to pthread_self() can be optimized out. This has been
correct until coroutines. With coroutines, we can have code like this:

auto x1 = pthread_self();
co_await ...
auto x2 = pthread_self();

Now because of the co_await, the function can suspend and resume in a
different thread, in which case the second call to pthread_self()
should return a different result than the first one. Unfortunately
LLVM will still optimize out the second call in the case of
coroutines.

I tried to just nuke all value reuse whenever a coro.suspend is seen
in all CSE-related passes (https://reviews.llvm.org/D89711), but it
doesn't seem scalable and it puts burden on pass writers. So I would
like to propose a new solution.

Proposed Solution:
First of all, we need to update the Clang front-end to special handle
the attributes of pthread_self function: replace the ConstAttr
attribute of pthread_self with a new attribute, say "ThreadConstAttr".
Next, in the emitted IR, functions with "ThreadConstAttr" will have a
new IR attribute, say "thread_readnone".
Finally, there are two possible sub-solutions to handle this new IR attribute:
a) We add a new Pass after CoroSplitPass that changes all the
"thread_readnone" attributes back to "readnone". This will
allow it to
work properly prior to CoroSplit, and still provide a chance to do CSE
after CoroSplit. This approach is simplest to implement.
b) We never remove "thread_readnone". However, we teach memory alias
analysis to understand that functions with "thread_readnone" attribute
will only interfere with coro.suspend intrinsics but nothing else.
Hopefully this will still enable CSE. Not sure how feasible this is.

Does the above solution (esp (a)) sound reasonable? Any feedback is
appreciated. Thank you!

A related issue, which may require separate solutions, is that
coroutine also does not work properly with thread local storage. This
is because access to thread local storage in LLVM IR is simply a
reference. However the address to such reference can change after a
coro.suspend. This is not taken care of today.
In this thread I would like to focus on the issue with pthread_self
first, but it's good to have context regarding the thread local
storage issue when discussing solution space.
-- 
Xun

Reasonably Related Threads

Search for more seemingly similar threads

llvm dev - Nov 2020 - [RFC] Coroutine and pthread_self

[llvm-dev] [RFC] Coroutine and pthread_self

Reasonably Related Threads

Wisdom of the Ancients