thr3ads.net - llvm dev - [llvm-dev] [RFC] Coroutine and pthread

If this information is useful, please help other people find it:
Share via:

Xun Li via llvm-dev

2020-Dec-04 16:54 UTC

[llvm-dev] [RFC] Coroutine and pthread_self

Hi,

To help with the discussion, I drafted two patches:
https://reviews.llvm.org/D92662 to address the pthread_self issue by
dropping the readnone attribute;
https://reviews.llvm.org/D92661 to address the TLS issue by turning
TLS access into an intrinsics.
Please take a look and let me know what you think and whether this is
along the right direction.
Thanks!

On Wed, Nov 25, 2020 at 11:38 PM Nicolai Hähnle via llvm-dev
<llvm-dev at lists.llvm.org> wrote:>
> On Thu, Nov 26, 2020 at 12:50 AM James Y Knight <jyknight at
google.com> wrote:
> >> The tricky part is that the "thread self identity" is a
piece of state
> >> that used to be immutable, but with the introduction of
coroutines, it
> >> has become mutable. The example @foo reads the "thread self
identity"
> >> piece of state. As long as that state was immutable, it was
correct to
> >> mark the function as readnone. Now that the state has become
mutable,
> >> this is no longer correct.
> >>
> >> That may seem annoying, but I suspect it's best to just
swallow that
> >> pill and take it to its logical conclusion.
> >>
> >> Doing so ends up treating pthread_self(), @foo, etc. more
> >> conservatively, but correctly. Treating them correctly and less
> >> conservatively is really a kind of alias analysis problem. I would
> >> hesitate to just add a new ad-hoc attribute for it, we already
have so
> >> many of those, and would prefer adding a mechanism for this kind
of
> >> thing generically.
> >
> >
> > While it is certainly important to correctly model the fiction that
the coroutine intrinsics present, I do think it's important to remember that
they are a fiction within the compiler. Yes, in the high-level coroutine IR,
before CoroSplit runs, thread identity acts as if it were mutable across the
"llvm.coro.suspend" intrinsic. However, the suspend intrinsic is
simply a placeholder for the transform which will ultimately occur, to turn the
entire function body inside-out. This is an important distinction, because there
remains no way to "secretly" mutate the thread identity. E.g. calling
a function cannot result in the thread identity of the caller being mutated.
>
> This is an important property in so far as it means we won't need a
> `does_not_change_thread_identity` function attribute. We can instead
> just hardcode that the only instruction which may change it is the
> llvm.coro.suspend intrinsic, and that is really helpful. However, I
> don't see how it affects the discussion around the meaning of
> `readnone`.
>
>
> > That we've introduced this set of coroutine intrinsics (and the
CoroSplit pass to flip a function inside out) does not imply that we ought to
change the meaning of 'readnone' retroactively to mean that it needs to
be invariant on thread-identity. We should continue to model it as meaning
"doesn't read mutable memory or mutable state, but may depend on the
thread identity", which is what the meaning has been up until now, even if
not explicitly stated. Given the existence of an intrinsic which
"changes" the thread identity, we should, now, clarify the
documentation.
> >
> > We also do need to fix any optimization passes that run prior to the
CoroSplit pass understand the effect that a coroutine suspend-point has on
thread-identity. Note that we need to do much of the same work anyways, to fix
access to TLS variables within a coroutine. (Or, as an alternative, we could
move the CoroSplit pass first in the optimization pipeline, so that there are no
such optimization passes which run before it. This would fix the correctness
issues, but greatly reduce the potential for optimization -- similar to OpenMP)
>
> It seems to me that all of us here are trying to claim that we don't
> want to change the meaning of "readnone", and yet we disagree on
what
> that implies. I would argue that the interpretation outlined by Joerg
> and me does not require a _normative_ change to the LangRef text,
> although adding a note to clarify the situation wrt coroutines would
> of course help.
>
> In practice, the main reason I want "readnone" to mean "does
not
> depend on thread identity" is that names matter and I don't want
us to
> design a long-term risk into the LangRef. The attribute is called
> "readnone", not "readalmostnone". I expect that the
majority of LLVM
> developers don't usually consider coroutines in their work, which
> means that redefining "readnone" as "there's this one
piece of mutable
> state which a readnone function _is_ allowed to read" is going to lead
> to bugs. (Of course, renaming the attribute would address that
> concern, but involves massive churn in the codebase.)
>
> Another factor is comparing impacts on optimizations in practice. I
> would expect:
>
> 1. "readnone" _may_ depend on thread identity: all functions that
are
> readnone today can remain readnone; _none_ of them can be optimized
> across coroutine suspend points.
>
> 2. "readnone" may _not_ depend on thread identity: very few
functions
> that are readnone today must remove the attribute; all remaining
> readnone functions (i.e., the vast majority of them) can still be
> optimized across coroutine suspend points.
>
> I acknowledge that for the C++ frontend, the situation with
> __attribute__((const)) is a concern. The long-term risk of bad names
> seems to me quite similar, i.e., arguably the fact that pthread_self()
> is declared __attribute__((const)) is a bug in pthreads if pthreads is
> meant to be compiled as C++20. C++20 is young enough that such an
> argument may still win out, but I acknowledge that ecosystem
> considerations may lead you to a different conclusion. For C++, the
> harsh reality of code in the wild using __attribute__((const)) is
> relevant, and that may lead you to define __attribute__((const)) as
> "may depend on mutable thread identity". I understand that and I
don't
> have a stake in what's decided for C++.
>
> By contrast, we don't have to worry about backwards compatibility with
> external code in LLVM IR, and C++ is only one of many frontends. A
> clean design for what makes sense within the internal logic of LLVM IR
> should be higher priority.
>
> Cheers,
> Nicolai
>
>
>
> > Finally, it turns out to be important, it may be reasonable to add a
new function attribute to indicate "doesn't depend on current thread
identity", which could unlock additional optimization potential even across
a suspend-point. But, I would not suggest actually adding it, unless a real need
has been identified and justified.
>
>
>
> --
> Lerne, wie die Welt wirklich ist,
> aber vergiss niemals, wie sie sein sollte.
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


-- 
Xun

Nicolai Hähnle via llvm-dev

2020-Dec-09 15:53 UTC

head link

[llvm-dev] [RFC] Coroutine and pthread_self

Hi,

The direction of adding an intrinsic for "taking the address of" a
thread_local variable in https://reviews.llvm.org/D92661 seems reasonable
to me. This new instruction "materializes the constant" as it were.

However, the patch does require updates to the LangRef.

Cheers,
Nicolai


On Fri, Dec 4, 2020 at 5:54 PM Xun Li <lxfind at gmail.com> wrote:
> Hi,
>
> To help with the discussion, I drafted two patches:
> https://reviews.llvm.org/D92662 to address the pthread_self issue by
> dropping the readnone attribute;
> https://reviews.llvm.org/D92661 to address the TLS issue by turning
> TLS access into an intrinsics.
> Please take a look and let me know what you think and whether this is
> along the right direction.
> Thanks!
>
> On Wed, Nov 25, 2020 at 11:38 PM Nicolai Hähnle via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
> >
> > On Thu, Nov 26, 2020 at 12:50 AM James Y Knight <jyknight at
google.com>
> wrote:
> > >> The tricky part is that the "thread self identity"
is a piece of state
> > >> that used to be immutable, but with the introduction of
coroutines, it
> > >> has become mutable. The example @foo reads the "thread
self identity"
> > >> piece of state. As long as that state was immutable, it was
correct to
> > >> mark the function as readnone. Now that the state has become
mutable,
> > >> this is no longer correct.
> > >>
> > >> That may seem annoying, but I suspect it's best to just
swallow that
> > >> pill and take it to its logical conclusion.
> > >>
> > >> Doing so ends up treating pthread_self(), @foo, etc. more
> > >> conservatively, but correctly. Treating them correctly and
less
> > >> conservatively is really a kind of alias analysis problem. I
would
> > >> hesitate to just add a new ad-hoc attribute for it, we
already have so
> > >> many of those, and would prefer adding a mechanism for this
kind of
> > >> thing generically.
> > >
> > >
> > > While it is certainly important to correctly model the fiction
that
> the coroutine intrinsics present, I do think it's important to remember
> that they are a fiction within the compiler. Yes, in the high-level
> coroutine IR, before CoroSplit runs, thread identity acts as if it were
> mutable across the "llvm.coro.suspend" intrinsic. However, the
suspend
> intrinsic is simply a placeholder for the transform which will ultimately
> occur, to turn the entire function body inside-out. This is an important
> distinction, because there remains no way to "secretly" mutate
the thread
> identity. E.g. calling a function cannot result in the thread identity of
> the caller being mutated.
> >
> > This is an important property in so far as it means we won't need
a
> > `does_not_change_thread_identity` function attribute. We can instead
> > just hardcode that the only instruction which may change it is the
> > llvm.coro.suspend intrinsic, and that is really helpful. However, I
> > don't see how it affects the discussion around the meaning of
> > `readnone`.
> >
> >
> > > That we've introduced this set of coroutine intrinsics (and
the
> CoroSplit pass to flip a function inside out) does not imply that we ought
> to change the meaning of 'readnone' retroactively to mean that it
needs to
> be invariant on thread-identity. We should continue to model it as meaning
> "doesn't read mutable memory or mutable state, but may depend on
the thread
> identity", which is what the meaning has been up until now, even if
not
> explicitly stated. Given the existence of an intrinsic which
"changes" the
> thread identity, we should, now, clarify the documentation.
> > >
> > > We also do need to fix any optimization passes that run prior to
the
> CoroSplit pass understand the effect that a coroutine suspend-point has on
> thread-identity. Note that we need to do much of the same work anyways, to
> fix access to TLS variables within a coroutine. (Or, as an alternative, we
> could move the CoroSplit pass first in the optimization pipeline, so that
> there are no such optimization passes which run before it. This would fix
> the correctness issues, but greatly reduce the potential for optimization
> -- similar to OpenMP)
> >
> > It seems to me that all of us here are trying to claim that we
don't
> > want to change the meaning of "readnone", and yet we
disagree on what
> > that implies. I would argue that the interpretation outlined by Joerg
> > and me does not require a _normative_ change to the LangRef text,
> > although adding a note to clarify the situation wrt coroutines would
> > of course help.
> >
> > In practice, the main reason I want "readnone" to mean
"does not
> > depend on thread identity" is that names matter and I don't
want us to
> > design a long-term risk into the LangRef. The attribute is called
> > "readnone", not "readalmostnone". I expect that
the majority of LLVM
> > developers don't usually consider coroutines in their work, which
> > means that redefining "readnone" as "there's this
one piece of mutable
> > state which a readnone function _is_ allowed to read" is going to
lead
> > to bugs. (Of course, renaming the attribute would address that
> > concern, but involves massive churn in the codebase.)
> >
> > Another factor is comparing impacts on optimizations in practice. I
> > would expect:
> >
> > 1. "readnone" _may_ depend on thread identity: all functions
that are
> > readnone today can remain readnone; _none_ of them can be optimized
> > across coroutine suspend points.
> >
> > 2. "readnone" may _not_ depend on thread identity: very few
functions
> > that are readnone today must remove the attribute; all remaining
> > readnone functions (i.e., the vast majority of them) can still be
> > optimized across coroutine suspend points.
> >
> > I acknowledge that for the C++ frontend, the situation with
> > __attribute__((const)) is a concern. The long-term risk of bad names
> > seems to me quite similar, i.e., arguably the fact that pthread_self()
> > is declared __attribute__((const)) is a bug in pthreads if pthreads is
> > meant to be compiled as C++20. C++20 is young enough that such an
> > argument may still win out, but I acknowledge that ecosystem
> > considerations may lead you to a different conclusion. For C++, the
> > harsh reality of code in the wild using __attribute__((const)) is
> > relevant, and that may lead you to define __attribute__((const)) as
> > "may depend on mutable thread identity". I understand that
and I don't
> > have a stake in what's decided for C++.
> >
> > By contrast, we don't have to worry about backwards compatibility
with
> > external code in LLVM IR, and C++ is only one of many frontends. A
> > clean design for what makes sense within the internal logic of LLVM IR
> > should be higher priority.
> >
> > Cheers,
> > Nicolai
> >
> >
> >
> > > Finally, it turns out to be important, it may be reasonable to
add a
> new function attribute to indicate "doesn't depend on current
thread
> identity", which could unlock additional optimization potential even
across
> a suspend-point. But, I would not suggest actually adding it, unless a real
> need has been identified and justified.
> >
> >
> >
> > --
> > Lerne, wie die Welt wirklich ist,
> > aber vergiss niemals, wie sie sein sollte.
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
> --
> Xun
>

-- 
Lerne, wie die Welt wirklich ist,
aber vergiss niemals, wie sie sein sollte.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20201209/89b823f6/attachment-0001.html>

llvm dev - Dec 2020 - [RFC] Coroutine and pthread_self

[llvm-dev] [RFC] Coroutine and pthread_self

[llvm-dev] [RFC] Coroutine and pthread_self