James Y Knight via llvm-dev
2020-Dec-10 20:00 UTC
[llvm-dev] [RFC] Coroutine and pthread_self
I view what happened is that we've introduced effectively a new form of IR (pre-coroutine-split IR), yet when doing so, we failed to realize the impact that a "suspend" allowing "thread identity" to change would have on the rest of LLVM. So, oops! And now we need to design what things really should be looking like here going forward. Let me try to make the two proposals so far more concrete. Assumptions: - We're not going to move CoroSplit early in the pipeline, so this issue really does need to be solved. - We should not change the meaning of the C `attribute((const))` to prohibit reading the thread-identity, but, we can change what LLVM IR that transforms into. - As a prerequisite, we have already solved the TLS-address-is-a-constant issue -- e.g. by moving TLS address lookup to an intrinsic (I have some other comments about that issue, separately). Proposal 1: - The readnone attribute is redefined to prohibit depending on thread id. - Create a new attribute "readthreadidonly", which is similar to readnone, except that it may access the current thread identity. - The new @llvm.threadlocaladdr intrinsic (or whatever it ends up as) is given this attribute. - Clang's `attribute((const))` emits this attribute, instead of readnone. - Any other frontends which emit the "readnone" attribute may also need to be updated and emit readthreadidonly instead of readonly. - Optimization passes that are oblivious to the new readthreadidonly will fail correct -- they won't *misoptimize* by moving such a call across an llvm.coro.suspend call (good) -- but they also will regress performance of straightline code (bad). - To avoid regressing performance, we'll need to update all optimization passes which currently handle readnone, to also handle readthreadidonly similarly. - Option 1: (quick'n'dirty) treat readthreadidonly as readnone (allowing arbitrary movement) on any function which doesn't contain an llvm.coro.suspend, and treat readthreadidonly as inaccessiblememonly if there are any suspends. - Option 2: actually understand that readthreadid can be moved anywhere except across an llvm.coro.suspend call. Proposal 2: - The readnone attribute (continues to) allow depending on thread id. - C frontend's `attribute((const))` continues to emit the readnone attribute. - The new @llvm.threadlocaladdr intrinsic is marked readnone. - (Optional) add a not_threadid_dependent attribute, which says that a function doesn't care which thread it's invoked on. - Neither clang nor other frontends need to change which IR attributes they emit. - Optimization passes must be taught not to move readnone calls across llvm.coro.suspend, or they will be incorrect. - Option 1: (quick'n'dirty) treat 'readnone' as 'inaccessiblememonly' in any function which contains a llvm.coro.suspend. - Option 2: actually understand that readnone can move anywhere in the function except across an llvm.coro.suspend call (unless not_threadid_dependent is present, in which case it can also move across that). I'm still slightly in favor of proposal 2, but either seems OK. Both seem like they'll be effectively the same amount of work to implement in LLVM. I feel like having the "not_threadid_dependent" attribute be separate from "readnone" (doesn't access memory) is logical and consistent -- and having them be distinct may potentially also be useful in other circumstances. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201210/716888fe/attachment.html>
Reid Kleckner via llvm-dev
2020-Dec-10 23:20 UTC
[llvm-dev] [RFC] Coroutine and pthread_self
I kind of like proposal 2, option 1, quick'n'dirty. Consider that Instruction::mayReadFromMemory could be taught to check if its parent function is a coroutine, and power down in those cases. Then it's a matter of ensuring that all code motion passes use the central utility. On Thu, Dec 10, 2020 at 12:01 PM James Y Knight via llvm-dev < llvm-dev at lists.llvm.org> wrote:> I view what happened is that we've introduced effectively a new form of IR > (pre-coroutine-split IR), yet when doing so, we failed to realize > the impact that a "suspend" allowing "thread identity" to change would have > on the rest of LLVM. So, oops! And now we need to design what things really > should be looking like here going forward. > > Let me try to make the two proposals so far more concrete. > > Assumptions: > - We're not going to move CoroSplit early in the pipeline, so this issue > really does need to be solved. > - We should not change the meaning of the C `attribute((const))` to > prohibit reading the thread-identity, but, we can change what LLVM IR that > transforms into. > - As a prerequisite, we have already solved the TLS-address-is-a-constant > issue -- e.g. by moving TLS address lookup to an intrinsic (I have some > other comments about that issue, separately). > > Proposal 1: > - The readnone attribute is redefined to prohibit depending on thread id. > - Create a new attribute "readthreadidonly", which is similar to readnone, > except that it may access the current thread identity. > - The new @llvm.threadlocaladdr intrinsic (or whatever it ends up as) is > given this attribute. > - Clang's `attribute((const))` emits this attribute, instead of readnone. > - Any other frontends which emit the "readnone" attribute may also need to > be updated and emit readthreadidonly instead of readonly. > - Optimization passes that are oblivious to the new readthreadidonly will > fail correct -- they won't *misoptimize* by moving such a call across an > llvm.coro.suspend call (good) -- but they also will regress performance of > straightline code (bad). > - To avoid regressing performance, we'll need to update all optimization > passes which currently handle readnone, to also handle readthreadidonly > similarly. > - Option 1: (quick'n'dirty) treat readthreadidonly as readnone > (allowing arbitrary movement) on any function which doesn't contain an > llvm.coro.suspend, and treat readthreadidonly as inaccessiblememonly if > there are any suspends. > - Option 2: actually understand that readthreadid can be moved anywhere > except across an llvm.coro.suspend call. > > Proposal 2: > - The readnone attribute (continues to) allow depending on thread id. > - C frontend's `attribute((const))` continues to emit the readnone > attribute. > - The new @llvm.threadlocaladdr intrinsic is marked readnone. > - (Optional) add a not_threadid_dependent attribute, which says that a > function doesn't care which thread it's invoked on. > - Neither clang nor other frontends need to change which IR attributes > they emit. > - Optimization passes must be taught not to move readnone calls across > llvm.coro.suspend, or they will be incorrect. > - Option 1: (quick'n'dirty) treat 'readnone' as 'inaccessiblememonly' > in any function which contains a llvm.coro.suspend. > - Option 2: actually understand that readnone can move anywhere in the > function except across an llvm.coro.suspend call (unless > not_threadid_dependent is present, in which case it can also move across > that). > > I'm still slightly in favor of proposal 2, but either seems OK. Both seem > like they'll be effectively the same amount of work to implement in LLVM. I > feel like having the "not_threadid_dependent" attribute be separate from > "readnone" (doesn't access memory) is logical and consistent -- and having > them be distinct may potentially also be useful in other circumstances. > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201210/cb7d12ed/attachment.html>
Nicolai Hähnle via llvm-dev
2020-Dec-14 17:00 UTC
[llvm-dev] [RFC] Coroutine and pthread_self
Thank you for the summary, James. On Thu, Dec 10, 2020 at 9:01 PM James Y Knight <jyknight at google.com> wrote:> I view what happened is that we've introduced effectively a new form of IR > (pre-coroutine-split IR), yet when doing so, we failed to realize > the impact that a "suspend" allowing "thread identity" to change would have > on the rest of LLVM. So, oops! And now we need to design what things really > should be looking like here going forward. > > Let me try to make the two proposals so far more concrete. > > Assumptions: > - We're not going to move CoroSplit early in the pipeline, so this issue > really does need to be solved. > - We should not change the meaning of the C `attribute((const))` to > prohibit reading the thread-identity, but, we can change what LLVM IR that > transforms into. > - As a prerequisite, we have already solved the TLS-address-is-a-constant > issue -- e.g. by moving TLS address lookup to an intrinsic (I have some > other comments about that issue, separately). >Sounds right. Proposal 1:> - The readnone attribute is redefined to prohibit depending on thread id. >As I've explained before, I don't think this is a redefinition. It is the context in which the definition is interpreted that has changed, not the definition itself.> - Create a new attribute "readthreadidonly", which is similar to readnone, > except that it may access the current thread identity. > - The new @llvm.threadlocaladdr intrinsic (or whatever it ends up as) is > given this attribute. > - Clang's `attribute((const))` emits this attribute, instead of readnone. > - Any other frontends which emit the "readnone" attribute may also need to > be updated and emit readthreadidonly instead of readonly. > - Optimization passes that are oblivious to the new readthreadidonly will > fail correct -- they won't *misoptimize* by moving such a call across an > llvm.coro.suspend call (good) -- but they also will regress performance of > straightline code (bad). > - To avoid regressing performance, we'll need to update all optimization > passes which currently handle readnone, to also handle readthreadidonly > similarly. > - Option 1: (quick'n'dirty) treat readthreadidonly as readnone > (allowing arbitrary movement) on any function which doesn't contain an > llvm.coro.suspend, and treat readthreadidonly as inaccessiblememonly if > there are any suspends. > - Option 2: actually understand that readthreadid can be moved anywhere > except across an llvm.coro.suspend call. >I wonder if there's an: Option 3: After (or during) coroutine splitting, upgrade readthreadidonly to readnone. This is based on the assumption that after coroutine splitting, threadid is no longer mutable state. Doing the upgrade any earlier is incorrect because of function inlining. This is functionally a subset of what Option 1 is doing, but requires changes to fewer transforms. Proposal 2:> - The readnone attribute (continues to) allow depending on thread id. > - C frontend's `attribute((const))` continues to emit the readnone > attribute. > - The new @llvm.threadlocaladdr intrinsic is marked readnone. > - (Optional) add a not_threadid_dependent attribute, which says that a > function doesn't care which thread it's invoked on. > - Neither clang nor other frontends need to change which IR attributes > they emit. > - Optimization passes must be taught not to move readnone calls across > llvm.coro.suspend, or they will be incorrect. > - Option 1: (quick'n'dirty) treat 'readnone' as 'inaccessiblememonly' > in any function which contains a llvm.coro.suspend. > - Option 2: actually understand that readnone can move anywhere in the > function except across an llvm.coro.suspend call (unless > not_threadid_dependent is present, in which case it can also move across > that). > > I'm still slightly in favor of proposal 2, but either seems OK. Both seem > like they'll be effectively the same amount of work to implement in LLVM. I > feel like having the "not_threadid_dependent" attribute be separate from > "readnone" (doesn't access memory) is logical and consistent -- and having > them be distinct may potentially also be useful in other circumstances. >If we compare the two proposals, they ultimately offer the same level of expressiveness: p1 readnone == p2 readnone+not_threadid_dependent p1 readthreadidonly == p2 readnone p1 _no attribute_ == p2 _no attribute_ ... and so yes, it's plausible that they're effectively the same amount of work in the end. The arguments against each of the proposals as far as I'm aware: Downside of proposal 1: it introduces an attribute "readthreadidonly", which seems unappealing, presumably because it feels like too much special-casing of threadid. Downside of proposal 2: it changes the meaning of "read none" from "reads no mutable state" to "reads almost no mutable state", which is unintuitive and invites bugs due to misunderstandings. I agree that a "readthreadidonly" seems unappealing, but the downside of proposal 2 weighs far heavier to me. It adds permanent mental baggage that everybody working with LLVM IR will have to carry around with them for the indefinite future. Perhaps there are ways to address the downside of proposal 1? For example, could we make a parameterized version of `readnone` that reads like `readnoneexcept(threadid)`? This could also fit nicely with a `readnoneexcept(inaccessiblemem)`. Cheers, Nicolai -- Lerne, wie die Welt wirklich ist, aber vergiss niemals, wie sie sein sollte. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201214/a307f4d8/attachment-0001.html>