Brian Gesiak via llvm-dev
2019-Dec-26 14:29 UTC
[llvm-dev] [RFC] Coroutines passes in the new pass manager
Hello all, It's been a month since my previous email on the topic, and since then I've done some initial work on porting the coroutines passes to the new pass manager. In total there are 6 patches -- that's a lot to review, so allow me to introduce the changes being made in each of them. # What's finished In these first 6 patches, I focused on lowering coroutine intrinsics correctly. With the patches applied, Clang is able to successfully use the new pass manager to build and test a major open source C++20 coroutines library, https://github.com/lewissbaker/cppcoro. 1. https://reviews.llvm.org/D71898 New pass manager implementation of the coro-early function pass, with LLVM regression tests for this pass updated to test both the new and legacy implementations. 2. https://reviews.llvm.org/D71899 Same thing, but for coro-split CGSCC pass. This patch adds support to the new pass manager for only the C++20 switch-based coroutines ABI. I'd like to implement support for the Swift returned-continuation ABI in a future patch. 3. https://reviews.llvm.org/D71900 4. https://reviews.llvm.org/D71901 Same thing, but for the coro-elide and coro-cleanup function passes. 5. https://reviews.llvm.org/D71902 The first 4 patches allow users to run coroutine passes by invoking, for example 'opt -passes=coro-early'. However, most of LLVM's tests for coroutines use 'opt -enable-coroutines', which adds all 4 coroutines passes, in the correct order, to the pass pipeline. This 5th patch adds a similar feature but in a way that can be used by the new pass manager: a pipeline parser that understands 'opt -passes=coroutines'. 6. https://reviews.llvm.org/D71903 Finally, this patch modifies Clang to run the new coroutine passes when the experimental pass manager is being used with coroutines enabled (either via '-fcoroutines-ts' or '-std=c++2a'). With all 6 patches applied, the cppcoro library builds and tests successfully with a Clang built with 'ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER=On'. # What's yet to be done: Swift "returned-continuation" coroutines & the "devirtualization trigger" Two things are missing from these initial 6 patches: the first is, as I mentioned above, support for returned-continuation coroutines, which are used by Swift. I think this will be fairly straightforward for me (or others, if interested) to add in an upcoming patch. The second missing feature has to do with how CGSCC passes are re-run by the pass manager infrastructure. The legacy coro-split and coro-elide passes work in tandem: the coro-split pass first introduces an indirect call to a dummy "devirtualization trigger" function, and then the coro-elide pass devirtualizes it. The legacy CGSCC pass manager checks for devirtualizations after each pass and, if any occur, it re-runs the CGSCC pass pipeline. In other words: coro-split is run, a check is made for devirtualization, then coro-elide is run, and another check is made. The new pass manager allows for a series of CGSCC passes to be wrapped in a DevirtSCCRepeatedPass, but this only checks for devirtualizations after all passes are run. In the case of coro-split and coro-elide, the indirect function call is added and then devirtualized within a single pass of the repeater, so DevirtSCCRepeatedPass never sees the devirtualization and thus doesn't perform a repeat iteration. In other words: a check is made, coro-split is run and it adds an indirect call, coro-elide is run and it devirtualizes that call, and then another check is made. From the repeater pass's point of view, nothing has changed. This is something I'd like to tweak in future patches (my thinking is to add a member to CGSCCUpdateResult to allow passes to manually inform the pass manager about devirtualizations, but I'm very open to alternative ideas). But for now, I simply have Clang manually schedule two iterations of the coro-split pass, rather than have it rely on the repeater detecting a devirtualization and automatically scheduling another coro-split iteration. As a result, the new pass manager implementation of coroutines realizes correct program behavior, but fails to realize some of the optimization patterns that are tested for in the regression tests in 'llvm/test/Transforms/Coroutines'. I'd greatly appreciate code review on my patches! Or, please reply here with any questions/comments. - Brian Gesiak On Mon, Nov 25, 2019 at 8:39 PM Brian Gesiak <modocache at gmail.com> wrote:> > Hi all! > > I'm working on porting the LLVM passes for C++20 coroutines over to > the new pass manager infrastructure. Of the 4 passes, 3 are function > passes, and so porting them is straightforward and easy (my thanks to > those involved -- the design is great!). However, I'm struggling with > 'coro-split', which is an SCC pass. Specifically, I'd like advice on > how to appropriately update the new pass manager's preferred > representation of the SCC: 'llvm::LazyCallGraph'. > > Before I ask my specific questions about 'LazyCallGraph', it may help > to explain my understanding of the 'coro-split' pass and how it > modifies the call graph: > > The coro-split pass "clones" coroutine functions. For example, for a > coroutine function 'foo', it creates declarations for 3 new functions > ('foo.resume', 'foo.destroy', and 'foo.cleanup') using the static > member function 'llvm::Function::Create'. It then uses > 'llvm::CloneFunctionInto' to copy the 'foo' function's attributes, > arguments, basic blocks, and instructions, into each of the 3 new > functions. Finally, the coro-split pass replaces the entry basic > blocks of the function to read a value from a global store of > coroutine state called the coroutine frame, and uses that value to > determine which part of the cloned coroutine function should be > executed upon "resumption" of the coroutine. > > Of course, once the coro-split SCC pass has done all this, it must > update LLVM's representation of the call graph. It does so using ~40 > lines of code that you may read here: > https://github.com/llvm/llvm-project/blob/890c6ef1fb/llvm/lib/Transforms/Coroutines/Coroutines.cpp#L186-L224 > > To explain my understanding of the code in the above link: the > coro-split pass completely re-initializes the 'CallGraphSCC' object, > using the member function > 'CallGraphSCC::initialize(ArrayRef<CallGraphNode*>)'. The array of > nodes it uses to re-initialize the SCC includes nodes for each of the > 3 new functions, which it adds to the call graph using > 'CallGraph::getOrInsertFunction'. For each of the new nodes, and for > the node representing the original function, the coro-split pass > iterates over each of the instructions in the function, and uses > 'CallGraphNode::addCalledFunction' to add edges to the call graph. > > The difficulty I'm having here in porting coro-split to the new pass > manager is that its SCC passes use 'LazyCallGraph' and > 'LazyCallGraph::SCC'. These classes' documentation explains that they > are designed with the constraint that optimization passes shall not > delete, remove, or add edges that invalidate a bottom-up traversal of > the SCC DAG. Unfortunately, I understand the coro-split pass to be > doing exactly those things. > > As a result, if I attempt to mimic the coro-split pass's logic by > inserting functions into the call graph using 'LazyCallGraph::get', > and then adding call edges with > 'LazyCallGraph::RefSCC::insertTrivialRefEdge' and > 'LazyCallGraph::RefSCC::switchInternalEdgeToCall', I'm met with an > assertion: llvm/lib/Analysis/CGSCCPassManager.cpp:463: [...]: > Assertion `E && "No function transformations should introduce *new* " > "call edges! Any new calls should be modeled as " "promoted existing > ref edges!"' failed. > > Does anyone have any suggestions on how I can better work within the > constraints of the 'LazyCallGraph'? > > In case it helps, you can see the code that currently hits this > assertion here: > https://github.com/modocache/llvm-project/commit/02c10528e9. Of > particular interest may be the functions 'buildLazyCallGraphNode' and > 'updateCallGraphAfterSplit'. > > One idea a colleague of mine suggested was to have an earlier > coroutine function pass insert declarations of 'foo.resume', > 'foo.destroy', and 'foo.cleanup', which we could then promote to call > edges. However, they also found this FIXME in CGSCCPassManager.cpp: > "We should really handle adding new calls. While it will make > downstream usage more complex, there is no fundamental limitation and > it will allow passes within the CGSCC to be a bit more flexible in > what transforms they can do. Until then, we verify that new calls > haven't been introduced." > > As a result, I'm now unsure whether I ought to modify coro-split's > implementation for the new pass manager, or modify 'CGSCCPassManager' > to allow for the insertion of new calls. Any and all advice would be > greatly appreciated! > > - Brian Gesiak
Brian Gesiak via llvm-dev
2020-Jan-21 23:17 UTC
[llvm-dev] Reviews needed for LazyCallGraph patches (and coroutines)
Hey all, Thanks to reviews by Wenlei He and JunMa (junparser), I have a stack of patches that implement the C++20 coroutine transformations in the new LLVM pass manager. The stack builds and optimizes coroutines in major open source coroutine libraries like http://github.com/lewissbaker/cppcoro. I also use the patches in my employer's downstream fork of LLVM/Clang, and it successfully compiles large C++ codebases that make use of C++17 and coroutines. The coroutine patches exist as a stack of 6 Phabricator revisions [1]. Crucially, they make use of 3 patches related to LazyCallGraph, the call graph representation used by the new pass manager: 1. https://reviews.llvm.org/D72025 (by Johannes Doerfert) 2. https://reviews.llvm.org/D70927 (ditto) 3. https://reviews.llvm.org/D72226 (this one by me) I only have moderate experience using the new pass manager, so I'd greatly appreciate a review of these patches. My coroutine patches rely on the interfaces they add to the LazyCallGraph abstraction. Could someone chime in with whether those interfaces could be merged to trunk? - Brian Gesiak [1] Sequentially, https://reviews.llvm.org/D71898 through https://reviews.llvm.org/D71903. Reviews are welcome here as well! On Thu, Dec 26, 2019 at 9:29 AM Brian Gesiak <modocache at gmail.com> wrote:> > Hello all, > > It's been a month since my previous email on the topic, and since then > I've done some initial work on porting the coroutines passes to the > new pass manager. In total there are 6 patches -- that's a lot to > review, so allow me to introduce the changes being made in each of > them. > > # What's finished > > In these first 6 patches, I focused on lowering coroutine intrinsics > correctly. With the patches applied, Clang is able to successfully use > the new pass manager to build and test a major open source C++20 > coroutines library, https://github.com/lewissbaker/cppcoro. > > 1. https://reviews.llvm.org/D71898 > New pass manager implementation of the coro-early function pass, with > LLVM regression tests for this pass updated to test both the new and > legacy implementations. > > 2. https://reviews.llvm.org/D71899 > Same thing, but for coro-split CGSCC pass. This patch adds support to > the new pass manager for only the C++20 switch-based coroutines ABI. > I'd like to implement support for the Swift returned-continuation ABI > in a future patch. > > 3. https://reviews.llvm.org/D71900 > 4. https://reviews.llvm.org/D71901 > Same thing, but for the coro-elide and coro-cleanup function passes. > > 5. https://reviews.llvm.org/D71902 > The first 4 patches allow users to run coroutine passes by invoking, > for example 'opt -passes=coro-early'. However, most of LLVM's tests > for coroutines use 'opt -enable-coroutines', which adds all 4 > coroutines passes, in the correct order, to the pass pipeline. This > 5th patch adds a similar feature but in a way that can be used by the > new pass manager: a pipeline parser that understands 'opt > -passes=coroutines'. > > 6. https://reviews.llvm.org/D71903 > Finally, this patch modifies Clang to run the new coroutine passes > when the experimental pass manager is being used with coroutines > enabled (either via '-fcoroutines-ts' or '-std=c++2a'). With all 6 > patches applied, the cppcoro library builds and tests successfully > with a Clang built with 'ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER=On'. > > # What's yet to be done: Swift "returned-continuation" coroutines & > the "devirtualization trigger" > > Two things are missing from these initial 6 patches: the first is, as > I mentioned above, support for returned-continuation coroutines, which > are used by Swift. I think this will be fairly straightforward for me > (or others, if interested) to add in an upcoming patch. > > The second missing feature has to do with how CGSCC passes are re-run > by the pass manager infrastructure. > > The legacy coro-split and coro-elide passes work in tandem: the > coro-split pass first introduces an indirect call to a dummy > "devirtualization trigger" function, and then the coro-elide pass > devirtualizes it. The legacy CGSCC pass manager checks for > devirtualizations after each pass and, if any occur, it re-runs the > CGSCC pass pipeline. In other words: coro-split is run, a check is > made for devirtualization, then coro-elide is run, and another check > is made. > > The new pass manager allows for a series of CGSCC passes to be wrapped > in a DevirtSCCRepeatedPass, but this only checks for devirtualizations > after all passes are run. In the case of coro-split and coro-elide, > the indirect function call is added and then devirtualized within a > single pass of the repeater, so DevirtSCCRepeatedPass never sees the > devirtualization and thus doesn't perform a repeat iteration. In other > words: a check is made, coro-split is run and it adds an indirect > call, coro-elide is run and it devirtualizes that call, and then > another check is made. From the repeater pass's point of view, nothing > has changed. > > This is something I'd like to tweak in future patches (my thinking is > to add a member to CGSCCUpdateResult to allow passes to manually > inform the pass manager about devirtualizations, but I'm very open to > alternative ideas). But for now, I simply have Clang manually schedule > two iterations of the coro-split pass, rather than have it rely on the > repeater detecting a devirtualization and automatically scheduling > another coro-split iteration. As a result, the new pass manager > implementation of coroutines realizes correct program behavior, but > fails to realize some of the optimization patterns that are tested for > in the regression tests in 'llvm/test/Transforms/Coroutines'. > > I'd greatly appreciate code review on my patches! Or, please reply > here with any questions/comments. > > - Brian Gesiak > > On Mon, Nov 25, 2019 at 8:39 PM Brian Gesiak <modocache at gmail.com> wrote: > > > > Hi all! > > > > I'm working on porting the LLVM passes for C++20 coroutines over to > > the new pass manager infrastructure. Of the 4 passes, 3 are function > > passes, and so porting them is straightforward and easy (my thanks to > > those involved -- the design is great!). However, I'm struggling with > > 'coro-split', which is an SCC pass. Specifically, I'd like advice on > > how to appropriately update the new pass manager's preferred > > representation of the SCC: 'llvm::LazyCallGraph'. > > > > Before I ask my specific questions about 'LazyCallGraph', it may help > > to explain my understanding of the 'coro-split' pass and how it > > modifies the call graph: > > > > The coro-split pass "clones" coroutine functions. For example, for a > > coroutine function 'foo', it creates declarations for 3 new functions > > ('foo.resume', 'foo.destroy', and 'foo.cleanup') using the static > > member function 'llvm::Function::Create'. It then uses > > 'llvm::CloneFunctionInto' to copy the 'foo' function's attributes, > > arguments, basic blocks, and instructions, into each of the 3 new > > functions. Finally, the coro-split pass replaces the entry basic > > blocks of the function to read a value from a global store of > > coroutine state called the coroutine frame, and uses that value to > > determine which part of the cloned coroutine function should be > > executed upon "resumption" of the coroutine. > > > > Of course, once the coro-split SCC pass has done all this, it must > > update LLVM's representation of the call graph. It does so using ~40 > > lines of code that you may read here: > > https://github.com/llvm/llvm-project/blob/890c6ef1fb/llvm/lib/Transforms/Coroutines/Coroutines.cpp#L186-L224 > > > > To explain my understanding of the code in the above link: the > > coro-split pass completely re-initializes the 'CallGraphSCC' object, > > using the member function > > 'CallGraphSCC::initialize(ArrayRef<CallGraphNode*>)'. The array of > > nodes it uses to re-initialize the SCC includes nodes for each of the > > 3 new functions, which it adds to the call graph using > > 'CallGraph::getOrInsertFunction'. For each of the new nodes, and for > > the node representing the original function, the coro-split pass > > iterates over each of the instructions in the function, and uses > > 'CallGraphNode::addCalledFunction' to add edges to the call graph. > > > > The difficulty I'm having here in porting coro-split to the new pass > > manager is that its SCC passes use 'LazyCallGraph' and > > 'LazyCallGraph::SCC'. These classes' documentation explains that they > > are designed with the constraint that optimization passes shall not > > delete, remove, or add edges that invalidate a bottom-up traversal of > > the SCC DAG. Unfortunately, I understand the coro-split pass to be > > doing exactly those things. > > > > As a result, if I attempt to mimic the coro-split pass's logic by > > inserting functions into the call graph using 'LazyCallGraph::get', > > and then adding call edges with > > 'LazyCallGraph::RefSCC::insertTrivialRefEdge' and > > 'LazyCallGraph::RefSCC::switchInternalEdgeToCall', I'm met with an > > assertion: llvm/lib/Analysis/CGSCCPassManager.cpp:463: [...]: > > Assertion `E && "No function transformations should introduce *new* " > > "call edges! Any new calls should be modeled as " "promoted existing > > ref edges!"' failed. > > > > Does anyone have any suggestions on how I can better work within the > > constraints of the 'LazyCallGraph'? > > > > In case it helps, you can see the code that currently hits this > > assertion here: > > https://github.com/modocache/llvm-project/commit/02c10528e9. Of > > particular interest may be the functions 'buildLazyCallGraphNode' and > > 'updateCallGraphAfterSplit'. > > > > One idea a colleague of mine suggested was to have an earlier > > coroutine function pass insert declarations of 'foo.resume', > > 'foo.destroy', and 'foo.cleanup', which we could then promote to call > > edges. However, they also found this FIXME in CGSCCPassManager.cpp: > > "We should really handle adding new calls. While it will make > > downstream usage more complex, there is no fundamental limitation and > > it will allow passes within the CGSCC to be a bit more flexible in > > what transforms they can do. Until then, we verify that new calls > > haven't been introduced." > > > > As a result, I'm now unsure whether I ought to modify coro-split's > > implementation for the new pass manager, or modify 'CGSCCPassManager' > > to allow for the insertion of new calls. Any and all advice would be > > greatly appreciated! > > > > - Brian Gesiak
Shoaib Meenai via llvm-dev
2020-Jan-21 23:49 UTC
[llvm-dev] Reviews needed for LazyCallGraph patches (and coroutines)
CCing Chandler and Fedor for feedback on the new pass manager. On 1/21/20, 3:18 PM, "llvm-dev on behalf of Brian Gesiak via llvm-dev" <llvm-dev-bounces at lists.llvm.org on behalf of llvm-dev at lists.llvm.org> wrote: Hey all, Thanks to reviews by Wenlei He and JunMa (junparser), I have a stack of patches that implement the C++20 coroutine transformations in the new LLVM pass manager. The stack builds and optimizes coroutines in major open source coroutine libraries like http://github.com/lewissbaker/cppcoro. I also use the patches in my employer's downstream fork of LLVM/Clang, and it successfully compiles large C++ codebases that make use of C++17 and coroutines. The coroutine patches exist as a stack of 6 Phabricator revisions [1]. Crucially, they make use of 3 patches related to LazyCallGraph, the call graph representation used by the new pass manager: 1. https://urldefense.proofpoint.com/v2/url?u=https-3A__reviews.llvm.org_D72025&d=DwIGaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=o3kDXzdBUE3ljQXKeTWOMw&m=7kZTN83WVu9bhfqIZtRR15pR0tbwTrUmjZ3Ajjfa9wY&s=kpLjtimhsMhNjVPLzvuD8eGmq-jcSv20Uju_AdarTTE&e= (by Johannes Doerfert) 2. https://urldefense.proofpoint.com/v2/url?u=https-3A__reviews.llvm.org_D70927&d=DwIGaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=o3kDXzdBUE3ljQXKeTWOMw&m=7kZTN83WVu9bhfqIZtRR15pR0tbwTrUmjZ3Ajjfa9wY&s=XRnY-8Zn_EoSKY6Wu5mmuDVf8TKU07Au19I4L2dt1kU&e= (ditto) 3. https://urldefense.proofpoint.com/v2/url?u=https-3A__reviews.llvm.org_D72226&d=DwIGaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=o3kDXzdBUE3ljQXKeTWOMw&m=7kZTN83WVu9bhfqIZtRR15pR0tbwTrUmjZ3Ajjfa9wY&s=P08bstAo5XXdmeDExMssmt7dU_favQOQ2k634si0BC0&e= (this one by me) I only have moderate experience using the new pass manager, so I'd greatly appreciate a review of these patches. My coroutine patches rely on the interfaces they add to the LazyCallGraph abstraction. Could someone chime in with whether those interfaces could be merged to trunk? - Brian Gesiak [1] Sequentially, https://urldefense.proofpoint.com/v2/url?u=https-3A__reviews.llvm.org_D71898&d=DwIGaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=o3kDXzdBUE3ljQXKeTWOMw&m=7kZTN83WVu9bhfqIZtRR15pR0tbwTrUmjZ3Ajjfa9wY&s=lxA19e74ycegcH6AbWGuBs13m5WDDqf5TPcTAmYCYNs&e= through https://urldefense.proofpoint.com/v2/url?u=https-3A__reviews.llvm.org_D71903&d=DwIGaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=o3kDXzdBUE3ljQXKeTWOMw&m=7kZTN83WVu9bhfqIZtRR15pR0tbwTrUmjZ3Ajjfa9wY&s=xKsheT47ZKP7XdEGNXzQeNCRsCNoZYmKFXAZjYsgkfU&e= . Reviews are welcome here as well! On Thu, Dec 26, 2019 at 9:29 AM Brian Gesiak <modocache at gmail.com> wrote: > > Hello all, > > It's been a month since my previous email on the topic, and since then > I've done some initial work on porting the coroutines passes to the > new pass manager. In total there are 6 patches -- that's a lot to > review, so allow me to introduce the changes being made in each of > them. > > # What's finished > > In these first 6 patches, I focused on lowering coroutine intrinsics > correctly. With the patches applied, Clang is able to successfully use > the new pass manager to build and test a major open source C++20 > coroutines library, https://github.com/lewissbaker/cppcoro. > > 1. https://urldefense.proofpoint.com/v2/url?u=https-3A__reviews.llvm.org_D71898&d=DwIGaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=o3kDXzdBUE3ljQXKeTWOMw&m=7kZTN83WVu9bhfqIZtRR15pR0tbwTrUmjZ3Ajjfa9wY&s=lxA19e74ycegcH6AbWGuBs13m5WDDqf5TPcTAmYCYNs&e= > New pass manager implementation of the coro-early function pass, with > LLVM regression tests for this pass updated to test both the new and > legacy implementations. > > 2. https://urldefense.proofpoint.com/v2/url?u=https-3A__reviews.llvm.org_D71899&d=DwIGaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=o3kDXzdBUE3ljQXKeTWOMw&m=7kZTN83WVu9bhfqIZtRR15pR0tbwTrUmjZ3Ajjfa9wY&s=bRU9JkSkk6pfLJK5wQrLUNw2wyTr_UB_ax0bMBTNkX0&e= > Same thing, but for coro-split CGSCC pass. This patch adds support to > the new pass manager for only the C++20 switch-based coroutines ABI. > I'd like to implement support for the Swift returned-continuation ABI > in a future patch. > > 3. https://urldefense.proofpoint.com/v2/url?u=https-3A__reviews.llvm.org_D71900&d=DwIGaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=o3kDXzdBUE3ljQXKeTWOMw&m=7kZTN83WVu9bhfqIZtRR15pR0tbwTrUmjZ3Ajjfa9wY&s=_oDoj_tqXuEzl0dGUSpTadrfW1HZRajKAAMagUc7SxE&e= > 4. https://urldefense.proofpoint.com/v2/url?u=https-3A__reviews.llvm.org_D71901&d=DwIGaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=o3kDXzdBUE3ljQXKeTWOMw&m=7kZTN83WVu9bhfqIZtRR15pR0tbwTrUmjZ3Ajjfa9wY&s=tyCSqoUBLCnZ7g-6CdTdcrJc7HyAChKl-8kd72mumGs&e= > Same thing, but for the coro-elide and coro-cleanup function passes. > > 5. https://urldefense.proofpoint.com/v2/url?u=https-3A__reviews.llvm.org_D71902&d=DwIGaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=o3kDXzdBUE3ljQXKeTWOMw&m=7kZTN83WVu9bhfqIZtRR15pR0tbwTrUmjZ3Ajjfa9wY&s=P093yCZSMla0daFd5wn5zpNB7s9-NlQtIPyzWL_051Q&e= > The first 4 patches allow users to run coroutine passes by invoking, > for example 'opt -passes=coro-early'. However, most of LLVM's tests > for coroutines use 'opt -enable-coroutines', which adds all 4 > coroutines passes, in the correct order, to the pass pipeline. This > 5th patch adds a similar feature but in a way that can be used by the > new pass manager: a pipeline parser that understands 'opt > -passes=coroutines'. > > 6. https://urldefense.proofpoint.com/v2/url?u=https-3A__reviews.llvm.org_D71903&d=DwIGaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=o3kDXzdBUE3ljQXKeTWOMw&m=7kZTN83WVu9bhfqIZtRR15pR0tbwTrUmjZ3Ajjfa9wY&s=xKsheT47ZKP7XdEGNXzQeNCRsCNoZYmKFXAZjYsgkfU&e= > Finally, this patch modifies Clang to run the new coroutine passes > when the experimental pass manager is being used with coroutines > enabled (either via '-fcoroutines-ts' or '-std=c++2a'). With all 6 > patches applied, the cppcoro library builds and tests successfully > with a Clang built with 'ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER=On'. > > # What's yet to be done: Swift "returned-continuation" coroutines & > the "devirtualization trigger" > > Two things are missing from these initial 6 patches: the first is, as > I mentioned above, support for returned-continuation coroutines, which > are used by Swift. I think this will be fairly straightforward for me > (or others, if interested) to add in an upcoming patch. > > The second missing feature has to do with how CGSCC passes are re-run > by the pass manager infrastructure. > > The legacy coro-split and coro-elide passes work in tandem: the > coro-split pass first introduces an indirect call to a dummy > "devirtualization trigger" function, and then the coro-elide pass > devirtualizes it. The legacy CGSCC pass manager checks for > devirtualizations after each pass and, if any occur, it re-runs the > CGSCC pass pipeline. In other words: coro-split is run, a check is > made for devirtualization, then coro-elide is run, and another check > is made. > > The new pass manager allows for a series of CGSCC passes to be wrapped > in a DevirtSCCRepeatedPass, but this only checks for devirtualizations > after all passes are run. In the case of coro-split and coro-elide, > the indirect function call is added and then devirtualized within a > single pass of the repeater, so DevirtSCCRepeatedPass never sees the > devirtualization and thus doesn't perform a repeat iteration. In other > words: a check is made, coro-split is run and it adds an indirect > call, coro-elide is run and it devirtualizes that call, and then > another check is made. From the repeater pass's point of view, nothing > has changed. > > This is something I'd like to tweak in future patches (my thinking is > to add a member to CGSCCUpdateResult to allow passes to manually > inform the pass manager about devirtualizations, but I'm very open to > alternative ideas). But for now, I simply have Clang manually schedule > two iterations of the coro-split pass, rather than have it rely on the > repeater detecting a devirtualization and automatically scheduling > another coro-split iteration. As a result, the new pass manager > implementation of coroutines realizes correct program behavior, but > fails to realize some of the optimization patterns that are tested for > in the regression tests in 'llvm/test/Transforms/Coroutines'. > > I'd greatly appreciate code review on my patches! Or, please reply > here with any questions/comments. > > - Brian Gesiak > > On Mon, Nov 25, 2019 at 8:39 PM Brian Gesiak <modocache at gmail.com> wrote: > > > > Hi all! > > > > I'm working on porting the LLVM passes for C++20 coroutines over to > > the new pass manager infrastructure. Of the 4 passes, 3 are function > > passes, and so porting them is straightforward and easy (my thanks to > > those involved -- the design is great!). However, I'm struggling with > > 'coro-split', which is an SCC pass. Specifically, I'd like advice on > > how to appropriately update the new pass manager's preferred > > representation of the SCC: 'llvm::LazyCallGraph'. > > > > Before I ask my specific questions about 'LazyCallGraph', it may help > > to explain my understanding of the 'coro-split' pass and how it > > modifies the call graph: > > > > The coro-split pass "clones" coroutine functions. For example, for a > > coroutine function 'foo', it creates declarations for 3 new functions > > ('foo.resume', 'foo.destroy', and 'foo.cleanup') using the static > > member function 'llvm::Function::Create'. It then uses > > 'llvm::CloneFunctionInto' to copy the 'foo' function's attributes, > > arguments, basic blocks, and instructions, into each of the 3 new > > functions. Finally, the coro-split pass replaces the entry basic > > blocks of the function to read a value from a global store of > > coroutine state called the coroutine frame, and uses that value to > > determine which part of the cloned coroutine function should be > > executed upon "resumption" of the coroutine. > > > > Of course, once the coro-split SCC pass has done all this, it must > > update LLVM's representation of the call graph. It does so using ~40 > > lines of code that you may read here: > > https://github.com/llvm/llvm-project/blob/890c6ef1fb/llvm/lib/Transforms/Coroutines/Coroutines.cpp#L186-L224 > > > > To explain my understanding of the code in the above link: the > > coro-split pass completely re-initializes the 'CallGraphSCC' object, > > using the member function > > 'CallGraphSCC::initialize(ArrayRef<CallGraphNode*>)'. The array of > > nodes it uses to re-initialize the SCC includes nodes for each of the > > 3 new functions, which it adds to the call graph using > > 'CallGraph::getOrInsertFunction'. For each of the new nodes, and for > > the node representing the original function, the coro-split pass > > iterates over each of the instructions in the function, and uses > > 'CallGraphNode::addCalledFunction' to add edges to the call graph. > > > > The difficulty I'm having here in porting coro-split to the new pass > > manager is that its SCC passes use 'LazyCallGraph' and > > 'LazyCallGraph::SCC'. These classes' documentation explains that they > > are designed with the constraint that optimization passes shall not > > delete, remove, or add edges that invalidate a bottom-up traversal of > > the SCC DAG. Unfortunately, I understand the coro-split pass to be > > doing exactly those things. > > > > As a result, if I attempt to mimic the coro-split pass's logic by > > inserting functions into the call graph using 'LazyCallGraph::get', > > and then adding call edges with > > 'LazyCallGraph::RefSCC::insertTrivialRefEdge' and > > 'LazyCallGraph::RefSCC::switchInternalEdgeToCall', I'm met with an > > assertion: llvm/lib/Analysis/CGSCCPassManager.cpp:463: [...]: > > Assertion `E && "No function transformations should introduce *new* " > > "call edges! Any new calls should be modeled as " "promoted existing > > ref edges!"' failed. > > > > Does anyone have any suggestions on how I can better work within the > > constraints of the 'LazyCallGraph'? > > > > In case it helps, you can see the code that currently hits this > > assertion here: > > https://github.com/modocache/llvm-project/commit/02c10528e9. Of > > particular interest may be the functions 'buildLazyCallGraphNode' and > > 'updateCallGraphAfterSplit'. > > > > One idea a colleague of mine suggested was to have an earlier > > coroutine function pass insert declarations of 'foo.resume', > > 'foo.destroy', and 'foo.cleanup', which we could then promote to call > > edges. However, they also found this FIXME in CGSCCPassManager.cpp: > > "We should really handle adding new calls. While it will make > > downstream usage more complex, there is no fundamental limitation and > > it will allow passes within the CGSCC to be a bit more flexible in > > what transforms they can do. Until then, we verify that new calls > > haven't been introduced." > > > > As a result, I'm now unsure whether I ought to modify coro-split's > > implementation for the new pass manager, or modify 'CGSCCPassManager' > > to allow for the insertion of new calls. Any and all advice would be > > greatly appreciated! > > > > - Brian Gesiak _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=DwIGaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=o3kDXzdBUE3ljQXKeTWOMw&m=7kZTN83WVu9bhfqIZtRR15pR0tbwTrUmjZ3Ajjfa9wY&s=2GifqvSsdIjIaqSixL5HUqvVF_46BH1ckSo3b9Qieak&e=
Possibly Parallel Threads
- Suggestions for how coroutines and UBSan codegen can play nice with one another?
- Suggestions for how coroutines and UBSan codegen can play nice with one another?
- RFC: LLVM Coroutine Representation, Round 2
- RFC: Coroutine Optimization Passes
- RFC: Coroutine Optimization Passes