Hi Eli: Thank you very much for your comments!>> If you need some sort of unusual control flow construct, make it a >> proper terminator instructionI would love to. I was going by the advice in "docs/ExtendingLLVM.rst": "WARNING: Adding instructions changes the bitcode format, and it will take some effort to maintain compatibility with the previous version. Only add an instruction if it is absolutely necessary. Having coro.fork and coro.suspend to be proper terminators (as they are) will be much cleaner. Something like: corobegin to label %start suspend label %retblock corosuspend [final] [save %token] resume label %resume cleanup label %cleanup I did consider "repurposing" 'invoke', but felt it is too much of an abuse. If there is a support for making corobegin/corosuspend instructions, I would love to do it. Also, if there is a support for invoke abuse :-), I will be glad to try it out and come back with the report on how it went.>> Maybe you could reduce the amount of magic involved by splitting the >> initialization into a separate function? So the initialization is always >> just something like "f(int x) { frame = init(); f_inner(x, frame); return >> frame; }", and you don't have to deal with the weirdness of fork().I think outlining the "start" part will be sufficient (starting at the coro.fork and cutting all of the branches ending in coro.end). In this case llvm won't see coro.fork and coro.end as the clang will do the outlining work. Something like T f(int x, A y) { auto mem = coro.elide(); if (!mem) mem = malloc(llvm.frame.size(&f_inner)); auto hdl = coro.init(mem); T result = <whatever>; f_inner(hdl, x, &y); return result; } f will be a normal function and f_inner will be magic function that will require to be split at suspend points. It will be processed in the IPO order, so by the time f_inner is considered for inlining into f, it will be a normal function. I think this can work, but we are trading off one complexity for another. As presented: coroutine is defined as a magic function (that has weird intrinsics) and requires splitting during IPO With coro.fork / coro.end removed coroutine is defined as a combination of normal function + magic function that requires splitting during IPO. If there are other languages that acquire coroutines they would need to have the same outlining pass as clang. I need to think about it more, but, I am currently leaning toward the first option (with coro.fork and coro.end). The "f_inner" is still part of the f, but nicely delineated with corobegin and coro.end. corobegin to label %coro.start suspend label %retblock corosuspend [final] [save %token] resume label %resume cleanup label %cleanup call void @llvm.coro.end(); Does it look better? Gor On Thu, Jun 9, 2016 at 1:33 AM, Eli Friedman <eli.friedman at gmail.com> wrote:> On Wed, Jun 8, 2016 at 10:57 PM, Gor Nishanov via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> >> Hi all: >> >> Below is a proposal to add experimental coroutine support to LLVM. Though >> this >> proposal is motivated primarily by the desire to support C++ Coroutines >> [1], >> the llvm representation is language neutral and can be used to support >> coroutines in other languages as well. > > > I looked over the proposal... a few comments: > > 1. This adds a lot of intrinsics, and they have weird semantics. This is > bad in general; the more weirdness you add to the IR, the more you'll > struggle to make the optimizer actually understand what you're doing. And > magic pattern matching *will* break. You've already pointed out the issue > with branches. (It's actually worse than you think: there isn't any > guarantee there will even be a branch when you've thrown the optimizer at > it.) The save intrinsic says "Its return value should be consumed by > exactly one `coro.suspend` intrinsic.", but it might have zero uses by the > time your lowering pass runs. You could end up with code getting inserted > before llvm.experimental.coro.fork. > 2. If you need some sort of unusual control flow construct, make it a proper > terminator instruction; don't try to glue your intrinsic to a normal branch > instruction. A new kind of terminator instruction might be appropriate. Or > maybe you can make "invoke" work for you. > 3. Maybe you could reduce the amount of magic involved by splitting the > initialization into a separate function? So the initialization is always > just something like "f(int x) { frame = init(); f_inner(x, frame); return > frame; }", and you don't have to deal with the weirdness of fork(). > > -Eli
On Thu, Jun 9, 2016 at 10:16 AM, Gor Nishanov <gornishanov at gmail.com> wrote:> Hi Eli: > > Thank you very much for your comments! > > >> If you need some sort of unusual control flow construct, make it a > >> proper terminator instruction > > I would love to. I was going by the advice in "docs/ExtendingLLVM.rst": > > "WARNING: Adding instructions changes the bitcode format, and it > will > take some effort to maintain compatibility with the previous > version. > Only add an instruction if it is absolutely necessary. >The point of this is that we want to strongly encourage people to look for other solutions first... but on the other hand this isn't a blanket ban on new instructions; pushing intrinsics too far consistently caused trouble for exception handling in LLVM for years. I need to think about it more, but, I am currently leaning toward the first> option (with coro.fork and coro.end). The "f_inner" is still part of the f, > but nicely delineated with corobegin and coro.end. > > corobegin to label %coro.start > suspend label %retblock > > corosuspend [final] [save %token] resume label %resume > cleanup label %cleanup > > call void @llvm.coro.end(); > > Does it look better? >Yes, that seems fine. There's still the potential for non-initialization instructions sneaking into the initialization block, but you can probably handle it somehow. That said, thinking about it a bit more, I'm not really sure why you need to tie the suspend call to the branch in the first place. What exactly is your algorithm doing that requires it? I mean, a naive implementation of CoroSplit based on your llvm.experimental.coro.suspend intrinsic clones the whole function, replaces the result of suspend calls with "true" in one version and "false" in the other, and runs SimplifyCFG to kill the dead code. And even with a an algorithm that tries to be more clever, you probably need to do some cloning anyway: presumably frontends will want to share code between the destroy and unwinding codepaths. -Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160609/c0d22a76/attachment.html>
Hi Eli:>> corobegin to label %coro.start >> suspend label %retblock >> >> corosuspend [final] [save %token] >> resume label %resume >> cleanup label %cleanup> Yes, that seems fine. There's still the potential for non-initialization > instructions sneaking into the initialization block, but you can probably > handle it somehow.I don't mind non-initialization instructions sneaking in at all. I want them to. All of the code that runs until it hits the suspends can stay in `f`. Only post suspend code need to go to `f.resume` and `f.destroy`. For example, consider fire and forget coroutine: void f(int n) { while (n-- > 0) { do1(); <suspend> -- will subscribe to some async continuation do2(); } } Post split, it will look something like this: struct f.frame { Fn* ResumeFn, Fn* DestroyFn, int n }; void f(int n) { f.frame* state = <init-stuff>; state->n = n; if (state->n-- > 0) do1(); else <destroy> } void f.resume(f.frame* state) { do2(); if (state->n-- > 0) do1(); else <destroy-state> } void f.destroy(f.frame* state) { <destroy-state> }> That said, thinking about it a bit more, I'm not really sure why you need to > tie the suspend call to the branch in the first place.I am not sure I understand the question. Suspend intrinsic is replaced with a branch to a return block in the 'f' and with 'ret void' in resume.> What exactly is your algorithm doing that requires it? I mean, a naive > implementation of CoroSplit based on your llvm.experimental.coro.suspend > intrinsic clones the whole function, replaces the result of suspend calls > with "true" in one version and "false" in the other, and runs SimplifyCFG > to kill the dead code.Very close. We do clone function body twice, once for resume and once for destroy. We make a new entry block with a switch instruction that will branch to all resume branches in `f.resume` and to all cleanup branches in `f.destroy` and then let SimplifyCFG to remove the rest. Changing the subject a little bit. I was thinking we can get rid of the coro.fork altogether it we add a third label to the corosuspend. Namely: corosuspend [final] [save %token] to label %return.block resume label %resume cleanup label %cleanup corosuspend is lowered as follows: in 'f': corosuspend is replaced with `br %return.block` in 'f.resume': add a new entry block with a switch jumping to all resume blocks corosuspend is replaced with `ret void` in 'f.destroy': add a new entry block with a switch jumping to all cleanup blocks corosuspend is replaced with `ret void` I think this makes understanding of the model clearer. The only negative side to a corosuspend with three branching targets is that it is very likely that to label block will be the same in all corosuspend's thus wasting valuable bits. :-) Gor P.S. Return block in 'f' may contain a lot of stuff, like destructors for parameters passed by value (if ABI requires), possibly some code related to return value. `f.resume` does not have any of that, so the return block is just: return.block: ret void P.P.S I did consider making resume part return something other than void. One scheme would be: i1 false - at final suspend point (or destroyed) i1 true - can be resumed again If that case: corosuspend => ret i1 true corosuspend final => ret i1 false control flow reaches the end of the `f.resume` ret i1 false. I wasn't sure that it is a clear win, so I went with 'void' as a return value.