thr3ads.net - llvm dev - [llvm-dev] Fwd: [RFC] LLVM Coroutines [Jun 2016]

If this information is useful, please help other people find it:
Share via:

Gor Nishanov via llvm-dev

2016-Jun-09 17:16 UTC

[llvm-dev] Fwd: [RFC] LLVM Coroutines

Hi Eli:

Thank you very much for your comments!
>> If you need some sort of unusual control flow construct, make it a
>> proper terminator instruction
I would love to. I was going by the advice in
"docs/ExtendingLLVM.rst":

       "WARNING: Adding instructions changes the bitcode format, and it
will
        take some effort to maintain compatibility with the previous version.
        Only add an instruction if it is absolutely necessary.

Having coro.fork and coro.suspend to be proper terminators (as they are) will
be much cleaner. Something like:

  corobegin to label %start
         suspend label %retblock

   corosuspend [final] [save %token]
           resume label %resume
            cleanup label %cleanup

I did consider "repurposing" 'invoke', but felt it is too much
of an abuse.
If there is a support for making corobegin/corosuspend instructions, I would
love to do it. Also, if there is a support for invoke abuse :-), I will be
glad to try it out and come back with the report on how it went.
>> Maybe you could reduce the amount of magic involved by splitting the
>> initialization into a separate function?  So the initialization is
always
>> just something like "f(int x) { frame = init(); f_inner(x, frame);
return
>> frame; }", and you don't have to deal with the weirdness of
fork().
I think outlining the "start" part will be sufficient (starting at the
coro.fork
and cutting all of the branches ending in coro.end). In this case llvm won't
see
coro.fork and coro.end as the clang will do the outlining work. Something like

 T f(int x, A y) {
   auto mem = coro.elide();
   if (!mem) mem = malloc(llvm.frame.size(&f_inner));
  auto hdl = coro.init(mem);
  T result = <whatever>;

  f_inner(hdl, x, &y);

  return result;
 }

f will be a normal function and f_inner will be magic function that will require
to be split at suspend points. It will be processed in the IPO order, so by the
time f_inner is considered for inlining into f, it will be a normal function.

I think this can work, but we are trading off one complexity for another.

As presented:

 coroutine is defined as a magic function (that has weird intrinsics) and
 requires splitting during IPO

With coro.fork / coro.end removed

 coroutine is defined as a combination of normal function + magic function that
 requires splitting during IPO. If there are other languages that acquire
 coroutines they would need to have the same outlining pass as clang.

I need to think about it more, but, I am currently leaning toward the first
option (with coro.fork and coro.end). The "f_inner" is still part of
the f,
but nicely delineated with corobegin and coro.end.

  corobegin to label %coro.start
         suspend label %retblock

   corosuspend [final] [save %token] resume label %resume
            cleanup label %cleanup

    call void @llvm.coro.end();

Does it look better?

Gor

On Thu, Jun 9, 2016 at 1:33 AM, Eli Friedman <eli.friedman at gmail.com>
wrote:> On Wed, Jun 8, 2016 at 10:57 PM, Gor Nishanov via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>>
>> Hi all:
>>
>> Below is a proposal to add experimental coroutine support to LLVM.
Though
>> this
>> proposal is motivated primarily by the desire to support C++ Coroutines
>> [1],
>> the llvm representation is language neutral and can be used to support
>> coroutines in other languages as well.
>
>
> I looked over the proposal... a few comments:
>
> 1. This adds a lot of intrinsics, and they have weird semantics.  This is
> bad in general; the more weirdness you add to the IR, the more you'll
> struggle to make the optimizer actually understand what you're doing. 
And
> magic pattern matching *will* break.  You've already pointed out the
issue
> with branches.  (It's actually worse than you think: there isn't
any
> guarantee there will even be a branch when you've thrown the optimizer
at
> it.)  The save intrinsic says "Its return value should be consumed by
> exactly one `coro.suspend` intrinsic.", but it might have zero uses by
the
> time your lowering pass runs.  You could end up with code getting inserted
> before llvm.experimental.coro.fork.
> 2. If you need some sort of unusual control flow construct, make it a
proper
> terminator instruction; don't try to glue your intrinsic to a normal
branch
> instruction.  A new kind of terminator instruction might be appropriate. 
Or
> maybe you can make "invoke" work for you.
> 3. Maybe you could reduce the amount of magic involved by splitting the
> initialization into a separate function?  So the initialization is always
> just something like "f(int x) { frame = init(); f_inner(x, frame);
return
> frame; }", and you don't have to deal with the weirdness of
fork().
>
> -Eli

Eli Friedman via llvm-dev

2016-Jun-09 18:29 UTC

head link

[llvm-dev] Fwd: [RFC] LLVM Coroutines

On Thu, Jun 9, 2016 at 10:16 AM, Gor Nishanov <gornishanov at gmail.com>
wrote:
> Hi Eli:
>
> Thank you very much for your comments!
>
> >> If you need some sort of unusual control flow construct, make it a
> >> proper terminator instruction
>
> I would love to. I was going by the advice in
"docs/ExtendingLLVM.rst":
>
>        "WARNING: Adding instructions changes the bitcode format, and
it
> will
>         take some effort to maintain compatibility with the previous
> version.
>         Only add an instruction if it is absolutely necessary.
>
The point of this is that we want to strongly encourage people to look for
other solutions first... but on the other hand this isn't a blanket ban on
new instructions; pushing intrinsics too far consistently caused trouble
for exception handling in LLVM for years.

I need to think about it more, but, I am currently leaning toward the
first> option (with coro.fork and coro.end). The "f_inner" is still part
of the f,
> but nicely delineated with corobegin and coro.end.
>
>   corobegin to label %coro.start
>          suspend label %retblock
>
>    corosuspend [final] [save %token] resume label %resume
>             cleanup label %cleanup
>
>     call void @llvm.coro.end();
>
> Does it look better?
>
Yes, that seems fine.  There's still the potential for non-initialization
instructions sneaking into the initialization block, but you can probably
handle it somehow.

That said, thinking about it a bit more, I'm not really sure why you need
to tie the suspend call to the branch in the first place.  What exactly is
your algorithm doing that requires it?  I mean, a naive implementation of
CoroSplit based on your llvm.experimental.coro.suspend intrinsic clones the
whole function, replaces the result of suspend calls with "true" in
one
version and "false" in the other, and runs SimplifyCFG to kill the
dead
code.  And even with a an algorithm that tries to be more clever, you
probably need to do some cloning anyway: presumably frontends will want to
share code between the destroy and unwinding codepaths.

-Eli
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160609/c0d22a76/attachment.html>

Gor Nishanov via llvm-dev

2016-Jun-09 20:17 UTC

head link

[llvm-dev] Fwd: [RFC] LLVM Coroutines

Hi Eli:
>>  corobegin to label %coro.start
>>       suspend label %retblock
>>
>>  corosuspend [final] [save %token]
>>      resume label %resume
>>      cleanup label %cleanup
> Yes, that seems fine.  There's still the potential for
non-initialization
> instructions sneaking into the initialization block, but you can probably
> handle it somehow.
I don't mind non-initialization instructions sneaking in at all. I want them
to.
All of the code that runs until it hits the suspends can stay in `f`.
Only post suspend code need to go to `f.resume` and `f.destroy`.

For example, consider fire and forget coroutine:

  void f(int n) {
    while (n-- > 0) {
      do1();
      <suspend> -- will subscribe to some async continuation
      do2();
    }
  }

Post split, it will look something like this:

  struct f.frame { Fn* ResumeFn, Fn* DestroyFn, int n };

  void f(int n) {
    f.frame* state = <init-stuff>;
    state->n = n;
    if (state->n-- > 0)
      do1();
    else
      <destroy>
  }

  void f.resume(f.frame* state) {
     do2();
     if (state->n-- > 0)
       do1();
     else
       <destroy-state>
  }

  void f.destroy(f.frame* state) { <destroy-state> }
> That said, thinking about it a bit more, I'm not really sure why you
need to
> tie the suspend call to the branch in the first place.
I am not sure I understand the question. Suspend intrinsic is replaced with
a branch to a return block in the 'f' and with 'ret void' in
resume.
> What exactly is your algorithm doing that requires it?  I mean, a naive
> implementation of CoroSplit based on your llvm.experimental.coro.suspend
> intrinsic clones the whole function, replaces the result of suspend calls
> with "true" in one version and "false" in the other,
and runs SimplifyCFG
> to kill the dead code.
Very close. We do clone function body twice, once for resume and once for
destroy. We make a new entry block with a switch instruction that will branch to
all resume branches in `f.resume` and to all cleanup branches in `f.destroy` and
then let SimplifyCFG to remove the rest.

Changing the subject a little bit. I was thinking we can get rid of the
coro.fork altogether it we add a third label to the corosuspend.

Namely:

  corosuspend [final] [save %token] to label %return.block
    resume label %resume
    cleanup label %cleanup

corosuspend is lowered as follows:
  in 'f': corosuspend is replaced with `br %return.block`

  in 'f.resume':
    add a new entry block with a switch jumping to all resume blocks
    corosuspend is replaced with `ret void`

  in 'f.destroy':
    add a new entry block with a switch jumping to all cleanup blocks
    corosuspend is replaced with `ret void`

I think this makes understanding of the model clearer. The only negative side
to a corosuspend with three branching targets is that it is very likely that
to label block will be the same in all corosuspend's thus wasting valuable
bits. :-)

Gor

P.S.

Return block in 'f' may contain a lot of stuff, like destructors for
parameters
passed by value (if ABI requires), possibly some code related to return value.

`f.resume` does not have any of that, so the return block is just:

return.block:
  ret void

P.P.S

I did consider making resume part return something other than void. One scheme
would be:

i1 false - at final suspend point (or destroyed)
i1 true - can be resumed again

If that case:
  corosuspend => ret i1 true
  corosuspend final => ret i1 false
  control flow reaches the end of the `f.resume` ret i1 false.

I wasn't sure that it is a clear win, so I went with 'void' as a
return value.

llvm dev - Jun 2016 - Fwd: [RFC] LLVM Coroutines

[llvm-dev] Fwd: [RFC] LLVM Coroutines

[llvm-dev] Fwd: [RFC] LLVM Coroutines

[llvm-dev] Fwd: [RFC] LLVM Coroutines