thr3ads.net - llvm dev - [llvm-dev] [RFC] LLVM Coroutines [Jun 2016]

If this information is useful, please help other people find it:
Share via:

Eli Friedman via llvm-dev

2016-Jun-10 23:08 UTC

[llvm-dev] [RFC] LLVM Coroutines

On Fri, Jun 10, 2016 at 6:36 AM, Gor Nishanov <gornishanov at gmail.com>
wrote:
> >> If you're going down that route, that still leaves the
question of the
> >> semantics of the fork intrinsic... thinking about it a bit more, I
think
> >> you're going to run into problems with trying to keep around a
return
> block
> >> through optimizations:
>
> Thinking about it a bit more, it is even simpler, I don't have to
involve
> coro.end in this case at all and should not worry about branches, RAUW-ing
> coro.suspend with an appropriate constant takes care of it all.
>
> Here is an updated version.
> 1) coro.fork is gone, gone gone!!!!
> 2) coro.suspend is changed as follows:
>
>     declare i8 coro.suspend([...])
>
>     returns:
>       0 - in resume clone
>       1 - in destroy clone
>       -1 - in the original function
>
> Usage:
>
>     [...]
>     %0 = call i8 @llvm.coro.suspend([...])
>     switch i8 %0, label %return [ i32 0, label %resume,
>                                   i32 1, label %cleanup]
>
> Lowering of coro.suspend X
>
>   In the original function:
>     * RAUW coro.suspend to -1
>     * Remove coro.suspend
>
>   In both clones:
>     * RAUW coro.suspend to 0 in `f.resume` and 1 in `f.destroy`
>     * Split block after coro.suspend, new block becomes the resume label X
>     * replace coro.suspend with
>       - `ret void` in resume clone
>       - @llvm.trap() in destroy clone
>         (in destroy clone none of the suspends points should be reachable,
>          if they are it is a front end bug)
>
> Much prettier than before :-)
>
>
>I'm not sure this quite works the way you want it to in terms of the
optimizer.  For example:

block1:
suspend
switch (resume, destroy, return)

resume:
store zero to global @g
[...]

destroy:
store zero to global @g
[...]

return:
store zero to global @g
[...]

Naively, you would expect that it would be legal to hoist the store... but
that breaks your coroutine semantics because the global could be
mutated between
the first return and the resume.

Probably the most natural way to solve this is the two-function approach I
suggested earlier... that way, the first call to suspend isn't special, so
everything just works.

There might be some other way to solve it, but I think it just becomes more
awkward. For example, you could add a boolean that tracks whether the
function has been suspended yet, and explicitly reroute the control flow to
conditionally perform one-time operations before each call to suspend.  But
then you need a giant switch statement or a separate one-time cleanup
function to actually route the control flow correctly.

-Eli
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160610/4344a44c/attachment.html>

Gor Nishanov via llvm-dev

2016-Jun-11 00:25 UTC

head link

[llvm-dev] [RFC] LLVM Coroutines

Hi Eli:
>> Naively, you would expect that it would be legal to hoist the store...
>> but that breaks your coroutine semantics because the global could be
mutated
>> between the first return and the resume.
Hmmm... I don't see the problem. I think hoisting the store is perfectly
legal
transformation with all semantics preserved.

Let's look at your example:
>> block1:
>>   suspend
>>   switch (resume, destroy, return)
>>
>> resume:
>>   store zero  to global @g
>>   doA()
>>   [...]
>>
>> destroy:
>>   store zero  to global @g
>>   doB()
>>   [...]
>>
>> return:
>>   store zero to global @g
>>   doC
>>   [...]
As written the behavior is:

  1) when we encounter a suspend during the first pass through the function,
     store 0 to @g and doC()

  2) when we encounter a suspend after coroutine was resumed
     ret void

  3) When coroutine is resumed:
     store 0 to @g and doA()

  4) When coroutine is destroyed:
     store 0 to @g and doB()

If this is a coroutine that can be resumed asynchronously from a different
thread there is a data race. For example, if resume happens before 'f'
returns,
doA() can write something useful to @g, and 'f' will clobber it with
zero.
But, this race is already present in the code and is not introduced by LLVM.

Let's hoist the store and see what happens:
>> block1:
>>   suspend
>>   store zero  to global @g
>>   switch (resume, destroy, return)
>>
>> resume:
>>   doA()
>>   [...]
>>
>> destroy:
>>   doB()
>>   [...]
>>
>> return:
>>   doC()
>>   [...]
Now, let's split it up:
  1. RAUW coro.suspend with -1,0,1 in f, f.resume and f.destroy
  2. remove coro.suspend in f, replace with ret void in f.resume

void f() {
  [...]
  store zero  to global @g
  doC();
  [...]
}

void @f.resume() {
entry:
  store zero  to global @g
  doA();
  [....]
}

void @f.destroy() {
entry:
  store zero  to global @g
  doB();
  [....]
}

Behavior looks exactly as before. What am I missing?

Thank you,
Gor

P.S.

Eli, thank you very much for your input. I struggled with the coro.fork for
months and in just a few days here it seems to be getting so much better!

Eli Friedman via llvm-dev

2016-Jun-11 02:40 UTC

head link

[llvm-dev] [RFC] LLVM Coroutines

On Fri, Jun 10, 2016 at 5:25 PM, Gor Nishanov <gornishanov at gmail.com>
wrote:
> Hi Eli:
>
> >> Naively, you would expect that it would be legal to hoist the
store...
> >> but that breaks your coroutine semantics because the global could
be
> mutated
> >> between the first return and the resume.
>
> Hmmm... I don't see the problem. I think hoisting the store is
perfectly
> legal
> transformation with all semantics preserved.
>
> Let's look at your example:
>
> >> block1:
> >>   suspend
> >>   switch (resume, destroy, return)
> >>
> >> resume:
> >>   store zero  to global @g
> >>   doA()
> >>   [...]
> >>
> >> destroy:
> >>   store zero  to global @g
> >>   doB()
> >>   [...]
> >>
> >> return:
> >>   store zero to global @g
> >>   doC
> >>   [...]
>
> As written the behavior is:
>
>   1) when we encounter a suspend during the first pass through the
> function,
>      store 0 to @g and doC()
>
>   2) when we encounter a suspend after coroutine was resumed
>      ret void
>
>   3) When coroutine is resumed:
>      store 0 to @g and doA()
>
>   4) When coroutine is destroyed:
>      store 0 to @g and doB()
>
> If this is a coroutine that can be resumed asynchronously from a different
> thread there is a data race. For example, if resume happens before
'f'
> returns,
> doA() can write something useful to @g, and 'f' will clobber it
with zero.
> But, this race is already present in the code and is not introduced by
> LLVM.
>
> Let's hoist the store and see what happens:
>
> >> block1:
> >>   suspend
> >>   store zero  to global @g
> >>   switch (resume, destroy, return)
> >>
> >> resume:
> >>   doA()
> >>   [...]
> >>
> >> destroy:
> >>   doB()
> >>   [...]
> >>
> >> return:
> >>   doC()
> >>   [...]
>
> Now, let's split it up:
>   1. RAUW coro.suspend with -1,0,1 in f, f.resume and f.destroy
>   2. remove coro.suspend in f, replace with ret void in f.resume
>
> void f() {
>   [...]
>   store zero  to global @g
>   doC();
>   [...]
> }
>
> void @f.resume() {
> entry:
>   store zero  to global @g
>   doA();
>   [....]
> }
>
> void @f.destroy() {
> entry:
>   store zero  to global @g
>   doB();
>   [....]
> }
>
> Behavior looks exactly as before. What am I missing?
>
>Err... I guess nothing; sorry, I got myself confused.  Essentially
executing a return statement, then jumping back, seems wrong, but I'm
having trouble coming up with a good example of how it would actually break
something.  I guess there aren't really any issues on a pure control-flow
basis: you can't execute a global side-effect a different number of times
than it would otherwise happen.

You might run into problems with allocas: LLVM's optimizer will assume the
lifetime of any alloca in the current function ends when you hit a
"ret"
instruction, so jumping back from the ret to the middle of the function
could lead to wrong results.  Silly example:

x = alloca...

block1:
  suspend
  ; increment could get moved here.
  switch (resume, destroy, return)

resume:
  x += 1
   [...]

destroy:
  x += 1
   [...]

return:
  (don't touch x)
  [...]

The increment is only supposed to execute once, but instead executes twice.

This isn't a great example... and I'm not sure issues are limited to
allocas... but that's the idea, anyway.

-Eli
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160610/39f19df7/attachment.html>

llvm dev - Jun 2016 - [RFC] LLVM Coroutines

[llvm-dev] [RFC] LLVM Coroutines

[llvm-dev] [RFC] LLVM Coroutines

[llvm-dev] [RFC] LLVM Coroutines