thr3ads.net - llvm dev - [LLVMdev] [WinEH] Cloning blocks that reference non-cloned PHI nodes [Apr 2015]

If this information is useful, please help other people find it:
Share via:

Kaylor, Andrew

2015-Apr-10 23:36 UTC

[LLVMdev] [WinEH] Cloning blocks that reference non-cloned PHI nodes

Just as an off-the-cuff idea that I haven’t thought through at all, what would
you think of the possibility of replacing this:

  %12 = call i32 @llvm.eh.typeid.for(i8* bitcast (%eh.CatchHandlerType*
@llvm.eh.handlertype.H.0 to i8*))
  %matches7 = icmp eq i32 %sel6, %12

with this:

  %matches7 = call i32 @llvm.eh.cmp.selector(i32 %sel6, i8* bitcast
(%eh.CatchHandlerType* @llvm.eh.handlertype.H.0 to i8*))

If we did that, the outlining code could just assume %sel6 was the current
selector (because we don’t clone into nested landing pads) and we wouldn’t need
to track it at all.

-Andy

From: Reid Kleckner [mailto:rnk at google.com]
Sent: Friday, April 10, 2015 4:06 PM
To: Kaylor, Andrew
Cc: David Majnemer <david.majnemer at gmail.com> (david.majnemer at
gmail.com); LLVM Developers Mailing List
Subject: Re: [WinEH] Cloning blocks that reference non-cloned PHI nodes

Sounds reasonable. Honestly, we've been having some serious whiteboard
discussions about how to get out of doing all this selector pattern matching. I
think that's basically the most fragile part of winehprepare, and if we can
add some new constructs that wrap that up in something opaque to mid-level
passes, that would be great.

Unfortunately, I haven't really been able to get enough consensus to write
something down. If I can't convince my coworkers that we need a new
construct in person, there's not much point in writing up a detailed RFC. :(

Here's the gist of the new idea:
- Leave invoke alone, too many optimizations care about it
- Add a new 'landingswitch' (name under debate) instruction that fills
the landingpad role and is also a terminator
- landingswitch clauses have some fixed operands (RTTI, handler label) and some
variable number of operands as required by the EH personality (catch object,
adjectives)
- Allow the 'resume' instruction to have 0, 1, or 2 successors, one of
which can be a landingpad.
- Add a 'resume' clause to landingswitches to allow them to continue the
unwind process at another landingpad in the same function

I've tried to sell this idea, but I'm getting push back suggesting that
we fix it a different way. The most compelling suggestion is to have the
existing landingpad instruction produce an i8* selector instead of an i32
selector, and then we indirectbr on it. This would save the selector pattern
matching that we do today, but I don't particularly like it. Optimization
passes can do basic block commoning, tail merging, and hoist instructions
between the landingpad and the indirectbr, and I don't like dealing with
that.

I also don't like it because we'd need to teach the inliner about
indirectbr, which today it refuses to inline.

On Fri, Apr 10, 2015 at 3:45 PM, Kaylor, Andrew <andrew.kaylor at
intel.com<mailto:andrew.kaylor at intel.com>> wrote:
Hi Reid and David,

I just wanted to give you a heads up that I’m currently working on a problem
where the WinEHPrepare outlining code stumbles and asserts while cloning code
that references extracted landing pad values via PHI nodes that are in
intermediate blocks that aren’t being cloned.  The test I’m looking at fails
with an assertion claiming that llvm.eh.begincatch was called twice inside a
handler.

I have an idea for how to address this (giving the cloning director a chance to
replace operands before an instruction is remapped).  I’ll put something up for
review as soon as I have it working.  In the mean time I wanted to make sure you
weren’t working on the same problem.

I’m seeing it with some C++ EH tests I’m working on, but I think it may be
possible for it to happen with SEH cleanup.

If you’re curious, here’s a reproducer.

int main(void) {
  try {
    try {
      throw 'a';
    } catch (char c) {
      printf("%c\n", c);
    }
    throw 1;
  } catch(int x) {
    printf("%d\n", x);
  } catch(...) {
    printf("...\n");
  }

  try {
    try {
      throw 'b';
    } catch (char c) {
      printf("%c\n", c);
    }
    throw 2;
  } catch(int x) {
    printf("%d\n", x);
  } catch (char c) {
    printf("%c\n", c);
  } catch(...) {
    printf("...\n");
  }
  return 0;
}

-Andy


-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150410/623e5180/attachment.html>

Reid Kleckner

2015-Apr-11 00:00 UTC

head link

[LLVMdev] [WinEH] Cloning blocks that reference non-cloned PHI nodes

Yeah, wrapping up the whole selector comparison into one intrinsic would
help. Thanks for the idea. :)

The current strategy also seems pretty fragile around complex cleanups. If
the cleanup isn't some simple destructor call with easily traceable
unconditional branch control flow, we get lost and can't find our way to
the next selector comparison. Having an explicit 'resume' or some other
intrinsic at the end of the cleanup would be pretty helpful here.

---

We also talked a lot about making WinEHPrepare try to less work all at
once, and split it into phases. This is separable from the landingswitch
proposal, which is more about making it easier to identify handler start
points. Here's a sketch of the phases we came up with:

Step 1: Identify selector dispatch conditional branches. Identify all
cleanup actions, and split basic blocks to make them start and end on a
basic block boundary. Build a list of all basic blocks that start a handler.

Step 2: Do a forwards-backwards dataflow analysis to mark basic blocks as
reachable and unreachable from each handler start block. The forwards part
is an easy DFS, the backwards part is basically about pruning unreachable
blocks like 'ret' from a handler. We can probably skip the backwards
part
initially.

Step 3: Clone all blocks that are reachable from more than one handler
until all blocks belong to exactly one handler, and sink code from
landingpads back into handlers if necessary. We don't have many code
commoning optimizations, so this step is more of a defense against
theoretical future optimization passes that do more than tail merging.

Step 4: Demote all SSA values live from one handler into or out of another.
The only SSA values referenceable across handlers are static allocas.

Step 5: Extract the blocks belonging to each handler into their own
functions. Rewrite static alloca references in each new handler to use
llvm.framerecover.

Step 6: Profit?

Does this seem like a reasonable plan? IMO simplifying the overarching
design is a higher priority right now than fixing individual test cases.

On Fri, Apr 10, 2015 at 4:36 PM, Kaylor, Andrew <andrew.kaylor at
intel.com>
wrote:
>  Just as an off-the-cuff idea that I haven’t thought through at all, what
> would you think of the possibility of replacing this:
>
>
>
>   %12 = call i32 @llvm.eh.typeid.for(i8* bitcast (%eh.CatchHandlerType*
> @llvm.eh.handlertype.H.0 to i8*))
>
>   %matches7 = icmp eq i32 %sel6, %12
>
>
>
> with this:
>
>
>
>   %matches7 = call i32 @llvm.eh.cmp.selector(i32 %sel6, i8* bitcast
> (%eh.CatchHandlerType* @llvm.eh.handlertype.H.0 to i8*))
>
>
>
> If we did that, the outlining code could just assume %sel6 was the current
> selector (because we don’t clone into nested landing pads) and we wouldn’t
> need to track it at all.
>
>
>
> -Andy
>
>
>
> *From:* Reid Kleckner [mailto:rnk at google.com]
> *Sent:* Friday, April 10, 2015 4:06 PM
> *To:* Kaylor, Andrew
> *Cc:* David Majnemer <david.majnemer at gmail.com> (david.majnemer at
gmail.com);
> LLVM Developers Mailing List
> *Subject:* Re: [WinEH] Cloning blocks that reference non-cloned PHI nodes
>
>
>
> Sounds reasonable. Honestly, we've been having some serious whiteboard
> discussions about how to get out of doing all this selector pattern
> matching. I think that's basically the most fragile part of
winehprepare,
> and if we can add some new constructs that wrap that up in something opaque
> to mid-level passes, that would be great.
>
>
>
> Unfortunately, I haven't really been able to get enough consensus to
write
> something down. If I can't convince my coworkers that we need a new
> construct in person, there's not much point in writing up a detailed
RFC. :(
>
>
>
> Here's the gist of the new idea:
>
> - Leave invoke alone, too many optimizations care about it
>
> - Add a new 'landingswitch' (name under debate) instruction that
fills the
> landingpad role and is also a terminator
>
> - landingswitch clauses have some fixed operands (RTTI, handler label) and
> some variable number of operands as required by the EH personality (catch
> object, adjectives)
>
> - Allow the 'resume' instruction to have 0, 1, or 2 successors, one
of
> which can be a landingpad.
>
> - Add a 'resume' clause to landingswitches to allow them to
continue the
> unwind process at another landingpad in the same function
>
>
>
> I've tried to sell this idea, but I'm getting push back suggesting
that we
> fix it a different way. The most compelling suggestion is to have the
> existing landingpad instruction produce an i8* selector instead of an i32
> selector, and then we indirectbr on it. This would save the selector
> pattern matching that we do today, but I don't particularly like it.
> Optimization passes can do basic block commoning, tail merging, and hoist
> instructions between the landingpad and the indirectbr, and I don't
like
> dealing with that.
>
>
>
> I also don't like it because we'd need to teach the inliner about
> indirectbr, which today it refuses to inline.
>
>
>
> On Fri, Apr 10, 2015 at 3:45 PM, Kaylor, Andrew <andrew.kaylor at
intel.com>
> wrote:
>
> Hi Reid and David,
>
>
>
> I just wanted to give you a heads up that I’m currently working on a
> problem where the WinEHPrepare outlining code stumbles and asserts while
> cloning code that references extracted landing pad values via PHI nodes
> that are in intermediate blocks that aren’t being cloned.  The test I’m
> looking at fails with an assertion claiming that llvm.eh.begincatch was
> called twice inside a handler.
>
>
>
> I have an idea for how to address this (giving the cloning director a
> chance to replace operands before an instruction is remapped).  I’ll put
> something up for review as soon as I have it working.  In the mean time I
> wanted to make sure you weren’t working on the same problem.
>
>
>
> I’m seeing it with some C++ EH tests I’m working on, but I think it may be
> possible for it to happen with SEH cleanup.
>
>
>
> If you’re curious, here’s a reproducer.
>
>
>
> int main(void) {
>
>   try {
>
>     try {
>
>       throw 'a';
>
>     } catch (char c) {
>
>       printf("%c\n", c);
>
>     }
>
>     throw 1;
>
>   } catch(int x) {
>
>     printf("%d\n", x);
>
>   } catch(...) {
>
>     printf("...\n");
>
>   }
>
>
>
>   try {
>
>     try {
>
>       throw 'b';
>
>     } catch (char c) {
>
>       printf("%c\n", c);
>
>     }
>
>     throw 2;
>
>   } catch(int x) {
>
>     printf("%d\n", x);
>
>   } catch (char c) {
>
>     printf("%c\n", c);
>
>   } catch(...) {
>
>     printf("...\n");
>
>   }
>
>   return 0;
>
> }
>
>
>
> -Andy
>
>
>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150410/93a60489/attachment.html>

Kaylor, Andrew

2015-Apr-13 18:25 UTC

head link

[LLVMdev] [WinEH] Cloning blocks that reference non-cloned PHI nodes

Comments below…

From: Reid Kleckner [mailto:rnk at google.com]
Sent: Friday, April 10, 2015 5:01 PM
To: Kaylor, Andrew
Cc: David Majnemer <david.majnemer at gmail.com> (david.majnemer at
gmail.com); LLVM Developers Mailing List
Subject: Re: [WinEH] Cloning blocks that reference non-cloned PHI nodes

Yeah, wrapping up the whole selector comparison into one intrinsic would help.
Thanks for the idea. :)

The current strategy also seems pretty fragile around complex cleanups. If the
cleanup isn't some simple destructor call with easily traceable
unconditional branch control flow, we get lost and can't find our way to the
next selector comparison. Having an explicit 'resume' or some other
intrinsic at the end of the cleanup would be pretty helpful here.

---

We also talked a lot about making WinEHPrepare try to less work all at once, and
split it into phases. This is separable from the landingswitch proposal, which
is more about making it easier to identify handler start points. Here's a
sketch of the phases we came up with:

Step 1: Identify selector dispatch conditional branches. Identify all cleanup
actions, and split basic blocks to make them start and end on a basic block
boundary. Build a list of all basic blocks that start a handler.

[Andy Kaylor] This sounds an awful lot like what we’re doing now except for the
block splitting part.  I guess that you are talking about doing this for all
landing pads before we go any further, right?  I agree that the current code to
identify cleanup blocks is too fragile.  We can probably fix it though. 
Intrinsics to mark the start and/or end of cleanup code would help immensely.

Step 2: Do a forwards-backwards dataflow analysis to mark basic blocks as
reachable and unreachable from each handler start block. The forwards part is an
easy DFS, the backwards part is basically about pruning unreachable blocks like
'ret' from a handler. We can probably skip the backwards part initially.

[Andy Kaylor] I have some concerns about this part.   When blocks have been
merged so that handlers are sharing blocks with non-exception code things are
going to get fairly messy.  The existing cloning code takes care of this (in
theory at least) by scrubbing things after resolving PHIs based on which blocks
didn’t get cloned.  I suppose this is a potential fault line if the code we
didn’t want loops back into the shared block.  I’m guessing that this is the
sort of problem you’re thinking of that has you wanting to change things.  Do
you have a test case that exposes the problem?  More importantly, do you have a
solution in mind?

With catch handlers we have clear markers for the beginning and end of the catch
(possibly including multiple end calls).  Having similar markers for cleanup
would help a lot.  I still think there’s potential for getting lost.

Step 3: Clone all blocks that are reachable from more than one handler until all
blocks belong to exactly one handler, and sink code from landingpads back into
handlers if necessary. We don't have many code commoning optimizations, so
this step is more of a defense against theoretical future optimization passes
that do more than tail merging.

[Andy Kaylor] I’m not sure I see how this is better than the current cloning
approach.

Step 4: Demote all SSA values live from one handler into or out of another. The
only SSA values referenceable across handlers are static allocas.

[Andy Kaylor] Again, if I underststand correctly we’re doing this now, just with
a fairly distributed mechanism.

Step 5: Extract the blocks belonging to each handler into their own functions.
Rewrite static alloca references in each new handler to use llvm.framerecover.

Step 6: Profit?

Does this seem like a reasonable plan? IMO simplifying the overarching design is
a higher priority right now than fixing individual test cases.

[Andy Kaylor] I think there’s significant value in getting some core suite of
tests working so that we can get out of our current pattern of rewriting all of
our tests every time we make a significant change and into a situation where
we’ll have some stable tests that will verify that any redesign isn’t breaking
something.  I also think that in chasing down problems with the implementation
we have now we’ll learn more about the problems that need to be solved.

I don’t have a strong objection to your working on a redesign while I try to
work through some bugs in the current implementation, but I’d like to ask you to
hold off on committing significant restructuring for a little while unless I run
into something that can’t be solved within the current design.  I feel like I
can have the small test suite that I’m working on passing within a week or so.

On Fri, Apr 10, 2015 at 4:36 PM, Kaylor, Andrew <andrew.kaylor at
intel.com<mailto:andrew.kaylor at intel.com>> wrote:
Just as an off-the-cuff idea that I haven’t thought through at all, what would
you think of the possibility of replacing this:

  %12 = call i32 @llvm.eh.typeid.for(i8* bitcast (%eh.CatchHandlerType*
@llvm.eh.handlertype.H.0 to i8*))
  %matches7 = icmp eq i32 %sel6, %12

with this:

  %matches7 = call i32 @llvm.eh.cmp.selector(i32 %sel6, i8* bitcast
(%eh.CatchHandlerType* @llvm.eh.handlertype.H.0 to i8*))

If we did that, the outlining code could just assume %sel6 was the current
selector (because we don’t clone into nested landing pads) and we wouldn’t need
to track it at all.

-Andy

From: Reid Kleckner [mailto:rnk at google.com<mailto:rnk at google.com>]
Sent: Friday, April 10, 2015 4:06 PM
To: Kaylor, Andrew
Cc: David Majnemer <david.majnemer at gmail.com<mailto:david.majnemer at
gmail.com>> (david.majnemer at gmail.com<mailto:david.majnemer at
gmail.com>); LLVM Developers Mailing List
Subject: Re: [WinEH] Cloning blocks that reference non-cloned PHI nodes

Sounds reasonable. Honestly, we've been having some serious whiteboard
discussions about how to get out of doing all this selector pattern matching. I
think that's basically the most fragile part of winehprepare, and if we can
add some new constructs that wrap that up in something opaque to mid-level
passes, that would be great.

Unfortunately, I haven't really been able to get enough consensus to write
something down. If I can't convince my coworkers that we need a new
construct in person, there's not much point in writing up a detailed RFC. :(

Here's the gist of the new idea:
- Leave invoke alone, too many optimizations care about it
- Add a new 'landingswitch' (name under debate) instruction that fills
the landingpad role and is also a terminator
- landingswitch clauses have some fixed operands (RTTI, handler label) and some
variable number of operands as required by the EH personality (catch object,
adjectives)
- Allow the 'resume' instruction to have 0, 1, or 2 successors, one of
which can be a landingpad.
- Add a 'resume' clause to landingswitches to allow them to continue the
unwind process at another landingpad in the same function

I've tried to sell this idea, but I'm getting push back suggesting that
we fix it a different way. The most compelling suggestion is to have the
existing landingpad instruction produce an i8* selector instead of an i32
selector, and then we indirectbr on it. This would save the selector pattern
matching that we do today, but I don't particularly like it. Optimization
passes can do basic block commoning, tail merging, and hoist instructions
between the landingpad and the indirectbr, and I don't like dealing with
that.

I also don't like it because we'd need to teach the inliner about
indirectbr, which today it refuses to inline.

On Fri, Apr 10, 2015 at 3:45 PM, Kaylor, Andrew <andrew.kaylor at
intel.com<mailto:andrew.kaylor at intel.com>> wrote:
Hi Reid and David,

I just wanted to give you a heads up that I’m currently working on a problem
where the WinEHPrepare outlining code stumbles and asserts while cloning code
that references extracted landing pad values via PHI nodes that are in
intermediate blocks that aren’t being cloned.  The test I’m looking at fails
with an assertion claiming that llvm.eh.begincatch was called twice inside a
handler.

I have an idea for how to address this (giving the cloning director a chance to
replace operands before an instruction is remapped).  I’ll put something up for
review as soon as I have it working.  In the mean time I wanted to make sure you
weren’t working on the same problem.

I’m seeing it with some C++ EH tests I’m working on, but I think it may be
possible for it to happen with SEH cleanup.

If you’re curious, here’s a reproducer.

int main(void) {
  try {
    try {
      throw 'a';
    } catch (char c) {
      printf("%c\n", c);
    }
    throw 1;
  } catch(int x) {
    printf("%d\n", x);
  } catch(...) {
    printf("...\n");
  }

  try {
    try {
      throw 'b';
    } catch (char c) {
      printf("%c\n", c);
    }
    throw 2;
  } catch(int x) {
    printf("%d\n", x);
  } catch (char c) {
    printf("%c\n", c);
  } catch(...) {
    printf("...\n");
  }
  return 0;
}

-Andy

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150413/29bb7294/attachment.html>

llvm dev - Apr 2015 - [LLVMdev] [WinEH] Cloning blocks that reference non-cloned PHI nodes

[LLVMdev] [WinEH] Cloning blocks that reference non-cloned PHI nodes

[LLVMdev] [WinEH] Cloning blocks that reference non-cloned PHI nodes

[LLVMdev] [WinEH] Cloning blocks that reference non-cloned PHI nodes