thr3ads.net - llvm dev - [LLVMdev] RFC: Exception Handling Proposal II [Nov 2010]

If this information is useful, please help other people find it:
Share via:

John McCall

2010-Nov-28 00:57 UTC

[LLVMdev] RFC: Exception Handling Proposal II

On Nov 25, 2010, at 3:03 AM, Duncan Sands wrote:> I'm pointing out that if the invoke instruction
> is removed and catch information is attached to entire basic blocks, then
if no
> care is taken then it is perfectly possible to use %x before it is defined
as
> explained in my previous email, blowing up the entire LLVM system.  Clearly
the
> solution is to not allow this by not allowing values defined in a basic
block
> to be used in a handler for that block;
If we take this route — and I think we should, although I'd like to see
region
chaining in first — I see two reasonable solutions.  The first is what
you've
said, that effectively there's an edge from the beginning of the block;  the
second is a slight twist, that the edge leaves from the end of the phis.  I
think the latter will greatly simplify every transformation which ever inserts
a phi, and in particular mem2reg.  Since phis can't throw, it should be
equivalent anyway.
> In Ada you can throw and exception inside a destructor and it does not lead
> to program termination.
Interesting.  I assume that the personality still sees these as just cleanups,
so this must be implemented by running the destructor in a handler which
aborts both unwinds and throws the Program_Error?

John.

Bill Wendling

2010-Nov-28 10:20 UTC

head link

[LLVMdev] RFC: Exception Handling Proposal II

On Nov 27, 2010, at 4:57 PM, John McCall wrote:
> On Nov 25, 2010, at 3:03 AM, Duncan Sands wrote:
>> I'm pointing out that if the invoke instruction
>> is removed and catch information is attached to entire basic blocks,
then if no
>> care is taken then it is perfectly possible to use %x before it is
defined as
>> explained in my previous email, blowing up the entire LLVM system. 
Clearly the
>> solution is to not allow this by not allowing values defined in a basic
block
>> to be used in a handler for that block;
> 
> If we take this route — and I think we should, although I'd like to see
region
> chaining in first — I see two reasonable solutions.  The first is what
you've
> said, that effectively there's an edge from the beginning of the block;
the
> second is a slight twist, that the edge leaves from the end of the phis.  I
> think the latter will greatly simplify every transformation which ever
inserts
> a phi, and in particular mem2reg.  Since phis can't throw, it should be
> equivalent anyway.
Wouldn't this route make otherwise value programs invalid? For instance:

entry:
  %x = alloca i32
  br label try

try: unwinds to %landing.pad
  %x1 = add i32 37, 927
  call @foo()
  br label %normal.edge

landing.pad: landingpad
  ; Code that prints the value of %x1

We expect the value printed to be 964, but if we don't allow %x1 to be
accessible then we get garbage or an assert...

Or am I missing something? :-)

-bw

John McCall

2010-Nov-28 10:59 UTC

head link

[LLVMdev] RFC: Exception Handling Proposal II

On Nov 28, 2010, at 2:20 AM, Bill Wendling wrote:
> On Nov 27, 2010, at 4:57 PM, John McCall wrote:
> 
>> On Nov 25, 2010, at 3:03 AM, Duncan Sands wrote:
>>> I'm pointing out that if the invoke instruction
>>> is removed and catch information is attached to entire basic
blocks, then if no
>>> care is taken then it is perfectly possible to use %x before it is
defined as
>>> explained in my previous email, blowing up the entire LLVM system. 
Clearly the
>>> solution is to not allow this by not allowing values defined in a
basic block
>>> to be used in a handler for that block;
>> 
>> If we take this route — and I think we should, although I'd like to
see region
>> chaining in first — I see two reasonable solutions.  The first is what
you've
>> said, that effectively there's an edge from the beginning of the
block;  the
>> second is a slight twist, that the edge leaves from the end of the
phis.  I
>> think the latter will greatly simplify every transformation which ever
inserts
>> a phi, and in particular mem2reg.  Since phis can't throw, it
should be
>> equivalent anyway.
> 
> Wouldn't this route make otherwise value programs invalid? For
instance:
> 
> entry:
>  %x = alloca i32
>  br label try
> 
> try: unwinds to %landing.pad
>  %x1 = add i32 37, 927
>  call @foo()
>  br label %normal.edge
> 
> landing.pad: landingpad
>  ; Code that prints the value of %x1
> 
> We expect the value printed to be 964, but if we don't allow %x1 to be
accessible then we get garbage or an assert...
What do you mean by "otherwise valid"?  Just that we can prove that
%x1 has been computed by the time a throw can occur?  We really want SSA
well-formedness to be a simple structural condition;  having to examine
instruction-specific semantics to compute dominance would make it absurdly
expensive to compute and keep valid.  For example, if any instruction were valid
to reference up to the first throwing instruction, then re-ordering the loads in
the following code would break SSA:

bb: unwinds to %lp
  %x1 = throwing load i32* %v1
  %x2 = throwing load i32* %v2
  ...
lp:
  call void @print_int(i32 %x1)

Put another way, this would change SSA well-formedness from a local problem to a
global one.

Of course, this will be a very challenging change regardless of dominance
criteria because of the imprecision of edges.  For example, we would not be able
to mem2reg %v in the following example without splitting %bb:

define void @foo()
entry:
  %v = alloca i32
  br label %bb
bb: unwinds to %lp
  store i32 0, i32* %v
  call void @bar()
  store i32 1, i32* %v
  call void @bar()
  ret void
lp:
  %x = load i32* %v
  ...

That's inherent in the design, unfortunately.

John.

Renato Golin

2010-Nov-28 14:23 UTC

head link

[LLVMdev] RFC: Exception Handling Proposal II

On 28 November 2010 10:20, Bill Wendling <wendling at apple.com>
wrote:> Or am I missing something? :-)
Hi Bill, John,

There still seems to be a confusion between clean-ups and catch areas.
What you both describe are catch areas, on which your arguments
(AFAICS) are perfectly valid. The distinction is between catch and
clean-up areas.

You would never print the value of %x in a clean-up area. The sole
purpose of clean-up areas is to cleanly destroy variables that
wouldn't otherwise because of exception handling. During normal flow,
the destruction code doesn't need to be in a clean-up area, it can
easily be at the end of the scope, and it normally is on both places.

The destruction code itself can print the value of %x (if it has
access to it), and the validity of such value in the destructor code
is up to the language + the user code. For instance, accessing a null
pointer in C++ is allowed by the language (nobody stops you from doing
so in compile time) but it's illegal during execution on most
platforms.

But, under no circumstances, a clean-up area can access a user
variable to print it on the screen. It's like calling an intrinsic and
expecting it to print the value of a random variable inside your code.
It doesn't even make sense.

Catch areas, on the other hand, are user code. Like destructors, the
user can print the value of %x if it has access to, and if the
variable was never initialised, it's the user's problem of relying on
such condition. Catch areas are NOT unwinding basic blocks, they are
the first user code blocks that, in case of a match, it's where
execution returns to normal flow. They can also throw again, and make
the flow go back to unwinding, but per se, they're user code.

As was pointed out, some optimizations in LLVM can move user code to
clean-up areas. The compiler may prove it valid and the execution
might even work, but that's an artefact of how the compiler works and
how other optimizations work around the same issue (such as inlining).

Moving code from try to catch areas (and vice-versa) is fine, as both
are user blocks. But moving user code to clean-up areas can lead to
undefined behaviour. For example, during the unwinding of several
functions without a single match to the exception, local variables in
all intermediate functions have to be cleaned up, and that's done via
the clean-up areas. The personality routine is controlling this flow,
so if you move user code that could have side effects to a clean-up
area (say only on -O3), a perfectly valid unwinding can break
completely and terminate.

That breaking is not inside the destructor, nor inside the catch
areas, but inside a clean-up area, on which the user has no access nor
control. This is a compiler bug.

cheers,
--renato

Duncan Sands

2010-Nov-28 17:50 UTC

head link

[LLVMdev] RFC: Exception Handling Proposal II

Hi John,
>> I'm pointing out that if the invoke instruction
>> is removed and catch information is attached to entire basic blocks,
then if no
>> care is taken then it is perfectly possible to use %x before it is
defined as
>> explained in my previous email, blowing up the entire LLVM system. 
Clearly the
>> solution is to not allow this by not allowing values defined in a basic
block
>> to be used in a handler for that block;
>
> If we take this route — and I think we should, although I'd like to see
region
> chaining in first — I see two reasonable solutions.  The first is what
you've
> said, that effectively there's an edge from the beginning of the block;
the
> second is a slight twist, that the edge leaves from the end of the phis.  I
> think the latter will greatly simplify every transformation which ever
inserts
> a phi, and in particular mem2reg.  Since phis can't throw, it should be
> equivalent anyway.
that makes sense to me, but needs to be thought about carefully.
>> In Ada you can throw and exception inside a destructor and it does not
lead
>> to program termination.
>
> Interesting.  I assume that the personality still sees these as just
cleanups,
> so this must be implemented by running the destructor in a handler which
> aborts both unwinds and throws the Program_Error?
Right.  The dwarf exception library doesn't mind if you throw a new
exception
inside a cleanup, it just unwinds it.  ["Cleanups don't throw" is
a C++ specific
concept that is built on top of the basic unwinder facilities; gcc does it by
wrapping cleanup code in a no-throw filter, either explicitly or implicitly via
no-throw regions, which are an optimization but conceptually no different to a
no-throw filter].  So the Ada stuff can just wrap the running of destructors in
a catch-all, and if a destructor throws it then finalizes the original
exception (finalizing exceptions never throws in Ada) and throws a new one
(Program_Error) instead, which then unwinds in the usual way, which may mean
being caught in the containing function if the scope being left is wrapped in
one which catches Program_Error.  Of course this means that in LLVM it would be
wrong to bake in a rule like "cleanup code never throws", but
hopefully no-one
was suggesting that.

Ciao,

Duncan.

PS: If you and Bill came to a consensus over the definition of regions etc,
it would be nice to see a revised proposal from Bill that incorporates it.

John McCall

2010-Nov-28 23:53 UTC

head link

[LLVMdev] RFC: Exception Handling Proposal II

On Nov 28, 2010, at 9:50 AM, Duncan Sands wrote:>>> In Ada you can throw and exception inside a destructor and it does
not lead
>>> to program termination.
>> 
>> Interesting.  I assume that the personality still sees these as just
cleanups,
>> so this must be implemented by running the destructor in a handler
which
>> aborts both unwinds and throws the Program_Error?
> 
> Right.  The dwarf exception library doesn't mind if you throw a new
exception
> inside a cleanup, it just unwinds it.  ["Cleanups don't
throw" is a C++ specific
> concept that is built on top of the basic unwinder facilities; gcc does it
by
> wrapping cleanup code in a no-throw filter, either explicitly or implicitly
via
> no-throw regions, which are an optimization but conceptually no different
to a
> no-throw filter].
Right.  In clang (and llvm-gcc, I think) we generate a catch-all which just
calls
terminate.
> So the Ada stuff can just wrap the running of destructors in
> a catch-all, and if a destructor throws it then finalizes the original
> exception (finalizing exceptions never throws in Ada) and throws a new one
> (Program_Error) instead, which then unwinds in the usual way, which may
mean
> being caught in the containing function if the scope being left is wrapped
in
> one which catches Program_Error.  Of course this means that in LLVM it
would be
> wrong to bake in a rule like "cleanup code never throws", but
hopefully no-one
> was suggesting that.
I had actually thought of something like that at one point, in an effort to
document
the expectation of the unwinder, but I'm totally willing to change it to
"cleanup code
that throws or returns is responsible for aborting the current unwind".
> PS: If you and Bill came to a consensus over the definition of regions etc,
> it would be nice to see a revised proposal from Bill that incorporates it.
I'm not sure if there's a consensus yet.

John.

Apparently Analagous Threads

Search for more possibly parallel threads

llvm dev - Nov 2010 - [LLVMdev] RFC: Exception Handling Proposal II

[LLVMdev] RFC: Exception Handling Proposal II

[LLVMdev] RFC: Exception Handling Proposal II

[LLVMdev] RFC: Exception Handling Proposal II

[LLVMdev] RFC: Exception Handling Proposal II

[LLVMdev] RFC: Exception Handling Proposal II

[LLVMdev] RFC: Exception Handling Proposal II

Apparently Analagous Threads