thr3ads.net - llvm dev - [LLVMdev] RFC: Exception Handling Proposal II [Nov 2010]

If this information is useful, please help other people find it:
Share via:

Renato Golin

2010-Nov-28 14:23 UTC

[LLVMdev] RFC: Exception Handling Proposal II

On 28 November 2010 10:20, Bill Wendling <wendling at apple.com>
wrote:> Or am I missing something? :-)
Hi Bill, John,

There still seems to be a confusion between clean-ups and catch areas.
What you both describe are catch areas, on which your arguments
(AFAICS) are perfectly valid. The distinction is between catch and
clean-up areas.

You would never print the value of %x in a clean-up area. The sole
purpose of clean-up areas is to cleanly destroy variables that
wouldn't otherwise because of exception handling. During normal flow,
the destruction code doesn't need to be in a clean-up area, it can
easily be at the end of the scope, and it normally is on both places.

The destruction code itself can print the value of %x (if it has
access to it), and the validity of such value in the destructor code
is up to the language + the user code. For instance, accessing a null
pointer in C++ is allowed by the language (nobody stops you from doing
so in compile time) but it's illegal during execution on most
platforms.

But, under no circumstances, a clean-up area can access a user
variable to print it on the screen. It's like calling an intrinsic and
expecting it to print the value of a random variable inside your code.
It doesn't even make sense.

Catch areas, on the other hand, are user code. Like destructors, the
user can print the value of %x if it has access to, and if the
variable was never initialised, it's the user's problem of relying on
such condition. Catch areas are NOT unwinding basic blocks, they are
the first user code blocks that, in case of a match, it's where
execution returns to normal flow. They can also throw again, and make
the flow go back to unwinding, but per se, they're user code.

As was pointed out, some optimizations in LLVM can move user code to
clean-up areas. The compiler may prove it valid and the execution
might even work, but that's an artefact of how the compiler works and
how other optimizations work around the same issue (such as inlining).

Moving code from try to catch areas (and vice-versa) is fine, as both
are user blocks. But moving user code to clean-up areas can lead to
undefined behaviour. For example, during the unwinding of several
functions without a single match to the exception, local variables in
all intermediate functions have to be cleaned up, and that's done via
the clean-up areas. The personality routine is controlling this flow,
so if you move user code that could have side effects to a clean-up
area (say only on -O3), a perfectly valid unwinding can break
completely and terminate.

That breaking is not inside the destructor, nor inside the catch
areas, but inside a clean-up area, on which the user has no access nor
control. This is a compiler bug.

cheers,
--renato

Bill Wendling

2010-Nov-28 22:14 UTC

head link

[LLVMdev] RFC: Exception Handling Proposal II

On Nov 28, 2010, at 6:23 AM, Renato Golin wrote:
> There still seems to be a confusion between clean-ups and catch areas.
> What you both describe are catch areas, on which your arguments
> (AFAICS) are perfectly valid. The distinction is between catch and
> clean-up areas.
> 
> You would never print the value of %x in a clean-up area. The sole
> purpose of clean-up areas is to cleanly destroy variables that
> wouldn't otherwise because of exception handling. During normal flow,
> the destruction code doesn't need to be in a clean-up area, it can
> easily be at the end of the scope, and it normally is on both places.
> 
> The destruction code itself can print the value of %x (if it has
> access to it), and the validity of such value in the destructor code
> is up to the language + the user code. For instance, accessing a null
> pointer in C++ is allowed by the language (nobody stops you from doing
> so in compile time) but it's illegal during execution on most
> platforms.
> 
> But, under no circumstances, a clean-up area can access a user
> variable to print it on the screen. It's like calling an intrinsic and
> expecting it to print the value of a random variable inside your code.
> It doesn't even make sense.
> 
> Catch areas, on the other hand, are user code. Like destructors, the
> user can print the value of %x if it has access to, and if the
> variable was never initialised, it's the user's problem of relying
on
> such condition. Catch areas are NOT unwinding basic blocks, they are
> the first user code blocks that, in case of a match, it's where
> execution returns to normal flow. They can also throw again, and make
> the flow go back to unwinding, but per se, they're user code.
> 
> As was pointed out, some optimizations in LLVM can move user code to
> clean-up areas. The compiler may prove it valid and the execution
> might even work, but that's an artefact of how the compiler works and
> how other optimizations work around the same issue (such as inlining).
> 
> Moving code from try to catch areas (and vice-versa) is fine, as both
> are user blocks. But moving user code to clean-up areas can lead to
> undefined behaviour. For example, during the unwinding of several
> functions without a single match to the exception, local variables in
> all intermediate functions have to be cleaned up, and that's done via
> the clean-up areas. The personality routine is controlling this flow,
> so if you move user code that could have side effects to a clean-up
> area (say only on -O3), a perfectly valid unwinding can break
> completely and terminate.
> 
> That breaking is not inside the destructor, nor inside the catch
> areas, but inside a clean-up area, on which the user has no access nor
> control. This is a compiler bug.
> My confusion could be in what Duncan was talking about. If we have a basic block
that may throw and is caught by some landing pad, what variables may be used in
that landing pad – both the cleanup part and the catch handler? Certainly the
cleanup cannot access the user variables directly (but can indirectly if a
pointer is passed into an object which is then dereferenced by the d'tor).
But the catch handlers are dominated by the landing pad, which would need to be
dominated by the throwing basic block in order to use any values calculated in
that basic block.

One possibility (perhaps this is what he meant) is that if a value is used in
the catch handler, then it cannot reside in the throwing block.

-bw

[Quoting Duncan's original email here]

"I think everyone wants to get rid of invoke, but that is hard.  One
problem
is that you want to keep the SSA property "definitions dominate uses".
Now
suppose you have a basic block

  bb: [when throws, branch to XYZ]
     ...
     %x = ... (define %x)
     ...

  XYZ:
     ...use %x...

If you got to XYZ because an instruction threw an exception before %x was
defined, then in XYZ you are using %x which was never defined.  In effect
the definition of %x in bb does not dominate the use in XYZ.  I think the
solution is to say that in XYZ you are not allowed to use any values defined
in bb: in the dominator tree, bb is not considered to dominate XYZ.

These kind of issues touch fundamental design points of LLVM, so need to be
dealt with carefully.

Ciao,

Duncan."

Renato Golin

2010-Nov-28 22:34 UTC

head link

[LLVMdev] RFC: Exception Handling Proposal II

On 28 November 2010 22:14, Bill Wendling <wendling at apple.com>
wrote:> If we have a basic block that may throw and is caught by some landing pad,
what variables may be used in that landing pad – both the cleanup part and the
catch handler? Certainly the cleanup cannot access the user variables directly
(but can indirectly if a pointer is passed into an object which is then
dereferenced by the d'tor).
A clean-up landing pad should never (IMHO, not even after
optimizations) handle user variables, especially not indirectly.

But again, that's perhaps my C++-istic view of exception handling. It
might make sense in other languages...

> But the catch handlers are dominated by the landing pad, which would need
to be dominated by the throwing basic block in order to use any values
calculated in that basic block.
That's a fair point. Ignore them just because they're special would
complicate the CFG analysis for little gain, and could even get in the
way of inlining the whole EH block.

> One possibility (perhaps this is what he meant) is that if a value is used
in the catch handler, then it cannot reside in the throwing block.
If the value is guaranteed never to throw an exception and not to have
any side-effects, I guess that's ok. Otherwise, it could change the
state of the program in unpredictable ways.

cheers,
--renato

John McCall

2010-Nov-29 00:15 UTC

head link

[LLVMdev] RFC: Exception Handling Proposal II

On Nov 28, 2010, at 6:23 AM, Renato Golin wrote:> On 28 November 2010 10:20, Bill Wendling <wendling at apple.com>
wrote:
>> Or am I missing something? :-)
> 
> Hi Bill, John,
> 
> There still seems to be a confusion between clean-ups and catch areas.
> What you both describe are catch areas, on which your arguments
> (AFAICS) are perfectly valid. The distinction is between catch and
> clean-up areas.
Two points.

First, we're talking about how basic dominance rules work.  Catch handlers
will be dominated by protected blocks *through* the landing pads, so whether
a value is legal to access in all the EH cleanups determines whether that
value is legal to access in the handler.  We are not going to have different
dominance rules for "user code" vs. "non-user code"; 
frontends and
optimizers are required to maintain well-formedness for all.

Second, while you are correct to say that code motion into cleanup areas
is generally unsafe, you are ignoring more likely ways that cleanups would
become dependent on values not defined in the entry block.  The first is
that there are several places in C++ (and presumably also Ada) where
a cleanup is required on the return value of a function call;  most notably,
a new-expression which throws after allocating memory must call the
paired 'operator delete' on the allocated pointer.  The second is the
inliner in combination with mem2reg or SROA.

John.

Apparently Analagous Threads

Search for more possibly parallel threads

llvm dev - Nov 2010 - [LLVMdev] RFC: Exception Handling Proposal II

[LLVMdev] RFC: Exception Handling Proposal II

[LLVMdev] RFC: Exception Handling Proposal II

[LLVMdev] RFC: Exception Handling Proposal II

[LLVMdev] RFC: Exception Handling Proposal II

Apparently Analagous Threads