thr3ads.net - llvm dev - [LLVMdev] RFC: Exception Handling Proposal II [Nov 2010]

If this information is useful, please help other people find it:
Share via:

Renato Golin

2010-Nov-25 10:33 UTC

[LLVMdev] RFC: Exception Handling Proposal II

On 25 November 2010 07:51, Duncan Sands <baldrick at free.fr>
wrote:> If you got to XYZ because an instruction threw an exception before %x was
> defined, then in XYZ you are using %x which was never defined.  In effect
> the definition of %x in bb does not dominate the use in XYZ.  I think the
> solution is to say that in XYZ you are not allowed to use any values
defined
> in bb: in the dominator tree, bb is not considered to dominate XYZ.
Hi Duncan,

I don't see how you can have dominance between a normal block and a
cleanup block. Clean-up landing pads should never use user code (since
they don't exist in userland).

Catch landing pads, on the other hand, have the same dominance
relationship that the rest of user code has (and the same problems).

Since you should never branch to XYZ under normal circumstances, you
should never rely on its predecessor's values anyway. That's the whole
point of having @llvm.eh.exception and @llvm.eh.selector, as it's the
role of the personality routine to pass information between the user
code and unwinding code.

In essence, in compiler generated landing pads, you should never
generate a use of user values. But if XYZ is user code, it's user
problem. ;)

cheers,
--renato

PS: abnormal cases like throwing on a destructor when previously
thrown inside a constructor leads to termination, so even if you "use"
the value in the catch area, you won't get there anyway. ;)

Duncan Sands

2010-Nov-25 11:03 UTC

head link

[LLVMdev] RFC: Exception Handling Proposal II

Hi Renato,
> On 25 November 2010 07:51, Duncan Sands<baldrick at free.fr>  wrote:
>> If you got to XYZ because an instruction threw an exception before %x
was
>> defined, then in XYZ you are using %x which was never defined.  In
effect
>> the definition of %x in bb does not dominate the use in XYZ.  I think
the
>> solution is to say that in XYZ you are not allowed to use any values
defined
>> in bb: in the dominator tree, bb is not considered to dominate XYZ.
>
> Hi Duncan,
>
> I don't see how you can have dominance between a normal block and a
> cleanup block. Clean-up landing pads should never use user code (since
> they don't exist in userland).
I don't understand what you are saying.  Cleanups (e.g. destructors)
can execute arbitrary user code, access arbitrary local variables etc.
For example, you can pass the address of a local variable to a class
which reads that value of that variable in a destructor etc.  Note also
that LLVM is not just used by C++, it is also used by Ada which makes huge
(and subtle) use of exception handling, and doesn't always work the same as
C++.  For example, throwing an exception in a destructor does not terminate
a program in Ada.
> Catch landing pads, on the other hand, have the same dominance
> relationship that the rest of user code has (and the same problems).
> Since you should never branch to XYZ under normal circumstances, you
> should never rely on its predecessor's values anyway. That's the
whole
> point of having @llvm.eh.exception and @llvm.eh.selector, as it's the
> role of the personality routine to pass information between the user
> code and unwinding code.
I don't get what you are talking about here.  You can access any variables
you like, whether local or global, in a catch handler.  They are not passed
to the handler via llvm.eh.exception or llvm.eh.selector, they are simply
accessed directly (the unwinder restores registers etc making this possible).
Anyway, I'm not talking about what users should or shouldn't do, I'm
talking
about fundamental rules for LLVM IR like "definitions must dominate
uses".
What does this rule mean exactly and why does it exist?  It is actually
fundamental to SSA form and is what makes the whole thing work.  For example,
beginners to LLVM often ask how to get the RHS in "%x = icmp i32 %a,
%b".  Of
course there is no right-hand side because in SSA form the value of %x cannot
change and (this is the important bit for this discussion) %x *is never used
before it is defined*.  Thus there is no point in distinguishing between %x
and the RHS, %x *is* the RHS.  I'm pointing out that if the invoke
instruction
is removed and catch information is attached to entire basic blocks, then if no
care is taken then it is perfectly possible to use %x before it is defined as
explained in my previous email, blowing up the entire LLVM system.  Clearly the
solution is to not allow this by not allowing values defined in a basic block
to be used in a handler for that block; this in turn means that basic blocks
cannot be considered to dominate their handlers even if the only way to get
to the handler is via that basic block; this in turn means that all kinds of
transforms that much around with basic blocks (eg: SimplifyCFG) need to be
audited to make sure they don't break the new rule.  And so on.
> In essence, in compiler generated landing pads, you should never
> generate a use of user values. But if XYZ is user code, it's user
> problem. ;)
The compiler crashing is a compiler problem, and that's exactly what is
going
to happen if care is not taken about such details as dominance.
>
> cheers,
> --renato
>
> PS: abnormal cases like throwing on a destructor when previously
> thrown inside a constructor leads to termination, so even if you
"use"
> the value in the catch area, you won't get there anyway. ;)
In Ada you can throw and exception inside a destructor and it does not lead
to program termination.

Ciao,

Duncan.

Renato Golin

2010-Nov-25 11:25 UTC

head link

[LLVMdev] RFC: Exception Handling Proposal II

On 25 November 2010 11:03, Duncan Sands <baldrick at free.fr>
wrote:> I don't understand what you are saying.  Cleanups (e.g. destructors)
Hi Duncan,

Cleanup landing pads normally call destructors, but they're not a
destructor themselves. I'm simply saying that compiler generated
blocks (such as cleanups) should never depend on user variables.

But I get what you're saying. If a cleanup area calls a destructor,
and destructors use user values, they have an indirect conditional
dependency that cannot be solved at compile time, unless you have a
way to normalize the call graph to disambiguate all dominance
analysis.

> I don't get what you are talking about here.  You can access any
variables
> you like, whether local or global, in a catch handler.  They are not passed
> to the handler via llvm.eh.exception or llvm.eh.selector, they are simply
> accessed directly (the unwinder restores registers etc making this
> possible).
Destructors and Catch areas are user code, you could access any
variable as pleased.

Cleanup areas are compiler code, those were the ones I was talking about.

> In Ada you can throw and exception inside a destructor and it does not lead
> to program termination.
Ok, sorry. I should stop thinking C++ here.... it's difficult, but
I'll try... ;)

cheers,
--renato

Eric Schweitz

2010-Nov-27 20:24 UTC

head link

[LLVMdev] RFC: Exception Handling Proposal II

Are you suggesting having different semantics for different chunks of the IR
graph?

--
Eric

On Thu, Nov 25, 2010 at 2:33 AM, Renato Golin <rengolin at
systemcall.org>wrote:
> On 25 November 2010 07:51, Duncan Sands <baldrick at free.fr> wrote:
> > If you got to XYZ because an instruction threw an exception before %x
was
> > defined, then in XYZ you are using %x which was never defined.  In
effect
> > the definition of %x in bb does not dominate the use in XYZ.  I think
the
> > solution is to say that in XYZ you are not allowed to use any values
> defined
> > in bb: in the dominator tree, bb is not considered to dominate XYZ.
>
> Hi Duncan,
>
> I don't see how you can have dominance between a normal block and a
> cleanup block. Clean-up landing pads should never use user code (since
> they don't exist in userland).
>
> Catch landing pads, on the other hand, have the same dominance
> relationship that the rest of user code has (and the same problems).
>
> Since you should never branch to XYZ under normal circumstances, you
> should never rely on its predecessor's values anyway. That's the
whole
> point of having @llvm.eh.exception and @llvm.eh.selector, as it's the
> role of the personality routine to pass information between the user
> code and unwinding code.
>
> In essence, in compiler generated landing pads, you should never
> generate a use of user values. But if XYZ is user code, it's user
> problem. ;)
>
> cheers,
> --renato
>
> PS: abnormal cases like throwing on a destructor when previously
> thrown inside a constructor leads to termination, so even if you
"use"
> the value in the catch area, you won't get there anyway. ;)
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20101127/e831a431/attachment.html>

Renato Golin

2010-Nov-27 21:12 UTC

head link

[LLVMdev] RFC: Exception Handling Proposal II

On 27 November 2010 20:24, Eric Schweitz <eric.schweitz at gmail.com>
wrote:> Are you suggesting having different semantics for different chunks of the
IR
> graph?
Hi Eric,

Not at all, why do you say that?



-- 
cheers,
--renato

http://systemcall.org/

Reclaim your digital rights, eliminate DRM, learn more at
http://www.defectivebydesign.org/what_is_drm

John McCall

2010-Nov-28 00:57 UTC

head link

[LLVMdev] RFC: Exception Handling Proposal II

On Nov 25, 2010, at 3:03 AM, Duncan Sands wrote:> I'm pointing out that if the invoke instruction
> is removed and catch information is attached to entire basic blocks, then
if no
> care is taken then it is perfectly possible to use %x before it is defined
as
> explained in my previous email, blowing up the entire LLVM system.  Clearly
the
> solution is to not allow this by not allowing values defined in a basic
block
> to be used in a handler for that block;
If we take this route — and I think we should, although I'd like to see
region
chaining in first — I see two reasonable solutions.  The first is what
you've
said, that effectively there's an edge from the beginning of the block;  the
second is a slight twist, that the edge leaves from the end of the phis.  I
think the latter will greatly simplify every transformation which ever inserts
a phi, and in particular mem2reg.  Since phis can't throw, it should be
equivalent anyway.
> In Ada you can throw and exception inside a destructor and it does not lead
> to program termination.
Interesting.  I assume that the personality still sees these as just cleanups,
so this must be implemented by running the destructor in a handler which
aborts both unwinds and throws the Program_Error?

John.

Seemingly Similar Threads

Search for more maybe matching threads

llvm dev - Nov 2010 - [LLVMdev] RFC: Exception Handling Proposal II

[LLVMdev] RFC: Exception Handling Proposal II

[LLVMdev] RFC: Exception Handling Proposal II

[LLVMdev] RFC: Exception Handling Proposal II

[LLVMdev] RFC: Exception Handling Proposal II

[LLVMdev] RFC: Exception Handling Proposal II

[LLVMdev] RFC: Exception Handling Proposal II

Seemingly Similar Threads