On 25 November 2010 07:51, Duncan Sands <baldrick at free.fr> wrote:> If you got to XYZ because an instruction threw an exception before %x was > defined, then in XYZ you are using %x which was never defined. In effect > the definition of %x in bb does not dominate the use in XYZ. I think the > solution is to say that in XYZ you are not allowed to use any values defined > in bb: in the dominator tree, bb is not considered to dominate XYZ.Hi Duncan, I don't see how you can have dominance between a normal block and a cleanup block. Clean-up landing pads should never use user code (since they don't exist in userland). Catch landing pads, on the other hand, have the same dominance relationship that the rest of user code has (and the same problems). Since you should never branch to XYZ under normal circumstances, you should never rely on its predecessor's values anyway. That's the whole point of having @llvm.eh.exception and @llvm.eh.selector, as it's the role of the personality routine to pass information between the user code and unwinding code. In essence, in compiler generated landing pads, you should never generate a use of user values. But if XYZ is user code, it's user problem. ;) cheers, --renato PS: abnormal cases like throwing on a destructor when previously thrown inside a constructor leads to termination, so even if you "use" the value in the catch area, you won't get there anyway. ;)
Hi Renato,> On 25 November 2010 07:51, Duncan Sands<baldrick at free.fr> wrote: >> If you got to XYZ because an instruction threw an exception before %x was >> defined, then in XYZ you are using %x which was never defined. In effect >> the definition of %x in bb does not dominate the use in XYZ. I think the >> solution is to say that in XYZ you are not allowed to use any values defined >> in bb: in the dominator tree, bb is not considered to dominate XYZ. > > Hi Duncan, > > I don't see how you can have dominance between a normal block and a > cleanup block. Clean-up landing pads should never use user code (since > they don't exist in userland).I don't understand what you are saying. Cleanups (e.g. destructors) can execute arbitrary user code, access arbitrary local variables etc. For example, you can pass the address of a local variable to a class which reads that value of that variable in a destructor etc. Note also that LLVM is not just used by C++, it is also used by Ada which makes huge (and subtle) use of exception handling, and doesn't always work the same as C++. For example, throwing an exception in a destructor does not terminate a program in Ada.> Catch landing pads, on the other hand, have the same dominance > relationship that the rest of user code has (and the same problems).> Since you should never branch to XYZ under normal circumstances, you > should never rely on its predecessor's values anyway. That's the whole > point of having @llvm.eh.exception and @llvm.eh.selector, as it's the > role of the personality routine to pass information between the user > code and unwinding code.I don't get what you are talking about here. You can access any variables you like, whether local or global, in a catch handler. They are not passed to the handler via llvm.eh.exception or llvm.eh.selector, they are simply accessed directly (the unwinder restores registers etc making this possible). Anyway, I'm not talking about what users should or shouldn't do, I'm talking about fundamental rules for LLVM IR like "definitions must dominate uses". What does this rule mean exactly and why does it exist? It is actually fundamental to SSA form and is what makes the whole thing work. For example, beginners to LLVM often ask how to get the RHS in "%x = icmp i32 %a, %b". Of course there is no right-hand side because in SSA form the value of %x cannot change and (this is the important bit for this discussion) %x *is never used before it is defined*. Thus there is no point in distinguishing between %x and the RHS, %x *is* the RHS. I'm pointing out that if the invoke instruction is removed and catch information is attached to entire basic blocks, then if no care is taken then it is perfectly possible to use %x before it is defined as explained in my previous email, blowing up the entire LLVM system. Clearly the solution is to not allow this by not allowing values defined in a basic block to be used in a handler for that block; this in turn means that basic blocks cannot be considered to dominate their handlers even if the only way to get to the handler is via that basic block; this in turn means that all kinds of transforms that much around with basic blocks (eg: SimplifyCFG) need to be audited to make sure they don't break the new rule. And so on.> In essence, in compiler generated landing pads, you should never > generate a use of user values. But if XYZ is user code, it's user > problem. ;)The compiler crashing is a compiler problem, and that's exactly what is going to happen if care is not taken about such details as dominance.> > cheers, > --renato > > PS: abnormal cases like throwing on a destructor when previously > thrown inside a constructor leads to termination, so even if you "use" > the value in the catch area, you won't get there anyway. ;)In Ada you can throw and exception inside a destructor and it does not lead to program termination. Ciao, Duncan.
On 25 November 2010 11:03, Duncan Sands <baldrick at free.fr> wrote:> I don't understand what you are saying. Cleanups (e.g. destructors)Hi Duncan, Cleanup landing pads normally call destructors, but they're not a destructor themselves. I'm simply saying that compiler generated blocks (such as cleanups) should never depend on user variables. But I get what you're saying. If a cleanup area calls a destructor, and destructors use user values, they have an indirect conditional dependency that cannot be solved at compile time, unless you have a way to normalize the call graph to disambiguate all dominance analysis.> I don't get what you are talking about here. You can access any variables > you like, whether local or global, in a catch handler. They are not passed > to the handler via llvm.eh.exception or llvm.eh.selector, they are simply > accessed directly (the unwinder restores registers etc making this > possible).Destructors and Catch areas are user code, you could access any variable as pleased. Cleanup areas are compiler code, those were the ones I was talking about.> In Ada you can throw and exception inside a destructor and it does not lead > to program termination.Ok, sorry. I should stop thinking C++ here.... it's difficult, but I'll try... ;) cheers, --renato
Are you suggesting having different semantics for different chunks of the IR graph? -- Eric On Thu, Nov 25, 2010 at 2:33 AM, Renato Golin <rengolin at systemcall.org>wrote:> On 25 November 2010 07:51, Duncan Sands <baldrick at free.fr> wrote: > > If you got to XYZ because an instruction threw an exception before %x was > > defined, then in XYZ you are using %x which was never defined. In effect > > the definition of %x in bb does not dominate the use in XYZ. I think the > > solution is to say that in XYZ you are not allowed to use any values > defined > > in bb: in the dominator tree, bb is not considered to dominate XYZ. > > Hi Duncan, > > I don't see how you can have dominance between a normal block and a > cleanup block. Clean-up landing pads should never use user code (since > they don't exist in userland). > > Catch landing pads, on the other hand, have the same dominance > relationship that the rest of user code has (and the same problems). > > Since you should never branch to XYZ under normal circumstances, you > should never rely on its predecessor's values anyway. That's the whole > point of having @llvm.eh.exception and @llvm.eh.selector, as it's the > role of the personality routine to pass information between the user > code and unwinding code. > > In essence, in compiler generated landing pads, you should never > generate a use of user values. But if XYZ is user code, it's user > problem. ;) > > cheers, > --renato > > PS: abnormal cases like throwing on a destructor when previously > thrown inside a constructor leads to termination, so even if you "use" > the value in the catch area, you won't get there anyway. ;) > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20101127/e831a431/attachment.html>
On 27 November 2010 20:24, Eric Schweitz <eric.schweitz at gmail.com> wrote:> Are you suggesting having different semantics for different chunks of the IR > graph?Hi Eric, Not at all, why do you say that? -- cheers, --renato http://systemcall.org/ Reclaim your digital rights, eliminate DRM, learn more at http://www.defectivebydesign.org/what_is_drm
On Nov 25, 2010, at 3:03 AM, Duncan Sands wrote:> I'm pointing out that if the invoke instruction > is removed and catch information is attached to entire basic blocks, then if no > care is taken then it is perfectly possible to use %x before it is defined as > explained in my previous email, blowing up the entire LLVM system. Clearly the > solution is to not allow this by not allowing values defined in a basic block > to be used in a handler for that block;If we take this route — and I think we should, although I'd like to see region chaining in first — I see two reasonable solutions. The first is what you've said, that effectively there's an edge from the beginning of the block; the second is a slight twist, that the edge leaves from the end of the phis. I think the latter will greatly simplify every transformation which ever inserts a phi, and in particular mem2reg. Since phis can't throw, it should be equivalent anyway.> In Ada you can throw and exception inside a destructor and it does not lead > to program termination.Interesting. I assume that the personality still sees these as just cleanups, so this must be implemented by running the destructor in a handler which aborts both unwinds and throws the Program_Error? John.