On Nov 25, 2010, at 3:03 AM, Duncan Sands wrote:> I'm pointing out that if the invoke instruction > is removed and catch information is attached to entire basic blocks, then if no > care is taken then it is perfectly possible to use %x before it is defined as > explained in my previous email, blowing up the entire LLVM system. Clearly the > solution is to not allow this by not allowing values defined in a basic block > to be used in a handler for that block;If we take this route — and I think we should, although I'd like to see region chaining in first — I see two reasonable solutions. The first is what you've said, that effectively there's an edge from the beginning of the block; the second is a slight twist, that the edge leaves from the end of the phis. I think the latter will greatly simplify every transformation which ever inserts a phi, and in particular mem2reg. Since phis can't throw, it should be equivalent anyway.> In Ada you can throw and exception inside a destructor and it does not lead > to program termination.Interesting. I assume that the personality still sees these as just cleanups, so this must be implemented by running the destructor in a handler which aborts both unwinds and throws the Program_Error? John.
On Nov 27, 2010, at 4:57 PM, John McCall wrote:> On Nov 25, 2010, at 3:03 AM, Duncan Sands wrote: >> I'm pointing out that if the invoke instruction >> is removed and catch information is attached to entire basic blocks, then if no >> care is taken then it is perfectly possible to use %x before it is defined as >> explained in my previous email, blowing up the entire LLVM system. Clearly the >> solution is to not allow this by not allowing values defined in a basic block >> to be used in a handler for that block; > > If we take this route — and I think we should, although I'd like to see region > chaining in first — I see two reasonable solutions. The first is what you've > said, that effectively there's an edge from the beginning of the block; the > second is a slight twist, that the edge leaves from the end of the phis. I > think the latter will greatly simplify every transformation which ever inserts > a phi, and in particular mem2reg. Since phis can't throw, it should be > equivalent anyway.Wouldn't this route make otherwise value programs invalid? For instance: entry: %x = alloca i32 br label try try: unwinds to %landing.pad %x1 = add i32 37, 927 call @foo() br label %normal.edge landing.pad: landingpad ; Code that prints the value of %x1 We expect the value printed to be 964, but if we don't allow %x1 to be accessible then we get garbage or an assert... Or am I missing something? :-) -bw
On Nov 28, 2010, at 2:20 AM, Bill Wendling wrote:> On Nov 27, 2010, at 4:57 PM, John McCall wrote: > >> On Nov 25, 2010, at 3:03 AM, Duncan Sands wrote: >>> I'm pointing out that if the invoke instruction >>> is removed and catch information is attached to entire basic blocks, then if no >>> care is taken then it is perfectly possible to use %x before it is defined as >>> explained in my previous email, blowing up the entire LLVM system. Clearly the >>> solution is to not allow this by not allowing values defined in a basic block >>> to be used in a handler for that block; >> >> If we take this route — and I think we should, although I'd like to see region >> chaining in first — I see two reasonable solutions. The first is what you've >> said, that effectively there's an edge from the beginning of the block; the >> second is a slight twist, that the edge leaves from the end of the phis. I >> think the latter will greatly simplify every transformation which ever inserts >> a phi, and in particular mem2reg. Since phis can't throw, it should be >> equivalent anyway. > > Wouldn't this route make otherwise value programs invalid? For instance: > > entry: > %x = alloca i32 > br label try > > try: unwinds to %landing.pad > %x1 = add i32 37, 927 > call @foo() > br label %normal.edge > > landing.pad: landingpad > ; Code that prints the value of %x1 > > We expect the value printed to be 964, but if we don't allow %x1 to be accessible then we get garbage or an assert...What do you mean by "otherwise valid"? Just that we can prove that %x1 has been computed by the time a throw can occur? We really want SSA well-formedness to be a simple structural condition; having to examine instruction-specific semantics to compute dominance would make it absurdly expensive to compute and keep valid. For example, if any instruction were valid to reference up to the first throwing instruction, then re-ordering the loads in the following code would break SSA: bb: unwinds to %lp %x1 = throwing load i32* %v1 %x2 = throwing load i32* %v2 ... lp: call void @print_int(i32 %x1) Put another way, this would change SSA well-formedness from a local problem to a global one. Of course, this will be a very challenging change regardless of dominance criteria because of the imprecision of edges. For example, we would not be able to mem2reg %v in the following example without splitting %bb: define void @foo() entry: %v = alloca i32 br label %bb bb: unwinds to %lp store i32 0, i32* %v call void @bar() store i32 1, i32* %v call void @bar() ret void lp: %x = load i32* %v ... That's inherent in the design, unfortunately. John.
On 28 November 2010 10:20, Bill Wendling <wendling at apple.com> wrote:> Or am I missing something? :-)Hi Bill, John, There still seems to be a confusion between clean-ups and catch areas. What you both describe are catch areas, on which your arguments (AFAICS) are perfectly valid. The distinction is between catch and clean-up areas. You would never print the value of %x in a clean-up area. The sole purpose of clean-up areas is to cleanly destroy variables that wouldn't otherwise because of exception handling. During normal flow, the destruction code doesn't need to be in a clean-up area, it can easily be at the end of the scope, and it normally is on both places. The destruction code itself can print the value of %x (if it has access to it), and the validity of such value in the destructor code is up to the language + the user code. For instance, accessing a null pointer in C++ is allowed by the language (nobody stops you from doing so in compile time) but it's illegal during execution on most platforms. But, under no circumstances, a clean-up area can access a user variable to print it on the screen. It's like calling an intrinsic and expecting it to print the value of a random variable inside your code. It doesn't even make sense. Catch areas, on the other hand, are user code. Like destructors, the user can print the value of %x if it has access to, and if the variable was never initialised, it's the user's problem of relying on such condition. Catch areas are NOT unwinding basic blocks, they are the first user code blocks that, in case of a match, it's where execution returns to normal flow. They can also throw again, and make the flow go back to unwinding, but per se, they're user code. As was pointed out, some optimizations in LLVM can move user code to clean-up areas. The compiler may prove it valid and the execution might even work, but that's an artefact of how the compiler works and how other optimizations work around the same issue (such as inlining). Moving code from try to catch areas (and vice-versa) is fine, as both are user blocks. But moving user code to clean-up areas can lead to undefined behaviour. For example, during the unwinding of several functions without a single match to the exception, local variables in all intermediate functions have to be cleaned up, and that's done via the clean-up areas. The personality routine is controlling this flow, so if you move user code that could have side effects to a clean-up area (say only on -O3), a perfectly valid unwinding can break completely and terminate. That breaking is not inside the destructor, nor inside the catch areas, but inside a clean-up area, on which the user has no access nor control. This is a compiler bug. cheers, --renato
Hi John,>> I'm pointing out that if the invoke instruction >> is removed and catch information is attached to entire basic blocks, then if no >> care is taken then it is perfectly possible to use %x before it is defined as >> explained in my previous email, blowing up the entire LLVM system. Clearly the >> solution is to not allow this by not allowing values defined in a basic block >> to be used in a handler for that block; > > If we take this route — and I think we should, although I'd like to see region > chaining in first — I see two reasonable solutions. The first is what you've > said, that effectively there's an edge from the beginning of the block; the > second is a slight twist, that the edge leaves from the end of the phis. I > think the latter will greatly simplify every transformation which ever inserts > a phi, and in particular mem2reg. Since phis can't throw, it should be > equivalent anyway.that makes sense to me, but needs to be thought about carefully.>> In Ada you can throw and exception inside a destructor and it does not lead >> to program termination. > > Interesting. I assume that the personality still sees these as just cleanups, > so this must be implemented by running the destructor in a handler which > aborts both unwinds and throws the Program_Error?Right. The dwarf exception library doesn't mind if you throw a new exception inside a cleanup, it just unwinds it. ["Cleanups don't throw" is a C++ specific concept that is built on top of the basic unwinder facilities; gcc does it by wrapping cleanup code in a no-throw filter, either explicitly or implicitly via no-throw regions, which are an optimization but conceptually no different to a no-throw filter]. So the Ada stuff can just wrap the running of destructors in a catch-all, and if a destructor throws it then finalizes the original exception (finalizing exceptions never throws in Ada) and throws a new one (Program_Error) instead, which then unwinds in the usual way, which may mean being caught in the containing function if the scope being left is wrapped in one which catches Program_Error. Of course this means that in LLVM it would be wrong to bake in a rule like "cleanup code never throws", but hopefully no-one was suggesting that. Ciao, Duncan. PS: If you and Bill came to a consensus over the definition of regions etc, it would be nice to see a revised proposal from Bill that incorporates it.
On Nov 28, 2010, at 9:50 AM, Duncan Sands wrote:>>> In Ada you can throw and exception inside a destructor and it does not lead >>> to program termination. >> >> Interesting. I assume that the personality still sees these as just cleanups, >> so this must be implemented by running the destructor in a handler which >> aborts both unwinds and throws the Program_Error? > > Right. The dwarf exception library doesn't mind if you throw a new exception > inside a cleanup, it just unwinds it. ["Cleanups don't throw" is a C++ specific > concept that is built on top of the basic unwinder facilities; gcc does it by > wrapping cleanup code in a no-throw filter, either explicitly or implicitly via > no-throw regions, which are an optimization but conceptually no different to a > no-throw filter].Right. In clang (and llvm-gcc, I think) we generate a catch-all which just calls terminate.> So the Ada stuff can just wrap the running of destructors in > a catch-all, and if a destructor throws it then finalizes the original > exception (finalizing exceptions never throws in Ada) and throws a new one > (Program_Error) instead, which then unwinds in the usual way, which may mean > being caught in the containing function if the scope being left is wrapped in > one which catches Program_Error. Of course this means that in LLVM it would be > wrong to bake in a rule like "cleanup code never throws", but hopefully no-one > was suggesting that.I had actually thought of something like that at one point, in an effort to document the expectation of the unwinder, but I'm totally willing to change it to "cleanup code that throws or returns is responsible for aborting the current unwind".> PS: If you and Bill came to a consensus over the definition of regions etc, > it would be nice to see a revised proposal from Bill that incorporates it.I'm not sure if there's a consensus yet. John.