On 28 November 2010 10:20, Bill Wendling <wendling at apple.com> wrote:> Or am I missing something? :-)Hi Bill, John, There still seems to be a confusion between clean-ups and catch areas. What you both describe are catch areas, on which your arguments (AFAICS) are perfectly valid. The distinction is between catch and clean-up areas. You would never print the value of %x in a clean-up area. The sole purpose of clean-up areas is to cleanly destroy variables that wouldn't otherwise because of exception handling. During normal flow, the destruction code doesn't need to be in a clean-up area, it can easily be at the end of the scope, and it normally is on both places. The destruction code itself can print the value of %x (if it has access to it), and the validity of such value in the destructor code is up to the language + the user code. For instance, accessing a null pointer in C++ is allowed by the language (nobody stops you from doing so in compile time) but it's illegal during execution on most platforms. But, under no circumstances, a clean-up area can access a user variable to print it on the screen. It's like calling an intrinsic and expecting it to print the value of a random variable inside your code. It doesn't even make sense. Catch areas, on the other hand, are user code. Like destructors, the user can print the value of %x if it has access to, and if the variable was never initialised, it's the user's problem of relying on such condition. Catch areas are NOT unwinding basic blocks, they are the first user code blocks that, in case of a match, it's where execution returns to normal flow. They can also throw again, and make the flow go back to unwinding, but per se, they're user code. As was pointed out, some optimizations in LLVM can move user code to clean-up areas. The compiler may prove it valid and the execution might even work, but that's an artefact of how the compiler works and how other optimizations work around the same issue (such as inlining). Moving code from try to catch areas (and vice-versa) is fine, as both are user blocks. But moving user code to clean-up areas can lead to undefined behaviour. For example, during the unwinding of several functions without a single match to the exception, local variables in all intermediate functions have to be cleaned up, and that's done via the clean-up areas. The personality routine is controlling this flow, so if you move user code that could have side effects to a clean-up area (say only on -O3), a perfectly valid unwinding can break completely and terminate. That breaking is not inside the destructor, nor inside the catch areas, but inside a clean-up area, on which the user has no access nor control. This is a compiler bug. cheers, --renato
On Nov 28, 2010, at 6:23 AM, Renato Golin wrote:> There still seems to be a confusion between clean-ups and catch areas. > What you both describe are catch areas, on which your arguments > (AFAICS) are perfectly valid. The distinction is between catch and > clean-up areas. > > You would never print the value of %x in a clean-up area. The sole > purpose of clean-up areas is to cleanly destroy variables that > wouldn't otherwise because of exception handling. During normal flow, > the destruction code doesn't need to be in a clean-up area, it can > easily be at the end of the scope, and it normally is on both places. > > The destruction code itself can print the value of %x (if it has > access to it), and the validity of such value in the destructor code > is up to the language + the user code. For instance, accessing a null > pointer in C++ is allowed by the language (nobody stops you from doing > so in compile time) but it's illegal during execution on most > platforms. > > But, under no circumstances, a clean-up area can access a user > variable to print it on the screen. It's like calling an intrinsic and > expecting it to print the value of a random variable inside your code. > It doesn't even make sense. > > Catch areas, on the other hand, are user code. Like destructors, the > user can print the value of %x if it has access to, and if the > variable was never initialised, it's the user's problem of relying on > such condition. Catch areas are NOT unwinding basic blocks, they are > the first user code blocks that, in case of a match, it's where > execution returns to normal flow. They can also throw again, and make > the flow go back to unwinding, but per se, they're user code. > > As was pointed out, some optimizations in LLVM can move user code to > clean-up areas. The compiler may prove it valid and the execution > might even work, but that's an artefact of how the compiler works and > how other optimizations work around the same issue (such as inlining). > > Moving code from try to catch areas (and vice-versa) is fine, as both > are user blocks. But moving user code to clean-up areas can lead to > undefined behaviour. For example, during the unwinding of several > functions without a single match to the exception, local variables in > all intermediate functions have to be cleaned up, and that's done via > the clean-up areas. The personality routine is controlling this flow, > so if you move user code that could have side effects to a clean-up > area (say only on -O3), a perfectly valid unwinding can break > completely and terminate. > > That breaking is not inside the destructor, nor inside the catch > areas, but inside a clean-up area, on which the user has no access nor > control. This is a compiler bug. >My confusion could be in what Duncan was talking about. If we have a basic block that may throw and is caught by some landing pad, what variables may be used in that landing pad – both the cleanup part and the catch handler? Certainly the cleanup cannot access the user variables directly (but can indirectly if a pointer is passed into an object which is then dereferenced by the d'tor). But the catch handlers are dominated by the landing pad, which would need to be dominated by the throwing basic block in order to use any values calculated in that basic block. One possibility (perhaps this is what he meant) is that if a value is used in the catch handler, then it cannot reside in the throwing block. -bw [Quoting Duncan's original email here] "I think everyone wants to get rid of invoke, but that is hard. One problem is that you want to keep the SSA property "definitions dominate uses". Now suppose you have a basic block bb: [when throws, branch to XYZ] ... %x = ... (define %x) ... XYZ: ...use %x... If you got to XYZ because an instruction threw an exception before %x was defined, then in XYZ you are using %x which was never defined. In effect the definition of %x in bb does not dominate the use in XYZ. I think the solution is to say that in XYZ you are not allowed to use any values defined in bb: in the dominator tree, bb is not considered to dominate XYZ. These kind of issues touch fundamental design points of LLVM, so need to be dealt with carefully. Ciao, Duncan."
On 28 November 2010 22:14, Bill Wendling <wendling at apple.com> wrote:> If we have a basic block that may throw and is caught by some landing pad, what variables may be used in that landing pad – both the cleanup part and the catch handler? Certainly the cleanup cannot access the user variables directly (but can indirectly if a pointer is passed into an object which is then dereferenced by the d'tor).A clean-up landing pad should never (IMHO, not even after optimizations) handle user variables, especially not indirectly. But again, that's perhaps my C++-istic view of exception handling. It might make sense in other languages...> But the catch handlers are dominated by the landing pad, which would need to be dominated by the throwing basic block in order to use any values calculated in that basic block.That's a fair point. Ignore them just because they're special would complicate the CFG analysis for little gain, and could even get in the way of inlining the whole EH block.> One possibility (perhaps this is what he meant) is that if a value is used in the catch handler, then it cannot reside in the throwing block.If the value is guaranteed never to throw an exception and not to have any side-effects, I guess that's ok. Otherwise, it could change the state of the program in unpredictable ways. cheers, --renato
On Nov 28, 2010, at 6:23 AM, Renato Golin wrote:> On 28 November 2010 10:20, Bill Wendling <wendling at apple.com> wrote: >> Or am I missing something? :-) > > Hi Bill, John, > > There still seems to be a confusion between clean-ups and catch areas. > What you both describe are catch areas, on which your arguments > (AFAICS) are perfectly valid. The distinction is between catch and > clean-up areas.Two points. First, we're talking about how basic dominance rules work. Catch handlers will be dominated by protected blocks *through* the landing pads, so whether a value is legal to access in all the EH cleanups determines whether that value is legal to access in the handler. We are not going to have different dominance rules for "user code" vs. "non-user code"; frontends and optimizers are required to maintain well-formedness for all. Second, while you are correct to say that code motion into cleanup areas is generally unsafe, you are ignoring more likely ways that cleanups would become dependent on values not defined in the entry block. The first is that there are several places in C++ (and presumably also Ada) where a cleanup is required on the return value of a function call; most notably, a new-expression which throws after allocating memory must call the paired 'operator delete' on the allocated pointer. The second is the inliner in combination with mem2reg or SROA. John.