On Nov 28, 2010, at 1:57 PM, Bill Wendling wrote:> On Nov 28, 2010, at 2:59 AM, John McCall wrote: > >> On Nov 28, 2010, at 2:20 AM, Bill Wendling wrote: >> >>> On Nov 27, 2010, at 4:57 PM, John McCall wrote: >>> >>>> On Nov 25, 2010, at 3:03 AM, Duncan Sands wrote: >>>>> I'm pointing out that if the invoke instruction >>>>> is removed and catch information is attached to entire basic blocks, then if no >>>>> care is taken then it is perfectly possible to use %x before it is defined as >>>>> explained in my previous email, blowing up the entire LLVM system. Clearly the >>>>> solution is to not allow this by not allowing values defined in a basic block >>>>> to be used in a handler for that block; >>>> >>>> If we take this route — and I think we should, although I'd like to see region >>>> chaining in first — I see two reasonable solutions. The first is what you've >>>> said, that effectively there's an edge from the beginning of the block; the >>>> second is a slight twist, that the edge leaves from the end of the phis. I >>>> think the latter will greatly simplify every transformation which ever inserts >>>> a phi, and in particular mem2reg. Since phis can't throw, it should be >>>> equivalent anyway. >>> >>> Wouldn't this route make otherwise value programs invalid? For instance: >>> >>> entry: >>> %x = alloca i32 >>> br label try >>> >>> try: unwinds to %landing.pad >>> %x1 = add i32 37, 927 >>> call @foo() >>> br label %normal.edge >>> >>> landing.pad: landingpad >>> ; Code that prints the value of %x1 >>> >>> We expect the value printed to be 964, but if we don't allow %x1 to be accessible then we get garbage or an assert... >> >> What do you mean by "otherwise valid"? > > ?? Uh...in the current EH implementation, if foo() throws, then %x1 would have 964 as the value, even in "landing.pad". And the value could be printed out with no problems. In other words, this is completely valid code with proper semantics: > > #include <iostream> > extern void foo(); > int main() { > int x = 0; > try { > x = 37 + 927; > foo(); > } catch (...) { > std::cout << x << '\n'; > } > } > > Any solution that violates this isn't a solution.These well-formedness constraints on IR are not constraints on actual user code, they're constraints on frontends and transforms. Frontends have to translate actual user code into well-formed IR, and transforms have to preserve well-formedness. In your example, a C++ frontend is going to generate this code with %x as an alloca, like so: (I've taken the liberty of changing "x = 37 + 927;" to "extern int count; x = count;". This makes the operation non-constant but still non-throwing. I don't think it substantively changes the example other than to remove a potential source of confusion, but if you disagree, the same argument works with the original example) entry: %x = alloca i32 store i32 0, i32* x br label %try try: unwinds to %lp %count = load i32* @count store i32 %count, i32* x call void @foo() br label %return lp: dispatch [ catchall i8* null, label %catch ] catch: # I've removed the begin/end catch calls for simplicity, if you think they're relevant I can elaborate %t = load i32* %x call void @operator<<(%"std::ostream"* @std::cout, i32 %t) br label %return return: ret i32 0 This is well-formed SSA; the alloca instruction %x is in the entry block and thus dominates both the store in %try and the load in %catch. mem2reg wants to eliminate %x and replace the load in %catch with a fixed value. This involves looking at the value stored in %x at all predecessor points, which is a challenge because one of those predecessors is an arbitrary point within the block %try, and %x actually has two values over that extent. So mem2reg would have to split %try and make a phi in %lp, like so: try1: unwinds to %lp %count = load i32* @count br label %try2 try2: unwinds to %lp call void foo() br label %return lp: %t = phi i32 [ i32 0, label %try1 ], [ i32 %count, label %try2 ] #etc. That's a lot of added complexity for mem2reg (and other transformations that use it as a subroutine), but it's pretty much inherent in any design where edges don't always come from terminators. Now, in this specific case, mem2reg could say "hey, %count can't throw, I don't need the phi at all". But it would still need to split %lp to make the IR well-formed; it's just that %lp1 wouldn't have an "unwinds" clause. And in the general case, it needs a phi. John.
On Nov 28, 2010, at 3:47 PM, John McCall wrote:> So mem2reg would have to split %try and make a phi in %lp, like so: > > try1: unwinds to %lp > %count = load i32* @count > br label %try2 > try2: unwinds to %lp > call void foo() > br label %return > lp: > %t = phi i32 [ i32 0, label %try1 ], [ i32 %count, label %try2 ] > #etc. > > That's a lot of added complexity for mem2reg (and other transformations that use it as a subroutine), but it's pretty much inherent in any design where edges don't always come from terminators. > > Now, in this specific case, mem2reg could say "hey, %count can't throw, I don't need the phi at all". But it would still need to split %lp to make the IR well-formed; it's just that %lp1 wouldn't have an "unwinds" clause. And in the general case, it needs a phi.This problem reappears during register allocation. If a value is live-in to a landing pad, it must be kept in the same location before any instruction that may throw - a callee-saved register or a stack slot. That means the register allocator is no longer free to move values between registers and the stack inside a basic block. It would at least have to ensure that a copy is kept in a stack slot or in a fixed callee-saved register. This would be particularly bad for register classes that have no callee-saved registers, such as vector registers. This will be less of a problem if the code generator knows which instructions may throw. /jakob -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20101128/d1e9ae57/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1929 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20101128/d1e9ae57/attachment.bin>
On Nov 28, 2010, at 4:15 PM, Jakob Stoklund Olesen wrote:> > On Nov 28, 2010, at 3:47 PM, John McCall wrote: > >> So mem2reg would have to split %try and make a phi in %lp, like so: >> >> try1: unwinds to %lp >> %count = load i32* @count >> br label %try2 >> try2: unwinds to %lp >> call void foo() >> br label %return >> lp: >> %t = phi i32 [ i32 0, label %try1 ], [ i32 %count, label %try2 ] >> #etc. >> >> That's a lot of added complexity for mem2reg (and other transformations that use it as a subroutine), but it's pretty much inherent in any design where edges don't always come from terminators. >> >> Now, in this specific case, mem2reg could say "hey, %count can't throw, I don't need the phi at all". But it would still need to split %lp to make the IR well-formed; it's just that %lp1 wouldn't have an "unwinds" clause. And in the general case, it needs a phi. > > This problem reappears during register allocation. > > If a value is live-in to a landing pad, it must be kept in the same location before any instruction that may throw - a callee-saved register or a stack slot. > > That means the register allocator is no longer free to move values between registers and the stack inside a basic block. It would at least have to ensure that a copy is kept in a stack slot or in a fixed callee-saved register. This would be particularly bad for register classes that have no callee-saved registers, such as vector registers. > > This will be less of a problem if the code generator knows which instructions may throw.Yeah, I think any proposal for attaching unwind destinations to blocks will have to include a "can unwind" bit on Instruction so that this can be tested extremely efficiently. John. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20101128/965a8de2/attachment.html>
On Nov 28, 2010, at 3:47 PM, John McCall wrote:> This is well-formed SSA; the alloca instruction %x is in the entry block and thus dominates both the store in %try and the load in %catch. mem2reg wants to eliminate %x and replace the load in %catch with a fixed value. This involves looking at the value stored in %x at all predecessor points, which is a challenge because one of those predecessors is an arbitrary point within the block %try, and %x actually has two values over that extent. So mem2reg would have to split %try and make a phi in %lp, like so: > > try1: unwinds to %lp > %count = load i32* @count > br label %try2 > try2: unwinds to %lp > call void foo() > br label %return > lp: > %t = phi i32 [ i32 0, label %try1 ], [ i32 %count, label %try2 ] > #etc. > > That's a lot of added complexity for mem2reg (and other transformations that use it as a subroutine), but it's pretty much inherent in any design where edges don't always come from terminators. > > Now, in this specific case, mem2reg could say "hey, %count can't throw, I don't need the phi at all". But it would still need to split %lp to make the IR well-formed; it's just that %lp1 wouldn't have an "unwinds" clause. And in the general case, it needs a phi.Thanks. Your initial statement was confusing. :-) Yes, this is one scenario (and possibly the only one??) for how things would work with adorning BBs with "unwinds to" attributes. -bw
On Nov 28, 2010, at 4:27 PM, Bill Wendling wrote:> On Nov 28, 2010, at 3:47 PM, John McCall wrote: > >> This is well-formed SSA; the alloca instruction %x is in the entry block and thus dominates both the store in %try and the load in %catch. mem2reg wants to eliminate %x and replace the load in %catch with a fixed value. This involves looking at the value stored in %x at all predecessor points, which is a challenge because one of those predecessors is an arbitrary point within the block %try, and %x actually has two values over that extent. So mem2reg would have to split %try and make a phi in %lp, like so: >> >> try1: unwinds to %lp >> %count = load i32* @count >> br label %try2 >> try2: unwinds to %lp >> call void foo() >> br label %return >> lp: >> %t = phi i32 [ i32 0, label %try1 ], [ i32 %count, label %try2 ] >> #etc. >> >> That's a lot of added complexity for mem2reg (and other transformations that use it as a subroutine), but it's pretty much inherent in any design where edges don't always come from terminators. >> >> Now, in this specific case, mem2reg could say "hey, %count can't throw, I don't need the phi at all". But it would still need to split %lp to make the IR well-formed; it's just that %lp1 wouldn't have an "unwinds" clause. And in the general case, it needs a phi. > > Thanks. Your initial statement was confusing. :-) Yes, this is one scenario (and possibly the only one??) for how things would work with adorning BBs with "unwinds to" attributes.Yeah, I think there are two reasonable possibilities here. The first is that nothing in an adorned block dominates the edge; the second is that the phi nodes do. Consider code like this: extern void print(int); extern void bar(); void foo(int *var) { int x = 0; if (var) x = *var; try { bar(); } catch (...) { print(x); } } Is the following well-formed, or do we need the phi in its own, unadorned block? entry: %cond = icmp ne i32* %var, null br i1 %cond, label %yes, label %cont yes: %ld = load i32* %var br label %cont cont: unwinds to %lp %x = phi i32 [i32 0, label %entry], [i32 %ld, label %yes] call void @bar() ret void lp: dispatch [catchall, label %catch] catch: #simplified call void @print(i32 %x) ret void My intuition is that passes which insert phis will be significantly simplified if phis dominate the EH edge, but it's a slightly more complicated rule, and it might be obnoxious to work with in the end. John.
Hi John,> This is well-formed SSA; the alloca instruction %x is in the entry block and thus dominates both the store in %try and the load in %catch...this brings up the question of whether we should allow a catch handler to be attached to the entry block. My feeling is that it should be disallowed. If it was allowed then values defined in the entry block (like alloca's) would no longer be guaranteed to dominate every other block in the function. Ciao, Duncan.
On Nov 29, 2010, at 12:17 AM, Duncan Sands wrote:>> This is well-formed SSA; the alloca instruction %x is in the entry block and thus dominates both the store in %try and the load in %catch... > > this brings up the question of whether we should allow a catch handler to be > attached to the entry block. My feeling is that it should be disallowed. If > it was allowed then values defined in the entry block (like alloca's) would no > longer be guaranteed to dominate every other block in the function.Agreed. John.