Since llvm-gcc is a rather large code base, which I have never looked at (or even run), could you give me a starting point of where to look? One thing I'd be interested in knowing is whether the llvm.eh.exception() intrinsic can be called more than once in a landing pad. Say for example I have a nested try block, so that there are two landing pads, one for the inner try block, and one for the outer. Let's say that the inner landing pad has a "finally" block - a cleanup handler. This means that the inner try block must catch every exception type so that it can execute the cleanup even if there was no specific catch handler for that exception type. At the end of the finally block, we need to jump to various different destinations depending on how the finally block was entered - return, fall through, throw, etc. In the case where we failed to catch an exception in the inner try block, but the outer try block has a catch handler for that exception, the finally block needs to jump to the outer landing pad. There are three ways this could happen: 1) Call _Unwind_Resume, using an "invoke" IR instruction whose unwind block points to the outer landing pad. This seems like it would be expensive however. 2) Branch directly to the start of the outer landing pad, and restart the exception dispatch all over. The problem with this is that the outer landing pad calls llvm.eh.exception() and llvm.eh.selector(), and I don't know if it's valid to call them at this point. Normally you don't jump to a landing pad directly, you get there via the personality function forcing a jump to that label. I don't know whether it is legal for the inner landing pad to jump to the start of the outer landing pad. 3) Branch directly to the individual catch blocks in the outer landing pad. In order to know which catch block to branch to, we need to add additional selectors to the inner landing pad representing the possible catch blocks in the outer landing pad that might catch the inner exception. The problem with this scheme is that now the liveness of the current exception object is all tangled up. The jump from the inner landing pad to the outer catch block passes through one or more cleanup blocks, each of which ends with a conditional jump. Each cleanup block has multiple predecessors, each predecessor setting a state variable telling the cleanup block where to jump to after finishing the cleanup. For a given cleanup block, "return" might set state = 0, "fall through" = 1, "catch block 1" = 2, "catch block 2" = 3 and so on. Notice that for some of those states (2, 3), there is a current active exception object, and for some (0, 1) there is not. That means that the value of the exception object is undefined for some predecessor blocks, so a regular phi node, won't work, and even storing the exception pointer in a local variable won't work because the compiler will notice that there are some code paths where the exception variable never gets set. Now, I happen to know that the code will never jump to a catch handler when there is no active throwable, but I am not sure that the compiler knows this. Duncan Sands wrote:> Hi Talin, > > >> Now that I've got exceptions working, I'm kind of wondering how to >> handle the case of nested "try" blocks. Say I have some code that looks >> like this: >> > > take a look at what llvm-gcc does for this kind of thing. > > Ciao, > > Duncan. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >
Hi Talin,> Since llvm-gcc is a rather large code base, which I have never looked at > (or even run), could you give me a starting point of where to look?I meant: compile some nested C++ with llvm-gcc to see what it does. Otherwise, look in llvm-convert.cpp, especially EmitLandingPads.> One thing I'd be interested in knowing is whether the > llvm.eh.exception() intrinsic can be called more than once in a landing pad.I committed some stuff a week or so ago which means that you can now call llvm.eh.exception as many times as you like, from wherever you like, and get the right result.> Say for example I have a nested try block, so that there are two landing > pads, one for the inner try block, and one for the outer. Let's say that > the inner landing pad has a "finally" block - a cleanup handler. This > means that the inner try block must catch every exception type so that > it can execute the cleanup even if there was no specific catch handler > for that exception type. > > At the end of the finally block, we need to jump to various different > destinations depending on how the finally block was entered - return, > fall through, throw, etc. In the case where we failed to catch an > exception in the inner try block, but the outer try block has a catch > handler for that exception, the finally block needs to jump to the outer > landing pad. There are three ways this could happen: > > 1) Call _Unwind_Resume, using an "invoke" IR instruction whose unwind > block points to the outer landing pad. This seems like it would be > expensive however.You need to be very careful with _Unwind_Resume. Due to a change in libgcc (starting from gcc-4.3) you can only use _Unwind_Resume on an exception if it didn't match anything (i.e. you can use it only if you matched a "cleanup"). This is extremely annoying.> 2) Branch directly to the start of the outer landing pad, and restart > the exception dispatch all over. The problem with this is that the outer > landing pad calls llvm.eh.exception() and llvm.eh.selector(), and I > don't know if it's valid to call them at this point. Normally you don't > jump to a landing pad directly, you get there via the personality > function forcing a jump to that label. I don't know whether it is legal > for the inner landing pad to jump to the start of the outer landing pad.LLVM doesn't yet support calling llvm.eh.selector far away from a landing pad, or multiple times. This is something I would like to support but it is hard.> 3) Branch directly to the individual catch blocks in the outer landing > pad. In order to know which catch block to branch to, we need to add > additional selectors to the inner landing pad representing the possible > catch blocks in the outer landing pad that might catch the inner exception.This is what llvm-gcc does, if I understand you right.> The problem with this scheme is that now the liveness of the current > exception object is all tangled up. The jump from the inner landing pad > to the outer catch block passes through one or more cleanup blocks, each > of which ends with a conditional jump. Each cleanup block has multiple > predecessors, each predecessor setting a state variable telling the > cleanup block where to jump to after finishing the cleanup. For a given > cleanup block, "return" might set state = 0, "fall through" = 1, "catch > block 1" = 2, "catch block 2" = 3 and so on. Notice that for some of > those states (2, 3), there is a current active exception object, and for > some (0, 1) there is not. That means that the value of the exception > object is undefined for some predecessor blocks, so a regular phi node, > won't work, and even storing the exception pointer in a local variable > won't work because the compiler will notice that there are some code > paths where the exception variable never gets set. Now, I happen to know > that the code will never jump to a catch handler when there is no active > throwable, but I am not sure that the compiler knows this.IIRC, llvm-gcc duplicates cleanups a lot. Less than gcc though! Good luck, Duncan.
All right, that is what I needed to know. Thanks very much! I decided to go with _Unwind_Resume for the moment, just to get something up and running, since that option is by far the easiest to implement - I can always optimize later if needed. With that change, nested try blocks are working :) Thanks for the warning about _Unwind_Resume. Fortunately, it appears that my code was already correct in this regard, although that was just an accident. For cleanups, my plan is as follows: As I generate the CFG, I maintain a stack of cleanup handlers. Each time I generate a return, break, continue, or other statement that leaves a block, I insert a special "local call" instruction (one of my instructions, not LLVM IR) for each currently active cleanup handler. The cleanup handler ends with a "local return" block terminator. After the CFG is generated, I will run a transform over the CFG that converts all local calls into "set a state variable and jump", and all local returns into a switch on that variable. Of course, what would be even more efficient is if labels could be first-class values :) Duncan Sands wrote:> Hi Talin, > > >> Since llvm-gcc is a rather large code base, which I have never looked at >> (or even run), could you give me a starting point of where to look? >> > > I meant: compile some nested C++ with llvm-gcc to see what it does. > Otherwise, look in llvm-convert.cpp, especially EmitLandingPads. > > >> One thing I'd be interested in knowing is whether the >> llvm.eh.exception() intrinsic can be called more than once in a landing pad. >> > > I committed some stuff a week or so ago which means that you can now > call llvm.eh.exception as many times as you like, from wherever you > like, and get the right result. > > >> Say for example I have a nested try block, so that there are two landing >> pads, one for the inner try block, and one for the outer. Let's say that >> the inner landing pad has a "finally" block - a cleanup handler. This >> means that the inner try block must catch every exception type so that >> it can execute the cleanup even if there was no specific catch handler >> for that exception type. >> >> At the end of the finally block, we need to jump to various different >> destinations depending on how the finally block was entered - return, >> fall through, throw, etc. In the case where we failed to catch an >> exception in the inner try block, but the outer try block has a catch >> handler for that exception, the finally block needs to jump to the outer >> landing pad. There are three ways this could happen: >> >> 1) Call _Unwind_Resume, using an "invoke" IR instruction whose unwind >> block points to the outer landing pad. This seems like it would be >> expensive however. >> > > You need to be very careful with _Unwind_Resume. Due to a change in > libgcc (starting from gcc-4.3) you can only use _Unwind_Resume on an > exception if it didn't match anything (i.e. you can use it only if you > matched a "cleanup"). This is extremely annoying. > > >> 2) Branch directly to the start of the outer landing pad, and restart >> the exception dispatch all over. The problem with this is that the outer >> landing pad calls llvm.eh.exception() and llvm.eh.selector(), and I >> don't know if it's valid to call them at this point. Normally you don't >> jump to a landing pad directly, you get there via the personality >> function forcing a jump to that label. I don't know whether it is legal >> for the inner landing pad to jump to the start of the outer landing pad. >> > > LLVM doesn't yet support calling llvm.eh.selector far away from a > landing pad, or multiple times. This is something I would like to > support but it is hard. > > >> 3) Branch directly to the individual catch blocks in the outer landing >> pad. In order to know which catch block to branch to, we need to add >> additional selectors to the inner landing pad representing the possible >> catch blocks in the outer landing pad that might catch the inner exception. >> > > This is what llvm-gcc does, if I understand you right. > > >> The problem with this scheme is that now the liveness of the current >> exception object is all tangled up. The jump from the inner landing pad >> to the outer catch block passes through one or more cleanup blocks, each >> of which ends with a conditional jump. Each cleanup block has multiple >> predecessors, each predecessor setting a state variable telling the >> cleanup block where to jump to after finishing the cleanup. For a given >> cleanup block, "return" might set state = 0, "fall through" = 1, "catch >> block 1" = 2, "catch block 2" = 3 and so on. Notice that for some of >> those states (2, 3), there is a current active exception object, and for >> some (0, 1) there is not. That means that the value of the exception >> object is undefined for some predecessor blocks, so a regular phi node, >> won't work, and even storing the exception pointer in a local variable >> won't work because the compiler will notice that there are some code >> paths where the exception variable never gets set. Now, I happen to know >> that the code will never jump to a catch handler when there is no active >> throwable, but I am not sure that the compiler knows this. >> > > IIRC, llvm-gcc duplicates cleanups a lot. Less than gcc though! > > Good luck, > > Duncan. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >