Sohail Somani
2011-Jun-12 04:01 UTC
[LLVMdev] Is LLVM expressive enough to represent asynchronous exceptions?
Is LLVM expressive enough to represent asynchronous exceptions? --------------------------------------------------------------- Summary: Need new LLVM instructions or extending of all instructions. C++ exceptions are synchronous: the compiler knows when/where they are being raised. Asynchronous exceptions can be raised at any time. For example, an integer divide-by-zero may raise an asynchronous exception. Windows structured exception handling (SEH) is an example of asynchronous exceptions. UNIX signals are another. Chip Davis is working on implementing GNU-style C++ exceptions on top of table-based Windows SEH for the COFF format. That is, implementing synchronous exceptions using the native platform's asynchronous exception framework. I am concerned with representing the handling of asynchronous exceptions in LLVM as there is language-level support in Windows C++ compilers via __try/__except/__finally blocks (Clang already supports this at the AST level). I believe that this is not currently possible and needs new no-op instructions or a change in syntax. A SEH-block in C++ consists of a __try block and either of two following blocks: * A __except block consisting of a filter and body * A __finally (cleanup) block consisting of a body The __except filter is essentially a generalized catch as it is a function call that determines how to handle the exception. It is different from llvm.eh.selector (although llvm.eh.selector could be modified to support it.) An example: DWORD filter(int code,int*p) { printf("code: %x, p: %x\n",code,p); *p = 10; return EXCEPTION_EXECUTE_HANDLER; } void whatever() { int p = 5; __try { printf("p: %d\n",p); p /= 0; // SEH printf("unreachable\n"); } __except( filter(GetExceptionCode(),&p) ) { printf("p: %d\n",p); } } The output of calling whatever() is: p: 5 <-- in __try block code: c0000094, p: 1af8d0 <-- in filter function p: 10 <-- handler (notice value of p changed) To summarize: * Exceptions can be raised at any point (asynchronous). * Control will jump back and forth between multiple contexts generally in a user-defined manner, but managed by the runtime. * The user-defined handling needs to be defined in LLVM IR somehow. The last one means that some changes to LLVM IR are necessary. Option 0: Re-using existing machinery ------------------------------------- This is unfortunately not really an option as far as I can tell because there is no way to delineate where different handlers are active without using either option below. I'd be very happy to be wrong, however. Option 1: Extend LLVM syntax ---------------------------- For synchronous (C++) exceptions, we use the following syntax when calling a function: invoke ... to label %continue unwind label %cleanup LLVM assumes exceptions can only arise from function calls and this is made explicit with the invoke syntax. One option, which would be consistent with this syntax, is to extend /every/ single instruction with similar syntax: %result = udiv i32 %p, 0 to label %continue unwind label %cleanup Personally, I like this because of the consistency but I think it may be a bit too verbose (and requires a lot more changes to syntax). There is a major problem though: it still assumes that only LLVM instructions can cause SEH exceptions. This is *not* true in the case of UNIX signals. SIGINT, for example. So it probably does not fully capture asynchronous exceptions. Option 2: Add a no-op --------------------- Another option is to transliterate the SEH handling code and add some no-ops like bitcast. define void whatever() { entry: ... aeh.try.enter0: ;; -----> this would be the no-op <----- llvm.aeh.enter blockaddress(@whatever,%aeh.try.enter0), blockaddress(@whatever,%aeh.try.except.filter0) ... br label aeh.try.exit0: aeh.try.except.filter0: ; no predecessor %result = <user code here> call llvm.aeh.continue(%result,blockaddress(@whatever,%aeh.try.except.handler0)) unreachable aeh.try.except.body0: ; no predecessor call printf "handler\n" br label aeh.try.exit0 aeh.try.exit0: ;; -----> this would be the no-op <----- llvm.aeh.exit br label aeh.try.cont0 aeh.try.cont0: <whatever> } I like this one because it is explicit. All options here would require a little bit extra futzing to integrate with C++ exceptions. So my question to you is: given that there is interest in proper Windows support, and that asynchronous exceptions are generally useful, which option above (or one of your own choosing) would you use? Thanks for your time! Sohail
Rafael Ávila de Espíndola
2011-Jun-12 06:54 UTC
[LLVMdev] Is LLVM expressive enough to represent asynchronous exceptions?
On 11-06-12 12:01 AM, Sohail Somani wrote:> Is LLVM expressive enough to represent asynchronous exceptions? > --------------------------------------------------------------- > > Summary: Need new LLVM instructions or extending of all instructions. > > C++ exceptions are synchronous: the compiler knows when/where they are > being raised. > > Asynchronous exceptions can be raised at any time. For example, an > integer divide-by-zero may raise an asynchronous exception. > > Windows structured exception handling (SEH) is an example of > asynchronous exceptions. UNIX signals are another. > > Chip Davis is working on implementing GNU-style C++ exceptions on top > of table-based Windows SEH for the COFF format. That is, implementing > synchronous exceptions using the native platform's asynchronous > exception framework. > > I am concerned with representing the handling of asynchronous > exceptions in LLVM as there is language-level support in Windows C++ > compilers via __try/__except/__finally blocks (Clang already supports > this at the AST level). I believe that this is not currently possible > and needs new no-op instructions or a change in syntax. > > A SEH-block in C++ consists of a __try block and either of two > following blocks: > > * A __except block consisting of a filter and body > > * A __finally (cleanup) block consisting of a body > > The __except filter is essentially a generalized catch as it is a > function call that determines how to handle the exception. It is > different from llvm.eh.selector (although llvm.eh.selector could be > modified to support it.) > > An example: > > DWORD filter(int code,int*p) { > printf("code: %x, p: %x\n",code,p); > *p = 10; > return EXCEPTION_EXECUTE_HANDLER; > } > > void whatever() { > int p = 5; > __try { > printf("p: %d\n",p); > p /= 0; // SEH > printf("unreachable\n"); > } > __except( filter(GetExceptionCode(),&p) ) { > printf("p: %d\n",p); > } > } > > The output of calling whatever() is: > > p: 5<-- in __try block > code: c0000094, p: 1af8d0<-- in filter function > p: 10<-- handler (notice value of p changed) > > To summarize: > > * Exceptions can be raised at any point (asynchronous). > > * Control will jump back and forth between multiple contexts generally > in a user-defined manner, but managed by the runtime. > > * The user-defined handling needs to be defined in LLVM IR somehow. > > The last one means that some changes to LLVM IR are necessary. > > Option 0: Re-using existing machinery > ------------------------------------- > > This is unfortunately not really an option as far as I can tell > because there is no way to delineate where different handlers are > active without using either option below. > > I'd be very happy to be wrong, however. > > Option 1: Extend LLVM syntax > ---------------------------- > > For synchronous (C++) exceptions, we use the following syntax when > calling a function: > > invoke ... to label %continue unwind label %cleanup > > LLVM assumes exceptions can only arise from function calls and this is > made explicit with the invoke syntax. > > One option, which would be consistent with this syntax, is to extend > /every/ single instruction with similar syntax: > > %result = udiv i32 %p, 0 to label %continue unwind label %cleanup > > Personally, I like this because of the consistency but I think it may > be a bit too verbose (and requires a lot more changes to syntax). > > There is a major problem though: it still assumes that only LLVM > instructions can cause SEH exceptions. This is *not* true in the case > of UNIX signals. SIGINT, for example. So it probably does not fully > capture asynchronous exceptions. > > Option 2: Add a no-op > --------------------- > > Another option is to transliterate the SEH handling code and add some > no-ops like bitcast. > > define void whatever() > { > entry: > ... > aeh.try.enter0: > ;; -----> this would be the no-op<----- > llvm.aeh.enter blockaddress(@whatever,%aeh.try.enter0), > blockaddress(@whatever,%aeh.try.except.filter0) > ... > br label aeh.try.exit0: > aeh.try.except.filter0: ; no predecessor > %result =<user code here> > call > llvm.aeh.continue(%result,blockaddress(@whatever,%aeh.try.except.handler0)) > unreachable > aeh.try.except.body0: ; no predecessor > call printf "handler\n" > br label aeh.try.exit0 > aeh.try.exit0: > ;; -----> this would be the no-op<----- > llvm.aeh.exit > br label aeh.try.cont0 > aeh.try.cont0: > <whatever> > } > > I like this one because it is explicit. > > All options here would require a little bit extra futzing to integrate > with C++ exceptions. > > So my question to you is: given that there is interest in proper > Windows support, and that asynchronous exceptions are generally > useful, which option above (or one of your own choosing) would you > use? > > Thanks for your time!If you search the list you will find some discussion on better ways to represent EH. The focus has been on C++ for now. You may also want to see http://www.nondot.org/sabre/LLVMNotes/ExceptionHandlingChanges.txt> Sohail > >Cheers, Rafael
Duncan Sands
2011-Jun-12 08:25 UTC
[LLVMdev] Is LLVM expressive enough to represent asynchronous exceptions?
Hi Sohail,> Is LLVM expressive enough to represent asynchronous exceptions?not currently. The first step in this direction is to get rid of the invoke instruction and attach exception handling information to basic blocks. See http://nondot.org/sabre/LLVMNotes/ExceptionHandlingChanges.txt for a discussion. Ciao, Duncan.
Sohail Somani
2011-Jun-12 14:47 UTC
[LLVMdev] Is LLVM expressive enough to represent asynchronous exceptions?
On 11-06-12 2:54 AM, Rafael Ávila de Espíndola wrote:> On 11-06-12 12:01 AM, Sohail Somani wrote: >> Is LLVM expressive enough to represent asynchronous exceptions? >> --------------------------------------------------------------- > > If you search the list you will find some discussion on better ways to > represent EH. The focus has been on C++ for now. You may also want to see > > http://www.nondot.org/sabre/LLVMNotes/ExceptionHandlingChanges.txtThat's encouraging, both people responding pointed to the same thing. So it's a known problem with a "preferred" solution. I also prefer the "attach unwind information to a block" method but I'd do it in a manner that also supports resuming exceptions (so as to support Windows SEH as well.) I guess that was option 2. So would it be acceptable to implement something like in the above linked notes with an eye towards async exceptions? If so, I'll write up how I'll do it, at least in terms of IR-related changes. Thanks a lot!
Sohail Somani
2011-Jun-12 14:49 UTC
[LLVMdev] Is LLVM expressive enough to represent asynchronous exceptions?
On 11-06-12 4:25 AM, Duncan Sands wrote:> Hi Sohail, > >> Is LLVM expressive enough to represent asynchronous exceptions? > > not currently. The first step in this direction is to get rid of the invoke > instruction and attach exception handling information to basic blocks. See > http://nondot.org/sabre/LLVMNotes/ExceptionHandlingChanges.txt > for a discussion. > > Ciao, Duncan.Thanks Duncan, I've continued the discussion regarding the same changes on an earlier thread (in response to Rafael) and would love your input there!
Cameron Zwarich
2011-Jun-12 21:14 UTC
[LLVMdev] Is LLVM expressive enough to represent asynchronous exceptions?
On Jun 12, 2011, at 1:25 AM, Duncan Sands wrote:> Hi Sohail, > >> Is LLVM expressive enough to represent asynchronous exceptions? > > not currently. The first step in this direction is to get rid of the invoke > instruction and attach exception handling information to basic blocks. See > http://nondot.org/sabre/LLVMNotes/ExceptionHandlingChanges.txt > for a discussion.Is this really a good idea? Why have a control flow graph if it doesn't actually capture control flow? There are lots of compilers for languages with more pervasive exceptions that represent them explicitly, e.g. the Hotspot server compiler for Java or several ML compilers (where integer overflow throws an exception). I can't see many advantages of implicit exceptions besides a nicer looking IR dump. Cameron
Reasonably Related Threads
- [LLVMdev] Is LLVM expressive enough to represent asynchronous exceptions?
- [LLVMdev] Is LLVM expressive enough to represent asynchronous exceptions?
- [LLVMdev] Is LLVM expressive enough to represent asynchronous exceptions?
- [LLVMdev] Is LLVM expressive enough to represent asynchronous exceptions?
- [LLVMdev] Is LLVM expressive enough to represent asynchronous exceptions?