Eli Friedman via llvm-dev
2020-Apr-01 20:19 UTC
[llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a _finally) and -EHa (Hardware Exception Handling)
Resending; I accidentally dropped llvm-dev. -Eli From: Eli Friedman Sent: Wednesday, April 1, 2020 1:01 PM To: Ten Tzen <tentzen at microsoft.com> Cc: aaron.smith at microsoft.com Subject: RE: [EXT] [llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a _finally) and -EHa (Hardware Exception Handling) This looks like it outlines the implementation pretty well. For goto in finally, why are you inventing a completely new mechanism for handling this sort of construct? What makes this different from our existing handling of goto out of catch blocks? Maybe there's something obvious here I'm missing, but it looks like essentially the same problem, and I don't see any reason why we can't use the existing solution. For hardware exceptions, the proposal seems to have big fundamental problems. I see two basic problems: How do you actually generate an exception? In general, UB means the program can do anything. So unless you define some rule that says otherwise, the only defined way to trigger an exception is using Windows API calls. If you want something else, we need to define new rules. At the C level, we need to redefine some specific constructs to trigger an exception instead of UB. And at the IR level, we need to annotate specific IR instructions in a way that passes can reasonably check, and add new LangRef rules describing those semantics. I mean, you can try to sort of hand-wave this and say it should "just work" if code happens to trigger a hardware exception. But if there aren't actually any rules, I'm afraid we'll end up with an infinitely long tail of "optimization X breaks some customer's code, so add a hack to disable it in EHa mode". If we're not modeling the control flow implied by an exception, how do we ensure that local variables and SSA registers have the right values when the exception is caught? Sure, invoke is clunky, but it's at least makes control flow well-defined. Adding "volatile" to every IR load and store instruction, including accesses to local variables, seems terrible for both optimization and correctness. Our handling of setjmp is already a complete mess; I don't want to add another way for unmodeled control flow to break code. (See also http://nondot.org/sabre/LLVMNotes/ExceptionHandlingChanges.txt, for a proposal to make invoke less messy.) -Eli From: llvm-dev <llvm-dev-bounces at lists.llvm.org<mailto:llvm-dev-bounces at lists.llvm.org>> On Behalf Of Ten Tzen via llvm-dev Sent: Tuesday, March 31, 2020 9:13 PM To: llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> Cc: aaron.smith at microsoft.com<mailto:aaron.smith at microsoft.com> Subject: [EXT] [llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a _finally) and -EHa (Hardware Exception Handling) Hi, all, The intend of this thread is to complete the support for Windows SEH. Currently there are two major missing features: Jumping out of a _finally and Hardware exception handling. The document below is my proposed design and implementation to fully support SEH on LLVM. I have completely implemented this design on a branch in repo: https://github.com/tentzen/llvm-project<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftentzen%2Fllvm-project&data=02%7C01%7Ctentzen%40microsoft.com%7Ced638e497aa74798b3f808d7d5e46775%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637213049272295023&sdata=Pd6gK%2B7JsIlfcyJLB%2FajWKdrbgqsITsseBfeB2Z5lgg%3D&reserved=0>. It now passes MSVC's in-house SEH suite. Sorry for this long write-up. For better readability, please read it on https://github.com/tentzen/llvm-project/wiki<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftentzen%2Fllvm-project%2Fwiki&data=02%7C01%7Ctentzen%40microsoft.com%7Ced638e497aa74798b3f808d7d5e46775%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637213049272305020&sdata=SN9XBN6InU79U%2FEXnReyi9H1uPbVwTHgXhMkKODnA%2FM%3D&reserved=0> Special thanks to Joseph Tremoulet for his earlier comments and suggestions. Note: I just subscribed llvm-dev, probably not in the list yet. So please reply with my email address (tentzen at microsoft.com<mailto:tentzen at microsoft.com>) explicitly in To-list. Thanks, --Ten -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200401/185a6c87/attachment.html>
Ten Tzen via llvm-dev
2020-Apr-01 22:54 UTC
[llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a _finally) and -EHa (Hardware Exception Handling)
* For goto in finally, why are you inventing a completely new mechanism for handling this sort of construct? What makes this different from our existing handling of goto out of catch blocks? Maybe there's something obvious here I'm missing, but it looks like essentially the same problem, and I don't see any reason why we can't use the existing solution. No, no new mechanism is invented. The design employs the existing mechanism to model the third exception path caused by _local_unwind (in addition to normal execution and exception handling flow). In earlier discussion with Joseph, adding second EH edge to InvokeInst was briefly discussed, but was quickly dropped as it's clearly a long shot. The extended model intends to solve the third control-flow that doesn't seem representable today. Take case #2 of the first example in wiki page as an example, the control flowing from normal execution of inner _finlly, passing through outer _finally, and landing in $t10 cannot be represented by LLVM IR. Or could you elaborate how to achieve it? (Bear with me as I'm new in Clang&LLVM world). * ..In general, UB means the program can do anything. Sorry, what is UB? Right we are not modeling HW exception in control-flow as it's not necessary. For C++ code, we don't care about the value in register, local variable, SSA and so on. All we need is that "live local-objects got dtored properly when HW exception is unwound and handled". For C code, only those code under _try construct is affected. Agree that making memory accesses there volatile is sub-optimal. But it should not have correctness issue. In MSVC, there is one less restricted "write-through" concept for memory access inside a _try. But I think the benefit of it is minor and it's not worth it as the amount of code directly under _try is very small, and usually is not performance critical code. * ..I don't want to add another way for unmodeled control flow to break code. I would really love to hear (and find a way to improve) if there is any place in this design & implementation which is not sound or robust. Thanks, --Ten From: Eli Friedman <efriedma at quicinc.com> Sent: Wednesday, April 1, 2020 1:20 PM To: Ten Tzen <tentzen at microsoft.com>; llvm-dev <llvm-dev at lists.llvm.org> Cc: Aaron Smith <aaron.smith at microsoft.com> Subject: [EXTERNAL] RE: [llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a _finally) and -EHa (Hardware Exception Handling) Resending; I accidentally dropped llvm-dev. -Eli From: Eli Friedman Sent: Wednesday, April 1, 2020 1:01 PM To: Ten Tzen <tentzen at microsoft.com<mailto:tentzen at microsoft.com>> Cc: aaron.smith at microsoft.com<mailto:aaron.smith at microsoft.com> Subject: RE: [EXT] [llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a _finally) and -EHa (Hardware Exception Handling) This looks like it outlines the implementation pretty well. For goto in finally, why are you inventing a completely new mechanism for handling this sort of construct? What makes this different from our existing handling of goto out of catch blocks? Maybe there's something obvious here I'm missing, but it looks like essentially the same problem, and I don't see any reason why we can't use the existing solution. For hardware exceptions, the proposal seems to have big fundamental problems. I see two basic problems: How do you actually generate an exception? In general, UB means the program can do anything. So unless you define some rule that says otherwise, the only defined way to trigger an exception is using Windows API calls. If you want something else, we need to define new rules. At the C level, we need to redefine some specific constructs to trigger an exception instead of UB. And at the IR level, we need to annotate specific IR instructions in a way that passes can reasonably check, and add new LangRef rules describing those semantics. I mean, you can try to sort of hand-wave this and say it should "just work" if code happens to trigger a hardware exception. But if there aren't actually any rules, I'm afraid we'll end up with an infinitely long tail of "optimization X breaks some customer's code, so add a hack to disable it in EHa mode". If we're not modeling the control flow implied by an exception, how do we ensure that local variables and SSA registers have the right values when the exception is caught? Sure, invoke is clunky, but it's at least makes control flow well-defined. Adding "volatile" to every IR load and store instruction, including accesses to local variables, seems terrible for both optimization and correctness. Our handling of setjmp is already a complete mess; I don't want to add another way for unmodeled control flow to break code. (See also http://nondot.org/sabre/LLVMNotes/ExceptionHandlingChanges.txt<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fnondot.org%2Fsabre%2FLLVMNotes%2FExceptionHandlingChanges.txt&data=02%7C01%7Ctentzen%40microsoft.com%7C74401eb45f3f4482e7ee08d7d67a0cb6%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637213692003044004&sdata=KfCWpmd6xCSNfsfp24DHefEWO9%2FFOnElOJoHnSWBWMI%3D&reserved=0>, for a proposal to make invoke less messy.) -Eli From: llvm-dev <llvm-dev-bounces at lists.llvm.org<mailto:llvm-dev-bounces at lists.llvm.org>> On Behalf Of Ten Tzen via llvm-dev Sent: Tuesday, March 31, 2020 9:13 PM To: llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> Cc: aaron.smith at microsoft.com<mailto:aaron.smith at microsoft.com> Subject: [EXT] [llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a _finally) and -EHa (Hardware Exception Handling) Hi, all, The intend of this thread is to complete the support for Windows SEH. Currently there are two major missing features: Jumping out of a _finally and Hardware exception handling. The document below is my proposed design and implementation to fully support SEH on LLVM. I have completely implemented this design on a branch in repo: https://github.com/tentzen/llvm-project<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftentzen%2Fllvm-project&data=02%7C01%7Ctentzen%40microsoft.com%7C74401eb45f3f4482e7ee08d7d67a0cb6%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637213692003044004&sdata=CGeaT8XHKEmodzMnBtgJSZN3JbQF6hFn09ZepiyTUEQ%3D&reserved=0>. It now passes MSVC's in-house SEH suite. Sorry for this long write-up. For better readability, please read it on https://github.com/tentzen/llvm-project/wiki<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftentzen%2Fllvm-project%2Fwiki&data=02%7C01%7Ctentzen%40microsoft.com%7C74401eb45f3f4482e7ee08d7d67a0cb6%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637213692003053963&sdata=bZHlFbA6jGIhD%2FDd9LOvBgi73kF9FKkVbIiQlE1pnBY%3D&reserved=0> Special thanks to Joseph Tremoulet for his earlier comments and suggestions. Note: I just subscribed llvm-dev, probably not in the list yet. So please reply with my email address (tentzen at microsoft.com<mailto:tentzen at microsoft.com>) explicitly in To-list. Thanks, --Ten -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200401/ce93fd03/attachment.html>
Eli Friedman via llvm-dev
2020-Apr-02 00:40 UTC
[llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a _finally) and -EHa (Hardware Exception Handling)
Reply inline From: Ten Tzen <tentzen at microsoft.com> Sent: Wednesday, April 1, 2020 3:54 PM To: Eli Friedman <efriedma at quicinc.com>; llvm-dev <llvm-dev at lists.llvm.org> Cc: aaron.smith at microsoft.com Subject: [EXT] RE: [llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a _finally) and -EHa (Hardware Exception Handling) ? For goto in finally, why are you inventing a completely new mechanism for handling this sort of construct? What makes this different from our existing handling of goto out of catch blocks? Maybe there's something obvious here I'm missing, but it looks like essentially the same problem, and I don't see any reason why we can't use the existing solution. No, no new mechanism is invented. The design employs the existing mechanism to model the third exception path caused by _local_unwind (in addition to normal execution and exception handling flow). In earlier discussion with Joseph, adding second EH edge to InvokeInst was briefly discussed, but was quickly dropped as it's clearly a long shot. Yes, right, it's not really a big extension of the fundamental model. It still seems like you're doing more than what's necessary. The extended model intends to solve the third control-flow that doesn't seem representable today. Take case #2 of the first example in wiki page as an example, the control flowing from normal execution of inner _finlly, passing through outer _finally, and landing in $t10 cannot be represented by LLVM IR. Or could you elaborate how to achieve it? (Bear with me as I'm new in Clang&LLVM world). Take your example, replace "_try" with C++ "try", replace the "_finally" with "catch(....)" with a "throw;" at the end of the catch block, replace the "_except()" with "catch(...)", and see what clang currently generates. That seems roughly equivalent to what you're trying to do. Extending this scheme to encompass try/finally seems like it shouldn't require new datastructures in clang's AST, or new entrypoints in the C runtime. But I could be missing something; I'm not deeply familiar with the differences between C++ and SEH unwind handlers. ? ..In general, UB means the program can do anything. Sorry, what is UB? Undefined behavior. Right we are not modeling HW exception in control-flow as it's not necessary. For C++ code, we don't care about the value in register, local variable, SSA and so on. All we need is that "live local-objects got dtored properly when HW exception is unwound and handled". For C code, only those code under _try construct is affected. Agree that making memory accesses there volatile is sub-optimal. But it should not have correctness issue. To be clear, we're talking about making all memory accesses, including accesses to local variables, in the try block "volatile"? So the compiler can't do any optimization on them? That gets you some fraction of the way there; there are no issues with SSA registers if there aren't any live SSA across the edge. And the compiler can't move volatile operations around each other. That leaves open the question about what to do about calls; we don't have any generic way to mark a call "volatile". I guess we could add something. At that point, basically every memory operation and variable would be completely opaque to the compiler, which would sort of force everything to work, I guess. But at the cost of terrible performance if there's any non-trivial code in the block. (And it's still not theoretically sound, because the compiler can introduce local variables.) In MSVC, there is one less restricted "write-through" concept for memory access inside a _try. But I think the benefit of it is minor and it's not worth it as the amount of code directly under _try is very small, and usually is not performance critical code. ? ..I don't want to add another way for unmodeled control flow to break code. I would really love to hear (and find a way to improve) if there is any place in this design & implementation which is not sound or robust. Thanks, --Ten From: Eli Friedman <efriedma at quicinc.com<mailto:efriedma at quicinc.com>> Sent: Wednesday, April 1, 2020 1:20 PM To: Ten Tzen <tentzen at microsoft.com<mailto:tentzen at microsoft.com>>; llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> Cc: Aaron Smith <aaron.smith at microsoft.com<mailto:aaron.smith at microsoft.com>> Subject: [EXTERNAL] RE: [llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a _finally) and -EHa (Hardware Exception Handling) Resending; I accidentally dropped llvm-dev. -Eli From: Eli Friedman Sent: Wednesday, April 1, 2020 1:01 PM To: Ten Tzen <tentzen at microsoft.com<mailto:tentzen at microsoft.com>> Cc: aaron.smith at microsoft.com<mailto:aaron.smith at microsoft.com> Subject: RE: [EXT] [llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a _finally) and -EHa (Hardware Exception Handling) This looks like it outlines the implementation pretty well. For goto in finally, why are you inventing a completely new mechanism for handling this sort of construct? What makes this different from our existing handling of goto out of catch blocks? Maybe there's something obvious here I'm missing, but it looks like essentially the same problem, and I don't see any reason why we can't use the existing solution. For hardware exceptions, the proposal seems to have big fundamental problems. I see two basic problems: How do you actually generate an exception? In general, UB means the program can do anything. So unless you define some rule that says otherwise, the only defined way to trigger an exception is using Windows API calls. If you want something else, we need to define new rules. At the C level, we need to redefine some specific constructs to trigger an exception instead of UB. And at the IR level, we need to annotate specific IR instructions in a way that passes can reasonably check, and add new LangRef rules describing those semantics. I mean, you can try to sort of hand-wave this and say it should "just work" if code happens to trigger a hardware exception. But if there aren't actually any rules, I'm afraid we'll end up with an infinitely long tail of "optimization X breaks some customer's code, so add a hack to disable it in EHa mode". If we're not modeling the control flow implied by an exception, how do we ensure that local variables and SSA registers have the right values when the exception is caught? Sure, invoke is clunky, but it's at least makes control flow well-defined. Adding "volatile" to every IR load and store instruction, including accesses to local variables, seems terrible for both optimization and correctness. Our handling of setjmp is already a complete mess; I don't want to add another way for unmodeled control flow to break code. (See also http://nondot.org/sabre/LLVMNotes/ExceptionHandlingChanges.txt<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fnondot.org%2Fsabre%2FLLVMNotes%2FExceptionHandlingChanges.txt&data=02%7C01%7Ctentzen%40microsoft.com%7C74401eb45f3f4482e7ee08d7d67a0cb6%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637213692003044004&sdata=KfCWpmd6xCSNfsfp24DHefEWO9%2FFOnElOJoHnSWBWMI%3D&reserved=0>, for a proposal to make invoke less messy.) -Eli From: llvm-dev <llvm-dev-bounces at lists.llvm.org<mailto:llvm-dev-bounces at lists.llvm.org>> On Behalf Of Ten Tzen via llvm-dev Sent: Tuesday, March 31, 2020 9:13 PM To: llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> Cc: aaron.smith at microsoft.com<mailto:aaron.smith at microsoft.com> Subject: [EXT] [llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a _finally) and -EHa (Hardware Exception Handling) Hi, all, The intend of this thread is to complete the support for Windows SEH. Currently there are two major missing features: Jumping out of a _finally and Hardware exception handling. The document below is my proposed design and implementation to fully support SEH on LLVM. I have completely implemented this design on a branch in repo: https://github.com/tentzen/llvm-project<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftentzen%2Fllvm-project&data=02%7C01%7Ctentzen%40microsoft.com%7C74401eb45f3f4482e7ee08d7d67a0cb6%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637213692003044004&sdata=CGeaT8XHKEmodzMnBtgJSZN3JbQF6hFn09ZepiyTUEQ%3D&reserved=0>. It now passes MSVC's in-house SEH suite. Sorry for this long write-up. For better readability, please read it on https://github.com/tentzen/llvm-project/wiki<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftentzen%2Fllvm-project%2Fwiki&data=02%7C01%7Ctentzen%40microsoft.com%7C74401eb45f3f4482e7ee08d7d67a0cb6%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637213692003053963&sdata=bZHlFbA6jGIhD%2FDd9LOvBgi73kF9FKkVbIiQlE1pnBY%3D&reserved=0> Special thanks to Joseph Tremoulet for his earlier comments and suggestions. Note: I just subscribed llvm-dev, probably not in the list yet. So please reply with my email address (tentzen at microsoft.com<mailto:tentzen at microsoft.com>) explicitly in To-list. Thanks, --Ten -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200402/6ece7b95/attachment.html>