Eli Friedman via llvm-dev
2020-Apr-02  20:48 UTC
[llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a _finally) and -EHa (Hardware Exception Handling)
* When a goto in a _finally occurs, we must "unwind" to the target code, not just "jump" to target label I'm not sure what you're trying to say here. In the Microsoft ABI, goto out of a catch block also calls into the unwinder. We have to run any destructors, and return from the funclet (catchret/cleanupret). * The call inside a _try is an invoke with EH edge. So it's perfectly modeled. If you call a nounwind function, the invoke will be transformed to a plain call. And we're likely to infer nounwind in many cases (for example, functions that don't call any other functions). There isn't any way to stop this currently; I guess we could add one. I'm sort of unhappy with the fact that this is theoretically unsound, but maybe the extra effort isn't worthwhile, as long as it doesn't impact any transforms we realistically perform. How much extra effort it would be sort of depends on what conclusion we reach for the "undefined behavior" part of this, which is really the part I'm more concerned about. -Eli From: Ten Tzen <tentzen at microsoft.com> Sent: Wednesday, April 1, 2020 7:55 PM To: Eli Friedman <efriedma at quicinc.com>; llvm-dev <llvm-dev at lists.llvm.org> Cc: aaron.smith at microsoft.com Subject: [EXT] RE: [llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a _finally) and -EHa (Hardware Exception Handling) ? Take your example, replace "_try" with C++ "try", replace the "_finally" with "catch(....)" with a "throw;" at the end of the catch block, replace the "_except()" with "catch(...)", and see what clang currently generates. That seems roughly equivalent to what you're trying to do. Extending this scheme to encompass try/finally seems like it shouldn't require new datastructures in clang's AST, or new entrypoints in the C runtime. ? ? But I could be missing something; I'm not deeply familiar with the differences between C++ and SEH unwind handlers. ? Right, you are missing something. The semantic of a "goto" from a SEH _finally is totally different from it's in EH Catch handler. It's why I have illustrated the semantic of "jumping-out-of-a _finally" in the first example in the document. When a goto in a _finally occurs, we must "unwind" to the target code, not just "jump" to target label. This is why it's called "local_unwind()", depending on the EH state of the target, local_unwind() runtime invokes _finally properly alone the way to final target. Again, take the case #2 as example, the outer _finally must be invoked before control goes to $t10. ? To be clear, we're talking about making all memory accesses, including accesses to local variables, in the try block "volatile"? So the compiler can't do any optimization on them? That gets you some fraction of the way there; there are no issues with SSA registers if there aren't any live SSA across the edge. And the compiler can't move volatile operations around each other. That leaves open the question about what to do about calls; we don't have any generic way to mark a call "volatile". I guess we could add something. At that point, basically every memory operation and variable would be completely opaque to the compiler, which would sort of force everything to work, I guess. But at the cost of terrible performance if there's any non-trivial code in the block. (And it's still not theoretically sound, because the compiler can introduce local variables.) The call inside a _try is an invoke with EH edge. So it's perfectly modeled. A HW exception occurs in callee will be properly caught and handled. Volatizing the _try block is done in Clang FE. So LLVM BE temporary variables will not be volatile. Finally I would not say it's at the cost of terrible performance because: (1) Again, in really world code, it's very small amount of code are directly inside a _try, and they are mostly not performance critical. (2) If the HW exception flow is perfectly modeled with iload/istore or with pointer-test explicit flow model, likely optimizations will be severely hindered. The result code will be probably not much better than volatile code. Thanks, --Ten From: Eli Friedman <efriedma at quicinc.com<mailto:efriedma at quicinc.com>> Sent: Wednesday, April 1, 2020 5:41 PM To: Ten Tzen <tentzen at microsoft.com<mailto:tentzen at microsoft.com>>; llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> Cc: Aaron Smith <aaron.smith at microsoft.com<mailto:aaron.smith at microsoft.com>> Subject: [EXTERNAL] RE: [llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a _finally) and -EHa (Hardware Exception Handling) Reply inline From: Ten Tzen <tentzen at microsoft.com<mailto:tentzen at microsoft.com>> Sent: Wednesday, April 1, 2020 3:54 PM To: Eli Friedman <efriedma at quicinc.com<mailto:efriedma at quicinc.com>>; llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> Cc: aaron.smith at microsoft.com<mailto:aaron.smith at microsoft.com> Subject: [EXT] RE: [llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a _finally) and -EHa (Hardware Exception Handling) ? For goto in finally, why are you inventing a completely new mechanism for handling this sort of construct? What makes this different from our existing handling of goto out of catch blocks? Maybe there's something obvious here I'm missing, but it looks like essentially the same problem, and I don't see any reason why we can't use the existing solution. No, no new mechanism is invented. The design employs the existing mechanism to model the third exception path caused by _local_unwind (in addition to normal execution and exception handling flow). In earlier discussion with Joseph, adding second EH edge to InvokeInst was briefly discussed, but was quickly dropped as it's clearly a long shot. Yes, right, it's not really a big extension of the fundamental model. It still seems like you're doing more than what's necessary. The extended model intends to solve the third control-flow that doesn't seem representable today. Take case #2 of the first example in wiki page as an example, the control flowing from normal execution of inner _finlly, passing through outer _finally, and landing in $t10 cannot be represented by LLVM IR. Or could you elaborate how to achieve it? (Bear with me as I'm new in Clang&LLVM world). Take your example, replace "_try" with C++ "try", replace the "_finally" with "catch(....)" with a "throw;" at the end of the catch block, replace the "_except()" with "catch(...)", and see what clang currently generates. That seems roughly equivalent to what you're trying to do. Extending this scheme to encompass try/finally seems like it shouldn't require new datastructures in clang's AST, or new entrypoints in the C runtime. But I could be missing something; I'm not deeply familiar with the differences between C++ and SEH unwind handlers. ? ..In general, UB means the program can do anything. Sorry, what is UB? Undefined behavior. Right we are not modeling HW exception in control-flow as it's not necessary. For C++ code, we don't care about the value in register, local variable, SSA and so on. All we need is that "live local-objects got dtored properly when HW exception is unwound and handled". For C code, only those code under _try construct is affected. Agree that making memory accesses there volatile is sub-optimal. But it should not have correctness issue. To be clear, we're talking about making all memory accesses, including accesses to local variables, in the try block "volatile"? So the compiler can't do any optimization on them? That gets you some fraction of the way there; there are no issues with SSA registers if there aren't any live SSA across the edge. And the compiler can't move volatile operations around each other. That leaves open the question about what to do about calls; we don't have any generic way to mark a call "volatile". I guess we could add something. At that point, basically every memory operation and variable would be completely opaque to the compiler, which would sort of force everything to work, I guess. But at the cost of terrible performance if there's any non-trivial code in the block. (And it's still not theoretically sound, because the compiler can introduce local variables.) In MSVC, there is one less restricted "write-through" concept for memory access inside a _try. But I think the benefit of it is minor and it's not worth it as the amount of code directly under _try is very small, and usually is not performance critical code. ? ..I don't want to add another way for unmodeled control flow to break code. I would really love to hear (and find a way to improve) if there is any place in this design & implementation which is not sound or robust. Thanks, --Ten From: Eli Friedman <efriedma at quicinc.com<mailto:efriedma at quicinc.com>> Sent: Wednesday, April 1, 2020 1:20 PM To: Ten Tzen <tentzen at microsoft.com<mailto:tentzen at microsoft.com>>; llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> Cc: Aaron Smith <aaron.smith at microsoft.com<mailto:aaron.smith at microsoft.com>> Subject: [EXTERNAL] RE: [llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a _finally) and -EHa (Hardware Exception Handling) Resending; I accidentally dropped llvm-dev. -Eli From: Eli Friedman Sent: Wednesday, April 1, 2020 1:01 PM To: Ten Tzen <tentzen at microsoft.com<mailto:tentzen at microsoft.com>> Cc: aaron.smith at microsoft.com<mailto:aaron.smith at microsoft.com> Subject: RE: [EXT] [llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a _finally) and -EHa (Hardware Exception Handling) This looks like it outlines the implementation pretty well. For goto in finally, why are you inventing a completely new mechanism for handling this sort of construct? What makes this different from our existing handling of goto out of catch blocks? Maybe there's something obvious here I'm missing, but it looks like essentially the same problem, and I don't see any reason why we can't use the existing solution. For hardware exceptions, the proposal seems to have big fundamental problems. I see two basic problems: How do you actually generate an exception? In general, UB means the program can do anything. So unless you define some rule that says otherwise, the only defined way to trigger an exception is using Windows API calls. If you want something else, we need to define new rules. At the C level, we need to redefine some specific constructs to trigger an exception instead of UB. And at the IR level, we need to annotate specific IR instructions in a way that passes can reasonably check, and add new LangRef rules describing those semantics. I mean, you can try to sort of hand-wave this and say it should "just work" if code happens to trigger a hardware exception. But if there aren't actually any rules, I'm afraid we'll end up with an infinitely long tail of "optimization X breaks some customer's code, so add a hack to disable it in EHa mode". If we're not modeling the control flow implied by an exception, how do we ensure that local variables and SSA registers have the right values when the exception is caught? Sure, invoke is clunky, but it's at least makes control flow well-defined. Adding "volatile" to every IR load and store instruction, including accesses to local variables, seems terrible for both optimization and correctness. Our handling of setjmp is already a complete mess; I don't want to add another way for unmodeled control flow to break code. (See also http://nondot.org/sabre/LLVMNotes/ExceptionHandlingChanges.txt<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fnondot.org%2Fsabre%2FLLVMNotes%2FExceptionHandlingChanges.txt&data=02%7C01%7Ctentzen%40microsoft.com%7C236f7648ead248c3a15b08d7d69e7980%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637213848450415103&sdata=XQAyvsAhp%2BnRAwq%2F2wYz9S9prC8fs0yWVzgy0RvHlpw%3D&reserved=0>, for a proposal to make invoke less messy.) -Eli From: llvm-dev <llvm-dev-bounces at lists.llvm.org<mailto:llvm-dev-bounces at lists.llvm.org>> On Behalf Of Ten Tzen via llvm-dev Sent: Tuesday, March 31, 2020 9:13 PM To: llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> Cc: aaron.smith at microsoft.com<mailto:aaron.smith at microsoft.com> Subject: [EXT] [llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a _finally) and -EHa (Hardware Exception Handling) Hi, all, The intend of this thread is to complete the support for Windows SEH. Currently there are two major missing features: Jumping out of a _finally and Hardware exception handling. The document below is my proposed design and implementation to fully support SEH on LLVM. I have completely implemented this design on a branch in repo: https://github.com/tentzen/llvm-project<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftentzen%2Fllvm-project&data=02%7C01%7Ctentzen%40microsoft.com%7C236f7648ead248c3a15b08d7d69e7980%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637213848450415103&sdata=S6bNeVNMvvSIi3W3Fv%2F0U5PfRP39fEhUzED%2Fw%2F3fSdM%3D&reserved=0>. It now passes MSVC's in-house SEH suite. Sorry for this long write-up. For better readability, please read it on https://github.com/tentzen/llvm-project/wiki<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftentzen%2Fllvm-project%2Fwiki&data=02%7C01%7Ctentzen%40microsoft.com%7C236f7648ead248c3a15b08d7d69e7980%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637213848450425058&sdata=5dLPotLop8R2DZRpJmoglae0%2FdCT3b9fjJonr8uV8TA%3D&reserved=0> Special thanks to Joseph Tremoulet for his earlier comments and suggestions. Note: I just subscribed llvm-dev, probably not in the list yet. So please reply with my email address (tentzen at microsoft.com<mailto:tentzen at microsoft.com>) explicitly in To-list. Thanks, --Ten -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200402/3d15911d/attachment.html>
Ten Tzen via llvm-dev
2020-Apr-03  01:01 UTC
[llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a _finally) and -EHa (Hardware Exception Handling)
Unwinding from SEH's perspective is to invoke outer _finally. Take this
simple example below:
    volatile int* Fault = 0;
    try {
      try {
        *Fault += 1;
      }
      __finally {
        printf(" inner finally:  Counter = %d\n\r", ++Counter);
         goto t10;
    }
    __finally {
      printf(" outer finally  Counter = %d\n\r", ++Counter);
    }
    printf(" after outer try_finally: Counter = %d\n\r", Counter);
    t10:;
  ...
Before the control gets to  "t10:",  the outer _finally funclet is
invoked by runtime.  Detailed steps:
  1.  Goto 10, call _local_unwind() runtime
  2.  _local_unwind() invoke outer _finally funclet
  3.  Then _local_unwind() passes control back to "t10:".
So with existent IR model, the reentrance from runtime to "t10:" is
not seen by Optimizer.
Our proposed solution is to add a pseudo _try-except like below so the
reentrance control-flow is represented in IR:
  try {   //  a pseudo try level to dispatch Local_unwind flow
    try {
      try {
        *Fault += 1;
      }
      __finally {
        printf(" inner finally:  Counter = %d\n\r", ++Counter);
         goto t10;
    }
    __finally {
      printf(" outer finally  Counter = %d\n\r", ++Counter);
    }
  } except (_IsLocalUnwind()) {
     goto t10;
  }
 printf(" after outer try_finally: Counter = %d\n\r", Counter);
t10:;
For C++ code, Going out of a catch-handler is simply.  For a similar example,
the outer catch-handler is NOT invoked.  At the end of inner catch-handler,
control directly passes back to t10:.
    try {
      try {
        throw(++Counter);
      }
      catch (...) {
        printf(" inner catch: goto : Counter = %d\n\r", ++Counter);
         goto t10;
      }
    catch(int i) {
      printf(" outer catch: Counter = %d\n\r", ++Counter);
    }
    printf(" after outer try_catch: Counter = %d\n\r", Counter);
  t10:;
  *   If you call a nounwind function, the invoke will be transformed to a plain
call.  And we're likely to infer nounwind in many cases (for example,
functions that don't call any other functions).  There isn't any way to
stop this currently; I guess we could add one.
For -EHa where HW exception must be handled, nounwind-attribute is ignored (or
reset) for callees directly inside a _try.
  *   I'm sort of unhappy with the fact that this is theoretically unsound,
but maybe the extra effort isn't worthwhile, as long as it doesn't
impact any transforms we realistically perform.  How much extra effort it would
be sort of depends on what conclusion we reach for the "undefined
behavior" part of this, which is really the part I'm more concerned
about.
Which part (-EHa or Local_unwind) is theoretically unsound to you?  Could you be
more specific what UB problem could arise in this design?
Thanks,
--Ten
From: Eli Friedman <efriedma at quicinc.com>
Sent: Thursday, April 2, 2020 1:49 PM
To: Ten Tzen <tentzen at microsoft.com>; llvm-dev <llvm-dev at
lists.llvm.org>
Cc: Aaron Smith <aaron.smith at microsoft.com>
Subject: [EXTERNAL] RE: [llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out
of a _finally) and -EHa (Hardware Exception Handling)
  *   When a goto in a _finally occurs, we must "unwind" to the target
code, not just "jump" to target label
I'm not sure what you're trying to say here.  In the Microsoft ABI, goto
out of a catch block also calls into the unwinder.  We have to run any
destructors, and return from the funclet (catchret/cleanupret).
  *   The call inside a _try is an invoke with EH edge.  So it's perfectly
modeled.
If you call a nounwind function, the invoke will be transformed to a plain call.
And we're likely to infer nounwind in many cases (for example, functions
that don't call any other functions).  There isn't any way to stop this
currently; I guess we could add one.
I'm sort of unhappy with the fact that this is theoretically unsound, but
maybe the extra effort isn't worthwhile, as long as it doesn't impact
any transforms we realistically perform.  How much extra effort it would be sort
of depends on what conclusion we reach for the "undefined behavior"
part of this, which is really the part I'm more concerned about.
-Eli
From: Ten Tzen <tentzen at microsoft.com<mailto:tentzen at
microsoft.com>>
Sent: Wednesday, April 1, 2020 7:55 PM
To: Eli Friedman <efriedma at quicinc.com<mailto:efriedma at
quicinc.com>>; llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev
at lists.llvm.org>>
Cc: aaron.smith at microsoft.com<mailto:aaron.smith at microsoft.com>
Subject: [EXT] RE: [llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a
_finally) and -EHa (Hardware Exception Handling)
?  Take your example, replace "_try" with C++ "try", replace
the "_finally" with "catch(....)" with a "throw;"
at the end of the catch block, replace the "_except()" with
"catch(...)", and see what clang currently generates.  That seems
roughly equivalent to what you're trying to do. Extending this scheme to
encompass try/finally seems like it shouldn't require new datastructures in
clang's AST, or new entrypoints in the C runtime.
?
?  But I could be missing something; I'm not deeply familiar with the
differences between C++ and SEH unwind handlers.
?
Right, you are missing something.  The semantic of a "goto" from a SEH
_finally is totally different from it's in EH Catch handler.  It's why I
have illustrated the semantic of "jumping-out-of-a _finally" in the
first example in the document.
When a goto in a _finally occurs, we must "unwind" to the target code,
not just "jump" to target label.  This is why it's called
"local_unwind()", depending on the EH state of the target,
local_unwind() runtime invokes _finally properly alone the way to final target. 
Again, take the case #2 as example, the outer _finally must be invoked before
control goes to $t10.
?  To be clear, we're talking about making all memory accesses, including
accesses to local variables, in the try block "volatile"? So the
compiler can't do any optimization on them?  That gets you some fraction of
the way there; there are no issues with SSA registers if there aren't any
live SSA across the edge.  And the compiler can't move volatile operations
around each other.  That leaves open the question about what to do about calls;
we don't have any generic way to mark a call "volatile".  I guess
we could add something.  At that point, basically every memory operation and
variable would be completely opaque to the compiler, which would sort of force
everything to work, I guess. But at the cost of terrible performance if
there's any non-trivial code in the block.  (And it's still not
theoretically sound, because the compiler can introduce local variables.)
The call inside a _try is an invoke with EH edge.  So it's perfectly
modeled. A HW exception occurs in callee will be properly caught and handled.
Volatizing the _try block is done in Clang FE.  So LLVM BE temporary variables
will not be volatile.
Finally I would not say it's at the cost of terrible performance because:
(1)    Again, in really world code, it's very small amount of code are
directly inside a _try, and they are mostly not performance critical.
(2)    If the HW exception flow is perfectly modeled with iload/istore or with
pointer-test explicit flow model, likely optimizations will be severely
hindered.  The result code will be probably not much better than volatile code.
Thanks,
--Ten
From: Eli Friedman <efriedma at quicinc.com<mailto:efriedma at
quicinc.com>>
Sent: Wednesday, April 1, 2020 5:41 PM
To: Ten Tzen <tentzen at microsoft.com<mailto:tentzen at
microsoft.com>>; llvm-dev <llvm-dev at
lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>
Cc: Aaron Smith <aaron.smith at microsoft.com<mailto:aaron.smith at
microsoft.com>>
Subject: [EXTERNAL] RE: [llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out
of a _finally) and -EHa (Hardware Exception Handling)
Reply inline
From: Ten Tzen <tentzen at microsoft.com<mailto:tentzen at
microsoft.com>>
Sent: Wednesday, April 1, 2020 3:54 PM
To: Eli Friedman <efriedma at quicinc.com<mailto:efriedma at
quicinc.com>>; llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev
at lists.llvm.org>>
Cc: aaron.smith at microsoft.com<mailto:aaron.smith at microsoft.com>
Subject: [EXT] RE: [llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a
_finally) and -EHa (Hardware Exception Handling)
?  For goto in finally, why are you inventing a completely new mechanism for
handling this sort of construct?  What makes this different from our existing
handling of goto out of catch blocks?  Maybe there's something obvious here
I'm missing, but it looks like essentially the same problem, and I don't
see any reason why we can't use the existing solution.
No, no new mechanism is invented.  The design employs the existing mechanism to
model the third exception path caused by _local_unwind (in addition to normal
execution and exception handling flow).  In earlier discussion with Joseph,
adding second EH edge to InvokeInst was briefly discussed, but was quickly
dropped as it's clearly a long shot.
Yes, right, it's not really a big extension of the fundamental model.  It
still seems like you're doing more than what's necessary.
The extended model intends to solve the third control-flow that doesn't seem
representable today.
Take case #2 of the first example in wiki page as an example,
the control flowing from normal execution of inner _finlly, passing through
outer _finally, and landing in $t10 cannot be represented by LLVM IR.
Or could you elaborate how to achieve it?  (Bear with me as I'm new in
Clang&LLVM world).
Take your example, replace "_try" with C++ "try", replace
the "_finally" with "catch(....)" with a "throw;"
at the end of the catch block, replace the "_except()" with
"catch(...)", and see what clang currently generates.  That seems
roughly equivalent to what you're trying to do. Extending this scheme to
encompass try/finally seems like it shouldn't require new datastructures in
clang's AST, or new entrypoints in the C runtime.
But I could be missing something; I'm not deeply familiar with the
differences between C++ and SEH unwind handlers.
?  ..In general, UB means the program can do anything.
Sorry, what is UB?
Undefined behavior.
Right we are not modeling HW exception in control-flow as it's not
necessary.
For C++ code, we don't care about the value in register, local variable, SSA
and so on.  All we need is that "live local-objects got dtored properly
when HW exception is unwound and handled".
For C code, only those code under _try construct is affected.  Agree that making
memory accesses there volatile is sub-optimal. But it should not have
correctness issue.
To be clear, we're talking about making all memory accesses, including
accesses to local variables, in the try block "volatile"? So the
compiler can't do any optimization on them?  That gets you some fraction of
the way there; there are no issues with SSA registers if there aren't any
live SSA across the edge.  And the compiler can't move volatile operations
around each other.  That leaves open the question about what to do about calls;
we don't have any generic way to mark a call "volatile".  I guess
we could add something.  At that point, basically every memory operation and
variable would be completely opaque to the compiler, which would sort of force
everything to work, I guess. But at the cost of terrible performance if
there's any non-trivial code in the block.  (And it's still not
theoretically sound, because the compiler can introduce local variables.)
In MSVC, there is one less restricted "write-through" concept for
memory access inside a _try.  But I think the benefit of it is minor and
it's not worth it as the amount of code directly under _try is very small,
and usually is not performance critical code.
?  ..I don't want to add another way for unmodeled control flow to break
code.
I would really love to hear (and find a way to improve) if there is any place in
this design & implementation which is not sound or robust.
Thanks,
--Ten
From: Eli Friedman <efriedma at quicinc.com<mailto:efriedma at
quicinc.com>>
Sent: Wednesday, April 1, 2020 1:20 PM
To: Ten Tzen <tentzen at microsoft.com<mailto:tentzen at
microsoft.com>>; llvm-dev <llvm-dev at
lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>
Cc: Aaron Smith <aaron.smith at microsoft.com<mailto:aaron.smith at
microsoft.com>>
Subject: [EXTERNAL] RE: [llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out
of a _finally) and -EHa (Hardware Exception Handling)
Resending; I accidentally dropped llvm-dev.
-Eli
From: Eli Friedman
Sent: Wednesday, April 1, 2020 1:01 PM
To: Ten Tzen <tentzen at microsoft.com<mailto:tentzen at
microsoft.com>>
Cc: aaron.smith at microsoft.com<mailto:aaron.smith at microsoft.com>
Subject: RE: [EXT] [llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a
_finally) and -EHa (Hardware Exception Handling)
This looks like it outlines the implementation pretty well.
For goto in finally, why are you inventing a completely new mechanism for
handling this sort of construct?  What makes this different from our existing
handling of goto out of catch blocks?  Maybe there's something obvious here
I'm missing, but it looks like essentially the same problem, and I don't
see any reason why we can't use the existing solution.
For hardware exceptions, the proposal seems to have big fundamental problems.  I
see two basic problems:
How do you actually generate an exception?  In general, UB means the program can
do anything.  So unless you define some rule that says otherwise, the only
defined way to trigger an exception is using Windows API calls.  If you want
something else, we need to define new rules.  At the C level, we need to
redefine some specific constructs to trigger an exception instead of UB.  And at
the IR level, we need to annotate specific IR instructions in a way that passes
can reasonably check, and add new LangRef rules describing those semantics.  I
mean, you can try to sort of hand-wave this and say it should "just
work" if code happens to trigger a hardware exception.  But if there
aren't actually any rules, I'm afraid we'll end up with an
infinitely long tail of "optimization X breaks some customer's code, so
add a hack to disable it in EHa mode".
If we're not modeling the control flow implied by an exception, how do we
ensure that local variables and SSA registers have the right values when the
exception is caught?  Sure, invoke is clunky, but it's at least makes
control flow well-defined.  Adding "volatile" to every IR load and
store instruction, including accesses to local variables, seems terrible for
both optimization and correctness.  Our handling of setjmp is already a complete
mess; I don't want to add another way for unmodeled control flow to break
code.  (See also
http://nondot.org/sabre/LLVMNotes/ExceptionHandlingChanges.txt<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fnondot.org%2Fsabre%2FLLVMNotes%2FExceptionHandlingChanges.txt&data=02%7C01%7Ctentzen%40microsoft.com%7Ca5501a890b284da77b8108d7d74740de%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637214573350550582&sdata=c4Wp8RZlg%2BN7uY1yqWNoZYQhZbnLvhdljVh7cybJ4Lc%3D&reserved=0>,
for a proposal to make invoke less messy.)
-Eli
From: llvm-dev <llvm-dev-bounces at lists.llvm.org<mailto:llvm-dev-bounces
at lists.llvm.org>> On Behalf Of Ten Tzen via llvm-dev
Sent: Tuesday, March 31, 2020 9:13 PM
To: llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
Cc: aaron.smith at microsoft.com<mailto:aaron.smith at microsoft.com>
Subject: [EXT] [llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a
_finally) and -EHa (Hardware Exception Handling)
Hi, all,
The intend of this thread is to complete the support for Windows SEH.
Currently there are two major missing features:  Jumping out of a _finally and
Hardware exception handling.
The document below is my proposed design and implementation to fully support SEH
on LLVM.
I have completely implemented this design on a branch in repo: 
https://github.com/tentzen/llvm-project<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftentzen%2Fllvm-project&data=02%7C01%7Ctentzen%40microsoft.com%7Ca5501a890b284da77b8108d7d74740de%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637214573350560564&sdata=AoFECMFzOE0Bd0PXl%2BzRWO36k3t6V8F7GjNmTktzY6w%3D&reserved=0>.
It now passes MSVC's in-house SEH suite.
Sorry for this long write-up.  For better readability, please read it on
https://github.com/tentzen/llvm-project/wiki<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftentzen%2Fllvm-project%2Fwiki&data=02%7C01%7Ctentzen%40microsoft.com%7Ca5501a890b284da77b8108d7d74740de%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637214573350560564&sdata=QGzsWmEot7%2B8EPSQk6LyH%2FWnhqYWqC07nGNrmqfFVWk%3D&reserved=0>
Special thanks to Joseph Tremoulet for his earlier comments and suggestions.
Note: I just subscribed llvm-dev, probably not in the list yet.  So please reply
with my email address (tentzen at microsoft.com<mailto:tentzen at
microsoft.com>) explicitly in To-list.
Thanks,
--Ten
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200403/6cfef240/attachment.html>
Eli Friedman via llvm-dev
2020-Apr-03  03:47 UTC
[llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a _finally) and -EHa (Hardware Exception Handling)
Reply inline.
From: Ten Tzen <tentzen at microsoft.com>
Sent: Thursday, April 2, 2020 6:01 PM
To: Eli Friedman <efriedma at quicinc.com>; llvm-dev <llvm-dev at
lists.llvm.org>
Cc: aaron.smith at microsoft.com
Subject: [EXT] RE: [llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a
_finally) and -EHa (Hardware Exception Handling)
Unwinding from SEH's perspective is to invoke outer _finally.
For C++ code, At the end of inner catch-handler, control directly passes back to
t10:.
If you have local variables with destructors, it doesn't.  The destructors
have to run first.
If you have a local variable with a destructor, clang emits two calls to the
destructor: one along the normal path, and one along the unwind path.   The goto
jumps to the "normal" path destructor call, it calls the destructor,
and then the code jumps from there to the final destination.
Currently, we do the same thing for SEH: there's a normal path and an unwind
path.  We outline the code into a separate function, and call it from both
paths.  This is essentially identical to what we do for try/catch.  There
isn't any obvious reason we can't extend this to handle goto the same
way.  In fact, clang already supports goto across a finally block:
void f(int a);
void f(int ex, int lu, int lu2, int lu3) {
__try {
  __try {
      f(ex);
  } __except (ex){
      if (lu3) goto T;
      f(lu);
  }
} __finally {
    f(lu);
}
T:;
}
(If the goto itself is in a finally block, it currently doesn't work, but
that's a relatively minor detail.)
This is not the same as what MSVC implements, but it isn't obviously wrong. 
If you're going to write a bunch of new code to implement something else,
you need to justify it.
?  If you call a nounwind function, the invoke will be transformed to a plain
call.  And we're likely to infer nounwind in many cases (for example,
functions that don't call any other functions).  There isn't any way to
stop this currently; I guess we could add one.
For -EHa where HW exception must be handled, nounwind-attribute is ignored (or
reset) for callees directly inside a _try.
In other words, you need to mark the calls "volatile".  (You could try
to track the region that's inside the try block for transforms that care,
but that's more complicated for no benefit.)
Also, even if you block directly removing the unwind edge, passes like IPSCCP
could still prove that the edge isn't feasible and reason based on that.  So
you really need to block all interprocedural transforms, not just ones that mess
with the unwind edge.
?  I'm sort of unhappy with the fact that this is theoretically unsound, but
maybe the extra effort isn't worthwhile, as long as it doesn't impact
any transforms we realistically perform.  How much extra effort it would be sort
of depends on what conclusion we reach for the "undefined behavior"
part of this, which is really the part I'm more concerned about.
Which part (-EHa or Local_unwind) is theoretically unsound to you?  Could you be
more specific what UB problem could arise in this design?
The unsoundness is the possibility of optimizations introducing local variables,
which I mentioned before.  The resulting variables won't use volatile
load/store operations, so they won't be properly preserved.  Actually,
thinking about it a bit more, I'm not sure it's completely theoretical;
you could run into trouble with constant hoisting.
The UB problem is what I outlined in my very first reply.  You need to define
some way that isn't UB to trigger an exception, or else handling the
resulting exception is formally meaningless.  If the behavior is undefined, it
doesn't matter what happens at that point.
-Eli
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200403/eae4e1ad/attachment-0001.html>