John Reagan via llvm-dev
2018-Jan-26 15:54 UTC
[llvm-dev] [RFC] Improving compact x86-64 compact unwind descriptors
Here is our proposal to extend/enhance the x86-64 compact unwind descriptors to fully describe the prologue/epilogue for asynchronous unwinding. I believe there are missing/lacking CFI directives as well, but I'll save that for another thread. Asynchronous Compact Unwind Descriptors Ron Brender, VMS Software, Inc. Revised January 25, 2018 1 Introduction This document proposes means to extend so-called compact unwind descriptors to support fully asynchronous exception handling. This will make extended compact unwind descriptors an alternative to DWARF CFI (call frame information) for achieving asynchronous exception support. Compact unwind descriptors can and have been used in both 64- and 32-bit environments. However, this proposal addresses only 64-bit environments. While the ideas presented here can be readily adapted for use in a 32-bit environment, for simplicity this document makes no attempt to do so. There are generally three kinds of information that together form the heart of modern software exception handling systems: 1. Information that is used to divide the remaining unwind information into groups that are specific to particular regions of memory (often, but not necessarily, associated with a single function) as well as provide a way to efficiently search for and identify the grouping that is associated with a particular address in memory. 2. Information that can be used to virtually or actually unwind from the call frame of an executing function to the call frame of its caller (at the point of the call). 3. Identification of an associated personality routine that is invoked by the general exception handling mechanization to guide how processing should proceed for a given function, as well as additional “language specific data” needed for the personality routine to do its job. Note that the personality routine and its related data are specified as an adjunct to compiled code and totally opaque to the general mechanism (other than the specified interface). Note in particular that C++ exception handling is built on top of a personality routine and language specific data area ABI that itself can be implemented using either DWARE CFI or extended compact unwind information as described here. The choice between the two is transparent to C++. 2 Compact Unwind Overview This section provides a brief overview of key features of the LLVM compact unwind design. It does not attempt a comprehensive re-statement of all aspects of the design except to the extent necessary to motivate and understand later proposed changes and enhancements. 2.1 Compact Unwind Group Description A compact unwind group consist of five fields, as follows: | 63 32 31 0 | |-------------------------------------------------------------------| | STARTING-ADDRESS | | LENGTH | COMPACT-UNWIND-DESCRIPTION | | PERSONALITY-FUNCTION-POINTER | | LANGUAGE-SPECIFIC-DATA-ADDRESS | |-------------------------------------------------------------------| STARTING-ADDRESS (64-bits) is the lowest address of a region of memory occupied by some code, typically the entry point of a function. LENGTH (32-bits) is the number of bytes included in this group, typically including all and only the code of a function. COMPACT-UNWIND-DESCRIPTION (32-bits) is a description of the fully formed frame of a function and how to unwind it. This is described further following. PERSONALITY-FUNCTION-POINTER (64-bit) is a pointer to the personality routine. LANGUAGE-SPECIFIC-DATA-ADDRESS (64-bits, sometimes abbreviated LSDA) is a pointer to some data to be passed to the personality routine when it is called. A key observation is that the starting address plus length way of describing a group means that the set of groups for a compilation unit need not describe all of the code in that unit. In particular, it appears to be expected that no unwind information need be generated for leaf functions. On the other hand, it is reasonable to expect that the groups that are emitted are ordered by the starting address. This means that a simple and fast binary search can be used to map an address to the group that applies to that address, if any. It is useful to note that the run-time representation of unwind information can vary from little more than a simple concatenation of the compile-time information to a substantial rewriting of unwind information by the linker. The proposal favors simple concatenation while maintaining the same ordering of groups as their associated code. 2.2 Compact Unwind Frame Description A compact unwind frame description describes a frame in sufficient detail to be able to unwind that frame to the frame of its caller. | 31 28 27 24 23 0 | |-------------------------------------------------------------------| | FLAGS | MODE | | |-------------------------------------------------------------------| At the top most level, there are four bits that are not of further interest here. Interpretation of these bits is neither used nor changed. Also at the top-level is a 4-bit mode field. This is the tag of a discriminated (tagged, variant) union that selects the interpretation of the remaining 24 bits. Of the 16 possible modes, only 5 are defined : Code Meaning Description 0 Old “Old” is presumed to refer to some historical design that is no longer of interest. It is treated here as Reserved. 1 RBP-based The frame uses the RBP register as a frame pointer. The size of the frame can frame vary during execution. 2 RSP-based The frame uses RSP as the frame pointer. The size of the frame is fixed (at frame compilation time). 3 Large RSP- The frame uses RSP as the frame pointer, The size of the frame is fixed (at based frame compilation time); however, that size is too large to express within this 32-bit descriptor encoding. 4 DWARF The frame, for whatever reason, cannot be adequately described using the escape compact unwind frame description. The remaining 24-bits are an index into what the DWARF standard calls the .debug_frame section (__eh_frame in LLVM). 2.2.1 RBP-based Frame (MODE=1) For a RBP-based frame, the remaining 24 bits are encoded as follows: | 23 16 | 15 | 14 0 | |-------------------------------------------------------------------| | OFFSET | 0 | REGS | |-------------------------------------------------------------------| In a RBP-based frame the RBP register is pushed on the stack immediately after the return address, then RSP is moved to RBP. To unwind, RSP is restored with the current RPB value, then RBP is restored by popping off the stack, and the return is done by popping the stack once more into the instruction pointer. All preserved registers are saved in a small range in the stack that starts at RBP-8 to RBP-2040. The offset/8 relative to RBP is encoded in the 8-bit OFFSET field. The registers saved are encoded in the 15-bit REGS field as five 3-bit entries. 2.2.2 RSP-Based Frame (MODE=2) For a RSP-based frame, the remaining 24 bits are encoded as follows: |23 16 | 15 13 | 12 10 | 9 0 | |-------------------------------------------------------------------| | SIZE | | CNT | REG_PERM | |-------------------------------------------------------------------| In a RSP-based frame the stack pointer serves directly as the frame pointer and RBP is available for use as a general register. Upon entry, the stack pointer is decremented by 8*SIZE bytes (the maximum stack allocation is thus 2040 bytes). To unwind, the stack size is added to the stack pointer, and completed by popping the stack once more into the instruction pointer. All preserved registers are saved on the stack immediately after the return address. The number of registers saved (up to 6) is encoded in the 3-bit CNT field. The 11-bit REG_PERM field encodes which registers were saved and in what order. 2.2.3 Large RSP-Based Frame (MODE=3) For a large RSP-based frame, the remaining 24 bits are encoded as follows: |23 16 | 15 13 | 12 10 | 9 0 | |-------------------------------------------------------------------| | SIZE | ADJ | CNT | REG_PERM | |-------------------------------------------------------------------| This case is like the previous, except the stack size is too large to encode in the compact unwind encoding. Instead, the function must include a "subq $nnnnnnnn, RSP" instruction in its prologue to allocate the stack. The offset from the entry point of the function to the nnnnnnnn value in the function is given in the SIZE field. Depending on the exact instructions used to save registers (PUSH versus MOV), the nnnnnnnn value in the instruction stream may not be quite the full stack size. ADJ * 8 is the additional adjustment needed to get the actual size. 2.2.4 DWARF Escape (MODE=4) The frame, for whatever reason, cannot be adequately described using a compact unwind frame description. The remaining 24-bits are an index into what the DWARF standard calls the .debug_frame section (called __eh_frame in LLVM). 3 Asynchronous Changes and Enhancements It is immediately obvious that omission of unwind information for leaf functions (with any kind of frame) precludes handling an exception that might occur during its execution. It follows that unwind information must cover all of the code of a module (with one exception discussed below). But if successive unwind groups are ordered (as previously assumed) and also leave no gaps, then the LENGTH field is redundant and can be omitted. The beginning address of a following group is always one byte past the end of the predecessor group. There remains only the question of how to identify the last group of a set. It should also be clear that the unwind representation described in the prior section is not sufficient to unwind from an asynchronous exception that might occur in either the prologue or epilogue of a function. To see this consider what would happen if an exception occurred at either the entry point or the return instruction of either a RBP- or RSP-frame function. To be able to handle asynchronous exceptions at any point during function execution, it is necessary to add additional information to each unwind group. These two considerations can be combined. The result is simply to repurpose the LENGTH field to encode prologue and epilogue information. 3.1 Extended MODEs To preserve backward compatibility and to allow intermixing of traditional and extended compact unwind groups, new MODEs are defined as follows: Code Meaning Description ---------------------------------------------------------------------------------------------- 8 Null frame See below. 9 Extended RBP-based frame Like mode 1 but combined with extended prologue/epilogue information in place of a LENGTH field 10 Extended RSP-based frame Like mode 2 but combined with extended prologue/epilogue information in place of a LENGTH field 11 Extended Large RSP-based Like mode 3 but combined with extended prologue/epilogue frame information in place of a LENGTH field 12 Extended parameter(s) See below. A null frame (MODE = 8) is the simplest possible frame, with no allocated stack of either kind (hence no saved registers). A null frame might be considered just a special case of a RSP-based frame with a stack size of zero. But unlike other frames, its frame pointer is usually not 16-byte aligned. It appears technically feasible for a null frame function to have a personality routine. However, the utility of such a capability seems too meager to justify allowing this. We propose to not support this. There remains the question of whether it is necessary or at least desirable to strictly apply the requirement that “all” code be covered by an unwind group. Based on successful experience in OpenVMS on the Itanium architecture (as well as Windows systems we think), this alternative is proposed: If the first attempt to lookup an unwind group for an exception address fails, then it is (tentatively) assumed to have occurred within a null frame function or in a part of a function that is adequately described by a null frame. The presumed return address is (virtually or actually) popped from the top of stack and looked up. This second attempted lookup must succeed, in which case processing continues normally. A failure is a fatal error. This concept of a “default null frame group” is especially convenient for dealing with disconnected code sequences such as trampolines, PLTs, and the like. The extended RBP-based and RSP-based frame modes (MODEs = 9 and 10) are simple supersets of their more traditional predecessors. The large RSP-based frame mode (MODE = 11) is also a superset of mode 3 except that instead of finding the stack size in the instruction stream, it is found in a preceding extended parameter unwind group (MODE = 12). This difference is essential for support of execute-only code. To understand the extended parameter group, suppose that two groups occur one after the other but have the same STARTING-ADDRESS value. A binary search using the STARTING- ADDRESS field will ignore the first of the pair because the address being sought can never be at least as large as the first but less than the second. Such a group can be used as an escape convention that allows inserting additional information into the sequence of groups that otherwise cannot be easily included. The extended modes are described more fully in the following sections. 3.2 Function Parts Functions are generally considered to consist of three distinct parts: 1. A prologue, which allocates local storage, saves registers that must be preserved for the benefit of callers, and possibly other housekeeping (such as initializing any local state necessary for the correct management of the function’s execution). 2. A body, which performs the main work of the function. 3. An epilogue, which makes the result available to the caller, and undoes the effects of the prologue (restores preserved registers, deallocates local storage, and so on). Here a word of caution is in order. This vocabulary tends to be applied at multiple levels in software architecture and each level has its own set of issues to consider and resulting requirements to be observed. Note that a null frame function has no distinct prologue, body or epilogue. Every instruction can be viewed as simultaneously in all three parts, or in none of them. This leads to the following proposal for the upper part of the extended compact unwind quadword for use in combination with MODE = 8 in the lower part. | 63 48 | 47 32 | |-------------------------------------------------------------------| | RESERVED | 0 ... 0 | |-------------------------------------------------------------------| | 31 16 | 15 0 | |-------------------------------------------------------------------| | 0 0 0 0 | 1 0 0 0 | 0 ... 0 | 0 ... 0 | |-------------------------------------------------------------------| 3.2.1 Function Prologue For the purposes of exception handling, the key steps are: 1. Allocate space on the stack in which to save preserved registers and for other purposes 2. Save the registers that must be preserved. The DWARF call frame information (CFI) provides a description that is precise and accurate at each and every instruction in a function (not limited to just the prologue). But experience shows that this representation is bulky in space and expensive in time to decode and interpret. Indeed, compact unwind descriptors as described in Section 2 were created to avoid these issues. Experience with OpenVMS on the Alpha and Itanium architectures shows that a key constraint can greatly simplify the unwind description: require that no preserved register can be changed until all of them have been saved. The last such save ends the prologue and the next instruction begins the body. An unwind does not need (must not) restore any preserved registers because they are still valid. A simple length from the entry point suffices to describe this boundary. [The body does not necessarily need to begin until at least one of the preserved registers is actually changed.] 3.2.1.1 RSP-Based Frame (MODE = 10) For a RSP-based frame, it is also necessary to know which instruction adjusts the stack pointer: before this instruction an unwind must only (virtually or actually) pop the return address; and after this instruction, the stack pointer must first be incremented to deallocate the stack frame, then the instruction pointer restored. This leads to the following proposal for the upper part of the extended compact unwind quadword for use in combination with MODE = 10 in the lower part. | 63 48 47 46 40 39 32 | |-------------------------------------------------------------------| | RESERVED | | PROLOGUE-SIZE2 | PROLOGUE-SIZE1 | |-------------------------------------------------------------------| Except for the MODE, the lower part is exactly like it would be for MODE = 2. PROLOGUE-SIZE1 is the length in bytes of the first part of the prologue relative to the STARTING-ADDRESS, up to and including the instruction that allocates the stack. PROLOGUE-SIZE2 is the length in bytes of the second part of the prologue relative to the end of the first part, up to a point after the last preserved register is saved and before the first preserved register is changed. (Recall, this point may not be unique.) PROLOGUE-SIZE1 + PROLOGUE-SIZE2 gives the total size of the prologue. The maximum prologue size allowed here is much greater than will be typical. This is deliberate to allow compilers freedom to use shrink-wrap type optimizations in which safe operations that need no body state are moved from the body to the beginning of the prologue. This avoids the cost of stack frame setup and teardown for simple special case handling that often leads to an early exit. 3.2.1.2 Large RSP-based Frame (MODE = 11) This case is like the previous, except the stack size is too large to encode in the compact unwind encoding. The SIZE field is interpreted as an offset relative to the beginning of the containing unwind group to (what is necessarily) an earlier extended parameter group. 3.2.1.3 RBP-Based Frame (MODE = 9) The RBP-based frame is the same as for a RSP-based frame except that there are two instructions that mark the transition between the two parts of the prologue: the push of the prior RBP value onto the stack followed by the copy of RSP to RBP to establish the new frame pointer. There appears to be no reason to not make this simplifying requirement: the push and copy always occur together. This leads to the following proposal for the upper part of the extended compact unwind quadword for use in combination with MODE = 9 in the lower part. | 63 48 47 46 40 39 32 | |-------------------------------------------------------------------| | RESERVED | | PROLOGUE-SIZE2 | PROLOGUE-SIZE1 | |-------------------------------------------------------------------| Except for the MODE, the lower part is exactly like it would be for MODE = 1. PROLOGUE-SIZE1 is the length in bytes of the first part of the prologue relative to the STARTING-ADDRESS, up to and including the instruction that pushes the prior RBP contents. PROLOGUE-SIZE2 is the length in bytes of the second part of the prologue relative to the end of the first part, up to a point after the last preserved register is saved and before the first preserved register is changed. (Recall, this point may not be unique.) PROLOGUE-SIZE1 + PROLOGUE-SIZE2 gives the total size of the prologue. 3.2.2 Function Epilogue For the purposes of exception handling, the key steps in an epilogue are: 1. Restore the registers that were preserved 2. Deallocate space on the stack that was used to preserve registers and for other purposes A key observation is that restoring registers is idempotent. That is, if in the midst of restoring registers when an exception occurs, it will do no harm if all of the preserved registers are restored again when unwinding. In effect, restoration of registers need not be distinguished from the body of a function. It is not until reaching the first instruction that deallocates stack is executed that the “real” epilogue begins. 3.2.2.1 RSP-Based Frame (MODE = 10 or 11) For a RSP-based frame, whether small or large, the only unwind action remaining after stack deallocation is to pop the return address into the instruction pointer (RIP). In the vast majority of cases this means that the epilogue of the code described by an unwind group is just a 1-byte RET instruction that occurs at the end of the unwind group. There are other possibilities but they are rare and con be dealt with separately (see below). It suffices to include a single 1-bit field, E, in the upper part of the extended compact unwind description: | 63 48 47 46 40 39 32 | |-------------------------------------------------------------------| | RESERVED | E | PROLOGUE-SIZE2 | PROLOGUE-SIZE1 | |-------------------------------------------------------------------| The E flag indicates that the unwind group ends with a standard 1-byte RET instruction. 3.2.2.2 RBP-Based Frame (MODE = 9) The RBP-based frame is much the same as for a RSP-based frame except that there are two instructions that mark the transition between the two parts of the prologue: the copy of RBP to RSP to cut back the stack, followed by the pop of RBP to restore what may be the callers frame pointer. Thus the epilogue, for the purposes of unwinding, begins immediately after the copy of RBP to RSP. In the vast majority of cases, the immediately following two instructions will consist of POP RBP and RET. The POP is not idempotent because it changes the stack pointer, but the unwind processing code can distinguish whether the stack has been popped or not based on the address (3 versus 1 byte less than the highest address of the group). It suffices to include a single 1-bit field, E, in the upper part of the extended compact unwind description: | 63 48 47 46 40 39 32 | |-------------------------------------------------------------------| | RESERVED | E | PROLOGUE-SIZE2 | PROLOGUE-SIZE1 | |-------------------------------------------------------------------| The E flag indicates that the unwind group ends with a standard 3-instruction return sequence. 3.3 Special Issues Up until this point, it might appear that each unwind group corresponds one-for-one to the code for a single function. While common, this is not required. Important variations are illustrated here. Issue/Problem Possible Resolution ------------------------------------------------------------------------------------------------- The first part of the prologue is too large to be Use two unwind groups: described by the 8-bit RxP-FRAME-SIZE1 field. 1) a null frame group of any appropriate size, 2) a suitable RxP-based frame. The second part of the prologue is too large to be Use three unwind groups: described by the 8-bit RxP-FRAME-SIZE2 field. 1) a RSP-frame group to describe the first part of the prologue only (PROLOGUE-SIZE1 = the length of the entire group and PROLOGUE-SIZE2 = 0, 2) a RSP-frame group to describe the second part of the prologue (PROLOGUE-SIZE1 & 2 both = 0 and no registers saved), 3) a suitable RxP-based frame group (with PROLOGUE-SIZE1 & 2 both = 0). There is more than one return location (without a Generally use one unwind group for each sequence shared epilogue sequence). of code ending with a RET. A group following a RET will have no prologue (PROLOGUE-SIZE1 & 2 both = 0). There is more than one entry point to a function Generally use one unwind group for each sequence (for example, a FORTRAN multiple entry point of code beginning at an entry point. Each group subroutine or analogous assembly language may have a normal prologue and all but the last equivalents) might have NO epilogue (E flag clear). These examples are not exhaustive, of course. But they illustrate the flexibility of the mechanism, which should be suitable for the vast majority of cases in practice. 3.4 Number of Unwind Groups A simple concatenation of unwind groups by the linker that combines unwind group sections from multiple modules does not, of itself, provide direct information to the unwind software regarding how many unwind groups are present. However, the image information (ELF segments) should suffice to determine this number based on the size of the segment and the known size of each group. Alternatively, the linker might create special symbols that can be used to help determine the number of groups. This proposal leaves this detail to target-specific linker, image loader and exception handling system to specify. 3.5 Interaction with Code Generation The relatively simple and fixed size nature of the extended compact unwind information proposed here depends on observing certain restrictions on optimization that might affect code in prologues and epilogues. These restrictions were noted where relevant throughout this section. Here is a summary of those requirements: 1. Use of any preserved register must be delayed until all of the preserved registers have been saved. 2. In the prologue of a function with a RBP-based frame, the two instructions that save the prior RBP contents and copy the stack pointer to RBP must be kept adjacent. Similarly, in the epilogue of a function with a RBP-based frame, the two instructions that cut back the stack and restore the prior RBP contents must be kept adjacent. [This adjacency requirement could be relaxed by introducing an additional MODE that corresponds to having the RBP saved in memory but not yet updated from RSP. This does not seem worthwhile.] 3. For shrink wrap optimization in the presence of a handler/personality routine, care must be taken to not move instructions into the prologue that might cause an exception. Floating point exceptions and access violations are of particular concern in this regard. 4 System Specific Extensions for OpenVMS Omitted since nobody other than OpenVMS should care but it describes adding an additional four byte extension to the extended compact unwind group to pass along some information from the VAX Macro-32 compiler about VAX register usage. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180126/54f5c06d/attachment.sig>
Jason Molenda via llvm-dev
2018-Jan-27 00:19 UTC
[llvm-dev] [RFC] Improving compact x86-64 compact unwind descriptors
Hi John & Ron, I read through the proposal and had a couple of quick observations. 1. The proposed encoding assumes that the epilogue instructions always come at the end of the function -- or rather, just before the next function. If there is a stack protector __stack_chk_fail sequence, or there is NOP padding between functions, then the epilogue cannot be expressed. The proposed encoding allows for instructions scheduled before the prologue, as long as they're guaranteed to never throw an exception or spill registers, but there's no similar affordance for after the epilogue. 2. In section 3.1, "A null frame (MODE = 8) is the simplest possible frame, with no allocated stack of either kind (hence no saved registers). A null frame might be considered just a special case of a RSP-based frame with a stack size of zero. But unlike other frames, its frame pointer is usually not 16-byte aligned." Did you meant stack pointer here? The x86_64 SysV abi requires 16-byte alignment of the stack pointer before a CALL instruction, but I don't know of any alignment requirements on the frame pointer. 3. In the same section, you propose an alternative "default null frame group" be defined, "If the first attempt to lookup an unwind group for an exception address fails, then it is (tentatively) assumed to have occurred within a null frame function or in a part of a function that is adequately described by a null frame. The presumed return address is (virtually or actually) popped from the top of stack and looked up. This second attempted lookup must succeed, in which case processing continues normally. A failure is a fatal error." I suspect I missed a part of the proposal covering this, but if entries are described by only start address, is there a problem of the previous non-null function's range extending & picking up these functions if they don't have entries? 4. For frameless functions -- small-stack RSP functions -- you're assuming that the compiler saves and restores spilled registers with MOV instructions instead of PUSH/POP instructions. I think this is the normal behavior of the compiler, something like subq $0x98, %rsp movq %rsi, 0x10(%rsp) movq %rdi, 0x18(%rsp) or whatever, but the proposed encoding can only allow for one change in stack size. The current encoding requires this for large-stack RSP functions (we get the offset to the RSP instruction where we read the amount being decremented in the compact unwind encoding), but for small stack frameless functions the compiler can today express in compact unwind a combination of sub and push instructions as long as it's a fixed amount within the range that the small-stack RSP encoding can express.> On Jan 26, 2018, at 7:54 AM, John Reagan <john.reagan at vmssoftware.com> wrote: > > Here is our proposal to extend/enhance the x86-64 compact unwind > descriptors to fully describe the prologue/epilogue for asynchronous > unwinding. I believe there are missing/lacking CFI directives as well, > but I'll save that for another thread. > > > Asynchronous Compact Unwind Descriptors > Ron Brender, VMS Software, Inc. > Revised January 25, 2018 > > 1 Introduction > This document proposes means to extend so-called compact unwind > descriptors to support fully > asynchronous exception handling. This will make extended compact unwind > descriptors an > alternative to DWARF CFI (call frame information) for achieving > asynchronous exception > support. > > Compact unwind descriptors can and have been used in both 64- and 32-bit > environments. > However, this proposal addresses only 64-bit environments. While the > ideas presented here can > be readily adapted for use in a 32-bit environment, for simplicity this > document makes no > attempt to do so. > > There are generally three kinds of information that together form the > heart of modern software > exception handling systems: > > 1. Information that is used to divide the remaining unwind information > into groups that are > specific to particular regions of memory (often, but not > necessarily, associated with a > single function) as well as provide a way to efficiently search for > and identify the > grouping that is associated with a particular address in memory. > > 2. Information that can be used to virtually or actually unwind from > the call frame of an > executing function to the call frame of its caller (at the point of > the call). > > 3. Identification of an associated personality routine that is invoked > by the general exception > handling mechanization to guide how processing should proceed for a > given function, as > well as additional “language specific data” needed for the > personality routine to do its > job. Note that the personality routine and its related data are > specified as an adjunct to > compiled code and totally opaque to the general mechanism (other > than the specified > interface). > > Note in particular that C++ exception handling is built on top of a > personality routine and > language specific data area ABI that itself can be implemented using > either DWARE CFI or > extended compact unwind information as described here. The choice > between the two is > transparent to C++. > > > 2 Compact Unwind Overview > This section provides a brief overview of key features of the LLVM > compact unwind design. It > does not attempt a comprehensive re-statement of all aspects of the > design except to the extent > necessary to motivate and understand later proposed changes and > enhancements. > > 2.1 Compact Unwind Group Description > A compact unwind group consist of five fields, as follows: > > | 63 32 31 0 | > |-------------------------------------------------------------------| > | STARTING-ADDRESS | > | LENGTH | COMPACT-UNWIND-DESCRIPTION | > | PERSONALITY-FUNCTION-POINTER | > | LANGUAGE-SPECIFIC-DATA-ADDRESS | > |-------------------------------------------------------------------| > > STARTING-ADDRESS (64-bits) is the lowest address of a region of memory > occupied by some > code, typically the entry point of a function. > > LENGTH (32-bits) is the number of bytes included in this group, > typically including all and only > the code of a function. > > COMPACT-UNWIND-DESCRIPTION (32-bits) is a description of the fully > formed frame of a > function and how to unwind it. This is described further following. > > PERSONALITY-FUNCTION-POINTER (64-bit) is a pointer to the personality > routine. > > LANGUAGE-SPECIFIC-DATA-ADDRESS (64-bits, sometimes abbreviated LSDA) is a > pointer to some data to be passed to the personality routine when it is > called. > > A key observation is that the starting address plus length way of > describing a group means that > the set of groups for a compilation unit need not describe all of the > code in that unit. In > particular, it appears to be expected that no unwind information need be > generated for leaf > functions. > > On the other hand, it is reasonable to expect that the groups that are > emitted are ordered by the > starting address. This means that a simple and fast binary search can be > used to map an address > to the group that applies to that address, if any. > > It is useful to note that the run-time representation of unwind > information can vary from little > more than a simple concatenation of the compile-time information to a > substantial rewriting of > unwind information by the linker. The proposal favors simple > concatenation while maintaining > the same ordering of groups as their associated code. > > 2.2 Compact Unwind Frame Description > A compact unwind frame description describes a frame in sufficient > detail to be able to unwind > that frame to the frame of its caller. > > | 31 28 27 24 23 0 | > |-------------------------------------------------------------------| > | FLAGS | MODE | | > |-------------------------------------------------------------------| > > At the top most level, there are four bits that are not of further > interest here. Interpretation of > these bits is neither used nor changed. > > Also at the top-level is a 4-bit mode field. This is the tag of a > discriminated (tagged, variant) > union that selects the interpretation of the remaining 24 bits. > > Of the 16 possible modes, only 5 are defined : > > Code Meaning Description > 0 Old “Old” is presumed to refer to some historical > design that is no longer of interest. > It is treated here as Reserved. > 1 RBP-based The frame uses the RBP register as a frame pointer. > The size of the frame can > frame vary during execution. > 2 RSP-based The frame uses RSP as the frame pointer. The size > of the frame is fixed (at > frame compilation time). > 3 Large RSP- The frame uses RSP as the frame pointer, The size > of the frame is fixed (at > based frame compilation time); however, that size is too large > to express within this 32-bit > descriptor encoding. > 4 DWARF The frame, for whatever reason, cannot be > adequately described using the > escape compact unwind frame description. The remaining > 24-bits are an index into > what the DWARF standard calls the .debug_frame > section (__eh_frame in > LLVM). > > 2.2.1 RBP-based Frame (MODE=1) > For a RBP-based frame, the remaining 24 bits are encoded as follows: > > | 23 16 | 15 | 14 0 | > |-------------------------------------------------------------------| > | OFFSET | 0 | REGS | > |-------------------------------------------------------------------| > > In a RBP-based frame the RBP register is pushed on the stack immediately > after the return > address, then RSP is moved to RBP. To unwind, RSP is restored with the > current RPB value, > then RBP is restored by popping off the stack, and the return is done by > popping the stack once > more into the instruction pointer. > > All preserved registers are saved in a small range in the stack that > starts at RBP-8 to RBP-2040. > The offset/8 relative to RBP is encoded in the 8-bit OFFSET field. The > registers saved are > encoded in the 15-bit REGS field as five 3-bit entries. > > 2.2.2 RSP-Based Frame (MODE=2) > For a RSP-based frame, the remaining 24 bits are encoded as follows: > > |23 16 | 15 13 | 12 10 | 9 0 | > |-------------------------------------------------------------------| > | SIZE | | CNT | REG_PERM | > |-------------------------------------------------------------------| > > In a RSP-based frame the stack pointer serves directly as the frame > pointer and RBP is available > for use as a general register. Upon entry, the stack pointer is > decremented by 8*SIZE bytes (the > maximum stack allocation is thus 2040 bytes). To unwind, the stack size > is added to the stack > pointer, and completed by popping the stack once more into the > instruction pointer. > > All preserved registers are saved on the stack immediately after the > return address. The number > of registers saved (up to 6) is encoded in the 3-bit CNT field. The > 11-bit REG_PERM field > encodes which registers were saved and in what order. > > 2.2.3 Large RSP-Based Frame (MODE=3) > For a large RSP-based frame, the remaining 24 bits are encoded as follows: > > |23 16 | 15 13 | 12 10 | 9 0 | > |-------------------------------------------------------------------| > | SIZE | ADJ | CNT | REG_PERM | > |-------------------------------------------------------------------| > > This case is like the previous, except the stack size is too large to > encode in the compact unwind > encoding. Instead, the function must include a "subq $nnnnnnnn, RSP" > instruction in its > prologue to allocate the stack. The offset from the entry point of the > function to the nnnnnnnn > value in the function is given in the SIZE field. > > Depending on the exact instructions used to save registers (PUSH versus > MOV), the nnnnnnnn > value in the instruction stream may not be quite the full stack size. > ADJ * 8 is the additional > adjustment needed to get the actual size. > > 2.2.4 DWARF Escape (MODE=4) > The frame, for whatever reason, cannot be adequately described using a > compact unwind frame > description. The remaining 24-bits are an index into what the DWARF > standard calls the > .debug_frame section (called __eh_frame in LLVM). > > > 3 Asynchronous Changes and Enhancements > It is immediately obvious that omission of unwind information for leaf > functions (with any kind > of frame) precludes handling an exception that might occur during its > execution. It follows that > unwind information must cover all of the code of a module (with one > exception discussed > below). But if successive unwind groups are ordered (as previously > assumed) and also leave no > gaps, then the LENGTH field is redundant and can be omitted. The > beginning address of a > following group is always one byte past the end of the predecessor > group. There remains only the > question of how to identify the last group of a set. > > It should also be clear that the unwind representation described in the > prior section is not > sufficient to unwind from an asynchronous exception that might occur in > either the prologue or > epilogue of a function. To see this consider what would happen if an > exception occurred at either > the entry point or the return instruction of either a RBP- or RSP-frame > function. To be able to > handle asynchronous exceptions at any point during function execution, > it is necessary to add > additional information to each unwind group. > > These two considerations can be combined. The result is simply to > repurpose the LENGTH field > to encode prologue and epilogue information. > > 3.1 Extended MODEs > To preserve backward compatibility and to allow intermixing of > traditional and extended > compact unwind groups, new MODEs are defined as follows: > > Code Meaning Description > ---------------------------------------------------------------------------------------------- > 8 Null frame See below. > 9 Extended RBP-based frame Like mode 1 but combined with > extended prologue/epilogue > information in place of a LENGTH field > 10 Extended RSP-based frame Like mode 2 but combined with > extended prologue/epilogue > information in place of a LENGTH field > 11 Extended Large RSP-based Like mode 3 but combined with > extended prologue/epilogue > frame information in place of a LENGTH field > 12 Extended parameter(s) See below. > > A null frame (MODE = 8) is the simplest possible frame, with no > allocated stack of either kind > (hence no saved registers). A null frame might be considered just a > special case of a RSP-based > frame with a stack size of zero. But unlike other frames, its frame > pointer is usually not 16-byte > aligned. > > It appears technically feasible for a null frame function to have a > personality routine. However, > the utility of such a capability seems too meager to justify allowing > this. We propose to not > support this. > > There remains the question of whether it is necessary or at least > desirable to strictly apply the > requirement that “all” code be covered by an unwind group. Based on > successful experience in > OpenVMS on the Itanium architecture (as well as Windows systems we > think), this alternative is > proposed: > > If the first attempt to lookup an unwind group for an exception > address fails, then it is > (tentatively) assumed to have occurred within a null frame function > or in a part of a function > that is adequately described by a null frame. The presumed return > address is (virtually or > actually) popped from the top of stack and looked up. This second > attempted lookup must > succeed, in which case processing continues normally. A failure is a > fatal error. > > This concept of a “default null frame group” is especially convenient > for dealing with > disconnected code sequences such as trampolines, PLTs, and the like. > > The extended RBP-based and RSP-based frame modes (MODEs = 9 and 10) are > simple > supersets of their more traditional predecessors. > > The large RSP-based frame mode (MODE = 11) is also a superset of mode 3 > except that instead > of finding the stack size in the instruction stream, it is found in a > preceding extended parameter > unwind group (MODE = 12). This difference is essential for support of > execute-only code. > > To understand the extended parameter group, suppose that two groups > occur one after the other > but have the same STARTING-ADDRESS value. A binary search using the > STARTING- > ADDRESS field will ignore the first of the pair because the address > being sought can never be at > least as large as the first but less than the second. Such a group can > be used as an escape > convention that allows inserting additional information into the > sequence of groups that > otherwise cannot be easily included. > > The extended modes are described more fully in the following sections. > > 3.2 Function Parts > Functions are generally considered to consist of three distinct parts: > > 1. A prologue, which allocates local storage, saves registers that must > be preserved for the > benefit of callers, and possibly other housekeeping (such as > initializing any local state > necessary for the correct management of the function’s execution). > > 2. A body, which performs the main work of the function. > > 3. An epilogue, which makes the result available to the caller, and > undoes the effects of the > prologue (restores preserved registers, deallocates local storage, > and so on). > > Here a word of caution is in order. This vocabulary tends to be applied > at multiple levels in > software architecture and each level has its own set of issues to > consider and resulting > requirements to be observed. > > Note that a null frame function has no distinct prologue, body or > epilogue. Every instruction can > be viewed as simultaneously in all three parts, or in none of them. > > This leads to the following proposal for the upper part of the extended > compact unwind > quadword for use in combination with MODE = 8 in the lower part. > > | 63 48 | 47 32 | > |-------------------------------------------------------------------| > | RESERVED | 0 ... 0 | > |-------------------------------------------------------------------| > > > | 31 16 | 15 0 | > |-------------------------------------------------------------------| > | 0 0 0 0 | 1 0 0 0 | 0 ... 0 | 0 ... 0 | > |-------------------------------------------------------------------| > > > > 3.2.1 Function Prologue > For the purposes of exception handling, the key steps are: > > 1. Allocate space on the stack in which to save preserved registers and > for other purposes > 2. Save the registers that must be preserved. > > The DWARF call frame information (CFI) provides a description that is > precise and accurate at > each and every instruction in a function (not limited to just the > prologue). But experience shows > that this representation is bulky in space and expensive in time to > decode and interpret. Indeed, > compact unwind descriptors as described in Section 2 were created to > avoid these issues. > > Experience with OpenVMS on the Alpha and Itanium architectures shows > that a key constraint > can greatly simplify the unwind description: require that no preserved > register can be changed > until all of them have been saved. The last such save ends the prologue > and the next instruction > begins the body. An unwind does not need (must not) restore any > preserved registers because > they are still valid. A simple length from the entry point suffices to > describe this boundary. > [The body does not necessarily need to begin until at least one of the > preserved registers is > actually changed.] > > 3.2.1.1 RSP-Based Frame (MODE = 10) > For a RSP-based frame, it is also necessary to know which instruction > adjusts the stack pointer: > before this instruction an unwind must only (virtually or actually) pop > the return address; and > after this instruction, the stack pointer must first be incremented to > deallocate the stack frame, > then the instruction pointer restored. > > This leads to the following proposal for the upper part of the extended > compact unwind > quadword for use in combination with MODE = 10 in the lower part. > > | 63 48 47 46 40 39 32 | > |-------------------------------------------------------------------| > | RESERVED | | PROLOGUE-SIZE2 | PROLOGUE-SIZE1 | > |-------------------------------------------------------------------| > > > Except for the MODE, the lower part is exactly like it would be for MODE > = 2. > > PROLOGUE-SIZE1 is the length in bytes of the first part of the prologue > relative to the > STARTING-ADDRESS, up to and including the instruction that allocates the > stack. > > PROLOGUE-SIZE2 is the length in bytes of the second part of the prologue > relative to the end > of the first part, up to a point after the last preserved register is > saved and before the first > preserved register is changed. (Recall, this point may not be unique.) > > PROLOGUE-SIZE1 + PROLOGUE-SIZE2 gives the total size of the prologue. > > The maximum prologue size allowed here is much greater than will be > typical. This is deliberate > to allow compilers freedom to use shrink-wrap type optimizations in > which safe operations that > need no body state are moved from the body to the beginning of the > prologue. This avoids the > cost of stack frame setup and teardown for simple special case handling > that often leads to an > early exit. > > 3.2.1.2 Large RSP-based Frame (MODE = 11) > This case is like the previous, except the stack size is too large to > encode in the compact unwind > encoding. The SIZE field is interpreted as an offset relative to the > beginning of the containing > unwind group to (what is necessarily) an earlier extended parameter group. > > 3.2.1.3 RBP-Based Frame (MODE = 9) > The RBP-based frame is the same as for a RSP-based frame except that > there are two instructions > that mark the transition between the two parts of the prologue: the push > of the prior RBP value > onto the stack followed by the copy of RSP to RBP to establish the new > frame pointer. > > There appears to be no reason to not make this simplifying requirement: > the push and copy > always occur together. > > This leads to the following proposal for the upper part of the extended > compact unwind > quadword for use in combination with MODE = 9 in the lower part. > > | 63 48 47 46 40 39 32 | > |-------------------------------------------------------------------| > | RESERVED | | PROLOGUE-SIZE2 | PROLOGUE-SIZE1 | > |-------------------------------------------------------------------| > > Except for the MODE, the lower part is exactly like it would be for MODE > = 1. > > PROLOGUE-SIZE1 is the length in bytes of the first part of the prologue > relative to the > STARTING-ADDRESS, up to and including the instruction that pushes the > prior RBP contents. > > PROLOGUE-SIZE2 is the length in bytes of the second part of the prologue > relative to the end > of the first part, up to a point after the last preserved register is > saved and before the first > preserved register is changed. (Recall, this point may not be unique.) > > PROLOGUE-SIZE1 + PROLOGUE-SIZE2 gives the total size of the prologue. > > 3.2.2 Function Epilogue > For the purposes of exception handling, the key steps in an epilogue are: > > 1. Restore the registers that were preserved > 2. Deallocate space on the stack that was used to preserve registers > and for other purposes > > A key observation is that restoring registers is idempotent. That is, if > in the midst of restoring > registers when an exception occurs, it will do no harm if all of the > preserved registers are > restored again when unwinding. In effect, restoration of registers need > not be distinguished from > the body of a function. It is not until reaching the first instruction > that deallocates stack is > executed that the “real” epilogue begins. > > 3.2.2.1 RSP-Based Frame (MODE = 10 or 11) > For a RSP-based frame, whether small or large, the only unwind action > remaining after stack > deallocation is to pop the return address into the instruction pointer > (RIP). In the vast majority of > cases this means that the epilogue of the code described by an unwind > group is just a 1-byte RET > instruction that occurs at the end of the unwind group. There are other > possibilities but they are > rare and con be dealt with separately (see below). > > It suffices to include a single 1-bit field, E, in the upper part of the > extended compact unwind > description: > > | 63 48 47 46 40 39 32 | > |-------------------------------------------------------------------| > | RESERVED | E | PROLOGUE-SIZE2 | PROLOGUE-SIZE1 | > |-------------------------------------------------------------------| > > > The E flag indicates that the unwind group ends with a standard 1-byte > RET instruction. > > 3.2.2.2 RBP-Based Frame (MODE = 9) > The RBP-based frame is much the same as for a RSP-based frame except > that there are two > instructions that mark the transition between the two parts of the > prologue: the copy of RBP to > RSP to cut back the stack, followed by the pop of RBP to restore what > may be the callers frame > pointer. Thus the epilogue, for the purposes of unwinding, begins > immediately after the copy of > RBP to RSP. In the vast majority of cases, the immediately following two > instructions will > consist of POP RBP and RET. The POP is not idempotent because it changes > the stack pointer, > but the unwind processing code can distinguish whether the stack has > been popped or not based > on the address (3 versus 1 byte less than the highest address of the group). > > It suffices to include a single 1-bit field, E, in the upper part of the > extended compact unwind > description: > > | 63 48 47 46 40 39 32 | > |-------------------------------------------------------------------| > | RESERVED | E | PROLOGUE-SIZE2 | PROLOGUE-SIZE1 | > |-------------------------------------------------------------------| > > The E flag indicates that the unwind group ends with a standard > 3-instruction return sequence. > > 3.3 Special Issues > Up until this point, it might appear that each unwind group corresponds > one-for-one to the code > for a single function. While common, this is not required. Important > variations are illustrated > here. > > Issue/Problem Possible Resolution > ------------------------------------------------------------------------------------------------- > The first part of the prologue is too large to be Use two unwind groups: > described by the 8-bit RxP-FRAME-SIZE1 field. 1) a null frame > group of any appropriate size, > 2) a suitable > RxP-based frame. > > The second part of the prologue is too large to be Use three unwind > groups: > described by the 8-bit RxP-FRAME-SIZE2 field. 1) a RSP-frame > group to describe the first part of > the prologue only > (PROLOGUE-SIZE1 = the > length of the > entire group and PROLOGUE-SIZE2 > = 0, > 2) a RSP-frame > group to describe the second part > of the prologue > (PROLOGUE-SIZE1 & 2 both = 0 > and no registers > saved), > 3) a suitable > RxP-based frame group (with > PROLOGUE-SIZE1 & 2 > both = 0). > > There is more than one return location (without a Generally use one > unwind group for each sequence > shared epilogue sequence). of code ending with > a RET. A group following a > RET will have no > prologue (PROLOGUE-SIZE1 > & 2 both = 0). > > There is more than one entry point to a function Generally use one > unwind group for each sequence > (for example, a FORTRAN multiple entry point of code beginning > at an entry point. Each group > subroutine or analogous assembly language may have a normal > prologue and all but the last > equivalents) might have NO > epilogue (E flag clear). > > > These examples are not exhaustive, of course. But they illustrate the > flexibility of the > mechanism, which should be suitable for the vast majority of cases in > practice. > > 3.4 Number of Unwind Groups > A simple concatenation of unwind groups by the linker that combines > unwind group sections > from multiple modules does not, of itself, provide direct information to > the unwind software > regarding how many unwind groups are present. However, the image > information (ELF > segments) should suffice to determine this number based on the size of > the segment and the > known size of each group. Alternatively, the linker might create special > symbols that can be used > to help determine the number of groups. > > This proposal leaves this detail to target-specific linker, image loader > and exception handling > system to specify. > > 3.5 Interaction with Code Generation > The relatively simple and fixed size nature of the extended compact > unwind information > proposed here depends on observing certain restrictions on optimization > that might affect code in > prologues and epilogues. These restrictions were noted where relevant > throughout this section. > Here is a summary of those requirements: > > 1. Use of any preserved register must be delayed until all of the > preserved registers have > been saved. > 2. In the prologue of a function with a RBP-based frame, the two > instructions that save the > prior RBP contents and copy the stack pointer to RBP must be kept > adjacent. Similarly, > in the epilogue of a function with a RBP-based frame, the two > instructions that cut back > the stack and restore the prior RBP contents must be kept adjacent. > [This adjacency > requirement could be relaxed by introducing an additional MODE that > corresponds to having > the RBP saved in memory but not yet updated from RSP. This does not > seem worthwhile.] > 3. For shrink wrap optimization in the presence of a > handler/personality routine, care must > be taken to not move instructions into the prologue that might cause > an exception. > Floating point exceptions and access violations are of particular > concern in this regard. > > 4 System Specific Extensions for OpenVMS > Omitted since nobody other than OpenVMS should care but it describes > adding an additional > four byte extension to the extended compact unwind group to pass along > some information from > the VAX Macro-32 compiler about VAX register usage. > > >
Nick Kledzik via llvm-dev
2018-Jan-27 02:54 UTC
[llvm-dev] [RFC] Improving compact x86-64 compact unwind descriptors
John and Ron, I developed the original compact unwind implementation for macOS 10.6 back in 2009. I tried to leave space in the design to support finer grain exception handling such as for asynchronous or for the shrink wrap optimization. The idea I had at the time was instead of having just one 32-bit compact unwind info per function, there could be an array of them each covering a different range. All but the first would have the UNWIND_IS_NOT_FUNCTION_START bit set. On macOS, the linker takes the 32-byte “compact unwind group description” records and folds them down into a complete different (read only) runtime table which basically maps an offset in the DSO into a 32-bit compact info for that address. It seems like from your proposal that your linker keeps the “compact unwind group description” records as is and just concatenates them, and the runtime would need to understand the prolog/epilog encoding. I would think that just widening the group record (as you suggest in 5.1) would be much simpler (like SHT_REL was widened to SHT_RELA). -Nick> On Jan 26, 2018, at 7:54 AM, John Reagan via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Here is our proposal to extend/enhance the x86-64 compact unwind > descriptors to fully describe the prologue/epilogue for asynchronous > unwinding. I believe there are missing/lacking CFI directives as well, > but I'll save that for another thread. > > > Asynchronous Compact Unwind Descriptors > Ron Brender, VMS Software, Inc. > Revised January 25, 2018 > > 1 Introduction > This document proposes means to extend so-called compact unwind > descriptors to support fully > asynchronous exception handling. This will make extended compact unwind > descriptors an > alternative to DWARF CFI (call frame information) for achieving > asynchronous exception > support. > > Compact unwind descriptors can and have been used in both 64- and 32-bit > environments. > However, this proposal addresses only 64-bit environments. While the > ideas presented here can > be readily adapted for use in a 32-bit environment, for simplicity this > document makes no > attempt to do so. > > There are generally three kinds of information that together form the > heart of modern software > exception handling systems: > > 1. Information that is used to divide the remaining unwind information > into groups that are > specific to particular regions of memory (often, but not > necessarily, associated with a > single function) as well as provide a way to efficiently search for > and identify the > grouping that is associated with a particular address in memory. > > 2. Information that can be used to virtually or actually unwind from > the call frame of an > executing function to the call frame of its caller (at the point of > the call). > > 3. Identification of an associated personality routine that is invoked > by the general exception > handling mechanization to guide how processing should proceed for a > given function, as > well as additional “language specific data” needed for the > personality routine to do its > job. Note that the personality routine and its related data are > specified as an adjunct to > compiled code and totally opaque to the general mechanism (other > than the specified > interface). > > Note in particular that C++ exception handling is built on top of a > personality routine and > language specific data area ABI that itself can be implemented using > either DWARE CFI or > extended compact unwind information as described here. The choice > between the two is > transparent to C++. > > > 2 Compact Unwind Overview > This section provides a brief overview of key features of the LLVM > compact unwind design. It > does not attempt a comprehensive re-statement of all aspects of the > design except to the extent > necessary to motivate and understand later proposed changes and > enhancements. > > 2.1 Compact Unwind Group Description > A compact unwind group consist of five fields, as follows: > > | 63 32 31 0 | > |-------------------------------------------------------------------| > | STARTING-ADDRESS | > | LENGTH | COMPACT-UNWIND-DESCRIPTION | > | PERSONALITY-FUNCTION-POINTER | > | LANGUAGE-SPECIFIC-DATA-ADDRESS | > |-------------------------------------------------------------------| > > STARTING-ADDRESS (64-bits) is the lowest address of a region of memory > occupied by some > code, typically the entry point of a function. > > LENGTH (32-bits) is the number of bytes included in this group, > typically including all and only > the code of a function. > > COMPACT-UNWIND-DESCRIPTION (32-bits) is a description of the fully > formed frame of a > function and how to unwind it. This is described further following. > > PERSONALITY-FUNCTION-POINTER (64-bit) is a pointer to the personality > routine. > > LANGUAGE-SPECIFIC-DATA-ADDRESS (64-bits, sometimes abbreviated LSDA) is a > pointer to some data to be passed to the personality routine when it is > called. > > A key observation is that the starting address plus length way of > describing a group means that > the set of groups for a compilation unit need not describe all of the > code in that unit. In > particular, it appears to be expected that no unwind information need be > generated for leaf > functions. > > On the other hand, it is reasonable to expect that the groups that are > emitted are ordered by the > starting address. This means that a simple and fast binary search can be > used to map an address > to the group that applies to that address, if any. > > It is useful to note that the run-time representation of unwind > information can vary from little > more than a simple concatenation of the compile-time information to a > substantial rewriting of > unwind information by the linker. The proposal favors simple > concatenation while maintaining > the same ordering of groups as their associated code. > > 2.2 Compact Unwind Frame Description > A compact unwind frame description describes a frame in sufficient > detail to be able to unwind > that frame to the frame of its caller. > > | 31 28 27 24 23 0 | > |-------------------------------------------------------------------| > | FLAGS | MODE | | > |-------------------------------------------------------------------| > > At the top most level, there are four bits that are not of further > interest here. Interpretation of > these bits is neither used nor changed. > > Also at the top-level is a 4-bit mode field. This is the tag of a > discriminated (tagged, variant) > union that selects the interpretation of the remaining 24 bits. > > Of the 16 possible modes, only 5 are defined : > > Code Meaning Description > 0 Old “Old” is presumed to refer to some historical > design that is no longer of interest. > It is treated here as Reserved. > 1 RBP-based The frame uses the RBP register as a frame pointer. > The size of the frame can > frame vary during execution. > 2 RSP-based The frame uses RSP as the frame pointer. The size > of the frame is fixed (at > frame compilation time). > 3 Large RSP- The frame uses RSP as the frame pointer, The size > of the frame is fixed (at > based frame compilation time); however, that size is too large > to express within this 32-bit > descriptor encoding. > 4 DWARF The frame, for whatever reason, cannot be > adequately described using the > escape compact unwind frame description. The remaining > 24-bits are an index into > what the DWARF standard calls the .debug_frame > section (__eh_frame in > LLVM). > > 2.2.1 RBP-based Frame (MODE=1) > For a RBP-based frame, the remaining 24 bits are encoded as follows: > > | 23 16 | 15 | 14 0 | > |-------------------------------------------------------------------| > | OFFSET | 0 | REGS | > |-------------------------------------------------------------------| > > In a RBP-based frame the RBP register is pushed on the stack immediately > after the return > address, then RSP is moved to RBP. To unwind, RSP is restored with the > current RPB value, > then RBP is restored by popping off the stack, and the return is done by > popping the stack once > more into the instruction pointer. > > All preserved registers are saved in a small range in the stack that > starts at RBP-8 to RBP-2040. > The offset/8 relative to RBP is encoded in the 8-bit OFFSET field. The > registers saved are > encoded in the 15-bit REGS field as five 3-bit entries. > > 2.2.2 RSP-Based Frame (MODE=2) > For a RSP-based frame, the remaining 24 bits are encoded as follows: > > |23 16 | 15 13 | 12 10 | 9 0 | > |-------------------------------------------------------------------| > | SIZE | | CNT | REG_PERM | > |-------------------------------------------------------------------| > > In a RSP-based frame the stack pointer serves directly as the frame > pointer and RBP is available > for use as a general register. Upon entry, the stack pointer is > decremented by 8*SIZE bytes (the > maximum stack allocation is thus 2040 bytes). To unwind, the stack size > is added to the stack > pointer, and completed by popping the stack once more into the > instruction pointer. > > All preserved registers are saved on the stack immediately after the > return address. The number > of registers saved (up to 6) is encoded in the 3-bit CNT field. The > 11-bit REG_PERM field > encodes which registers were saved and in what order. > > 2.2.3 Large RSP-Based Frame (MODE=3) > For a large RSP-based frame, the remaining 24 bits are encoded as follows: > > |23 16 | 15 13 | 12 10 | 9 0 | > |-------------------------------------------------------------------| > | SIZE | ADJ | CNT | REG_PERM | > |-------------------------------------------------------------------| > > This case is like the previous, except the stack size is too large to > encode in the compact unwind > encoding. Instead, the function must include a "subq $nnnnnnnn, RSP" > instruction in its > prologue to allocate the stack. The offset from the entry point of the > function to the nnnnnnnn > value in the function is given in the SIZE field. > > Depending on the exact instructions used to save registers (PUSH versus > MOV), the nnnnnnnn > value in the instruction stream may not be quite the full stack size. > ADJ * 8 is the additional > adjustment needed to get the actual size. > > 2.2.4 DWARF Escape (MODE=4) > The frame, for whatever reason, cannot be adequately described using a > compact unwind frame > description. The remaining 24-bits are an index into what the DWARF > standard calls the > .debug_frame section (called __eh_frame in LLVM). > > > 3 Asynchronous Changes and Enhancements > It is immediately obvious that omission of unwind information for leaf > functions (with any kind > of frame) precludes handling an exception that might occur during its > execution. It follows that > unwind information must cover all of the code of a module (with one > exception discussed > below). But if successive unwind groups are ordered (as previously > assumed) and also leave no > gaps, then the LENGTH field is redundant and can be omitted. The > beginning address of a > following group is always one byte past the end of the predecessor > group. There remains only the > question of how to identify the last group of a set. > > It should also be clear that the unwind representation described in the > prior section is not > sufficient to unwind from an asynchronous exception that might occur in > either the prologue or > epilogue of a function. To see this consider what would happen if an > exception occurred at either > the entry point or the return instruction of either a RBP- or RSP-frame > function. To be able to > handle asynchronous exceptions at any point during function execution, > it is necessary to add > additional information to each unwind group. > > These two considerations can be combined. The result is simply to > repurpose the LENGTH field > to encode prologue and epilogue information. > > 3.1 Extended MODEs > To preserve backward compatibility and to allow intermixing of > traditional and extended > compact unwind groups, new MODEs are defined as follows: > > Code Meaning Description > ---------------------------------------------------------------------------------------------- > 8 Null frame See below. > 9 Extended RBP-based frame Like mode 1 but combined with > extended prologue/epilogue > information in place of a LENGTH field > 10 Extended RSP-based frame Like mode 2 but combined with > extended prologue/epilogue > information in place of a LENGTH field > 11 Extended Large RSP-based Like mode 3 but combined with > extended prologue/epilogue > frame information in place of a LENGTH field > 12 Extended parameter(s) See below. > > A null frame (MODE = 8) is the simplest possible frame, with no > allocated stack of either kind > (hence no saved registers). A null frame might be considered just a > special case of a RSP-based > frame with a stack size of zero. But unlike other frames, its frame > pointer is usually not 16-byte > aligned. > > It appears technically feasible for a null frame function to have a > personality routine. However, > the utility of such a capability seems too meager to justify allowing > this. We propose to not > support this. > > There remains the question of whether it is necessary or at least > desirable to strictly apply the > requirement that “all” code be covered by an unwind group. Based on > successful experience in > OpenVMS on the Itanium architecture (as well as Windows systems we > think), this alternative is > proposed: > > If the first attempt to lookup an unwind group for an exception > address fails, then it is > (tentatively) assumed to have occurred within a null frame function > or in a part of a function > that is adequately described by a null frame. The presumed return > address is (virtually or > actually) popped from the top of stack and looked up. This second > attempted lookup must > succeed, in which case processing continues normally. A failure is a > fatal error. > > This concept of a “default null frame group” is especially convenient > for dealing with > disconnected code sequences such as trampolines, PLTs, and the like. > > The extended RBP-based and RSP-based frame modes (MODEs = 9 and 10) are > simple > supersets of their more traditional predecessors. > > The large RSP-based frame mode (MODE = 11) is also a superset of mode 3 > except that instead > of finding the stack size in the instruction stream, it is found in a > preceding extended parameter > unwind group (MODE = 12). This difference is essential for support of > execute-only code. > > To understand the extended parameter group, suppose that two groups > occur one after the other > but have the same STARTING-ADDRESS value. A binary search using the > STARTING- > ADDRESS field will ignore the first of the pair because the address > being sought can never be at > least as large as the first but less than the second. Such a group can > be used as an escape > convention that allows inserting additional information into the > sequence of groups that > otherwise cannot be easily included. > > The extended modes are described more fully in the following sections. > > 3.2 Function Parts > Functions are generally considered to consist of three distinct parts: > > 1. A prologue, which allocates local storage, saves registers that must > be preserved for the > benefit of callers, and possibly other housekeeping (such as > initializing any local state > necessary for the correct management of the function’s execution). > > 2. A body, which performs the main work of the function. > > 3. An epilogue, which makes the result available to the caller, and > undoes the effects of the > prologue (restores preserved registers, deallocates local storage, > and so on). > > Here a word of caution is in order. This vocabulary tends to be applied > at multiple levels in > software architecture and each level has its own set of issues to > consider and resulting > requirements to be observed. > > Note that a null frame function has no distinct prologue, body or > epilogue. Every instruction can > be viewed as simultaneously in all three parts, or in none of them. > > This leads to the following proposal for the upper part of the extended > compact unwind > quadword for use in combination with MODE = 8 in the lower part. > > | 63 48 | 47 32 | > |-------------------------------------------------------------------| > | RESERVED | 0 ... 0 | > |-------------------------------------------------------------------| > > > | 31 16 | 15 0 | > |-------------------------------------------------------------------| > | 0 0 0 0 | 1 0 0 0 | 0 ... 0 | 0 ... 0 | > |-------------------------------------------------------------------| > > > > 3.2.1 Function Prologue > For the purposes of exception handling, the key steps are: > > 1. Allocate space on the stack in which to save preserved registers and > for other purposes > 2. Save the registers that must be preserved. > > The DWARF call frame information (CFI) provides a description that is > precise and accurate at > each and every instruction in a function (not limited to just the > prologue). But experience shows > that this representation is bulky in space and expensive in time to > decode and interpret. Indeed, > compact unwind descriptors as described in Section 2 were created to > avoid these issues. > > Experience with OpenVMS on the Alpha and Itanium architectures shows > that a key constraint > can greatly simplify the unwind description: require that no preserved > register can be changed > until all of them have been saved. The last such save ends the prologue > and the next instruction > begins the body. An unwind does not need (must not) restore any > preserved registers because > they are still valid. A simple length from the entry point suffices to > describe this boundary. > [The body does not necessarily need to begin until at least one of the > preserved registers is > actually changed.] > > 3.2.1.1 RSP-Based Frame (MODE = 10) > For a RSP-based frame, it is also necessary to know which instruction > adjusts the stack pointer: > before this instruction an unwind must only (virtually or actually) pop > the return address; and > after this instruction, the stack pointer must first be incremented to > deallocate the stack frame, > then the instruction pointer restored. > > This leads to the following proposal for the upper part of the extended > compact unwind > quadword for use in combination with MODE = 10 in the lower part. > > | 63 48 47 46 40 39 32 | > |-------------------------------------------------------------------| > | RESERVED | | PROLOGUE-SIZE2 | PROLOGUE-SIZE1 | > |-------------------------------------------------------------------| > > > Except for the MODE, the lower part is exactly like it would be for MODE > = 2. > > PROLOGUE-SIZE1 is the length in bytes of the first part of the prologue > relative to the > STARTING-ADDRESS, up to and including the instruction that allocates the > stack. > > PROLOGUE-SIZE2 is the length in bytes of the second part of the prologue > relative to the end > of the first part, up to a point after the last preserved register is > saved and before the first > preserved register is changed. (Recall, this point may not be unique.) > > PROLOGUE-SIZE1 + PROLOGUE-SIZE2 gives the total size of the prologue. > > The maximum prologue size allowed here is much greater than will be > typical. This is deliberate > to allow compilers freedom to use shrink-wrap type optimizations in > which safe operations that > need no body state are moved from the body to the beginning of the > prologue. This avoids the > cost of stack frame setup and teardown for simple special case handling > that often leads to an > early exit. > > 3.2.1.2 Large RSP-based Frame (MODE = 11) > This case is like the previous, except the stack size is too large to > encode in the compact unwind > encoding. The SIZE field is interpreted as an offset relative to the > beginning of the containing > unwind group to (what is necessarily) an earlier extended parameter group. > > 3.2.1.3 RBP-Based Frame (MODE = 9) > The RBP-based frame is the same as for a RSP-based frame except that > there are two instructions > that mark the transition between the two parts of the prologue: the push > of the prior RBP value > onto the stack followed by the copy of RSP to RBP to establish the new > frame pointer. > > There appears to be no reason to not make this simplifying requirement: > the push and copy > always occur together. > > This leads to the following proposal for the upper part of the extended > compact unwind > quadword for use in combination with MODE = 9 in the lower part. > > | 63 48 47 46 40 39 32 | > |-------------------------------------------------------------------| > | RESERVED | | PROLOGUE-SIZE2 | PROLOGUE-SIZE1 | > |-------------------------------------------------------------------| > > Except for the MODE, the lower part is exactly like it would be for MODE > = 1. > > PROLOGUE-SIZE1 is the length in bytes of the first part of the prologue > relative to the > STARTING-ADDRESS, up to and including the instruction that pushes the > prior RBP contents. > > PROLOGUE-SIZE2 is the length in bytes of the second part of the prologue > relative to the end > of the first part, up to a point after the last preserved register is > saved and before the first > preserved register is changed. (Recall, this point may not be unique.) > > PROLOGUE-SIZE1 + PROLOGUE-SIZE2 gives the total size of the prologue. > > 3.2.2 Function Epilogue > For the purposes of exception handling, the key steps in an epilogue are: > > 1. Restore the registers that were preserved > 2. Deallocate space on the stack that was used to preserve registers > and for other purposes > > A key observation is that restoring registers is idempotent. That is, if > in the midst of restoring > registers when an exception occurs, it will do no harm if all of the > preserved registers are > restored again when unwinding. In effect, restoration of registers need > not be distinguished from > the body of a function. It is not until reaching the first instruction > that deallocates stack is > executed that the “real” epilogue begins. > > 3.2.2.1 RSP-Based Frame (MODE = 10 or 11) > For a RSP-based frame, whether small or large, the only unwind action > remaining after stack > deallocation is to pop the return address into the instruction pointer > (RIP). In the vast majority of > cases this means that the epilogue of the code described by an unwind > group is just a 1-byte RET > instruction that occurs at the end of the unwind group. There are other > possibilities but they are > rare and con be dealt with separately (see below). > > It suffices to include a single 1-bit field, E, in the upper part of the > extended compact unwind > description: > > | 63 48 47 46 40 39 32 | > |-------------------------------------------------------------------| > | RESERVED | E | PROLOGUE-SIZE2 | PROLOGUE-SIZE1 | > |-------------------------------------------------------------------| > > > The E flag indicates that the unwind group ends with a standard 1-byte > RET instruction. > > 3.2.2.2 RBP-Based Frame (MODE = 9) > The RBP-based frame is much the same as for a RSP-based frame except > that there are two > instructions that mark the transition between the two parts of the > prologue: the copy of RBP to > RSP to cut back the stack, followed by the pop of RBP to restore what > may be the callers frame > pointer. Thus the epilogue, for the purposes of unwinding, begins > immediately after the copy of > RBP to RSP. In the vast majority of cases, the immediately following two > instructions will > consist of POP RBP and RET. The POP is not idempotent because it changes > the stack pointer, > but the unwind processing code can distinguish whether the stack has > been popped or not based > on the address (3 versus 1 byte less than the highest address of the group). > > It suffices to include a single 1-bit field, E, in the upper part of the > extended compact unwind > description: > > | 63 48 47 46 40 39 32 | > |-------------------------------------------------------------------| > | RESERVED | E | PROLOGUE-SIZE2 | PROLOGUE-SIZE1 | > |-------------------------------------------------------------------| > > The E flag indicates that the unwind group ends with a standard > 3-instruction return sequence. > > 3.3 Special Issues > Up until this point, it might appear that each unwind group corresponds > one-for-one to the code > for a single function. While common, this is not required. Important > variations are illustrated > here. > > Issue/Problem Possible Resolution > ------------------------------------------------------------------------------------------------- > The first part of the prologue is too large to be Use two unwind groups: > described by the 8-bit RxP-FRAME-SIZE1 field. 1) a null frame > group of any appropriate size, > 2) a suitable > RxP-based frame. > > The second part of the prologue is too large to be Use three unwind > groups: > described by the 8-bit RxP-FRAME-SIZE2 field. 1) a RSP-frame > group to describe the first part of > the prologue only > (PROLOGUE-SIZE1 = the > length of the > entire group and PROLOGUE-SIZE2 > = 0, > 2) a RSP-frame > group to describe the second part > of the prologue > (PROLOGUE-SIZE1 & 2 both = 0 > and no registers > saved), > 3) a suitable > RxP-based frame group (with > PROLOGUE-SIZE1 & 2 > both = 0). > > There is more than one return location (without a Generally use one > unwind group for each sequence > shared epilogue sequence). of code ending with > a RET. A group following a > RET will have no > prologue (PROLOGUE-SIZE1 > & 2 both = 0). > > There is more than one entry point to a function Generally use one > unwind group for each sequence > (for example, a FORTRAN multiple entry point of code beginning > at an entry point. Each group > subroutine or analogous assembly language may have a normal > prologue and all but the last > equivalents) might have NO > epilogue (E flag clear). > > > These examples are not exhaustive, of course. But they illustrate the > flexibility of the > mechanism, which should be suitable for the vast majority of cases in > practice. > > 3.4 Number of Unwind Groups > A simple concatenation of unwind groups by the linker that combines > unwind group sections > from multiple modules does not, of itself, provide direct information to > the unwind software > regarding how many unwind groups are present. However, the image > information (ELF > segments) should suffice to determine this number based on the size of > the segment and the > known size of each group. Alternatively, the linker might create special > symbols that can be used > to help determine the number of groups. > > This proposal leaves this detail to target-specific linker, image loader > and exception handling > system to specify. > > 3.5 Interaction with Code Generation > The relatively simple and fixed size nature of the extended compact > unwind information > proposed here depends on observing certain restrictions on optimization > that might affect code in > prologues and epilogues. These restrictions were noted where relevant > throughout this section. > Here is a summary of those requirements: > > 1. Use of any preserved register must be delayed until all of the > preserved registers have > been saved. > 2. In the prologue of a function with a RBP-based frame, the two > instructions that save the > prior RBP contents and copy the stack pointer to RBP must be kept > adjacent. Similarly, > in the epilogue of a function with a RBP-based frame, the two > instructions that cut back > the stack and restore the prior RBP contents must be kept adjacent. > [This adjacency > requirement could be relaxed by introducing an additional MODE that > corresponds to having > the RBP saved in memory but not yet updated from RSP. This does not > seem worthwhile.] > 3. For shrink wrap optimization in the presence of a > handler/personality routine, care must > be taken to not move instructions into the prologue that might cause > an exception. > Floating point exceptions and access violations are of particular > concern in this regard. > > 4 System Specific Extensions for OpenVMS > Omitted since nobody other than OpenVMS should care but it describes > adding an additional > four byte extension to the extended compact unwind group to pass along > some information from > the VAX Macro-32 compiler about VAX register usage. > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Ron Brender via llvm-dev
2018-Jan-28 23:14 UTC
[llvm-dev] [RFC] Improving compact x86-64 compact unwind descriptors
>Hi John & Ron, I read through the proposal and had a couple of quickobservations. Hi Jason. Thanks for the interesting observations and questions.>1. The proposed encoding assumes that the epilogue instructions alwayscome at the end of the function -- or rather, just before the next function. If there is a stack protector __stack_chk_fail sequence, or there is NOP padding between functions, then the epilogue cannot be expressed. The proposed encoding allows for instructions scheduled before the prologue, as long as they're guaranteed to never throw an exception or spill registers, but there's no similar affordance for after the epilogue. There are at least two alternatives for this situation: a. The main group can be ended at the RET instruction and a new group started to cover the following bytes. Interestingly, both NOPs and a __stack_chk_fail call sequence are not really unwindable instructions. Perhaps a new NOUNWIND mode should be added for such cases; this would protect against an asynchronous exception attempting what is likely a dangerous stack walk. b. Starting a new group is admittedly a rather large hammer that is perhaps OK if rare but not if it is common. If common, an alternative might be to widen the E (epilogue) field to allow more idiomatic alternatives. 4 bits might be enough: 8 codes for a RET followed by 0 to 7 NOPs + 1 code for RET + __stack_chk_fail call + 1 code for just a __stack_chk_fail call + 1 code for no RET and nothing following. For anything else, fall back to a. Personally I rather like the combination of a. + b.>2. In section 3.1, "A null frame (MODE = 8) is the simplest possibleframe, with no allocated stack of either kind (hence no saved registers). A null frame might be considered just a special case of a RSP-based frame with a stack size of zero. But unlike other frames, its frame pointer is usually not 16-byte aligned." Did you meant stack pointer here? The x86_64 SysV abi requires 16-byte alignment of the stack pointer before a CALL instruction, but I don't know of any alignment requirements on the frame pointer. Yes, it would be clearer to say the stack pointer is not 16-byte aligned. Aside: in my vocabulary, the frame pointer for a RSP-based or null frame function is the stack pointer. (And I find it an abuse of the English language to refer to a RSP-based frame as "frameless".) So what I wrote strictly is correct, but it does tend to confuse. Sorry... 3. In the same section, you propose an alternative "default null frame group" be defined,>"If the first attempt to lookup an unwind group for an exceptionaddress fails, then it is (tentatively) assumed to have occurred within a null frame function or in a part of a function that is adequately described by a null frame. The presumed return address is (virtually or actually) popped from the top of stack and looked up. This second attempted lookup must succeed, in which case processing continues normally. A failure is a fatal error." I suspect I missed a part of the proposal covering this, but if entries are described by only start address, is there a problem of the previous non-null function's range extending & picking up these functions if they don't have entries? For normal entries, I admit I sorta hand waved in Section 3.4 over how to determine the end address of the last group. There are lots of alternatives. Most involve some kind of dummy extended parameter group (MODE =12) where the STARTING-ADDRESS is one more than the last address of the last "real" group. There might also be linker assistance of some kind. In any case I think it a problem that need not be solved upfront but can wait for broader issues to settle. The default unwind group has no starting address, of course. It is defined by what is left over after all other normal groups have been considered.>4. For frameless functions -- small-stack RSP functions -- you'reassuming that the compiler saves and restores spilled registers with MOV instructions instead of PUSH/POP instructions. I think this is the normal behavior of the compiler, something like subq $0x98, %rsp movq %rsi, 0x10(%rsp) movq %rdi, 0x18(%rsp) or whatever, but the proposed encoding can only allow for one change in stack size. The current encoding requires this for large-stack RSP functions (we get the offset to the RSP instruction where we read the amount being decremented in the compact unwind encoding), but for small stack frameless functions the compiler can today express in compact unwind a combination of sub and push instructions as long as it's a fixed amount within the range that the small-stack RSP encoding can express. You are quite right, with the understanding that "the range that the small-stack RSP encoding can express" is in fact the body of the function and excludes the prologue and epilogue. Our proposal, of course, is trying to describe the prologue and epilogue as well. My understanding of modern x86 processors is that there is no meaningful performance difference between using pushes/pops versus moves. That is, using only moves in prologues and epilogues is a negligible price to pay in our quest for a simple and compact unwind representation. But I do take the point that this should be mentioned in Section 3.5 regarding code generation implications. Thanks again, Ron On Fri, Jan 26, 2018 at 7:19 PM, Jason Molenda via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi John & Ron, I read through the proposal and had a couple of quick > observations. > ... > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180128/989cf751/attachment-0001.html>
Ron Brender via llvm-dev
2018-Jan-29 21:36 UTC
[llvm-dev] [RFC] Improving compact x86-64 compact unwind descriptors
Hi Nick, It is a pleasure to be in contact with the creator of the compact unwind approach! I can see how an array of 32-bit unwind blocks could be used to describe each distinct point within a function (within a prolog in particular). But then you end up with six or seven or more such blocks for a large percentage of functions, don't you? Seems like a lot of additional space for something that is usually simple, compact (sic) and idiomatic and dilutes the original benefit. We are seeking a summary description in a small fixed size to supplement the base unwind block. Since you mention it, why have the UNWIND_IS_NOT_FUNCTION_START flag? Seems like you don't need it for unwinding purposes. Especially in a function that has a code layout where the entry point is not the first/lowest addressed instruction of the function (which I have seen in some GEM produced code for Alpha). An aside, but just curious. Not being a MAC guy the macOS two-level lookup scheme has always been rather a mystery. What little I know is gleaned from the compact_unwind_encoding.h file found around the net. But I have never seen the .c/.cxx/.cpp that goes with it to better understand how the mapping works. Always figured the was considered proprietary but maybe I just don't know where/how to look. Is there public code and/or design note information to look at? You are correct that this paper tends toward trying to create a scheme with little or no linker involvement. There are good reasons to ask more of the linker for better space economy, so this is still under discussion. You are also correct that the unwinder needs to interpret the additional prolog/epilog information. But the rest of your comment has me confused: the extra 2-bytes is just a widening as you suggest, right? And there is no Section 5.1--did you mean 3.1? Thanks again, Ron On Fri, Jan 26, 2018 at 9:54 PM, Nick Kledzik via llvm-dev < llvm-dev at lists.llvm.org> wrote:> John and Ron, > > I developed the original compact unwind implementation for macOS 10.6 back > in 2009. I tried to leave space in the design to support finer grain > exception handling such as for asynchronous or for the shrink wrap > optimization. The idea I had at the time was instead of having just one > 32-bit compact unwind info per function, there could be an array of them > each covering a different range. All but the first would have the > UNWIND_IS_NOT_FUNCTION_START bit set. > > On macOS, the linker takes the 32-byte “compact unwind group description” > records and folds them down into a complete different (read only) runtime > table which basically maps an offset in the DSO into a 32-bit compact info > for that address. > > It seems like from your proposal that your linker keeps the “compact > unwind group description” records as is and just concatenates them, and the > runtime would need to understand the prolog/epilog encoding. I would think > that just widening the group record (as you suggest in 5.1) would be much > simpler (like SHT_REL was widened to SHT_RELA). > > -Nick > > > > > On Jan 26, 2018, at 7:54 AM, John Reagan via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > > > Here is our proposal to extend/enhance the x86-64 compact unwind > > descriptors to fully describe the prologue/epilogue for asynchronous > > unwinding. I believe there are missing/lacking CFI directives as well, > > but I'll save that for another thread. > > > > > > Asynchronous Compact Unwind Descriptors > > Ron Brender, VMS Software, Inc. > > Revised January 25, 2018 > > > > 1 Introduction > > This document proposes means to extend so-called compact unwind > > descriptors to support fully > > asynchronous exception handling. This will make extended compact unwind > > descriptors an > > alternative to DWARF CFI (call frame information) for achieving > > asynchronous exception > > support. > > > > Compact unwind descriptors can and have been used in both 64- and 32-bit > > environments. > > However, this proposal addresses only 64-bit environments. While the > > ideas presented here can > > be readily adapted for use in a 32-bit environment, for simplicity this > > document makes no > > attempt to do so. > > > > There are generally three kinds of information that together form the > > heart of modern software > > exception handling systems: > > > > 1. Information that is used to divide the remaining unwind information > > into groups that are > > specific to particular regions of memory (often, but not > > necessarily, associated with a > > single function) as well as provide a way to efficiently search for > > and identify the > > grouping that is associated with a particular address in memory. > > > > 2. Information that can be used to virtually or actually unwind from > > the call frame of an > > executing function to the call frame of its caller (at the point of > > the call). > > > > 3. Identification of an associated personality routine that is invoked > > by the general exception > > handling mechanization to guide how processing should proceed for a > > given function, as > > well as additional “language specific data” needed for the > > personality routine to do its > > job. Note that the personality routine and its related data are > > specified as an adjunct to > > compiled code and totally opaque to the general mechanism (other > > than the specified > > interface). > > > > Note in particular that C++ exception handling is built on top of a > > personality routine and > > language specific data area ABI that itself can be implemented using > > either DWARE CFI or > > extended compact unwind information as described here. The choice > > between the two is > > transparent to C++. > > > > > > 2 Compact Unwind Overview > > This section provides a brief overview of key features of the LLVM > > compact unwind design. It > > does not attempt a comprehensive re-statement of all aspects of the > > design except to the extent > > necessary to motivate and understand later proposed changes and > > enhancements. > > > > 2.1 Compact Unwind Group Description > > A compact unwind group consist of five fields, as follows: > > > > | 63 32 31 0 | > > |-------------------------------------------------------------------| > > | STARTING-ADDRESS | > > | LENGTH | COMPACT-UNWIND-DESCRIPTION | > > | PERSONALITY-FUNCTION-POINTER | > > | LANGUAGE-SPECIFIC-DATA-ADDRESS | > > |-------------------------------------------------------------------| > > > > STARTING-ADDRESS (64-bits) is the lowest address of a region of memory > > occupied by some > > code, typically the entry point of a function. > > > > LENGTH (32-bits) is the number of bytes included in this group, > > typically including all and only > > the code of a function. > > > > COMPACT-UNWIND-DESCRIPTION (32-bits) is a description of the fully > > formed frame of a > > function and how to unwind it. This is described further following. > > > > PERSONALITY-FUNCTION-POINTER (64-bit) is a pointer to the personality > > routine. > > > > LANGUAGE-SPECIFIC-DATA-ADDRESS (64-bits, sometimes abbreviated LSDA) is a > > pointer to some data to be passed to the personality routine when it is > > called. > > > > A key observation is that the starting address plus length way of > > describing a group means that > > the set of groups for a compilation unit need not describe all of the > > code in that unit. In > > particular, it appears to be expected that no unwind information need be > > generated for leaf > > functions. > > > > On the other hand, it is reasonable to expect that the groups that are > > emitted are ordered by the > > starting address. This means that a simple and fast binary search can be > > used to map an address > > to the group that applies to that address, if any. > > > > It is useful to note that the run-time representation of unwind > > information can vary from little > > more than a simple concatenation of the compile-time information to a > > substantial rewriting of > > unwind information by the linker. The proposal favors simple > > concatenation while maintaining > > the same ordering of groups as their associated code. > > > > 2.2 Compact Unwind Frame Description > > A compact unwind frame description describes a frame in sufficient > > detail to be able to unwind > > that frame to the frame of its caller. > > > > | 31 28 27 24 23 0 | > > |-------------------------------------------------------------------| > > | FLAGS | MODE | | > > |-------------------------------------------------------------------| > > > > At the top most level, there are four bits that are not of further > > interest here. Interpretation of > > these bits is neither used nor changed. > > > > Also at the top-level is a 4-bit mode field. This is the tag of a > > discriminated (tagged, variant) > > union that selects the interpretation of the remaining 24 bits. > > > > Of the 16 possible modes, only 5 are defined : > > > > Code Meaning Description > > 0 Old “Old” is presumed to refer to some historical > > design that is no longer of interest. > > It is treated here as Reserved. > > 1 RBP-based The frame uses the RBP register as a frame pointer. > > The size of the frame can > > frame vary during execution. > > 2 RSP-based The frame uses RSP as the frame pointer. The size > > of the frame is fixed (at > > frame compilation time). > > 3 Large RSP- The frame uses RSP as the frame pointer, The size > > of the frame is fixed (at > > based frame compilation time); however, that size is too large > > to express within this 32-bit > > descriptor encoding. > > 4 DWARF The frame, for whatever reason, cannot be > > adequately described using the > > escape compact unwind frame description. The remaining > > 24-bits are an index into > > what the DWARF standard calls the .debug_frame > > section (__eh_frame in > > LLVM). > > > > 2.2.1 RBP-based Frame (MODE=1) > > For a RBP-based frame, the remaining 24 bits are encoded as follows: > > > > | 23 16 | 15 | 14 0 | > > |-------------------------------------------------------------------| > > | OFFSET | 0 | REGS | > > |-------------------------------------------------------------------| > > > > In a RBP-based frame the RBP register is pushed on the stack immediately > > after the return > > address, then RSP is moved to RBP. To unwind, RSP is restored with the > > current RPB value, > > then RBP is restored by popping off the stack, and the return is done by > > popping the stack once > > more into the instruction pointer. > > > > All preserved registers are saved in a small range in the stack that > > starts at RBP-8 to RBP-2040. > > The offset/8 relative to RBP is encoded in the 8-bit OFFSET field. The > > registers saved are > > encoded in the 15-bit REGS field as five 3-bit entries. > > > > 2.2.2 RSP-Based Frame (MODE=2) > > For a RSP-based frame, the remaining 24 bits are encoded as follows: > > > > |23 16 | 15 13 | 12 10 | 9 0 | > > |-------------------------------------------------------------------| > > | SIZE | | CNT | REG_PERM | > > |-------------------------------------------------------------------| > > > > In a RSP-based frame the stack pointer serves directly as the frame > > pointer and RBP is available > > for use as a general register. Upon entry, the stack pointer is > > decremented by 8*SIZE bytes (the > > maximum stack allocation is thus 2040 bytes). To unwind, the stack size > > is added to the stack > > pointer, and completed by popping the stack once more into the > > instruction pointer. > > > > All preserved registers are saved on the stack immediately after the > > return address. The number > > of registers saved (up to 6) is encoded in the 3-bit CNT field. The > > 11-bit REG_PERM field > > encodes which registers were saved and in what order. > > > > 2.2.3 Large RSP-Based Frame (MODE=3) > > For a large RSP-based frame, the remaining 24 bits are encoded as > follows: > > > > |23 16 | 15 13 | 12 10 | 9 0 | > > |-------------------------------------------------------------------| > > | SIZE | ADJ | CNT | REG_PERM | > > |-------------------------------------------------------------------| > > > > This case is like the previous, except the stack size is too large to > > encode in the compact unwind > > encoding. Instead, the function must include a "subq $nnnnnnnn, RSP" > > instruction in its > > prologue to allocate the stack. The offset from the entry point of the > > function to the nnnnnnnn > > value in the function is given in the SIZE field. > > > > Depending on the exact instructions used to save registers (PUSH versus > > MOV), the nnnnnnnn > > value in the instruction stream may not be quite the full stack size. > > ADJ * 8 is the additional > > adjustment needed to get the actual size. > > > > 2.2.4 DWARF Escape (MODE=4) > > The frame, for whatever reason, cannot be adequately described using a > > compact unwind frame > > description. The remaining 24-bits are an index into what the DWARF > > standard calls the > > .debug_frame section (called __eh_frame in LLVM). > > > > > > 3 Asynchronous Changes and Enhancements > > It is immediately obvious that omission of unwind information for leaf > > functions (with any kind > > of frame) precludes handling an exception that might occur during its > > execution. It follows that > > unwind information must cover all of the code of a module (with one > > exception discussed > > below). But if successive unwind groups are ordered (as previously > > assumed) and also leave no > > gaps, then the LENGTH field is redundant and can be omitted. The > > beginning address of a > > following group is always one byte past the end of the predecessor > > group. There remains only the > > question of how to identify the last group of a set. > > > > It should also be clear that the unwind representation described in the > > prior section is not > > sufficient to unwind from an asynchronous exception that might occur in > > either the prologue or > > epilogue of a function. To see this consider what would happen if an > > exception occurred at either > > the entry point or the return instruction of either a RBP- or RSP-frame > > function. To be able to > > handle asynchronous exceptions at any point during function execution, > > it is necessary to add > > additional information to each unwind group. > > > > These two considerations can be combined. The result is simply to > > repurpose the LENGTH field > > to encode prologue and epilogue information. > > > > 3.1 Extended MODEs > > To preserve backward compatibility and to allow intermixing of > > traditional and extended > > compact unwind groups, new MODEs are defined as follows: > > > > Code Meaning Description > > ------------------------------------------------------------ > ---------------------------------- > > 8 Null frame See below. > > 9 Extended RBP-based frame Like mode 1 but combined with > > extended prologue/epilogue > > information in place of a LENGTH > field > > 10 Extended RSP-based frame Like mode 2 but combined with > > extended prologue/epilogue > > information in place of a LENGTH > field > > 11 Extended Large RSP-based Like mode 3 but combined with > > extended prologue/epilogue > > frame information in place of a LENGTH > field > > 12 Extended parameter(s) See below. > > > > A null frame (MODE = 8) is the simplest possible frame, with no > > allocated stack of either kind > > (hence no saved registers). A null frame might be considered just a > > special case of a RSP-based > > frame with a stack size of zero. But unlike other frames, its frame > > pointer is usually not 16-byte > > aligned. > > > > It appears technically feasible for a null frame function to have a > > personality routine. However, > > the utility of such a capability seems too meager to justify allowing > > this. We propose to not > > support this. > > > > There remains the question of whether it is necessary or at least > > desirable to strictly apply the > > requirement that “all” code be covered by an unwind group. Based on > > successful experience in > > OpenVMS on the Itanium architecture (as well as Windows systems we > > think), this alternative is > > proposed: > > > > If the first attempt to lookup an unwind group for an exception > > address fails, then it is > > (tentatively) assumed to have occurred within a null frame function > > or in a part of a function > > that is adequately described by a null frame. The presumed return > > address is (virtually or > > actually) popped from the top of stack and looked up. This second > > attempted lookup must > > succeed, in which case processing continues normally. A failure is a > > fatal error. > > > > This concept of a “default null frame group” is especially convenient > > for dealing with > > disconnected code sequences such as trampolines, PLTs, and the like. > > > > The extended RBP-based and RSP-based frame modes (MODEs = 9 and 10) are > > simple > > supersets of their more traditional predecessors. > > > > The large RSP-based frame mode (MODE = 11) is also a superset of mode 3 > > except that instead > > of finding the stack size in the instruction stream, it is found in a > > preceding extended parameter > > unwind group (MODE = 12). This difference is essential for support of > > execute-only code. > > > > To understand the extended parameter group, suppose that two groups > > occur one after the other > > but have the same STARTING-ADDRESS value. A binary search using the > > STARTING- > > ADDRESS field will ignore the first of the pair because the address > > being sought can never be at > > least as large as the first but less than the second. Such a group can > > be used as an escape > > convention that allows inserting additional information into the > > sequence of groups that > > otherwise cannot be easily included. > > > > The extended modes are described more fully in the following sections. > > > > 3.2 Function Parts > > Functions are generally considered to consist of three distinct parts: > > > > 1. A prologue, which allocates local storage, saves registers that must > > be preserved for the > > benefit of callers, and possibly other housekeeping (such as > > initializing any local state > > necessary for the correct management of the function’s execution). > > > > 2. A body, which performs the main work of the function. > > > > 3. An epilogue, which makes the result available to the caller, and > > undoes the effects of the > > prologue (restores preserved registers, deallocates local storage, > > and so on). > > > > Here a word of caution is in order. This vocabulary tends to be applied > > at multiple levels in > > software architecture and each level has its own set of issues to > > consider and resulting > > requirements to be observed. > > > > Note that a null frame function has no distinct prologue, body or > > epilogue. Every instruction can > > be viewed as simultaneously in all three parts, or in none of them. > > > > This leads to the following proposal for the upper part of the extended > > compact unwind > > quadword for use in combination with MODE = 8 in the lower part. > > > > | 63 48 | 47 32 | > > |-------------------------------------------------------------------| > > | RESERVED | 0 ... 0 | > > |-------------------------------------------------------------------| > > > > > > | 31 16 | 15 0 | > > |-------------------------------------------------------------------| > > | 0 0 0 0 | 1 0 0 0 | 0 ... 0 | 0 ... 0 | > > |-------------------------------------------------------------------| > > > > > > > > 3.2.1 Function Prologue > > For the purposes of exception handling, the key steps are: > > > > 1. Allocate space on the stack in which to save preserved registers and > > for other purposes > > 2. Save the registers that must be preserved. > > > > The DWARF call frame information (CFI) provides a description that is > > precise and accurate at > > each and every instruction in a function (not limited to just the > > prologue). But experience shows > > that this representation is bulky in space and expensive in time to > > decode and interpret. Indeed, > > compact unwind descriptors as described in Section 2 were created to > > avoid these issues. > > > > Experience with OpenVMS on the Alpha and Itanium architectures shows > > that a key constraint > > can greatly simplify the unwind description: require that no preserved > > register can be changed > > until all of them have been saved. The last such save ends the prologue > > and the next instruction > > begins the body. An unwind does not need (must not) restore any > > preserved registers because > > they are still valid. A simple length from the entry point suffices to > > describe this boundary. > > [The body does not necessarily need to begin until at least one of the > > preserved registers is > > actually changed.] > > > > 3.2.1.1 RSP-Based Frame (MODE = 10) > > For a RSP-based frame, it is also necessary to know which instruction > > adjusts the stack pointer: > > before this instruction an unwind must only (virtually or actually) pop > > the return address; and > > after this instruction, the stack pointer must first be incremented to > > deallocate the stack frame, > > then the instruction pointer restored. > > > > This leads to the following proposal for the upper part of the extended > > compact unwind > > quadword for use in combination with MODE = 10 in the lower part. > > > > | 63 48 47 46 40 39 32 | > > |-------------------------------------------------------------------| > > | RESERVED | | PROLOGUE-SIZE2 | PROLOGUE-SIZE1 | > > |-------------------------------------------------------------------| > > > > > > Except for the MODE, the lower part is exactly like it would be for MODE > > = 2. > > > > PROLOGUE-SIZE1 is the length in bytes of the first part of the prologue > > relative to the > > STARTING-ADDRESS, up to and including the instruction that allocates the > > stack. > > > > PROLOGUE-SIZE2 is the length in bytes of the second part of the prologue > > relative to the end > > of the first part, up to a point after the last preserved register is > > saved and before the first > > preserved register is changed. (Recall, this point may not be unique.) > > > > PROLOGUE-SIZE1 + PROLOGUE-SIZE2 gives the total size of the prologue. > > > > The maximum prologue size allowed here is much greater than will be > > typical. This is deliberate > > to allow compilers freedom to use shrink-wrap type optimizations in > > which safe operations that > > need no body state are moved from the body to the beginning of the > > prologue. This avoids the > > cost of stack frame setup and teardown for simple special case handling > > that often leads to an > > early exit. > > > > 3.2.1.2 Large RSP-based Frame (MODE = 11) > > This case is like the previous, except the stack size is too large to > > encode in the compact unwind > > encoding. The SIZE field is interpreted as an offset relative to the > > beginning of the containing > > unwind group to (what is necessarily) an earlier extended parameter > group. > > > > 3.2.1.3 RBP-Based Frame (MODE = 9) > > The RBP-based frame is the same as for a RSP-based frame except that > > there are two instructions > > that mark the transition between the two parts of the prologue: the push > > of the prior RBP value > > onto the stack followed by the copy of RSP to RBP to establish the new > > frame pointer. > > > > There appears to be no reason to not make this simplifying requirement: > > the push and copy > > always occur together. > > > > This leads to the following proposal for the upper part of the extended > > compact unwind > > quadword for use in combination with MODE = 9 in the lower part. > > > > | 63 48 47 46 40 39 32 | > > |-------------------------------------------------------------------| > > | RESERVED | | PROLOGUE-SIZE2 | PROLOGUE-SIZE1 | > > |-------------------------------------------------------------------| > > > > Except for the MODE, the lower part is exactly like it would be for MODE > > = 1. > > > > PROLOGUE-SIZE1 is the length in bytes of the first part of the prologue > > relative to the > > STARTING-ADDRESS, up to and including the instruction that pushes the > > prior RBP contents. > > > > PROLOGUE-SIZE2 is the length in bytes of the second part of the prologue > > relative to the end > > of the first part, up to a point after the last preserved register is > > saved and before the first > > preserved register is changed. (Recall, this point may not be unique.) > > > > PROLOGUE-SIZE1 + PROLOGUE-SIZE2 gives the total size of the prologue. > > > > 3.2.2 Function Epilogue > > For the purposes of exception handling, the key steps in an epilogue are: > > > > 1. Restore the registers that were preserved > > 2. Deallocate space on the stack that was used to preserve registers > > and for other purposes > > > > A key observation is that restoring registers is idempotent. That is, if > > in the midst of restoring > > registers when an exception occurs, it will do no harm if all of the > > preserved registers are > > restored again when unwinding. In effect, restoration of registers need > > not be distinguished from > > the body of a function. It is not until reaching the first instruction > > that deallocates stack is > > executed that the “real” epilogue begins. > > > > 3.2.2.1 RSP-Based Frame (MODE = 10 or 11) > > For a RSP-based frame, whether small or large, the only unwind action > > remaining after stack > > deallocation is to pop the return address into the instruction pointer > > (RIP). In the vast majority of > > cases this means that the epilogue of the code described by an unwind > > group is just a 1-byte RET > > instruction that occurs at the end of the unwind group. There are other > > possibilities but they are > > rare and con be dealt with separately (see below). > > > > It suffices to include a single 1-bit field, E, in the upper part of the > > extended compact unwind > > description: > > > > | 63 48 47 46 40 39 32 | > > |-------------------------------------------------------------------| > > | RESERVED | E | PROLOGUE-SIZE2 | PROLOGUE-SIZE1 | > > |-------------------------------------------------------------------| > > > > > > The E flag indicates that the unwind group ends with a standard 1-byte > > RET instruction. > > > > 3.2.2.2 RBP-Based Frame (MODE = 9) > > The RBP-based frame is much the same as for a RSP-based frame except > > that there are two > > instructions that mark the transition between the two parts of the > > prologue: the copy of RBP to > > RSP to cut back the stack, followed by the pop of RBP to restore what > > may be the callers frame > > pointer. Thus the epilogue, for the purposes of unwinding, begins > > immediately after the copy of > > RBP to RSP. In the vast majority of cases, the immediately following two > > instructions will > > consist of POP RBP and RET. The POP is not idempotent because it changes > > the stack pointer, > > but the unwind processing code can distinguish whether the stack has > > been popped or not based > > on the address (3 versus 1 byte less than the highest address of the > group). > > > > It suffices to include a single 1-bit field, E, in the upper part of the > > extended compact unwind > > description: > > > > | 63 48 47 46 40 39 32 | > > |-------------------------------------------------------------------| > > | RESERVED | E | PROLOGUE-SIZE2 | PROLOGUE-SIZE1 | > > |-------------------------------------------------------------------| > > > > The E flag indicates that the unwind group ends with a standard > > 3-instruction return sequence. > > > > 3.3 Special Issues > > Up until this point, it might appear that each unwind group corresponds > > one-for-one to the code > > for a single function. While common, this is not required. Important > > variations are illustrated > > here. > > > > Issue/Problem Possible Resolution > > ------------------------------------------------------------ > ------------------------------------- > > The first part of the prologue is too large to be Use two unwind > groups: > > described by the 8-bit RxP-FRAME-SIZE1 field. 1) a null frame > > group of any appropriate size, > > 2) a suitable > > RxP-based frame. > > > > The second part of the prologue is too large to be Use three unwind > > groups: > > described by the 8-bit RxP-FRAME-SIZE2 field. 1) a RSP-frame > > group to describe the first part of > > the prologue only > > (PROLOGUE-SIZE1 = the > > length of the > > entire group and PROLOGUE-SIZE2 > > = 0, > > 2) a RSP-frame > > group to describe the second part > > of the prologue > > (PROLOGUE-SIZE1 & 2 both = 0 > > and no registers > > saved), > > 3) a suitable > > RxP-based frame group (with > > PROLOGUE-SIZE1 & 2 > > both = 0). > > > > There is more than one return location (without a Generally use one > > unwind group for each sequence > > shared epilogue sequence). of code ending with > > a RET. A group following a > > RET will have no > > prologue (PROLOGUE-SIZE1 > > & 2 both = 0). > > > > There is more than one entry point to a function Generally use one > > unwind group for each sequence > > (for example, a FORTRAN multiple entry point of code beginning > > at an entry point. Each group > > subroutine or analogous assembly language may have a normal > > prologue and all but the last > > equivalents) might have NO > > epilogue (E flag clear). > > > > > > These examples are not exhaustive, of course. But they illustrate the > > flexibility of the > > mechanism, which should be suitable for the vast majority of cases in > > practice. > > > > 3.4 Number of Unwind Groups > > A simple concatenation of unwind groups by the linker that combines > > unwind group sections > > from multiple modules does not, of itself, provide direct information to > > the unwind software > > regarding how many unwind groups are present. However, the image > > information (ELF > > segments) should suffice to determine this number based on the size of > > the segment and the > > known size of each group. Alternatively, the linker might create special > > symbols that can be used > > to help determine the number of groups. > > > > This proposal leaves this detail to target-specific linker, image loader > > and exception handling > > system to specify. > > > > 3.5 Interaction with Code Generation > > The relatively simple and fixed size nature of the extended compact > > unwind information > > proposed here depends on observing certain restrictions on optimization > > that might affect code in > > prologues and epilogues. These restrictions were noted where relevant > > throughout this section. > > Here is a summary of those requirements: > > > > 1. Use of any preserved register must be delayed until all of the > > preserved registers have > > been saved. > > 2. In the prologue of a function with a RBP-based frame, the two > > instructions that save the > > prior RBP contents and copy the stack pointer to RBP must be kept > > adjacent. Similarly, > > in the epilogue of a function with a RBP-based frame, the two > > instructions that cut back > > the stack and restore the prior RBP contents must be kept adjacent. > > [This adjacency > > requirement could be relaxed by introducing an additional MODE that > > corresponds to having > > the RBP saved in memory but not yet updated from RSP. This does not > > seem worthwhile.] > > 3. For shrink wrap optimization in the presence of a > > handler/personality routine, care must > > be taken to not move instructions into the prologue that might cause > > an exception. > > Floating point exceptions and access violations are of particular > > concern in this regard. > > > > 4 System Specific Extensions for OpenVMS > > Omitted since nobody other than OpenVMS should care but it describes > > adding an additional > > four byte extension to the extended compact unwind group to pass along > > some information from > > the VAX Macro-32 compiler about VAX register usage. > > > > > > > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180129/93e4346b/attachment.html>