Rahman Lavaee via llvm-dev
2020-Jul-17 20:30 UTC
[llvm-dev] A Section-Metadata-Based Approach for Mapping Binary Addresses to Machine Basic Blocks
Greetings, TLDR: We propose emitting a new section in the binary which would contain information required to associate executable PC addresses to basic blocks. The new proposal decouples the BB information from the symbol table, allowing for more flexibility (stripping the section, adding new BB information fields). Furthermore, the new approach cuts the binary size overhead by ~60% compared to the current implementation (using special BB symbols to pass the information). Background Today Propeller (proposed in https://lists.llvm.org/pipermail/llvm-dev/2019-September/135393.html) uses the *-fbasicblock-sections=labels* option to create a symbol for every basic block to be able to map instruction addresses to individual basic blocks. To lower the .strtab bloat, these symbols are encoded using a unary naming scheme which allows for string compression. For instance, a function “foo” with four basic blocks will have three basic block labels (the entry block doesn’t need a BB label since it can be found using the function symbol). foo: … je ra.BB.foo a.BB.foo: … je rra.BB.foo ra.BB.foo: … ret rra.BB.foo: … ret While this serves the functionality of mapping addresses to basic blocks, it poses several challenges: 1. Unary symbol names are great for compression, but require changes to the demangler for readability. 2. Stripping BB symbols from the binary requires special linker support as they are placed in the symbol table along with other symbols. 3. Profilers and debuggers need to be changed to accommodate these symbols. For instance, the profiles from all BBs of a function must be aggregated and the debugger must be able to map to the function symbol, rather than the BB symbol, to show a function-level backtrace. 4. Relying on the ELF symbol format provides little flexibility for extension of BB information for passing other information about basic blocks (e.g., whether this is a return block, or an exception handling block). Today Propeller encodes this information using special characters in the symbol name (‘r’ for return blocks as in the above example). The BB Info Section Design To solve these problems, we propose emitting the BB information in a separate “.bb_info” section. Each function will have its own BB info table emitted into this section. The structure of this table is as follows. The table has a header which includes the address of the function and the number of basic blocks in that function. +-----------------------------------------------+ | Address of the function | -> 8 bytes (pointer size) +-----------------------------------------------+ | Number of basic blocks in this function (>0) | -> ULEB128 +-----------------------------------------------+ The header is followed by the BB information record for every basic block. These records are ordered in the same order as MBBs are placed in the function. Each BB Info is structured as follows: +-------------------------------------------------------+ | Offset of the basic block relative to function begin | -> ULEB128 +-------------------------------------------------------+ | Binary size of the basic block | -> ULEB128 +-------------------------------------------------------+ | BB metadata | | [MBB.isReturn() | MBB.hasTailCall() << 1 | MBB.isEHPad() << 2 ] | -> ULEB128 +-------------------------------------------------------+ The BB record for the entry basic block excludes the offset field, as it is always zero. Along with this information, Propeller needs the corresponding MBB number of every BB info record. Instead of emitting another field in the record, we implicitly derive the index assuming that MBBs are renumbered from 0 to N-1 prior to bb-info section emission. Remapping PC addresses to BBs using the .bb_info section Propeller can use the BB Info table, along with function addresses to map sampled PCs to their corresponding basic blocks. Given an address X, the workflow is as follows: 1. Use the symbol table to find the function F containing X. 2. Lookup the BB info table for function F. 3. Use the offset fields of the BB info table to find the BB which contains address X. Complexity All steps can use binary search. 1. Step 1 uses a binary search into the symbol table and has time complexity of O(lg N), with N being the number of symbols in the binary. 2. Step 2 also uses binary search to find the BB info table corresponding to a function address and it has time complexity of O(lg M), with M being the number of functions. 3. Finally, step 3 uses a binary search into the BB info table to find the offset containing the address offset relative to the function entry and has time complexity of O(lg N_F), with N_F being the number of basic blocks in function F. Advantage over the BB symbols approach The section-based approach solves the problems mentioned above by decoupling the BB information from the symbol table. Furthermore, the new approach significantly reduces the binary and object size overheads. The following table lists the total binary size of Clang for vanilla, BB-symbols, and BB-info builds, all using PGO. Flavor Binary size PGO 86MB PGO with BB Info section 97MB PGO with BB symbols 146MB We see that using the .bb_info section increases the binary size by about 13% while BB symbols (with strtab compression) amounts to about 70% binary size (60MBs) overhead. Most of this overhead comes from the .symtab bloat (56MBs). The .strtab bloat is lower (4MB) due to string compression. Implementation and challenges Our implementation of the .bb_info section is inspired by the .stack_sizes section feature in LLVM. Each function will have its BB info table emitted into a section which is linked to the section containing that function (The BB info section uses ELF:SHF_LINK_ORDER and is linked to the beginning symbol of the section containing that function). With -function-sections this emits a unique .bb_info section for each function (we don’t need a unique id for sections since they are already distinguished by the linked-to symbol). Eventually, these sections are concatenated by the linker in an order consistent with how their function sections are laid out. Several linker and compiler features may affect the .bb_info section: 1. COMDATs: For COMDAT functions, we add their .bb_info sections to the COMDAT group too, so that conflicts between .bb_info sections of the COMDAT group are resolved the same way as it is resolved for the function sections. Besides this, since we make the BB info table header of every function refer not to the global function symbol, but to the local .Lfunc_begin symbol, the treatment of the bb info section for every function is independent of symbol resolution for COMDAT functions (This is the same as how stack_sizes section is emitted: https://reviews.llvm.org/D44669). 1. Garbage Collection: If a function section is garbage-collected by the linker its .bb_info section must be removed as well, since the section only refers to that particular function. 1. Code Folding: If two functions are identical, their section is folded by identical code folding and their .bb_info sections must be folded as well. This is a safe assumption as the .bb_info sections should be identical as well, except for the function symbol that they refer to, and ICF can correctly identify that those function symbols are folded together. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200717/137e6d32/attachment.html>