Kristof Beyls via llvm-dev
2020-Oct-15 12:48 UTC
[llvm-dev] Round table on AArch64 Pauth ABI - minutes
Thank you to everyone who contributed to the round table on AArch64 Pauth ABI at the LLVM developers meeting last week. Please find the minutes below. As we discussed at the round table, we’ll have more calls in the future on this topic. The next one is scheduled on Monday 9th of November; at 17:00 UTC (which should be 9am US pacific time). Zoom link: https://armltd.zoom.us/j/96797724255?pwd=ZEpuMmJndTBlSVU5OVIxTjN3ZzQ3Zz09 I’ve attached an ical file with invite to that call for convenience. Thanks, Kristof — When: Wed. Oct 7, 2020 8:55 AM - 9:30 AM Where: https://whova.com/portal/webapp/llvm_202010/Agenda/1274481 Intro: The Arm instruction set has introduced a “Pointer Authentication” extension in the Armv8.3-A architecture. All cores implementing the Armv8.3 or later architecture implement this extension. The extension enables putting a cryptographic hash of the address and other “salt” info into the upper bits of the pointer, effectively “signing” the pointer. Later in the execution, just before the pointer is used, the pointer can be authenticated, i.e. the cryptographic hash stored in the upper bits of the pointer can be verified, checking with high probability that the pointer has not been tampered by a hacker. Protecting return addresses (i.e. hardening backwards control flow integrity, i.o.w. hardening against ROP attacks) can be done without breaking ABI. This has been implemented in clang/llvm/gcc and is further being implemented in other parts of the system software stack. Protecting indirect calls/jumps (i.e. hardening forward control flow integrity, i.o.w. hardening against JOP attacks) can NOT be done without breaking ABI. We are starting to define a new ELF ABI to enable using the Armv8.3 PAuth extension. The topic of this round table is to discuss the design of that ABI and its implementation in LLVM. This ELF ABI is similar to the Apple arm64e ABI which has similar goals for the Darwin platform (see https://www.youtube.com/watch?v=C1nZvpEBfYA and https://github.com/apple/llvm-project/blob/a63a81bd9911f87a0b5dcd5bdd7ccdda7124af87/clang/docs/PointerAuthentication.rst). The draft ELF ABI is being developed at https://github.com/ARM-software/abi-aa/blob/master/pauthabielf64/pauthabielf64.rst. There is a pull-request to take the document to version 0.2 open at the moment. To view 0.2 of the document please use https://github.com/ARM-software/abi-aa/blob/ed1151099f52c8caf5e575e8d8c00450d43dcbc2/pauthabielf64/pauthabielf64.rst Feel free to comment on the pull-request https://github.com/ARM-software/abi-aa/pull/41 Minutes: * Breaking the abi, we’ll need multiple iterations to design this new abi. How can we record version information for the different variants of this ABI a binary file assumes? * current proposal - a note section in the ELF file with a (vendor, version) tuple. * As a mitigation - it’s needed to record version information. Indeed, ABI is expected to evolve as we understand the tradeoffs (security and others) over time. * No better solution known than just assuming every different variant is fully incompatible with every other variant. * Also good to help make sure that all binary objects are actually built with the right abi variant. * Versioning at the ELF object level may not be needed? as it’s “just” indications/relocations needed how to sign specific pointers? The ELF part stabilises early but the language level mapping to it does not. That implies the compiler should encode the signing scheme it uses in ELF object files? * Within even a single OS, there probably are going to be different ABI variants - e.g. firmware using a more strict variant (but harder to use/deploy) than user space software. * Version info probably should contain platform (e.g. “FreeBSD 2.0 signing scheme”) * Use cases for deeply bare metal or is this mostly an OS-level? * A bare-metal system may not support RELRO, lazy-binding cannot be RELRO either. What do we do about the GOT? Ideally it is RELRO so does not need to be signed. * You don’t need to specify how to construct a signed GOT - basically the compiler constructs fragments of a “GOT” without using GOT generating relocations. * complexity of the draft spec comes mostly from a signed GOT. would be good to get rid of the complexity. * You should expect a section of the GOT to be signed; another section not signed. All these presumably would need to go on different pages? If you use RELRO, non-lazy binding for all GOTs they can all live on the same set of pages. But maybe on ELF this turns out to not be a big problem? Typically in AArch64 ELF the GOT and PLT GOT are separate sections with the GOT placed in the RELRO segment and a lazy binding enabled PLT GOT in RW segment. * Lazy binding needs the PLT GOT to be writable, which may be a security issue if it is not signed as an attacker may modify the pointers in there to redirect function calls. * A signed PLT GOT is a contract between static linker and dynamic linker, we already have a dynamic tag and a signing schema defined for this. LLD and ld.bfd implement support but there is as yet no support in dynamic linkers. * dlopen/dlsym - what to do about it? * arm64e found that from process launch either the process is “signed” or “unsigned”. so, a signed application loading a shared library needs that library to be signed. Loading a signed library from a non-signed application works though. * Could we use fat binaries in ELF for packing a signed and non-signed version in a single library? * dlsym: whether you get a signed pointer or not depends on whether you are in a signed segment in arm64e? * Peter Collingbourne has implemented a prototype where an extra side table is present to have the extra per-symbol signing information. * Another approach is to have the compiler re-sign pointers when casting from void* to a function type. * Tools to measure “how much security did you get” by using a particular abi version? e.g. post-processing tools. * front-end diagnostics (some of them have been implemented)? * in the backend there are cases where a compiler bug might cause an issue. Optimization remarks in the backend very late could help catch these. * There’s a whole community of security folks that are interested that have their own tools (non-LLVM) like IDA, hydra etc. * The binary analysis part (last part) is actually really important. patterns are easy to recognize if there is something problematic. (e.g. catching corner cases in the compiler; or bad assembly code). * the compiler can emit standard code patterns to make spotting these easier. * upstreaming for arm64e is starting. * debug these things: major issue (Ahmed): downstream lldb arm64e has enough support to debug this. * Could introduce new DWARF extensions to make the debugging experience better. * A spec for this? Would be a separate document for ELF AArch64 at least. * Could the debugger just strip the PAC codes? It gets complicated(similar to TBI) because the size of the PAC may be different based on whether it is a data or code pointer, and the debugger doesn’t always know if something is a code or data pointer? * LLDB in UI tries to print both signed and pointer without PAC bits. * Since debugger runs as different process, it cannot do anything with PAC bits (different keys) * LLDB uses cross-process JIT to evaluate expressions. Different processes have different crypto keys. How to make that work? * Kristof will look into setting up a monthly zoom call to discuss this topic further in the future. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201015/b6fdc37c/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: LLVM Pointer Authentication ABI.ics Type: text/calendar Size: 2996 bytes Desc: LLVM Pointer Authentication ABI.ics URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201015/b6fdc37c/attachment.ics>