David Chisnall
2014-Nov-04 09:07 UTC
[LLVMdev] [PATCH] Protection against stack-based memory corruption errors using SafeStack
On 4 Nov 2014, at 00:36, Kostya Serebryany <kcc at google.com> wrote:> You at least increase the memory footprint by doubling the stack sizes.Not quite. The space overhead is constant for each stack frame - you just need to keep track of the top of two stacks, rather than one. The important overhead is that you reduce locality of reference. You will need a minimum of two cache lines for each stack frame instead of one. In practice, this is not a huge problem, because you need several cache lines live for good performance of the stack and the total number of lines is not much different. There are likely to be some pathological cases though, when both the safe and unsafe stacks have the same alignment for the top and you are dealing with some other heap data with the same alignment. This will increase the contention in set-associative cache lines and may cause more misses. David
Volodymyr Kuznetsov
2014-Nov-04 12:47 UTC
[LLVMdev] [PATCH] Protection against stack-based memory corruption errors using SafeStack
Yes, indeed, the increase in the memory footprint is minimal and constant for each stack frame that uses the unsafe stack - it's just a single unsafe stack frame pointer per unsafe stack frame. The space for each stack object is still allocated only once: either on normal or on the unsafe stack, but not both. In practice, we indeed didn't observe any measurable increase in the memory footprint due to the safe stack in our experiments. As for the cache locality, we actually observed that the safe stack sometimes improves the cache hit rate. This is especially the case for programs that allocate large arrays or long-lived objects on the stack that should be normally evicted from the cache, but are kept there only because they share the same cache lines with e.g., spilled registers. With the safe stack, such objects are moved elsewhere, which results in the frequently accessed objects on the normal stack being closer to each other and occupy less cache lines in total. Of course there might be pathological negative cases as well, but as we show in our paper, both the average and the maximum overhead looks quite good in practice (see Figures 3 and 4 in http://dslab.epfl.ch/pubs/cpi.pdf). - Vova On Tue Nov 04 2014 at 11:50:01 AM David Chisnall < David.Chisnall at cl.cam.ac.uk> wrote:> On 4 Nov 2014, at 00:36, Kostya Serebryany <kcc at google.com> wrote: > > > You at least increase the memory footprint by doubling the stack sizes. > > Not quite. The space overhead is constant for each stack frame - you just > need to keep track of the top of two stacks, rather than one. The > important overhead is that you reduce locality of reference. You will need > a minimum of two cache lines for each stack frame instead of one. In > practice, this is not a huge problem, because you need several cache lines > live for good performance of the stack and the total number of lines is not > much different. > > There are likely to be some pathological cases though, when both the safe > and unsafe stacks have the same alignment for the top and you are dealing > with some other heap data with the same alignment. This will increase the > contention in set-associative cache lines and may cause more misses. > > David > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141104/b96ba34e/attachment.html>
Volodymyr Kuznetsov
2014-Nov-04 19:14 UTC
[LLVMdev] [PATCH] Protection against stack-based memory corruption errors using SafeStack
Hi Kostya, Thanks for your comments! We are great fans of AddressSanitizer ourselves, we use it extensively and also plan to contribute to it in the future as well. Our understanding is that ASAN's main use case is during testing; the goal of SafeStack is to run in production, so it therefore offers a specific form of protection that can be delivered at near-zero overhead. In particular, we don't focus much on bug detection, but rather on making it much harder to write exploits against code that has bugs. With the SafeStack enabled, the distance between on-stack buffer that might overflow and the return addresses (or sensitive spilled registers) that the attacker might want to overwrite becomes much less predictable (or even randomized, if ASLR is employed), as they're now stored on two separate stacks. Our current SafeStack implementation is just a first step in that direction. In the future, we plan to further protect the regular stack using leak-proof randomization (as described in our paper): the regular stack will be allocated at a random offset and the instrumentation will ensure that neither %rsp value nor any other pointers to the regular stack would ever be stored on the heap or on the unsafe stack. This would mostly require changes to the libc/glibc to instrument setjmp/longjmp, stack unwinding, and other code that accesses %rsp directly. With such protection in place, overwriting the return addresses or pivoting the stack would become nearly impossible in practice, along with many ROP attacks that are based on it. Please find answers to some of your specific questions below: On 4 Nov 2014, at 00:36, Kostya Serebryany <kcc at google.com> wrote:> > Hi Volodymyr, > > disclaimer: my opinion is biased because I've co-authored AddressSanitizer > and SafeStack is doing very similar things. > > The functionality of SafeStack is limited in scope, but given the near-zero > overhead and non-zero benefits I'd still like to see it in LLVM trunk. > SafeStack, both the LLVM and compiler-rt parts, is very similar to what we > do in AddressSanitizer, so I would like to see more code reuse, especially > in compiler-rt. > > That would be great indeed and could simplify the SafeStack code inseveral places. We will try to figure our how to do it without increasing the overhead or complexity of the SafeStack (e.g., without requiring to link with pthreads, etc.).> What about user-visible interface? Do we want it to be more similar to > asan/tsan/msan/lsan/ubsan/dfsan flags, e.g. -fsanitize=safe-stack ? > > We've picked the -fsafe-stack option as it feels more similar to-fstack-protector option, whose usage model we follow. The -fsanitize options feel more associated with testing/debugging than the production use (or at least we perceive it this way).> I am puzzled why you are doing transformations on the CodeGen level, as > opposed to doing it in LLVM IR pass. > > As I explained on Phabricator, we want to apply the SafeStacktransformation as the very last step before code generation, to make sure that it operate on the final stack layout. Doing so earlier might prevent some other optimizations from succeeding (as it e.g., complicates the alias analysis, breaks mem2reg pass, etc.) or might force the SafeStack pass move more objects to the unsafe stack than necessary (e.g., if the operations on such objects that the SafeStack considered potentially unsafe are actually later optimized away). In principle, in some pathological cases, it might even break correctness, e.g., if the SafeStack decides to keep some object on the normal stack, but the subsequent optimization or instrumentation passes add potentially unsafe operations on such objects.> LLVM code base is c++11 now, so in the new code please use c++11, at least > where it leads to simpler code (e.g. "for" loops). > > Great point! I've fixed the code to use c++11 (along with many otherissues raised on the Phabricator) and will update the patch ASAP.> compiler-rt part lacks tests. same for clang part. > > Yes, we plan to eventually add such tests in the future.> Are you planing to support this feature in LLVM long term? > > We certainly want to see SafeStack used in the real world and will do ourbest to support it in LLVM. That said, please keep in mind that we're just a small group in a research institution with very limited resources, so we hardly can make any promises and would greatly appreciate any help from the community on supporting and improving the SafeStack.> You say that SafeStack is a superset of stack cookies. > What are the downsides? > You at least increase the memory footprint by doubling the stack sizes. > You also add some (minor) incompatibility and the need for the new > attributes to disable SafeStack. > What else? > > Please see an earlier email<http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-November/078475.html> for a discussion of the memory footprint. The main (minor) incompatibility that we observed were related to the mark-and-sweep garbage collection implementation for C++ that we saw in Chromuim: we had to change it to scan the unsafe stack as well (in addition to the regular stack) when searching for pointers to the heap. The change was rather small and well isolated though, and pretty much aligned with already existing support for AddressSanitizer in that garbage collector.> I've also left a few specific comments in phabricator. > > Thank you for the comments! We plan to submit the updated patch ASAP.- Volodymyr Kuznetsov -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141104/27ad7159/attachment.html>
Kostya Serebryany
2014-Nov-04 20:42 UTC
[LLVMdev] [PATCH] Protection against stack-based memory corruption errors using SafeStack
On Tue, Nov 4, 2014 at 11:14 AM, Volodymyr Kuznetsov <vova.kuznetsov at epfl.ch> wrote:> Hi Kostya, > > Thanks for your comments! We are great fans of AddressSanitizer ourselves, > we use it extensively and also plan to contribute to it in the future as > well. > > Our understanding is that ASAN's main use case is during testing; the goal > of SafeStack is to run in production, so it therefore offers a specific > form of protection that can be delivered at near-zero overhead. In > particular, we don't focus much on bug detection, but rather on making it > much harder to write exploits against code that has bugs. With the > SafeStack enabled, the distance between on-stack buffer that might overflow > and the return addresses (or sensitive spilled registers) that the attacker > might want to overwrite becomes much less predictable (or even randomized, > if ASLR is employed), as they're now stored on two separate stacks. >No disagreement here. We do want to use asan in production (and we have some results!) but asan's overhead will remain much higher than SafeStack's.> Our current SafeStack implementation is just a first step in that > direction. In the future, we plan to further protect the regular stack > using leak-proof randomization (as described in our paper): the regular > stack will be allocated at a random offset and the instrumentation will > ensure that neither %rsp value nor any other pointers to the regular stack > would ever be stored on the heap or on the unsafe stack. This would mostly > require changes to the libc/glibc to instrument setjmp/longjmp, stack > unwinding, and other code that accesses %rsp directly. With such protection > in place, overwriting the return addresses or pivoting the stack would > become nearly impossible in practice, along with many ROP attacks that are > based on it. > > Please find answers to some of your specific questions below: > > On 4 Nov 2014, at 00:36, Kostya Serebryany <kcc at google.com> wrote: >> >> Hi Volodymyr, >> >> disclaimer: my opinion is biased because I've co-authored AddressSanitizer >> and SafeStack is doing very similar things. >> >> The functionality of SafeStack is limited in scope, but given the near-zero >> overhead and non-zero benefits I'd still like to see it in LLVM trunk. >> SafeStack, both the LLVM and compiler-rt parts, is very similar to what we >> do in AddressSanitizer, so I would like to see more code reuse, especially >> in compiler-rt. >> >> That would be great indeed and could simplify the SafeStack code in > several places. We will try to figure our how to do it without increasing > the overhead or complexity of the SafeStack (e.g., without requiring to > link with pthreads, etc.). > >> What about user-visible interface? Do we want it to be more similar to >> asan/tsan/msan/lsan/ubsan/dfsan flags, e.g. -fsanitize=safe-stack ? >> >> We've picked the -fsafe-stack option as it feels more similar to > -fstack-protector option, whose usage model we follow. The -fsanitize > options feel more associated with testing/debugging than the production use > (or at least we perceive it this way). >I have no strong opinion here.> I am puzzled why you are doing transformations on the CodeGen level, as >> opposed to doing it in LLVM IR pass. >> >> As I explained on Phabricator, we want to apply the SafeStack > transformation as the very last step before code generation, to make sure > that it operate on the final stack layout. Doing so earlier might prevent > some other optimizations from succeeding (as it e.g., complicates the alias > analysis, breaks mem2reg pass, etc.) or might force the SafeStack pass move > more objects to the unsafe stack than necessary (e.g., if the operations on > such objects that the SafeStack considered potentially unsafe are actually > later optimized away). In principle, in some pathological cases, it might > even break correctness, e.g., if the SafeStack decides to keep some object > on the normal stack, but the subsequent optimization or instrumentation > passes add potentially unsafe operations on such objects. >asan instrumentation is happening at the very end of the optimization chain and effectively we achieve what you need (run after all optimizations). I would still suggest you to at least explore such possibility. LLVM code base is c++11 now, so in the new code please use c++11, at least>> where it leads to simpler code (e.g. "for" loops). >> >> Great point! I've fixed the code to use c++11 (along with many other > issues raised on the Phabricator) and will update the patch ASAP. > >> compiler-rt part lacks tests. same for clang part. >> >> Yes, we plan to eventually add such tests in the future. > >> Are you planing to support this feature in LLVM long term? >> >> We certainly want to see SafeStack used in the real world and will do our > best to support it in LLVM. That said, please keep in mind that we're just > a small group in a research institution with very limited resources, so we > hardly can make any promises and would greatly appreciate any help from the > community on supporting and improving the SafeStack. >You say that SafeStack is a superset of stack cookies.>> What are the downsides? >> You at least increase the memory footprint by doubling the stack sizes. >> You also add some (minor) incompatibility and the need for the new >> attributes to disable SafeStack. >> What else? >> >> Please see an earlier email > <http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-November/078475.html> for > a discussion of the memory footprint. The main (minor) incompatibility that > we observed were related to the mark-and-sweep garbage collection > implementation for C++ that we saw in Chromuim: we had to change it to scan > the unsafe stack as well (in addition to the regular stack) when searching > for pointers to the heap. The change was rather small and well isolated > though, and pretty much aligned with already existing support for > AddressSanitizer in that garbage collector. >I wonder if SafeStack can hook into the existing asan<=>gc interface?> I've also left a few specific comments in phabricator. >> >> Thank you for the comments! We plan to submit the updated patch ASAP. > > - Volodymyr Kuznetsov >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141104/63e116a7/attachment.html>
Stephen Checkoway
2014-Nov-14 04:39 UTC
[LLVMdev] [PATCH] Protection against stack-based memory corruption errors using SafeStack
Hi Volodymyr, On Nov 4, 2014, at 2:14 PM, Volodymyr Kuznetsov <vova.kuznetsov at epfl.ch> wrote:> With such protection in place, overwriting the return addresses or pivoting the stack would become nearly impossible in practice, along with many ROP attacks that are based on it.I agree that overwriting the return address becomes more difficult, but stack pivoting is still doable. On x86, overwriting a function pointer to point to xchg eax, esp ret which is just two bytes, 0x94 0xc3, will pivot the stack to eax and start a return-oriented program when the function pointer is called. -- Stephen Checkoway