Peter Boström
2012-Sep-26 11:17 UTC
[LLVMdev] Modifying address-sanitizer to prevent threads from sharing memory
Hi llvm-dev! I'm writing my master's thesis on sandboxing/isolation of plugins running in a multithreaded environment. These plugins run in a real-time environment where the cost of IPC/context switching and being at the scheduler's mercy is not an option. There can be a lot of plugin instances running and all have to perform some computations and return the result to the main thread on an audio-buffer callback. These need to be isolated, primarily from the main thread, but preferably from eachother as well. I'm thinking that modifying address-sanitizer for this purpose could be feasible. The shadow byte could also be split to contain a part with a 'short-id' associated with to which thread/plugin the memory belong. This would of course limit the plugin/thread short-ids available, and in theory some false negatives could arise if two plugins are given the same short-id, and then access eachother's memory, though this error would be detected if it occurs another time, when they don't have the same short-id. Modification would be slightly different if n threads drive n plugins, or if a thread pool of n threads drive m plugins. This would of course not work for globals, which naturally would be owned by the main thread. Also any kind of communication between plugins or back to the main thread would have to be mediated by the main thread or using uninstrumented/unsafe code. It's intended that the main code and plugin code are instrumented with different asan passes.>From what I have thought of yet (but I'd love feedback!), these changesare needed: 1. Storing+checking thread/plugin id in shadow byte. 2. Modified stack instrumentation to set up these shadow bytes. 3. Graceful shutdown of plugins preferred, free'ing heap and signaling back to main thread instead of shutting down. Also, an optional compile flag could be used to modify the instrumentation's granularity, whether to assume memory blocks are allocated in multiples of 8, giving less code-blowup. The shadow bytes would essentially be booleans then. Though this isn't directly related to the changes I'd require doing. === Heap part == shadow_byte k: 0 0 0 0 0 0 0 0 <short_id><shadow> short-id part: 0: main thread 1-30: plugin/thread short-ids 31 = 0x1F, all bits set: unallocated shadow part: 0-7, same encoding as original. == Original instrumentation code (ASan USENIX2012 paper) = * All instrumented code: ShadowAddr = (Addr >> 3) + offset; k = *ShadowAddr; if (k != 0 && (Addr & 7) + AccessSize > k) ReportAndCrash(Addr); == Concept code (code blowup, though) = ShadowAddr = (Addr >> 3) + offset; k = *ShadowAddr; alloc_id = k >> 3; shadow = k & 0x0F; * Thread/plugin code: if (alloc_id != my_short_id || // alloc belongs to other thread shadow && (Addr & 7) + AccessSize > shadow) ReportAndCrash(Addr); * Main code: if (alloc_id == 0x1F || // unallocated memory shadow != 0 && (Addr & 7) + AccessSize > shadow) ReportAndCrash(Addr); == Less granularity: assume+enforce multiples of 8, quicker/smaller = shadow byte = short-id: 0 = main id 1-254: short-ids 255 = 0xFF: unallocated ShadowAddr = (Addr >> 3) + offset; k = *ShadowAddr; * Thread/plugin code: if (k != my_short_id) // allocated/set from different thread ReportAndCrash(Addr); * Main code: if (k == 0xFF) // unallocated memory ReportAndCrash(Addr); === Stack part == This part would be different depending on whether there's a 1-to-1 mapping between threads and plugins. * 1-to-1 mapping: Since the plugin owns the thread stack, all of the corresponding shadow can be initially filled with the shadow byte indicating that that thread can access all of it. Poisoning the redzones would have to be done still, but unpoisoning (and initial setup) would not set the shadow to zero(except for the main stack), but rather each byte (memset) back to (short_id << 3), which would indicate that the plugin with that short_id can read/write all corresponding bytes. * n-to-m mapping: If the stack is shared, it can't be poisoned/unpoisoned back to a state readable by the next plugin using that stack space. When allocating stack variables, all corresponding shadow bytes have to be set to readable by that stack. Though it may be possible to have different stacks for each plugin, and use the same mapping as above. == What do you think? Does this sound feasible? This would of course not be changes to the existing -faddress-sanitizer flag, but part of the thesis project. Very thankful for feedback! Kind regards, - Peter
Dmitry Vyukov
2012-Sep-26 20:01 UTC
[LLVMdev] Modifying address-sanitizer to prevent threads from sharing memory
On Wed, Sep 26, 2012 at 4:17 AM, Peter Boström <pbos at kth.se> wrote:> Hi llvm-dev! > > > I'm writing my master's thesis on sandboxing/isolation of plugins > running in a multithreaded environment. These plugins run in a real-time > environment where the cost of IPC/context switching and being at the > scheduler's mercy is not an option. There can be a lot of plugin > instances running and all have to perform some computations and return > the result to the main thread on an audio-buffer callback. > > These need to be isolated, primarily from the main thread, but > preferably from eachother as well. I'm thinking that modifying > address-sanitizer for this purpose could be feasible. > > > The shadow byte could also be split to contain a part with a 'short-id' > associated with to which thread/plugin the memory belong. This would of > course limit the plugin/thread short-ids available, and in theory some > false negatives could arise if two plugins are given the same short-id, > and then access eachother's memory, though this error would be detected > if it occurs another time, when they don't have the same short-id. > > Modification would be slightly different if n threads drive n plugins, > or if a thread pool of n threads drive m plugins. > > > This would of course not work for globals, which naturally would be > owned by the main thread. Also any kind of communication between plugins > or back to the main thread would have to be mediated by the main thread > or using uninstrumented/unsafe code. > > It's intended that the main code and plugin code are instrumented with > different asan passes. > > > From what I have thought of yet (but I'd love feedback!), these changes > are needed: > > 1. Storing+checking thread/plugin id in shadow byte. > 2. Modified stack instrumentation to set up these shadow bytes. > 3. Graceful shutdown of plugins preferred, free'ing heap and signaling > back to main thread instead of shutting down. > > Also, an optional compile flag could be used to modify the > instrumentation's granularity, whether to assume memory blocks are > allocated in multiples of 8, giving less code-blowup. The shadow bytes > would essentially be booleans then. Though this isn't directly related > to the changes I'd require doing. > > > === Heap part ==> > shadow_byte k: 0 0 0 0 0 0 0 0 > <short_id><shadow> > > short-id part: 0: main thread > 1-30: plugin/thread short-ids > 31 = 0x1F, all bits set: unallocated > > shadow part: 0-7, same encoding as original. > > > == Original instrumentation code (ASan USENIX2012 paper) => > * All instrumented code: > > ShadowAddr = (Addr >> 3) + offset; > k = *ShadowAddr; > > if (k != 0 && (Addr & 7) + AccessSize > k) > ReportAndCrash(Addr); > > == Concept code (code blowup, though) => > ShadowAddr = (Addr >> 3) + offset; > k = *ShadowAddr; > alloc_id = k >> 3; > shadow = k & 0x0F; > > * Thread/plugin code: > > if (alloc_id != my_short_id || // alloc belongs to other thread > shadow && (Addr & 7) + AccessSize > shadow) > ReportAndCrash(Addr); > > > * Main code: > > if (alloc_id == 0x1F || // unallocated memory > shadow != 0 && (Addr & 7) + AccessSize > shadow) > ReportAndCrash(Addr); > > > == Less granularity: assume+enforce multiples of 8, quicker/smaller => > shadow byte = short-id: 0 = main id > 1-254: short-ids > 255 = 0xFF: unallocated > > ShadowAddr = (Addr >> 3) + offset; > k = *ShadowAddr; > > * Thread/plugin code: > > if (k != my_short_id) // allocated/set from different thread > ReportAndCrash(Addr); > > > * Main code: > > if (k == 0xFF) // unallocated memory > ReportAndCrash(Addr); > > > === Stack part ==> > This part would be different depending on whether there's a 1-to-1 > mapping between threads and plugins. > > * 1-to-1 mapping: > > Since the plugin owns the thread stack, all of the corresponding > shadow can be initially filled with the shadow byte indicating that > that thread can access all of it. > > Poisoning the redzones would have to be done still, but unpoisoning > (and initial setup) would not set the shadow to zero(except for the > main stack), but rather each byte (memset) back to (short_id << 3), > which would indicate that the plugin with that short_id can read/write > all corresponding bytes. > > > * n-to-m mapping: > > If the stack is shared, it can't be poisoned/unpoisoned back to a > state readable by the next plugin using that stack space. When > allocating stack variables, all corresponding shadow bytes have to be > set to readable by that stack. Though it may be possible to have > different stacks for each plugin, and use the same mapping as above. > > > ==> > What do you think? Does this sound feasible? This would of course not be > changes to the existing -faddress-sanitizer flag, but part of the thesis > project. > > > Very thankful for feedback! > >Hi Peter, Have you looked at ThreadSanitizer (-fthread-sanitizer)? It does not exactly the thing you want, but something similar. It will detect data races between threads (when a data is accessed w/o proper synchronization). -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120926/0601db24/attachment.html>