Many VMs focus on performance, optimizations, memory consumption, etc. but very few, if any, focus on fault isolation and security. Given memory safety, any VM reduces to capability security, which is sufficient to implement most security policies of interest; however, most such VMs still ignore two main attack vectors from malicious code: DoS attack on memory allocation, and DoS against the CPU. I've been mulling over how LLVM could be extended to provide a degree of isolation from these two attack vectors [3]. Preventing a DoS against memory allocation involves controlling access to allocation in some way. Fine-grained control over every single allocation is likely infeasible [1]. Similarly, preventing a DoS against the CPU involves controlling the execution time of certain code blocks, by introducing concurrency or flow control of some sort. There is a single abstraction which has solved the above two problems for over 40 years: the process, which provides an isolated memory space, and an independently schedulable execution context. A VM process would run in its own heap and manages its own memory. The memory allocation routines are scoped to the process, which can itself potentially call out to a "space bank" to allocate more space for its heap. Memory faults in a process can be handled by "keepers" [4]. Concurrency is still an open question, because a kernel thread per VM process is actually overkill. A mix of kernel threads and Erlang-style preemptive green threads might be optimal, but this isn't the interesting part of the proposal IMO. There must also be some sort of interprocess communication (IPC), either via copying between heaps, or an "exchange heap". The exchange heap is the approach taken by the Singularity OS [2] where they add "software isolated processes" to the .NET VM and make it an operating system. There are two approaches I currently foresee for adding process constructs to LLVM: 1. Add process management instructions to the core instructions, and modify the runtime to rebind the allocation routines depending on some VM-level context that names which process is actually executing (perhaps in thread-local storage). 2. Add instrinsics to launch an entirely new VM instance (ExecutionEngine?) as if it were the process. This would involve modifying the VM to accept allocation primitives as function pointers, and potentially adding some scheduling awareness. At the moment, I'm not primarily interested in making LLVM itself a secure VM, but I think that too might be possible, and suggests possible future work. For instance, unsafe pointer operations can be made safe if the casting operation from integer to pointer implements a dynamic check that it's within the bounds of the heap. This is potentially an expensive operation, but such casts only penalize heavily unsafe programs, which should hopefully be rare. I believe LLVM programs that do not use these casting instructions are inherently memory safe, so they incur no such penalties (please correct me if I'm wrong). Using this approach, LLVM could support the safe execution of unsafe programs by running them in an isolated VM process. Alternately, one could actually launch the unsafe code in a completely separate OS process with a new LLVM instance, and the VM-level IPC instructions would transparently perform OS-level IPC to the separate process. This maintains the isolation properties, with the full execution speed (no need for dynamic heap bound checks), at the cost of using slightly heavier OS processes. Any comments on the feasibility of this approach? I'm definitely not familiar with the LLVM internals, and I wrote the above given only my understanding from reading the LLVM reference manual. Sandro [1] except perhaps using some sort of region-based approach with region inference, etc. I'm still reading the literature on this. [2] http://research.microsoft.com/os/singularity/ [3] I realize that LLVM is unsafe in other ways, but I believe it currently lacks even the base constructs necessary to even build a secure VM on top of it. [4] I can explain space banks and keepers concepts further, but just think of them as stateful exception handlers specific to a process. The concepts come from the KeyKOS/EROS and Coyotos secure operating systems.
We have a research project that is developing a Secure Virtual Architecture using LLVM as the instruction set, and implementing via a VM which we call a Secure Virtual Machine. The memory safety foundations of this work are based on Dinakar Dhurjati's thesis and publications: http://llvm.org/pubs/ SVA is at a very preliminary stage but some slides about it are attached. -------------- next part -------------- A non-text attachment was scrubbed... Name: 2007-SVAOverview.ppt Type: application/vnd.ms-powerpoint Size: 827904 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20070602/cd89eaf5/attachment.ppt> -------------- next part -------------- --Vikram http://www.cs.uiuc.edu/~vadve http://llvm.org On Jun 2, 2007, at 12:45 PM, Sandro Magi wrote:> Many VMs focus on performance, optimizations, memory consumption, etc. > but very few, if any, focus on fault isolation and security. Given > memory safety, any VM reduces to capability security, which is > sufficient to implement most security policies of interest; however, > most such VMs still ignore two main attack vectors from malicious > code: DoS attack on memory allocation, and DoS against the CPU. > > I've been mulling over how LLVM could be extended to provide a degree > of isolation from these two attack vectors [3]. > > Preventing a DoS against memory allocation involves controlling access > to allocation in some way. Fine-grained control over every single > allocation is likely infeasible [1]. Similarly, preventing a DoS > against the CPU involves controlling the execution time of certain > code blocks, by introducing concurrency or flow control of some sort. > > There is a single abstraction which has solved the above two problems > for over 40 years: the process, which provides an isolated memory > space, and an independently schedulable execution context. > > A VM process would run in its own heap and manages its own memory. The > memory allocation routines are scoped to the process, which can itself > potentially call out to a "space bank" to allocate more space for its > heap. Memory faults in a process can be handled by "keepers" [4]. > > Concurrency is still an open question, because a kernel thread per VM > process is actually overkill. A mix of kernel threads and Erlang-style > preemptive green threads might be optimal, but this isn't the > interesting part of the proposal IMO. > > There must also be some sort of interprocess communication (IPC), > either via copying between heaps, or an "exchange heap". The exchange > heap is the approach taken by the Singularity OS [2] where they add > "software isolated processes" to the .NET VM and make it an operating > system. > > There are two approaches I currently foresee for adding process > constructs to LLVM: > > 1. Add process management instructions to the core instructions, and > modify the runtime to rebind the allocation routines depending on some > VM-level context that names which process is actually executing > (perhaps in thread-local storage). > > 2. Add instrinsics to launch an entirely new VM instance > (ExecutionEngine?) as if it were the process. This would involve > modifying the VM to accept allocation primitives as function pointers, > and potentially adding some scheduling awareness. > > At the moment, I'm not primarily interested in making LLVM itself a > secure VM, but I think that too might be possible, and suggests > possible future work. > > For instance, unsafe pointer operations can be made safe if the > casting operation from integer to pointer implements a dynamic check > that it's within the bounds of the heap. This is potentially an > expensive operation, but such casts only penalize heavily unsafe > programs, which should hopefully be rare. I believe LLVM programs that > do not use these casting instructions are inherently memory safe, so > they incur no such penalties (please correct me if I'm wrong). Using > this approach, LLVM could support the safe execution of unsafe > programs by running them in an isolated VM process. > > Alternately, one could actually launch the unsafe code in a completely > separate OS process with a new LLVM instance, and the VM-level IPC > instructions would transparently perform OS-level IPC to the separate > process. This maintains the isolation properties, with the full > execution speed (no need for dynamic heap bound checks), at the cost > of using slightly heavier OS processes. > > Any comments on the feasibility of this approach? I'm definitely not > familiar with the LLVM internals, and I wrote the above given only my > understanding from reading the LLVM reference manual. > > Sandro > > [1] except perhaps using some sort of region-based approach with > region inference, etc. I'm still reading the literature on this. > [2] http://research.microsoft.com/os/singularity/ > [3] I realize that LLVM is unsafe in other ways, but I believe it > currently lacks even the base constructs necessary to even build a > secure VM on top of it. > [4] I can explain space banks and keepers concepts further, but just > think of them as stateful exception handlers specific to a process. > The concepts come from the KeyKOS/EROS and Coyotos secure operating > systems. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
SVA looks very promising. It would be great to be able to run unmodified C safely! However, it does not seem to address my original question: how can I ensure that code cannot DoS either the memory subsystem, or the CPU? In my proposal, I could execute said code in a concurrent process with a memory quota. How would SVA address that problem? Sandro On 6/2/07, Vikram S. Adve <vadve at uiuc.edu> wrote:> We have a research project that is developing a Secure Virtual > Architecture using LLVM as the instruction set, and implementing via > a VM which we call a Secure Virtual Machine. The memory safety > foundations of this work are based on Dinakar Dhurjati's thesis and > publications: > http://llvm.org/pubs/ > > SVA is at a very preliminary stage but some slides about it are > attached. >
Let me cut it down to the core problem: I'm asking about the feasibility of extending LLVM with constructs to manage separate heaps. Given my current understanding of LLVM, I can see this done in two ways: 1. Add heap management instructions to the core instructions, modify allocation routines to explicitly name heaps or modify the runtime to rebind the allocation routines depending on some VM-level context that names a heap (thread-local storage?). 2. Add instrinsics to start a new heap (via a new ExecutionEngine?). This would involve modifying the VM to accept allocation primitives as function pointers. So a program or language with real-time constraints where an incremental GC is preferable, and where an efficient, non-incremental GC is used for other tasks, can be expressed as partitioned heaps each with their own GC. Sandro On 6/2/07, Sandro Magi <naasking at gmail.com> wrote:> Many VMs focus on performance, optimizations, memory consumption, etc. > but very few, if any, focus on fault isolation and security. Given > memory safety, any VM reduces to capability security, which is > sufficient to implement most security policies of interest; however, > most such VMs still ignore two main attack vectors from malicious > code: DoS attack on memory allocation, and DoS against the CPU. > > I've been mulling over how LLVM could be extended to provide a degree > of isolation from these two attack vectors [3]. > > Preventing a DoS against memory allocation involves controlling access > to allocation in some way. Fine-grained control over every single > allocation is likely infeasible [1]. Similarly, preventing a DoS > against the CPU involves controlling the execution time of certain > code blocks, by introducing concurrency or flow control of some sort. > > There is a single abstraction which has solved the above two problems > for over 40 years: the process, which provides an isolated memory > space, and an independently schedulable execution context. > > A VM process would run in its own heap and manages its own memory. The > memory allocation routines are scoped to the process, which can itself > potentially call out to a "space bank" to allocate more space for its > heap. Memory faults in a process can be handled by "keepers" [4]. > > Concurrency is still an open question, because a kernel thread per VM > process is actually overkill. A mix of kernel threads and Erlang-style > preemptive green threads might be optimal, but this isn't the > interesting part of the proposal IMO. > > There must also be some sort of interprocess communication (IPC), > either via copying between heaps, or an "exchange heap". The exchange > heap is the approach taken by the Singularity OS [2] where they add > "software isolated processes" to the .NET VM and make it an operating > system. > > There are two approaches I currently foresee for adding process > constructs to LLVM: > > 1. Add process management instructions to the core instructions, and > modify the runtime to rebind the allocation routines depending on some > VM-level context that names which process is actually executing > (perhaps in thread-local storage). > > 2. Add instrinsics to launch an entirely new VM instance > (ExecutionEngine?) as if it were the process. This would involve > modifying the VM to accept allocation primitives as function pointers, > and potentially adding some scheduling awareness. > > At the moment, I'm not primarily interested in making LLVM itself a > secure VM, but I think that too might be possible, and suggests > possible future work. > > For instance, unsafe pointer operations can be made safe if the > casting operation from integer to pointer implements a dynamic check > that it's within the bounds of the heap. This is potentially an > expensive operation, but such casts only penalize heavily unsafe > programs, which should hopefully be rare. I believe LLVM programs that > do not use these casting instructions are inherently memory safe, so > they incur no such penalties (please correct me if I'm wrong). Using > this approach, LLVM could support the safe execution of unsafe > programs by running them in an isolated VM process. > > Alternately, one could actually launch the unsafe code in a completely > separate OS process with a new LLVM instance, and the VM-level IPC > instructions would transparently perform OS-level IPC to the separate > process. This maintains the isolation properties, with the full > execution speed (no need for dynamic heap bound checks), at the cost > of using slightly heavier OS processes. > > Any comments on the feasibility of this approach? I'm definitely not > familiar with the LLVM internals, and I wrote the above given only my > understanding from reading the LLVM reference manual. > > Sandro > > [1] except perhaps using some sort of region-based approach with > region inference, etc. I'm still reading the literature on this. > [2] http://research.microsoft.com/os/singularity/ > [3] I realize that LLVM is unsafe in other ways, but I believe it > currently lacks even the base constructs necessary to even build a > secure VM on top of it. > [4] I can explain space banks and keepers concepts further, but just > think of them as stateful exception handlers specific to a process. > The concepts come from the KeyKOS/EROS and Coyotos secure operating > systems. >
Sandro Magi wrote:> Let me cut it down to the core problem: I'm asking about the > feasibility of extending LLVM with constructs to manage separate > heaps. Given my current understanding of LLVM, I can see this done in > two ways: >If you just need to partition the heap into multiple heaps, then the easiest thing to do would be to replace the use of malloc/free instructions with calls to library functions that implement your segmented heap allocation/free functions. For example, in the Automatic Pool Allocation work (http://llvm.org/pubs/2005-05-21-PLDI-PoolAlloc.html), we have an LLVM pass that changes malloc instructions: %tmp = malloc struct {i8} ... into calls to an allocation function that takes a pool identifier and an allocation size as arguments (in this work, we segregated the heap based upon pointer analysis results): %tmp = call %poolalloc (sbyte * PoolID, uint 8) The poolalloc function is then implemented as a run-time library (written in C) that is compiled and linked into the program (either as a native code library or an LLVM bytecode library). You could do something similar to implement multiple heaps. Your proposed methods below (adding intrinsics or new core instructions) would work too, but using memory allocator functions does the same thing with less work. Adding intrinsics or new core instructions is only useful in a few rare cases, such as when you need special code generator support or need to extend the type system.> 1. Add heap management instructions to the core instructions, modify > allocation routines to explicitly name heaps or modify the runtime to > rebind the allocation routines depending on some VM-level context that > names a heap (thread-local storage?). > > 2. Add instrinsics to start a new heap (via a new ExecutionEngine?). > This would involve modifying the VM to accept allocation primitives as > function pointers. > > So a program or language with real-time constraints where an > incremental GC is preferable, and where an efficient, non-incremental > GC is used for other tasks, can be expressed as partitioned heaps each > with their own GC. >Doing GC may require using the LLVM GC intrinsics as described in this document (http://llvm.org/docs/GarbageCollection.html), but just segmenting the heap into multiple heaps should not require any new instructions or intrinsics to be added. -- John T.> Sandro > > On 6/2/07, Sandro Magi <naasking at gmail.com> wrote: > >> Many VMs focus on performance, optimizations, memory consumption, etc. >> but very few, if any, focus on fault isolation and security. Given >> memory safety, any VM reduces to capability security, which is >> sufficient to implement most security policies of interest; however, >> most such VMs still ignore two main attack vectors from malicious >> code: DoS attack on memory allocation, and DoS against the CPU. >> >> I've been mulling over how LLVM could be extended to provide a degree >> of isolation from these two attack vectors [3]. >> >> Preventing a DoS against memory allocation involves controlling access >> to allocation in some way. Fine-grained control over every single >> allocation is likely infeasible [1]. Similarly, preventing a DoS >> against the CPU involves controlling the execution time of certain >> code blocks, by introducing concurrency or flow control of some sort. >> >> There is a single abstraction which has solved the above two problems >> for over 40 years: the process, which provides an isolated memory >> space, and an independently schedulable execution context. >> >> A VM process would run in its own heap and manages its own memory. The >> memory allocation routines are scoped to the process, which can itself >> potentially call out to a "space bank" to allocate more space for its >> heap. Memory faults in a process can be handled by "keepers" [4]. >> >> Concurrency is still an open question, because a kernel thread per VM >> process is actually overkill. A mix of kernel threads and Erlang-style >> preemptive green threads might be optimal, but this isn't the >> interesting part of the proposal IMO. >> >> There must also be some sort of interprocess communication (IPC), >> either via copying between heaps, or an "exchange heap". The exchange >> heap is the approach taken by the Singularity OS [2] where they add >> "software isolated processes" to the .NET VM and make it an operating >> system. >> >> There are two approaches I currently foresee for adding process >> constructs to LLVM: >> >> 1. Add process management instructions to the core instructions, and >> modify the runtime to rebind the allocation routines depending on some >> VM-level context that names which process is actually executing >> (perhaps in thread-local storage). >> >> 2. Add instrinsics to launch an entirely new VM instance >> (ExecutionEngine?) as if it were the process. This would involve >> modifying the VM to accept allocation primitives as function pointers, >> and potentially adding some scheduling awareness. >> >> At the moment, I'm not primarily interested in making LLVM itself a >> secure VM, but I think that too might be possible, and suggests >> possible future work. >> >> For instance, unsafe pointer operations can be made safe if the >> casting operation from integer to pointer implements a dynamic check >> that it's within the bounds of the heap. This is potentially an >> expensive operation, but such casts only penalize heavily unsafe >> programs, which should hopefully be rare. I believe LLVM programs that >> do not use these casting instructions are inherently memory safe, so >> they incur no such penalties (please correct me if I'm wrong). Using >> this approach, LLVM could support the safe execution of unsafe >> programs by running them in an isolated VM process. >> >> Alternately, one could actually launch the unsafe code in a completely >> separate OS process with a new LLVM instance, and the VM-level IPC >> instructions would transparently perform OS-level IPC to the separate >> process. This maintains the isolation properties, with the full >> execution speed (no need for dynamic heap bound checks), at the cost >> of using slightly heavier OS processes. >> >> Any comments on the feasibility of this approach? I'm definitely not >> familiar with the LLVM internals, and I wrote the above given only my >> understanding from reading the LLVM reference manual. >> >> Sandro >> >> [1] except perhaps using some sort of region-based approach with >> region inference, etc. I'm still reading the literature on this. >> [2] http://research.microsoft.com/os/singularity/ >> [3] I realize that LLVM is unsafe in other ways, but I believe it >> currently lacks even the base constructs necessary to even build a >> secure VM on top of it. >> [4] I can explain space banks and keepers concepts further, but just >> think of them as stateful exception handlers specific to a process. >> The concepts come from the KeyKOS/EROS and Coyotos secure operating >> systems. >> >> > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >