Hi all, I am trying to identify when a write to memory (store) is safe or not (based on LLVM IR). I refer "safe" to spatial safety (no out of bounds write) and not temporal safety (no double free() etc). My current approach would be to declare every pointer as unsafe that is computed somewhere by a GEP instruction with non constant indices, as well as constant indices that can be proven to be out of bounds (but that would be more a thing for the compiler to complain). Is that approach about right? Are there other situations where spatial safety might be violated? Are there already analysis that can determine something like this? Thanks you. Regards, Fredi
Jonas Wagner via llvm-dev
2016-Oct-04 07:54 UTC
[llvm-dev] When is a store not (memory) safe?
Hi, My current approach would be to declare every pointer as unsafe that is> computed somewhere by a GEP instruction with non constant indices, as well > as constant indices that can be proven to be out of > bounds (but that would be more a thing for the compiler to complain). >Are you assuming that you see the entire program (something like using LTO)? Otherwise one of the main difficulties will be that the provenance of pointers are unknown. For example, you might see a function that receives a pointer argument, and have no idea where that function is called from. Or some code might use a global, and you don't know what the value of the global is. Or a pointer is read from memory, and you don't know which instruction modified it. There is definitely a lot of work already in this area. For large real-world software, I believe the idea of proving memory safety is not feasible yet; for those people tend to use dynamic (run-time) checks instead, e.g., AddressSanitizer <http://clang.llvm.org/docs/AddressSanitizer.html>. For statically proving accesses, there are ideas such as abstract interpretation, model checking, CEGAR, separation logic... I think a good place to get an overview of current tools is the software verification competition <https://sv-comp.sosy-lab.org/2016/>. Some of the participants there are open-source and based on LLVM. Cheers, Jonas -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161004/48d1a008/attachment.html>
Thanks for your hints, they give me a good overview. I am not trying to achieve complete memory safety. I wrote a pass that does some safety checks for every store instruction. Now I want to optimize my pass by avoiding to instrument checks for store instructions that are somehow provable "memory safe". So, I am only looking for some easy and obvious cases where I can optimize for performance. On 10/04/2016 09:54 AM, Jonas Wagner wrote:> Hi, > > My current approach would be to declare every pointer as unsafe that is computed somewhere by a GEP instruction with non constant indices, as well as constant indices that can be proven to be out of > bounds (but that would be more a thing for the compiler to complain). > > > Are you assuming that you see the entire program (something like using LTO)? Otherwise one of the main difficulties will be that the provenance of pointers are unknown. For example, you might see a > function that receives a pointer argument, and have no idea where that function is called from. Or some code might use a global, and you don't know what the value of the global is. Or a pointer is > read from memory, and you don't know which instruction modified it. > > There is definitely a lot of work already in this area. For large real-world software, I believe the idea of proving memory safety is not feasible yet; for those people tend to use dynamic > (run-time) checks instead, e.g., AddressSanitizer <http://clang.llvm.org/docs/AddressSanitizer.html>. > > For statically proving accesses, there are ideas such as abstract interpretation, model checking, CEGAR, separation logic... I think a good place to get an overview of current tools is the software > verification competition <https://sv-comp.sosy-lab.org/2016/>. Some of the participants there are open-source and based on LLVM. > > Cheers, > Jonas >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161004/e8d1afa7/attachment.html>