Hi, I am a final year French student doing an internship at the University of Portsmouth. As I was taking hands on AddressSanitizer I took a look at BoundsChecking (both are in the lib/Transforms/Instrumentation folder). I found nothing on it except for the LLVM Documentation and references to BaggyBoundsCheck (which is not the same project. As far as I understood it is part of the SAFECode project). Does anyone knows about it (BoundsChecking)? I have some inquiries I will try to explain just below... I modified a bit the registration process of the Pass (the BoundsChecking one) to get the .so generated file once llvm rebuild. I then ran the LLVM opt with loading the .so for a C program that did both a stack and heap overflow: - clang -emit-llvm overflow.c -c -o overflow.bc - opt -load path-to-so/LLVMBoundsChecking.so -options < overflow.bc > overflow_instrumented.bc I then ran llc and gcc to get an executable: - llc -filetype=obj overflow_instrumented.bc (generates a .o file with same name) - gcc overflow_instrumented.o -o overflow_instrumented Once launched, the executable detects the stack access and crash the program (you can see the checks on the assembly code which are followed by a conditional jump on a UD2 instruction that basically crash a program) but nothing is instrumented for the heap access. On the BoundsChecking file it is said that run-time checks are maid but I don't see them. So my questions are: - are there any heap checking made? - if yes, where are they? I am interested in this because I think I am going to try to do the same work made on the stack to the heap. Thank you for your help, any information or advice is welcome :) Pierre -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160520/797a0f5b/attachment.html>
Hi Pierre, I'm the author of the BoundsChecking pass. It's true there's little documentation about it (only mentioned in: http://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#availablle-checks). You can run it with 'clang -fsanitize=bounds' or 'opt -bounds-checking'. The BoundsChecking pass, AddressSanitizer and BaggyBoundsCheck are all different code bases, each exploring a different set of tradeoffs. The goal of the BoundsChecking pass was that the runtime penalty should be low enough to enable usage in production. Some information about the BoundsChecking pass: - It is intra-procedural only. If you dereference a pointer that was passed as argument, then it is not checked (with some exceptions). - It supports heap allocations, provided that these allocations are done using 1) standard functions that LLVM recognizes (malloc, new, strdup, etc) or 2) functions are annotated with alloc_size (https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html) - It's helpful to compile with -O2, otherwise the pass will get confused very quickly. The design of the analysis assumes at least a few simplifications were done before. - Sometimes LLVM transforms loops into intrinsics, like memcpy or memset. Right now these are not checked (but should, though) - Guards are mostly not hoisted out of loops by LLVM; this needs improvement otherwise perf may suffer quite a bit. - The analysis code is in lib/Analysis/MemoryBuiltins.cpp Hope this helps. Please let us know if you have more questions. Nuno -----Original Message----- From: Pierre Gagelin via llvm-dev Sent: Friday, May 20, 2016 11:16 AM To: llvm-dev at lists.llvm.org Subject: [llvm-dev] BoundsChecking Pass Hi, I am a final year French student doing an internship at the University of Portsmouth. As I was taking hands on AddressSanitizer I took a look at BoundsChecking (both are in the lib/Transforms/Instrumentation folder). I found nothing on it except for the LLVM Documentation and references to BaggyBoundsCheck (which is not the same project. As far as I understood it is part of the SAFECode project). Does anyone knows about it (BoundsChecking)? I have some inquiries I will try to explain just below... I modified a bit the registration process of the Pass (the BoundsChecking one) to get the .so generated file once llvm rebuild. I then ran the LLVM opt with loading the .so for a C program that did both a stack and heap overflow: - clang -emit-llvm overflow.c -c -o overflow.bc - opt -load path-to-so/LLVMBoundsChecking.so -options < overflow.bc > overflow_instrumented.bc I then ran llc and gcc to get an executable: - llc -filetype=obj overflow_instrumented.bc (generates a .o file with same name) - gcc overflow_instrumented.o -o overflow_instrumented Once launched, the executable detects the stack access and crash the program (you can see the checks on the assembly code which are followed by a conditional jump on a UD2 instruction that basically crash a program) but nothing is instrumented for the heap access. On the BoundsChecking file it is said that run-time checks are maid but I don't see them. So my questions are: - are there any heap checking made? - if yes, where are they? I am interested in this because I think I am going to try to do the same work made on the stack to the heap. Thank you for your help, any information or advice is welcome :) Pierre
Hi Nuno, On 22 May 2016 at 22:33, Nuno Lopes <nunoplopes at sapo.pt> wrote:> Hi Pierre, > > I'm the author of the BoundsChecking pass. >Wow, I am happily surprised to have an answer from you directly!> It's true there's little documentation about it (only mentioned in: > http://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#availablle-checks). > You can run it with 'clang -fsanitize=bounds' or 'opt -bounds-checking'. > The BoundsChecking pass, AddressSanitizer and BaggyBoundsCheck are all > different code bases, each exploring a different set of tradeoffs. The > goal of the BoundsChecking pass was that the runtime penalty should be low > enough to enable usage in production. > > Some information about the BoundsChecking pass: > - It is intra-procedural only. If you dereference a pointer that was > passed as argument, then it is not checked (with some exceptions). > - It supports heap allocations, provided that these allocations are done > using 1) standard functions that LLVM recognizes (malloc, new, strdup, etc) > or 2) functions are annotated with alloc_size ( > https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html) > - It's helpful to compile with -O2, otherwise the pass will get confused > very quickly. The design of the analysis assumes at least a few > simplifications were done before. >OK, I just compiled it with -O2 and the heapoverflow protection have been triggered. Though, I don't know what is the simplification required for the pass to run correctly?> - Sometimes LLVM transforms loops into intrinsics, like memcpy or memset. > Right now these are not checked (but should, though) > - Guards are mostly not hoisted out of loops by LLVM; this needs > improvement otherwise perf may suffer quite a bit. >Are you still working on it? If yes, what is it that you are trying to do? I would like to work on this Pass during summer (until end of August). That would be great if you could lead me a little bit =)> - The analysis code is in lib/Analysis/MemoryBuiltins.cpp >I have a question on this. As I read the code I was wondering how the run-time part was implemented. I was looking for something like a redefinition of malloc&free functions but I found no clue. Now I'm wondering if it's reduced to the run-time action of the ObjectSizeOffsetEvaluator class? This one is used to get the size&offset of the current array pointer.> > Hope this helps. Please let us know if you have more questions. >This already helped a lot, thank you!> > Nuno >Pierre -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160523/0cceb2e0/attachment.html>