Dear all, I want to implement a pass that provides some kind of data flow integrity similar to Write Integrity Testing (https://www.doc.ic.ac.uk/~cristic/papers/wit-sp-ieee-08.pdf). This approach statically determines for each memory write the (conservative, overapproximated) points-to set of locations that can be written by the instruction. Further, it instruments the memory write instruction to prevent a write to a location not in the points-to set. How can I get the points-to set, including locations from stack/heap/static variables? How do I approach this problem in general? I am new to LLVM. Thank you! Regards, – Fredi
John Criswell via llvm-dev
2016-Aug-01 19:58 UTC
[llvm-dev] Implementing Data Flow Integrity
On 7/31/16 11:48 AM, Fee via llvm-dev wrote:> Dear all, > > I want to implement a pass that provides some kind of data flow > integrity similar to Write Integrity Testing > (https://www.doc.ic.ac.uk/~cristic/papers/wit-sp-ieee-08.pdf).> > This approach statically determines for each memory write the > (conservative, overapproximated) points-to set of locations that can be > written by the instruction. Further, it instruments the memory write > instruction to prevent a write to a location not in the points-to set.Correct. I would also point out that their use of Anderson's analysis is (most likely) unnecessary. Because they unify points-to sets before instrumenting, they are modifying the end-result of the inclusion-based analysis to be what unification-based points-to analysis would have computed. It is not clear to me that anything can be gained by using inclusion-based analysis over unification-based analysis.> > How can I get the points-to set, including locations from > stack/heap/static variables? > How do I approach this problem in general?To the best of my knowledge, the existing LLVM alias analysis passes only provide a mod/ref and aliasing query interface. I don't believe they provide a shape graph or points-to sets that can be easily used. You might want to check CFL-AA to see what it provides, but I have personally never used it. You could use DSA located in the poolalloc project which provides a shape graph. The original SAFECode essentially did what WIT does (except that it also protected memory reads and used a very different run-time check mechanism, plus it could optimize away provably type-safe checks). SAFECode used DSA's shape graphs to segregate the heap, find points-to sets, and learn memory object type information. However, in its current shape, you'd need to run DSA prior to most LLVM optimizations to get good field sensitivity. Otherwise, DSA will lose field sensitivity and provide poor precision in its results. As I need something similar for my research work, my research group will be working on either improving or replacing DSA. However, it'll be awhile, so if you need something now, either CFL-AA or DSA will be your best bet.> I am new to LLVM.Welcome to the club. Regards, John Criswell> > Thank you! > > Regards, > – Fredi > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-- John Criswell Assistant Professor Department of Computer Science, University of Rochester http://www.cs.rochester.edu/u/criswell
Hi John Criswell, Thank you for your helpful answer. I think CFL Alias Analysis is not the right way to go because it seems to avoid building whole points-to sets to be more efficient (correct me if i am wrong). So, it would trade compile-time performance for runtime-security, which is bad. DSA from poolalloc seems promising, I need to check it. Do you think that an implementation of Andersen's pointer analysis (like https://github.com/grievejia/andersen) would work? It seems to be not field-sensitive. Do you have any clue how hard it is to make an existing analysis field-sensitive? Regards —Fredi On 08/01/2016 09:58 PM, John Criswell wrote:> On 7/31/16 11:48 AM, Fee via llvm-dev wrote: >> Dear all, >> >> I want to implement a pass that provides some kind of data flow >> integrity similar to Write Integrity Testing >> (https://www.doc.ic.ac.uk/~cristic/papers/wit-sp-ieee-08.pdf). > > >> >> This approach statically determines for each memory write the >> (conservative, overapproximated) points-to set of locations that can be >> written by the instruction. Further, it instruments the memory write >> instruction to prevent a write to a location not in the points-to set. > > Correct. I would also point out that their use of Anderson's analysis > is (most likely) unnecessary. Because they unify points-to sets > before instrumenting, they are modifying the end-result of the > inclusion-based analysis to be what unification-based points-to > analysis would have computed. It is not clear to me that anything can > be gained by using inclusion-based analysis over unification-based > analysis. > >> >> How can I get the points-to set, including locations from >> stack/heap/static variables? >> How do I approach this problem in general? > > To the best of my knowledge, the existing LLVM alias analysis passes > only provide a mod/ref and aliasing query interface. I don't believe > they provide a shape graph or points-to sets that can be easily used. > You might want to check CFL-AA to see what it provides, but I have > personally never used it. > > You could use DSA located in the poolalloc project which provides a > shape graph. The original SAFECode essentially did what WIT does > (except that it also protected memory reads and used a very different > run-time check mechanism, plus it could optimize away provably > type-safe checks). SAFECode used DSA's shape graphs to segregate the > heap, find points-to sets, and learn memory object type information. > However, in its current shape, you'd need to run DSA prior to most > LLVM optimizations to get good field sensitivity. Otherwise, DSA will > lose field sensitivity and provide poor precision in its results. > > As I need something similar for my research work, my research group > will be working on either improving or replacing DSA. However, it'll > be awhile, so if you need something now, either CFL-AA or DSA will be > your best bet. > >> I am new to LLVM. > > Welcome to the club. > > Regards, > > John Criswell > >> >> Thank you! >> >> Regards, >> – Fredi >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >