Hi, Some days ago I am interested in detecting undefined behaviors in C programs based on Clang. After several days’ investigation, I think checking bounds overflow bugs is more interesting, because bounds overflow is one of the most frequently encountered errors in C programs. For example, performing pointer arithmetic without checking bounds can cause bounds overflow. To increase the accuracy of finding bugs, I want to write several passes, based on slicing, inline and summary function / (partial) transition function, to implement intre-procedural analysis. Does some person have interest in the project? I need a mentor, and wait for your reply. Best Reagards! Qiuping Yi -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100330/ffcf599d/attachment.html>
John Regehr
2010-Mar-30 14:36 UTC
[LLVMdev] summer of code idea — checking bounds overflow bugs
Qiuping, Have you looked at what has already been done? I would expect that taking previous work such as this: http://llvm.org/pubs/2006-05-24-SAFECode-BoundsCheck.html and integrating into current LLVM would be a better idea than starting over. John On Tue, 30 Mar 2010, ??? wrote:> > Hi, > > Some days ago I am interested in detecting undefined behaviors > > in C programs based on Clang. After several days? investigation, I think > > checking bounds overflow bugs is more interesting, because bounds > > overflow is one of the most frequently encountered errors in C programs. > > For example, performing pointer arithmetic without checking bounds > > can cause bounds overflow. To increase the accuracy of finding bugs, > > I want to write several passes, based on slicing, inline and summary > function > > / (partial) transition function, to implement intre-procedural analysis. > > Does some person have interest in the project? I need a mentor, > > and wait for your reply. > > > > Best Reagards! > > > > Qiuping Yi > > >
John Criswell
2010-Mar-30 14:42 UTC
[LLVMdev] summer of code idea — checking bounds overflow bugs
John Regehr wrote:> Qiuping, > > Have you looked at what has already been done? I would expect that taking > previous work such as this: > > http://llvm.org/pubs/2006-05-24-SAFECode-BoundsCheck.html > > and integrating into current LLVM would be a better idea than starting > over. >This code is publicly available from the SAFECode project (see http://safecode.cs.illinois.edu to see how to get it). However, it has not been maintained well over the years and is currently disabled. Getting it to work again with LLVM 2.6 or replacing it with something better would be nice. I'm writing up a response to this project idea as I'm willing to mentor it; I'll send it out shortly. -- John T.> John > > > On Tue, 30 Mar 2010, ??? wrote: > > >> Hi, >> >> Some days ago I am interested in detecting undefined behaviors >> >> in C programs based on Clang. After several days? investigation, I think >> >> checking bounds overflow bugs is more interesting, because bounds >> >> overflow is one of the most frequently encountered errors in C programs. >> >> For example, performing pointer arithmetic without checking bounds >> >> can cause bounds overflow. To increase the accuracy of finding bugs, >> >> I want to write several passes, based on slicing, inline and summary >> function >> >> / (partial) transition function, to implement intre-procedural analysis. >> >> Does some person have interest in the project? I need a mentor, >> >> and wait for your reply. >> >> >> >> Best Reagards! >> >> >> >> Qiuping Yi >> >> >> > >
John Criswell
2010-Mar-30 15:42 UTC
[LLVMdev] summer of code idea — checking bounds overflow bugs
易秋萍 wrote:> > Hi, > > Some days ago I am interested in detecting undefined behaviors > > in C programs based on Clang. After several days’ investigation, I think > > checking bounds overflow bugs is more interesting, because bounds > > overflow is one of the most frequently encountered errors in C programs. > > For example, performing pointer arithmetic without checking bounds > > can cause bounds overflow. To increase the accuracy of finding bugs, > > I want to write several passes, based on slicing, inline and summary > function > > / (partial) transition function, to implement intre-procedural analysis. > > Does some person have interest in the project? I need a mentor, > > and wait for your reply. >The SAFECode project (http://safecode.cs.illinois.edu) is very interested in having a static array bounds checking analysis pass. However, we're not interested in a static analysis tool, and we're not interested in something that works on Clang's IR. What we want is a one or more LLVM IR analysis passes that tells us which getelementptr (GEP) instructions in a program are statically known not to overflow (this allows us to eliminate run-time checks for such instructions). Such a pass exists in the SAFECode project's source tree, but it has not been maintained well over the years and has fallen into disuse (it also had some engineering limitations, such as using exec() to repeatedly execute the Omega compiler for constraint solving). Either getting that code to work again or replacing it with something better would be beneficial. If you'd be willing to work on something that works with SAFECode, I'd be willing to mentor your project. I do, however, have one condition: you need to find a specific static array bounds checking algorithm and understand how it works. I don't see a specific algorithm or paper reference above. A good starting point for algorithms might be Dinakar's paper (see Section 5): http://llvm.org/pubs/2005-02-TECS-SAFECode.html. There has also been talk of implementing the ABCD algorithm in LLVM (http://portal.acm.org/citation.cfm?id=349342); you may want to read about that algorithm as well. As an aside, I've written a pass that finds the inter-procedural static backwards slice of an LLVM value (disclaimer: it uses DSA to find the targets of indirect function calls). I'm pretty sure we could release it to the public if you find that you need it for your project. -- John T.> Best Reagards! > > Qiuping Yi >
John Regehr
2010-Mar-30 16:33 UTC
[LLVMdev] summer of code idea — checking bounds overflow bugs
John-- a couple questions: Can you explain the SAFECode model in a bit more detail? I am getting conflicting information. On one hand, some of the papers describe a system that is primarily designed to hide safety violations. On the other hand, the 2006 ICSE paper that I cited earlier today seems to be talking about catching violations. These are very different goals! What does the code in the SAFECode repository actually do? Can you comment on the speed of LLVM when shelling out to Omega? My guess would be that this will result in unacceptable compile times for large software, and that something fast and relatively simple like ABCD is a better choice for general usage. Finally a comment: it's a clear that a comprehensive system for trapping undefined behavior in Clang is a multi-year project. Some parts of this must live in Clang. Some parts, such as bounds check optimizations, should go into LLVM passes. Anyway I'm just saying that the project you outlines seems to fit very well into the overall vision of detecting undefined behavior in C programs. John
罗勇刚(Yonggang Luo)
2010-Mar-31 01:19 UTC
[LLVMdev] summer of code idea — checking bounds overflow bugs
Sounds an good idea, is that means lowerinng down the SAFECode project from the higher level(clang)to lower level for an more general work on bound check? I aslo want to know is it possoble to detecting memory leak at the very low(llvm ir) level to detecting memory leaks? Or at llvm ir level to providing an stackfull hooks? It's very useful to have such an feature. The stack hooks can help us to print extra stack info in the exec period without modify the original code, to help us to find bugs easier:) I think for memory leaks, we just need to modify the internal method llvm.malloc, add hooks to this method For stack hooks, we can modify the calling conventions:) 2010/3/30, John Criswell <criswell at uiuc.edu>:> 易秋萍 wrote: >> >> Hi, >> >> Some days ago I am interested in detecting undefined behaviors >> >> in C programs based on Clang. After several days’ investigation, I think >> >> checking bounds overflow bugs is more interesting, because bounds >> >> overflow is one of the most frequently encountered errors in C programs. >> >> For example, performing pointer arithmetic without checking bounds >> >> can cause bounds overflow. To increase the accuracy of finding bugs, >> >> I want to write several passes, based on slicing, inline and summary >> function >> >> / (partial) transition function, to implement intre-procedural analysis. >> >> Does some person have interest in the project? I need a mentor, >> >> and wait for your reply. >> > > The SAFECode project (http://safecode.cs.illinois.edu) is very > interested in having a static array bounds checking analysis pass. > However, we're not interested in a static analysis tool, and we're not > interested in something that works on Clang's IR. What we want is a one > or more LLVM IR analysis passes that tells us which getelementptr (GEP) > instructions in a program are statically known not to overflow (this > allows us to eliminate run-time checks for such instructions). > > Such a pass exists in the SAFECode project's source tree, but it has not > been maintained well over the years and has fallen into disuse (it also > had some engineering limitations, such as using exec() to repeatedly > execute the Omega compiler for constraint solving). Either getting that > code to work again or replacing it with something better would be > beneficial. > > If you'd be willing to work on something that works with SAFECode, I'd > be willing to mentor your project. I do, however, have one condition: > you need to find a specific static array bounds checking algorithm and > understand how it works. I don't see a specific algorithm or paper > reference above. > > A good starting point for algorithms might be Dinakar's paper (see > Section 5): http://llvm.org/pubs/2005-02-TECS-SAFECode.html. There has > also been talk of implementing the ABCD algorithm in LLVM > (http://portal.acm.org/citation.cfm?id=349342); you may want to read > about that algorithm as well. > > As an aside, I've written a pass that finds the inter-procedural static > backwards slice of an LLVM value (disclaimer: it uses DSA to find the > targets of indirect function calls). I'm pretty sure we could release it > to the public if you find that you need it for your project. > > -- John T. > >> Best Reagards! >> >> Qiuping Yi >> > >-- 从我的移动设备发送 此致 礼 罗勇刚 Yours sincerely, Yonggang Luo
yiqiuping1986
2010-Apr-01 00:56 UTC
[LLVMdev] summer of code idea— checking boun ds overflow bugs
2010-04-01 yiqiuping1986 发件人: John Criswell 发送时间: 2010-03-30 23:44:58 收件人: 易秋萍 抄送: llvmdev at cs.uiuc.edu 主题: Re: [LLVMdev] summer of code idea— checking boun ds overflow bugs 易秋萍 wrote: Hi, Some days ago I am interested in detecting undefined behaviors in C programs based on Clang. After several days’ investigation, I think checking bounds overflow bugs is more interesting, because bounds overflow is one of the most frequently encountered errors in C programs. For example, performing pointer arithmetic without checking bounds can cause bounds overflow. To increase the accuracy of finding bugs, I want to write several passes, based on slicing, inline and summary function / (partial) transition function, to implement intre-procedural analysis. Does some person have interest in the project? I need a mentor, and wait for your reply. John wrote: The SAFECode project (http://safecode.cs.illinois.edu) is very interested in having a static array bounds checking analysis pass. However, we're not interested in a static analysis tool, and we're not interested in something that works on Clang's IR. What we want is a one or more LLVM IR analysis passes that tells us which getelementptr (GEP) instructions in a program are statically known not to overflow (this allows us to eliminate run-time checks for such instructions). Such a pass exists in the SAFECode project's source tree, but it has not been maintained well over the years and has fallen into disuse (it also had some engineering limitations, such as using exec() to repeatedly execute the Omega compiler for constraint solving). Either getting that code to work again or replacing it with something better would be beneficial. If you'd be willing to work on something that works with SAFECode, I'd be willing to mentor your project. I do, however, have one condition: you need to find a specific static array bounds checking algorithm and understand how it works. I don't see a specific algorithm or paper reference above. Did you mean implementing a new static array bouns checking algorithm with SAFECode is a better idea than that with LLVM? I am not sure whether it's feasible to finish it within a summer, under the condition that I have little knowledge of SAFECode project. A good starting point for algorithms might be Dinakar's paper (see Section 5): http://llvm.org/pubs/2005-02-TECS-SAFECode.html. There has also been talk of implementing the ABCD algorithm in LLVM (http://portal.acm.org/citation.cfm?id=349342); you may want to read about that algorithm as well. As an aside, I've written a pass that finds the inter-procedural static backwards slice of an LLVM value (disclaimer: it uses DSA to find the targets of indirect function calls). I'm pretty sure we could release it to the public if you find that you need it for your project. -- John T.> Best Reagards! > > Qiuping Yi >_______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100401/375b7502/attachment.html>
Adve, Vikram Sadanand
2010-Apr-01 19:33 UTC
[LLVMdev] summer of code idea— checking boun ds overflow bugs
On Mar 31, 2010, at 7:56 PM, yiqiuping1986 wrote: Did you mean implementing a new static array bouns checking algorithm with SAFECode is a better idea than that with LLVM? I am not sure whether it's feasible to finish it within a summer, under the condition that I have little knowledge of SAFECode project. SAFECode is built as a set of LLVM passes, so there is no difference. E.g., Dinakar's earlier work that John mentioned is an LLVM pass used by SAFECode. --Vikram Associate Professor, Computer Science University of Illinois at Urbana-Champaign http://llvm.org/~vadve
Apparently Analagous Threads
- [LLVMdev] summer of code idea— update the SAFECode project to the new LLVM API
- [LLVMdev] summer of code idea— update the SAFECode project to the new LLVM API
- [LLVMdev] summer of code idea — checking bounds overflow bugs
- [LLVMdev] summer_of_code_idea_—_checking_bounds_overflow_bugs
- [LLVMdev] Summer Code of idea