Hi Vikram, I think it's worth continuing to discuss this on-list. I'm interested in different kinds of embedded software, but specifically in TinyOS applications.> We (more accurately, John Criswell and Brice Lin) are working on a > debugging version of SAFECode right now, which should be robust enough > to play with soon.This is great to hear. I know the SAFECode paper and like it. However, my intuition fails when I try to imagine what your transformations will do to a TinyOS application. These applications contain some substantially ugly pointer manipulation in the network stack. The applications will clearly violate some of the assumptions from your 2005 paper. On the other hand, these apps are definitely no worse than the Linux kernel. They are substantially more static in nature, containing few or no function pointers and little or no heap allocation. Our Safe TinyOS work (memory safety via Deputy) suffers from code size problems. Some architectures, such as msp430, are quite flash-poor, and so even a few percent code bloat will stop large applications from compiling. This is too bad since these large applications are the ones thst most need safety. Since we have deployed Safe TinyOS as part of a recent release of TinyOS, the problem is not just a theoretical one! Making things worse, msp430-gcc is not one of the stronger gcc ports. So I am very interested in approaches that can reduce or eliminate dynamic checks. However, this legacy code base is not going to lend itself to heavy-handed restrictions. I have been expecting that dragging a heavyweight decision procedure into the bounds check optimizer will be necessary. More broadly, I'm intersted in exploring whole-program compilation of embedded codes using LLVM. The transformations that are most interesting to me are those that do not make sense (and hence, have not been explored deeply) by the general-purpose compiler community. A good example is getting rid of the call stack (we have an upcoming LCTES paper about this). Clearly it's stupid to do this for Firefox or MSWord, but it can win for microcontroller codes. Our prototype was partially via source-to-source transformation, partially via gcc hacks. Overall this was not very fun-- we should have done the work in LLVM (but nobody wrote AVR or msp430 backends for us to use :). John Regehr
I understand. They only way you can keep the code bloat at a few percent or less is to aggressively eliminate nearly all run-time checks. We haven't measured our code size increases but if you have any codes to send us, we can try to get you some preliminary numbers. We do have a static array bounds checking algorithm based on the Omega integer programming library, but it is not hugely effective. I think this can be strengthened a *lot*. We'd have to see the pointer manipulation in your codes to figure out what (if any) run-time checks they might introduce. --Vikram Associate Professor, Computer Science University of Illinois at Urbana-Champaign http://llvm.org/~vadve On Mar 25, 2009, at 10:35 PM, John Regehr wrote:> Hi Vikram, > > I think it's worth continuing to discuss this on-list. I'm > interested in > different kinds of embedded software, but specifically in TinyOS > applications. > >> We (more accurately, John Criswell and Brice Lin) are working on a >> debugging version of SAFECode right now, which should be robust >> enough >> to play with soon. > > This is great to hear. > > I know the SAFECode paper and like it. However, my intuition fails > when I > try to imagine what your transformations will do to a TinyOS > application. > > These applications contain some substantially ugly pointer > manipulation in > the network stack. The applications will clearly violate some > of the assumptions from your 2005 paper. On the other hand, these > apps > are definitely no worse than the Linux kernel. They are > substantially more static in nature, containing few or no function > pointers and little or no heap allocation. > > Our Safe TinyOS work (memory safety via Deputy) suffers from code size > problems. Some architectures, such as msp430, are quite flash-poor, > and so even a few percent code bloat will stop large applications from > compiling. This is too bad since these large applications are the > ones thst most need safety. Since we have deployed Safe TinyOS as > part of > a recent release of TinyOS, the problem is not just a > theoretical one! Making things worse, msp430-gcc is not one of the > stronger gcc ports. > > So I am very interested in approaches that can reduce or eliminate > dynamic > checks. However, this legacy code base is not going to lend > itself to heavy-handed restrictions. I have been expecting that > dragging > a heavyweight decision procedure into the bounds check optimizer > will be necessary. > > More broadly, I'm intersted in exploring whole-program compilation of > embedded codes using LLVM. The transformations that are most > interesting to me are those that do not make sense (and hence, have > not > been explored deeply) by the general-purpose compiler community. A > good example is getting rid of the call stack (we have an upcoming > LCTES > paper about this). Clearly it's stupid to do this for Firefox or > MSWord, but it can win for microcontroller codes. Our prototype was > partially via source-to-source transformation, partially via gcc > hacks. Overall this was not very fun-- we should have done the work > in > LLVM (but nobody wrote AVR or msp430 backends for us to use :). > > John Regehr > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> We do have a static array bounds checking algorithm based on the Omega > integer programming library, but it is not hugely effective. I think > this can be strengthened a *lot*.This is great to hear. I now have extremely high expectations :). I have put a few apps here: http://www.cs.utah.edu/~regehr/hacked_tinyos_apps/ These are for compiling only-- they will not run in any useful sense. They are embedded C code from which I have deleted processor-specific constructs so that llvm-gcc for x86 can compile them. Lacking actual LLVM backends for the processors of interest, this seemed the easiest way to get started. These apps contain multiple entry points for interrupt handlers, which llvm-gcc for x86 will ignore. That is OK, since quite a bit of code is reachable from main(). John
On Wed, 25 Mar 2009, Vikram S. Adve wrote:> We do have a static array bounds checking algorithm based on the Omega > integer programming library, but it is not hugely effective. I think > this can be strengthened a *lot*.I should add that I would be interested in helping with hooking a good decision procedure into LLVM. This will be useful far beyond array bounds check elimination. What is the current thinking? SMT solver? John