On 5/24/12 3:51 AM, Duncan Sands wrote:> Hi Nuno, > >> I'm implementing the alloc_size function attribute in clang. > does anyone actually use this attribute? And if they do, can it really buy > them anything? How about "implementing" it by ignoring it!Tools like ASan and SAFECode *could* use this attribute to determine the size of memory objects created by allocators. This is needed for things like SAFECode's fastcheck optimization (which replaces expensive checks that need to lookup object bounds with a simpler check that has the object bounds passed in as arguments) as well as its instrumentation to register heap object bounds with the SAFECode run-time. Currently, SAFECode has a pass which just recognizes certain functions as allocators and knows how to interpret the arguments to find the size. If we want SAFECode to work with another allocator (like a program's custom allocator, the Objective-C allocator, the Boehm garbage collector, etc), then that pass needs to be modified to recognize it. Having to update this pass for every allocator name and type is one of the few reasons why SAFECode only works with C/C++ and not just any old language that is compiled down to LLVM IR. Nuno's proposed feature would allow programmers to communicate the relevant information about allocators to tools like SAFECode and ASan. I think it might also make some of the optimizations in LLVM that require knowing about allocators work on non-C/C++ code. So, no, I don't think anything's using it now, but it looks like something very useful to add, and I think some of those useful things are already a part of core LLVM. -- John T.> > Ciao, Duncan. > > This >> attribute exists in gcc since version 4.3. >> The attribute specifies that a function returns a buffer with the size >> given by the multiplication of the specified arguments. I would like >> to add new metadata to pass this information to LLVM IR. >> >> For example, take the following C code: >> >> void* my_calloc(size_t, size_t) __attribute__((alloc_size(1,2))); >> >> void* fn() { >> return my_calloc(1, 3); >> } >> >> >> which would get translated to: >> >> >> define i8* @fn() { >> entry: >> %call = call i8* @my_calloc(i32 1, i32 3), !alloc_size !0 >> ret i8* %call >> } >> >> declare i8* @my_calloc(i32, i32) >> >> !0 = metadata !{i32 0, i32 1} >> >> >> The downsize is that the metadata is added to all call sites, since >> it's not possible to attach metadata to function declarations. >> >> Any comment, suggestion, etc? >> >> Thanks, >> Nuno >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
On Thu, May 24, 2012 at 7:43 PM, John Criswell <criswell at illinois.edu>wrote:> On 5/24/12 3:51 AM, Duncan Sands wrote: > > Hi Nuno, > > > >> I'm implementing the alloc_size function attribute in clang. > > does anyone actually use this attribute? And if they do, can it really > buy > > them anything? How about "implementing" it by ignoring it! > > Tools like ASan and SAFECode *could* use this attributeA case where this may be useful for asan: size_t n, m; ... int *x = new int [n]; ... x[m] // here we can check "m < n" instead of a more expensive shadow memory lookup. For asan such optimization is possible only if 'x' does not escape the current function before the use, otherwise we may lose a use-after-free. I don't know whether such optimization will fire often enough to pay the price for the added complexity of the implementation. It would be interesting to see statistics on some huge app. --kcc> to determine the > size of memory objects created by allocators. This is needed for things > like SAFECode's fastcheck optimization (which replaces expensive checks > that need to lookup object bounds with a simpler check that has the > object bounds passed in as arguments) as well as its instrumentation to > register heap object bounds with the SAFECode run-time. > > Currently, SAFECode has a pass which just recognizes certain functions > as allocators and knows how to interpret the arguments to find the > size. If we want SAFECode to work with another allocator (like a > program's custom allocator, the Objective-C allocator, the Boehm garbage > collector, etc), then that pass needs to be modified to recognize it. > Having to update this pass for every allocator name and type is one of > the few reasons why SAFECode only works with C/C++ and not just any old > language that is compiled down to LLVM IR. > > Nuno's proposed feature would allow programmers to communicate the > relevant information about allocators to tools like SAFECode and ASan. > I think it might also make some of the optimizations in LLVM that > require knowing about allocators work on non-C/C++ code. > > So, no, I don't think anything's using it now, but it looks like > something very useful to add, and I think some of those useful things > are already a part of core LLVM. > > -- John T. > > > > > Ciao, Duncan. > > > > This > >> attribute exists in gcc since version 4.3. > >> The attribute specifies that a function returns a buffer with the size > >> given by the multiplication of the specified arguments. I would like > >> to add new metadata to pass this information to LLVM IR. > >> > >> For example, take the following C code: > >> > >> void* my_calloc(size_t, size_t) __attribute__((alloc_size(1,2))); > >> > >> void* fn() { > >> return my_calloc(1, 3); > >> } > >> > >> > >> which would get translated to: > >> > >> > >> define i8* @fn() { > >> entry: > >> %call = call i8* @my_calloc(i32 1, i32 3), !alloc_size !0 > >> ret i8* %call > >> } > >> > >> declare i8* @my_calloc(i32, i32) > >> > >> !0 = metadata !{i32 0, i32 1} > >> > >> > >> The downsize is that the metadata is added to all call sites, since > >> it's not possible to attach metadata to function declarations. > >> > >> Any comment, suggestion, etc? > >> > >> Thanks, > >> Nuno > >> _______________________________________________ > >> LLVM Developers mailing list > >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120525/86be80d4/attachment.html>
Hi John,>>> I'm implementing the alloc_size function attribute in clang. >> does anyone actually use this attribute? And if they do, can it really buy >> them anything? How about "implementing" it by ignoring it! >...> > Currently, SAFECode has a pass which just recognizes certain functions as > allocators and knows how to interpret the arguments to find the size. If we want > SAFECode to work with another allocator (like a program's custom allocator, the > Objective-C allocator, the Boehm garbage collector, etc), then that pass needs > to be modified to recognize it. Having to update this pass for every allocator > name and type is one of the few reasons why SAFECode only works with C/C++ and > not just any old language that is compiled down to LLVM IR.> Nuno's proposed feature would allow programmers to communicate the relevant > information about allocators to tools like SAFECode and ASan. I think it might > also make some of the optimizations in LLVM that require knowing about > allocators work on non-C/C++ code.these are good points. The attribute and proposed implementation feel pretty clunky though, which is my main gripe. Since LLVM already has utility functions for recognizing allocators (i.e. that know about malloc, realloc and -fno-builtin etc) can't SAFECode just make use of them? Then either (1) something like alloc_size is implemented, the LLVM utility learns about it, and SAFECode benefits automagically, or (2) the LLVM utility is taught about other allocators like Ada's, and SAFECode benefits automagically. Ciao, Duncan.
Hi Kostya,> Tools like ASan and SAFECode *could* use this attribute > > > A case where this may be useful for asan: > size_t n, m; ... > int *x = new int [n]; ... > x[m] // here we can check "m < n" instead of a more expensive shadow memory > lookup.I don't think you need the attribute for this: LLVM already has utilities that know about malloc and C++'s new. Can't you just use them? Ciao, Duncan.
On Fri, May 25, 2012 at 12:03 AM, Kostya Serebryany <kcc at google.com> wrote:> > > On Thu, May 24, 2012 at 7:43 PM, John Criswell <criswell at illinois.edu>wrote: > >> On 5/24/12 3:51 AM, Duncan Sands wrote: >> > Hi Nuno, >> > >> >> I'm implementing the alloc_size function attribute in clang. >> > does anyone actually use this attribute? And if they do, can it really >> buy >> > them anything? How about "implementing" it by ignoring it! >> >> Tools like ASan and SAFECode *could* use this attribute > > > A case where this may be useful for asan: > size_t n, m; ... > int *x = new int [n]; ... > x[m] // here we can check "m < n" instead of a more expensive shadow > memory lookup. > For asan such optimization is possible only if 'x' does not escape the > current function before the use, > otherwise we may lose a use-after-free. > > I don't know whether such optimization will fire often enough to pay the > price for the added complexity of the implementation. > It would be interesting to see statistics on some huge app. >I think this is key -- there should be some clear numbers and evidence that this is a really important semantic extension in order to get accurate and efficient results. And as Duncan points out, we should be confident that there is no existing mechanism to get the same optimization improvements. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120525/aa3a3fbc/attachment.html>
On 5/25/12 2:16 AM, Duncan Sands wrote:> Hi John, > >>>> I'm implementing the alloc_size function attribute in clang. >>> does anyone actually use this attribute? And if they do, can it >>> really buy >>> them anything? How about "implementing" it by ignoring it! >> > ... >> >> Currently, SAFECode has a pass which just recognizes certain >> functions as >> allocators and knows how to interpret the arguments to find the size. >> If we want >> SAFECode to work with another allocator (like a program's custom >> allocator, the >> Objective-C allocator, the Boehm garbage collector, etc), then that >> pass needs >> to be modified to recognize it. Having to update this pass for every >> allocator >> name and type is one of the few reasons why SAFECode only works with >> C/C++ and >> not just any old language that is compiled down to LLVM IR. > > >> Nuno's proposed feature would allow programmers to communicate the >> relevant >> information about allocators to tools like SAFECode and ASan. I think >> it might >> also make some of the optimizations in LLVM that require knowing about >> allocators work on non-C/C++ code. > > these are good points. The attribute and proposed implementation feel > pretty > clunky though, which is my main gripe.Hrm. I haven't formed an opinion on what the attributes should look like. I think supporting the ones established by GCC would be important for compatibility, and on the surface, they look reasonable. Devising better ones for Clang is fine with me. What about them feels klunky?> > Since LLVM already has utility functions for recognizing allocators > (i.e. that > know about malloc, realloc and -fno-builtin etc) can't SAFECode just > make use > of them?It probably could. It doesn't simply because SAFECode was written before these features existed within LLVM. :)> Then either (1) something like alloc_size is implemented, the LLVM > utility learns about it, and SAFECode benefits automagically, or (2) > the LLVM > utility is taught about other allocators like Ada's, and SAFECode > benefits > automagically.I'm not sure what you mean by "LLVM utility," but I think we're thinking along the same lines. Clang/LLVM implement the alloc_size attributes, we change SAFECode to recognize it, and so when people use it, SAFECode benefits automagically. Am I right that we're thinking the same thing, or did I completely misunderstand you? -- John T.> > Ciao, Duncan.