thr3ads.net - llvm dev - [LLVMdev] alloc

If this information is useful, please help other people find it:
Share via:

John Criswell

2012-May-24 15:43 UTC

[LLVMdev] alloc_size metadata

On 5/24/12 3:51 AM, Duncan Sands wrote:> Hi Nuno,
>
>> I'm implementing the alloc_size function attribute in clang.
> does anyone actually use this attribute?  And if they do, can it really buy
> them anything?  How about "implementing" it by ignoring it!
Tools like ASan and SAFECode *could* use this attribute to determine the 
size of memory objects created by allocators.  This is needed for things 
like SAFECode's fastcheck optimization (which replaces expensive checks 
that need to lookup object bounds with a simpler check that has the 
object bounds passed in as arguments) as well as its instrumentation to 
register heap object bounds with the SAFECode run-time.

Currently, SAFECode has a pass which just recognizes certain functions 
as allocators and knows how to interpret the arguments to find the 
size.  If we want SAFECode to work with another allocator (like a 
program's custom allocator, the Objective-C allocator, the Boehm garbage 
collector, etc), then that pass needs to be modified to recognize it.  
Having to update this pass for every allocator name and type is one of 
the few reasons why SAFECode only works with C/C++ and not just any old 
language that is compiled down to LLVM IR.

Nuno's proposed feature would allow programmers to communicate the 
relevant information about allocators to tools like SAFECode and ASan.  
I think it might also make some of the optimizations in LLVM that 
require knowing about allocators work on non-C/C++ code.

So, no, I don't think anything's using it now, but it looks like 
something very useful to add, and I think some of those useful things 
are already a part of core LLVM.

-- John T.
>
> Ciao, Duncan.
>
>    This
>> attribute exists in gcc since version 4.3.
>> The attribute specifies that a function returns a buffer with the size
>> given by the multiplication of the specified arguments.  I would like
>> to add new metadata to pass this information to LLVM IR.
>>
>> For example, take the following C code:
>>
>> void* my_calloc(size_t, size_t) __attribute__((alloc_size(1,2)));
>>
>> void* fn() {
>>      return my_calloc(1, 3);
>> }
>>
>>
>> which would get translated to:
>>
>>
>> define i8* @fn() {
>> entry:
>>      %call = call i8* @my_calloc(i32 1, i32 3), !alloc_size !0
>>      ret i8* %call
>> }
>>
>> declare i8* @my_calloc(i32, i32)
>>
>> !0 = metadata !{i32 0, i32 1}
>>
>>
>> The downsize is that the metadata is added to all call sites, since
>> it's not possible to attach metadata to function declarations.
>>
>> Any comment, suggestion, etc?
>>
>> Thanks,
>> Nuno
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Kostya Serebryany

2012-May-25 07:03 UTC

head link

[LLVMdev] alloc_size metadata

On Thu, May 24, 2012 at 7:43 PM, John Criswell <criswell at
illinois.edu>wrote:
> On 5/24/12 3:51 AM, Duncan Sands wrote:
> > Hi Nuno,
> >
> >> I'm implementing the alloc_size function attribute in clang.
> > does anyone actually use this attribute?  And if they do, can it
really
> buy
> > them anything?  How about "implementing" it by ignoring it!
>
> Tools like ASan and SAFECode *could* use this attribute

A case where this may be useful for asan:
   size_t n, m; ...
   int *x = new int [n]; ...
   x[m]  // here we can check "m < n" instead of a more expensive
shadow
memory lookup.
For asan such optimization is possible only if 'x' does not escape the
current function before the use,
otherwise we may lose a use-after-free.

I don't know whether such optimization will fire often enough to pay the
price for the added complexity of the implementation.
It would be interesting to see statistics on some huge app.


--kcc


> to determine the
> size of memory objects created by allocators.  This is needed for things
> like SAFECode's fastcheck optimization (which replaces expensive checks
> that need to lookup object bounds with a simpler check that has the
> object bounds passed in as arguments) as well as its instrumentation to
> register heap object bounds with the SAFECode run-time.
>
> Currently, SAFECode has a pass which just recognizes certain functions
> as allocators and knows how to interpret the arguments to find the
> size.  If we want SAFECode to work with another allocator (like a
> program's custom allocator, the Objective-C allocator, the Boehm
garbage
> collector, etc), then that pass needs to be modified to recognize it.
> Having to update this pass for every allocator name and type is one of
> the few reasons why SAFECode only works with C/C++ and not just any old
> language that is compiled down to LLVM IR.
>
> Nuno's proposed feature would allow programmers to communicate the
> relevant information about allocators to tools like SAFECode and ASan.
> I think it might also make some of the optimizations in LLVM that
> require knowing about allocators work on non-C/C++ code.
>
> So, no, I don't think anything's using it now, but it looks like
> something very useful to add, and I think some of those useful things
> are already a part of core LLVM.
>
> -- John T.
>
> >
> > Ciao, Duncan.
> >
> >    This
> >> attribute exists in gcc since version 4.3.
> >> The attribute specifies that a function returns a buffer with the
size
> >> given by the multiplication of the specified arguments.  I would
like
> >> to add new metadata to pass this information to LLVM IR.
> >>
> >> For example, take the following C code:
> >>
> >> void* my_calloc(size_t, size_t) __attribute__((alloc_size(1,2)));
> >>
> >> void* fn() {
> >>      return my_calloc(1, 3);
> >> }
> >>
> >>
> >> which would get translated to:
> >>
> >>
> >> define i8* @fn() {
> >> entry:
> >>      %call = call i8* @my_calloc(i32 1, i32 3), !alloc_size !0
> >>      ret i8* %call
> >> }
> >>
> >> declare i8* @my_calloc(i32, i32)
> >>
> >> !0 = metadata !{i32 0, i32 1}
> >>
> >>
> >> The downsize is that the metadata is added to all call sites,
since
> >> it's not possible to attach metadata to function declarations.
> >>
> >> Any comment, suggestion, etc?
> >>
> >> Thanks,
> >> Nuno
> >> _______________________________________________
> >> LLVM Developers mailing list
> >> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20120525/86be80d4/attachment.html>

Duncan Sands

2012-May-25 07:16 UTC

head link

[LLVMdev] alloc_size metadata

Hi John,
>>> I'm implementing the alloc_size function attribute in clang.
>> does anyone actually use this attribute? And if they do, can it really
buy
>> them anything? How about "implementing" it by ignoring it!
>
...>
> Currently, SAFECode has a pass which just recognizes certain functions as
> allocators and knows how to interpret the arguments to find the size. If we
want
> SAFECode to work with another allocator (like a program's custom
allocator, the
> Objective-C allocator, the Boehm garbage collector, etc), then that pass
needs
> to be modified to recognize it. Having to update this pass for every
allocator
> name and type is one of the few reasons why SAFECode only works with C/C++
and
> not just any old language that is compiled down to LLVM IR.
> Nuno's proposed feature would allow programmers to communicate the
relevant
> information about allocators to tools like SAFECode and ASan. I think it
might
> also make some of the optimizations in LLVM that require knowing about
> allocators work on non-C/C++ code.
these are good points.  The attribute and proposed implementation feel pretty
clunky though, which is my main gripe.

Since LLVM already has utility functions for recognizing allocators (i.e. that
know about malloc, realloc and -fno-builtin etc) can't SAFECode just make
use
of them?  Then either (1) something like alloc_size is implemented, the LLVM
utility learns about it, and SAFECode benefits automagically, or (2) the LLVM
utility is taught about other allocators like Ada's, and SAFECode benefits
automagically.

Ciao, Duncan.

Duncan Sands

2012-May-25 07:17 UTC

head link

[LLVMdev] alloc_size metadata

Hi Kostya,
>     Tools like ASan and SAFECode *could* use this attribute
>
>
> A case where this may be useful for asan:
>     size_t n, m; ...
>     int *x = new int [n]; ...
>     x[m]  // here we can check "m < n" instead of a more
expensive shadow memory
> lookup.
I don't think you need the attribute for this: LLVM already has utilities
that
know about malloc and C++'s new.  Can't you just use them?

Ciao, Duncan.

Chandler Carruth

2012-May-25 07:22 UTC

head link

[LLVMdev] alloc_size metadata

On Fri, May 25, 2012 at 12:03 AM, Kostya Serebryany <kcc at google.com>
wrote:
>
>
> On Thu, May 24, 2012 at 7:43 PM, John Criswell <criswell at
illinois.edu>wrote:
>
>> On 5/24/12 3:51 AM, Duncan Sands wrote:
>> > Hi Nuno,
>> >
>> >> I'm implementing the alloc_size function attribute in
clang.
>> > does anyone actually use this attribute?  And if they do, can it
really
>> buy
>> > them anything?  How about "implementing" it by ignoring
it!
>>
>> Tools like ASan and SAFECode *could* use this attribute
>
>
> A case where this may be useful for asan:
>    size_t n, m; ...
>    int *x = new int [n]; ...
>    x[m]  // here we can check "m < n" instead of a more
expensive shadow
> memory lookup.
> For asan such optimization is possible only if 'x' does not escape
the
> current function before the use,
> otherwise we may lose a use-after-free.
>
> I don't know whether such optimization will fire often enough to pay
the
> price for the added complexity of the implementation.
> It would be interesting to see statistics on some huge app.
>
I think this is key -- there should be some clear numbers and evidence that
this is a really important semantic extension in order to get accurate and
efficient results.

And as Duncan points out, we should be confident that there is no existing
mechanism to get the same optimization improvements.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20120525/aa3a3fbc/attachment.html>

John Criswell

2012-May-25 15:22 UTC

head link

[LLVMdev] alloc_size metadata

On 5/25/12 2:16 AM, Duncan Sands wrote:> Hi John,
>
>>>> I'm implementing the alloc_size function attribute in
clang.
>>> does anyone actually use this attribute? And if they do, can it 
>>> really buy
>>> them anything? How about "implementing" it by ignoring
it!
>>
> ...
>>
>> Currently, SAFECode has a pass which just recognizes certain 
>> functions as
>> allocators and knows how to interpret the arguments to find the size. 
>> If we want
>> SAFECode to work with another allocator (like a program's custom 
>> allocator, the
>> Objective-C allocator, the Boehm garbage collector, etc), then that 
>> pass needs
>> to be modified to recognize it. Having to update this pass for every 
>> allocator
>> name and type is one of the few reasons why SAFECode only works with 
>> C/C++ and
>> not just any old language that is compiled down to LLVM IR.
>
>
>> Nuno's proposed feature would allow programmers to communicate the 
>> relevant
>> information about allocators to tools like SAFECode and ASan. I think 
>> it might
>> also make some of the optimizations in LLVM that require knowing about
>> allocators work on non-C/C++ code.
>
> these are good points.  The attribute and proposed implementation feel 
> pretty
> clunky though, which is my main gripe.
Hrm.  I haven't formed an opinion on what the attributes should look 
like.  I think supporting the ones established by GCC would be important 
for compatibility, and on the surface, they look reasonable.  Devising 
better ones for Clang is fine with me.  What about them feels klunky?
>
> Since LLVM already has utility functions for recognizing allocators 
> (i.e. that
> know about malloc, realloc and -fno-builtin etc) can't SAFECode just 
> make use
> of them?
It probably could.  It doesn't simply because SAFECode was written 
before these features existed within LLVM.
:)
> Then either (1) something like alloc_size is implemented, the LLVM
> utility learns about it, and SAFECode benefits automagically, or (2) 
> the LLVM
> utility is taught about other allocators like Ada's, and SAFECode 
> benefits
> automagically.
I'm not sure what you mean by "LLVM utility," but I think
we're thinking
along the same lines.  Clang/LLVM implement the alloc_size attributes, 
we change SAFECode to recognize it, and so when people use it, SAFECode 
benefits automagically.

Am I right that we're thinking the same thing, or did I completely 
misunderstand you?

-- John T.
>
> Ciao, Duncan.

Possibly Parallel Threads

Search for more reasonably related threads

llvm dev - May 2012 - [LLVMdev] alloc_size metadata

[LLVMdev] alloc_size metadata

[LLVMdev] alloc_size metadata

[LLVMdev] alloc_size metadata

[LLVMdev] alloc_size metadata

[LLVMdev] alloc_size metadata

[LLVMdev] alloc_size metadata

Possibly Parallel Threads