Theresia Hansson
2010-Oct-15 11:37 UTC
[LLVMdev] How do I find all memory allocations in an llvm ir code file?
I tried to compile this snippet of C++ code: void FuncTest() { int* a = new int; int* b = new int[2]; } using: clang test.cpp -S -emit-llvm -o - > test.llvm and obtained this: define void @_Z8FuncTestv() { entry: %a = alloca i32*, align 4 %b = alloca i32*, align 4 %call = call noalias i8* @_Znwj(i32 4) %0 = bitcast i8* %call to i32* store i32* %0, i32** %a, align 4 %call1 = call noalias i8* @_Znaj(i32 8) %1 = bitcast i8* %call1 to i32* store i32* %1, i32** %b, align 4 ret void } declare noalias i8* @_Znwj(i32) declare noalias i8* @_Znaj(i32) What I am wondering now is: where do the _Znwj and _Znaj symbols come from? Are they just randomly assigned or is there a system to it? I would like to be able to tell that the lines %call = call noalias i8* @_Znwj(i32 4) and %call1 = call noalias i8* @_Znaj(i32 8) perform memory allocations. But it does not look that promising... Some llvm expert here who has an idea?
James Molloy
2010-Oct-15 12:17 UTC
[LLVMdev] How do I find all memory allocations in an llvm ir codefile?
Hi, _Znwj and friends are the C++-name-mangled versions of operator new. Because operator new is so common, the IA64 C++ ABI provided a shorthand for it. It can be parsed as follows: _Z: Prefix to all c++ mangled names. nw: operator new(). The other version is "na": operator new[](). ("na"->new array). j: unsigned int. All of these can be found in the C++ ABI: http://www.codesourcery.com/public/cxx-abi/abi.html#mangling You can run an identifier through the g++ tool "c++filt" to get a human-readable representation: $ c++filt _Znwj operator new(unsigned int) Cheers, James> -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] > On Behalf Of Theresia Hansson > Sent: 15 October 2010 12:38 > To: llvmdev at cs.uiuc.edu > Subject: [LLVMdev] How do I find all memory allocations in an llvm ir > codefile? > > I tried to compile this snippet of C++ code: > > void FuncTest() { > int* a = new int; > int* b = new int[2]; > } > > using: > > clang test.cpp -S -emit-llvm -o - > test.llvm > > and obtained this: > > define void @_Z8FuncTestv() { > entry: > %a = alloca i32*, align 4 > %b = alloca i32*, align 4 > %call = call noalias i8* @_Znwj(i32 4) > %0 = bitcast i8* %call to i32* > store i32* %0, i32** %a, align 4 > %call1 = call noalias i8* @_Znaj(i32 8) > %1 = bitcast i8* %call1 to i32* > store i32* %1, i32** %b, align 4 > ret void > } > > declare noalias i8* @_Znwj(i32) > declare noalias i8* @_Znaj(i32) > > What I am wondering now is: where do the _Znwj and _Znaj symbols come > from? Are they just randomly assigned or is there a system to it? I > would like to be able to tell that the lines > > %call = call noalias i8* @_Znwj(i32 4) > > and > > %call1 = call noalias i8* @_Znaj(i32 8) > > perform memory allocations. But it does not look that promising... > Some llvm expert here who has an idea? > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Olivier Meurant
2010-Oct-15 12:19 UTC
[LLVMdev] How do I find all memory allocations in an llvm ir code file?
echo "_Znwj" | c++filt => operator new(unsigned int) echo "_Znaj" | c++filt => operator new[](unsigned int) So yes, they are memory allocators. Names are just mangled. Olivier. On Fri, Oct 15, 2010 at 1:37 PM, Theresia Hansson < theresia.hansson at gmail.com> wrote:> I tried to compile this snippet of C++ code: > > void FuncTest() { > int* a = new int; > int* b = new int[2]; > } > > using: > > clang test.cpp -S -emit-llvm -o - > test.llvm > > and obtained this: > > define void @_Z8FuncTestv() { > entry: > %a = alloca i32*, align 4 > %b = alloca i32*, align 4 > %call = call noalias i8* @_Znwj(i32 4) > %0 = bitcast i8* %call to i32* > store i32* %0, i32** %a, align 4 > %call1 = call noalias i8* @_Znaj(i32 8) > %1 = bitcast i8* %call1 to i32* > store i32* %1, i32** %b, align 4 > ret void > } > > declare noalias i8* @_Znwj(i32) > declare noalias i8* @_Znaj(i32) > > What I am wondering now is: where do the _Znwj and _Znaj symbols come > from? Are they just randomly assigned or is there a system to it? I > would like to be able to tell that the lines > > %call = call noalias i8* @_Znwj(i32 4) > > and > > %call1 = call noalias i8* @_Znaj(i32 8) > > perform memory allocations. But it does not look that promising... > Some llvm expert here who has an idea? > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20101015/10744053/attachment.html>
Matthieu Wipliez
2010-Oct-15 12:42 UTC
[LLVMdev] Re : How do I find all memory allocations in an llvm ir code file?
Hi Theresia, I am no LLVM expert, but c++filt indicates that _Znwj and _Znaj are the mangled names for new and new[] operators respectively: $ c++filt __Znwj operator new(unsigned int) $ c++filt __Znaj operator new[](unsigned int) Hope this helps, Matthieu ----- Message d'origine ----> De : Theresia Hansson <theresia.hansson at gmail.com> > À : llvmdev at cs.uiuc.edu > Envoyé le : Ven 15 octobre 2010, 13h 37min 37s > Objet : [LLVMdev] How do I find all memory allocations in an llvm ir codefile?> > I tried to compile this snippet of C++ code: > > void FuncTest() { > int* a = new int; > int* b = new int[2]; > } > > using: > > clang test.cpp -S -emit-llvm -o - > test.llvm > > and obtained this: > > define void @_Z8FuncTestv() { > entry: > %a = alloca i32*, align 4 > %b = alloca i32*, align 4 > %call = call noalias i8* @_Znwj(i32 4) > %0 = bitcast i8* %call to i32* > store i32* %0, i32** %a, align 4 > %call1 = call noalias i8* @_Znaj(i32 8) > %1 = bitcast i8* %call1 to i32* > store i32* %1, i32** %b, align 4 > ret void > } > > declare noalias i8* @_Znwj(i32) > declare noalias i8* @_Znaj(i32) > > What I am wondering now is: where do the _Znwj and _Znaj symbols come > from? Are they just randomly assigned or is there a system to it? I > would like to be able to tell that the lines > > %call = call noalias i8* @_Znwj(i32 4) > > and > > %call1 = call noalias i8* @_Znaj(i32 8) > > perform memory allocations. But it does not look that promising... > Some llvm expert here who has an idea? > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >
John Criswell
2010-Oct-15 14:12 UTC
[LLVMdev] How do I find all memory allocations in an llvm ir code file?
As others have mentioned, C++ mangles names (i.e., it changes the name of a symbol into a string that contains both the name, scope, and type of the variable or function), so if you know what the mangled name is of your allocator, you can recognize it. Additionally, I believe that functions with return values marked with the noalias attribute are, essentially, memory allocators because the return value is guaranteed to not alias with anything not based off of the return value. See http://llvm.org/docs/LangRef.html#pointeraliasing for more details. As an aside, I've been thinking for awhile that we should have a "memory allocator" analysis group that identifies different allocators for different source-level languages (i.e., one analysis would recognize malloc, free, realloc, calloc while another would recognize new, new[], delete, and delete[]). There are even analyses you can do to determine if a function is a memory allocator. I have not yet had enough time to implement such an analysis group, but if others think it's a good idea, feel free to write it. :) -- John T. On 10/15/10 6:37 AM, Theresia Hansson wrote:> I tried to compile this snippet of C++ code: > > void FuncTest() { > int* a = new int; > int* b = new int[2]; > } > > using: > > clang test.cpp -S -emit-llvm -o -> test.llvm > > and obtained this: > > define void @_Z8FuncTestv() { > entry: > %a = alloca i32*, align 4 > %b = alloca i32*, align 4 > %call = call noalias i8* @_Znwj(i32 4) > %0 = bitcast i8* %call to i32* > store i32* %0, i32** %a, align 4 > %call1 = call noalias i8* @_Znaj(i32 8) > %1 = bitcast i8* %call1 to i32* > store i32* %1, i32** %b, align 4 > ret void > } > > declare noalias i8* @_Znwj(i32) > declare noalias i8* @_Znaj(i32) > > What I am wondering now is: where do the _Znwj and _Znaj symbols come > from? Are they just randomly assigned or is there a system to it? I > would like to be able to tell that the lines > > %call = call noalias i8* @_Znwj(i32 4) > > and > > %call1 = call noalias i8* @_Znaj(i32 8) > > perform memory allocations. But it does not look that promising... > Some llvm expert here who has an idea? > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Theresia Hansson
2010-Oct-15 15:17 UTC
[LLVMdev] How do I find all memory allocations in an llvm ir code file?
Ah ok, thank you for that. I guess I was simply confused by the fact that I got two different (to me they seemed randomly named) new functions. Had I tried some more allocations I would probably had noticed this, my bad :). Again thank you very much. 2010/10/15 John Criswell <criswell at illinois.edu>:> As others have mentioned, C++ mangles names (i.e., it changes the name of a > symbol into a string that contains both the name, scope, and type of the > variable or function), so if you know what the mangled name is of your > allocator, you can recognize it. > > Additionally, I believe that functions with return values marked with the > noalias attribute are, essentially, memory allocators because the return > value is guaranteed to not alias with anything not based off of the return > value. See http://llvm.org/docs/LangRef.html#pointeraliasing for more > details. > > As an aside, I've been thinking for awhile that we should have a "memory > allocator" analysis group that identifies different allocators for different > source-level languages (i.e., one analysis would recognize malloc, free, > realloc, calloc while another would recognize new, new[], delete, and > delete[]). There are even analyses you can do to determine if a function is > a memory allocator. I have not yet had enough time to implement such an > analysis group, but if others think it's a good idea, feel free to write it. > :) > > -- John T. > > On 10/15/10 6:37 AM, Theresia Hansson wrote: >> >> I tried to compile this snippet of C++ code: >> >> void FuncTest() { >> int* a = new int; >> int* b = new int[2]; >> } >> >> using: >> >> clang test.cpp -S -emit-llvm -o -> test.llvm >> >> and obtained this: >> >> define void @_Z8FuncTestv() { >> entry: >> %a = alloca i32*, align 4 >> %b = alloca i32*, align 4 >> %call = call noalias i8* @_Znwj(i32 4) >> %0 = bitcast i8* %call to i32* >> store i32* %0, i32** %a, align 4 >> %call1 = call noalias i8* @_Znaj(i32 8) >> %1 = bitcast i8* %call1 to i32* >> store i32* %1, i32** %b, align 4 >> ret void >> } >> >> declare noalias i8* @_Znwj(i32) >> declare noalias i8* @_Znaj(i32) >> >> What I am wondering now is: where do the _Znwj and _Znaj symbols come >> from? Are they just randomly assigned or is there a system to it? I >> would like to be able to tell that the lines >> >> %call = call noalias i8* @_Znwj(i32 4) >> >> and >> >> %call1 = call noalias i8* @_Znaj(i32 8) >> >> perform memory allocations. But it does not look that promising... >> Some llvm expert here who has an idea? >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >