On 13 Jan 2010, at 20:34, Nick Lewycky wrote:> On 13 January 2010 12:05, Mark Muir <mark.i.r.muir at gmail.com> wrote: > > But... now there's a small problem with library calls. Symbols such as 'memset', 'malloc', etc. are being removed by global dead code elimination. They are implemented in one of the bitcode modules that are linked together (implementations are based on newlib). > > And what problems does that cause? If malloc is linked in, we're free to inline it everywhere and delete the symbol. If you meant for it to be visible to the optimizers but you don't want it to be part of the code generated for your program (ie., you'll link it against newlib later), you should mark the functions with available_externally linkage. >Sorry, I should've been more clear - the calls to _malloc and _free weren't being inlined (see example below). I'm not sure why (happens with or without -simplify-libcalls). So, the resulting .bc file from 'opt' contains live references to symbols that were in its input .bc, but for some reason it stripped them. #include <stdlib.h> int entries = 3; int result; int main() { int i; // Allocate and populate the initial array. int* values = malloc(entries * sizeof(int)); for (i = 0; i < entries; i ++) values[i] = i + 1; // Calculate the sum, using a dynamically allocated accumulator. int* acc = malloc(sizeof(int)); *acc = 0; for (i = 0; i < entries; i ++) *acc += values[i]; result = *acc; // Deallocate the memory. free(values); free(acc); return 0; } Here's a fragment of the final machine assembly (with -O3): _main: ADDCOMP out=r1 in1=r1 in2=4 conf=`ADDCOMP_SUB WMEM in=r2 in_addr=r1 conf=`WMEM_SI CONST_16B out=r3 conf=12 JUMP nl_out=r2/*RA*/ addr_in=&_malloc conf=`JUMP_ALWAYS_ABS // Call In case this is important, here is the relevant declarations from the 'stdlib.h' that is in use: _PTR _EXFUN(malloc,(size_t __size)); _VOID _EXFUN(free,(_PTR)); where: #define _PTR void * #define _EXFUN(name, proto) name proto and from 'newlib.c': void * malloc (size_t sz) { ... } i.e. They look like any other function call, which is why I suspect it has something to do with special behaviour given to built-ins.> > Alternately, if you wanted malloc, memset and friends to be externally visible (compiled as part of your program and dlsym'able), you could create a public api file which contains a one per line list of the names of the functions that may not be marked internal linkage by internalize. Pass that in to opt with -internalize-public-api-file filename ...other flags... >I saw that. I was thinking of only using that option as a last resort, due to maintainability.> > I guess I need help with the concept of built-ins, and what code is related to them in the Clang driver and back-end.Thanks. - Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100113/1305cf18/attachment.html>
Mark Muir wrote:> > On 13 Jan 2010, at 20:34, Nick Lewycky wrote: > >> On 13 January 2010 12:05, Mark Muir <mark.i.r.muir at gmail.com >> <mailto:mark.i.r.muir at gmail.com>> wrote: >> >> >> But... now there's a small problem with library calls. Symbols >> such as 'memset', 'malloc', etc. are being removed by global dead >> code elimination. They are implemented in one of the bitcode >> modules that are linked together (implementations are based on >> newlib). >> >> >> And what problems does that cause? If malloc is linked in, we're free >> to inline it everywhere and delete the symbol. If you meant for it to >> be visible to the optimizers but you don't want it to be part of the >> code generated for your program (ie., you'll link it against newlib >> later), you should mark the functions with available_externally linkage. > > Sorry, I should've been more clear - the calls to _malloc and _free > weren't being inlined (see example below). I'm not sure why (happens > with or without -simplify-libcalls). So, the resulting .bc file from > 'opt' contains live references to symbols that were in its input .bc, > but for some reason it stripped them.Okay. Could you post an .ll (run 'llvm-dis < foo.bc') example of where this happens? Just the input and opt commands to run is fine. It's very frustrating to look at C and assembly when the problem is in the IR -> IR transform itself. Nick> #include <stdlib.h> > > int entries = 3; > int result; > > int main() > { > int i; > > // Allocate and populate the initial array. > int* values = malloc(entries * sizeof(int)); > for (i = 0; i < entries; i ++) > values[i] = i + 1; > > // Calculate the sum, using a dynamically allocated accumulator. > int* acc = malloc(sizeof(int)); > *acc = 0; > for (i = 0; i < entries; i ++) > *acc += values[i]; > result = *acc; > > // Deallocate the memory. > free(values); > free(acc); > > return 0; > } > > > Here's a fragment of the final machine assembly (with -O3): > > _main: > ADDCOMP out=r1 in1=r1 in2=4 conf=`ADDCOMP_SUB > WMEM in=r2 in_addr=r1 conf=`WMEM_SI > CONST_16B out=r3 conf=12 > JUMP nl_out=r2/*RA*/ addr_in=&_malloc conf=`JUMP_ALWAYS_ABS // Call > > > In case this is important, here is the relevant declarations from the > 'stdlib.h' that is in use: > > _PTR _EXFUN(malloc,(size_t __size)); > _VOID _EXFUN(free,(_PTR)); > > > where: > > #define _PTR void * > #define _EXFUN(name, proto) name proto > > > and from 'newlib.c': > > void * > malloc (size_t sz) > { > ... > } > > > i.e. They look like any other function call, which is why I suspect it > has something to do with special behaviour given to built-ins. > >> >> Alternately, if you wanted malloc, memset and friends to be externally >> visible (compiled as part of your program and dlsym'able), you could >> create a public api file which contains a one per line list of the >> names of the functions that may not be marked internal linkage by >> internalize. Pass that in to opt with -internalize-public-api-file >> filename ...other flags... >> > > I saw that. I was thinking of only using that option as a last resort, > due to maintainability. > >> >> I guess I need help with the concept of built-ins, and what code >> is related to them in the Clang driver and back-end. > > Thanks. > > - Mark > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
On 14 Jan 2010, at 05:20, Nick Lewycky wrote:>> calls to _malloc and _free >> weren't being inlined (see example below). I'm not sure why (happens >> with or without -simplify-libcalls). So, the resulting .bc file from >> 'opt' contains live references to symbols that were in its input .bc, >> but for some reason it stripped them. > > Okay. Could you post an .ll (run 'llvm-dis < foo.bc') example of where this happens? Just the input and opt commands to run is fine. It's very frustrating to look at C and assembly when the problem is in the IR -> IR transform itself.I've attached the relevant IR (stripped down to the bare minimum). The following commands will reproduce the problem (using vanilla 2.6 versions of the LLVM tools): llvm-as test_malloc.ll -o - | opt -std-link-opts -o - | llvm-dis -o - That strips everything except for @main. The stripping of the two global variables is fine, and there are no references to them left in the IR. But there are live references to @malloc and @free. The minimum options required for this behaviour are: llvm-as test_malloc.ll -o - | opt -internalize -globaldce -o - | llvm-dis -o - If I use -disable-internalize with -std-link-opts, then global dead code elimination doesn't remove anything, but inlining still takes place. So that is the solution I'm using at the moment. But I'd like to know why this behaviour is happening, and it would be nice to have global DCE so that the resulting machine assembly is easier to work with (for manual debugging on this architecture). Thanks for looking at this. Regards, - Mark -------------- next part -------------- A non-text attachment was scrubbed... Name: test_malloc.ll.bz2 Type: application/x-bzip2 Size: 2417 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100114/f2ce3118/attachment.bin>