Hi Evan, Thanks for the pointers. We found a simple test case that causes the problem (thanks to Tom in my group): #include<stdio.h> #include<stdlib.h> void test(); void (*funcPtr)(); int main(int argc, char **argv) { funcPtr = test; test(); } void test() { if(funcPtr == test) { printf("OK!\n"); } else { fprintf(stderr, "Bad!\n"); exit(1); } } $ llvm-gcc -emit-llvm -o FPtrEqTest.bc -c FPtrEqTest.c $ llc -f FPtrEqTest.bc $ gcc -o FPtrEqTest FPtrEqTest.s $ ./FPtrEqTest OK! $ lli FPtrEqTest.bc Bad! The above test case is just a smaller version of the one in Python's subtype_traverse which also tests a function pointer and calls itself. It seems the problem arises due comparison with the stub's address when a comparison with the actual address of the compiled function is intended. thanks, Prakash On Mon, Nov 3, 2008 at 2:07 AM, Evan Cheng <evan.cheng at apple.com> wrote:> Hi Prakash, > Unfortunately it looks like you need to do quite a bit of investigation > into this. However, I hope I can provide some useful tips. > > 1. In general, lli and llc generate exact the same code except lli default > to static codegen while llc defaults to dynamic-no-pic codegen. So try > passing -relocation-model=dynamic-no-pic to lli. If this works, that means > there are issues with static codegen. > 2. It could be a JIT encoding bug. If you can identify a problematic > function, it's possible examine the generated code in gdb and compare it > with llc generated assembly. > 3. It could be a bug in the app and it's exposed when running under the > JIT. You can try enabling additional debugging output. > > Hope this helps. > > Evan > > On Nov 2, 2008, at 2:55 PM, Prakash Prabhu wrote: > > Hi Eli, > > Thanks for the reply. I tried with -Xlinker="-ldl ". However it does not > seem to make a difference. It seems that when bugpoint is run with > --run-jit, the linker args are not passed to gcc (from > tools/bugpoint/ExecutionDriver.cpp) : > > if (InterpreterSel == RunLLC || InterpreterSel == RunCBE || > InterpreterSel == CBE_bug || InterpreterSel == LLC_Safe) > > RetVal = AI->ExecuteProgram(BitcodeFile, InputArgv, InputFile, > OutputFile, AdditionalLinkerArgs, > SharedObjs, > Timeout, MemoryLimit); > > else > > > RetVal = AI->ExecuteProgram(BitcodeFile, InputArgv, InputFile, > OutputFile, std::vector<std::string>(), > SharedObjs, Timeout, MemoryLimit); > > > I tried the following after this: > > (1) Firstly instead of running Gap ( > http://www.gap-system.org/Download/UNIXInst.html), I am now trying to run > python with lli (http://www.python.org/download/releases/2.5.2/). I > managed to compile python.bc and here again I face the same problem: > > llc and gcc can get python.exe to run (which is great :)!) : > > $ llc -f python.bc > $ gcc -o python.exe python.s -ldl -lutil -lm -lrt > $ ./python.exe > Python 2.5.2 (r252:60911, Oct 31 2008, 14:41:11) > [GCC 4.2.1 (Based on Apple Inc. build 5623) (LLVM build)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> > > however, when I try to run python.bc using lli it crashes with a > segmentation fault: > > $ lli -load=/usr/lib/libdl.so -load=/usr/lib/libutil.so > -load=/usr/lib/libm.so -load=/usr/lib/librt.so python.bc > > When i try it with gdb, it seems that the crash is somewhere inside python > code (since bt only shows ?? ). Before the crash, I could see that the > memory consumption(VM) reaches somewhere near 80% of my 2GB RAM (seen via > top, and that too a sudden increase from around when it was previously > occupying around 2-3% of VM). I tried to run this on a 64-bit machine which > has 8GB RAM and still have the same issue wrt memory. > > (2) Finally I wrote a pass (and loaded it through opt) to instrument each > function's (in python code ) entry and exit and then ran the instrumented > program with both [llc ; gcc] combination and lli. In the lli version a > single method (subtype_traverse) is recursively called (about 2 million > times) until the program runs out of memory while the statically compiled > code (llc + gcc) calls this method (I am comparing the calls in the same > context in both cases) only once: > > python with llc + gcc : > ... > (tupletraverse (visit_decref (type_is_gc))) > (subtype_traverse (visit_decref (type_is_gc)) > (type_traverse (visit_decref) (visit_decref) ... > > python with lli: > > (tupletraverse (visit_decref (type_is_gc))) > (subtype_traverse (visit_decref (type_is_gc)) (subtype_traverse > (visit_decref (type_is_gc))(subtype_traverse (visit_decref (type_is_gc)) > ..... about 2 million times > > Looking at the code (Objects/typeobject.c: > http://google.com/codesearch?hl=en&q=show:VK_wUSuAZto:jHKC99mjNVM:4z02hQcYQRY&sa=N&ct=rd&cs_p=http://gentoo.osuosl.org/distfiles/Python-2.5.tar.bz2&cs_f=Python-2.5/Objects/typeobject.c > ) > > it seems the last call (through a function pointer) in subtype_traverse > results in this never-ending recursive call. > > Has anyone tried compiling python to bit code and running it the LLVM JIT > before ? > > Thanks for your time. > > - Prakash > > > > > > On Tue, Oct 28, 2008 at 3:02 PM, Eli Friedman <eli.friedman at gmail.com>wrote: > >> On Tue, Oct 28, 2008 at 12:17 PM, Prakash Prabhu >> <prakash.prabhu at gmail.com> wrote: >> > Generating reference output from raw program: <cbe><gcc> >> > Error running tool: >> [snip] >> > /tmp/cc08IpX8.o: In function `SyLoadModule': >> > bugpoint-test-program.bc.cbe.c:(.text+0x25705): undefined reference to >> > `dlopen' >> [snip] >> >> This is saying that compilation with CBE is failing. Try something >> like -Xlinker -ldl? >> >> -Eli >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20081103/5aab341b/attachment.html>
Sorry about the tardiness. I'll take a look. Thanks, Evan On Nov 3, 2008, at 4:00 PM, Prakash Prabhu wrote:> Hi Evan, > > Thanks for the pointers. We found a simple test case that causes the > problem (thanks to Tom in my group): > > #include<stdio.h> > #include<stdlib.h> > > void test(); > void (*funcPtr)(); > > int main(int argc, char **argv) { > funcPtr = test; > test(); > } > > void test() { > if(funcPtr == test) { > printf("OK!\n"); > } else { > fprintf(stderr, "Bad!\n"); > exit(1); > } > } > > $ llvm-gcc -emit-llvm -o FPtrEqTest.bc -c FPtrEqTest.c > $ llc -f FPtrEqTest.bc > $ gcc -o FPtrEqTest FPtrEqTest.s > $ ./FPtrEqTest > OK! > > $ lli FPtrEqTest.bc > Bad! > > The above test case is just a smaller version of the one in Python's > subtype_traverse which also tests a function pointer and calls > itself. It seems the problem arises due comparison with the stub's > address when a comparison with the actual address of the compiled > function is intended. > > thanks, > Prakash > > On Mon, Nov 3, 2008 at 2:07 AM, Evan Cheng <evan.cheng at apple.com> > wrote: > Hi Prakash, > > Unfortunately it looks like you need to do quite a bit of > investigation into this. However, I hope I can provide some useful > tips. > > 1. In general, lli and llc generate exact the same code except lli > default to static codegen while llc defaults to dynamic-no-pic > codegen. So try passing -relocation-model=dynamic-no-pic to lli. If > this works, that means there are issues with static codegen. > 2. It could be a JIT encoding bug. If you can identify a problematic > function, it's possible examine the generated code in gdb and > compare it with llc generated assembly. > 3. It could be a bug in the app and it's exposed when running under > the JIT. You can try enabling additional debugging output. > > Hope this helps. > > Evan > > On Nov 2, 2008, at 2:55 PM, Prakash Prabhu wrote: > >> Hi Eli, >> >> Thanks for the reply. I tried with -Xlinker="-ldl ". However it >> does not seem to make a difference. It seems that when bugpoint is >> run with --run-jit, the linker args are not passed to gcc (from >> tools/bugpoint/ExecutionDriver.cpp) : >> >> if (InterpreterSel == RunLLC || InterpreterSel == RunCBE || >> InterpreterSel == CBE_bug || InterpreterSel == LLC_Safe) >> >> RetVal = AI->ExecuteProgram(BitcodeFile, InputArgv, InputFile, >> OutputFile, AdditionalLinkerArgs, >> SharedObjs, >> Timeout, MemoryLimit); >> >> else >> >> >> RetVal = AI->ExecuteProgram(BitcodeFile, InputArgv, InputFile, >> OutputFile, >> std::vector<std::string>(), >> SharedObjs, Timeout, MemoryLimit); >> >> >> I tried the following after this: >> >> (1) Firstly instead of running Gap (http://www.gap-system.org/Download/UNIXInst.html >> ), I am now trying to run python with lli (http://www.python.org/download/releases/2.5.2/ >> ). I managed to compile python.bc and here again I face the same >> problem: >> >> llc and gcc can get python.exe to run (which is great :)!) : >> >> $ llc -f python.bc >> $ gcc -o python.exe python.s -ldl -lutil -lm -lrt >> $ ./python.exe >> Python 2.5.2 (r252:60911, Oct 31 2008, 14:41:11) >> [GCC 4.2.1 (Based on Apple Inc. build 5623) (LLVM build)] on linux2 >> Type "help", "copyright", "credits" or "license" for more >> information. >> >>> >> >> however, when I try to run python.bc using lli it crashes with a >> segmentation fault: >> >> $ lli -load=/usr/lib/libdl.so -load=/usr/lib/libutil.so -load=/usr/ >> lib/libm.so -load=/usr/lib/librt.so python.bc >> >> When i try it with gdb, it seems that the crash is somewhere inside >> python code (since bt only shows ?? ). Before the crash, I could >> see that the memory consumption(VM) reaches somewhere near 80% of >> my 2GB RAM (seen via top, and that too a sudden increase from >> around when it was previously occupying around 2-3% of VM). I tried >> to run this on a 64-bit machine which has 8GB RAM and still have >> the same issue wrt memory. >> >> (2) Finally I wrote a pass (and loaded it through opt) to >> instrument each function's (in python code ) entry and exit and >> then ran the instrumented program with both [llc ; gcc] combination >> and lli. In the lli version a single method (subtype_traverse) is >> recursively called (about 2 million times) until the program runs >> out of memory while the statically compiled code (llc + gcc) calls >> this method (I am comparing the calls in the same context in both >> cases) only once: >> >> python with llc + gcc : >> ... >> (tupletraverse (visit_decref (type_is_gc))) >> (subtype_traverse (visit_decref (type_is_gc)) >> (type_traverse (visit_decref) (visit_decref) ... >> >> python with lli: >> >> (tupletraverse (visit_decref (type_is_gc))) >> (subtype_traverse (visit_decref (type_is_gc)) (subtype_traverse >> (visit_decref (type_is_gc))(subtype_traverse (visit_decref >> (type_is_gc)) ..... about 2 million times >> >> Looking at the code (Objects/typeobject.c: http://google.com/codesearch?hl=en&q=show:VK_wUSuAZto:jHKC99mjNVM:4z02hQcYQRY&sa=N&ct=rd&cs_p=http://gentoo.osuosl.org/distfiles/Python-2.5.tar.bz2&cs_f=Python-2.5/Objects/typeobject.c) >> >> it seems the last call (through a function pointer) in >> subtype_traverse results in this never-ending recursive call. >> >> Has anyone tried compiling python to bit code and running it the >> LLVM JIT before ? >> >> Thanks for your time. >> >> - Prakash >> >> >> >> >> >> On Tue, Oct 28, 2008 at 3:02 PM, Eli Friedman >> <eli.friedman at gmail.com> wrote: >> On Tue, Oct 28, 2008 at 12:17 PM, Prakash Prabhu >> <prakash.prabhu at gmail.com> wrote: >> > Generating reference output from raw program: <cbe><gcc> >> > Error running tool: >> [snip] >> > /tmp/cc08IpX8.o: In function `SyLoadModule': >> > bugpoint-test-program.bc.cbe.c:(.text+0x25705): undefined >> reference to >> > `dlopen' >> [snip] >> >> This is saying that compilation with CBE is failing. Try something >> like -Xlinker -ldl? >> >> -Eli >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20081111/c414529b/attachment.html>
I've filed PR3043 for this. Evan On Nov 3, 2008, at 4:00 PM, Prakash Prabhu wrote:> Hi Evan, > > Thanks for the pointers. We found a simple test case that causes the > problem (thanks to Tom in my group): > > #include<stdio.h> > #include<stdlib.h> > > void test(); > void (*funcPtr)(); > > int main(int argc, char **argv) { > funcPtr = test; > test(); > } > > void test() { > if(funcPtr == test) { > printf("OK!\n"); > } else { > fprintf(stderr, "Bad!\n"); > exit(1); > } > } > > $ llvm-gcc -emit-llvm -o FPtrEqTest.bc -c FPtrEqTest.c > $ llc -f FPtrEqTest.bc > $ gcc -o FPtrEqTest FPtrEqTest.s > $ ./FPtrEqTest > OK! > > $ lli FPtrEqTest.bc > Bad! > > The above test case is just a smaller version of the one in Python's > subtype_traverse which also tests a function pointer and calls > itself. It seems the problem arises due comparison with the stub's > address when a comparison with the actual address of the compiled > function is intended. > > thanks, > Prakash > > On Mon, Nov 3, 2008 at 2:07 AM, Evan Cheng <evan.cheng at apple.com> > wrote: > Hi Prakash, > > Unfortunately it looks like you need to do quite a bit of > investigation into this. However, I hope I can provide some useful > tips. > > 1. In general, lli and llc generate exact the same code except lli > default to static codegen while llc defaults to dynamic-no-pic > codegen. So try passing -relocation-model=dynamic-no-pic to lli. If > this works, that means there are issues with static codegen. > 2. It could be a JIT encoding bug. If you can identify a problematic > function, it's possible examine the generated code in gdb and > compare it with llc generated assembly. > 3. It could be a bug in the app and it's exposed when running under > the JIT. You can try enabling additional debugging output. > > Hope this helps. > > Evan > > On Nov 2, 2008, at 2:55 PM, Prakash Prabhu wrote: > >> Hi Eli, >> >> Thanks for the reply. I tried with -Xlinker="-ldl ". However it >> does not seem to make a difference. It seems that when bugpoint is >> run with --run-jit, the linker args are not passed to gcc (from >> tools/bugpoint/ExecutionDriver.cpp) : >> >> if (InterpreterSel == RunLLC || InterpreterSel == RunCBE || >> InterpreterSel == CBE_bug || InterpreterSel == LLC_Safe) >> >> RetVal = AI->ExecuteProgram(BitcodeFile, InputArgv, InputFile, >> OutputFile, AdditionalLinkerArgs, >> SharedObjs, >> Timeout, MemoryLimit); >> >> else >> >> >> RetVal = AI->ExecuteProgram(BitcodeFile, InputArgv, InputFile, >> OutputFile, >> std::vector<std::string>(), >> SharedObjs, Timeout, MemoryLimit); >> >> >> I tried the following after this: >> >> (1) Firstly instead of running Gap (http://www.gap-system.org/Download/UNIXInst.html >> ), I am now trying to run python with lli (http://www.python.org/download/releases/2.5.2/ >> ). I managed to compile python.bc and here again I face the same >> problem: >> >> llc and gcc can get python.exe to run (which is great :)!) : >> >> $ llc -f python.bc >> $ gcc -o python.exe python.s -ldl -lutil -lm -lrt >> $ ./python.exe >> Python 2.5.2 (r252:60911, Oct 31 2008, 14:41:11) >> [GCC 4.2.1 (Based on Apple Inc. build 5623) (LLVM build)] on linux2 >> Type "help", "copyright", "credits" or "license" for more >> information. >> >>> >> >> however, when I try to run python.bc using lli it crashes with a >> segmentation fault: >> >> $ lli -load=/usr/lib/libdl.so -load=/usr/lib/libutil.so -load=/usr/ >> lib/libm.so -load=/usr/lib/librt.so python.bc >> >> When i try it with gdb, it seems that the crash is somewhere inside >> python code (since bt only shows ?? ). Before the crash, I could >> see that the memory consumption(VM) reaches somewhere near 80% of >> my 2GB RAM (seen via top, and that too a sudden increase from >> around when it was previously occupying around 2-3% of VM). I tried >> to run this on a 64-bit machine which has 8GB RAM and still have >> the same issue wrt memory. >> >> (2) Finally I wrote a pass (and loaded it through opt) to >> instrument each function's (in python code ) entry and exit and >> then ran the instrumented program with both [llc ; gcc] combination >> and lli. In the lli version a single method (subtype_traverse) is >> recursively called (about 2 million times) until the program runs >> out of memory while the statically compiled code (llc + gcc) calls >> this method (I am comparing the calls in the same context in both >> cases) only once: >> >> python with llc + gcc : >> ... >> (tupletraverse (visit_decref (type_is_gc))) >> (subtype_traverse (visit_decref (type_is_gc)) >> (type_traverse (visit_decref) (visit_decref) ... >> >> python with lli: >> >> (tupletraverse (visit_decref (type_is_gc))) >> (subtype_traverse (visit_decref (type_is_gc)) (subtype_traverse >> (visit_decref (type_is_gc))(subtype_traverse (visit_decref >> (type_is_gc)) ..... about 2 million times >> >> Looking at the code (Objects/typeobject.c: http://google.com/codesearch?hl=en&q=show:VK_wUSuAZto:jHKC99mjNVM:4z02hQcYQRY&sa=N&ct=rd&cs_p=http://gentoo.osuosl.org/distfiles/Python-2.5.tar.bz2&cs_f=Python-2.5/Objects/typeobject.c) >> >> it seems the last call (through a function pointer) in >> subtype_traverse results in this never-ending recursive call. >> >> Has anyone tried compiling python to bit code and running it the >> LLVM JIT before ? >> >> Thanks for your time. >> >> - Prakash >> >> >> >> >> >> On Tue, Oct 28, 2008 at 3:02 PM, Eli Friedman >> <eli.friedman at gmail.com> wrote: >> On Tue, Oct 28, 2008 at 12:17 PM, Prakash Prabhu >> <prakash.prabhu at gmail.com> wrote: >> > Generating reference output from raw program: <cbe><gcc> >> > Error running tool: >> [snip] >> > /tmp/cc08IpX8.o: In function `SyLoadModule': >> > bugpoint-test-program.bc.cbe.c:(.text+0x25705): undefined >> reference to >> > `dlopen' >> [snip] >> >> This is saying that compilation with CBE is failing. Try something >> like -Xlinker -ldl? >> >> -Eli >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20081111/5485e558/attachment.html>
Thanks, Evan. We've finally got a working version of python.bc which runs most of non-multithreaded python scripts (those that do not use the 'threadmodule') with lli. It needed a few more workarounds (apart from the one to make sure that functions whose addresses are taken are compiled before their addresses are taken): (a) When any shared library (.so) is loaded using dlopen() by the program currently being JIT'ed by lli, if the initialization code in the library makes a call back to a function in the main process, the program crashes. The statically compiled code (llc + gcc) works however using the -disable-internalize and -rdynamic flags (to llc and gcc respectively). The -rdynamic flag makes sure that the function being called (in case of python, a single function Py_InitModule4() is always called by any module loaded as a .so in response to 'import' statements in python) is exported to the dynamic symbol table of the main process. I do not know a way to achieve this using lli. I changed python code to statically link all the modules that are required for some the scripts that we are running. (b) Since the LLVM JIT does not support inline asm, I had to redefine some _byteswap() macros in C rather than inline asm. With these changes, python.bc runs smoothly with lli for most of the python scripts that I have tested with :). regards, Prakash On Tue, Nov 11, 2008 at 11:40 AM, Evan Cheng <evan.cheng at apple.com> wrote:> I've filed PR3043 for this. > Evan > > On Nov 3, 2008, at 4:00 PM, Prakash Prabhu wrote: > > Hi Evan, > > Thanks for the pointers. We found a simple test case that causes the > problem (thanks to Tom in my group): > > #include<stdio.h> > #include<stdlib.h> > > void test(); > void (*funcPtr)(); > > int main(int argc, char **argv) { > funcPtr = test; > test(); > } > > void test() { > if(funcPtr == test) { > printf("OK!\n"); > } else { > fprintf(stderr, "Bad!\n"); > exit(1); > } > } > > $ llvm-gcc -emit-llvm -o FPtrEqTest.bc -c FPtrEqTest.c > $ llc -f FPtrEqTest.bc > $ gcc -o FPtrEqTest FPtrEqTest.s > $ ./FPtrEqTest > OK! > > $ lli FPtrEqTest.bc > Bad! > > The above test case is just a smaller version of the one in Python's > subtype_traverse which also tests a function pointer and calls itself. It > seems the problem arises due comparison with the stub's address when a > comparison with the actual address of the compiled function is intended. > > thanks, > Prakash > > On Mon, Nov 3, 2008 at 2:07 AM, Evan Cheng <evan.cheng at apple.com> wrote: > >> Hi Prakash, >> Unfortunately it looks like you need to do quite a bit of investigation >> into this. However, I hope I can provide some useful tips. >> >> 1. In general, lli and llc generate exact the same code except lli default >> to static codegen while llc defaults to dynamic-no-pic codegen. So try >> passing -relocation-model=dynamic-no-pic to lli. If this works, that means >> there are issues with static codegen. >> 2. It could be a JIT encoding bug. If you can identify a problematic >> function, it's possible examine the generated code in gdb and compare it >> with llc generated assembly. >> 3. It could be a bug in the app and it's exposed when running under the >> JIT. You can try enabling additional debugging output. >> >> Hope this helps. >> >> Evan >> >> On Nov 2, 2008, at 2:55 PM, Prakash Prabhu wrote: >> >> Hi Eli, >> >> Thanks for the reply. I tried with -Xlinker="-ldl ". However it does not >> seem to make a difference. It seems that when bugpoint is run with >> --run-jit, the linker args are not passed to gcc (from >> tools/bugpoint/ExecutionDriver.cpp) : >> >> if (InterpreterSel == RunLLC || InterpreterSel == RunCBE || >> InterpreterSel == CBE_bug || InterpreterSel == LLC_Safe) >> >> RetVal = AI->ExecuteProgram(BitcodeFile, InputArgv, InputFile, >> OutputFile, AdditionalLinkerArgs, >> SharedObjs, >> Timeout, MemoryLimit); >> >> else >> >> >> RetVal = AI->ExecuteProgram(BitcodeFile, InputArgv, InputFile, >> OutputFile, std::vector<std::string>(), >> SharedObjs, Timeout, MemoryLimit); >> >> >> I tried the following after this: >> >> (1) Firstly instead of running Gap ( >> http://www.gap-system.org/Download/UNIXInst.html), I am now trying to run >> python with lli (http://www.python.org/download/releases/2.5.2/). I >> managed to compile python.bc and here again I face the same problem: >> >> llc and gcc can get python.exe to run (which is great :)!) : >> >> $ llc -f python.bc >> $ gcc -o python.exe python.s -ldl -lutil -lm -lrt >> $ ./python.exe >> Python 2.5.2 (r252:60911, Oct 31 2008, 14:41:11) >> [GCC 4.2.1 (Based on Apple Inc. build 5623) (LLVM build)] on linux2 >> Type "help", "copyright", "credits" or "license" for more information. >> >>> >> >> however, when I try to run python.bc using lli it crashes with a >> segmentation fault: >> >> $ lli -load=/usr/lib/libdl.so -load=/usr/lib/libutil.so >> -load=/usr/lib/libm.so -load=/usr/lib/librt.so python.bc >> >> When i try it with gdb, it seems that the crash is somewhere inside python >> code (since bt only shows ?? ). Before the crash, I could see that the >> memory consumption(VM) reaches somewhere near 80% of my 2GB RAM (seen via >> top, and that too a sudden increase from around when it was previously >> occupying around 2-3% of VM). I tried to run this on a 64-bit machine which >> has 8GB RAM and still have the same issue wrt memory. >> >> (2) Finally I wrote a pass (and loaded it through opt) to instrument each >> function's (in python code ) entry and exit and then ran the instrumented >> program with both [llc ; gcc] combination and lli. In the lli version a >> single method (subtype_traverse) is recursively called (about 2 million >> times) until the program runs out of memory while the statically compiled >> code (llc + gcc) calls this method (I am comparing the calls in the same >> context in both cases) only once: >> >> python with llc + gcc : >> ... >> (tupletraverse (visit_decref (type_is_gc))) >> (subtype_traverse (visit_decref (type_is_gc)) >> (type_traverse (visit_decref) (visit_decref) ... >> >> python with lli: >> >> (tupletraverse (visit_decref (type_is_gc))) >> (subtype_traverse (visit_decref (type_is_gc)) (subtype_traverse >> (visit_decref (type_is_gc))(subtype_traverse (visit_decref (type_is_gc)) >> ..... about 2 million times >> >> Looking at the code (Objects/typeobject.c: >> http://google.com/codesearch?hl=en&q=show:VK_wUSuAZto:jHKC99mjNVM:4z02hQcYQRY&sa=N&ct=rd&cs_p=http://gentoo.osuosl.org/distfiles/Python-2.5.tar.bz2&cs_f=Python-2.5/Objects/typeobject.c >> ) >> >> it seems the last call (through a function pointer) in subtype_traverse >> results in this never-ending recursive call. >> >> Has anyone tried compiling python to bit code and running it the LLVM JIT >> before ? >> >> Thanks for your time. >> >> - Prakash >> >> >> >> >> >> On Tue, Oct 28, 2008 at 3:02 PM, Eli Friedman <eli.friedman at gmail.com>wrote: >> >>> On Tue, Oct 28, 2008 at 12:17 PM, Prakash Prabhu >>> <prakash.prabhu at gmail.com> wrote: >>> > Generating reference output from raw program: <cbe><gcc> >>> > Error running tool: >>> [snip] >>> > /tmp/cc08IpX8.o: In function `SyLoadModule': >>> > bugpoint-test-program.bc.cbe.c:(.text+0x25705): undefined reference to >>> > `dlopen' >>> [snip] >>> >>> This is saying that compilation with CBE is failing. Try something >>> like -Xlinker -ldl? >>> >>> -Eli >>> _______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>> >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20081111/c7b3cd00/attachment.html>
Ok, the problem has to do with lazy compilation. In main, when the address of "test" is taken, "test" hasn't been compiled. So the store ended up storing the address of its stub instead. If you run the test with -disable-lazy-compilation, it will work. I think the current solution would be something like this: 1. When code emitter sees a function address, emit it as a relocation and remember the function. 2. After the function has been emitted, compile and emit all functions whose addresses are taken. 3. Relocate all references to function addresses. Step 2 is the only real enhancement required. It's not terribly difficult. Unfortunately I don't have time to deal with this right now. Would someone care to take this up? Thanks, Evan On Nov 3, 2008, at 4:00 PM, Prakash Prabhu wrote:> Hi Evan, > > Thanks for the pointers. We found a simple test case that causes the > problem (thanks to Tom in my group): > > #include<stdio.h> > #include<stdlib.h> > > void test(); > void (*funcPtr)(); > > int main(int argc, char **argv) { > funcPtr = test; > test(); > } > > void test() { > if(funcPtr == test) { > printf("OK!\n"); > } else { > fprintf(stderr, "Bad!\n"); > exit(1); > } > } > > $ llvm-gcc -emit-llvm -o FPtrEqTest.bc -c FPtrEqTest.c > $ llc -f FPtrEqTest.bc > $ gcc -o FPtrEqTest FPtrEqTest.s > $ ./FPtrEqTest > OK! > > $ lli FPtrEqTest.bc > Bad! > > The above test case is just a smaller version of the one in Python's > subtype_traverse which also tests a function pointer and calls > itself. It seems the problem arises due comparison with the stub's > address when a comparison with the actual address of the compiled > function is intended. > > thanks, > Prakash > > On Mon, Nov 3, 2008 at 2:07 AM, Evan Cheng <evan.cheng at apple.com> > wrote: > Hi Prakash, > > Unfortunately it looks like you need to do quite a bit of > investigation into this. However, I hope I can provide some useful > tips. > > 1. In general, lli and llc generate exact the same code except lli > default to static codegen while llc defaults to dynamic-no-pic > codegen. So try passing -relocation-model=dynamic-no-pic to lli. If > this works, that means there are issues with static codegen. > 2. It could be a JIT encoding bug. If you can identify a problematic > function, it's possible examine the generated code in gdb and > compare it with llc generated assembly. > 3. It could be a bug in the app and it's exposed when running under > the JIT. You can try enabling additional debugging output. > > Hope this helps. > > Evan > > On Nov 2, 2008, at 2:55 PM, Prakash Prabhu wrote: > >> Hi Eli, >> >> Thanks for the reply. I tried with -Xlinker="-ldl ". However it >> does not seem to make a difference. It seems that when bugpoint is >> run with --run-jit, the linker args are not passed to gcc (from >> tools/bugpoint/ExecutionDriver.cpp) : >> >> if (InterpreterSel == RunLLC || InterpreterSel == RunCBE || >> InterpreterSel == CBE_bug || InterpreterSel == LLC_Safe) >> >> RetVal = AI->ExecuteProgram(BitcodeFile, InputArgv, InputFile, >> OutputFile, AdditionalLinkerArgs, >> SharedObjs, >> Timeout, MemoryLimit); >> >> else >> >> >> RetVal = AI->ExecuteProgram(BitcodeFile, InputArgv, InputFile, >> OutputFile, >> std::vector<std::string>(), >> SharedObjs, Timeout, MemoryLimit); >> >> >> I tried the following after this: >> >> (1) Firstly instead of running Gap (http://www.gap-system.org/Download/UNIXInst.html >> ), I am now trying to run python with lli (http://www.python.org/download/releases/2.5.2/ >> ). I managed to compile python.bc and here again I face the same >> problem: >> >> llc and gcc can get python.exe to run (which is great :)!) : >> >> $ llc -f python.bc >> $ gcc -o python.exe python.s -ldl -lutil -lm -lrt >> $ ./python.exe >> Python 2.5.2 (r252:60911, Oct 31 2008, 14:41:11) >> [GCC 4.2.1 (Based on Apple Inc. build 5623) (LLVM build)] on linux2 >> Type "help", "copyright", "credits" or "license" for more >> information. >> >>> >> >> however, when I try to run python.bc using lli it crashes with a >> segmentation fault: >> >> $ lli -load=/usr/lib/libdl.so -load=/usr/lib/libutil.so -load=/usr/ >> lib/libm.so -load=/usr/lib/librt.so python.bc >> >> When i try it with gdb, it seems that the crash is somewhere inside >> python code (since bt only shows ?? ). Before the crash, I could >> see that the memory consumption(VM) reaches somewhere near 80% of >> my 2GB RAM (seen via top, and that too a sudden increase from >> around when it was previously occupying around 2-3% of VM). I tried >> to run this on a 64-bit machine which has 8GB RAM and still have >> the same issue wrt memory. >> >> (2) Finally I wrote a pass (and loaded it through opt) to >> instrument each function's (in python code ) entry and exit and >> then ran the instrumented program with both [llc ; gcc] combination >> and lli. In the lli version a single method (subtype_traverse) is >> recursively called (about 2 million times) until the program runs >> out of memory while the statically compiled code (llc + gcc) calls >> this method (I am comparing the calls in the same context in both >> cases) only once: >> >> python with llc + gcc : >> ... >> (tupletraverse (visit_decref (type_is_gc))) >> (subtype_traverse (visit_decref (type_is_gc)) >> (type_traverse (visit_decref) (visit_decref) ... >> >> python with lli: >> >> (tupletraverse (visit_decref (type_is_gc))) >> (subtype_traverse (visit_decref (type_is_gc)) (subtype_traverse >> (visit_decref (type_is_gc))(subtype_traverse (visit_decref >> (type_is_gc)) ..... about 2 million times >> >> Looking at the code (Objects/typeobject.c: http://google.com/codesearch?hl=en&q=show:VK_wUSuAZto:jHKC99mjNVM:4z02hQcYQRY&sa=N&ct=rd&cs_p=http://gentoo.osuosl.org/distfiles/Python-2.5.tar.bz2&cs_f=Python-2.5/Objects/typeobject.c) >> >> it seems the last call (through a function pointer) in >> subtype_traverse results in this never-ending recursive call. >> >> Has anyone tried compiling python to bit code and running it the >> LLVM JIT before ? >> >> Thanks for your time. >> >> - Prakash >> >> >> >> >> >> On Tue, Oct 28, 2008 at 3:02 PM, Eli Friedman >> <eli.friedman at gmail.com> wrote: >> On Tue, Oct 28, 2008 at 12:17 PM, Prakash Prabhu >> <prakash.prabhu at gmail.com> wrote: >> > Generating reference output from raw program: <cbe><gcc> >> > Error running tool: >> [snip] >> > /tmp/cc08IpX8.o: In function `SyLoadModule': >> > bugpoint-test-program.bc.cbe.c:(.text+0x25705): undefined >> reference to >> > `dlopen' >> [snip] >> >> This is saying that compilation with CBE is failing. Try something >> like -Xlinker -ldl? >> >> -Eli >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20081111/7b3e30b6/attachment.html>