Hi all. I'm working on the recently-announced unladen-swallow project, and I'm having a bit of trouble getting gdb to step into functions I've compiled with LLVM's JIT compiler. The attached a_module.ll is the module I produce from compiling def foo(r): for i in r: pass I'm JIT-compiling and running foo() with: typedef PyObject *(*NativeFunction)(PyFrameObject *); llvm::ExecutionEngine *engine = ...->getExecutionEngine(); NativeFunction native (NativeFunction)engine->getPointerToFunction(function); return native(frame); However, when I try to step into the call with gdb, I get: Breakpoint 1, eval_llvm_function (function_obj=0x142f6e0, frame=0x1350b98) at ../src/Python/ceval.cc:2549 2549 (NativeFunction)engine->getPointerToFunction(function); (gdb) n Current language: auto; currently c++ 2550 return native(frame); (gdb) p native $1 = (NativeFunction) 0x2080010 (gdb) b *0x2080010 Breakpoint 2 at 0x2080010 (gdb) s Breakpoint 2, 0x02080010 in ?? () If I don't set the second breakpoint, that last step totally skips the call. To see if I'm just emitting totally wrong debugging information, I compiled the module into a binary with a stub main, and gdb'ed into that. Trying to set a breakpoint on "foo" from there crashed Apple's gdb, which isn't ideal but at least indicates that something's happening with the dwarf information. Do I need to do anything extra to get the debug information the JIT produces hooked into gdb? Thanks, Jeffrey -------------- next part -------------- A non-text attachment was scrubbed... Name: a_module.ll Type: application/octet-stream Size: 16971 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20090327/34e782e3/attachment.obj>
Hi, Jeffrey> Do I need to do anything extra to get the debug information the JIT > produces hooked into gdb?I'm not sure, if debug information is ever emitted for code being JITed. Most probably only EH info is honored. Even if it is emitted - you need to "register" it into gdb somehow, I don't remember offhand how you can do this, unfortunately. -- With best regards, Anton Korobeynikov. Faculty of Mathematics & Mechanics, Saint Petersburg State University.
Run with -debug-only=jit. Break on line 1148 of JITEmitter.cpp. The debugging message will tell you the address and size of the function that was jitted. You can then tell gdb to disassemble the code. On Mar 26, 2009, at 11:35 PM, Jeffrey Yasskin wrote:> Hi all. I'm working on the recently-announced unladen-swallow project, > and I'm having a bit of trouble getting gdb to step into functions > I've compiled with LLVM's JIT compiler. The attached a_module.ll is > the module I produce from compiling > > def foo(r): > for i in r: > pass > > I'm JIT-compiling and running foo() with: > > typedef PyObject *(*NativeFunction)(PyFrameObject *); > llvm::ExecutionEngine *engine = ...->getExecutionEngine(); > NativeFunction native > (NativeFunction)engine->getPointerToFunction(function); > return native(frame); > > However, when I try to step into the call with gdb, I get: > > Breakpoint 1, eval_llvm_function (function_obj=0x142f6e0, > frame=0x1350b98) at ../src/Python/ceval.cc:2549 > 2549 (NativeFunction)engine->getPointerToFunction(function); > (gdb) n > Current language: auto; currently c++ > 2550 return native(frame); > (gdb) p native > $1 = (NativeFunction) 0x2080010 > (gdb) b *0x2080010 > Breakpoint 2 at 0x2080010 > (gdb) s > Breakpoint 2, 0x02080010 in ?? () > > > If I don't set the second breakpoint, that last step totally skips the > call. To see if I'm just emitting totally wrong debugging information, > I compiled the module into a binary with a stub main, and gdb'ed into > that. Trying to set a breakpoint on "foo" from there crashed Apple's > gdb, which isn't ideal but at least indicates that something's > happening with the dwarf information. > > Do I need to do anything extra to get the debug information the JIT > produces hooked into gdb?That isn't available. Currently the JIT does not produce dwarf information. Evan> > > Thanks, > Jeffrey > <a_module.ll>_______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
On Fri, Mar 27, 2009 at 3:48 PM, Evan Cheng <evan.cheng at apple.com> wrote:> Run with -debug-only=jit.OT: I take it the recommended model for tools that embed LLVM is for them to accept all of LLVM's command line arguments on their own command lines? For Python, it'd be much nicer to make this stuff tweakable through a module at runtime, or even, for thread-safety reasons, as a parameter to each call that cares about it. The command line route will work for our development, but I don't think we'll be able to release without a better story. (Luckily, we don't have anything scheduled for 3ish months, and we're happy to send you patches and pull them into our tree before an official LLVM release if you don't have this done before we need it.)> Break on line 1148 of JITEmitter.cpp. The debugging message will tell > you the address and size of the function that was jitted. You can then > tell gdb to disassemble the code.I also want to step through the code and print variables. I can find the address to read for each variable by reading the assembly, or maybe by watching the JIT's debug output, but that's significantly more painful than the standard gdb interface. This will block our release too since we can't ask most people to read assembly. We'll need to get work done for this (or do it ourselves) on both your end and gdb's end, since it doesn't yet have hooks to register debug info like the exception system does. I've started a page on the wiki to track the state of the art for this: http://wiki.llvm.org/HowTo:_Tell_GDB_about_JITted_code Thanks!> On Mar 26, 2009, at 11:35 PM, Jeffrey Yasskin wrote: > >> Hi all. I'm working on the recently-announced unladen-swallow project, >> and I'm having a bit of trouble getting gdb to step into functions >> I've compiled with LLVM's JIT compiler. The attached a_module.ll is >> the module I produce from compiling >> >> def foo(r): >> for i in r: >> pass >> >> I'm JIT-compiling and running foo() with: >> >> typedef PyObject *(*NativeFunction)(PyFrameObject *); >> llvm::ExecutionEngine *engine = ...->getExecutionEngine(); >> NativeFunction native >> (NativeFunction)engine->getPointerToFunction(function); >> return native(frame); >> >> However, when I try to step into the call with gdb, I get: >> >> Breakpoint 1, eval_llvm_function (function_obj=0x142f6e0, >> frame=0x1350b98) at ../src/Python/ceval.cc:2549 >> 2549 (NativeFunction)engine->getPointerToFunction(function); >> (gdb) n >> Current language: auto; currently c++ >> 2550 return native(frame); >> (gdb) p native >> $1 = (NativeFunction) 0x2080010 >> (gdb) b *0x2080010 >> Breakpoint 2 at 0x2080010 >> (gdb) s >> Breakpoint 2, 0x02080010 in ?? () >> >> >> If I don't set the second breakpoint, that last step totally skips the >> call. To see if I'm just emitting totally wrong debugging information, >> I compiled the module into a binary with a stub main, and gdb'ed into >> that. Trying to set a breakpoint on "foo" from there crashed Apple's >> gdb, which isn't ideal but at least indicates that something's >> happening with the dwarf information. >> >> Do I need to do anything extra to get the debug information the JIT >> produces hooked into gdb? > > > That isn't available. Currently the JIT does not produce dwarf > information. > > Evan
I'm adding the gdb list because it appears there's currently no way to tell gdb about newly-JITted code. That is, it's not an LLVM-specific problem. There appear to be two techniques in common use to debug dynamically-generated code despite this. First, as Evan suggests below, we can have the JIT print the address range that it's written a function into, have gdb disassemble that, set breakpoints at particular addresses, and print variables by knowing what register they live in. Second, as described at http://www.mono-project.com/Debugging#Debugging_with_GDB_in_XDEBUG_mode, we can write out a full object file with dwarf tables, and use add-symbol-file to get gdb to load that on demand. Neither of these is ideal. add-symbol-file is better, but it doesn't allow us to set breakpoints inside the JITted code until it's generated, and it doesn't let those breakpoints follow the code as it's re-optimized and re-translated. It currently also requires user interaction, but it's possible that we could write a -gdb.py file to reload the debug information every time the user gets to the gdb prompt. There may be other problems I haven't though of. It would be better to have an interface through which a JITting library could tell gdb about newly-generated code. This could resemble the overlay interface (http://sourceware.org/gdb/current/onlinedocs/gdb_13.html#SEC108) or the interface through which dynamic loaders tell gdb about newly-loaded code. There are a couple considerations that are specific to JITting, of course: 1. A JIT compiler generates new code frequently, and having to do lots of extra work, especially while the debugger isn't attached, may hurt performance. 2. Translated code gets duplicated, replaced, and freed, and gdb needs to modify its breakpoints to keep up. I don't really know enough about the internals of LLVM or gdb to make any recommendations, but I think it would be useful to find some way for them (and other debuggers and JITs) to talk to each other. Jeffrey On Fri, Mar 27, 2009 at 1:48 PM, Evan Cheng <evan.cheng at apple.com> wrote:> Run with -debug-only=jit. > > Break on line 1148 of JITEmitter.cpp. The debugging message will tell > you the address and size of the function that was jitted. You can then > tell gdb to disassemble the code.