I didn't get to work on this more last week, but I'll look at incorporating that suggestion. The other question of course is how to do this in LLDB. Right, now what I'm doing is going through and adjusting the load address of every leaf in the section tree. That basically works and gets me backtraces with the correct function names and the ability to set breakpoints at functions in JITed modules. What it doesn't get me yet is line numbers. I suspect that is because the DWARF still refer to the old addresses. I thought relocations should take care of that, but apparently they don't so I'll have to look at whether to solve this in LLDB or in LLVM. Suggestions are most welcome. On Wed, May 28, 2014 at 12:53 PM, Greg Clayton <gclayton at apple.com> wrote:> > > On May 28, 2014, at 8:57 AM, Keno Fischer <kfischer at college.harvard.edu> > wrote: > > > > Hello, > > > > I'm finally getting back to getting JIT debugging work for MCJIT. This > has worked for ELF for a while in LLVM and support in lldb was added in > January (for ELF). I'm now trying to add support for Mach-O and would > appreciate some feedback (though I'm fighting my way through learning the > format, I'm still just a novice). > > > > My current patchset for llvm is here: > https://gist.github.com/loladiro/8d909ddd04e6d7e9a5d0 . I have a > corresponding patch for lldb and I basically got this working (modulo line > table information, though I'm sure I'm doing something stupid in lldb here). > > The basic approach is to, when a section gets allocated rewrite the > sections `addr` and update every symbols `n_value` correspondingly. This is > very much in line with what is done for ELF, but I'm not sure if it's the > right approach, so I'd appreciate if somebody who has more experience with > Mach-O could look at the above patch and give some feedback. If this > approach looks sane in general, I'll finish up and post both the LLVM and > the LLDB patch for formal review. > > The one thing you might want to look into is the n_value only needs to be > updated "if ((N_TYPE & n_type) == N_SECT)" (the symbol is in a section and > therefore is has a address value). Other symbols have values that usually > don't need to be modified. You might also need to watch out for absolute > symbols (if ((N_TYPE & n_type) == N_ABS)) as there are a few that sometimes > don't claim to be a symbol that has a valid address, but they actually do > point to an address. The symbol named "mach_header" is one such absolute > symbol. > > If this is all new code, get it as close as you can and then we can work > the kinks out once it is in the codebase. > > Greg >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140602/248746a8/attachment.html>
I think I'm getting closer. The debug_info section is being relocated correctly (I think): 0x00000000: Compile Unit: length = 0x00000045 version = 0x0003 abbr_offset = 0x00000000 addr_size = 0x08 (next CU at 0x00000049) 0x0000000b: TAG_compile_unit [1] * AT_producer( "julia" ) AT_language( DW_LANG_C89 ) AT_name( "string.jl" ) AT_stmt_list( 0x00000000 ) AT_comp_dir( "." ) AT_APPLE_optimized( 0x01 ) AT_low_pc( 0x0000000112f5f1c0 ) AT_high_pc( 0x000006fb ) 0x0000002b: TAG_subprogram [2] AT_low_pc( 0x0000000112f5f1c0 ) AT_high_pc( 0x0000000112f5f8bb ) AT_frame_base( rbp ) AT_MIPS_linkage_name( "julia_parseint_nocheck;18749" ) AT_name( "parseint_nocheck" ) AT_external( 0x01 ) AT_accessibility( DW_ACCESS_private ) 0x00000048: NULL but lldb is still showing it at the original location: 0x7ff3afca9280: SymbolVendor 0x7ff3afcafa20: Type{0x0000002b} , name = "parseint_nocheck", clang_type = 0x00007ff3ab548df0 void (void) 0x7ff3afca93e0: CompileUnit{0x00000000}, language = "Language(language 0xafca93e0)", file = './string.jl' 0x7ff3afcafe20: Function{0x0000002b}, mangled julia_parseint_nocheck;18749, type = 0x7ff3afcafa20 even though the section seems to be loaded correctly: Sections for 'JIT(0x7fc4230f4e00)(0x00007fc4230f4e00)' (x86_64): SectID Type Load Address File Off. File Size Flags Section Name ---------- ---------------- --------------------------------------- ---------- ---------- ---------- ---------------------------- 0x00000100 container [0x0000000112efccf8-0x0000000112f5f8fb)* 0x000003b0 0x00000950 0x00000000 JIT(0x7fc4230f4e00).__TEXT 0x00000001 code [0x0000000112f5f1c0-0x0000000112f5f8fb) 0x000003b0 0x0000073b 0x80000400 JIT(0x7fc4230f4e00).__TEXT.__text 0x00000009 eh-frame [0x0000000112efccf8-0x0000000112efcd68) 0x00000c90 0x00000070 0x6800000b JIT(0x7fc4230f4e00).__TEXT.__eh_frame 0x00000200 container [0x0000000000000784-0x0000000112efce75)* 0x00000aeb 0x00000160 0x00000000 JIT(0x7fc4230f4e00).__DWARF 0x00000002 dwarf-info [0x0000000112efcd68-0x0000000112efcdb1) 0x00000aeb 0x00000049 0x02000000 JIT(0x7fc4230f4e00).__DWARF.__debug_info 0x00000003 dwarf-abbrev [0x00007fc4230f5934-0x00007fc4230f595f) 0x00000b34 0x0000002b 0x02000000 JIT(0x7fc4230f4e00).__DWARF.__debug_abbrev 0x00000004 dwarf-line [0x0000000112efcdc9-0x0000000112efce75) 0x00000b5f 0x000000ac 0x02000000 JIT(0x7fc4230f4e00).__DWARF.__debug_line 0x00000005 dwarf-str [0x00007fc4230f5a0b-0x00007fc4230f5a4b) 0x00000c0b 0x00000040 0x02000000 JIT(0x7fc4230f4e00).__DWARF.__debug_str 0x00000006 dwarf-loc 0x00000c4b 0x00000000 0x02000000 JIT(0x7fc4230f4e00).__DWARF.__debug_loc 0x00000007 dwarf-ranges 0x00000c4b 0x00000000 0x02000000 JIT(0x7fc4230f4e00).__DWARF.__debug_ranges 0x00000300 container [0x0000000112efce80-0x0000000112efcec0)* 0x00000c50 0x00000040 0x00000000 JIT(0x7fc4230f4e00).__LD 0x00000008 regular [0x0000000112efce80-0x0000000112efcec0) 0x00000c50 0x00000040 0x02000000 JIT(0x7fc4230f4e00).__LD.__compact_unwind (the relocated address is julia> datapointer(filter(s->s.sectname == "__debug_info",sects)[1]) Ptr{Uint8} @0x0000000112efcd68 ) so it seems like despite knowing the correct load address for the __debug_info section, it's still somehow picking up on the old addresses. I'll keep looking, but if something springs to mind, please let me know. On Mon, Jun 2, 2014 at 11:47 AM, Keno Fischer <kfischer at college.harvard.edu> wrote:> I didn't get to work on this more last week, but I'll look at > incorporating that suggestion. > > The other question of course is how to do this in LLDB. Right, now what > I'm doing is going through and adjusting the load address of every leaf in > the section tree. That basically works and gets me backtraces with the > correct function names and the ability to set breakpoints at functions in > JITed modules. What it doesn't get me yet is line numbers. I suspect that > is because the DWARF still refer to the old addresses. I thought > relocations should take care of that, but apparently they don't so I'll > have to look at whether to solve this in LLDB or in LLVM. Suggestions are > most welcome. > > > > On Wed, May 28, 2014 at 12:53 PM, Greg Clayton <gclayton at apple.com> wrote: > >> >> > On May 28, 2014, at 8:57 AM, Keno Fischer <kfischer at college.harvard.edu> >> wrote: >> > >> > Hello, >> > >> > I'm finally getting back to getting JIT debugging work for MCJIT. This >> has worked for ELF for a while in LLVM and support in lldb was added in >> January (for ELF). I'm now trying to add support for Mach-O and would >> appreciate some feedback (though I'm fighting my way through learning the >> format, I'm still just a novice). >> > >> > My current patchset for llvm is here: >> https://gist.github.com/loladiro/8d909ddd04e6d7e9a5d0 . I have a >> corresponding patch for lldb and I basically got this working (modulo line >> table information, though I'm sure I'm doing something stupid in lldb here). >> > The basic approach is to, when a section gets allocated rewrite the >> sections `addr` and update every symbols `n_value` correspondingly. This is >> very much in line with what is done for ELF, but I'm not sure if it's the >> right approach, so I'd appreciate if somebody who has more experience with >> Mach-O could look at the above patch and give some feedback. If this >> approach looks sane in general, I'll finish up and post both the LLVM and >> the LLDB patch for formal review. >> >> The one thing you might want to look into is the n_value only needs to be >> updated "if ((N_TYPE & n_type) == N_SECT)" (the symbol is in a section and >> therefore is has a address value). Other symbols have values that usually >> don't need to be modified. You might also need to watch out for absolute >> symbols (if ((N_TYPE & n_type) == N_ABS)) as there are a few that sometimes >> don't claim to be a symbol that has a valid address, but they actually do >> point to an address. The symbol named "mach_header" is one such absolute >> symbol. >> >> If this is all new code, get it as close as you can and then we can work >> the kinks out once it is in the codebase. >> >> Greg >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140602/5376d26e/attachment.html>
We don't currently apply any relocations (that I know of) for debug info in LLDB.> On Jun 2, 2014, at 12:35 PM, Keno Fischer <kfischer at college.harvard.edu> wrote: > > I think I'm getting closer. The debug_info section is being relocated correctly (I think): > > 0x00000000: Compile Unit: length = 0x00000045 version = 0x0003 abbr_offset = 0x00000000 addr_size = 0x08 (next CU at 0x00000049) > > 0x0000000b: TAG_compile_unit [1] * > AT_producer( "julia" ) > AT_language( DW_LANG_C89 ) > AT_name( "string.jl" ) > AT_stmt_list( 0x00000000 ) > AT_comp_dir( "." ) > AT_APPLE_optimized( 0x01 ) > AT_low_pc( 0x0000000112f5f1c0 ) > AT_high_pc( 0x000006fb ) > > 0x0000002b: TAG_subprogram [2] > AT_low_pc( 0x0000000112f5f1c0 ) > AT_high_pc( 0x0000000112f5f8bb ) > AT_frame_base( rbp ) > AT_MIPS_linkage_name( "julia_parseint_nocheck;18749" ) > AT_name( "parseint_nocheck" ) > AT_external( 0x01 ) > AT_accessibility( DW_ACCESS_private ) > > 0x00000048: NULL > > but lldb is still showing it at the original location: > > 0x7ff3afca9280: SymbolVendor > 0x7ff3afcafa20: Type{0x0000002b} , name = "parseint_nocheck", clang_type = 0x00007ff3ab548df0 void (void) > 0x7ff3afca93e0: CompileUnit{0x00000000}, language = "Language(language = 0xafca93e0)", file = './string.jl' > 0x7ff3afcafe20: Function{0x0000002b}, mangled = julia_parseint_nocheck;18749, type = 0x7ff3afcafa20 > > even though the section seems to be loaded correctly: > > Sections for 'JIT(0x7fc4230f4e00)(0x00007fc4230f4e00)' (x86_64): > SectID Type Load Address File Off. File Size Flags Section Name > ---------- ---------------- --------------------------------------- ---------- ---------- ---------- ---------------------------- > 0x00000100 container [0x0000000112efccf8-0x0000000112f5f8fb)* 0x000003b0 0x00000950 0x00000000 JIT(0x7fc4230f4e00).__TEXT > 0x00000001 code [0x0000000112f5f1c0-0x0000000112f5f8fb) 0x000003b0 0x0000073b 0x80000400 JIT(0x7fc4230f4e00).__TEXT.__text > 0x00000009 eh-frame [0x0000000112efccf8-0x0000000112efcd68) 0x00000c90 0x00000070 0x6800000b JIT(0x7fc4230f4e00).__TEXT.__eh_frame > 0x00000200 container [0x0000000000000784-0x0000000112efce75)* 0x00000aeb 0x00000160 0x00000000 JIT(0x7fc4230f4e00).__DWARF > 0x00000002 dwarf-info [0x0000000112efcd68-0x0000000112efcdb1) 0x00000aeb 0x00000049 0x02000000 JIT(0x7fc4230f4e00).__DWARF.__debug_info > 0x00000003 dwarf-abbrev [0x00007fc4230f5934-0x00007fc4230f595f) 0x00000b34 0x0000002b 0x02000000 JIT(0x7fc4230f4e00).__DWARF.__debug_abbrev > 0x00000004 dwarf-line [0x0000000112efcdc9-0x0000000112efce75) 0x00000b5f 0x000000ac 0x02000000 JIT(0x7fc4230f4e00).__DWARF.__debug_line > 0x00000005 dwarf-str [0x00007fc4230f5a0b-0x00007fc4230f5a4b) 0x00000c0b 0x00000040 0x02000000 JIT(0x7fc4230f4e00).__DWARF.__debug_str > 0x00000006 dwarf-loc 0x00000c4b 0x00000000 0x02000000 JIT(0x7fc4230f4e00).__DWARF.__debug_loc > 0x00000007 dwarf-ranges 0x00000c4b 0x00000000 0x02000000 JIT(0x7fc4230f4e00).__DWARF.__debug_ranges > 0x00000300 container [0x0000000112efce80-0x0000000112efcec0)* 0x00000c50 0x00000040 0x00000000 JIT(0x7fc4230f4e00).__LD > 0x00000008 regular [0x0000000112efce80-0x0000000112efcec0) 0x00000c50 0x00000040 0x02000000 JIT(0x7fc4230f4e00).__LD.__compact_unwind > > (the relocated address is > > julia> datapointer(filter(s->s.sectname == "__debug_info",sects)[1]) > Ptr{Uint8} @0x0000000112efcd68 > > ) > > so it seems like despite knowing the correct load address for the __debug_info section, it's still somehow picking up on the old addresses. I'll keep looking, but if something springs to mind, please let me know. > > > > > > On Mon, Jun 2, 2014 at 11:47 AM, Keno Fischer <kfischer at college.harvard.edu> wrote: > I didn't get to work on this more last week, but I'll look at incorporating that suggestion. > > The other question of course is how to do this in LLDB. Right, now what I'm doing is going through and adjusting the load address of every leaf in the section tree. That basically works and gets me backtraces with the correct function names and the ability to set breakpoints at functions in JITed modules. What it doesn't get me yet is line numbers. I suspect that is because the DWARF still refer to the old addresses. I thought relocations should take care of that, but apparently they don't so I'll have to look at whether to solve this in LLDB or in LLVM. Suggestions are most welcome. > > > > On Wed, May 28, 2014 at 12:53 PM, Greg Clayton <gclayton at apple.com> wrote: > > > On May 28, 2014, at 8:57 AM, Keno Fischer <kfischer at college.harvard.edu> wrote: > > > > Hello, > > > > I'm finally getting back to getting JIT debugging work for MCJIT. This has worked for ELF for a while in LLVM and support in lldb was added in January (for ELF). I'm now trying to add support for Mach-O and would appreciate some feedback (though I'm fighting my way through learning the format, I'm still just a novice). > > > > My current patchset for llvm is here: https://gist.github.com/loladiro/8d909ddd04e6d7e9a5d0 . I have a corresponding patch for lldb and I basically got this working (modulo line table information, though I'm sure I'm doing something stupid in lldb here). > > The basic approach is to, when a section gets allocated rewrite the sections `addr` and update every symbols `n_value` correspondingly. This is very much in line with what is done for ELF, but I'm not sure if it's the right approach, so I'd appreciate if somebody who has more experience with Mach-O could look at the above patch and give some feedback. If this approach looks sane in general, I'll finish up and post both the LLVM and the LLDB patch for formal review. > > The one thing you might want to look into is the n_value only needs to be updated "if ((N_TYPE & n_type) == N_SECT)" (the symbol is in a section and therefore is has a address value). Other symbols have values that usually don't need to be modified. You might also need to watch out for absolute symbols (if ((N_TYPE & n_type) == N_ABS)) as there are a few that sometimes don't claim to be a symbol that has a valid address, but they actually do point to an address. The symbol named "mach_header" is one such absolute symbol. > > If this is all new code, get it as close as you can and then we can work the kinks out once it is in the codebase. > > Greg > >