thr3ads.net - llvm dev - [LLVMdev] MCJIT debugger registration interface. [Aug 2014]

If this information is useful, please help other people find it:
Share via:

Eric Christopher

2014-Aug-11 17:06 UTC

[LLVMdev] MCJIT debugger registration interface.

On Sun, Aug 10, 2014 at 3:37 PM, Filip Pizlo <fpizlo at apple.com>
wrote:>
>
>> On Aug 10, 2014, at 3:07 PM, Eric Christopher <echristo at
gmail.com> wrote:
>>
>>> On Sun, Aug 10, 2014 at 1:43 PM, Filip Pizlo <fpizlo at
apple.com> wrote:
>>> I think this ignores the real problem with the MCJIT debugging
interface: it doesn't give MCJIT clients any way of directly accessing and
parsing the debug metadata.
>>
>> Parsing the existing debug metadata isn't necessarily a good idea
>> anyhow. It's not a stable format and is quite large.
>
> I agree. I suspect that a better solution is to have the smarts for
grokking the debug data inside LLVM, possibly borrowing logic from lldb. For
starters clients like WebKit will want a machine-offset-to-debug-info map, which
ain't rocket science - but currently parsing dwarf inside the LLVM client is
the only way to do this afaict.
There's some support (originally forked from lldb) already in llvm to
do this. Look at lib/DebugInfo, it's what llvm-dwarfdump, etc are
based upon.
>>
>>> WebKit, and likely other non-C/C++ clients of MCJIT, will not want
the MCJIT to register anything with the system debugger. Non-C languages usually
have a different set of debugging interfaces and it's up to the client of
LLVM to arrange to glue the debugging information that the MCJIT knows about to
the debugging interface that the LLVM client knows about. The mcjit's
current architecture makes this extremely awkward.
>>>
>>> This is part of a bigger problem in the MCJIT API: it is designed
to work like an execution engine for C programs despite the fact that the most
compelling use of MCJIT is a higher-tier JIT that is part of a mixed-mode or
tiered runtime for non-C languages. Is there some client of the MCJIT that
actually benefits from the MCJIT pretending to be an execution engine for C
programs?  Is there a reason why this client should get more attention than the
seemingly more compelling non-C use cases?
>>
>> The debug metadata is largely based around dwarf debug information,
>> but it isn't a C language based format. I think this is a
misleading
>> assertion you make.
>
> That would be a misleading assertion indeed, but it's not the one
I'm making. Let me restate.
>
> Clients of optimizing JIT compilers are usually going to want to have some
finer-grained control over how that JIT presents debug data to the debugger.
Probably all that we want is: the JIT offers its debug data to its client, and
the client decides if, and how, this data is presented to any debugger (lldb,
gdb, or whatever). A reasonable default can of course be provided, if it leads
to a good API.
>
> The MCJIT is currently ill suited to this kind of thing because it pretends
to be a black box execution engine for LLVM IR. This black box then makes
further assumptions that make sense for programs that target the C runtime. I
believe that life would be easier if the task of generating code and the task of
linking and executing it were better separated in the API.
I think there are two things here, dwarf level support for things like
line numbers, variable locations, and even some basic type
information. Then there's language support like you'd want to see
debugging a high level language that can't be fully described or has
run time effects - a debugging interface that can be called into for
that could be useful, but I'm not seeing that as necessarily something
that MCJIT would vend but something on top of it. I.e. how a debugger
would handle (bad example here, but...) something like Obj-C or Swift.
>
>>
>> Also, it's your most compelling use case, not the most compelling.
>
> If it isn't the most compelling, then can you provide an example of an
MCJIT client that benefits from the current design?
>
> I suspect that most other MCJIT clients will do some similar things to what
WebKit does:
>
> - custom runtime that doesn't behave like a C linker.
>
> - custom debugging infrastructure; even if lldb integration is provided,
the client's runtime will want lots of control.
>
> - multiple compiler tiers or mixed-mode execution.
>
> - source language that is not like C.
>
> These four things apply to many systems and it would be cool if LLVM became
easier to use for those. If you believe that these things are not compelling,
then can you describe what kind system you envision MCJIT being used for?
>
Oh, I agree they'd be cool to have as well, but there's also languages
like Swift and Julia that use the JIT. There are all of the
OpenGL/OpenCL/OpenACC accelerator type compilation uses, etc. Just
saying that the Webkit JavaScript compilation strategy isn't the only
compelling use case.

Mostly I think we're in agreement that this sort of functionality
would be useful, just where it goes and whether or not the existing
information that we can vend is also useful.

-eric

>>
>> -eric
>>
>>>
>>> -Filip
>>>
>>>> On Aug 1, 2014, at 6:10 PM, Lang Hames <lhames at
gmail.com> wrote:
>>>>
>>>> Hi All,
>>>>
>>>> I'd like to revisit the MCJIT debugger-registration system,
as the existing system has a few flaws, some of which are seriously problematic.
>>>>
>>>> The 20,000 foot overview of the existing scheme (implemented in
llvm/lib/ExecutionEngine/RuntimeDyld/GDBRegistrar.cpp and friends), as I
understand it, is as follows:
>>>>
>>>> We have two symbols in MCJIT that act as fixed points for the
debugger to latch on to:
>>>>
>>>> __jit_debug_register_code is a no-op function that the debugger
can set a breakpoint on.  MCJIT will call this function to notify the debugger
when an object file is loaded.
>>>>
>>>> __jit_debug_descriptor is the head of a C linked list data
structure that contains pointers to in-memory object files. The ELF/MachO
headers of the in memory object files will have had their vaddrs fixed up by the
JIT to point to where each of the linked sections reside in memory.
>>>>
>>>> There are a couple of problems with this system: (1) Modifying
object-file headers in-place violates some internal LLVM contracts. In
particular, the object files may be backed by read-only memory. This has caused
crashes in the JIT that have forced me to revert support for debugger
registration on the MachO side (We really want to replace this on the ELF side
soon too). (2) The JIT has no way of knowing whether a debugger is attached,
which means keeping object files in memory even if they're not being used,
just in case there an attached debugger that needs them.
>>>>
>>>> We'd really like to come up with a system that doesn't
have these drawbacks. That is, a system where the object files remain
unmodified, and the JIT knows if/when a debugger attaches so that it can
generate the relevant information on the fly.
>>>>
>>>> It would be great if the debugger experts (and particularly
anyone who has experience on both the debugger and the JIT side of things) could
weigh in on these issues. In particular:
>>>>
>>>> (1) Is there a reason we bake the vmaddrs into the object file
headers, or could they just as easily be passed in a side-table so as to keep
the object untouched?
>>>>
>>>> (2) Is there a canonical way for the debugger to communicate to
a JIT that it's interested in inspecting the JIT's output? If we're
going to use breakpoints (or something like that) to signal to the debugger when
objects have been linked, is it reasonable to have an API that the debugger can
call in to to request the information it's looking for? If the JIT actually
receives a call then it would give us a chance to lazily populate the necessary
data structures.
>>>>
>>>> Regards,
>>>> Lang.
>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

David Chisnall

2014-Aug-11 21:02 UTC

head link

[LLVMdev] MCJIT debugger registration interface.

On 11 Aug 2014, at 18:06, Eric Christopher <echristo at gmail.com> wrote:
> There's some support (originally forked from lldb) already in llvm to
> do this. Look at lib/DebugInfo, it's what llvm-dwarfdump, etc are
> based upon.
Now that lldb is following trunk, it would be really great if some of this could
be unified.  Every time we find a bug, we end up fixing it in both places
(sometimes we remember, sometimes we find the same bug twice).

David

Lang Hames

2014-Aug-12 06:04 UTC

head link

[LLVMdev] MCJIT debugger registration interface.

Hi All,
> I think this ignores the real problem with the MCJIT debugging interface:
it doesn't give MCJIT clients any way of directly accessing and parsing the
debug metadata.
Sorry - it wasn't clear from my original post, but I'm hoping to improve
debugging APIs in general, not just for the system debugger.

I think there are two orthogonal concerns here - (1) the debug info format (and
tools for parsing it), and (2) the APIs for getting the debug info to the people
who need it.

I want to keep these two things separate to allow for clients passing through
debug-info or other annotations that LLVM/LLDB has no idea how to parse.

So, here's a sketch of a partial solution for MCJIT clients (we'll leave
the system debugger to a follow-up email):

On point one, my inclination is that we should use an existing stable debug info
format. Dwarf seems an obvious candidate, given the level of support in LLVM. As
noted, this shouldn't matter to the client - I think there's general
agreement that the debug info parsing support should be available in LLVM/LLDB.
The client shouldn't have to care about debug-info format specifics unless
they want to. (If anybody has a use-case where that wouldn't work, please
speak up).

Regarding the second point, my current (vague) plan is to introduce a utility
class that, when attached to the execution engine, records the relocated debug
info sections for each JIT'd object. Clients should be able to query this
object to access the debug info sections. We would provide, either in LLVM or
LLDB, debug info parsers that wrap this class to parse the contained debug info.

My intent is that use of this API would look something like:

ExecutionEngine EE = ...;
DebugInfoListener DI = new DebugInfoListener(...);
EE->addEventListener(DI);
EE->addModule(Foo);
EE->addModule(Bar);

MCJITDebugInfoParser DIP = createMCJITDebugInfoParser(DI);
DIP...;

Any thoughts/comments on this (admittedly very vague) proposal are very welcome.
Assuming it sounds reasonable so far, I'm going to start hacking up some
patches and basic use cases to serve as a basis for further discussion (and a
tutorial if the eventual proposal is adopted).

Cheers,
Lang.
> On Mon, Aug 11, 2014 at 2:02 PM, David Chisnall <David.Chisnall at
cl.cam.ac.uk> wrote:
> On 11 Aug 2014, at 18:06, Eric Christopher <echristo at gmail.com>
wrote:
> 
> > There's some support (originally forked from lldb) already in llvm
to
> > do this. Look at lib/DebugInfo, it's what llvm-dwarfdump, etc are
> > based upon.
> 
> Now that lldb is following trunk, it would be really great if some of this
could be unified.  Every time we find a bug, we end up fixing it in both places
(sometimes we remember, sometimes we find the same bug twice).
> 
> David
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140811/8b03a489/attachment.html>

Possibly Parallel Threads

Search for more apparently analagous threads

llvm dev - Aug 2014 - [LLVMdev] MCJIT debugger registration interface.

[LLVMdev] MCJIT debugger registration interface.

[LLVMdev] MCJIT debugger registration interface.

[LLVMdev] MCJIT debugger registration interface.

Possibly Parallel Threads