On Sun, Aug 10, 2014 at 3:37 PM, Filip Pizlo <fpizlo at apple.com>
wrote:>
>
>> On Aug 10, 2014, at 3:07 PM, Eric Christopher <echristo at
gmail.com> wrote:
>>
>>> On Sun, Aug 10, 2014 at 1:43 PM, Filip Pizlo <fpizlo at
apple.com> wrote:
>>> I think this ignores the real problem with the MCJIT debugging
interface: it doesn't give MCJIT clients any way of directly accessing and
parsing the debug metadata.
>>
>> Parsing the existing debug metadata isn't necessarily a good idea
>> anyhow. It's not a stable format and is quite large.
>
> I agree. I suspect that a better solution is to have the smarts for
grokking the debug data inside LLVM, possibly borrowing logic from lldb. For
starters clients like WebKit will want a machine-offset-to-debug-info map, which
ain't rocket science - but currently parsing dwarf inside the LLVM client is
the only way to do this afaict.
There's some support (originally forked from lldb) already in llvm to
do this. Look at lib/DebugInfo, it's what llvm-dwarfdump, etc are
based upon.
>>
>>> WebKit, and likely other non-C/C++ clients of MCJIT, will not want
the MCJIT to register anything with the system debugger. Non-C languages usually
have a different set of debugging interfaces and it's up to the client of
LLVM to arrange to glue the debugging information that the MCJIT knows about to
the debugging interface that the LLVM client knows about. The mcjit's
current architecture makes this extremely awkward.
>>>
>>> This is part of a bigger problem in the MCJIT API: it is designed
to work like an execution engine for C programs despite the fact that the most
compelling use of MCJIT is a higher-tier JIT that is part of a mixed-mode or
tiered runtime for non-C languages. Is there some client of the MCJIT that
actually benefits from the MCJIT pretending to be an execution engine for C
programs? Is there a reason why this client should get more attention than the
seemingly more compelling non-C use cases?
>>
>> The debug metadata is largely based around dwarf debug information,
>> but it isn't a C language based format. I think this is a
misleading
>> assertion you make.
>
> That would be a misleading assertion indeed, but it's not the one
I'm making. Let me restate.
>
> Clients of optimizing JIT compilers are usually going to want to have some
finer-grained control over how that JIT presents debug data to the debugger.
Probably all that we want is: the JIT offers its debug data to its client, and
the client decides if, and how, this data is presented to any debugger (lldb,
gdb, or whatever). A reasonable default can of course be provided, if it leads
to a good API.
>
> The MCJIT is currently ill suited to this kind of thing because it pretends
to be a black box execution engine for LLVM IR. This black box then makes
further assumptions that make sense for programs that target the C runtime. I
believe that life would be easier if the task of generating code and the task of
linking and executing it were better separated in the API.
I think there are two things here, dwarf level support for things like
line numbers, variable locations, and even some basic type
information. Then there's language support like you'd want to see
debugging a high level language that can't be fully described or has
run time effects - a debugging interface that can be called into for
that could be useful, but I'm not seeing that as necessarily something
that MCJIT would vend but something on top of it. I.e. how a debugger
would handle (bad example here, but...) something like Obj-C or Swift.
>
>>
>> Also, it's your most compelling use case, not the most compelling.
>
> If it isn't the most compelling, then can you provide an example of an
MCJIT client that benefits from the current design?
>
> I suspect that most other MCJIT clients will do some similar things to what
WebKit does:
>
> - custom runtime that doesn't behave like a C linker.
>
> - custom debugging infrastructure; even if lldb integration is provided,
the client's runtime will want lots of control.
>
> - multiple compiler tiers or mixed-mode execution.
>
> - source language that is not like C.
>
> These four things apply to many systems and it would be cool if LLVM became
easier to use for those. If you believe that these things are not compelling,
then can you describe what kind system you envision MCJIT being used for?
>
Oh, I agree they'd be cool to have as well, but there's also languages
like Swift and Julia that use the JIT. There are all of the
OpenGL/OpenCL/OpenACC accelerator type compilation uses, etc. Just
saying that the Webkit JavaScript compilation strategy isn't the only
compelling use case.
Mostly I think we're in agreement that this sort of functionality
would be useful, just where it goes and whether or not the existing
information that we can vend is also useful.
-eric
>>
>> -eric
>>
>>>
>>> -Filip
>>>
>>>> On Aug 1, 2014, at 6:10 PM, Lang Hames <lhames at
gmail.com> wrote:
>>>>
>>>> Hi All,
>>>>
>>>> I'd like to revisit the MCJIT debugger-registration system,
as the existing system has a few flaws, some of which are seriously problematic.
>>>>
>>>> The 20,000 foot overview of the existing scheme (implemented in
llvm/lib/ExecutionEngine/RuntimeDyld/GDBRegistrar.cpp and friends), as I
understand it, is as follows:
>>>>
>>>> We have two symbols in MCJIT that act as fixed points for the
debugger to latch on to:
>>>>
>>>> __jit_debug_register_code is a no-op function that the debugger
can set a breakpoint on. MCJIT will call this function to notify the debugger
when an object file is loaded.
>>>>
>>>> __jit_debug_descriptor is the head of a C linked list data
structure that contains pointers to in-memory object files. The ELF/MachO
headers of the in memory object files will have had their vaddrs fixed up by the
JIT to point to where each of the linked sections reside in memory.
>>>>
>>>> There are a couple of problems with this system: (1) Modifying
object-file headers in-place violates some internal LLVM contracts. In
particular, the object files may be backed by read-only memory. This has caused
crashes in the JIT that have forced me to revert support for debugger
registration on the MachO side (We really want to replace this on the ELF side
soon too). (2) The JIT has no way of knowing whether a debugger is attached,
which means keeping object files in memory even if they're not being used,
just in case there an attached debugger that needs them.
>>>>
>>>> We'd really like to come up with a system that doesn't
have these drawbacks. That is, a system where the object files remain
unmodified, and the JIT knows if/when a debugger attaches so that it can
generate the relevant information on the fly.
>>>>
>>>> It would be great if the debugger experts (and particularly
anyone who has experience on both the debugger and the JIT side of things) could
weigh in on these issues. In particular:
>>>>
>>>> (1) Is there a reason we bake the vmaddrs into the object file
headers, or could they just as easily be passed in a side-table so as to keep
the object untouched?
>>>>
>>>> (2) Is there a canonical way for the debugger to communicate to
a JIT that it's interested in inspecting the JIT's output? If we're
going to use breakpoints (or something like that) to signal to the debugger when
objects have been linked, is it reasonable to have an API that the debugger can
call in to to request the information it's looking for? If the JIT actually
receives a call then it would give us a chance to lazily populate the necessary
data structures.
>>>>
>>>> Regards,
>>>> Lang.
>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev