thr3ads.net - llvm dev - [LLVMdev] Function binding [Sep 2005]

If this information is useful, please help other people find it:
Share via:

Andreas Fredriksson

2005-Sep-10 11:07 UTC

[LLVMdev] Function binding

Hey list,

I'm looking for information on how programs that span multiple LLVM
modules work at runtime, especially wrt. symbol handling when running
in a JIT setting. To give some background, I'm developing a language
that targets LLVM as a backend, and I'd like my translation units to
map to LLVM modules as closely as possible.

What I'm looking for here is something similar to how Java or Python
handles intra-module depencies at runtime, where they load modules (or
classes, in the Java case) as necessary, and where different modules
can cooperate during different runs of the same program depending on
the code path that is taken.

Is it possibly to get a hook call when a JITed module encounters a
symbol reference it can't resolve locally? My current solution is
based upon pessimizations that force the loading of all dependent
modules, but that's wasteful in many cases when only some of those
dependencies are actually required for execution.

That said, I would also like to examine the possibility to recompile
modules in the running system on the fly from source, so that it is
possible to update modules as longs as their interfaces stay
compatible. Can LLVM freeze the JIT in a safe place and unload
modules?

I'm also curious to find out how the external symbols referenced from
the C frontend are resolved (such as printf or other functions in
libc). I assume there is a dlsym() call somewhere depending on the
libs listed in the module, is this correct? Does this happen at module
load time or at some later point while executing?

Finally, is the LLVM linked really required for a system like this? I
know the JIT is happy executing my bytecode modules as long as they
are self-contained, but on-demand loading is a requirement for this
(test) project. Currently all I'm getting is a hard error from the
runtime complaining that a referenced symbol is undefined.

Any information (or pointers to information) on the above would be
very helpful to me.

Thanks,
Andreas

-- 
"Give a man a fire and he's warm for a day; set him on fire and
he's
warm for the rest of his life" -- Terry Pratchett

Chris Lattner

2005-Sep-12 06:12 UTC

head link

[LLVMdev] Function binding

On Sat, 10 Sep 2005, Andreas Fredriksson wrote:
> Hey list,
>
> I'm looking for information on how programs that span multiple LLVM
> modules work at runtime, especially wrt. symbol handling when running
> in a JIT setting. To give some background, I'm developing a language
> that targets LLVM as a backend, and I'd like my translation units to
> map to LLVM modules as closely as possible.
Ok.  Currently the LLVM JIT just knows about a single module.  I think it 
would be very useful to extend this to support multiple modules at a time, 
where a function reference consults a symbol table to determine the right 
module to compile from.

In the context of C/C++, imagine completely skipping the link step. 
Instead of linking, you could just present the JIT with a list of .o files 
to load and execute.  If it could execute from multiple modules at a time, 
it would "just work" as if linking had occurred.
> What I'm looking for here is something similar to how Java or Python
> handles intra-module depencies at runtime, where they load modules (or
> classes, in the Java case) as necessary, and where different modules
> can cooperate during different runs of the same program depending on
> the code path that is taken.
I think this is another very logical application of this idea.
> Is it possibly to get a hook call when a JITed module encounters a
> symbol reference it can't resolve locally?
Yes, sort of.  Look at lib/ExecutionEngine/JIT/Intercept.cpp. 
getPointerToNamedFunction contains logic that works like this:

1. If this is one of the very few functions the JIT knows about, handle
    it.
2. Otherwise, call 'dlsym' on the local process to resolve the address.
3. Otherwise abort.

It would be pretty straight-forward to extend that code, or the callers of 
that code, to search multiple modules.
> My current solution is
> based upon pessimizations that force the loading of all dependent
> modules, but that's wasteful in many cases when only some of those
> dependencies are actually required for execution.
Yup.
> That said, I would also like to examine the possibility to recompile
> modules in the running system on the fly from source, so that it is
> possible to update modules as longs as their interfaces stay
> compatible. Can LLVM freeze the JIT in a safe place and unload
> modules?
Not really.  However, it can do the equivalent thing: it can replace code 
for functions that have already been compiled with new code (see 
ExecutionEngine::recompileAndRelinkFunction).  The semantics of this are 
the any future invocations of the function will call the newly compiled 
function.  If there are any invocations of the function on the stack 
(currently executing) they will finish executing the old function.  Any 
new calls into the function will get the new code (this is to avoid 
having the JIT have to keep track of potentially very expensive mapping 
information).
> I'm also curious to find out how the external symbols referenced from
> the C frontend are resolved (such as printf or other functions in
> libc). I assume there is a dlsym() call somewhere depending on the
> libs listed in the module, is this correct? Does this happen at module
> load time or at some later point while executing?
Yup, see above.  These happen lazily as the process needs the symbols. 
The address of 'printf' is inserted into the JIT's symbol table just
like
any JIT'd function's address.
> Finally, is the LLVM linked really required for a system like this? I
> know the JIT is happy executing my bytecode modules as long as they
> are self-contained, but on-demand loading is a requirement for this
> (test) project. Currently all I'm getting is a hard error from the
> runtime complaining that a referenced symbol is undefined.
Currently, yes, it does require this.  However, I think it would be great 
for the JIT to have a list of Module's that are currently 'open'
that it
can generate code for, and for this list to be dynamically mutable.  Any 
help adding the functionality to the JIT would be greatly appreciated!

-Chris

-- 
http://nondot.org/sabre/
http://llvm.org/

Reasonably Related Threads

Search for more maybe matching threads

llvm dev - Sep 2005 - [LLVMdev] Function binding

[LLVMdev] Function binding

[LLVMdev] Function binding

Reasonably Related Threads