thr3ads.net - llvm dev - [LLVMdev] Redefining function [Jan 2010]

If this information is useful, please help other people find it:
Share via:

Conrado Miranda

2010-Jan-31 02:22 UTC

[LLVMdev] Redefining function

Albert Graef wrote:
> The way I do this in Pure is to always call global functions in an
> indirect fashion, using an internal global variable which holds the
> current function pointer. When a function definition gets updated, the
> Pure interpreter just jits the new function, changes the global variable
> accordingly, and frees the old code.
>
> Compared to Duncan's suggestion, this has the advantage that you only
> have to recompile the function which was changed. AFAICT, if you use
> replaceAllUsesWith, then the changes ripple through so that you might
> end up re-jiting most of your program.
>
Thought of that before, but I was trying to do it more elegantly and
transparent to the program (which is being write in C/C++). Maybe going back
to that.

Thank you both for the quick replies.

Miranda

PS:
If it's any help, got the svn version and, while running the program, got
this:
The JIT doesn't know how to handle a RAUW on a value it has emitted.
UNREACHABLE executed at
/home/conrado/engines/llvm/lib/ExecutionEngine/JIT/JITEmitter.cpp:1542!

I looked at the function and it's a dummy function. Just looking forward to
see that corrected.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20100131/a5097a2a/attachment.html>

Jeffrey Yasskin

2010-Jan-31 05:25 UTC

head link

[LLVMdev] Redefining function

On Sat, Jan 30, 2010 at 6:22 PM, Conrado Miranda
<miranda.conrado at gmail.com> wrote:> Albert Graef wrote:
>>
>> The way I do this in Pure is to always call global functions in an
>> indirect fashion, using an internal global variable which holds the
>> current function pointer. When a function definition gets updated, the
>> Pure interpreter just jits the new function, changes the global
variable
>> accordingly, and frees the old code.
>>
>> Compared to Duncan's suggestion, this has the advantage that you
only
>> have to recompile the function which was changed. AFAICT, if you use
>> replaceAllUsesWith, then the changes ripple through so that you might
>> end up re-jiting most of your program.
>
> Thought of that before, but I was trying to do it more elegantly and
> transparent to the program (which is being write in C/C++). Maybe going
back
> to that.
>
> Thank you both for the quick replies.
>
> Miranda
>
> PS:
> If it's any help, got the svn version and, while running the program,
got
> this:
> The JIT doesn't know how to handle a RAUW on a value it has emitted.
> UNREACHABLE executed at
> /home/conrado/engines/llvm/lib/ExecutionEngine/JIT/JITEmitter.cpp:1542!
>
> I looked at the function and it's a dummy function. Just looking
forward to
> see that corrected.
The problem here is reasonably complicated. With the JIT, you have two
different worlds that aren't automatically in sync: the IR in your
program, and the machine code generated for one version of that IR.

runFunction(F) is a wrapper around getPointerToFunction(F), which
returns the address of some machine code implementing the function.
runFunction() does _not_ free this machine code when it returns, so
subsequent runFunction() calls don't need to re-compile it, but they
also get the original definition even if the function has changed. The
JIT will automatically destroy the machine code when F is destroyed,
or you can destroy it manually with freeMachineCodeForFunction().

If you have an existing IR function A which calls function B, and
you've emitted A to machine code, then you have a machine code call to
B in there. Now you want A to call C instead. Without the above
assert, it would be relatively easy to change the IR to call C: call
B->replaceAllUsesWith(C). However, you still have the machine code for
A, which calls B, and there could be a thread concurrently executing
A, making it unsafe to modify A's code. So what should the JIT do when
it sees you replacing B with C?
 1. It could do nothing. Then it would be your responsibility to wait
for all threads to finish running A, free its machine code, and then
recompile it with the new call. (You can do the recompile without
freeing the old code by calling
ExecutionEngine::recompileAndRelinkFunction(A), but that'll
permanently leak the old code.) If you destroy B while a thread is
still in A, its machine code gets freed, leaving you with a latent
crash.
 2. It could compile C, and either replace B's machine code with a
jump to C, or replace all calls to B with calls to C. Aside from not
having the infrastructure to do this, it's not thread-safe:
http://llvm.org/PR5184.
 3. ???

You'd have an extra option if machine code lifetimes weren't tied to
llvm::Function lifetimes, but I haven't spent the time to get that
working.

Since I didn't have a use for RAUW on a compiled function, I resisted
the temptation to guess at the right behavior and put in that assert.
If you think you know what the right behavior is, feel free to file a
bug asking for it.

You can work around this by using freeMachineCodeForFunction yourself
on the whole call tree, then using RAUW to replace the functions, and
then re-compiling them.

Or you can take Albert's advice to make all calls through function
pointers. This will be a bit slower, but should Just Work.

Duncan Sands

2010-Jan-31 14:44 UTC

head link

[LLVMdev] Redefining function

Hi Jeffrey,
>  2. It could compile C, and either replace B's machine code with a
> jump to C, or replace all calls to B with calls to C. Aside from not
> having the infrastructure to do this, it's not thread-safe:
> http://llvm.org/PR5184.
if all calls were via a handle (i.e. load the function pointer out of
some memory location then jump to it), then you could compile C,
atomically replace the pointer-to-B with the pointer-to-C in the memory
location, and later free B using some kind of user-space read-copy-update
type logic.  This could be managed transparently by the JIT (i.e. in the
IR you would have direct calls, that are implemented by a jump to a place
that loads the function pointer then calls it).  If you don't want to use
handles, then there are also various possibilities for thread-safe code
patching (eg: the linux kernel does this kind of thing in various places),
but this is of course more complicated.  That said, if the IR optimizers
have inlined your original function everywhere, then replacing the function
later won't have any effect...

Ciao,

Duncan.

Conrado Miranda

2010-Jan-31 15:35 UTC

head link

[LLVMdev] Redefining function

Great! It just worked. I was a bit worried about using pointers to call
functions because it's a little too overwhelming in a big project, I think.

Just for the record, if the function code isn't freed with
freeMachineCodeForFunction, I get a segmentation fault during
recompileAndRelinkFunction with this stack dump:
Running pass 'X86 Machine Code Emitter' on function '@do_print'

I know no one should do this, but it's good to know LLVM doesn't allow
you
to leak (or it's just a good side effect of something else).

Although this method can stop the whole program for quite some time, it
doesn't require reboot (which can be costy) and doesn't have the
constant
cost of pointers (it allows me to choose when I can afford the cost of the
change).

Thanks for the explanation. The code works just as wanted now.

On Sun, Jan 31, 2010 at 3:25 AM, Jeffrey Yasskin <jyasskin at
google.com>wrote:
> On Sat, Jan 30, 2010 at 6:22 PM, Conrado Miranda
> <miranda.conrado at gmail.com> wrote:
> > Albert Graef wrote:
> >>
> >> The way I do this in Pure is to always call global functions in an
> >> indirect fashion, using an internal global variable which holds
the
> >> current function pointer. When a function definition gets updated,
the
> >> Pure interpreter just jits the new function, changes the global
variable
> >> accordingly, and frees the old code.
> >>
> >> Compared to Duncan's suggestion, this has the advantage that
you only
> >> have to recompile the function which was changed. AFAICT, if you
use
> >> replaceAllUsesWith, then the changes ripple through so that you
might
> >> end up re-jiting most of your program.
> >
> > Thought of that before, but I was trying to do it more elegantly and
> > transparent to the program (which is being write in C/C++). Maybe
going
> back
> > to that.
> >
> > Thank you both for the quick replies.
> >
> > Miranda
> >
> > PS:
> > If it's any help, got the svn version and, while running the
program, got
> > this:
> > The JIT doesn't know how to handle a RAUW on a value it has
emitted.
> > UNREACHABLE executed at
> >
/home/conrado/engines/llvm/lib/ExecutionEngine/JIT/JITEmitter.cpp:1542!
> >
> > I looked at the function and it's a dummy function. Just looking
forward
> to
> > see that corrected.
>
> The problem here is reasonably complicated. With the JIT, you have two
> different worlds that aren't automatically in sync: the IR in your
> program, and the machine code generated for one version of that IR.
>
> runFunction(F) is a wrapper around getPointerToFunction(F), which
> returns the address of some machine code implementing the function.
> runFunction() does _not_ free this machine code when it returns, so
> subsequent runFunction() calls don't need to re-compile it, but they
> also get the original definition even if the function has changed. The
> JIT will automatically destroy the machine code when F is destroyed,
> or you can destroy it manually with freeMachineCodeForFunction().
>
> If you have an existing IR function A which calls function B, and
> you've emitted A to machine code, then you have a machine code call to
> B in there. Now you want A to call C instead. Without the above
> assert, it would be relatively easy to change the IR to call C: call
> B->replaceAllUsesWith(C). However, you still have the machine code for
> A, which calls B, and there could be a thread concurrently executing
> A, making it unsafe to modify A's code. So what should the JIT do when
> it sees you replacing B with C?
>  1. It could do nothing. Then it would be your responsibility to wait
> for all threads to finish running A, free its machine code, and then
> recompile it with the new call. (You can do the recompile without
> freeing the old code by calling
> ExecutionEngine::recompileAndRelinkFunction(A), but that'll
> permanently leak the old code.) If you destroy B while a thread is
> still in A, its machine code gets freed, leaving you with a latent
> crash.
>  2. It could compile C, and either replace B's machine code with a
> jump to C, or replace all calls to B with calls to C. Aside from not
> having the infrastructure to do this, it's not thread-safe:
> http://llvm.org/PR5184.
>  3. ???
>
> You'd have an extra option if machine code lifetimes weren't tied
to
> llvm::Function lifetimes, but I haven't spent the time to get that
> working.
>
> Since I didn't have a use for RAUW on a compiled function, I resisted
> the temptation to guess at the right behavior and put in that assert.
> If you think you know what the right behavior is, feel free to file a
> bug asking for it.
>
> You can work around this by using freeMachineCodeForFunction yourself
> on the whole call tree, then using RAUW to replace the functions, and
> then re-compiling them.
>
> Or you can take Albert's advice to make all calls through function
> pointers. This will be a bit slower, but should Just Work.
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20100131/2a27526c/attachment.html>

Reasonably Related Threads

Search for more maybe matching threads

llvm dev - Jan 2010 - [LLVMdev] Redefining function

[LLVMdev] Redefining function

[LLVMdev] Redefining function

[LLVMdev] Redefining function

[LLVMdev] Redefining function

Reasonably Related Threads