thr3ads.net - llvm dev - [LLVMdev] Memory clean for applications using LLVM for JIT compilation [Jan 2013]

If this information is useful, please help other people find it:
Share via:

Dirkjan Bussink

2013-Jan-14 09:56 UTC

[LLVMdev] Memory clean for applications using LLVM for JIT compilation

Hello all,

I've already bothered people on IRC with this question and it was
recommended to ask it here.

First of all, some context. In Rubinius (http://rubini.us/,
http://github.com/rubinius/rubinius) we use LLVM for our JIT. We create LLVM IR
using the C++ API and turn that into machine code using
ExecutionEngine::runJITOnFunction. The resulting native code is then installed
as the executor for a method.

Right now we use a single LLVMContext (the global context), since we reuse a lot
of structures such as the types mapping onto the types in the virtual machine.
This does have a downside though, mainly that when constant expressions are used
they are stored in the LLVMContext (for example when we use a ConstantInt::get).
We have control over some of these allocations, but also some of them are
allocated during built in LLVM optimization passes.

This means that over time the LLVMContext keeps growing as we JIT more and more
code. This is especially a problem with applications we run that don't have
a stable type profile at run time and keep triggering methods to be jitted. In a
very dynamic language such as Ruby this is not very uncommon and there are some
libraries out there that behave badly in this regard.

So what we're seeing now when running these applications with the JIT
enabled, is that memory consumption grows slowly over time. We know it isn't
technically a memory leak, since all that data is still reachable and will be
cleaned up on shutdown, but what we'd like is a way to control this memory
during the time the app runs.

We've explored a few ideas and each of them has significant downsides. The
initial idea would be to setup an LLVMContext per compilation request for a
method. The problem with this approach is that we'd have to keep this
context alive for the lifetime of the jitted method. This would significantly
increase memory usage since each context would carry around all kinds of
additional data such as the type information for our internal VM types.

A second idea was to copy out the generated native code, but that causes all
kinds of problems because of CALL semantics etc. that can call into other jitted
code. The last idea we had is more of hack, since I think this uses LLVM in an
invalid way (that perhaps might work, perhaps not). This approach was to compile
each request in a new context, but only keep the llvm::Function we have as the
result alive outside this context. The only reason we need to have this
llvm::Function, is to be able to cleanup the native code (with
ExecutionEngine::freeMachineCodeForFunction).

So the question is what would be a recommended way to handle this problem? Is
there a way to clean up / free native code like
ExecutionEngine::freeMachineCodeForFunction without needing the llvm::Function?
Is it safe to use the llvm::Function outside the LLVMContext in the way
described here? Is there a way to clean up the constants allocated in the
LLVMContext manually?

Or maybe would it be possible to have a custom allocator for memory space for
the native code that we could provide? With this last option we would be
responsible for the clean up ourselves and just provide memory space to LLVM
where it can store the results.

We're open to different approaches, but we would like to know the
recommendations from the LLVM community here.

-- 
Regards,

Dirkjan Bussink

Reid Kleckner

2013-Jan-14 14:48 UTC

head link

[LLVMdev] Memory clean for applications using LLVM for JIT compilation

On Mon, Jan 14, 2013 at 4:56 AM, Dirkjan Bussink <d.bussink at
gmail.com>wrote:
> Hello all,
>
> I've already bothered people on IRC with this question and it was
> recommended to ask it here.
>
> First of all, some context. In Rubinius (http://rubini.us/,
> http://github.com/rubinius/rubinius) we use LLVM for our JIT. We create
> LLVM IR using the C++ API and turn that into machine code using
> ExecutionEngine::runJITOnFunction. The resulting native code is then
> installed as the executor for a method.
>
> Right now we use a single LLVMContext (the global context), since we reuse
> a lot of structures such as the types mapping onto the types in the virtual
> machine. This does have a downside though, mainly that when constant
> expressions are used they are stored in the LLVMContext (for example when
> we use a ConstantInt::get). We have control over some of these allocations,
> but also some of them are allocated during built in LLVM optimization
> passes.
>
> This means that over time the LLVMContext keeps growing as we JIT more and
> more code. This is especially a problem with applications we run that
don't
> have a stable type profile at run time and keep triggering methods to be
> jitted. In a very dynamic language such as Ruby this is not very uncommon
> and there are some libraries out there that behave badly in this regard.
>
> So what we're seeing now when running these applications with the JIT
> enabled, is that memory consumption grows slowly over time. We know it
> isn't technically a memory leak, since all that data is still reachable
and
> will be cleaned up on shutdown, but what we'd like is a way to control
this
> memory during the time the app runs.
>
> We've explored a few ideas and each of them has significant downsides.
The
> initial idea would be to setup an LLVMContext per compilation request for a
> method. The problem with this approach is that we'd have to keep this
> context alive for the lifetime of the jitted method. This would
> significantly increase memory usage since each context would carry around
> all kinds of additional data such as the type information for our internal
> VM types.
>
> A second idea was to copy out the generated native code, but that causes
> all kinds of problems because of CALL semantics etc. that can call into
> other jitted code. The last idea we had is more of hack, since I think this
> uses LLVM in an invalid way (that perhaps might work, perhaps not). This
> approach was to compile each request in a new context, but only keep the
> llvm::Function we have as the result alive outside this context. The only
> reason we need to have this llvm::Function, is to be able to cleanup the
> native code (with ExecutionEngine::freeMachineCodeForFunction).
>
> So the question is what would be a recommended way to handle this problem?
> Is there a way to clean up / free native code like
> ExecutionEngine::freeMachineCodeForFunction without needing the
> llvm::Function? Is it safe to use the llvm::Function outside the
> LLVMContext in the way described here? Is there a way to clean up the
> constants allocated in the LLVMContext manually?
>
I believe it is safe to use llvm::Function::deleteBody() to free the IR
from the function.  IIRC we used deleteBody() for Unladen Swallow, which
was a few years ago now.

You'd still have to implement some kind of type cleanup in/on the context,
and the function type will still reference the types of the parameters.

Or maybe would it be possible to have a custom allocator for memory
space> for the native code that we could provide? With this last option we would
> be responsible for the clean up ourselves and just provide memory space to
> LLVM where it can store the results.
>
Yes, you should be able to inherit from llvm::JITMemoryManager and do
something like this.

>
> We're open to different approaches, but we would like to know the
> recommendations from the LLVM community here.
>
> --
> Regards,
>
> Dirkjan Bussink
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130114/c6b5e0f2/attachment.html>

Manny Ko

2013-Jan-17 02:17 UTC

head link

[LLVMdev] Memory clean for applications using LLVM for JIT compilation

"This approach was to compile each request in a new context, but only keep
the llvm::Function we have as the result alive outside this context."

Not sure the above will work.  The JITed MC is owned by the ExecutionEngine
which has taken ownership of the Module. As soon as you delete the
ExecutionEngine the Module and its MC will be freed.

Cheers.

From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On
Behalf Of Reid Kleckner
Sent: Monday, January 14, 2013 6:48 AM
To: Dirkjan Bussink
Cc: Evan Phoenix; llvmdev at cs.uiuc.edu
Subject: Re: [LLVMdev] Memory clean for applications using LLVM for JIT
compilation

On Mon, Jan 14, 2013 at 4:56 AM, Dirkjan Bussink <d.bussink at
gmail.com<mailto:d.bussink at gmail.com>> wrote:
Hello all,

I've already bothered people on IRC with this question and it was
recommended to ask it here.

First of all, some context. In Rubinius (http://rubini.us/,
http://github.com/rubinius/rubinius) we use LLVM for our JIT. We create LLVM IR
using the C++ API and turn that into machine code using
ExecutionEngine::runJITOnFunction. The resulting native code is then installed
as the executor for a method.

Right now we use a single LLVMContext (the global context), since we reuse a lot
of structures such as the types mapping onto the types in the virtual machine.
This does have a downside though, mainly that when constant expressions are used
they are stored in the LLVMContext (for example when we use a ConstantInt::get).
We have control over some of these allocations, but also some of them are
allocated during built in LLVM optimization passes.

This means that over time the LLVMContext keeps growing as we JIT more and more
code. This is especially a problem with applications we run that don't have
a stable type profile at run time and keep triggering methods to be jitted. In a
very dynamic language such as Ruby this is not very uncommon and there are some
libraries out there that behave badly in this regard.

So what we're seeing now when running these applications with the JIT
enabled, is that memory consumption grows slowly over time. We know it isn't
technically a memory leak, since all that data is still reachable and will be
cleaned up on shutdown, but what we'd like is a way to control this memory
during the time the app runs.

We've explored a few ideas and each of them has significant downsides. The
initial idea would be to setup an LLVMContext per compilation request for a
method. The problem with this approach is that we'd have to keep this
context alive for the lifetime of the jitted method. This would significantly
increase memory usage since each context would carry around all kinds of
additional data such as the type information for our internal VM types.

A second idea was to copy out the generated native code, but that causes all
kinds of problems because of CALL semantics etc. that can call into other jitted
code. The last idea we had is more of hack, since I think this uses LLVM in an
invalid way (that perhaps might work, perhaps not). This approach was to compile
each request in a new context, but only keep the llvm::Function we have as the
result alive outside this context. The only reason we need to have this
llvm::Function, is to be able to cleanup the native code (with
ExecutionEngine::freeMachineCodeForFunction).

So the question is what would be a recommended way to handle this problem? Is
there a way to clean up / free native code like
ExecutionEngine::freeMachineCodeForFunction without needing the llvm::Function?
Is it safe to use the llvm::Function outside the LLVMContext in the way
described here? Is there a way to clean up the constants allocated in the
LLVMContext manually?

I believe it is safe to use llvm::Function::deleteBody() to free the IR from the
function.  IIRC we used deleteBody() for Unladen Swallow, which was a few years
ago now.

You'd still have to implement some kind of type cleanup in/on the context,
and the function type will still reference the types of the parameters.

Or maybe would it be possible to have a custom allocator for memory space for
the native code that we could provide? With this last option we would be
responsible for the clean up ourselves and just provide memory space to LLVM
where it can store the results.

Yes, you should be able to inherit from llvm::JITMemoryManager and do something
like this.


We're open to different approaches, but we would like to know the
recommendations from the LLVM community here.

--
Regards,

Dirkjan Bussink




_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu<mailto:LLVMdev at cs.uiuc.edu>        
http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130117/b4172c4b/attachment.html>

Dirkjan Bussink

2013-Jan-20 14:54 UTC

head link

[LLVMdev] Memory clean for applications using LLVM for JIT compilation

On 14 Jan 2013, at 15:48, Reid Kleckner <rnk at google.com>
wrote:> 
> Or maybe would it be possible to have a custom allocator for memory space
for the native code that we could provide? With this last option we would be
responsible for the clean up ourselves and just provide memory space to LLVM
where it can store the results.
> 
> Yes, you should be able to inherit from llvm::JITMemoryManager and do
something like this.
I've been trying to work with this solution, but it does pose a problem. The
problem is that we use an ExecutionEngine and set a memory manager with
setJITMemoryManager on EngineBuilder. The problem is that this means when the
ExecutionEngine is deallocated, it end up deallocating the memory manager.

I can understand doing this when the code sets up it's own memory manager,
but with an external memory manager, I'd expect LLVM not to deallocate that
object for me. Is there a way to prevent this from happening? I can't keep
the ExecutionEngine around here either, since EngineBuilder needs a Module,
which in it's turn needs an LLVMContext, which I'm trying to create for
each new request.

Does anyone have additional ideas for how to handle this? Or whether there is
another approach that could work here?

-- 
Dirkjan

Seemingly Similar Threads

Search for more reasonably related threads

llvm dev - Jan 2013 - [LLVMdev] Memory clean for applications using LLVM for JIT compilation

[LLVMdev] Memory clean for applications using LLVM for JIT compilation

[LLVMdev] Memory clean for applications using LLVM for JIT compilation

[LLVMdev] Memory clean for applications using LLVM for JIT compilation

[LLVMdev] Memory clean for applications using LLVM for JIT compilation

Seemingly Similar Threads