thr3ads.net - llvm dev - [llvm-dev] LLVMContext: Threads and Ownership. [Sep 2018]

If this information is useful, please help other people find it:
Share via:

Lang Hames via llvm-dev

2018-Sep-15 23:14 UTC

[llvm-dev] LLVMContext: Threads and Ownership.

Hi All,

ORC's new concurrent compilation model generates some interesting lifetime
and thread safety questions around LLVMContext: We need multiple
LLVMContexts (one per module in the simplest case, but at least one per
thread), and the lifetime of each context depends on the execution path of
the JIT'd code. We would like to deallocate contexts once all modules
associated with them have been compiled, but there is no safe or easy way
to check that condition at the moment as LLVMContext does not expose how
many modules are associated with it.

One way to fix this would be to add a mutex to LLVMContext, and expose this
and the module count. Then in the IR-compiling layer of the JIT we could
have something like:

// Compile finished, time to deallocate the module.
// Explicit deletes used for clarity, we would use unique_ptrs in practice.
auto &Ctx = Mod->getContext();
delete Mod;
std::lock_guard<std::mutex> Lock(Ctx->getMutex());
if (Ctx.getNumModules())
  delete Ctx;

Another option would be to invert the ownership model and say that each
Module shares ownership of its LLVMContext. That way LLVMContexts would be
automatically deallocated when the last module using them is destructed
(providing no other shared_ptrs to the context are held elsewhere).

There are other possible approaches (e.g. side tables for the mutex and
module count) but before I spent too much time on it I wanted to see
whether anyone else has encountered these issues or has opinions on
solutions.

Cheers,
Lang.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180915/efe057f2/attachment.html>

Lang Hames via llvm-dev

2018-Sep-16 04:30 UTC

head link

[llvm-dev] LLVMContext: Threads and Ownership.

Actually, looking at the destructors for LLVMContext and Module I do not
think the current ownership scheme makes sense, so this might be a good
opportunity to re-think it.

Right now an LLVMContext owns a list of modules (see
LLVMContextImpl::OwnedModules) that it destroys when its destructor is
called.  Modules remove themselves from this list if they are destructed
before the context:

Module::~Module() {
  Context.removeModule(this);
  ...

LLVMContextImpl::~LLVMContextImpl() {
  // NOTE: We need to delete the contents of OwnedModules, but Module's dtor
  // will call LLVMContextImpl::removeModule, thus invalidating iterators
into
  // the container. Avoid iterators during this operation:
  while (!OwnedModules.empty())
    delete *OwnedModules.begin();
    ...

This makes it unsafe to hold a unique_ptr to a Module: If any Module is
still alive when its context goes out of scope it will be double freed,
first by the LLVMContextImpl destructor and then again by the unique ptr.
Idiomatic scoping means that we tend not to see this in practice  (Module
takes an LLVMContext reference, meaning we always declare the context
first, so it goes out of scope last), but makes the context ownership
redundant: the modules are always freed first via their unique_ptr's.

I don't think it makes sense for LLVMContext to own Modules. I think that
Modules should share ownership of their LLVMContext via a shared_ptr.

Thoughts?

Cheers,
Lang.

On Sat, Sep 15, 2018 at 4:14 PM Lang Hames <lhames at gmail.com> wrote:
> Hi All,
>
> ORC's new concurrent compilation model generates some interesting
lifetime
> and thread safety questions around LLVMContext: We need multiple
> LLVMContexts (one per module in the simplest case, but at least one per
> thread), and the lifetime of each context depends on the execution path of
> the JIT'd code. We would like to deallocate contexts once all modules
> associated with them have been compiled, but there is no safe or easy way
> to check that condition at the moment as LLVMContext does not expose how
> many modules are associated with it.
>
> One way to fix this would be to add a mutex to LLVMContext, and expose
> this and the module count. Then in the IR-compiling layer of the JIT we
> could have something like:
>
> // Compile finished, time to deallocate the module.
> // Explicit deletes used for clarity, we would use unique_ptrs in practice.
> auto &Ctx = Mod->getContext();
> delete Mod;
> std::lock_guard<std::mutex> Lock(Ctx->getMutex());
> if (Ctx.getNumModules())
>   delete Ctx;
>
> Another option would be to invert the ownership model and say that each
> Module shares ownership of its LLVMContext. That way LLVMContexts would be
> automatically deallocated when the last module using them is destructed
> (providing no other shared_ptrs to the context are held elsewhere).
>
> There are other possible approaches (e.g. side tables for the mutex and
> module count) but before I spent too much time on it I wanted to see
> whether anyone else has encountered these issues or has opinions on
> solutions.
>
> Cheers,
> Lang.
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180915/c52a91cd/attachment.html>

David Blaikie via llvm-dev

2018-Sep-16 17:22 UTC

head link

[llvm-dev] LLVMContext: Threads and Ownership.

Agreed, the existing ownership seems sub-optimal. I wouldn't say broken,
but subtle at least - looks like you get the choice to either manage the
ownership of the Module object yourself, or let the context handle it (eg:
currently it'd be valid to just do "{ LLVMContext C; new Module(C); new
Module(C); }" - Modules end up owned by the context and cleaned up there).

Might be hard to migrate existing users away from this without silently
introducing memory leaks... maybe with some significant API breakage - move
Module construction to a factory/helper that returns a
std::unique_ptr<Module> - requiring every Module construction to be
revisited, and users relying on LLVMContext based ownership/cleanup to
redesign their code.

As to the original question - gut reaction: this doesn't seem like
something that's general-purpose enough to be implemented in the
LLVMContext/Module itself. I think a reasonable ownership model for
LLVMContext/Module is that the user is required to ensure the LLVMContext
outlives all Modules created within it (same way a user of std::vector is
required to ensure that the vector is not reallocated so long as they're
keeping pointers/references to elements in it). I'd think/suggest
ref-counted LLVMContext ownership would be done by wrapping/external
tracking in this use case.

- Dave

On Sat, Sep 15, 2018 at 9:30 PM Lang Hames via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Actually, looking at the destructors for LLVMContext and Module I do not
> think the current ownership scheme makes sense, so this might be a good
> opportunity to re-think it.
>
> Right now an LLVMContext owns a list of modules (see
> LLVMContextImpl::OwnedModules) that it destroys when its destructor is
> called.  Modules remove themselves from this list if they are destructed
> before the context:
>
> Module::~Module() {
>   Context.removeModule(this);
>   ...
>
> LLVMContextImpl::~LLVMContextImpl() {
>   // NOTE: We need to delete the contents of OwnedModules, but Module's
> dtor
>   // will call LLVMContextImpl::removeModule, thus invalidating iterators
> into
>   // the container. Avoid iterators during this operation:
>   while (!OwnedModules.empty())
>     delete *OwnedModules.begin();
>     ...
>
> This makes it unsafe to hold a unique_ptr to a Module: If any Module is
> still alive when its context goes out of scope it will be double freed,
> first by the LLVMContextImpl destructor and then again by the unique ptr.
> Idiomatic scoping means that we tend not to see this in practice  (Module
> takes an LLVMContext reference, meaning we always declare the context
> first, so it goes out of scope last), but makes the context ownership
> redundant: the modules are always freed first via their unique_ptr's.
>
> I don't think it makes sense for LLVMContext to own Modules. I think
that
> Modules should share ownership of their LLVMContext via a shared_ptr.
>
> Thoughts?
>
> Cheers,
> Lang.
>
> On Sat, Sep 15, 2018 at 4:14 PM Lang Hames <lhames at gmail.com>
wrote:
>
>> Hi All,
>>
>> ORC's new concurrent compilation model generates some interesting
>> lifetime and thread safety questions around LLVMContext: We need
multiple
>> LLVMContexts (one per module in the simplest case, but at least one per
>> thread), and the lifetime of each context depends on the execution path
of
>> the JIT'd code. We would like to deallocate contexts once all
modules
>> associated with them have been compiled, but there is no safe or easy
way
>> to check that condition at the moment as LLVMContext does not expose
how
>> many modules are associated with it.
>>
>> One way to fix this would be to add a mutex to LLVMContext, and expose
>> this and the module count. Then in the IR-compiling layer of the JIT we
>> could have something like:
>>
>> // Compile finished, time to deallocate the module.
>> // Explicit deletes used for clarity, we would use unique_ptrs in
>> practice.
>> auto &Ctx = Mod->getContext();
>> delete Mod;
>> std::lock_guard<std::mutex> Lock(Ctx->getMutex());
>> if (Ctx.getNumModules())
>>   delete Ctx;
>>
>> Another option would be to invert the ownership model and say that each
>> Module shares ownership of its LLVMContext. That way LLVMContexts would
be
>> automatically deallocated when the last module using them is destructed
>> (providing no other shared_ptrs to the context are held elsewhere).
>>
>> There are other possible approaches (e.g. side tables for the mutex and
>> module count) but before I spent too much time on it I wanted to see
>> whether anyone else has encountered these issues or has opinions on
>> solutions.
>>
>> Cheers,
>> Lang.
>>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180916/50cbb31c/attachment-0001.html>

Alex Denisov via llvm-dev

2018-Sep-16 18:06 UTC

head link

[llvm-dev] LLVMContext: Threads and Ownership.

Hi Lang,
> We would like to deallocate contexts

I did not look at the Orc APIs as of LLVM-6+, but I'm curious where this
requirement comes from?
Does Orc takes ownership of the modules that are being JIT'ed? I.e., it is
the same 'ownership model' as MCJIT has, am I right?

I think the JIT users should take care of memory allocation/deallocation. Also,
I believe that you have strong reasons to implement things this way :)
Could you please tell us more about the topic?
> On 16. Sep 2018, at 01:14, Lang Hames via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> Hi All,
> 
> ORC's new concurrent compilation model generates some interesting
lifetime and thread safety questions around LLVMContext: We need multiple
LLVMContexts (one per module in the simplest case, but at least one per thread),
and the lifetime of each context depends on the execution path of the JIT'd
code. We would like to deallocate contexts once all modules associated with them
have been compiled, but there is no safe or easy way to check that condition at
the moment as LLVMContext does not expose how many modules are associated with
it.
> 
> One way to fix this would be to add a mutex to LLVMContext, and expose this
and the module count. Then in the IR-compiling layer of the JIT we could have
something like:
> 
> // Compile finished, time to deallocate the module.
> // Explicit deletes used for clarity, we would use unique_ptrs in practice.
> auto &Ctx = Mod->getContext();
> delete Mod;
> std::lock_guard<std::mutex> Lock(Ctx->getMutex());
> if (Ctx.getNumModules())
>   delete Ctx;
> 
> Another option would be to invert the ownership model and say that each
Module shares ownership of its LLVMContext. That way LLVMContexts would be
automatically deallocated when the last module using them is destructed
(providing no other shared_ptrs to the context are held elsewhere).
> 
> There are other possible approaches (e.g. side tables for the mutex and
module count) but before I spent too much time on it I wanted to see whether
anyone else has encountered these issues or has opinions on solutions.
> 
> Cheers,
> Lang.
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: Message signed with OpenPGP
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180916/9a42d654/attachment.sig>

Alex Denisov via llvm-dev

2018-Sep-17 07:33 UTC

head link

[llvm-dev] LLVMContext: Threads and Ownership.

Hi Lang,

Thank you for clarification.
> Do you have a use-case for manually managing LLVM module lifetimes?
Our[1] initial idea was to feed all the modules to a JIT engine, run some code,
then do further analysis and modification of the modules, feed the new versions
to a JIT engine and run some code again.

It didn't work well with MCJIT because we did not want to give up the
ownership
of the modules.
However, this initial approach did not work properly because, as you mentioned,
compilation does modify modules, which does not suit our need.

Eventually, we decide to compile things on our own and communicate with the
Orc JIT using only object files. Before compilation we make a clone of a module
(simply by loading it from disk again), using the original module only for
analysis.
This way, we actually have two copies of a module in memory, with the second
copy
disposed immediately after compilation.
It also worth mentioning that our use case is AOT compilation, not really lazy
JIT.
As of the object files we also cannot give up on the ownership because we have
to reuse them many times. At least this is the current state, though I'm
working on
another improvement: to load all object files once and then re-execute program
many times.

Also, we wanted to support several versions of LLVM. Because of drastic
changes in the Orc APIs we decided to hand-craft another JIT engine based on
your
work. It is less powerful, but fits our needs very well:

   
https://github.com/mull-project/mull/blob/master/include/Toolchain/JITEngine.h
    https://github.com/mull-project/mull/blob/master/lib/Toolchain/JITEngine.cpp

We also parallelize lots of things, but this could also work using Orc given
that we
only use the ObjectLinkingLayer.

P.S. Sorry for giving such a chaotic explanation, but I hope it shows our use
case :)

[1] https://github.com/mull-project/mull
> On 17. Sep 2018, at 02:05, Lang Hames <lhames at gmail.com> wrote:
> 
> Hi Alex,
> 
> > We would like to deallocate contexts
> 
> I did not look at the Orc APIs as of LLVM-6+, but I'm curious where
this requirement comes from?
> 
> The older generation of ORC APIs were single-threaded so in common use
cases the client would create one LLVMContext for all IR created within a JIT
session. The new generation supports concurrent compilation, so each module
needs a different LLVMContext (including modules created inside the JIT itself,
for example when extracting functions in the CompileOnDemandLayer). This means
creating and managing context lifetimes alongside module lifetimes.
> 
> Does Orc takes ownership of the modules that are being JIT'ed? I.e., it
is the same 'ownership model' as MCJIT has, am I right?
> 
> The latest version of ORC takes ownership of modules until they are
compiled, at which point it passes the Module's unique-ptr to a
'NotifyCompiled' callback. By default this just throws away the pointer,
deallocating the module. It can be used by clients to retrieve ownership of
modules if they prefer to manage the Module's lifetime themselves.
> 
> I think the JIT users should take care of memory allocation/deallocation.
Also, I believe that you have strong reasons to implement things this way :)
> Could you please tell us more about the topic?
> 
> Do you have a use-case for manually managing LLVM module lifetimes? I would
like to hear more about that so I can factor it in to the design.
> 
> There were two motivations for having the JIT take ownership by default
(with the NotifyCompiled escape hatch for those who wanted to hold on to the
module after it is compiled): First, it makes efficient memory management the
default: Unless you have a reason to hold on to the module it is thrown away as
soon as possible, freeing up memory. Second, since compilation is a mutating
operation, it seemed more natural to have the "right-to-mutate" flow
alongside ownership of the underlying memory: whoever owns the Module is allowed
to mutate it, rather than clients having to be aware of internal JIT state t
know when the could or could not operate on a module.
> 
> Cheers,
> Lang.
> 
> On Sun, Sep 16, 2018 at 11:06 AM Alex Denisov <1101.debian at
gmail.com> wrote:
> Hi Lang,
> 
> > We would like to deallocate contexts
> 
> 
> I did not look at the Orc APIs as of LLVM-6+, but I'm curious where
this requirement comes from?
> Does Orc takes ownership of the modules that are being JIT'ed? I.e., it
is the same 'ownership model' as MCJIT has, am I right?
> 
> I think the JIT users should take care of memory allocation/deallocation.
Also, I believe that you have strong reasons to implement things this way :)
> Could you please tell us more about the topic?
> 
> > On 16. Sep 2018, at 01:14, Lang Hames via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> >
> > Hi All,
> >
> > ORC's new concurrent compilation model generates some interesting
lifetime and thread safety questions around LLVMContext: We need multiple
LLVMContexts (one per module in the simplest case, but at least one per thread),
and the lifetime of each context depends on the execution path of the JIT'd
code. We would like to deallocate contexts once all modules associated with them
have been compiled, but there is no safe or easy way to check that condition at
the moment as LLVMContext does not expose how many modules are associated with
it.
> >
> > One way to fix this would be to add a mutex to LLVMContext, and expose
this and the module count. Then in the IR-compiling layer of the JIT we could
have something like:
> >
> > // Compile finished, time to deallocate the module.
> > // Explicit deletes used for clarity, we would use unique_ptrs in
practice.
> > auto &Ctx = Mod->getContext();
> > delete Mod;
> > std::lock_guard<std::mutex> Lock(Ctx->getMutex());
> > if (Ctx.getNumModules())
> >   delete Ctx;
> >
> > Another option would be to invert the ownership model and say that
each Module shares ownership of its LLVMContext. That way LLVMContexts would be
automatically deallocated when the last module using them is destructed
(providing no other shared_ptrs to the context are held elsewhere).
> >
> > There are other possible approaches (e.g. side tables for the mutex
and module count) but before I spent too much time on it I wanted to see whether
anyone else has encountered these issues or has opinions on solutions.
> >
> > Cheers,
> > Lang.
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: Message signed with OpenPGP
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180917/45aef08b/attachment.sig>

Possibly Parallel Threads

Search for more reasonably related threads

llvm dev - Sep 2018 - LLVMContext: Threads and Ownership.

[llvm-dev] LLVMContext: Threads and Ownership.

[llvm-dev] LLVMContext: Threads and Ownership.

[llvm-dev] LLVMContext: Threads and Ownership.

[llvm-dev] LLVMContext: Threads and Ownership.

[llvm-dev] LLVMContext: Threads and Ownership.

Possibly Parallel Threads