thr3ads.net - llvm dev - [llvm-dev] Deleting function IR after codegen [Mar 2016]

If this information is useful, please help other people find it:
Share via:

Larry Gritz via llvm-dev

2016-Mar-08 19:03 UTC

[llvm-dev] Deleting function IR after codegen

YES. My use of LLVM involves an app that JITs program after program and will
quickly swamp memory if everything is retained. It is crucial to aggressively
throw everything away but the functions we still need to execute.

I've been faking it with old JIT (llvm 3.4/3.5) by using a custom subclass
of JITMemoryManager that squirrels away the jitted binary code so that when I
free the Modules, ExecutionEngine, and MemoryManager, the real memory that got
allocated for the binary code is hidden and does not get deallocated.

Currently I'm struggling with bringing my code base up to MCJIT without
losing this ability, because the memory consumption is killing me.

Sometimes I think that clang as the canonical user of LLVM does not reflect the
diversity of JIT-oriented LLVM use cases. An "offline" compiler like
clang gets to exit after compiling a module, but other apps using LLVM may JIT
module after module after module indefinitely.

For that kind of use case, it would be great to have as a first-class feature
the ability to free the IR of a compiled module, and even better, to throw away
the Module and EE entirely but keep the ability to call into the JITed binary
function. Many apps would benefit from a stable API for doing this.

	-- lg

> On Mar 7, 2016, at 4:55 PM, Pete Cooper via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> 
> After codegen for a given function, the IR should no longer be needed.  In
the AsmPrinter we convert from MI->MCInstr, and then we never go back and
look at the IR during the MC layer.
> 
> I’ve prototyped a simple pass which can be (optionally) scheduled to do
just this.  It is added at the end of addPassesToEmitFile.  It is optional so
that clang can continue to leak the IR with --disable-free, but i would propose
that all other tools, and especially LTO, would enable it.  The savings are 20%
of peak memory in LTO of clang itself.
> 
> I could attach a patch, but first i’d really like to know if anyone is
fundamentally opposed to this.
> 
> I should note, a couple of issues have come up in the prototype.
> - llvm::getDISubprogram was walking the function body to find the
subprogram.  This is trivial to fix as functions now have !dbg on them.
> - The AsmPrinter is calling canBeOmittedFromSymbolTable on GlobalValue’s
which then walks all their uses.  I think this should be done earlier in codegen
as an analysis whose results are available to the AsmPrinter.
> - BB’s whose addresses are taken, i.e. jump tables, can’t be deleted. 
Those functions will just keep their IR around so no changes there.
> 
> With the above issues fixed, I can run make check and pass all the tests.
> 
> Comments very welcome.
> 
> Cheers,
> Pete
--
Larry Gritz
lg at larrygritz.com

Andy Ayers via llvm-dev

2016-Mar-08 19:24 UTC

head link

[llvm-dev] Deleting function IR after codegen

FWIW, LLILC (https://github.com/dotnet/llilc) uses MCJIT with a custom memory
manager to hold onto the binary bits and discard the rest.

As far as I know it doesn't leak, though we don't blow away the context,
so that grows a bit over time.

-----Original Message-----
From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Larry
Gritz via llvm-dev
Sent: Tuesday, March 8, 2016 11:04 AM
To: llvm-dev at lists.llvm.org
Subject: Re: [llvm-dev] Deleting function IR after codegen

YES. My use of LLVM involves an app that JITs program after program and will
quickly swamp memory if everything is retained. It is crucial to aggressively
throw everything away but the functions we still need to execute.

I've been faking it with old JIT (llvm 3.4/3.5) by using a custom subclass
of JITMemoryManager that squirrels away the jitted binary code so that when I
free the Modules, ExecutionEngine, and MemoryManager, the real memory that got
allocated for the binary code is hidden and does not get deallocated.

Currently I'm struggling with bringing my code base up to MCJIT without
losing this ability, because the memory consumption is killing me.

Sometimes I think that clang as the canonical user of LLVM does not reflect the
diversity of JIT-oriented LLVM use cases. An "offline" compiler like
clang gets to exit after compiling a module, but other apps using LLVM may JIT
module after module after module indefinitely.

For that kind of use case, it would be great to have as a first-class feature
the ability to free the IR of a compiled module, and even better, to throw away
the Module and EE entirely but keep the ability to call into the JITed binary
function. Many apps would benefit from a stable API for doing this.

	-- lg

> On Mar 7, 2016, at 4:55 PM, Pete Cooper via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> 
> After codegen for a given function, the IR should no longer be needed.  In
the AsmPrinter we convert from MI->MCInstr, and then we never go back and
look at the IR during the MC layer.
> 
> I’ve prototyped a simple pass which can be (optionally) scheduled to do
just this.  It is added at the end of addPassesToEmitFile.  It is optional so
that clang can continue to leak the IR with --disable-free, but i would propose
that all other tools, and especially LTO, would enable it.  The savings are 20%
of peak memory in LTO of clang itself.
> 
> I could attach a patch, but first i’d really like to know if anyone is
fundamentally opposed to this.
> 
> I should note, a couple of issues have come up in the prototype.
> - llvm::getDISubprogram was walking the function body to find the
subprogram.  This is trivial to fix as functions now have !dbg on them.
> - The AsmPrinter is calling canBeOmittedFromSymbolTable on GlobalValue’s
which then walks all their uses.  I think this should be done earlier in codegen
as an analysis whose results are available to the AsmPrinter.
> - BB’s whose addresses are taken, i.e. jump tables, can’t be deleted. 
Those functions will just keep their IR around so no changes there.
> 
> With the above issues fixed, I can run make check and pass all the tests.
> 
> Comments very welcome.
> 
> Cheers,
> Pete
--
Larry Gritz
lg at larrygritz.com

_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org
https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2flists.llvm.org%2fcgi-bin%2fmailman%2flistinfo%2fllvm-dev%0a&data=01%7c01%7candya%40microsoft.com%7c3a126b157c9545a5070708d347847d3f%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=cmNfRJTEMpDrEfKReureGCmBwnyP1UNeX3lTYUbXDc8%3d

Caldarale, Charles R via llvm-dev

2016-Mar-08 19:40 UTC

head link

[llvm-dev] Deleting function IR after codegen

> From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] 
> On Behalf Of Larry Gritz via llvm-dev
> Subject: Re: [llvm-dev] Deleting function IR after codegen
> I've been faking it with old JIT (llvm 3.4/3.5) by using a custom
subclass of
> JITMemoryManager that squirrels away the jitted binary code so that when I
free
> the Modules, ExecutionEngine, and MemoryManager, the real memory that got
allocated
> for the binary code is hidden and does not get deallocated.
> Currently I'm struggling with bringing my code base up to MCJIT without
losing this
> ability, because the memory consumption is killing me.
> For that kind of use case, it would be great to have as a first-class
feature
> the ability to free the IR of a compiled module, and even better, to throw
away
> the Module and EE entirely but keep the ability to call into the JITed
binary
This is precisely what we do for our environment.  Updating for MCJIT took a bit
of work, but not too much, using SectionMemoryManager as the base class.

When OrcJIT arrived, we realized we didn't actually need the ExecutionEngine
or an LLVM-provided JIT at all.  This simplified our code and removed some
undesirable dependencies.  What we have now essentially just implements the
logic of MCJIT's finalizeObject(), finalizeLoadedModules(), emitObject(),
and generateCodeForModule() methods coupled with the aforementioned extension of
SectionMemoryManager (effectively our own ExecutionEngine).

 - Chuck

Larry Gritz via llvm-dev

2016-Mar-08 19:45 UTC

head link

[llvm-dev] Deleting function IR after codegen

Thanks for the pointer, it's always helpful to be able to see how another
project solved similar problems.

> On Mar 8, 2016, at 11:24 AM, Andy Ayers <andya at microsoft.com>
wrote:
> 
> FWIW, LLILC (https://github.com/dotnet/llilc) uses MCJIT with a custom
memory manager to hold onto the binary bits and discard the rest.
> 
> As far as I know it doesn't leak, though we don't blow away the
context, so that grows a bit over time.
> 
> -----Original Message-----
> From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of
Larry Gritz via llvm-dev
> Sent: Tuesday, March 8, 2016 11:04 AM
> To: llvm-dev at lists.llvm.org
> Subject: Re: [llvm-dev] Deleting function IR after codegen
> 
> YES. My use of LLVM involves an app that JITs program after program and
will quickly swamp memory if everything is retained. It is crucial to
aggressively throw everything away but the functions we still need to execute.
> 
> I've been faking it with old JIT (llvm 3.4/3.5) by using a custom
subclass of JITMemoryManager that squirrels away the jitted binary code so that
when I free the Modules, ExecutionEngine, and MemoryManager, the real memory
that got allocated for the binary code is hidden and does not get deallocated.
> 
> Currently I'm struggling with bringing my code base up to MCJIT without
losing this ability, because the memory consumption is killing me.
> 
> Sometimes I think that clang as the canonical user of LLVM does not reflect
the diversity of JIT-oriented LLVM use cases. An "offline" compiler
like clang gets to exit after compiling a module, but other apps using LLVM may
JIT module after module after module indefinitely.
> 
> For that kind of use case, it would be great to have as a first-class
feature the ability to free the IR of a compiled module, and even better, to
throw away the Module and EE entirely but keep the ability to call into the
JITed binary function. Many apps would benefit from a stable API for doing this.
> 
> 	-- lg
> 
> 
>> On Mar 7, 2016, at 4:55 PM, Pete Cooper via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
>> 
>> 
>> After codegen for a given function, the IR should no longer be needed. 
In the AsmPrinter we convert from MI->MCInstr, and then we never go back and
look at the IR during the MC layer.
>> 
>> I’ve prototyped a simple pass which can be (optionally) scheduled to do
just this.  It is added at the end of addPassesToEmitFile.  It is optional so
that clang can continue to leak the IR with --disable-free, but i would propose
that all other tools, and especially LTO, would enable it.  The savings are 20%
of peak memory in LTO of clang itself.
>> 
>> I could attach a patch, but first i’d really like to know if anyone is
fundamentally opposed to this.
>> 
>> I should note, a couple of issues have come up in the prototype.
>> - llvm::getDISubprogram was walking the function body to find the
subprogram.  This is trivial to fix as functions now have !dbg on them.
>> - The AsmPrinter is calling canBeOmittedFromSymbolTable on
GlobalValue’s which then walks all their uses.  I think this should be done
earlier in codegen as an analysis whose results are available to the AsmPrinter.
>> - BB’s whose addresses are taken, i.e. jump tables, can’t be deleted. 
Those functions will just keep their IR around so no changes there.
>> 
>> With the above issues fixed, I can run make check and pass all the
tests.
>> 
>> Comments very welcome.
>> 
>> Cheers,
>> Pete
> 
> --
> Larry Gritz
> lg at larrygritz.com
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
>
https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2flists.llvm.org%2fcgi-bin%2fmailman%2flistinfo%2fllvm-dev%0a&data=01%7c01%7candya%40microsoft.com%7c3a126b157c9545a5070708d347847d3f%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=cmNfRJTEMpDrEfKReureGCmBwnyP1UNeX3lTYUbXDc8%3d
--
Larry Gritz
lg at larrygritz.com

Seemingly Similar Threads

Search for more maybe matching threads

llvm dev - Mar 2016 - Deleting function IR after codegen

[llvm-dev] Deleting function IR after codegen

[llvm-dev] Deleting function IR after codegen

[llvm-dev] Deleting function IR after codegen

[llvm-dev] Deleting function IR after codegen

Seemingly Similar Threads