Larry Gritz via llvm-dev
2016-Mar-08 19:03 UTC
[llvm-dev] Deleting function IR after codegen
YES. My use of LLVM involves an app that JITs program after program and will quickly swamp memory if everything is retained. It is crucial to aggressively throw everything away but the functions we still need to execute. I've been faking it with old JIT (llvm 3.4/3.5) by using a custom subclass of JITMemoryManager that squirrels away the jitted binary code so that when I free the Modules, ExecutionEngine, and MemoryManager, the real memory that got allocated for the binary code is hidden and does not get deallocated. Currently I'm struggling with bringing my code base up to MCJIT without losing this ability, because the memory consumption is killing me. Sometimes I think that clang as the canonical user of LLVM does not reflect the diversity of JIT-oriented LLVM use cases. An "offline" compiler like clang gets to exit after compiling a module, but other apps using LLVM may JIT module after module after module indefinitely. For that kind of use case, it would be great to have as a first-class feature the ability to free the IR of a compiled module, and even better, to throw away the Module and EE entirely but keep the ability to call into the JITed binary function. Many apps would benefit from a stable API for doing this. -- lg> On Mar 7, 2016, at 4:55 PM, Pete Cooper via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > After codegen for a given function, the IR should no longer be needed. In the AsmPrinter we convert from MI->MCInstr, and then we never go back and look at the IR during the MC layer. > > I’ve prototyped a simple pass which can be (optionally) scheduled to do just this. It is added at the end of addPassesToEmitFile. It is optional so that clang can continue to leak the IR with --disable-free, but i would propose that all other tools, and especially LTO, would enable it. The savings are 20% of peak memory in LTO of clang itself. > > I could attach a patch, but first i’d really like to know if anyone is fundamentally opposed to this. > > I should note, a couple of issues have come up in the prototype. > - llvm::getDISubprogram was walking the function body to find the subprogram. This is trivial to fix as functions now have !dbg on them. > - The AsmPrinter is calling canBeOmittedFromSymbolTable on GlobalValue’s which then walks all their uses. I think this should be done earlier in codegen as an analysis whose results are available to the AsmPrinter. > - BB’s whose addresses are taken, i.e. jump tables, can’t be deleted. Those functions will just keep their IR around so no changes there. > > With the above issues fixed, I can run make check and pass all the tests. > > Comments very welcome. > > Cheers, > Pete-- Larry Gritz lg at larrygritz.com
Andy Ayers via llvm-dev
2016-Mar-08 19:24 UTC
[llvm-dev] Deleting function IR after codegen
FWIW, LLILC (https://github.com/dotnet/llilc) uses MCJIT with a custom memory manager to hold onto the binary bits and discard the rest. As far as I know it doesn't leak, though we don't blow away the context, so that grows a bit over time. -----Original Message----- From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Larry Gritz via llvm-dev Sent: Tuesday, March 8, 2016 11:04 AM To: llvm-dev at lists.llvm.org Subject: Re: [llvm-dev] Deleting function IR after codegen YES. My use of LLVM involves an app that JITs program after program and will quickly swamp memory if everything is retained. It is crucial to aggressively throw everything away but the functions we still need to execute. I've been faking it with old JIT (llvm 3.4/3.5) by using a custom subclass of JITMemoryManager that squirrels away the jitted binary code so that when I free the Modules, ExecutionEngine, and MemoryManager, the real memory that got allocated for the binary code is hidden and does not get deallocated. Currently I'm struggling with bringing my code base up to MCJIT without losing this ability, because the memory consumption is killing me. Sometimes I think that clang as the canonical user of LLVM does not reflect the diversity of JIT-oriented LLVM use cases. An "offline" compiler like clang gets to exit after compiling a module, but other apps using LLVM may JIT module after module after module indefinitely. For that kind of use case, it would be great to have as a first-class feature the ability to free the IR of a compiled module, and even better, to throw away the Module and EE entirely but keep the ability to call into the JITed binary function. Many apps would benefit from a stable API for doing this. -- lg> On Mar 7, 2016, at 4:55 PM, Pete Cooper via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > After codegen for a given function, the IR should no longer be needed. In the AsmPrinter we convert from MI->MCInstr, and then we never go back and look at the IR during the MC layer. > > I’ve prototyped a simple pass which can be (optionally) scheduled to do just this. It is added at the end of addPassesToEmitFile. It is optional so that clang can continue to leak the IR with --disable-free, but i would propose that all other tools, and especially LTO, would enable it. The savings are 20% of peak memory in LTO of clang itself. > > I could attach a patch, but first i’d really like to know if anyone is fundamentally opposed to this. > > I should note, a couple of issues have come up in the prototype. > - llvm::getDISubprogram was walking the function body to find the subprogram. This is trivial to fix as functions now have !dbg on them. > - The AsmPrinter is calling canBeOmittedFromSymbolTable on GlobalValue’s which then walks all their uses. I think this should be done earlier in codegen as an analysis whose results are available to the AsmPrinter. > - BB’s whose addresses are taken, i.e. jump tables, can’t be deleted. Those functions will just keep their IR around so no changes there. > > With the above issues fixed, I can run make check and pass all the tests. > > Comments very welcome. > > Cheers, > Pete-- Larry Gritz lg at larrygritz.com _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2flists.llvm.org%2fcgi-bin%2fmailman%2flistinfo%2fllvm-dev%0a&data=01%7c01%7candya%40microsoft.com%7c3a126b157c9545a5070708d347847d3f%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=cmNfRJTEMpDrEfKReureGCmBwnyP1UNeX3lTYUbXDc8%3d
Caldarale, Charles R via llvm-dev
2016-Mar-08 19:40 UTC
[llvm-dev] Deleting function IR after codegen
> From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] > On Behalf Of Larry Gritz via llvm-dev > Subject: Re: [llvm-dev] Deleting function IR after codegen> I've been faking it with old JIT (llvm 3.4/3.5) by using a custom subclass of > JITMemoryManager that squirrels away the jitted binary code so that when I free > the Modules, ExecutionEngine, and MemoryManager, the real memory that got allocated > for the binary code is hidden and does not get deallocated.> Currently I'm struggling with bringing my code base up to MCJIT without losing this > ability, because the memory consumption is killing me.> For that kind of use case, it would be great to have as a first-class feature > the ability to free the IR of a compiled module, and even better, to throw away > the Module and EE entirely but keep the ability to call into the JITed binaryThis is precisely what we do for our environment. Updating for MCJIT took a bit of work, but not too much, using SectionMemoryManager as the base class. When OrcJIT arrived, we realized we didn't actually need the ExecutionEngine or an LLVM-provided JIT at all. This simplified our code and removed some undesirable dependencies. What we have now essentially just implements the logic of MCJIT's finalizeObject(), finalizeLoadedModules(), emitObject(), and generateCodeForModule() methods coupled with the aforementioned extension of SectionMemoryManager (effectively our own ExecutionEngine). - Chuck
Larry Gritz via llvm-dev
2016-Mar-08 19:45 UTC
[llvm-dev] Deleting function IR after codegen
Thanks for the pointer, it's always helpful to be able to see how another project solved similar problems.> On Mar 8, 2016, at 11:24 AM, Andy Ayers <andya at microsoft.com> wrote: > > FWIW, LLILC (https://github.com/dotnet/llilc) uses MCJIT with a custom memory manager to hold onto the binary bits and discard the rest. > > As far as I know it doesn't leak, though we don't blow away the context, so that grows a bit over time. > > -----Original Message----- > From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Larry Gritz via llvm-dev > Sent: Tuesday, March 8, 2016 11:04 AM > To: llvm-dev at lists.llvm.org > Subject: Re: [llvm-dev] Deleting function IR after codegen > > YES. My use of LLVM involves an app that JITs program after program and will quickly swamp memory if everything is retained. It is crucial to aggressively throw everything away but the functions we still need to execute. > > I've been faking it with old JIT (llvm 3.4/3.5) by using a custom subclass of JITMemoryManager that squirrels away the jitted binary code so that when I free the Modules, ExecutionEngine, and MemoryManager, the real memory that got allocated for the binary code is hidden and does not get deallocated. > > Currently I'm struggling with bringing my code base up to MCJIT without losing this ability, because the memory consumption is killing me. > > Sometimes I think that clang as the canonical user of LLVM does not reflect the diversity of JIT-oriented LLVM use cases. An "offline" compiler like clang gets to exit after compiling a module, but other apps using LLVM may JIT module after module after module indefinitely. > > For that kind of use case, it would be great to have as a first-class feature the ability to free the IR of a compiled module, and even better, to throw away the Module and EE entirely but keep the ability to call into the JITed binary function. Many apps would benefit from a stable API for doing this. > > -- lg > > >> On Mar 7, 2016, at 4:55 PM, Pete Cooper via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> >> >> After codegen for a given function, the IR should no longer be needed. In the AsmPrinter we convert from MI->MCInstr, and then we never go back and look at the IR during the MC layer. >> >> I’ve prototyped a simple pass which can be (optionally) scheduled to do just this. It is added at the end of addPassesToEmitFile. It is optional so that clang can continue to leak the IR with --disable-free, but i would propose that all other tools, and especially LTO, would enable it. The savings are 20% of peak memory in LTO of clang itself. >> >> I could attach a patch, but first i’d really like to know if anyone is fundamentally opposed to this. >> >> I should note, a couple of issues have come up in the prototype. >> - llvm::getDISubprogram was walking the function body to find the subprogram. This is trivial to fix as functions now have !dbg on them. >> - The AsmPrinter is calling canBeOmittedFromSymbolTable on GlobalValue’s which then walks all their uses. I think this should be done earlier in codegen as an analysis whose results are available to the AsmPrinter. >> - BB’s whose addresses are taken, i.e. jump tables, can’t be deleted. Those functions will just keep their IR around so no changes there. >> >> With the above issues fixed, I can run make check and pass all the tests. >> >> Comments very welcome. >> >> Cheers, >> Pete > > -- > Larry Gritz > lg at larrygritz.com > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2flists.llvm.org%2fcgi-bin%2fmailman%2flistinfo%2fllvm-dev%0a&data=01%7c01%7candya%40microsoft.com%7c3a126b157c9545a5070708d347847d3f%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=cmNfRJTEMpDrEfKReureGCmBwnyP1UNeX3lTYUbXDc8%3d-- Larry Gritz lg at larrygritz.com