thr3ads.net - llvm dev - [LLVMdev] Proposal: release MDNodes for source modules (LTO+debug info) [Nov 2013]

If this information is useful, please help other people find it:
Share via:

Manman Ren

2013-Nov-13 02:07 UTC

[LLVMdev] Proposal: release MDNodes for source modules (LTO+debug info)

On Tue, Nov 12, 2013 at 4:59 PM, Chandler Carruth <chandlerc at
google.com>wrote:
> On Tue, Nov 12, 2013 at 4:46 PM, Manman Ren <manman.ren at gmail.com>
wrote:
>
>>
>>
>>
>> On Tue, Nov 12, 2013 at 4:38 PM, Chandler Carruth <chandlerc at
google.com>wrote:
>>
>>>
>>> On Tue, Nov 12, 2013 at 4:29 PM, Manman Ren <manman.ren at
gmail.com>wrote:
>>>
>>>> Hi All,
>>>>
>>>>  In LTO, we load in the source modules and link the source
modules into
>>>> a destination module.
>>>> Lots of MDNodes are only used by the source modules, for
example Xalan
>>>> used 649MB for MDNodes after loading and linking, but the
actual
>>>> destination module only has 393MB of MDNodes. There are
649-393MB (40% of
>>>> 649MB) not used.
>>>>
>>>> MDNodes belong to the Context, deleting modules will not
release the
>>>> MDNodes.
>>>>
>>>> One possible solution is:
>>>>
>>>> In LLVMContext, add “removeUnusedMDNodes" function
>>>>   It goes through OwnedModules and check if a MDNode is used by
any of
>>>> the modules, if not remove it.
>>>>   One implementation is to mark a visited MDNode used when
traversing
>>>> the module. After done traversing all modules, we can delete
MDNodes in
>>>> MDNodeSet that are not marked.
>>>>
>>>> In LTOCodeGenerator, add a vector of source modules that are
added
>>>> (these source modules will be linked with DestroySource mode).
>>>> In LTOCodeGenerator:: compile_to_file, delete all source
modules that
>>>> are linked in, then call LLVMContext::removeUnusedMDNodes
>>>> —> I can’t find a better place to call the function. When we
>>>> call compile_to_file, we should have done linking in all source
modules.
>>>> Another possibility is to add a lto API so the linker can
delete the
>>>> source modules and call the API to release MDNodes.
>>>>
>>>> Other options are:
>>>> 1> Using a different LLVMContext for the destination module,
but it
>>>> didn’t work out since Linker was not designed to work with
different
>>>> LLVMContexts for source vs destination.
>>>> 2> removeUnusedMDNodes checks if a MDNode is used in a
different way
>>>>  (i.e use_empty() && !hasValueHandler()), but it does
not remove MDNodes
>>>> that form cycles.
>>>>
>>>
>>> 3) Make the MDNode be owned by the module that uses it?
>>>
>>
>> MDNode is shared among modules so multiple modules can use it, if we
>> specify an owner for a MDNode, that will prevent sharing.
>>
>
> From your stats (40% stuck in the old module) it doesn't sound like
this
> is buying us anything...
>
Hi Chandler,

I don't quite get why you think sharing is not buying us anything...
It reduces the memory footprint of the source modules (there is sharing
among the source modules) and the number of MDNodes created for the
destination module (we do not need to re-create the MDNodes that can be
shared).

The amount of sharing may not be that much but it still exists.

I had some experiments earlier on building clang with "-flto -g", if
we
dis-allow sharing between source modules and destination module, the memory
footprint for MDNodes will increase by 15%.
If we disallow sharing among the source modules, the memory footprint for
MDNodes will be even larger.

Thanks,
Manman

>
>>
>> Manman
>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131112/25c14a13/attachment.html>

Chandler Carruth

2013-Nov-13 02:11 UTC

head link

[LLVMdev] Proposal: release MDNodes for source modules (LTO+debug info)

On Tue, Nov 12, 2013 at 6:07 PM, Manman Ren <manman.ren at gmail.com>
wrote:
> Hi Chandler,
>
> I don't quite get why you think sharing is not buying us anything...
> It reduces the memory footprint of the source modules (there is sharing
> among the source modules) and the number of MDNodes created for the
> destination module (we do not need to re-create the MDNodes that can be
> shared).
>
> The amount of sharing may not be that much but it still exists.
>
> I had some experiments earlier on building clang with "-flto -g",
if we
> dis-allow sharing between source modules and destination module, the memory
> footprint for MDNodes will increase by 15%.
>
So, in my naive view, we do something like the following:

0) load a source module
1) load another source module
2) merge the second module into the first
3) delete the second module
4) while there are more source modules, goto 1

This would mean that by not sharing the individual source module would use
15% more memory, but based on your OP numbers the final linked memory usage
should still be 40% smaller. That seems like an easy win with very low
complexity?

Perhaps I just am being naive about how the LTO step works or there are
other complications. I just wanted to make sure we considered the easy path
of the module owning the metadata before introducing something to walk all
metadata and delete unreachable bits.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131112/8bb66753/attachment.html>

Nick Kledzik

2013-Nov-13 21:55 UTC

head link

[LLVMdev] Proposal: release MDNodes for source modules (LTO+debug info)

On Nov 12, 2013, at 6:11 PM, Chandler Carruth <chandlerc at google.com>
wrote:> On Tue, Nov 12, 2013 at 6:07 PM, Manman Ren <manman.ren at gmail.com>
wrote:
> Hi Chandler,
> 
> I don't quite get why you think sharing is not buying us anything...
> It reduces the memory footprint of the source modules (there is sharing
among the source modules) and the number of MDNodes created for the destination
module (we do not need to re-create the MDNodes that can be shared).
> 
> The amount of sharing may not be that much but it still exists.
> 
> I had some experiments earlier on building clang with "-flto -g",
if we dis-allow sharing between source modules and destination module, the
memory footprint for MDNodes will increase by 15%.
> 
> So, in my naive view, we do something like the following:
> 
> 0) load a source module
> 1) load another source module
> 2) merge the second module into the first
> 3) delete the second module
> 4) while there are more source modules, goto 1
I'll describe how the darwin linker uses the LTO interface.  It may be
amenable to earlier module deletion.

1) The darwin linker mmap()s each input file.  If it is a bitcode file, it calls
   lto_module_create_from_memory()
then lto_module_get_num_symbols() and lto_module_get_symbol_*() to discover what
the module provides and needs.

2) After all object files are loaded (which means no undefined symbols are
left), the linker then calls:
  lto_codegen_create() and then in a for-loop calls lto_codegen_add_module() on
each module previously loaded.

3) After lto_codegen_compile() has returned, the linker does clean up and
deletes each module with lto_module_dispose().

It sounds like the linker could call lto_module_dispose() right after
lto_codegen_add_module() to help reduce the memory footprint.  That would be a
simple linker change.  A slightly larger linker change would be to immediately
call lto_codegen_add_module() right after lto_module_create_from_memory(), then
lto_module_dispose().  That is, never have any unmerged modules laying around.

I have no idea is these sort of changes work for the gold plugin.

-Nick

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131113/449ee352/attachment.html>

Seemingly Similar Threads

Search for more seemingly similar threads

llvm dev - Nov 2013 - [LLVMdev] Proposal: release MDNodes for source modules (LTO+debug info)

[LLVMdev] Proposal: release MDNodes for source modules (LTO+debug info)

[LLVMdev] Proposal: release MDNodes for source modules (LTO+debug info)

[LLVMdev] Proposal: release MDNodes for source modules (LTO+debug info)

Seemingly Similar Threads