thr3ads.net - llvm dev - [LLVMdev] [RFC] Less memory and greater maintainability for debug info IR [Oct 2014]

If this information is useful, please help other people find it:
Share via:

Duncan P. N. Exon Smith

2014-Oct-13 22:02 UTC

[LLVMdev] [RFC] Less memory and greater maintainability for debug info IR

In r219010, I merged integer and string fields into a single header
field.  By reducing the number of metadata operands used in debug info,
this saved 2.2GB on an `llvm-lto` bootstrap.  I've done some profiling
of DW_TAGs to see what parts of PR17891 and PR17892 to tackle next, and
I've concluded that they will be insufficient.

Instead, I'd like to implement a more aggressive plan, which as a
side-effect cleans up the much "loved" debug info IR assembly syntax.

At a high-level, the idea is to create distinct subclasses of `Value`
for each debug info concept, starting with line table entries and moving
on to the DIDescriptor hierarchy.  By leveraging the use-list
infrastructure for metadata operands -- i.e., only using value handles
for non-metadata operands -- we'll improve memory usage and increase
RAUW speed.

My rough plan follows.  I quote some numbers for memory savings below
based on an -flto -g bootstrap of `llvm-lto` (i.e., running `llvm-lto`
on `llvm-lto.lto.bc`, an already-linked bitcode file dumped by ld64's
-save-temps option) that currently peaks at 15.3GB.

 1. Introduce `MDUser`, which inherits from `User`, and whose `Use`s
    must all be metadata.  The cost per operand is 1 pointer, vs. 4
    pointers in an `MDNode`.

 2. Create `MDLineTable` as the first subclass of `MDUser`.  Use normal
    fields (not `Value`s) for the line and column, and use `Use`
    operands for the metadata operands.

    On x86-64, this will save 104B / line table entry.  Linking
    `llvm-lto` uses ~7M line-table entries, so this on its own saves
    ~700MB.

    Sketch of class definition:

        class MDLineTable : public MDUser {
          unsigned Line;
          unsigned Column;
        public:
          static MDLineTable *get(unsigned Line, unsigned Column,
                                  MDNode *Scope);
          static MDLineTable *getInlined(MDLineTable *Base, MDNode *Scope);
          static MDLineTable *getBase(MDLineTable *Inlined);

          unsigned getLine() const { return Line; }
          unsigned getColumn() const { return Column; }
          bool isInlined() const { return getNumOperands() == 2; }
          MDNode *getScope() const { return getOperand(0); }
          MDNode *getInlinedAt() const { return getOperand(1); }
        };

    Proposed assembly syntax:

        ; Not inlined.
        !7 = metadata !MDLineTable(line: 45, column: 7, scope: metadata !9)

        ; Inlined.
        !7 = metadata !MDLineTable(line: 45, column: 7, scope: metadata !9,
                                   inlinedAt: metadata !10)

        ; Column defaulted to 0.
        !7 = metadata !MDLineTable(line: 45, scope: metadata !9)

    (What colour should that bike shed be?)

 3. (Optional) Rewrite `DebugLoc` lookup tables.  My profiling shows
    that we have 3.5M entries in the `DebugLoc` side-vectors for 7M line
    table entries.  The cost of these is ~180B each, for another
    ~600MB.

    If we integrate a side-table of `MDLineTable`s into its uniquing,
    the overhead is only ~12B / line table entry, or ~80MB.  This saves
    520MB.

    This is somewhat perpendicular to redesigning the metadata format,
    but IMO it's worth doing as soon as it's possible.

 4. Create `GenericDebugMDNode`, a transitional subclass of `MDUser`
    through an intermediate class `DebugMDNode` with an
    allocation-time-optional `CallbackVH` available for referencing
    non-metadata.  Change `DIDescriptor` to wrap a `DebugMDNode` instead
    of an `MDNode`.

    This saves another ~960MB, for a running total of ~2GB.

    Proposed assembly syntax:

        !7 = metadata !GenericDebugMDNode(tag: DW_TAG_compile_unit,
                                          fields: "0\00clang
3.6\00...",
                                          operands: { metadata !8, ... })

        !7 = metadata !GenericDebugMDNode(tag: DW_TAG_variable,
                                          fields: "global_var\00...",
                                          operands: { metadata !8, ... },
                                          handle: i32* @global_var)

    This syntax pulls the tag out of the current header-string, calls
    the rest of the header "fields", and includes the metadata
operands
    in "operands".

 5. Incrementally create subclasses of `DebugMDNode`, such as
    `MDCompileUnit` and `MDSubprogram`.  Sub-classed nodes replace the
    "fields" and "operands" catch-alls with explicit names
for each
    operand.

    Proposed assembly syntax:

        !7 = metadata !MDSubprogram(line: 45, name: "foo",
displayName: "foo",
                                    linkageName: "_Z3foov", file:
metadata !8,
                                    function: i32 (i32)* @foo)

 6. Remove the dead code for `GenericDebugMDNode`.

 7. (Optional) Refactor `DebugMDNode` sub-classes to minimize RAUW
    traffic during bitcode serialization.  Now that metadata types are
    known, we can write debug info out in an order that makes it cheap
    to read back in.

    Note that using `MDUser` will make RAUW much cheaper, since we're
    using the use-list infrastructure for most of them.  If RAUW isn't
    showing up in a profile, I may skip this.

Does this direction seem reasonable?  Any major problems I've missed?

David Blaikie

2014-Oct-13 22:23 UTC

head link

[LLVMdev] [RFC] Less memory and greater maintainability for debug info IR

On Mon, Oct 13, 2014 at 3:02 PM, Duncan P. N. Exon Smith <
dexonsmith at apple.com> wrote:
> In r219010, I merged integer and string fields into a single header
> field.  By reducing the number of metadata operands used in debug info,
> this saved 2.2GB on an `llvm-lto` bootstrap.  I've done some profiling
> of DW_TAGs to see what parts of PR17891 and PR17892 to tackle next, and
> I've concluded that they will be insufficient.
>
Could you explain what your end-goal here looked like and what data you
used to evaluate its insufficiency?

Just to be clear, what I was picturing was that, starting with your initial
improvement, we'd string-ify more data in the records but eventually
we'd
start stringifying across records (eg: rolling a DW_TAG_structure_type's
members into the structure type itself, one big string). In the end we'd
just pull out the non-metadata references (like the llvm::Function* in the
DW_TAG_subroutine_type metadata) into a table kept separately from a
handful of big strings of debug info (I say a handful, as we'd keep the
types separate so they could be easily deduplicated).

> Instead, I'd like to implement a more aggressive plan, which as a
> side-effect cleans up the much "loved" debug info IR assembly
syntax.
>
> At a high-level, the idea is to create distinct subclasses of `Value`
> for each debug info concept,

My concern with this is baking parts of our current debug info
representation into IR constructs seems rather heavyweight. If we need to
add first class IR constructs to cope with debug info I'd hope to find,
ideally, one, general purpose extension we can use for this (& possibly for
other things). But maybe the bar for adding first class IR constructs is
lower than I've imagined it to be.

> starting with line table entries and moving
> on to the DIDescriptor hierarchy.  By leveraging the use-list
> infrastructure for metadata operands -- i.e., only using value handles
> for non-metadata operands -- we'll improve memory usage and increase
> RAUW speed.
>
> My rough plan follows.  I quote some numbers for memory savings below
> based on an -flto -g bootstrap of `llvm-lto` (i.e., running `llvm-lto`
> on `llvm-lto.lto.bc`, an already-linked bitcode file dumped by ld64's
> -save-temps option) that currently peaks at 15.3GB.
>
>  1. Introduce `MDUser`, which inherits from `User`, and whose `Use`s
>     must all be metadata.  The cost per operand is 1 pointer, vs. 4
>     pointers in an `MDNode`.
>
Perhaps a generic MD-only-node might be a sufficiently generically valuable
IR construct.

A similar alternative: A schematized metadata node. Much like DWARF, being
able to say "this node is of some type T, defined elsewhere in the module -
string, int, string, string, etc... ". Heck, this could even be just a
generic improvement to llvm IR, maybe? (the textual representation might
not need to change at all - IR Generation would just do much like DWARF
generation in LLVM does - create abbreviation/type descriptions on the fly
and share them rather than having every metadata node include its own
self-description)

>
>  2. Create `MDLineTable` as the first subclass of `MDUser`.  Use normal
>     fields (not `Value`s) for the line and column, and use `Use`
>     operands for the metadata operands.
>
>     On x86-64, this will save 104B / line table entry.  Linking
>     `llvm-lto` uses ~7M line-table entries, so this on its own saves
>     ~700MB.
>
>     Sketch of class definition:
>
>         class MDLineTable : public MDUser {
>           unsigned Line;
>           unsigned Column;
>         public:
>           static MDLineTable *get(unsigned Line, unsigned Column,
>                                   MDNode *Scope);
>           static MDLineTable *getInlined(MDLineTable *Base, MDNode *Scope);
>           static MDLineTable *getBase(MDLineTable *Inlined);
>
>           unsigned getLine() const { return Line; }
>           unsigned getColumn() const { return Column; }
>           bool isInlined() const { return getNumOperands() == 2; }
>           MDNode *getScope() const { return getOperand(0); }
>           MDNode *getInlinedAt() const { return getOperand(1); }
>         };
>
>     Proposed assembly syntax:
>
>         ; Not inlined.
>         !7 = metadata !MDLineTable(line: 45, column: 7, scope: metadata !9)
>
>         ; Inlined.
>         !7 = metadata !MDLineTable(line: 45, column: 7, scope: metadata !9,
>                                    inlinedAt: metadata !10)
>
>         ; Column defaulted to 0.
>         !7 = metadata !MDLineTable(line: 45, scope: metadata !9)
>
>     (What colour should that bike shed be?)
>
>  3. (Optional) Rewrite `DebugLoc` lookup tables.  My profiling shows
>     that we have 3.5M entries in the `DebugLoc` side-vectors for 7M line
>     table entries.  The cost of these is ~180B each, for another
>     ~600MB.
>
>     If we integrate a side-table of `MDLineTable`s into its uniquing,
>     the overhead is only ~12B / line table entry, or ~80MB.  This saves
>     520MB.
>     This is somewhat perpendicular to redesigning the metadata format,
>     but IMO it's worth doing as soon as it's possible.
>
>  4. Create `GenericDebugMDNode`, a transitional subclass of `MDUser`
>     through an intermediate class `DebugMDNode` with an
>     allocation-time-optional `CallbackVH` available for referencing
>     non-metadata.  Change `DIDescriptor` to wrap a `DebugMDNode` instead
>     of an `MDNode`.
>
>     This saves another ~960MB,

960 from what?

> for a running total of ~2GB.
>
~2GB is the total of what? (you mention a lot of numbers in this post, but
it's not always clear what they're relative to/out of/subtracted from)

>
>     Proposed assembly syntax:
>
>         !7 = metadata !GenericDebugMDNode(tag: DW_TAG_compile_unit,
>                                           fields: "0\00clang
3.6\00...",
>                                           operands: { metadata !8, ... })
>
>         !7 = metadata !GenericDebugMDNode(tag: DW_TAG_variable,
>                                           fields:
"global_var\00...",
>                                           operands: { metadata !8, ... },
>                                           handle: i32* @global_var)
>
>     This syntax pulls the tag out of the current header-string, calls
>     the rest of the header "fields", and includes the metadata
operands
>     in "operands".
>
>  5. Incrementally create subclasses of `DebugMDNode`, such as
>     `MDCompileUnit` and `MDSubprogram`.  Sub-classed nodes replace the
>     "fields" and "operands" catch-alls with explicit
names for each
>     operand.
>
I wouldn't mind seeing how expensive it would be if these schema
descriptions were within the module itself - so we didn't have to bake them
into the IR spec, but could still share them between every usage within a
module.

>
>     Proposed assembly syntax:
>
>         !7 = metadata !MDSubprogram(line: 45, name: "foo",
displayName:
> "foo",
>                                     linkageName: "_Z3foov", file:
metadata
> !8,
>                                     function: i32 (i32)* @foo)
>
>  6. Remove the dead code for `GenericDebugMDNode`.
>
>  7. (Optional) Refactor `DebugMDNode` sub-classes to minimize RAUW
>     traffic during bitcode serialization.  Now that metadata types are
>     known, we can write debug info out in an order that makes it cheap
>     to read back in.
>
>     Note that using `MDUser` will make RAUW much cheaper, since we're
>     using the use-list infrastructure for most of them.  If RAUW isn't
>     showing up in a profile, I may skip this.
>
> Does this direction seem reasonable?  Any major problems I've missed?
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141013/185277fd/attachment.html>

Reid Kleckner

2014-Oct-13 22:37 UTC

head link

[LLVMdev] [RFC] Less memory and greater maintainability for debug info IR

I think making debug info more of a first-class IR citizen is probably the
way to go. Right now debug info is completely unreadable and is downright
opposed to the design goals of the IR as I understand them.

Our backwards compatibility policy should give you the flexibility you need
to update the debug info representation as you go along:
http://llvm.org/docs/DeveloperPolicy.html#id18

On Mon, Oct 13, 2014 at 3:02 PM, Duncan P. N. Exon Smith <
dexonsmith at apple.com> wrote:
> In r219010, I merged integer and string fields into a single header
> field.  By reducing the number of metadata operands used in debug info,
> this saved 2.2GB on an `llvm-lto` bootstrap.  I've done some profiling
> of DW_TAGs to see what parts of PR17891 and PR17892 to tackle next, and
> I've concluded that they will be insufficient.
>
> Instead, I'd like to implement a more aggressive plan, which as a
> side-effect cleans up the much "loved" debug info IR assembly
syntax.
>
> At a high-level, the idea is to create distinct subclasses of `Value`
> for each debug info concept, starting with line table entries and moving
> on to the DIDescriptor hierarchy.  By leveraging the use-list
> infrastructure for metadata operands -- i.e., only using value handles
> for non-metadata operands -- we'll improve memory usage and increase
> RAUW speed.
>
> My rough plan follows.  I quote some numbers for memory savings below
> based on an -flto -g bootstrap of `llvm-lto` (i.e., running `llvm-lto`
> on `llvm-lto.lto.bc`, an already-linked bitcode file dumped by ld64's
> -save-temps option) that currently peaks at 15.3GB.
>
>  1. Introduce `MDUser`, which inherits from `User`, and whose `Use`s
>     must all be metadata.  The cost per operand is 1 pointer, vs. 4
>     pointers in an `MDNode`.
>
>  2. Create `MDLineTable` as the first subclass of `MDUser`.  Use normal
>     fields (not `Value`s) for the line and column, and use `Use`
>     operands for the metadata operands.
>
>     On x86-64, this will save 104B / line table entry.  Linking
>     `llvm-lto` uses ~7M line-table entries, so this on its own saves
>     ~700MB.
>
>     Sketch of class definition:
>
>         class MDLineTable : public MDUser {
>           unsigned Line;
>           unsigned Column;
>         public:
>           static MDLineTable *get(unsigned Line, unsigned Column,
>                                   MDNode *Scope);
>           static MDLineTable *getInlined(MDLineTable *Base, MDNode *Scope);
>           static MDLineTable *getBase(MDLineTable *Inlined);
>
>           unsigned getLine() const { return Line; }
>           unsigned getColumn() const { return Column; }
>           bool isInlined() const { return getNumOperands() == 2; }
>           MDNode *getScope() const { return getOperand(0); }
>           MDNode *getInlinedAt() const { return getOperand(1); }
>         };
>
>     Proposed assembly syntax:
>
>         ; Not inlined.
>         !7 = metadata !MDLineTable(line: 45, column: 7, scope: metadata !9)
>
>         ; Inlined.
>         !7 = metadata !MDLineTable(line: 45, column: 7, scope: metadata !9,
>                                    inlinedAt: metadata !10)
>
>         ; Column defaulted to 0.
>         !7 = metadata !MDLineTable(line: 45, scope: metadata !9)
>
>     (What colour should that bike shed be?)
>
>  3. (Optional) Rewrite `DebugLoc` lookup tables.  My profiling shows
>     that we have 3.5M entries in the `DebugLoc` side-vectors for 7M line
>     table entries.  The cost of these is ~180B each, for another
>     ~600MB.
>
>     If we integrate a side-table of `MDLineTable`s into its uniquing,
>     the overhead is only ~12B / line table entry, or ~80MB.  This saves
>     520MB.
>
>     This is somewhat perpendicular to redesigning the metadata format,
>     but IMO it's worth doing as soon as it's possible.
>
>  4. Create `GenericDebugMDNode`, a transitional subclass of `MDUser`
>     through an intermediate class `DebugMDNode` with an
>     allocation-time-optional `CallbackVH` available for referencing
>     non-metadata.  Change `DIDescriptor` to wrap a `DebugMDNode` instead
>     of an `MDNode`.
>
>     This saves another ~960MB, for a running total of ~2GB.
>
>     Proposed assembly syntax:
>
>         !7 = metadata !GenericDebugMDNode(tag: DW_TAG_compile_unit,
>                                           fields: "0\00clang
3.6\00...",
>                                           operands: { metadata !8, ... })
>
>         !7 = metadata !GenericDebugMDNode(tag: DW_TAG_variable,
>                                           fields:
"global_var\00...",
>                                           operands: { metadata !8, ... },
>                                           handle: i32* @global_var)
>
>     This syntax pulls the tag out of the current header-string, calls
>     the rest of the header "fields", and includes the metadata
operands
>     in "operands".
>
>  5. Incrementally create subclasses of `DebugMDNode`, such as
>     `MDCompileUnit` and `MDSubprogram`.  Sub-classed nodes replace the
>     "fields" and "operands" catch-alls with explicit
names for each
>     operand.
>
>     Proposed assembly syntax:
>
>         !7 = metadata !MDSubprogram(line: 45, name: "foo",
displayName:
> "foo",
>                                     linkageName: "_Z3foov", file:
metadata
> !8,
>                                     function: i32 (i32)* @foo)
>
>  6. Remove the dead code for `GenericDebugMDNode`.
>
>  7. (Optional) Refactor `DebugMDNode` sub-classes to minimize RAUW
>     traffic during bitcode serialization.  Now that metadata types are
>     known, we can write debug info out in an order that makes it cheap
>     to read back in.
>
>     Note that using `MDUser` will make RAUW much cheaper, since we're
>     using the use-list infrastructure for most of them.  If RAUW isn't
>     showing up in a profile, I may skip this.
>
> Does this direction seem reasonable?  Any major problems I've missed?
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141013/426293ca/attachment.html>

David Blaikie

2014-Oct-13 22:47 UTC

head link

[LLVMdev] [RFC] Less memory and greater maintainability for debug info IR

On Mon, Oct 13, 2014 at 3:37 PM, Reid Kleckner <rnk at google.com> wrote:
> I think making debug info more of a first-class IR citizen is probably the
> way to go. Right now debug info is completely unreadable and is downright
> opposed to the design goals of the IR as I understand them.
>
I'm still not sure this would produce particularly more legible, let alone
writeable, debug info IR. It's possible, certainly, if the schema was baked
into IR reading and writing, that we could pretty print it with annotated
field names and allow writing the debug info with omitted fields (because
the parser would know that this was, say, a subprogram record, and be able
to reorder fields to the required schema or add default values for omitted
fields), but I'm not sure we'd get that far nor whether it would really
tip
debug info to the point of writeability - it's still necessarily a format
that describes code, which tends towards being more ungainly than the code
itself. ("this thing is on line 42" rather than "thing"
written on line 42)

I'd have to see examples & promises of where this would go/what value it
would add, but I'd still be fairly concerned about the ongoing costs.

> Our backwards compatibility policy should give you the flexibility you
> need to update the debug info representation as you go along:
> http://llvm.org/docs/DeveloperPolicy.html#id18
>
It's a rather heavy burden to carry. Currently we have a much lighter cost
to changing the debug info schema (rev the version number - any debug info
with an older version number is dropped on sight).

>
>
> On Mon, Oct 13, 2014 at 3:02 PM, Duncan P. N. Exon Smith <
> dexonsmith at apple.com> wrote:
>
>> In r219010, I merged integer and string fields into a single header
>> field.  By reducing the number of metadata operands used in debug info,
>> this saved 2.2GB on an `llvm-lto` bootstrap.  I've done some
profiling
>> of DW_TAGs to see what parts of PR17891 and PR17892 to tackle next, and
>> I've concluded that they will be insufficient.
>>
>> Instead, I'd like to implement a more aggressive plan, which as a
>> side-effect cleans up the much "loved" debug info IR assembly
syntax.
>>
>> At a high-level, the idea is to create distinct subclasses of `Value`
>> for each debug info concept, starting with line table entries and
moving
>> on to the DIDescriptor hierarchy.  By leveraging the use-list
>> infrastructure for metadata operands -- i.e., only using value handles
>> for non-metadata operands -- we'll improve memory usage and
increase
>> RAUW speed.
>>
>> My rough plan follows.  I quote some numbers for memory savings below
>> based on an -flto -g bootstrap of `llvm-lto` (i.e., running `llvm-lto`
>> on `llvm-lto.lto.bc`, an already-linked bitcode file dumped by
ld64's
>> -save-temps option) that currently peaks at 15.3GB.
>>
>>  1. Introduce `MDUser`, which inherits from `User`, and whose `Use`s
>>     must all be metadata.  The cost per operand is 1 pointer, vs. 4
>>     pointers in an `MDNode`.
>>
>>  2. Create `MDLineTable` as the first subclass of `MDUser`.  Use normal
>>     fields (not `Value`s) for the line and column, and use `Use`
>>     operands for the metadata operands.
>>
>>     On x86-64, this will save 104B / line table entry.  Linking
>>     `llvm-lto` uses ~7M line-table entries, so this on its own saves
>>     ~700MB.
>>
>>     Sketch of class definition:
>>
>>         class MDLineTable : public MDUser {
>>           unsigned Line;
>>           unsigned Column;
>>         public:
>>           static MDLineTable *get(unsigned Line, unsigned Column,
>>                                   MDNode *Scope);
>>           static MDLineTable *getInlined(MDLineTable *Base, MDNode
>> *Scope);
>>           static MDLineTable *getBase(MDLineTable *Inlined);
>>
>>           unsigned getLine() const { return Line; }
>>           unsigned getColumn() const { return Column; }
>>           bool isInlined() const { return getNumOperands() == 2; }
>>           MDNode *getScope() const { return getOperand(0); }
>>           MDNode *getInlinedAt() const { return getOperand(1); }
>>         };
>>
>>     Proposed assembly syntax:
>>
>>         ; Not inlined.
>>         !7 = metadata !MDLineTable(line: 45, column: 7, scope: metadata
>> !9)
>>
>>         ; Inlined.
>>         !7 = metadata !MDLineTable(line: 45, column: 7, scope: metadata
>> !9,
>>                                    inlinedAt: metadata !10)
>>
>>         ; Column defaulted to 0.
>>         !7 = metadata !MDLineTable(line: 45, scope: metadata !9)
>>
>>     (What colour should that bike shed be?)
>>
>>  3. (Optional) Rewrite `DebugLoc` lookup tables.  My profiling shows
>>     that we have 3.5M entries in the `DebugLoc` side-vectors for 7M
line
>>     table entries.  The cost of these is ~180B each, for another
>>     ~600MB.
>>
>>     If we integrate a side-table of `MDLineTable`s into its uniquing,
>>     the overhead is only ~12B / line table entry, or ~80MB.  This saves
>>     520MB.
>>
>>     This is somewhat perpendicular to redesigning the metadata format,
>>     but IMO it's worth doing as soon as it's possible.
>>
>>  4. Create `GenericDebugMDNode`, a transitional subclass of `MDUser`
>>     through an intermediate class `DebugMDNode` with an
>>     allocation-time-optional `CallbackVH` available for referencing
>>     non-metadata.  Change `DIDescriptor` to wrap a `DebugMDNode`
instead
>>     of an `MDNode`.
>>
>>     This saves another ~960MB, for a running total of ~2GB.
>>
>>     Proposed assembly syntax:
>>
>>         !7 = metadata !GenericDebugMDNode(tag: DW_TAG_compile_unit,
>>                                           fields: "0\00clang
3.6\00...",
>>                                           operands: { metadata !8, ...
})
>>
>>         !7 = metadata !GenericDebugMDNode(tag: DW_TAG_variable,
>>                                           fields:
"global_var\00...",
>>                                           operands: { metadata !8, ...
},
>>                                           handle: i32* @global_var)
>>
>>     This syntax pulls the tag out of the current header-string, calls
>>     the rest of the header "fields", and includes the
metadata operands
>>     in "operands".
>>
>>  5. Incrementally create subclasses of `DebugMDNode`, such as
>>     `MDCompileUnit` and `MDSubprogram`.  Sub-classed nodes replace the
>>     "fields" and "operands" catch-alls with
explicit names for each
>>     operand.
>>
>>     Proposed assembly syntax:
>>
>>         !7 = metadata !MDSubprogram(line: 45, name: "foo",
displayName:
>> "foo",
>>                                     linkageName: "_Z3foov",
file:
>> metadata !8,
>>                                     function: i32 (i32)* @foo)
>>
>>  6. Remove the dead code for `GenericDebugMDNode`.
>>
>>  7. (Optional) Refactor `DebugMDNode` sub-classes to minimize RAUW
>>     traffic during bitcode serialization.  Now that metadata types are
>>     known, we can write debug info out in an order that makes it cheap
>>     to read back in.
>>
>>     Note that using `MDUser` will make RAUW much cheaper, since
we're
>>     using the use-list infrastructure for most of them.  If RAUW
isn't
>>     showing up in a profile, I may skip this.
>>
>> Does this direction seem reasonable?  Any major problems I've
missed?
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141013/631fd1f6/attachment.html>

Duncan P. N. Exon Smith

2014-Oct-13 23:30 UTC

head link

[LLVMdev] [RFC] Less memory and greater maintainability for debug info IR

> On Oct 13, 2014, at 3:23 PM, David Blaikie <dblaikie at gmail.com>
wrote:
> 
> 
> 
> On Mon, Oct 13, 2014 at 3:02 PM, Duncan P. N. Exon Smith <dexonsmith at
apple.com> wrote:
>> In r219010, I merged integer and string fields into a single header
>> field.  By reducing the number of metadata operands used in debug info,
>> this saved 2.2GB on an `llvm-lto` bootstrap.  I've done some
profiling
>> of DW_TAGs to see what parts of PR17891 and PR17892 to tackle next, and
>> I've concluded that they will be insufficient.
>> 
> Could you explain what your end-goal here looked like and what data you
used to evaluate its insufficiency?
In the links of C++ programs I've looked at, most `Value`s are line
tables and local variables.  E.g., for the `llvm-lto.lto.bc` case
I've used for memory numbers:

  - 23967800 Value
      - 16837368 MDNode
          - 7611669 DIDescriptor
              - 4373879 DW_TAG_arg_variable
              - 1341021 DW_TAG_subprogram
              -  554992 DW_TAG_auto_variable
              -  360390 DW_TAG_lexical_block
              -  354166 DW_TAG_subroutine_type
          - 7500000 line table entries
      -  5850877 User
      -   693869 MDString

IIUC, line tables and local variables need to be referenced directly
from the rest of the IR, so they can't be sunk into other nodes.

Relevant to your question, I didn't a way to sufficiently decrease
the numbers of these (or the number of their operands).
> Just to be clear, what I was picturing was that, starting with your initial
improvement, we'd string-ify more data in the records but eventually
we'd start stringifying across records (eg: rolling a
DW_TAG_structure_type's members into the structure type itself, one big
string). In the end we'd just pull out the non-metadata references (like the
llvm::Function* in the DW_TAG_subroutine_type metadata) into a table kept
separately from a handful of big strings of debug info (I say a handful, as
we'd keep the types separate so they could be easily deduplicated).
I was thinking along the same lines.  Unfortunately, there aren't
enough types left for that to make a big impact.

Unless you envisioned a completely different way of dealing with
`@llvm.dbg.value` and `!dbg` references?
>> Instead, I'd like to implement a more aggressive plan, which as a
>> side-effect cleans up the much "loved" debug info IR assembly
syntax.
>> 
>> At a high-level, the idea is to create distinct subclasses of `Value`
>> for each debug info concept, 
> 
> My concern with this is baking parts of our current debug info
representation into IR constructs seems rather heavyweight. If we need to add
first class IR constructs to cope with debug info I'd hope to find, ideally,
one, general purpose extension we can use for this (& possibly for other
things). But maybe the bar for adding first class IR constructs is lower than
I've imagined it to be.
Since 75% of all `Value`s are debug info, representing them well
seems worthwhile to me.
>> starting with line table entries and moving
>> on to the DIDescriptor hierarchy.  By leveraging the use-list
>> infrastructure for metadata operands -- i.e., only using value handles
>> for non-metadata operands -- we'll improve memory usage and
increase
>> RAUW speed.
>> 
>> My rough plan follows.
(Note the following sentence, which I think you missed.)
>> I quote some numbers for memory savings below
>> based on an -flto -g bootstrap of `llvm-lto` (i.e., running `llvm-lto`
>> on `llvm-lto.lto.bc`, an already-linked bitcode file dumped by
ld64's
>> -save-temps option) that currently peaks at 15.3GB.
>> 
>>  1. Introduce `MDUser`, which inherits from `User`, and whose `Use`s
>>     must all be metadata.  The cost per operand is 1 pointer, vs. 4
>>     pointers in an `MDNode`.
> 
> Perhaps a generic MD-only-node might be a sufficiently generically valuable
IR construct.
> 
> A similar alternative: A schematized metadata node. Much like DWARF, being
able to say "this node is of some type T, defined elsewhere in the module -
string, int, string, string, etc... ". Heck, this could even be just a
generic improvement to llvm IR, maybe? (the textual representation might not
need to change at all - IR Generation would just do much like DWARF generation
in LLVM does - create abbreviation/type descriptions on the fly and share them
rather than having every metadata node include its own self-description)
> 
"Being generic" seems like a defect to me, not a feature.  If you need
to add support for every IR construct to the backend to emit DIEs, etc.,
then what's the benefit in being able to express arbitrary other things?

>>  2. Create `MDLineTable` as the first subclass of `MDUser`.  Use normal
>>     fields (not `Value`s) for the line and column, and use `Use`
>>     operands for the metadata operands.
>> 
>>     On x86-64, this will save 104B / line table entry.  Linking
>>     `llvm-lto` uses ~7M line-table entries, so this on its own saves
>>     ~700MB.
>> 
>>     Sketch of class definition:
>> 
>>         class MDLineTable : public MDUser {
>>           unsigned Line;
>>           unsigned Column;
>>         public:
>>           static MDLineTable *get(unsigned Line, unsigned Column,
>>                                   MDNode *Scope);
>>           static MDLineTable *getInlined(MDLineTable *Base, MDNode
*Scope);
>>           static MDLineTable *getBase(MDLineTable *Inlined);
>> 
>>           unsigned getLine() const { return Line; }
>>           unsigned getColumn() const { return Column; }
>>           bool isInlined() const { return getNumOperands() == 2; }
>>           MDNode *getScope() const { return getOperand(0); }
>>           MDNode *getInlinedAt() const { return getOperand(1); }
>>         };
>> 
>>     Proposed assembly syntax:
>> 
>>         ; Not inlined.
>>         !7 = metadata !MDLineTable(line: 45, column: 7, scope: metadata
!9)
>> 
>>         ; Inlined.
>>         !7 = metadata !MDLineTable(line: 45, column: 7, scope: metadata
!9,
>>                                    inlinedAt: metadata !10)
>> 
>>         ; Column defaulted to 0.
>>         !7 = metadata !MDLineTable(line: 45, scope: metadata !9)
>> 
>>     (What colour should that bike shed be?)
>> 
>>  3. (Optional) Rewrite `DebugLoc` lookup tables.  My profiling shows
>>     that we have 3.5M entries in the `DebugLoc` side-vectors for 7M
line
>>     table entries.  The cost of these is ~180B each, for another
>>     ~600MB.
>> 
>>     If we integrate a side-table of `MDLineTable`s into its uniquing,
>>     the overhead is only ~12B / line table entry, or ~80MB.  This saves
>>     520MB. 
>> 
>>     This is somewhat perpendicular to redesigning the metadata format,
>>     but IMO it's worth doing as soon as it's possible.
>> 
>>  4. Create `GenericDebugMDNode`, a transitional subclass of `MDUser`
>>     through an intermediate class `DebugMDNode` with an
>>     allocation-time-optional `CallbackVH` available for referencing
>>     non-metadata.  Change `DIDescriptor` to wrap a `DebugMDNode`
instead
>>     of an `MDNode`.
>> 
>>     This saves another ~960MB, 
> 
> 960 from what?
This number references the sentence noted above.
>  
>> for a running total of ~2GB.
> 
> ~2GB is the total of what? (you mention a lot of numbers in this post, but
it's not always clear what they're relative to/out of/subtracted from)
This number references the sentence noted above.
>>  
>>     Proposed assembly syntax:
>> 
>>         !7 = metadata !GenericDebugMDNode(tag: DW_TAG_compile_unit,
>>                                           fields: "0\00clang
3.6\00...",
>>                                           operands: { metadata !8, ...
})
>> 
>>         !7 = metadata !GenericDebugMDNode(tag: DW_TAG_variable,
>>                                           fields:
"global_var\00...",
>>                                           operands: { metadata !8, ...
},
>>                                           handle: i32* @global_var)
>> 
>>     This syntax pulls the tag out of the current header-string, calls
>>     the rest of the header "fields", and includes the
metadata operands
>>     in "operands".
>> 
>>  5. Incrementally create subclasses of `DebugMDNode`, such as
>>     `MDCompileUnit` and `MDSubprogram`.  Sub-classed nodes replace the
>>     "fields" and "operands" catch-alls with
explicit names for each
>>     operand.
> 
> I wouldn't mind seeing how expensive it would be if these schema
descriptions were within the module itself - so we didn't have to bake them
into the IR spec, but could still share them between every usage within a
module.
It's already baked into the IR spec, since the backend needs to
understand debug info to emit it.  We might as well understand what
exactly we're representing by formalizing it.
>  
>> 
>>     Proposed assembly syntax:
>> 
>>         !7 = metadata !MDSubprogram(line: 45, name: "foo",
displayName: "foo",
>>                                     linkageName: "_Z3foov",
file: metadata !8,
>>                                     function: i32 (i32)* @foo)
>> 
>>  6. Remove the dead code for `GenericDebugMDNode`.
>> 
>>  7. (Optional) Refactor `DebugMDNode` sub-classes to minimize RAUW
>>     traffic during bitcode serialization.  Now that metadata types are
>>     known, we can write debug info out in an order that makes it cheap
>>     to read back in.
>> 
>>     Note that using `MDUser` will make RAUW much cheaper, since
we're
>>     using the use-list infrastructure for most of them.  If RAUW
isn't
>>     showing up in a profile, I may skip this.
>> 
>> Does this direction seem reasonable?  Any major problems I've
missed?

Sean Silva

2014-Oct-14 01:59 UTC

head link

[LLVMdev] [RFC] Less memory and greater maintainability for debug info IR

For those interested, I've attached some pie charts based on Duncan's
data
in one of the other posts; successive slides break down the usage
increasingly finely. To my understanding, they represent the number of
Value's (and subclasses) allocated.

On Mon, Oct 13, 2014 at 3:02 PM, Duncan P. N. Exon Smith <
dexonsmith at apple.com> wrote:
> In r219010, I merged integer and string fields into a single header
> field.  By reducing the number of metadata operands used in debug info,
> this saved 2.2GB on an `llvm-lto` bootstrap.  I've done some profiling
> of DW_TAGs to see what parts of PR17891 and PR17892 to tackle next, and
> I've concluded that they will be insufficient.
>
> Instead, I'd like to implement a more aggressive plan, which as a
> side-effect cleans up the much "loved" debug info IR assembly
syntax.
>
> At a high-level, the idea is to create distinct subclasses of `Value`
> for each debug info concept, starting with line table entries and moving
> on to the DIDescriptor hierarchy.  By leveraging the use-list
> infrastructure for metadata operands -- i.e., only using value handles
> for non-metadata operands -- we'll improve memory usage and increase
> RAUW speed.
>
> My rough plan follows.  I quote some numbers for memory savings below
> based on an -flto -g bootstrap of `llvm-lto` (i.e., running `llvm-lto`
> on `llvm-lto.lto.bc`, an already-linked bitcode file dumped by ld64's
> -save-temps option) that currently peaks at 15.3GB.
>
Stupid question, but when I was working on LTO last Summer the primary
culprit for excessive memory use was due to us not being smart when linking
the IR together (Espindola would know more details). Do we still have that
problem? For starters, how does the memory usage of just llvm-link compare
to the memory usage of the actual LTO run? If the issue I was seeing last
Summer is still there, you should see that the invocation of llvm-link is
actually the most memory-intensive part of the LTO step, by far.


Also, you seem to really like saying "peak" here. Is there a definite
peak?
When does it occur?


>
>  1. Introduce `MDUser`, which inherits from `User`, and whose `Use`s
>     must all be metadata.  The cost per operand is 1 pointer, vs. 4
>     pointers in an `MDNode`.
>
>  2. Create `MDLineTable` as the first subclass of `MDUser`.  Use normal
>     fields (not `Value`s) for the line and column, and use `Use`
>     operands for the metadata operands.
>
>     On x86-64, this will save 104B / line table entry.  Linking
>     `llvm-lto` uses ~7M line-table entries, so this on its own saves
>     ~700MB.
>     Sketch of class definition:
>
>         class MDLineTable : public MDUser {
>           unsigned Line;
>           unsigned Column;
>         public:
>           static MDLineTable *get(unsigned Line, unsigned Column,
>                                   MDNode *Scope);
>           static MDLineTable *getInlined(MDLineTable *Base, MDNode *Scope);
>           static MDLineTable *getBase(MDLineTable *Inlined);
>
>           unsigned getLine() const { return Line; }
>           unsigned getColumn() const { return Column; }
>           bool isInlined() const { return getNumOperands() == 2; }
>           MDNode *getScope() const { return getOperand(0); }
>           MDNode *getInlinedAt() const { return getOperand(1); }
>         };
>
>     Proposed assembly syntax:
>
>         ; Not inlined.
>         !7 = metadata !MDLineTable(line: 45, column: 7, scope: metadata !9)
>
>         ; Inlined.
>         !7 = metadata !MDLineTable(line: 45, column: 7, scope: metadata !9,
>                                    inlinedAt: metadata !10)
>
>         ; Column defaulted to 0.
>         !7 = metadata !MDLineTable(line: 45, scope: metadata !9)
>
>     (What colour should that bike shed be?)
>
>  3. (Optional) Rewrite `DebugLoc` lookup tables.  My profiling shows
>     that we have 3.5M entries in the `DebugLoc` side-vectors for 7M line
>     table entries.  The cost of these is ~180B each, for another
>     ~600MB.
>
>     If we integrate a side-table of `MDLineTable`s into its uniquing,
>     the overhead is only ~12B / line table entry, or ~80MB.  This saves
>     520MB.
>
>     This is somewhat perpendicular to redesigning the metadata format,
>     but IMO it's worth doing as soon as it's possible.
>
>  4. Create `GenericDebugMDNode`, a transitional subclass of `MDUser`
>     through an intermediate class `DebugMDNode` with an
>     allocation-time-optional `CallbackVH` available for referencing
>     non-metadata.  Change `DIDescriptor` to wrap a `DebugMDNode` instead
>     of an `MDNode`.
>
>     This saves another ~960MB, for a running total of ~2GB.
>
2GB (out of 15.3GB i.e. ~13%) seems pretty pathetic savings when we have a
single pie slice near 40% of the # of Value's allocated and another at 21%.
Especially this being "step 4".

As a rough back of the envelope calculation, dividing 15.3GB by ~24 million
Values gives about 600 bytes per Value. That seems sort of excessive (but
is it realistic?). All of the data types that you are proposing to shrink
fall far short of this "average size", meaning that if you are trying
to
reduce memory usage, you might be looking in the wrong place. Something
smells fishy. At the very least, this would indicate that the real memory
usage is elsewhere.

A pie chart breaking down the total memory usage seems essential to have
here.

>
>     Proposed assembly syntax:
>
>         !7 = metadata !GenericDebugMDNode(tag: DW_TAG_compile_unit,
>                                           fields: "0\00clang
3.6\00...",
>                                           operands: { metadata !8, ... })
>
>         !7 = metadata !GenericDebugMDNode(tag: DW_TAG_variable,
>                                           fields:
"global_var\00...",
>                                           operands: { metadata !8, ... },
>                                           handle: i32* @global_var)
>
>     This syntax pulls the tag out of the current header-string, calls
>     the rest of the header "fields", and includes the metadata
operands
>     in "operands".
>
>  5. Incrementally create subclasses of `DebugMDNode`, such as
>     `MDCompileUnit` and `MDSubprogram`.  Sub-classed nodes replace the
>     "fields" and "operands" catch-alls with explicit
names for each
>     operand.
>
>     Proposed assembly syntax:
>
>         !7 = metadata !MDSubprogram(line: 45, name: "foo",
displayName:
> "foo",
>                                     linkageName: "_Z3foov", file:
metadata
> !8,
>                                     function: i32 (i32)* @foo)
>
>  6. Remove the dead code for `GenericDebugMDNode`.
>
>  7. (Optional) Refactor `DebugMDNode` sub-classes to minimize RAUW
>     traffic during bitcode serialization.  Now that metadata types are
>     known, we can write debug info out in an order that makes it cheap
>     to read back in.
>
>     Note that using `MDUser` will make RAUW much cheaper, since we're
>     using the use-list infrastructure for most of them.  If RAUW isn't
>     showing up in a profile, I may skip this.
>
> Does this direction seem reasonable?  Any major problems I've missed?
>
You need more data. Right now you have essentially one data point, and it's
not even clear what you measured really. If your goal is saving memory, I
would expect at least a pie chart that breaks down LLVM's memory usage (not
just # of allocations of different sorts; an approximation is fine, as long
as you explain how you arrived at it and in what sense it approximates the
true number).

Do the numbers change significantly for different projects? (e.g. Chromium
or Firefox or a kernel or a large app you have handy to compile with LTO?).
If you have specific data you want (and a suggestion for how to gather it),
I can also get your numbers for one of our internal games as well.

Once you have some more data, then as a first step, I would like to see an
analysis of how much we can "ideally" expect to gain (back of the
envelope
calculations == win).

-- Sean Silva

>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141013/b1da4b87/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: DebugInfoSize.pdf
Type: application/pdf
Size: 108040 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141013/b1da4b87/attachment.pdf>

Eric Christopher

2014-Oct-14 02:01 UTC

head link

[LLVMdev] [RFC] Less memory and greater maintainability for debug info IR

On Mon, Oct 13, 2014 at 6:59 PM, Sean Silva <chisophugis at gmail.com>
wrote:> For those interested, I've attached some pie charts based on
Duncan's data
> in one of the other posts; successive slides break down the usage
> increasingly finely. To my understanding, they represent the number of
> Value's (and subclasses) allocated.
>
> On Mon, Oct 13, 2014 at 3:02 PM, Duncan P. N. Exon Smith
> <dexonsmith at apple.com> wrote:
>>
>> In r219010, I merged integer and string fields into a single header
>> field.  By reducing the number of metadata operands used in debug info,
>> this saved 2.2GB on an `llvm-lto` bootstrap.  I've done some
profiling
>> of DW_TAGs to see what parts of PR17891 and PR17892 to tackle next, and
>> I've concluded that they will be insufficient.
>>
>> Instead, I'd like to implement a more aggressive plan, which as a
>> side-effect cleans up the much "loved" debug info IR assembly
syntax.
>>
>> At a high-level, the idea is to create distinct subclasses of `Value`
>> for each debug info concept, starting with line table entries and
moving
>> on to the DIDescriptor hierarchy.  By leveraging the use-list
>> infrastructure for metadata operands -- i.e., only using value handles
>> for non-metadata operands -- we'll improve memory usage and
increase
>> RAUW speed.
>>
>> My rough plan follows.  I quote some numbers for memory savings below
>> based on an -flto -g bootstrap of `llvm-lto` (i.e., running `llvm-lto`
>> on `llvm-lto.lto.bc`, an already-linked bitcode file dumped by
ld64's
>> -save-temps option) that currently peaks at 15.3GB.
>
>
> Stupid question, but when I was working on LTO last Summer the primary
> culprit for excessive memory use was due to us not being smart when linking
> the IR together (Espindola would know more details). Do we still have that
> problem? For starters, how does the memory usage of just llvm-link compare
> to the memory usage of the actual LTO run? If the issue I was seeing last
> Summer is still there, you should see that the invocation of llvm-link is
> actually the most memory-intensive part of the LTO step, by far.
>
This is vague. Could you be more specific on where you saw all of the memory?

-eric
>
> Also, you seem to really like saying "peak" here. Is there a
definite peak?
> When does it occur?
>
>
>>
>>
>>  1. Introduce `MDUser`, which inherits from `User`, and whose `Use`s
>>     must all be metadata.  The cost per operand is 1 pointer, vs. 4
>>     pointers in an `MDNode`.
>>
>>  2. Create `MDLineTable` as the first subclass of `MDUser`.  Use normal
>>     fields (not `Value`s) for the line and column, and use `Use`
>>     operands for the metadata operands.
>>
>>     On x86-64, this will save 104B / line table entry.  Linking
>>     `llvm-lto` uses ~7M line-table entries, so this on its own saves
>>     ~700MB.
>>
>>
>>     Sketch of class definition:
>>
>>         class MDLineTable : public MDUser {
>>           unsigned Line;
>>           unsigned Column;
>>         public:
>>           static MDLineTable *get(unsigned Line, unsigned Column,
>>                                   MDNode *Scope);
>>           static MDLineTable *getInlined(MDLineTable *Base, MDNode
>> *Scope);
>>           static MDLineTable *getBase(MDLineTable *Inlined);
>>
>>           unsigned getLine() const { return Line; }
>>           unsigned getColumn() const { return Column; }
>>           bool isInlined() const { return getNumOperands() == 2; }
>>           MDNode *getScope() const { return getOperand(0); }
>>           MDNode *getInlinedAt() const { return getOperand(1); }
>>         };
>>
>>     Proposed assembly syntax:
>>
>>         ; Not inlined.
>>         !7 = metadata !MDLineTable(line: 45, column: 7, scope: metadata
>> !9)
>>
>>         ; Inlined.
>>         !7 = metadata !MDLineTable(line: 45, column: 7, scope: metadata
>> !9,
>>                                    inlinedAt: metadata !10)
>>
>>         ; Column defaulted to 0.
>>         !7 = metadata !MDLineTable(line: 45, scope: metadata !9)
>>
>>     (What colour should that bike shed be?)
>>
>>  3. (Optional) Rewrite `DebugLoc` lookup tables.  My profiling shows
>>     that we have 3.5M entries in the `DebugLoc` side-vectors for 7M
line
>>     table entries.  The cost of these is ~180B each, for another
>>     ~600MB.
>>
>>     If we integrate a side-table of `MDLineTable`s into its uniquing,
>>     the overhead is only ~12B / line table entry, or ~80MB.  This saves
>>     520MB.
>>
>>     This is somewhat perpendicular to redesigning the metadata format,
>>     but IMO it's worth doing as soon as it's possible.
>>
>>  4. Create `GenericDebugMDNode`, a transitional subclass of `MDUser`
>>     through an intermediate class `DebugMDNode` with an
>>     allocation-time-optional `CallbackVH` available for referencing
>>     non-metadata.  Change `DIDescriptor` to wrap a `DebugMDNode`
instead
>>     of an `MDNode`.
>>
>>     This saves another ~960MB, for a running total of ~2GB.
>
>
> 2GB (out of 15.3GB i.e. ~13%) seems pretty pathetic savings when we have a
> single pie slice near 40% of the # of Value's allocated and another at
21%.
> Especially this being "step 4".
>
> As a rough back of the envelope calculation, dividing 15.3GB by ~24 million
> Values gives about 600 bytes per Value. That seems sort of excessive (but
is
> it realistic?). All of the data types that you are proposing to shrink fall
> far short of this "average size", meaning that if you are trying
to reduce
> memory usage, you might be looking in the wrong place. Something smells
> fishy. At the very least, this would indicate that the real memory usage is
> elsewhere.
>
> A pie chart breaking down the total memory usage seems essential to have
> here.
>
>>
>>
>>     Proposed assembly syntax:
>>
>>         !7 = metadata !GenericDebugMDNode(tag: DW_TAG_compile_unit,
>>                                           fields: "0\00clang
3.6\00...",
>>                                           operands: { metadata !8, ...
})
>>
>>         !7 = metadata !GenericDebugMDNode(tag: DW_TAG_variable,
>>                                           fields:
"global_var\00...",
>>                                           operands: { metadata !8, ...
},
>>                                           handle: i32* @global_var)
>>
>>     This syntax pulls the tag out of the current header-string, calls
>>     the rest of the header "fields", and includes the
metadata operands
>>     in "operands".
>>
>>  5. Incrementally create subclasses of `DebugMDNode`, such as
>>     `MDCompileUnit` and `MDSubprogram`.  Sub-classed nodes replace the
>>     "fields" and "operands" catch-alls with
explicit names for each
>>     operand.
>>
>>     Proposed assembly syntax:
>>
>>         !7 = metadata !MDSubprogram(line: 45, name: "foo",
displayName:
>> "foo",
>>                                     linkageName: "_Z3foov",
file: metadata
>> !8,
>>                                     function: i32 (i32)* @foo)
>>
>>  6. Remove the dead code for `GenericDebugMDNode`.
>>
>>  7. (Optional) Refactor `DebugMDNode` sub-classes to minimize RAUW
>>     traffic during bitcode serialization.  Now that metadata types are
>>     known, we can write debug info out in an order that makes it cheap
>>     to read back in.
>>
>>     Note that using `MDUser` will make RAUW much cheaper, since
we're
>>     using the use-list infrastructure for most of them.  If RAUW
isn't
>>     showing up in a profile, I may skip this.
>>
>> Does this direction seem reasonable?  Any major problems I've
missed?
>
>
> You need more data. Right now you have essentially one data point, and
it's
> not even clear what you measured really. If your goal is saving memory, I
> would expect at least a pie chart that breaks down LLVM's memory usage
(not
> just # of allocations of different sorts; an approximation is fine, as long
> as you explain how you arrived at it and in what sense it approximates the
> true number).
>
> Do the numbers change significantly for different projects? (e.g. Chromium
> or Firefox or a kernel or a large app you have handy to compile with LTO?).
> If you have specific data you want (and a suggestion for how to gather it),
> I can also get your numbers for one of our internal games as well.
>
> Once you have some more data, then as a first step, I would like to see an
> analysis of how much we can "ideally" expect to gain (back of the
envelope
> calculations == win).
>
> -- Sean Silva
>
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>

Duncan P. N. Exon Smith

2014-Oct-14 18:40 UTC

head link

[LLVMdev] [RFC] Less memory and greater maintainability for debug info IR

> On Oct 13, 2014, at 6:59 PM, Sean Silva <chisophugis at gmail.com>
wrote:
> 
> Stupid question, but when I was working on LTO last Summer the primary
culprit for excessive memory use was due to us not being smart when linking the
IR together (Espindola would know more details). Do we still have that problem?
For starters, how does the memory usage of just llvm-link compare to the memory
usage of the actual LTO run? If the issue I was seeing last Summer is still
there, you should see that the invocation of llvm-link is actually the most
memory-intensive part of the LTO step, by far.
To be clear, I'm running the command-line:

    $ llvm-lto -exported-symbol _main llvm-lto.lto.bc

Since this is a pre-linked bitcode file, we shouldn't be wasting much
memory from the linking stage.

Running ld64 directly gives a peak memory footprint of ~30GB for the
full link, so there's something else going on there that I'll be
digging into later.
> 2GB (out of 15.3GB i.e. ~13%) seems pretty pathetic savings when we have a
single pie slice near 40% of the # of Value's allocated and another at 21%.
Especially this being "step 4".
15.3GB is the peak memory of `llvm-lto`.  This comes late in the
process, after DIEs have been created.  I haven't looked in detail past
debug info metadata, but here's a sketch of what I imagine is in memory
at this point.

  - The IR, including uniquing side-tables.
  - Optimization and backend passes.
  - Parts of SelectionDAG that haven't been freed.
  - `MachineFunction`s and everything inside them.
  - Whatever state the `AsmPrinter`, etc., need.

I expect to look at a couple of other debug-info-related memory usage
areas once I've shrunk the metadata:

  - What's the total footprint of DIEs?  This run has 4M of them, whose
    allocated footprint is ~1GB.  I'm hoping that a deeper look will
    reveal an even larger attack surface.

  - How much do debug info intrinsics cost?  They show up in at least
    three forms -- IR-level, SDNodes, and MachineInstrs -- and there
    can be a lot of them.  How many?  What's their footprint?

For now, I'm focusing on the problem I've already identified.
> You need more data. Right now you have essentially one data point,
I looked at a number of internal C and C++ programs with -flto -g, and
dug deeply into llvm-lto.lto.bc because it's small enough that it's easy
to analyze (and its runtime profile was representative of the other C++
programs I was looking at).

I didn't look deeply at a broad spectrum, but memory usage and runtime
for building clang with -flto -g is something we care a fair bit about.
> and it's not even clear what you measured really. If your goal is
saving memory, I would expect at least a pie chart that breaks down LLVM's
memory usage (not just # of allocations of different sorts; an approximation is
fine, as long as you explain how you arrived at it and in what sense it
approximates the true number).
I'm not sure there's value in diving deeply into everything at once.
I've identified one of the bottlenecks, so I'd like to improve it before
digging into the others.

Here's some visibility into where my numbers come from.

I got the 15.3GB from a profile of memory usage vs. time.  Peak usage
comes late in the process, around when DIEs are being dealt with.

Metadata node counts stabilize much earlier in the process.  The rest of
the numbers are based on counting `MDNodes` and their respective
`MDNodeOperands`, and multiplying by the cost of their operands.  Here's
a dump from around the peak metadata node count:

    LineTables = 7500000[30000000], InlinedLineTables = 6756182, Directives =
7611669[42389128], Arrays = 570609[577447], Others = 1176556[5133065]
    Tag =   256, Count =   554992, Ops =   2531428, Name = DW_TAG_auto_variable
    Tag = 16647, Count =      988, Ops =      4940, Name =
DW_TAG_GNU_template_parameter_pack
    Tag =    52, Count =     9933, Ops =     59598, Name = DW_TAG_variable
    Tag =    33, Count =      190, Ops =       190, Name = DW_TAG_subrange_type
    Tag =    59, Count =        1, Ops =         3, Name =
DW_TAG_unspecified_type
    Tag =    40, Count =    24731, Ops =     24731, Name = DW_TAG_enumerator
    Tag =    21, Count =   354166, Ops =   2833328, Name =
DW_TAG_subroutine_type
    Tag =     2, Count =    77999, Ops =    623992, Name = DW_TAG_class_type
    Tag =    47, Count =    27122, Ops =    108488, Name =
DW_TAG_template_type_parameter
    Tag =    28, Count =     8491, Ops =     33964, Name = DW_TAG_inheritance
    Tag =    66, Count =    10930, Ops =     43720, Name =
DW_TAG_rvalue_reference_type
    Tag =    16, Count =    54680, Ops =    218720, Name = DW_TAG_reference_type
    Tag =    23, Count =      624, Ops =      4992, Name = DW_TAG_union_type
    Tag =     4, Count =     5344, Ops =     42752, Name =
DW_TAG_enumeration_type
    Tag =    11, Count =   360390, Ops =   1081170, Name = DW_TAG_lexical_block
    Tag =   258, Count =        1, Ops =         1, Name = DW_TAG_expression
    Tag =    13, Count =    73880, Ops =    299110, Name = DW_TAG_member
    Tag =    58, Count =     1387, Ops =      4161, Name =
DW_TAG_imported_module
    Tag =     1, Count =     2747, Ops =     21976, Name = DW_TAG_array_type
    Tag =    46, Count =  1341021, Ops =  12069189, Name = DW_TAG_subprogram
    Tag =   257, Count =  4373879, Ops =  20785065, Name = DW_TAG_arg_variable
    Tag =     8, Count =     2246, Ops =      6738, Name =
DW_TAG_imported_declaration
    Tag =    53, Count =       57, Ops =       228, Name = DW_TAG_volatile_type
    Tag =    15, Count =    55163, Ops =    220652, Name = DW_TAG_pointer_type
    Tag =    41, Count =     3382, Ops =      6764, Name = DW_TAG_file_type
    Tag =    22, Count =   158479, Ops =    633916, Name = DW_TAG_typedef
    Tag =    48, Count =      486, Ops =      2430, Name =
DW_TAG_template_value_parameter
    Tag =    36, Count =       15, Ops =        45, Name = DW_TAG_base_type
    Tag =    17, Count =     1164, Ops =      8148, Name = DW_TAG_compile_unit
    Tag =    31, Count =       19, Ops =        95, Name =
DW_TAG_ptr_to_member_type
    Tag =    57, Count =     2034, Ops =      6102, Name = DW_TAG_namespace
    Tag =    38, Count =    32133, Ops =    128532, Name = DW_TAG_const_type
    Tag =    19, Count =    72995, Ops =    583960, Name = DW_TAG_structure_type

(Note: the InlinedLineTables stat is included in LineTables stat.)

You can determine the rough memory footprint of each type of node by
multiplying the "Count" by `sizeof(MDNode)` (x86-64: 56B) and the
"Ops"
by `sizeof(MDNodeOperand)` (x86-64: 32B).

Overall, there are 7.5M linetables with 30M operands, so by this method
their footprint is ~1.3GB.  There are 7.6M descriptors with 42.4M
operands, so their footprint is ~1.7GB.

I dumped another stat periodically to tell me the peak size of the
side-tables for line table entries, which are split into "Scopes" (for
non-inlined) and "Inlined" (these counts are disjoint, unlike the
previous stats):

    Scopes = 203166 [203166], Inlined = 3500000 [3500000]

I assumed that both `DenseMap` and `std::vector` over-allocate by 50%
to estimate the current (and planned) costs for the side-tables.

Another stat I dumped periodically was the breakdown between V(alues),
U(sers), C(onstants), M(etadata nodes), and (metadata) S(trings).
Here's a sample from nearby:

    V = 23967800 (40200000 - 16232200)
    U =  5850877 ( 7365503 -  1514626)
    C =   205491 (  279134 -    73643)
    M = 16837368 (31009291 - 14171923)
    S =   693869 (  693869 -        0)

Lastly, I dumped a breakdown of the types of MDNodeOperands.  This is
also a sample from nearby:

    MDOps = 77644750 (100%)
    Const = 14947077 ( 19%)
    Node  = 41749475 ( 53%)
    Str   =  9553581 ( 12%)
    Null  = 10976693 ( 14%)
    Other =   417924 (  0%)

While I didn't use this breakdown for my memory estimates, it was
interesting nevertheless.  Note the following:

  - The number of constants is just under 15M.  This dump came less than
    a second before the dump above, where we have 7.5M line table
    entries.  Line table entries have 2 operands of `ConstantInt`.  This
    lines up nicely.
    
    Note: this checked `isa<Constant>(Op) &&
!isa<GlobalValue>(Op)`.

  - There are a lot of null operands.  By making subclasses for the
    various types of debug info IR, we can probably shed some of these
    altogether.

  - There are few "Other" operands.  These are likely all
`GlobalValue`
    references, and are the only operands that need to be referenced
    using value handles.

Alex Rosenberg

2014-Oct-16 03:53 UTC

head link

[LLVMdev] [RFC] Less memory and greater maintainability for debug info IR

As all of these transforms are 1-to-1, can we still support the older metadata
and convert it on the fly?

Alex
> On Oct 13, 2014, at 3:02 PM, Duncan P. N. Exon Smith <dexonsmith at
apple.com> wrote:
> 
> In r219010, I merged integer and string fields into a single header
> field.  By reducing the number of metadata operands used in debug info,
> this saved 2.2GB on an `llvm-lto` bootstrap.  I've done some profiling
> of DW_TAGs to see what parts of PR17891 and PR17892 to tackle next, and
> I've concluded that they will be insufficient.
> 
> Instead, I'd like to implement a more aggressive plan, which as a
> side-effect cleans up the much "loved" debug info IR assembly
syntax.
> 
> At a high-level, the idea is to create distinct subclasses of `Value`
> for each debug info concept, starting with line table entries and moving
> on to the DIDescriptor hierarchy.  By leveraging the use-list
> infrastructure for metadata operands -- i.e., only using value handles
> for non-metadata operands -- we'll improve memory usage and increase
> RAUW speed.
> 
> My rough plan follows.  I quote some numbers for memory savings below
> based on an -flto -g bootstrap of `llvm-lto` (i.e., running `llvm-lto`
> on `llvm-lto.lto.bc`, an already-linked bitcode file dumped by ld64's
> -save-temps option) that currently peaks at 15.3GB.
> 
> 1. Introduce `MDUser`, which inherits from `User`, and whose `Use`s
>    must all be metadata.  The cost per operand is 1 pointer, vs. 4
>    pointers in an `MDNode`.
> 
> 2. Create `MDLineTable` as the first subclass of `MDUser`.  Use normal
>    fields (not `Value`s) for the line and column, and use `Use`
>    operands for the metadata operands.
> 
>    On x86-64, this will save 104B / line table entry.  Linking
>    `llvm-lto` uses ~7M line-table entries, so this on its own saves
>    ~700MB.
> 
>    Sketch of class definition:
> 
>        class MDLineTable : public MDUser {
>          unsigned Line;
>          unsigned Column;
>        public:
>          static MDLineTable *get(unsigned Line, unsigned Column,
>                                  MDNode *Scope);
>          static MDLineTable *getInlined(MDLineTable *Base, MDNode *Scope);
>          static MDLineTable *getBase(MDLineTable *Inlined);
> 
>          unsigned getLine() const { return Line; }
>          unsigned getColumn() const { return Column; }
>          bool isInlined() const { return getNumOperands() == 2; }
>          MDNode *getScope() const { return getOperand(0); }
>          MDNode *getInlinedAt() const { return getOperand(1); }
>        };
> 
>    Proposed assembly syntax:
> 
>        ; Not inlined.
>        !7 = metadata !MDLineTable(line: 45, column: 7, scope: metadata !9)
> 
>        ; Inlined.
>        !7 = metadata !MDLineTable(line: 45, column: 7, scope: metadata !9,
>                                   inlinedAt: metadata !10)
> 
>        ; Column defaulted to 0.
>        !7 = metadata !MDLineTable(line: 45, scope: metadata !9)
> 
>    (What colour should that bike shed be?)
> 
> 3. (Optional) Rewrite `DebugLoc` lookup tables.  My profiling shows
>    that we have 3.5M entries in the `DebugLoc` side-vectors for 7M line
>    table entries.  The cost of these is ~180B each, for another
>    ~600MB.
> 
>    If we integrate a side-table of `MDLineTable`s into its uniquing,
>    the overhead is only ~12B / line table entry, or ~80MB.  This saves
>    520MB.
> 
>    This is somewhat perpendicular to redesigning the metadata format,
>    but IMO it's worth doing as soon as it's possible.
> 
> 4. Create `GenericDebugMDNode`, a transitional subclass of `MDUser`
>    through an intermediate class `DebugMDNode` with an
>    allocation-time-optional `CallbackVH` available for referencing
>    non-metadata.  Change `DIDescriptor` to wrap a `DebugMDNode` instead
>    of an `MDNode`.
> 
>    This saves another ~960MB, for a running total of ~2GB.
> 
>    Proposed assembly syntax:
> 
>        !7 = metadata !GenericDebugMDNode(tag: DW_TAG_compile_unit,
>                                          fields: "0\00clang
3.6\00...",
>                                          operands: { metadata !8, ... })
> 
>        !7 = metadata !GenericDebugMDNode(tag: DW_TAG_variable,
>                                          fields:
"global_var\00...",
>                                          operands: { metadata !8, ... },
>                                          handle: i32* @global_var)
> 
>    This syntax pulls the tag out of the current header-string, calls
>    the rest of the header "fields", and includes the metadata
operands
>    in "operands".
> 
> 5. Incrementally create subclasses of `DebugMDNode`, such as
>    `MDCompileUnit` and `MDSubprogram`.  Sub-classed nodes replace the
>    "fields" and "operands" catch-alls with explicit
names for each
>    operand.
> 
>    Proposed assembly syntax:
> 
>        !7 = metadata !MDSubprogram(line: 45, name: "foo",
displayName: "foo",
>                                    linkageName: "_Z3foov", file:
metadata !8,
>                                    function: i32 (i32)* @foo)
> 
> 6. Remove the dead code for `GenericDebugMDNode`.
> 
> 7. (Optional) Refactor `DebugMDNode` sub-classes to minimize RAUW
>    traffic during bitcode serialization.  Now that metadata types are
>    known, we can write debug info out in an order that makes it cheap
>    to read back in.
> 
>    Note that using `MDUser` will make RAUW much cheaper, since we're
>    using the use-list infrastructure for most of them.  If RAUW isn't
>    showing up in a profile, I may skip this.
> 
> Does this direction seem reasonable?  Any major problems I've missed?
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Eric Christopher

2014-Oct-16 06:30 UTC

head link

[LLVMdev] [RFC] Less memory and greater maintainability for debug info IR

On Wed, Oct 15, 2014 at 8:53 PM, Alex Rosenberg <alexr at leftfield.org>
wrote:> As all of these transforms are 1-to-1, can we still support the older
metadata and convert it on the fly?
>
I'd prefer not to keep all of that code around to interpret both
versions without a very good reason.

-eric
> Alex
>
>> On Oct 13, 2014, at 3:02 PM, Duncan P. N. Exon Smith <dexonsmith at
apple.com> wrote:
>>
>> In r219010, I merged integer and string fields into a single header
>> field.  By reducing the number of metadata operands used in debug info,
>> this saved 2.2GB on an `llvm-lto` bootstrap.  I've done some
profiling
>> of DW_TAGs to see what parts of PR17891 and PR17892 to tackle next, and
>> I've concluded that they will be insufficient.
>>
>> Instead, I'd like to implement a more aggressive plan, which as a
>> side-effect cleans up the much "loved" debug info IR assembly
syntax.
>>
>> At a high-level, the idea is to create distinct subclasses of `Value`
>> for each debug info concept, starting with line table entries and
moving
>> on to the DIDescriptor hierarchy.  By leveraging the use-list
>> infrastructure for metadata operands -- i.e., only using value handles
>> for non-metadata operands -- we'll improve memory usage and
increase
>> RAUW speed.
>>
>> My rough plan follows.  I quote some numbers for memory savings below
>> based on an -flto -g bootstrap of `llvm-lto` (i.e., running `llvm-lto`
>> on `llvm-lto.lto.bc`, an already-linked bitcode file dumped by
ld64's
>> -save-temps option) that currently peaks at 15.3GB.
>>
>> 1. Introduce `MDUser`, which inherits from `User`, and whose `Use`s
>>    must all be metadata.  The cost per operand is 1 pointer, vs. 4
>>    pointers in an `MDNode`.
>>
>> 2. Create `MDLineTable` as the first subclass of `MDUser`.  Use normal
>>    fields (not `Value`s) for the line and column, and use `Use`
>>    operands for the metadata operands.
>>
>>    On x86-64, this will save 104B / line table entry.  Linking
>>    `llvm-lto` uses ~7M line-table entries, so this on its own saves
>>    ~700MB.
>>
>>    Sketch of class definition:
>>
>>        class MDLineTable : public MDUser {
>>          unsigned Line;
>>          unsigned Column;
>>        public:
>>          static MDLineTable *get(unsigned Line, unsigned Column,
>>                                  MDNode *Scope);
>>          static MDLineTable *getInlined(MDLineTable *Base, MDNode
*Scope);
>>          static MDLineTable *getBase(MDLineTable *Inlined);
>>
>>          unsigned getLine() const { return Line; }
>>          unsigned getColumn() const { return Column; }
>>          bool isInlined() const { return getNumOperands() == 2; }
>>          MDNode *getScope() const { return getOperand(0); }
>>          MDNode *getInlinedAt() const { return getOperand(1); }
>>        };
>>
>>    Proposed assembly syntax:
>>
>>        ; Not inlined.
>>        !7 = metadata !MDLineTable(line: 45, column: 7, scope: metadata
!9)
>>
>>        ; Inlined.
>>        !7 = metadata !MDLineTable(line: 45, column: 7, scope: metadata
!9,
>>                                   inlinedAt: metadata !10)
>>
>>        ; Column defaulted to 0.
>>        !7 = metadata !MDLineTable(line: 45, scope: metadata !9)
>>
>>    (What colour should that bike shed be?)
>>
>> 3. (Optional) Rewrite `DebugLoc` lookup tables.  My profiling shows
>>    that we have 3.5M entries in the `DebugLoc` side-vectors for 7M line
>>    table entries.  The cost of these is ~180B each, for another
>>    ~600MB.
>>
>>    If we integrate a side-table of `MDLineTable`s into its uniquing,
>>    the overhead is only ~12B / line table entry, or ~80MB.  This saves
>>    520MB.
>>
>>    This is somewhat perpendicular to redesigning the metadata format,
>>    but IMO it's worth doing as soon as it's possible.
>>
>> 4. Create `GenericDebugMDNode`, a transitional subclass of `MDUser`
>>    through an intermediate class `DebugMDNode` with an
>>    allocation-time-optional `CallbackVH` available for referencing
>>    non-metadata.  Change `DIDescriptor` to wrap a `DebugMDNode` instead
>>    of an `MDNode`.
>>
>>    This saves another ~960MB, for a running total of ~2GB.
>>
>>    Proposed assembly syntax:
>>
>>        !7 = metadata !GenericDebugMDNode(tag: DW_TAG_compile_unit,
>>                                          fields: "0\00clang
3.6\00...",
>>                                          operands: { metadata !8, ...
})
>>
>>        !7 = metadata !GenericDebugMDNode(tag: DW_TAG_variable,
>>                                          fields:
"global_var\00...",
>>                                          operands: { metadata !8, ...
},
>>                                          handle: i32* @global_var)
>>
>>    This syntax pulls the tag out of the current header-string, calls
>>    the rest of the header "fields", and includes the metadata
operands
>>    in "operands".
>>
>> 5. Incrementally create subclasses of `DebugMDNode`, such as
>>    `MDCompileUnit` and `MDSubprogram`.  Sub-classed nodes replace the
>>    "fields" and "operands" catch-alls with explicit
names for each
>>    operand.
>>
>>    Proposed assembly syntax:
>>
>>        !7 = metadata !MDSubprogram(line: 45, name: "foo",
displayName: "foo",
>>                                    linkageName: "_Z3foov",
file: metadata !8,
>>                                    function: i32 (i32)* @foo)
>>
>> 6. Remove the dead code for `GenericDebugMDNode`.
>>
>> 7. (Optional) Refactor `DebugMDNode` sub-classes to minimize RAUW
>>    traffic during bitcode serialization.  Now that metadata types are
>>    known, we can write debug info out in an order that makes it cheap
>>    to read back in.
>>
>>    Note that using `MDUser` will make RAUW much cheaper, since
we're
>>    using the use-list infrastructure for most of them.  If RAUW
isn't
>>    showing up in a profile, I may skip this.
>>
>> Does this direction seem reasonable?  Any major problems I've
missed?
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Reasonably Related Threads

Search for more possibly parallel threads

llvm dev - Oct 2014 - [LLVMdev] [RFC] Less memory and greater maintainability for debug info IR

[LLVMdev] [RFC] Less memory and greater maintainability for debug info IR

[LLVMdev] [RFC] Less memory and greater maintainability for debug info IR

[LLVMdev] [RFC] Less memory and greater maintainability for debug info IR

[LLVMdev] [RFC] Less memory and greater maintainability for debug info IR

[LLVMdev] [RFC] Less memory and greater maintainability for debug info IR

[LLVMdev] [RFC] Less memory and greater maintainability for debug info IR

[LLVMdev] [RFC] Less memory and greater maintainability for debug info IR

[LLVMdev] [RFC] Less memory and greater maintainability for debug info IR

[LLVMdev] [RFC] Less memory and greater maintainability for debug info IR

[LLVMdev] [RFC] Less memory and greater maintainability for debug info IR

Reasonably Related Threads