David Blaikie via llvm-dev
2016-Dec-24 18:18 UTC
[llvm-dev] distinct DISubprograms hindering sharing inlined subprogram descriptions
On Fri, Dec 23, 2016 at 7:25 PM Duncan Exon Smith <dexonsmith at apple.com> wrote:> > > On Dec 23, 2016, at 18:36, David Blaikie <dblaikie at gmail.com> wrote: > > > > On Fri, Dec 23, 2016 at 11:47 AM Duncan P. N. Exon Smith < > dexonsmith at apple.com> wrote: > > A few disjoint thoughts; sorry they're so delayed (I skimmed the responses > below, and I think these are still relevant/not covered elsewhere). > > Firstly, why *should* DISubprogram definitions be distinct? There were > two reasons this was valuable (this was from before there was a cu: link). > > - It helped to fix long-standing bugs in the IRLinker, where > uniqued-DISubprograms in different compile units would coalesce. > > > Any idea why that ^ was a problem/bugs? > > > Now that the Function <-> DISubprogram connection has been reversed, I'm > not entirely sure it's still a necessary precaution. > > - It broke a several uniquing cycles involving DISubprogram. The > existence of the uniquing cycles meant that affected DISubprograms weren't > really uniqued, anyway (since we don't bother to solve graph isomorphism); > and breaking the cycles sped up the IRLinker and reduced memory waste. > > > Ah, fair point - my trivial example didn't have any local variables, so > didn't hit this problem. They do indeed form cycles and I haven't tested, > but seems totally reasonable/likely that that would break my simple example > because cycles break uniquing and all that. > > > Making DISubprogram uniqued again would either have no real effect > (because distinct nodes they reference aren't uniqued anymore either) or > bring back uniquing cycles (which would have effects... but not the desired > one). > > But there ought to be a way to get coalescing in the right places. We > just need to ensure that the uniqued/coalesced parts of the graph: > - are acyclic, and > - reference only coalesced nodes (note that leaves can happily include the > strangely-coalesced distinct DICompositeType beast I created). > > > Perhaps before I dive into your suggestion (which, it seems, is to > separate out the definition but let the declaration/locals form cycles > > > No (new) cycles; in fact this breaks some. > - the Function itself references the defn and the locals; > - the locals and the defn reference (transitively) the decl; and > - the decl doesn't reference any of those. > > The only cycle is the decl <-> composite type one that already exists (and > that odr magic (mostly) fixes). Maybe that one could be removed somehow > too, but it isn't directly relevant to the problem you're looking at. >Oh, yeah - I've an aside there too, maybe for another thread. (punchy summary: anonymous enums in headers used for constants are technically ODR violations (well, if they're ever used in an ODR entity - like an inline function) - they're not public types, so we don't put the mangled name on them & don't deduplicate them in DWARF type units - and probably similarly don't deduplicate them in the IR... - not sure what the right answer is there)> (Note that this is somewhat inspired by Reid's point in his recent debug > info talk, that CodeView avoids type cycles by breaking types into two > records.) >Ah, thanks, that helps with me picture it!> > - that would at least be dropped because the definition would only be kept > due to it being referenced from the llvm::Function), what about something > simpler: > > What if scope chains didn't go all the way to the top (the subprogram) but > stopped short of that? > > Now, granted - this was part of the great assertion I added several years > back that caught all kinds of silliness - mostly around inlined call sites > not having debugloc and leading to scope chains that lead to distinct > roots. But the alternative to having and maintaining an invariant - is just > to make things true by construction. All debug info scope chains within a > function with an associated DISubprogram have an implicit last element of > that DISubprogram? > > > That seems reasonable, I think, now that the function points directly at > its subprogram. > > I'm not sure it's simpler to get right though. Dropping the scope runs > the risk of coalescing things incorrectly. I imagine the backend relies on > pointer distinctness of two local variables called "a" that are in (or are > parameters of) unrelated subprograms. I think in some situations they > could be coalesced incorrectly if they had no scope. >Just no top level scope - so it'd only happen if you had two variables with the same name in the same scope. It might break in debug-having inline functions inlined into non-debug-having functions (which then might in turn be inlined into a debug-having function) - would have to check. Also, not sure if inlined situations might introduce circularity. Yep. The scope for inlinedAt DILocations point to the DISubprogram - so that'd create a cycle. Ah well.> > That would mean it'd be harder to catch the bugs I found > > > A shame if so, but if we can find another way to catch it or design it > away, as you mention, it doesn't much matter. > > - but I think most of them came from that specific inlining situation (I > can go back and do some archaeology and see if there's other exposure where > we could sure up some defenses as well) which can be caught more directly > (I forget if the inliner now catches it itself, or if we enhanced the debug > info verifier to catch it after the inliner runs - if it's the verifier, > then that wouldn't suffice if we made the change I'm suggesting - and we'd > have to hook it up in the inliner itself) > > Thoughts? > > > I've been thinking idly whether we can/should break DISubprogram apart > into uniqued (coalesce-able) and distinct parts. Still being someone > ignorant of DWARF itself (and CodeView, for that matter), I'm not entirely > sure. But... > > Here's my understanding of the current uses of DISubprogram (where I'm > using DISubprogramDecl for the usually uniqued declarations, and > DISubprogramDefn for the always distinct definitions): > > // !dbg > Function -> DISubprogram > > // Declaration (optional, only for member functions). > DISubprogramDefn -> DISubprogramDecl > > // Variables: > DISubprogramDefn -> MDNode -> DILocalVariable > > // Terminator for scope chains: > DILocalVariable [-> DILocalScope [-> ...]] -> DISubprogramDefn > DILocation [-> DILocalScope [-> ...]] -> DISubprogramDefn > DICompositeType -> DISubprogramDefn // function-local type > > // Elements (member functions): > DICompositeType -> MDNode -> DISubprogramDecl > DISubprogramDecl -> DICompositeType > DISubprogramDefn -> DICompositeType > > > And here are the changes I would consider: > > 1. Split DISubprogramDecl and DISubprogramDefn into different types. > > 2. Remove definition-specific fields from DISubprogramDecl. > - variables, a "dangerous" source of uniquing cycles. > - declaration, which it will never have. > - scopeLine, which it will never have. > - unit, which the Defn will have when it's relevant. > > 3. Make a DISubprogramDecl mandatory for DISubprogramDefn. > > 4. Remove redundant fields from DISubprogramDefn. > - name and linkageName are on the declaration. > - type is redundant. > - flags is likely redundant? > > Nothing so far has made any real changes, I think. DISubprogramDecl > should coalesce nicely between modules though. > > 5. Migrate links to point at DISubprogramDecl instead of DISubprogramDefn. > - DILocalVariable scope chain. > - DILocation scope chain. > - DICompositeType scope. > > I don't think this will create any uniquing cycles. DISubprogramDecl > won't point at any of these. Also, DILocalVariable should start coalescing > with each other. > > Since DISubprogramDefn is only referenced from the Function !dbg > attachment, only one should survive function-coalescing. > > One open question, since I don't know the backend code, is: can you emit > the "correct" DWARF after step (5)? I'm hoping you only need the Defn of > the Function that has instructions referencing the > DILocalVariable/DILocation. In which case, we suddenly get a lot of > coalescing. > > On 2016-Dec-15, at 10:54, David Blaikie <dblaikie at gmail.com> wrote: > > Branching off from a discussion of improvements to DIGlobalVariable > representations that Adrian's working on - got me thinking about related > changes that have already been made to DISubprogram. > > To reduce duplicate debug info when things like linkonce_odr functions > were deduplicated in LTO linking, the relationship between a CU and > DISubprogram was inverted (instead of a CU maintaining a list > of subprograms, subprograms specify which CU they come from - and the > llvm::Function references the DISubprogram, so if the llvm::Function goes > away, so does the associated DISubprogram) > > I'm not sure if this caused a regression, but at least seems to miss a > possible improvement: > > During IR linking (for LTO, ThinLTO, etc) these distinct DISubprogram > definitions (& their CU link, even if they weren't marked 'distinct', the > CU link would cause them to effectively be so) remain separate - this means > that inlined versions in one CU don't refer to an existing subprogram > definition in another CU. > > To demonstrate: > inl.h: > void f1(); > inline __attribute__((always_inline)) void f2() { > f1(); > } > inl1.cpp: > #include "inl.h" > void c1() { > f2(); > } > inl2.cpp: > #include "inl.h" > void c2() { > f2(); > } > > Compile to IR, llvm-link the result. The DWARF you get is basically the > same as the DWARF you'd get without linking: > > DW_TAG_compile_unit > DW_AT_name "inl1.cpp" > DW_TAG_subprogram #0 > DW_AT_name "f2" > DW_TAG_subprogram > DW_AT_name "c1" > DW_TAG_inlined_subroutine > DW_TAG_abstract_origin #0 "f2" > DW_TAG_compile_unit > DW_AT_name "inl2.cpp" > DW_TAG_subprogram #1 > DW_AT_name "f2" > DW_TAG_subprogram > DW_AT_name "c2" > DW_TAG_inlined_subroutine > DW_TAG_abstract_origin #1 "f2" > > Instead of something more like this: > > DW_TAG_compile_unit > DW_AT_name "inl1.cpp" > DW_TAG_subprogram #0 > DW_AT_name "f2" > DW_TAG_subprogram > DW_AT_name "c1" > DW_TAG_inlined_subroutine > DW_TAG_abstract_origin #0 "f2" > DW_TAG_compile_unit > DW_AT_name "inl2.cpp" > DW_TAG_subprogram > DW_AT_name "c2" > DW_TAG_inlined_subroutine > DW_TAG_abstract_origin #0 "f2" > > (note that only one abstract definition of f2 is produced here) > > Any thoughts? I imagine this is probably worth a reasonable amount of > savings in an optimized build. Not huge, but not nothing. (probably not the > top of anyone's list though, I realize) > > Should we remove the CU link from a non-internal linkage subprogram (& > this may have an effect on the GV representation issue originally being > discussed) and just emit it into whichever CU happens to need it first? > > This might be slightly sub-optimal, due to, say, the namespace being > foreign to that CU. But it's how we do types currently, I think? So at > least it'd be consistent and probably cheap enough/better. > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161224/3bc74f1b/attachment.html>
David Blaikie via llvm-dev
2021-Oct-15 19:06 UTC
[llvm-dev] distinct DISubprograms hindering sharing inlined subprogram descriptions
(came across this while looking for other things) This is still an issue: $ cat a.cpp #include "a.h" void f3() { f2(); } $ cat b.cpp #include "a.h" void f3(); void f1() { } int main() { f3(); f2(); } $ cat a.h void f1(); struct t1 { int x; }; inline __attribute__((always_inline)) void f2(t1 = t1()) { f1(); } $ $ clang++-tot -fuse-ld=lld -flto a.cpp b.cpp -g && llvm-dwarfdump-tot a.out -debug-info | grep "DW_AT_name\|DW_TAG_\|DW_AT_type" | sed -e "s/^............//" DW_TAG_compile_unit DW_AT_name ("a.cpp") DW_TAG_subprogram DW_AT_name ("f2") DW_TAG_formal_parameter DW_AT_type (0x0000003e "t1") DW_TAG_structure_type DW_AT_name ("t1") DW_TAG_member DW_AT_name ("x") DW_AT_type (0x00000054 "int") ... DW_TAG_compile_unit DW_AT_name ("b.cpp") DW_TAG_subprogram DW_AT_name ("f2") DW_TAG_formal_parameter DW_AT_type (0x000000000000003e "t1") ... The type (t1) gets deduplicated/shared across CUs (the DW_AT_type in b.cpp uses FORM_ref_addr) but the f2 subprogram is duplicated - though b.cpp's use of it could use ref_addr, and would if the function hadn't been inlined away. eg: if we move 'f2' to be defined in a.cpp but attributed always_inline, so it does get cross-unit inlined due to LTO: DW_TAG_compile_unit DW_AT_name ("a.cpp") DW_TAG_subprogram DW_AT_name ("f2") DW_TAG_formal_parameter DW_AT_type (0x0000003e "t1") DW_TAG_structure_type DW_AT_name ("t1") DW_TAG_member DW_AT_name ("x") DW_AT_type (0x00000054 "int") DW_TAG_base_type DW_AT_name ("int") DW_TAG_subprogram DW_AT_name ("f3") DW_TAG_inlined_subroutine DW_AT_abstract_origin (0x0000002a "_Z2f22t1") DW_TAG_formal_parameter DW_AT_abstract_origin (0x00000036) DW_TAG_compile_unit DW_AT_name ("b.cpp") DW_TAG_subprogram DW_AT_name ("f1") DW_TAG_subprogram DW_AT_name ("main") DW_AT_type (0x0000000000000054 "int") DW_TAG_inlined_subroutine DW_AT_abstract_origin (0x000000000000002a "_Z2f22t1") So now because the cross-CU inlining happened during LTO after module merging - the merging happens when the function's not inlined, module merge picks one llvm::Function, dropping the other one and dropping the other DISubprogram - and then the inlining kicks in, referencing just that single DIsubprogram. Not sure if it's of interest to anyone/a priority. (side note: Anyone interested in making ThinLTO home type definitions? That'd be great to reduce type duplication - would mean ThinLTO could turn off type units and/or that .o/.dwo/etc files would just generally be smaller anyway) On Sat, Dec 24, 2016 at 10:18 AM David Blaikie <dblaikie at gmail.com> wrote:> > > On Fri, Dec 23, 2016 at 7:25 PM Duncan Exon Smith <dexonsmith at apple.com> > wrote: > >> >> >> On Dec 23, 2016, at 18:36, David Blaikie <dblaikie at gmail.com> wrote: >> >> >> >> On Fri, Dec 23, 2016 at 11:47 AM Duncan P. N. Exon Smith < >> dexonsmith at apple.com> wrote: >> >>> A few disjoint thoughts; sorry they're so delayed (I skimmed the >>> responses below, and I think these are still relevant/not covered >>> elsewhere). >>> >>> Firstly, why *should* DISubprogram definitions be distinct? There were >>> two reasons this was valuable (this was from before there was a cu: link). >>> >>> - It helped to fix long-standing bugs in the IRLinker, where >>> uniqued-DISubprograms in different compile units would coalesce. >>> >> >> Any idea why that ^ was a problem/bugs? >> >> >>> Now that the Function <-> DISubprogram connection has been reversed, >>> I'm not entirely sure it's still a necessary precaution. >>> >>> - It broke a several uniquing cycles involving DISubprogram. The >>> existence of the uniquing cycles meant that affected DISubprograms weren't >>> really uniqued, anyway (since we don't bother to solve graph isomorphism); >>> and breaking the cycles sped up the IRLinker and reduced memory waste. >>> >> >> Ah, fair point - my trivial example didn't have any local variables, so >> didn't hit this problem. They do indeed form cycles and I haven't tested, >> but seems totally reasonable/likely that that would break my simple example >> because cycles break uniquing and all that. >> >> >>> Making DISubprogram uniqued again would either have no real effect >>> (because distinct nodes they reference aren't uniqued anymore either) or >>> bring back uniquing cycles (which would have effects... but not the desired >>> one). >>> >>> But there ought to be a way to get coalescing in the right places. We >>> just need to ensure that the uniqued/coalesced parts of the graph: >>> - are acyclic, and >>> - reference only coalesced nodes (note that leaves can happily include >>> the strangely-coalesced distinct DICompositeType beast I created). >>> >> >> Perhaps before I dive into your suggestion (which, it seems, is to >> separate out the definition but let the declaration/locals form cycles >> >> >> No (new) cycles; in fact this breaks some. >> - the Function itself references the defn and the locals; >> - the locals and the defn reference (transitively) the decl; and >> - the decl doesn't reference any of those. >> >> The only cycle is the decl <-> composite type one that already exists >> (and that odr magic (mostly) fixes). Maybe that one could be removed >> somehow too, but it isn't directly relevant to the problem you're looking >> at. >> > > Oh, yeah - I've an aside there too, maybe for another thread. (punchy > summary: anonymous enums in headers used for constants are technically ODR > violations (well, if they're ever used in an ODR entity - like an inline > function) - they're not public types, so we don't put the mangled name on > them & don't deduplicate them in DWARF type units - and probably similarly > don't deduplicate them in the IR... - not sure what the right answer is > there) > > >> (Note that this is somewhat inspired by Reid's point in his recent debug >> info talk, that CodeView avoids type cycles by breaking types into two >> records.) >> > > Ah, thanks, that helps with me picture it! > > >> >> - that would at least be dropped because the definition would only be >> kept due to it being referenced from the llvm::Function), what about >> something simpler: >> >> What if scope chains didn't go all the way to the top (the subprogram) >> but stopped short of that? >> >> Now, granted - this was part of the great assertion I added several years >> back that caught all kinds of silliness - mostly around inlined call sites >> not having debugloc and leading to scope chains that lead to distinct >> roots. But the alternative to having and maintaining an invariant - is just >> to make things true by construction. All debug info scope chains within a >> function with an associated DISubprogram have an implicit last element of >> that DISubprogram? >> >> >> That seems reasonable, I think, now that the function points directly at >> its subprogram. >> >> I'm not sure it's simpler to get right though. Dropping the scope runs >> the risk of coalescing things incorrectly. I imagine the backend relies on >> pointer distinctness of two local variables called "a" that are in (or are >> parameters of) unrelated subprograms. I think in some situations they >> could be coalesced incorrectly if they had no scope. >> > > Just no top level scope - so it'd only happen if you had two variables > with the same name in the same scope. It might break in debug-having inline > functions inlined into non-debug-having functions (which then might in turn > be inlined into a debug-having function) - would have to check. > > Also, not sure if inlined situations might introduce circularity. Yep. The > scope for inlinedAt DILocations point to the DISubprogram - so that'd > create a cycle. Ah well. > > >> >> That would mean it'd be harder to catch the bugs I found >> >> >> A shame if so, but if we can find another way to catch it or design it >> away, as you mention, it doesn't much matter. >> >> - but I think most of them came from that specific inlining situation (I >> can go back and do some archaeology and see if there's other exposure where >> we could sure up some defenses as well) which can be caught more directly >> (I forget if the inliner now catches it itself, or if we enhanced the debug >> info verifier to catch it after the inliner runs - if it's the verifier, >> then that wouldn't suffice if we made the change I'm suggesting - and we'd >> have to hook it up in the inliner itself) >> >> Thoughts? >> >> >> I've been thinking idly whether we can/should break DISubprogram apart >>> into uniqued (coalesce-able) and distinct parts. Still being someone >>> ignorant of DWARF itself (and CodeView, for that matter), I'm not entirely >>> sure. But... >>> >>> Here's my understanding of the current uses of DISubprogram (where I'm >>> using DISubprogramDecl for the usually uniqued declarations, and >>> DISubprogramDefn for the always distinct definitions): >>> >>> // !dbg >>> Function -> DISubprogram >>> >>> // Declaration (optional, only for member functions). >>> DISubprogramDefn -> DISubprogramDecl >>> >>> // Variables: >>> DISubprogramDefn -> MDNode -> DILocalVariable >>> >>> // Terminator for scope chains: >>> DILocalVariable [-> DILocalScope [-> ...]] -> DISubprogramDefn >>> DILocation [-> DILocalScope [-> ...]] -> DISubprogramDefn >>> DICompositeType -> DISubprogramDefn // function-local type >>> >>> // Elements (member functions): >>> DICompositeType -> MDNode -> DISubprogramDecl >>> DISubprogramDecl -> DICompositeType >>> DISubprogramDefn -> DICompositeType >>> >>> >>> And here are the changes I would consider: >>> >>> 1. Split DISubprogramDecl and DISubprogramDefn into different types. >>> >>> 2. Remove definition-specific fields from DISubprogramDecl. >>> - variables, a "dangerous" source of uniquing cycles. >>> - declaration, which it will never have. >>> - scopeLine, which it will never have. >>> - unit, which the Defn will have when it's relevant. >>> >>> 3. Make a DISubprogramDecl mandatory for DISubprogramDefn. >>> >>> 4. Remove redundant fields from DISubprogramDefn. >>> - name and linkageName are on the declaration. >>> - type is redundant. >>> - flags is likely redundant? >>> >>> Nothing so far has made any real changes, I think. DISubprogramDecl >>> should coalesce nicely between modules though. >>> >>> 5. Migrate links to point at DISubprogramDecl instead of >>> DISubprogramDefn. >>> - DILocalVariable scope chain. >>> - DILocation scope chain. >>> - DICompositeType scope. >>> >>> I don't think this will create any uniquing cycles. DISubprogramDecl >>> won't point at any of these. Also, DILocalVariable should start coalescing >>> with each other. >>> >>> Since DISubprogramDefn is only referenced from the Function !dbg >>> attachment, only one should survive function-coalescing. >>> >>> One open question, since I don't know the backend code, is: can you emit >>> the "correct" DWARF after step (5)? I'm hoping you only need the Defn of >>> the Function that has instructions referencing the >>> DILocalVariable/DILocation. In which case, we suddenly get a lot of >>> coalescing. >>> >>> On 2016-Dec-15, at 10:54, David Blaikie <dblaikie at gmail.com> wrote: >>> >>> Branching off from a discussion of improvements to DIGlobalVariable >>> representations that Adrian's working on - got me thinking about related >>> changes that have already been made to DISubprogram. >>> >>> To reduce duplicate debug info when things like linkonce_odr functions >>> were deduplicated in LTO linking, the relationship between a CU and >>> DISubprogram was inverted (instead of a CU maintaining a list >>> of subprograms, subprograms specify which CU they come from - and the >>> llvm::Function references the DISubprogram, so if the llvm::Function goes >>> away, so does the associated DISubprogram) >>> >>> I'm not sure if this caused a regression, but at least seems to miss a >>> possible improvement: >>> >>> During IR linking (for LTO, ThinLTO, etc) these distinct DISubprogram >>> definitions (& their CU link, even if they weren't marked 'distinct', the >>> CU link would cause them to effectively be so) remain separate - this means >>> that inlined versions in one CU don't refer to an existing subprogram >>> definition in another CU. >>> >>> To demonstrate: >>> inl.h: >>> void f1(); >>> inline __attribute__((always_inline)) void f2() { >>> f1(); >>> } >>> inl1.cpp: >>> #include "inl.h" >>> void c1() { >>> f2(); >>> } >>> inl2.cpp: >>> #include "inl.h" >>> void c2() { >>> f2(); >>> } >>> >>> Compile to IR, llvm-link the result. The DWARF you get is basically the >>> same as the DWARF you'd get without linking: >>> >>> DW_TAG_compile_unit >>> DW_AT_name "inl1.cpp" >>> DW_TAG_subprogram #0 >>> DW_AT_name "f2" >>> DW_TAG_subprogram >>> DW_AT_name "c1" >>> DW_TAG_inlined_subroutine >>> DW_TAG_abstract_origin #0 "f2" >>> DW_TAG_compile_unit >>> DW_AT_name "inl2.cpp" >>> DW_TAG_subprogram #1 >>> DW_AT_name "f2" >>> DW_TAG_subprogram >>> DW_AT_name "c2" >>> DW_TAG_inlined_subroutine >>> DW_TAG_abstract_origin #1 "f2" >>> >>> Instead of something more like this: >>> >>> DW_TAG_compile_unit >>> DW_AT_name "inl1.cpp" >>> DW_TAG_subprogram #0 >>> DW_AT_name "f2" >>> DW_TAG_subprogram >>> DW_AT_name "c1" >>> DW_TAG_inlined_subroutine >>> DW_TAG_abstract_origin #0 "f2" >>> DW_TAG_compile_unit >>> DW_AT_name "inl2.cpp" >>> DW_TAG_subprogram >>> DW_AT_name "c2" >>> DW_TAG_inlined_subroutine >>> DW_TAG_abstract_origin #0 "f2" >>> >>> (note that only one abstract definition of f2 is produced here) >>> >>> Any thoughts? I imagine this is probably worth a reasonable amount of >>> savings in an optimized build. Not huge, but not nothing. (probably not the >>> top of anyone's list though, I realize) >>> >>> Should we remove the CU link from a non-internal linkage subprogram (& >>> this may have an effect on the GV representation issue originally being >>> discussed) and just emit it into whichever CU happens to need it first? >>> >>> This might be slightly sub-optimal, due to, say, the namespace being >>> foreign to that CU. But it's how we do types currently, I think? So at >>> least it'd be consistent and probably cheap enough/better. >>> >>> >>>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20211015/97d747cc/attachment.html>