Greg Clayton via llvm-dev
2018-Jun-15 14:42 UTC
[llvm-dev] [lldb-dev] Adding DWARF5 accelerator table support to llvm
To elaborate a bit more on the issue that is detailed in reviews.llvm.org/rL260308: There are many clang AST contexts that are used in LLDB: - one for each lldb_private::Module that contains type definitions as we know them in the module and its symbol vendor - one for each expression - one for results of expressions in the lldb_private::Target As we run expressions we end up copying classes around between the Module ASTs and the expression and Target ASTs. If a class has templated functions, they will only be in the DWARF is a specialization was created and used. If you have a class that looks like: class A { A(); <template T> void Foo(T t); }; And then you have main.cpp that has a "double" and "int" specialization, the class definition in DWARF looks like: class A { A(); <int> void Foo(int t); <double> void Foo(double t); }; In another source file, say foo.cpp, if its use of class A doesn't specialize Foo, we have a class definition in DWARF that looks like: class A { A(); }; With the C++ ODR rules, we can pick any of the "class A" definitions whose qualified name matches ("::A") and has the same decl file + decl line. So when parsing "class A", the DWARF parser will use the accelerator tables to find all instances of "class A", and it will pick on and use it and this will become the one true definition for "class A". This is because DWARF is only emitted for template functions when there is a specialization, that mean any definition of "class A" might or might not include any definition for "<template T> A::Foo(T t);". When we copy types between ASTs, everything is fine if the classes match (no copy needs to be made), but things go wrong if things don't match and this causes expression errors. Some ways to fix this: 1 - anytime we need _any_ C++ class, we must dig up all definitions and check _all_ DW_TAG_subprogram DIEs within the class to check if any functions have templates and use the class with the most specializations 2 - have DWARF actually emit the template function info all the time as a type T, not a specialization, so we always have the full definition 3 - have some accelerator table that explicitly points us to all specializations of class methods given a class name Solution #1 would cause us to dig through all definitions of all C++ classes all the time when parsing DWARF to check if definitions of the classes had template methods. And we would need to find the class that has the most template methods. This would cause us to parse much more of the debug info all of the time and cause increased memory consumption and performance regressions. Solution #2: not sure if DWARF even supports generic template definitions where the template isn't specialized. And, how would we be able to tell DWARF that emits only specialized templates vs one that has generic definitions... Solution #3 will require compiler changes. So this is another vote to support the ability for a given class to be able to locate all of its functions, kind of like we need for Objective C where the class definition doesn't contain all of methods, so we have the .apple_objc section that provides this mapping for us. We would need something similar for C++. So maybe a possible solution is some sort of section that can specify all of the DIEs related to a class that are not contained in the class hierarchy itself. This would work for Objective C and for C++. Thoughts? Greg> On Jun 15, 2018, at 3:34 AM, Pavel Labath <labath at google.com> wrote: > > I wasn't using type units (those don't work at all right now). > > I've done a bit of digging, and i found this patch > <reviews.llvm.org/rL260308> which explicitly disables template > member function parsing (though it seems it didn't really work before > either). The patch contains a quite long explanation of why is this > not working. I can't say I understand all of it (this is getting a bit > out of my league), but the core of the issue seems to be that when we > start to mix classes from two CU which have different sets of > instantiations in a single expression, things quickly go south because > the recycled clang ASTs from the two dwarf versions do not match. > > For better or worse, it seems gdb is having similar issues as well, as > I couldn't get it to grok my member template expressions either.. > > On Thu, 14 Jun 2018 at 19:47, David Blaikie <dblaikie at gmail.com> wrote: >> >> oh, awesome. >> >> Were you using type units? (I imagine that'd make the situation worse - since the way clang emits DWARF for a type with a member function template implicit specialization is to emit the type unit without any mention of this, and to emit the implicit specialization declaration into the stub type in the CU (that references the type unit)) Without type units I'd be pretty surprised if you couldn't call the implicit specialization at least from the CU in which it was instantiated. >> >> On Thu, Jun 14, 2018 at 11:41 AM Pavel Labath <labath at google.com> wrote: >>> >>> On Thu, 14 Jun 2018 at 19:29, Pavel Labath <labath at google.com> wrote: >>>> >>>> On Thu, 14 Jun 2018 at 19:26, David Blaikie <dblaikie at gmail.com> wrote: >>>>> >>>>> >>>>> >>>>> On Thu, Jun 14, 2018 at 11:24 AM Pavel Labath <labath at google.com> wrote: >>>>>> >>>>>> On Thu, 14 Jun 2018 at 17:58, Greg Clayton <clayborg at gmail.com> wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Jun 14, 2018, at 9:36 AM, Adrian Prantl <aprantl at apple.com> wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Jun 14, 2018, at 7:01 AM, Pavel Labath via llvm-dev <llvm-dev at lists.llvm.org> wrote: >>>>>>> >>>>>>> Thank you all. I am going to try to reply to all comments in a single email. >>>>>>> >>>>>>> Regarding the .apple_objc idea, I am afraid the situation is not as >>>>>>> simple as just flipping a switch. >>>>>>> >>>>>>> >>>>>>> Jonas is currently working on adding the support for DWARF5-style Objective-C accelerator tables to LLVM/LLDB/dsymutil. Based on the assumption that DWARF 4 and earlier are unaffected by any of this, I don't think it's necessary to spend any effort of making the transition smooth. I'm fine with having Objective-C on DWARF 5 broken on trunk for two weeks until Jonas is done adding Objective-C support to the DWARF 5 implementation. >>>>>> >>>>>> Ideally, I would like to enable the accelerator tables (possibly with >>>>>> a different version number or something) on DWARF 4 too (on non-apple >>>>>> targets only). The reason for this is that their absence if causing >>>>>> large slowdowns when debugging on non-apple platforms, and I wouldn't >>>>>> want to wait for dwarf 5 for that to go away (I mean no disrespect to >>>>>> Paul and DWARF 5 effort in general, but even if all of DWARF 5 in llvm >>>>>> was done tomorrow, there would still be lldb, which hasn't even begun >>>>>> to look at this version). >>>>>> >>>>>> That said, if you are working on the Objective C support right now, >>>>>> then I am happy to wait two weeks or so that we have a full >>>>>> implementation from the get-go. >>>>>> >>>>>>> But, other options may be possible as well. What's not clear to me is >>>>>>> whether these tables couldn't be replaced by extra information in the >>>>>>> .debug_info section. It seems to me that these tables are trying to >>>>>>> work around the issue that there is no straight way to go from a >>>>>>> DW_TAG_structure type DIE describing an ObjC class to it's methods. If >>>>>>> these methods (their forward declarations) were be present as children >>>>>>> of the type DIE (as they are for c++ classes), then these tables may >>>>>>> not be necessary. But maybe (probably) that has already been >>>>>>> considered and deemed infeasible for some reason. In any case this >>>>>>> seemed like a thing best left for people who actually work on ObjC >>>>>>> support to figure out. >>>>>>> >>>>>>> >>>>>>> That's really a question for Greg or Jim — I don't know why the current representation has the Objective-C methods outside of the structs. One reason might be that an interface's implementation can define more methods than are visible in its public interface in the header file, but we already seem to be aware of this and mark the implementation with DW_AT_APPLE_objc_complete_type. I also am not sure that this is the *only* reason for the objc accelerator table. But I'd like to learn. >>>>>> >>>>>> My observation was based on studying lldb code. The only place where >>>>>> the objc table is used is in the AppleDWARFIndex::GetObjCMethods >>>>>> function, which is called from >>>>>> SymbolFileDWARF::GetObjCMethodDIEOffsets, whose only caller is >>>>>> DWARFASTParserClang::CompleteTypeFromDWARF, which seems to have a >>>>>> class DIE as an argument. However, if not all declarations of a >>>>>> class/interface have access to the full list of methods then this >>>>>> might be a problem for the approach I suggested. >>>>> >>>>> >>>>> Maybe, but the same is actually true for C++ classes too (see my comments in another reply about implicit specializations of class member templates (and there are a couple of other examples)) - so might be worth considering how those are handled/could be improved, and maybe in fixing those we could improve/normalize the ObjC representation and avoid the need for ObjC tables... maybe. >>>>> >>>> >>>> That's a good point! I need to check out how we handle that right now. >>> >>> Apparently we handle that very poorly. :/ I wasn't even able to call >>> the instantiation which was present in the CU I was stopped in. I >>> didn't even get to the part about trying an instantiation from a >>> different CU.-------------- next part -------------- An HTML attachment was scrubbed... URL: <lists.llvm.org/pipermail/llvm-dev/attachments/20180615/929783bb/attachment.html>
via llvm-dev
2018-Jun-15 16:23 UTC
[llvm-dev] [lldb-dev] Adding DWARF5 accelerator table support to llvm
> From: Greg Clayton [mailto:clayborg at gmail.com] > > ... > If a class has templated functions, they will only be in the DWARF is a > specialization was created and used. If you have a class that looks like: > > class A { > A(); > <template T> void Foo(T t); > }; > > And then you have main.cpp that has a "double" and "int" specialization, > the class definition in DWARF looks like: > > class A { > A(); > <int> void Foo(int t); > <double> void Foo(double t); > }; > > In another source file, say foo.cpp, if its use of class A doesn't > specialize Foo, we have a class definition in DWARF that looks like: > > class A { > A(); > }; >I think it would be more instructive to think about a case where main.cpp had Foo<int> and foo.cpp had Foo<double>.> With the C++ ODR rules, we can pick any of the "class A" definitions > whose qualified name matches ("::A") and has the same decl file + > decl line. So when parsing "class A", the DWARF parser will use the > accelerator tables to find all instances of "class A", and it will > pick on and use it and this will become the one true definition for > "class A".FTR, the Sony debugger finds them all and then merges them into the one true definition, because there's no promise that any one description is a superset of the rest.> This is because DWARF is only emitted for template functions when > there is a specialization, that mean any definition of "class A" might > or might not include any definition for "<template T> A::Foo(T t);".It's not just template functions, you know; the implicit ctors/dtors might or might not be in any given description. Those are also only instantiated and described in CUs that require them.> When we copy types between ASTs, everything is fine if the classes > match (no copy needs to be made), but things go wrong if things > don't match and this causes expression errors. > > Some ways to fix this: > 1 - anytime we need _any_ C++ class, we must dig up all definitions > and check _all_ DW_TAG_subprogram DIEs within the class to check if > any functions have templates and use the class with the most > specializationsThis still fails in my "more instructive" case. The accelerator table does make it fast to find all the classes with the same name, and you can then merge them.> 2 - have DWARF actually emit the template function info all the time > as a type T, not a specialization, so we always have the full definitionHm. You mean a subroutine_type with template_parameter children that don't have actual values? That would give you a pattern, but not tell you what/where definitions exist. I don't see how that can help?> 3 - have some accelerator table that explicitly points us to all > specializations of class methods given a class nameSo you want an accelerator to tell you what bits you need to pick up from the various descriptions in order to construct the overall superset definition, which would be a little cheaper than parsing each entire class description (which you can already find through the existing accelerator tables). I can see that.> Solution #1 would cause us to dig through all definitions of all C++ > classes all the time when parsing DWARF to check if definitions of > the classes had template methods. And we would need to find the class > that has the most template methods. This would cause us to parse much > more of the debug info all of the time and cause increased memory > consumption and performance regressions.It would be cheap to put a flag on the class DIE that tells you there are template methods to go look for. Then you incur the cost only when necessary. And the accelerator table makes it fast to find the other class descriptions.> Solution #2: not sure if DWARF even supports generic template > definitions where the template isn't specialized. And, how would we > be able to tell DWARF that emits only specialized templates vs one > that has generic definitions...Right, DWARF today doesn't do that, although as I mentioned earlier conjuring up a subroutine_type with template parameters would not appear to be any more helpful than a simple flag on the class.> > Solution #3 will require compiler changes.Well, so does #2.> So this is another vote to support the ability for a given class > to be able to locate all of its functions, kind of like we need > for Objective C where the class definition doesn't contain all of > methods, so we have the .apple_objc section that provides this > mapping for us. We would need something similar for C++. > > So maybe a possible solution is some sort of section that can > specify all of the DIEs related to a class that are not contained > in the class hierarchy itself. This would work for Objective C > and for C++. > > Thoughts? > > GregSo, would it be helpful to have a flag on the class that tells you whether there are method instantiations to go look for? My impression is that templated class methods are unusual so it would save the performance cost a lot of the time right there. I don't know enough about Obj-C to say whether it can know up front there are potentially other methods elsewhere. --paulr
Greg Clayton via llvm-dev
2018-Jun-15 16:45 UTC
[llvm-dev] [lldb-dev] Adding DWARF5 accelerator table support to llvm
> On Jun 15, 2018, at 9:23 AM, <paul.robinson at sony.com> <paul.robinson at sony.com> wrote: > >> From: Greg Clayton [mailto:clayborg at gmail.com] >> >> ... >> If a class has templated functions, they will only be in the DWARF is a >> specialization was created and used. If you have a class that looks like: >> >> class A { >> A(); >> <template T> void Foo(T t); >> }; >> >> And then you have main.cpp that has a "double" and "int" specialization, >> the class definition in DWARF looks like: >> >> class A { >> A(); >> <int> void Foo(int t); >> <double> void Foo(double t); >> }; >> >> In another source file, say foo.cpp, if its use of class A doesn't >> specialize Foo, we have a class definition in DWARF that looks like: >> >> class A { >> A(); >> }; >> > > I think it would be more instructive to think about a case where > main.cpp had Foo<int> and foo.cpp had Foo<double>.Any difference is the problem here for clang ASTs, so it just matters that they are different. We make a CXXRecordDecl in the clang AST and if we see even one specialization, then we add the generic version to the CXXRecordDecl and we are good to go. So the main.cpp had Foo<int> and foo.cpp had Foo<double> is actually fine. If I make a CXXRecordDecl from either of these then the two definitions match since the clang AST CXXRecordDecl just needs to have the templated function declaration.> >> With the C++ ODR rules, we can pick any of the "class A" definitions >> whose qualified name matches ("::A") and has the same decl file + >> decl line. So when parsing "class A", the DWARF parser will use the >> accelerator tables to find all instances of "class A", and it will >> pick on and use it and this will become the one true definition for >> "class A". > > FTR, the Sony debugger finds them all and then merges them into the one > true definition, because there's no promise that any one description is > a superset of the rest.At the expense of parsing every definition for a class within each file.> >> This is because DWARF is only emitted for template functions when >> there is a specialization, that mean any definition of "class A" might >> or might not include any definition for "<template T> A::Foo(T t);". > > It's not just template functions, you know; the implicit ctors/dtors > might or might not be in any given description. Those are also only > instantiated and described in CUs that require them.We don't care about those as those are marked as DW_AT_artificial and we leave those out of the clang AST Context CXXRecordDecl because they are implicit and the compiler can add those back in if needed since it knows if a class can have the constructors implicitly created.> >> When we copy types between ASTs, everything is fine if the classes >> match (no copy needs to be made), but things go wrong if things >> don't match and this causes expression errors. >> >> Some ways to fix this: >> 1 - anytime we need _any_ C++ class, we must dig up all definitions >> and check _all_ DW_TAG_subprogram DIEs within the class to check if >> any functions have templates and use the class with the most >> specializations > > This still fails in my "more instructive" case. The accelerator table > does make it fast to find all the classes with the same name, and you > can then merge them.That is a lot of DWARF parsing and logic to try and figure out what the full set of DW_TAG_subprograms are.> >> 2 - have DWARF actually emit the template function info all the time >> as a type T, not a specialization, so we always have the full definition > > Hm. You mean a subroutine_type with template_parameter children that > don't have actual values? That would give you a pattern, but not tell > you what/where definitions exist. I don't see how that can help?Right now DWARF does only specializations, so DWARF would need to be extended to be able to specify the template details without requiring a specialization. Not easy for sure.> >> 3 - have some accelerator table that explicitly points us to all >> specializations of class methods given a class name > > So you want an accelerator to tell you what bits you need to pick up > from the various descriptions in order to construct the overall > superset definition, which would be a little cheaper than parsing > each entire class description (which you can already find through > the existing accelerator tables). I can see that.Yes. This is the most appealing to me as well.> >> Solution #1 would cause us to dig through all definitions of all C++ >> classes all the time when parsing DWARF to check if definitions of >> the classes had template methods. And we would need to find the class >> that has the most template methods. This would cause us to parse much >> more of the debug info all of the time and cause increased memory >> consumption and performance regressions. > > It would be cheap to put a flag on the class DIE that tells you there > are template methods to go look for. Then you incur the cost only > when necessary. And the accelerator table makes it fast to find the > other class descriptions.That is a fine solution. But we still run into the problem where we don't know if the DWARF knows about that flag. If we do a flag, it would be nice if it were mandatory on all classes to indicate support for the flag. But this would be a fine solution and not hard to implement.> >> Solution #2: not sure if DWARF even supports generic template >> definitions where the template isn't specialized. And, how would we >> be able to tell DWARF that emits only specialized templates vs one >> that has generic definitions... > > Right, DWARF today doesn't do that, although as I mentioned earlier > conjuring up a subroutine_type with template parameters would not > appear to be any more helpful than a simple flag on the class.Yeah, there would have to be new DWARF and a lot of DW_AT_specification or DW_AT_abstract_origin references involved...> >> >> Solution #3 will require compiler changes. > > Well, so does #2.Indeed.> >> So this is another vote to support the ability for a given class >> to be able to locate all of its functions, kind of like we need >> for Objective C where the class definition doesn't contain all of >> methods, so we have the .apple_objc section that provides this >> mapping for us. We would need something similar for C++. >> >> So maybe a possible solution is some sort of section that can >> specify all of the DIEs related to a class that are not contained >> in the class hierarchy itself. This would work for Objective C >> and for C++. >> >> Thoughts? >> >> Greg > > So, would it be helpful to have a flag on the class that tells you > whether there are method instantiations to go look for?That would be a great start to solution #1 if that is the way we choose to go.> My > impression is that templated class methods are unusual so it would > save the performance cost a lot of the time right there. > > I don't know enough about Obj-C to say whether it can know up front > there are potentially other methods elsewhere.Another way to do the objective C solution would be to have an attribute on a DW_TAG_class_type that could specify a list of DIEs that are not contained in the class definition that are required for the class. In Objective C, the DW_TAG_subprogram for methods are outside of the class itself. This would involve DWARF changes, but it could be an attribute that specifies a section offset into a new section that contains a DIE offset list. So the Objective C case is still different enough from the C++ case because in the C++ case all functions are still contained within the DW_TAG_class_type DIE IIRC. Greg