David Blaikie via llvm-dev
2016-Mar-03 18:26 UTC
[llvm-dev] [cfe-dev] RFC: CodeView debug info emission in Clang/LLVM
I think it'd be reasonable to at least figure out a good way to do type references consistently across the two schemes, but I'm OK with the idea of having a blob of opaque type information for different debug info formats, created by frontends (& don't mind if the library for building that blob live in LLVM or Clang for now - the DWARF one at least would probably live in LLVM because type info and other DWARF are described by similar/the same constructs (DIEs, abbrevs, etc) - but it seems like that's not the case for PDB, so there might not be any code to share between LLVM's CodeView needs and the type info construction - then it's just a matter of whether pushing that library down into LLVM for other frontends to use would be good, which it probably will be at some point, so if it goes into Clang I'd at least try to keep it pretty well separated) Potentially that consistency could be created by going the other way - replace DITypeRef with an int, then have the retained types list be the int->type mapping. Skipping the mangled names. (& skip the retained types list for CV/PDB) - Dave On Wed, Mar 2, 2016 at 5:19 PM, Reid Kleckner via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Circling back around 4 months later... > > I now believe that we should just let the frontend generate CV type info. > It's really not worth the hassle to try to have a common representation. > Enough C++ ABI-specific information leaks into the format that it's really > better to avoid trying to create a union of DWARF and CV type info in LLVM > DI metadata. We were able to reuse all the other non-type DI metadata, such > as location info and scope info, to emit inline line tables and variable > locations, so I think we did OK on reusing the existing infrastructure. > Compromising at not reusing the type representation seems OK. > > I haven't come up with any ideas better than the design that Dave > Bartolomeo outlined below, so I think we should go ahead with that. One > thing I considered was extending DITypeRef to be a union between MDString*, > DIType*, and a type index, but I think that's too invasive. I also don't > want to make a whole DIType heap allocation just to wrap a 32-bit type > index, so I'm in favor of putting the indices into DISubprogram and > DIVariable. > > Any thoughts on this plan? > > On Thu, Oct 29, 2015 at 10:11 AM, Dave Bartolomeo via cfe-dev < > cfe-dev at lists.llvm.org> wrote: >> >> *Proposed Design* >> >> *How Debug Info is Generated* >> >> The CodeView type records for a compilation unit will be generated by the >> front-end for the source language (Clang, in the case of C and C++). The >> front-end has access to the full type system and AST of the language, which >> is necessary to generate accurate debug type info. The type records will be >> represented as metadata in the LLVM IR, similar to how DWARF debug info is >> represented. I’ll cover the actual representation in a bit more detail >> below. >> >> The LLVM back-end will be responsible for emitting the CodeView type >> records from the IR into the output .obj file. Since the type records will >> already be in the correct format, this is essentially just a copy. No >> inspection of the type records is necessary within LLVM. The back-end will >> also be responsible for generating CodeView symbol records, line numbers, >> and source file info for any functions and data defined in the compilation >> unit. The back-end is the logical place to do this because only the >> back-end knows the code addresses, data addresses, and stack frame layouts. >> >> >> >> *Representation of CodeView in LLVM IR* >> >> DICompileUnit >> >> + e*xisting fields* >> >> + CodeViewTypes : DICodeViewTypes >> >> >> >> DICodeViewTypes >> >> + TypeRecords : MDString[] >> >> + UDTSymbols : DICodeViewUDT[] >> >> >> >> DICodeViewUDT >> >> + Name : MDString >> >> + TypeIndex : uint32_t >> >> >> >> DIVariable >> >> + *existing fields* >> >> + TypeIndex : uint32_t >> >> >> >> DISubprogram >> >> + *existing fields* >> >> + TypeIndex : uint32_t >> >> The existing DICompileUnit node will have a new operand named >> CodeViewTypes, which points to the new DICodeViewTypes node that describes >> the CodeView type information for the compilation unit. >> >> >> >> The DICodeViewTypes node contains two operands: >> >> - TypeRecords, an array of MDStrings containing the actual >> CodeView type records for the compilation unit, sorted in ascending order >> of type index. >> >> - UDTSymbols, and array of DICodeViewUDT nodes describing the >> user-defined types (class/struct/union/enum) for which CodeView symbol >> records will need to be emitted by the back-end. >> >> >> >> The DICodeViewUDT node contains two operands: >> >> - Name, an MDString with the name of the symbol as it should >> appear in the CodeView symbol record. >> >> - TypeIndex, a uint32_t holding the CodeView type index of the >> type record for the user-defined type’s definition. >> >> >> >> The DICodeViewUDT nodes are necessary because they are generally the only >> references to the definition of the user-defined type. Other uses of that >> type refer to the forward declaration record for the type, and without a >> reference to the definition of the type, the linker will discard the >> definition record when it merges the type information into the PDB. >> >> >> >> To specify the CodeView type for a variable or function, the DIVariable >> and DISubprogram nodes will have an additional TypeIndex operand containing >> the type index of the type record for that variable or function’s type. >> This operand will be set to zero when CodeView debug info is not enabled. >> >> >> >> The above representation essentially extends the existing DWARF-focused >> debug metadata to also include CodeView info. This was the least invasive >> way I found to add CodeView support, but it may not be the right >> architectural decision. It would also be possible to have the CodeView >> metadata entirely separate from the DWARF metadata. This would reduce the >> size of the IR when only one form of debug information was being emitted, >> which is presumably the common case. However, I expect it would complicate >> the scenario where both DWARF and CodeView are being emitted; for example, >> would having two dbg.declare intrinsics for a single local variable confuse >> existing consumers of LLVM IR? I’m hoping someone more familiar with the >> existing debug info architecture can provide some guidance here if there’s >> a better way of doing this. >> > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160303/00c79964/attachment.html>
Aboud, Amjad via llvm-dev
2016-Mar-08 12:39 UTC
[llvm-dev] [cfe-dev] RFC: CodeView debug info emission in Clang/LLVM
Hi, I said it before and I am saying it again, I do not think that this proposal is needed to support Codeview. 1. Why cannot Codegen make use of current DIType metadata to represent the codeview types? 2. Why cannot “DW_TAG_typedef” be used to generate the “DICodeViewUDT” symbol? 3. Why do we need the TypeIndex? · DISubprogram and DIVariable simply point to the DIType metadata, instead of having an index into an array where these DIType are stored?! 4. Why the “TypeRecords” are of type MDString? Are they the source name of the type? I believe that current Debug Info metadata contains all information needed to create the codeview information in codegen. Thus, I do not see a need to either modify Clang or even modify the LLVM IR. Please, if you have a concrete case where you think we have lost information needed for codeview between Clang and Codegen, tell us about it and I will be happy to help you figure out how to retrieve this information from current DI metadata. Thanks, Amjad From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of David Blaikie via llvm-dev Sent: Thursday, March 03, 2016 20:26 To: Reid Kleckner <rnk at google.com> Cc: llvm-dev at lists.llvm.org; cfe-dev at lists.llvm.org Subject: Re: [llvm-dev] [cfe-dev] RFC: CodeView debug info emission in Clang/LLVM I think it'd be reasonable to at least figure out a good way to do type references consistently across the two schemes, but I'm OK with the idea of having a blob of opaque type information for different debug info formats, created by frontends (& don't mind if the library for building that blob live in LLVM or Clang for now - the DWARF one at least would probably live in LLVM because type info and other DWARF are described by similar/the same constructs (DIEs, abbrevs, etc) - but it seems like that's not the case for PDB, so there might not be any code to share between LLVM's CodeView needs and the type info construction - then it's just a matter of whether pushing that library down into LLVM for other frontends to use would be good, which it probably will be at some point, so if it goes into Clang I'd at least try to keep it pretty well separated) Potentially that consistency could be created by going the other way - replace DITypeRef with an int, then have the retained types list be the int->type mapping. Skipping the mangled names. (& skip the retained types list for CV/PDB) - Dave On Wed, Mar 2, 2016 at 5:19 PM, Reid Kleckner via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote: Circling back around 4 months later... I now believe that we should just let the frontend generate CV type info. It's really not worth the hassle to try to have a common representation. Enough C++ ABI-specific information leaks into the format that it's really better to avoid trying to create a union of DWARF and CV type info in LLVM DI metadata. We were able to reuse all the other non-type DI metadata, such as location info and scope info, to emit inline line tables and variable locations, so I think we did OK on reusing the existing infrastructure. Compromising at not reusing the type representation seems OK. I haven't come up with any ideas better than the design that Dave Bartolomeo outlined below, so I think we should go ahead with that. One thing I considered was extending DITypeRef to be a union between MDString*, DIType*, and a type index, but I think that's too invasive. I also don't want to make a whole DIType heap allocation just to wrap a 32-bit type index, so I'm in favor of putting the indices into DISubprogram and DIVariable. Any thoughts on this plan? On Thu, Oct 29, 2015 at 10:11 AM, Dave Bartolomeo via cfe-dev <cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>> wrote: Proposed Design How Debug Info is Generated The CodeView type records for a compilation unit will be generated by the front-end for the source language (Clang, in the case of C and C++). The front-end has access to the full type system and AST of the language, which is necessary to generate accurate debug type info. The type records will be represented as metadata in the LLVM IR, similar to how DWARF debug info is represented. I’ll cover the actual representation in a bit more detail below. The LLVM back-end will be responsible for emitting the CodeView type records from the IR into the output .obj file. Since the type records will already be in the correct format, this is essentially just a copy. No inspection of the type records is necessary within LLVM. The back-end will also be responsible for generating CodeView symbol records, line numbers, and source file info for any functions and data defined in the compilation unit. The back-end is the logical place to do this because only the back-end knows the code addresses, data addresses, and stack frame layouts. Representation of CodeView in LLVM IR DICompileUnit + existing fields + CodeViewTypes : DICodeViewTypes DICodeViewTypes + TypeRecords : MDString[] + UDTSymbols : DICodeViewUDT[] DICodeViewUDT + Name : MDString + TypeIndex : uint32_t DIVariable + existing fields + TypeIndex : uint32_t DISubprogram + existing fields + TypeIndex : uint32_t The existing DICompileUnit node will have a new operand named CodeViewTypes, which points to the new DICodeViewTypes node that describes the CodeView type information for the compilation unit. The DICodeViewTypes node contains two operands: - TypeRecords, an array of MDStrings containing the actual CodeView type records for the compilation unit, sorted in ascending order of type index. - UDTSymbols, and array of DICodeViewUDT nodes describing the user-defined types (class/struct/union/enum) for which CodeView symbol records will need to be emitted by the back-end. The DICodeViewUDT node contains two operands: - Name, an MDString with the name of the symbol as it should appear in the CodeView symbol record. - TypeIndex, a uint32_t holding the CodeView type index of the type record for the user-defined type’s definition. The DICodeViewUDT nodes are necessary because they are generally the only references to the definition of the user-defined type. Other uses of that type refer to the forward declaration record for the type, and without a reference to the definition of the type, the linker will discard the definition record when it merges the type information into the PDB. To specify the CodeView type for a variable or function, the DIVariable and DISubprogram nodes will have an additional TypeIndex operand containing the type index of the type record for that variable or function’s type. This operand will be set to zero when CodeView debug info is not enabled. The above representation essentially extends the existing DWARF-focused debug metadata to also include CodeView info. This was the least invasive way I found to add CodeView support, but it may not be the right architectural decision. It would also be possible to have the CodeView metadata entirely separate from the DWARF metadata. This would reduce the size of the IR when only one form of debug information was being emitted, which is presumably the common case. However, I expect it would complicate the scenario where both DWARF and CodeView are being emitted; for example, would having two dbg.declare intrinsics for a single local variable confuse existing consumers of LLVM IR? I’m hoping someone more familiar with the existing debug info architecture can provide some guidance here if there’s a better way of doing this. _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160308/4e56c6fa/attachment.html>
Eric Christopher via llvm-dev
2016-Mar-09 05:34 UTC
[llvm-dev] [cfe-dev] RFC: CodeView debug info emission in Clang/LLVM
In general, I agree here. I'm still unconvinced that this needs to happen this way. -eric On Tue, Mar 8, 2016 at 4:39 AM Aboud, Amjad via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi, > > I said it before and I am saying it again, I do not think that this > proposal is needed to support Codeview. > > > > 1. Why cannot Codegen make use of current DIType metadata to > represent the codeview types? > > 2. Why cannot “DW_TAG_typedef” be used to generate the > “DICodeViewUDT” symbol? > > 3. Why do we need the TypeIndex? > > · DISubprogram and DIVariable simply point to the DIType > metadata, instead of having an index into an array where these DIType are > stored?! > > 4. Why the “TypeRecords” are of type MDString? Are they the source > name of the type? > > > > I believe that current Debug Info metadata contains all information needed > to create the codeview information in codegen. > > Thus, I do not see a need to either modify Clang or even modify the LLVM > IR. > > > > Please, if you have a concrete case where you think we have lost > information needed for codeview between Clang and Codegen, tell us about it > and I will be happy to help you figure out how to retrieve this information > from current DI metadata. > > > > Thanks, > > Amjad > > > > *From:* llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] *On Behalf Of *David > Blaikie via llvm-dev > *Sent:* Thursday, March 03, 2016 20:26 > *To:* Reid Kleckner <rnk at google.com> > *Cc:* llvm-dev at lists.llvm.org; cfe-dev at lists.llvm.org > *Subject:* Re: [llvm-dev] [cfe-dev] RFC: CodeView debug info emission in > Clang/LLVM > > > > I think it'd be reasonable to at least figure out a good way to do type > references consistently across the two schemes, but I'm OK with the idea of > having a blob of opaque type information for different debug info formats, > created by frontends (& don't mind if the library for building that blob > live in LLVM or Clang for now - the DWARF one at least would probably live > in LLVM because type info and other DWARF are described by similar/the same > constructs (DIEs, abbrevs, etc) - but it seems like that's not the case for > PDB, so there might not be any code to share between LLVM's CodeView needs > and the type info construction - then it's just a matter of whether pushing > that library down into LLVM for other frontends to use would be good, which > it probably will be at some point, so if it goes into Clang I'd at least > try to keep it pretty well separated) > > Potentially that consistency could be created by going the other way - > replace DITypeRef with an int, then have the retained types list be the > int->type mapping. Skipping the mangled names. (& skip the retained types > list for CV/PDB) > > - Dave > > > > On Wed, Mar 2, 2016 at 5:19 PM, Reid Kleckner via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > Circling back around 4 months later... > > > > I now believe that we should just let the frontend generate CV type info. > It's really not worth the hassle to try to have a common representation. > Enough C++ ABI-specific information leaks into the format that it's really > better to avoid trying to create a union of DWARF and CV type info in LLVM > DI metadata. We were able to reuse all the other non-type DI metadata, such > as location info and scope info, to emit inline line tables and variable > locations, so I think we did OK on reusing the existing infrastructure. > Compromising at not reusing the type representation seems OK. > > > > I haven't come up with any ideas better than the design that Dave > Bartolomeo outlined below, so I think we should go ahead with that. One > thing I considered was extending DITypeRef to be a union between MDString*, > DIType*, and a type index, but I think that's too invasive. I also don't > want to make a whole DIType heap allocation just to wrap a 32-bit type > index, so I'm in favor of putting the indices into DISubprogram and > DIVariable. > > > > Any thoughts on this plan? > > > > On Thu, Oct 29, 2015 at 10:11 AM, Dave Bartolomeo via cfe-dev < > cfe-dev at lists.llvm.org> wrote: > > *Proposed Design* > > *How Debug Info is Generated* > > The CodeView type records for a compilation unit will be generated by the > front-end for the source language (Clang, in the case of C and C++). The > front-end has access to the full type system and AST of the language, which > is necessary to generate accurate debug type info. The type records will be > represented as metadata in the LLVM IR, similar to how DWARF debug info is > represented. I’ll cover the actual representation in a bit more detail > below. > > The LLVM back-end will be responsible for emitting the CodeView type > records from the IR into the output .obj file. Since the type records will > already be in the correct format, this is essentially just a copy. No > inspection of the type records is necessary within LLVM. The back-end will > also be responsible for generating CodeView symbol records, line numbers, > and source file info for any functions and data defined in the compilation > unit. The back-end is the logical place to do this because only the > back-end knows the code addresses, data addresses, and stack frame layouts. > > > > *Representation of CodeView in LLVM IR* > > DICompileUnit > > + e*xisting fields* > > + CodeViewTypes : DICodeViewTypes > > > > DICodeViewTypes > > + TypeRecords : MDString[] > > + UDTSymbols : DICodeViewUDT[] > > > > DICodeViewUDT > > + Name : MDString > > + TypeIndex : uint32_t > > > > DIVariable > > + *existing fields* > > + TypeIndex : uint32_t > > > > DISubprogram > > + *existing fields* > > + TypeIndex : uint32_t > > The existing DICompileUnit node will have a new operand named > CodeViewTypes, which points to the new DICodeViewTypes node that describes > the CodeView type information for the compilation unit. > > > > The DICodeViewTypes node contains two operands: > > - TypeRecords, an array of MDStrings containing the actual > CodeView type records for the compilation unit, sorted in ascending order > of type index. > > - UDTSymbols, and array of DICodeViewUDT nodes describing the > user-defined types (class/struct/union/enum) for which CodeView symbol > records will need to be emitted by the back-end. > > > > The DICodeViewUDT node contains two operands: > > - Name, an MDString with the name of the symbol as it should > appear in the CodeView symbol record. > > - TypeIndex, a uint32_t holding the CodeView type index of the > type record for the user-defined type’s definition. > > > > The DICodeViewUDT nodes are necessary because they are generally the only > references to the definition of the user-defined type. Other uses of that > type refer to the forward declaration record for the type, and without a > reference to the definition of the type, the linker will discard the > definition record when it merges the type information into the PDB. > > > > To specify the CodeView type for a variable or function, the DIVariable > and DISubprogram nodes will have an additional TypeIndex operand containing > the type index of the type record for that variable or function’s type. > This operand will be set to zero when CodeView debug info is not enabled. > > > > The above representation essentially extends the existing DWARF-focused > debug metadata to also include CodeView info. This was the least invasive > way I found to add CodeView support, but it may not be the right > architectural decision. It would also be possible to have the CodeView > metadata entirely separate from the DWARF metadata. This would reduce the > size of the IR when only one form of debug information was being emitted, > which is presumably the common case. However, I expect it would complicate > the scenario where both DWARF and CodeView are being emitted; for example, > would having two dbg.declare intrinsics for a single local variable confuse > existing consumers of LLVM IR? I’m hoping someone more familiar with the > existing debug info architecture can provide some guidance here if there’s > a better way of doing this. > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > --------------------------------------------------------------------- > Intel Israel (74) Limited > > This e-mail and any attachments may contain confidential material for > the sole use of the intended recipient(s). Any review or distribution > by others is strictly prohibited. If you are not the intended > recipient, please contact the sender and delete all copies. > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160309/7affcd5d/attachment.html>
Reid Kleckner via llvm-dev
2016-Mar-10 17:24 UTC
[llvm-dev] [cfe-dev] RFC: CodeView debug info emission in Clang/LLVM
On Thu, Mar 3, 2016 at 10:26 AM, David Blaikie <dblaikie at gmail.com> wrote:> I think it'd be reasonable to at least figure out a good way to do type > references consistently across the two schemes, but I'm OK with the idea of > having a blob of opaque type information for different debug info formats, > created by frontends (& don't mind if the library for building that blob > live in LLVM or Clang for now - the DWARF one at least would probably live > in LLVM because type info and other DWARF are described by similar/the same > constructs (DIEs, abbrevs, etc) - but it seems like that's not the case for > PDB, so there might not be any code to share between LLVM's CodeView needs > and the type info construction - then it's just a matter of whether pushing > that library down into LLVM for other frontends to use would be good, which > it probably will be at some point, so if it goes into Clang I'd at least > try to keep it pretty well separated) > > Potentially that consistency could be created by going the other way - > replace DITypeRef with an int, then have the retained types list be the > int->type mapping. Skipping the mangled names. (& skip the retained types > list for CV/PDB) >DITypeRef wraps a Metadata*, though, not an int. Given that there are zero users of DITypeRef in Transforms/ and Analysis/, I don't see why we should try to forcibly create sharing where there is none. The only consumers of type information are essentially the separate debug info backends. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160310/8e5d419d/attachment.html>
David Blaikie via llvm-dev
2016-Mar-10 17:49 UTC
[llvm-dev] [cfe-dev] RFC: CodeView debug info emission in Clang/LLVM
On Thu, Mar 10, 2016 at 9:24 AM, Reid Kleckner <rnk at google.com> wrote:> On Thu, Mar 3, 2016 at 10:26 AM, David Blaikie <dblaikie at gmail.com> wrote: > >> I think it'd be reasonable to at least figure out a good way to do type >> references consistently across the two schemes, but I'm OK with the idea of >> having a blob of opaque type information for different debug info formats, >> created by frontends (& don't mind if the library for building that blob >> live in LLVM or Clang for now - the DWARF one at least would probably live >> in LLVM because type info and other DWARF are described by similar/the same >> constructs (DIEs, abbrevs, etc) - but it seems like that's not the case for >> PDB, so there might not be any code to share between LLVM's CodeView needs >> and the type info construction - then it's just a matter of whether pushing >> that library down into LLVM for other frontends to use would be good, which >> it probably will be at some point, so if it goes into Clang I'd at least >> try to keep it pretty well separated) >> >> Potentially that consistency could be created by going the other way - >> replace DITypeRef with an int, then have the retained types list be the >> int->type mapping. Skipping the mangled names. (& skip the retained types >> list for CV/PDB) >> > > DITypeRef wraps a Metadata*, though, not an int. Given that there are zero > users of DITypeRef in Transforms/ and Analysis/, I don't see why we should > try to forcibly create sharing where there is none. The only consumers of > type information are essentially the separate debug info backends. >I haven't looked in detail at the patch - but it sounded like the proposal was to add an int field next to every DITypeRef field? That seems verbose/intrusive to the schema compared to making the type reference machinery able to be one or the other (or is the proposal to have DITypeRef fields be a union of int or DITypeRef (then the DITypeRef itself is a union of metadata reference or string)? If we already have a union of metadata or string, it seems like the better thing to do would be to make it metadata, string, or int rather than having two different layers for referring to types) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160310/894ae06c/attachment.html>
Reid Kleckner via llvm-dev
2016-Mar-10 18:51 UTC
[llvm-dev] [cfe-dev] RFC: CodeView debug info emission in Clang/LLVM
It is certainly *possible* to use the existing DIType hierarchy to generate CodeView, but I don't believe it is useful. We would have to make the DI metadata into the union of DWARF and CodeView, and it would be horrible. Here is an incomplete list of things that would be awkward: - Member pointer inheritance models. Not all pointers to members are the same size. - Describing locations of virtual bases in vbtables. I'm not sure how to get from DW_TAG_inheritance data to "offset of vbptr from vfptr of complete class". - Describing 'this' adjustments performed in virtual method prologues. - New virtuality types to indicate "introducing" virtual methods. - New flags on everything, see CodeView.h for more info. If you need more visibility into what's different, consider this C++ source: struct A { virtual void f() {} int a; }; struct B : virtual A { virtual void f() {} virtual void g() {} int b; }; struct C : virtual A { virtual void f() {} virtual void h() {} int c; }; struct D : B, C { virtual void f() {} virtual void g() {} virtual void h() {} int d; }; D d; auto mp = &D::f; Compare the metadata that clang generates with the dump of the codeview that MSVC generates, and decide for yourself if the representations are a good match: $ clang -cc1 -std=c++11 -emit-llvm -debug-info-kind=limited t.cpp -o - -triple x86_64-linux -o t.ll LLVM IR: https://ghostbin.com/paste/dpqo8 $ cl -c t.cpp -Z7 && llvm-readobj -codeview t.obj Dump of MSVC CodeView: https://ghostbin.com/paste/92ya3 Sure, yes, it is *possible* to write a converter from one to the other, but why is it necessary? What use case does it enable? You might think it would allow non-Clang frontends to avoid having separate type info emitters, but in practice it won't, because these frontends will need to be augmented to pass down all kinds of CV-specific junk. On Tue, Mar 8, 2016 at 4:39 AM, Aboud, Amjad <amjad.aboud at intel.com> wrote:> Hi, > > I said it before and I am saying it again, I do not think that this > proposal is needed to support Codeview. > > > > 1. Why cannot Codegen make use of current DIType metadata to > represent the codeview types? > > 2. Why cannot “DW_TAG_typedef” be used to generate the > “DICodeViewUDT” symbol? > > 3. Why do we need the TypeIndex? > > · DISubprogram and DIVariable simply point to the DIType > metadata, instead of having an index into an array where these DIType are > stored?! > > 4. Why the “TypeRecords” are of type MDString? Are they the source > name of the type? > > > > I believe that current Debug Info metadata contains all information needed > to create the codeview information in codegen. > > Thus, I do not see a need to either modify Clang or even modify the LLVM > IR. > > > > Please, if you have a concrete case where you think we have lost > information needed for codeview between Clang and Codegen, tell us about it > and I will be happy to help you figure out how to retrieve this information > from current DI metadata. > > > > Thanks, > > Amjad > > > > *From:* llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] *On Behalf Of *David > Blaikie via llvm-dev > *Sent:* Thursday, March 03, 2016 20:26 > *To:* Reid Kleckner <rnk at google.com> > *Cc:* llvm-dev at lists.llvm.org; cfe-dev at lists.llvm.org > *Subject:* Re: [llvm-dev] [cfe-dev] RFC: CodeView debug info emission in > Clang/LLVM > > > > I think it'd be reasonable to at least figure out a good way to do type > references consistently across the two schemes, but I'm OK with the idea of > having a blob of opaque type information for different debug info formats, > created by frontends (& don't mind if the library for building that blob > live in LLVM or Clang for now - the DWARF one at least would probably live > in LLVM because type info and other DWARF are described by similar/the same > constructs (DIEs, abbrevs, etc) - but it seems like that's not the case for > PDB, so there might not be any code to share between LLVM's CodeView needs > and the type info construction - then it's just a matter of whether pushing > that library down into LLVM for other frontends to use would be good, which > it probably will be at some point, so if it goes into Clang I'd at least > try to keep it pretty well separated) > > Potentially that consistency could be created by going the other way - > replace DITypeRef with an int, then have the retained types list be the > int->type mapping. Skipping the mangled names. (& skip the retained types > list for CV/PDB) > > - Dave > > > > On Wed, Mar 2, 2016 at 5:19 PM, Reid Kleckner via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > Circling back around 4 months later... > > > > I now believe that we should just let the frontend generate CV type info. > It's really not worth the hassle to try to have a common representation. > Enough C++ ABI-specific information leaks into the format that it's really > better to avoid trying to create a union of DWARF and CV type info in LLVM > DI metadata. We were able to reuse all the other non-type DI metadata, such > as location info and scope info, to emit inline line tables and variable > locations, so I think we did OK on reusing the existing infrastructure. > Compromising at not reusing the type representation seems OK. > > > > I haven't come up with any ideas better than the design that Dave > Bartolomeo outlined below, so I think we should go ahead with that. One > thing I considered was extending DITypeRef to be a union between MDString*, > DIType*, and a type index, but I think that's too invasive. I also don't > want to make a whole DIType heap allocation just to wrap a 32-bit type > index, so I'm in favor of putting the indices into DISubprogram and > DIVariable. > > > > Any thoughts on this plan? > > > > On Thu, Oct 29, 2015 at 10:11 AM, Dave Bartolomeo via cfe-dev < > cfe-dev at lists.llvm.org> wrote: > > *Proposed Design* > > *How Debug Info is Generated* > > The CodeView type records for a compilation unit will be generated by the > front-end for the source language (Clang, in the case of C and C++). The > front-end has access to the full type system and AST of the language, which > is necessary to generate accurate debug type info. The type records will be > represented as metadata in the LLVM IR, similar to how DWARF debug info is > represented. I’ll cover the actual representation in a bit more detail > below. > > The LLVM back-end will be responsible for emitting the CodeView type > records from the IR into the output .obj file. Since the type records will > already be in the correct format, this is essentially just a copy. No > inspection of the type records is necessary within LLVM. The back-end will > also be responsible for generating CodeView symbol records, line numbers, > and source file info for any functions and data defined in the compilation > unit. The back-end is the logical place to do this because only the > back-end knows the code addresses, data addresses, and stack frame layouts. > > > > *Representation of CodeView in LLVM IR* > > DICompileUnit > > + e*xisting fields* > > + CodeViewTypes : DICodeViewTypes > > > > DICodeViewTypes > > + TypeRecords : MDString[] > > + UDTSymbols : DICodeViewUDT[] > > > > DICodeViewUDT > > + Name : MDString > > + TypeIndex : uint32_t > > > > DIVariable > > + *existing fields* > > + TypeIndex : uint32_t > > > > DISubprogram > > + *existing fields* > > + TypeIndex : uint32_t > > The existing DICompileUnit node will have a new operand named > CodeViewTypes, which points to the new DICodeViewTypes node that describes > the CodeView type information for the compilation unit. > > > > The DICodeViewTypes node contains two operands: > > - TypeRecords, an array of MDStrings containing the actual > CodeView type records for the compilation unit, sorted in ascending order > of type index. > > - UDTSymbols, and array of DICodeViewUDT nodes describing the > user-defined types (class/struct/union/enum) for which CodeView symbol > records will need to be emitted by the back-end. > > > > The DICodeViewUDT node contains two operands: > > - Name, an MDString with the name of the symbol as it should > appear in the CodeView symbol record. > > - TypeIndex, a uint32_t holding the CodeView type index of the > type record for the user-defined type’s definition. > > > > The DICodeViewUDT nodes are necessary because they are generally the only > references to the definition of the user-defined type. Other uses of that > type refer to the forward declaration record for the type, and without a > reference to the definition of the type, the linker will discard the > definition record when it merges the type information into the PDB. > > > > To specify the CodeView type for a variable or function, the DIVariable > and DISubprogram nodes will have an additional TypeIndex operand containing > the type index of the type record for that variable or function’s type. > This operand will be set to zero when CodeView debug info is not enabled. > > > > The above representation essentially extends the existing DWARF-focused > debug metadata to also include CodeView info. This was the least invasive > way I found to add CodeView support, but it may not be the right > architectural decision. It would also be possible to have the CodeView > metadata entirely separate from the DWARF metadata. This would reduce the > size of the IR when only one form of debug information was being emitted, > which is presumably the common case. However, I expect it would complicate > the scenario where both DWARF and CodeView are being emitted; for example, > would having two dbg.declare intrinsics for a single local variable confuse > existing consumers of LLVM IR? I’m hoping someone more familiar with the > existing debug info architecture can provide some guidance here if there’s > a better way of doing this. > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > --------------------------------------------------------------------- > Intel Israel (74) Limited > > This e-mail and any attachments may contain confidential material for > the sole use of the intended recipient(s). Any review or distribution > by others is strictly prohibited. If you are not the intended > recipient, please contact the sender and delete all copies. >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160310/8bf6d314/attachment.html>