Sourabh Singh Tomar via llvm-dev
2020-Jan-02 14:03 UTC
[llvm-dev] Query/Suggestions on upgrading macro infrastructure.
Hello Everyone, I would like to have your thoughts on this. Overview of the problem ================== While implementing support for the DWARFv5 debug_macro section support in LLVM. I came across these holes: - The macros infrastructure in CLANG/LLVM is inherently tied to a particular version(v4 macinfo). For instance, consider this snippet from CLANG: Gen->getCGDebugInfo()->CreateMacro(getCurrentScope(), llvm::dwarf::DW_MACINFO_define,> location, Name.str(), Value.str()).- So, let's say you want to add support for a new form /DW_MACRO_define_strx/, with existing infrastructure, you can't do this without polluting code with lot of if-else. This is just to highlight the problem, there are numerous areas where we will have to add if-else. - DWARF forms[DW_MACINFO*] are unnecessarily hard-coded in CLANG, resulting in a false dependence of CLANG/LLVM on entire generation phase of debug info for macros, even when they don't have to. - To worsen the problem even more, we have some DWARF forms sharing same encoding. Currently, we have 4-forms that share same encoding with previous DWARFv4. High Level Summary of Solution ======================= To deal with this and to facilitate addition of new future DWARF forms and other stuff, I propose following changes to be done: · Segregate CLANG entirely from macro debug info part. This has been done by introducing a new Generic Macro Types. Since macro information can be broadly classified into 4 categories: 1. Definition 2. Un-definition 3. File inclusion 4. File termination So I propose to introduce 4 new Generic Macro Types to be used as types, with no dependence to various DWARFv5/v4 forms. 1. MACRO_definition 2. MACRO_undef 3. MACRO_start_file 4. MACRO_end_file - Previous infrastructure uses DWARF forms as types in metadata --> !DIMacro(type: DW_MACINFO_define, line: 9, name: "Name", value: "Value")And the production part will query this and emits the form present[hardcoded]. Upgraded infrastructure will be using something like --> !DIMacro(type: MACRO_definition, line: 9, name: "Name", value: "Value")Now production part will query the Generic Macro Type present in metadata and based on the version of DWARF specified, it will emit forms accordingly. - Now this should cater to all our needs of LLVM and debug-info production-consumption WRT to macro. This approach solves the problem highlighted above and simplifies the code, increases code/infrastructure reuse for building DWARFv5 macro and future additons. I have a patch[implementing this] ready, currently supporting DWARFv4 macro only. Planning to build rest of DWARFv5 on top of this. Could you guys let me know your thoughts and how to proceed forward with this. Should we need an RFC for it have a better discussion? Thanks in anticipation, Sourabh. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200102/b69b2ee2/attachment.html>
David Blaikie via llvm-dev
2020-Jan-02 22:31 UTC
[llvm-dev] Query/Suggestions on upgrading macro infrastructure.
On Thu, Jan 2, 2020 at 6:03 AM Sourabh Singh Tomar <sourav0311 at gmail.com> wrote:> Hello Everyone, > > > > I would like to have your thoughts on this. > > > > Overview of the problem > > ==================> > While implementing support for the DWARFv5 debug_macro section support in > LLVM. I came across these holes: > > - The macros infrastructure in CLANG/LLVM is inherently tied to a > particular version(v4 macinfo). For instance, consider this snippet from > CLANG: > > Gen->getCGDebugInfo()->CreateMacro(getCurrentScope(), llvm::dwarf::DW_MACINFO_define, >> location, Name.str(), Value.str()). > > > - So, let's say you want to add support for a new form > /DW_MACRO_define_strx/, with existing infrastructure, you can't do this > without polluting code with lot of if-else. This is just to highlight the > problem, there are numerous areas where we will have to add if-else. > - DWARF forms[DW_MACINFO*] are unnecessarily hard-coded in CLANG, > resulting in a false dependence of CLANG/LLVM on entire generation phase of > debug info for macros, even when they don't have to. > - To worsen the problem even more, we have some DWARF forms sharing > same encoding. Currently, we have 4-forms that share same encoding with > previous DWARFv4. > >4 forms that share the same encoding in DWARFv4 - you mean the define, define_strp, define_strx, and define_sup? I think that the solution here is probably not to change the IR representation - IR doesn't have the semantics that necessitate strip/strx/sup encodings (there's no indexed string section at the IR level - string sharing is a lower level implementation detail at the IR Level that's not apparent at the IR Level). These are choices the backend would make about how to efficiently encode the strings into the final output. All this is to say that I agree with you about the semantics of your proposal, but I think the /syntax/ of the IR can remain the same, just with the semantics you're describing. (ie: that DW_MACINFO_define in the IR doesn't necessarily mean you'll get a DW_MACINFO_define in the output (you might get a DW_MACINFO_define, maybe a DW_MACINFO_define_strx, etc, etc... )> > > High Level Summary of Solution > > =======================> > To deal with this and to facilitate addition of new future DWARF forms > and other stuff, I propose following changes to be done: > > · Segregate CLANG entirely from macro debug info part. This has been > done by introducing a new Generic Macro Types. Since macro information can > be broadly classified into 4 categories: > > 1. Definition > > 2. Un-definition > > 3. File inclusion > > 4. File termination > > So I propose to introduce 4 new Generic Macro Types to be used as types, > with no dependence to various DWARFv5/v4 forms. > > > 1. MACRO_definition > 2. MACRO_undef > 3. MACRO_start_file > 4. MACRO_end_file > > > - Previous infrastructure uses DWARF forms as types in metadata -- > >> !DIMacro(type: DW_MACINFO_define, line: 9, name: "Name", value: "Value") > > And the production part will query this and emits the form > present[hardcoded]. > > > > Upgraded infrastructure will be using something like -- > >> !DIMacro(type: MACRO_definition, line: 9, name: "Name", value: "Value") > > Now production part will query the Generic Macro Type present in metadata > and based on the version of DWARF specified, it will emit forms accordingly. > > > > - Now this should cater to all our needs of LLVM and debug-info > production-consumption WRT to macro. > > > > This approach solves the problem highlighted above and simplifies the > code, increases code/infrastructure reuse for building DWARFv5 macro and > future additons. > > > I have a patch[implementing this] ready, currently supporting DWARFv4 > macro only. Planning to build rest of DWARFv5 on top of this. Could you > guys let me know your thoughts and how to proceed forward with this. Should > we need an RFC for it have a better discussion? > > > > Thanks in anticipation, > > Sourabh. > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200102/47a7077f/attachment.html>
Sourabh Singh Tomar via llvm-dev
2020-Jan-03 14:47 UTC
[llvm-dev] Query/Suggestions on upgrading macro infrastructure.
Hi David, Thanks for sharing your thoughts/suggestions on this. 4 forms that share the same encoding in DWARFv4 - you mean the define,> define_strp, define_strx, and define_sup?These are: DW_MACINFO_define = DW_MACRO_define DW_MACINFO_undef = DW_MACRO_undef DW_MACINFO_start_file = DW_MACRO_start_file DW_MACINFO_end_file = DW_MACRO_end_file I think that the solution here is probably not to change the IR> representation - IR doesn't have the semantics that necessitate > strip/strx/sup encodings (there's no indexed string section at the IR level > - string sharing is a lower level implementation detail at the IR Level > that's not apparent at the IR Level). These are choices the backend would > make about how to efficiently encode the strings into the final output.Agreed. IR doesn't have the semantics that necessitate strp/strx/sup encodings, and don't have too. However considering the most productive macro-string representation form /DW_FORM_define_strx/ and taking the existing infrastructure, emission psuedo code will be : if (Version == 4 && getMacinfoType() == DW_MACINFO_define)> emit DW_MACINFO_define > if (Version == 5 && getMacinfoType() == DW_MACINFO_define) > emit DW_MACRO_define_strxNow here the presence of "DW_MACINFO_define" here seems unfair/odd, may confuse the user/developer. With new types, we can have much cleaner code:> if (getGenericMacroType() == MACRO_definition) /* IR types doesn't > show/share any false dependence on how back-end DWARF or other will consume > this*/ > if (Version == 5) emit "DW_MACRO_define_strx" > else emit "DW_MACRO_define"Even at the IR level, user will have confusion. When in .ll file we have "DW_MACINFO_define" and subsequent binary contains "DW_MACRO_define_strx". This again, is due to the fact that "DW_MACINFO_define" is defined in the DWARF spec, resulting in a false dependence/representation. At least visible to user familiar with DWARF. Yet another motivating example: Exactly same thing happens with "DW_MACINFO_start_file". We'll have "DW_MACINFO_start_file" in .ll file and subsequent binary will contain "DW_MACRO_start_file". Consider this snippet from llvm:> auto *MF = DIMacroFile::get(VMContext, > dwarf::DW_MACINFO_start_file, TMF->getLine(), TMF->getFile(), > getOrCreateMacroArray(I.second.getArrayRef())); > replaceTemporary(llvm::TempDIMacroNode(TMF), MF);Now based on this, emission psuedo code will be:> if (Version == 4 && getMacinfoType() == DW_MACINFO_start_file) > emit "DW_MACINFO_start_file" > if (Version == 5 && getMacinfoType() == DW_MACINFO_start_file) > emit "DW_MACRO_start_file"All this is to say that I agree with you about the semantics of your> proposal, but I think the /syntax/ of the IR can remain the same, just with > the semantics you're describing. (ie: that DW_MACINFO_define in the IR > doesn't necessarily mean you'll get a DW_MACINFO_define in the output (you > might get a DW_MACINFO_define, maybe a DW_MACINFO_define_strx, etc, etc... ) >To summarize: Adding new Generic macro type to metadata IR, and removing old ones seems a clean and promising solution. This also enforces/depicts that IR metadata is not dependent on the target consumer DWARF or other. Thanks, Sourabh. On Fri, Jan 3, 2020 at 4:01 AM David Blaikie <dblaikie at gmail.com> wrote:> > > On Thu, Jan 2, 2020 at 6:03 AM Sourabh Singh Tomar <sourav0311 at gmail.com> > wrote: > >> Hello Everyone, >> >> >> >> I would like to have your thoughts on this. >> >> >> >> Overview of the problem >> >> ==================>> >> While implementing support for the DWARFv5 debug_macro section support in >> LLVM. I came across these holes: >> >> - The macros infrastructure in CLANG/LLVM is inherently tied to a >> particular version(v4 macinfo). For instance, consider this snippet from >> CLANG: >> >> Gen->getCGDebugInfo()->CreateMacro(getCurrentScope(), llvm::dwarf::DW_MACINFO_define, >>> location, Name.str(), Value.str()). >> >> >> - So, let's say you want to add support for a new form >> /DW_MACRO_define_strx/, with existing infrastructure, you can't do this >> without polluting code with lot of if-else. This is just to highlight the >> problem, there are numerous areas where we will have to add if-else. >> - DWARF forms[DW_MACINFO*] are unnecessarily hard-coded in >> CLANG, resulting in a false dependence of CLANG/LLVM on entire generation >> phase of debug info for macros, even when they don't have to. >> - To worsen the problem even more, we have some DWARF forms sharing >> same encoding. Currently, we have 4-forms that share same encoding with >> previous DWARFv4. >> >> > 4 forms that share the same encoding in DWARFv4 - you mean the define, > define_strp, define_strx, and define_sup? > > I think that the solution here is probably not to change the IR > representation - IR doesn't have the semantics that necessitate > strip/strx/sup encodings (there's no indexed string section at the IR level > - string sharing is a lower level implementation detail at the IR Level > that's not apparent at the IR Level). These are choices the backend would > make about how to efficiently encode the strings into the final output. > > All this is to say that I agree with you about the semantics of your > proposal, but I think the /syntax/ of the IR can remain the same, just with > the semantics you're describing. (ie: that DW_MACINFO_define in the IR > doesn't necessarily mean you'll get a DW_MACINFO_define in the output (you > might get a DW_MACINFO_define, maybe a DW_MACINFO_define_strx, etc, etc... ) > > >> >> >> High Level Summary of Solution >> >> =======================>> >> To deal with this and to facilitate addition of new future DWARF forms >> and other stuff, I propose following changes to be done: >> >> · Segregate CLANG entirely from macro debug info part. This has been >> done by introducing a new Generic Macro Types. Since macro information can >> be broadly classified into 4 categories: >> >> 1. Definition >> >> 2. Un-definition >> >> 3. File inclusion >> >> 4. File termination >> >> So I propose to introduce 4 new Generic Macro Types to be used as types, >> with no dependence to various DWARFv5/v4 forms. >> >> >> 1. MACRO_definition >> 2. MACRO_undef >> 3. MACRO_start_file >> 4. MACRO_end_file >> >> >> - Previous infrastructure uses DWARF forms as types in metadata -- >> >>> !DIMacro(type: DW_MACINFO_define, line: 9, name: "Name", value: "Value") >> >> And the production part will query this and emits the form >> present[hardcoded]. >> >> >> >> Upgraded infrastructure will be using something like -- >> >>> !DIMacro(type: MACRO_definition, line: 9, name: "Name", value: "Value") >> >> Now production part will query the Generic Macro Type present in metadata >> and based on the version of DWARF specified, it will emit forms accordingly. >> >> >> >> - Now this should cater to all our needs of LLVM and debug-info >> production-consumption WRT to macro. >> >> >> >> This approach solves the problem highlighted above and simplifies the >> code, increases code/infrastructure reuse for building DWARFv5 macro and >> future additons. >> >> >> I have a patch[implementing this] ready, currently supporting DWARFv4 >> macro only. Planning to build rest of DWARFv5 on top of this. Could you >> guys let me know your thoughts and how to proceed forward with this. Should >> we need an RFC for it have a better discussion? >> >> >> >> Thanks in anticipation, >> >> Sourabh. >> >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200103/d6a7c3c0/attachment.html>