thr3ads.net - llvm dev - [llvm-dev] Query/Suggestions on upgrading macro infrastructure. [Jan 2020]

If this information is useful, please help other people find it:
Share via:

Sourabh Singh Tomar via llvm-dev

2020-Jan-02 14:03 UTC

[llvm-dev] Query/Suggestions on upgrading macro infrastructure.

Hello Everyone,



I would like to have your thoughts on this.



Overview of the problem

==================
While implementing support for the DWARFv5 debug_macro section support in
LLVM. I came across these holes:

   -  The macros infrastructure in CLANG/LLVM is inherently tied to a
   particular version(v4 macinfo). For instance, consider this snippet from
   CLANG:

Gen->getCGDebugInfo()->CreateMacro(getCurrentScope(),
llvm::dwarf::DW_MACINFO_define,> location, Name.str(), Value.str()).

   - So, let's say you want to add support for a new form
      /DW_MACRO_define_strx/, with existing infrastructure, you can't do
this
      without polluting code with lot of if-else. This is just to highlight the
      problem, there are numerous areas where we will have to add if-else.
      -  DWARF  forms[DW_MACINFO*] are unnecessarily hard-coded in CLANG,
   resulting in a false dependence of CLANG/LLVM on entire generation phase of
   debug info for macros, even when they don't have to.
   -  To worsen the problem even more, we have some DWARF forms sharing
   same encoding. Currently, we have 4-forms that share same encoding with
   previous DWARFv4.





High Level Summary of Solution

=======================
To deal with this and to facilitate addition of new future DWARF forms
and other stuff, I propose following changes to be done:

·  Segregate CLANG entirely from macro debug info part. This has been done
by introducing a new Generic Macro Types. Since macro information can be
broadly classified into 4 categories:

1.     Definition

2.     Un-definition

3.     File inclusion

4.     File termination

So I propose to introduce 4 new Generic Macro Types  to be used as types,
with no dependence to various DWARFv5/v4 forms.


   1. MACRO_definition
   2. MACRO_undef
   3. MACRO_start_file
   4. MACRO_end_file


- Previous infrastructure uses DWARF forms as types in metadata --
> !DIMacro(type: DW_MACINFO_define, line: 9, name: "Name", value:
"Value")
And the production part will query this and emits the form
present[hardcoded].



Upgraded infrastructure will be using something like --
> !DIMacro(type: MACRO_definition, line: 9, name: "Name", value:
"Value")
Now production part will query the Generic Macro Type present in metadata
and based on the version of DWARF specified, it will emit forms accordingly.



- Now this should cater to all our needs of LLVM and debug-info
production-consumption WRT to macro.



This approach solves the problem highlighted above and simplifies the code,
increases code/infrastructure reuse for building DWARFv5 macro and future
additons.


I have a patch[implementing this] ready, currently supporting DWARFv4 macro
only. Planning to build rest of DWARFv5 on top of this. Could you guys let
me know your thoughts and how to proceed forward with this. Should we need
an RFC for it have a better discussion?



Thanks in anticipation,

Sourabh.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200102/b69b2ee2/attachment.html>

David Blaikie via llvm-dev

2020-Jan-02 22:31 UTC

head link

[llvm-dev] Query/Suggestions on upgrading macro infrastructure.

On Thu, Jan 2, 2020 at 6:03 AM Sourabh Singh Tomar <sourav0311 at
gmail.com>
wrote:
> Hello Everyone,
>
>
>
> I would like to have your thoughts on this.
>
>
>
> Overview of the problem
>
> ==================>
> While implementing support for the DWARFv5 debug_macro section support in
> LLVM. I came across these holes:
>
>    -  The macros infrastructure in CLANG/LLVM is inherently tied to a
>    particular version(v4 macinfo). For instance, consider this snippet from
>    CLANG:
>
> Gen->getCGDebugInfo()->CreateMacro(getCurrentScope(),
llvm::dwarf::DW_MACINFO_define,
>> location, Name.str(), Value.str()).
>
>
>    - So, let's say you want to add support for a new form
>       /DW_MACRO_define_strx/, with existing infrastructure, you can't
do this
>       without polluting code with lot of if-else. This is just to highlight
the
>       problem, there are numerous areas where we will have to add if-else.
>       -  DWARF  forms[DW_MACINFO*] are unnecessarily hard-coded in CLANG,
>    resulting in a false dependence of CLANG/LLVM on entire generation phase
of
>    debug info for macros, even when they don't have to.
>    -  To worsen the problem even more, we have some DWARF forms sharing
>    same encoding. Currently, we have 4-forms that share same encoding with
>    previous DWARFv4.
>
>4 forms that share the same encoding in DWARFv4 - you mean the define,
define_strp, define_strx, and define_sup?

I think that the solution here is probably not to change the IR
representation - IR doesn't have the semantics that necessitate
strip/strx/sup encodings (there's no indexed string section at the IR level
- string sharing is a lower level implementation detail at the IR Level
that's not apparent at the IR Level). These are choices the backend would
make about how to efficiently encode the strings into the final output.

All this is to say that I agree with you about the semantics of your
proposal, but I think the /syntax/ of the IR can remain the same, just with
the semantics you're describing. (ie: that DW_MACINFO_define in the IR
doesn't necessarily mean you'll get a DW_MACINFO_define in the output
(you
might get a DW_MACINFO_define, maybe a DW_MACINFO_define_strx, etc, etc... )

>
>
>  High Level Summary of Solution
>
> =======================>
> To deal with this and to facilitate addition of new future DWARF forms
> and other stuff, I propose following changes to be done:
>
> ·  Segregate CLANG entirely from macro debug info part. This has been
> done by introducing a new Generic Macro Types. Since macro information can
> be  broadly classified into 4 categories:
>
> 1.     Definition
>
> 2.     Un-definition
>
> 3.     File inclusion
>
> 4.     File termination
>
> So I propose to introduce 4 new Generic Macro Types  to be used as types,
> with no dependence to various DWARFv5/v4 forms.
>
>
>    1. MACRO_definition
>    2. MACRO_undef
>    3. MACRO_start_file
>    4. MACRO_end_file
>
>
> - Previous infrastructure uses DWARF forms as types in metadata --
>
>> !DIMacro(type: DW_MACINFO_define, line: 9, name: "Name",
value: "Value")
>
> And the production part will query this and emits the form
> present[hardcoded].
>
>
>
> Upgraded infrastructure will be using something like --
>
>> !DIMacro(type: MACRO_definition, line: 9, name: "Name",
value: "Value")
>
> Now production part will query the Generic Macro Type present in metadata
> and based on the version of DWARF specified, it will emit forms
accordingly.
>
>
>
> - Now this should cater to all our needs of LLVM and debug-info
> production-consumption WRT to macro.
>
>
>
> This approach solves the problem highlighted above and simplifies the
> code, increases code/infrastructure reuse for building DWARFv5 macro and
> future additons.
>
>
> I have a patch[implementing this] ready, currently supporting DWARFv4
> macro only. Planning to build rest of DWARFv5 on top of this. Could you
> guys let me know your thoughts and how to proceed forward with this. Should
> we need an RFC for it have a better discussion?
>
>
>
> Thanks in anticipation,
>
> Sourabh.
>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200102/47a7077f/attachment.html>

Sourabh Singh Tomar via llvm-dev

2020-Jan-03 14:47 UTC

head link

[llvm-dev] Query/Suggestions on upgrading macro infrastructure.

Hi David,

Thanks for sharing your thoughts/suggestions on this.

4 forms that share the same encoding in DWARFv4 - you mean the
define,> define_strp, define_strx, and define_sup?
These are:
 DW_MACINFO_define      = DW_MACRO_define
 DW_MACINFO_undef       = DW_MACRO_undef
 DW_MACINFO_start_file  = DW_MACRO_start_file
 DW_MACINFO_end_file   = DW_MACRO_end_file


I think that the solution here is probably not to change the
IR> representation - IR doesn't have the semantics that necessitate
> strip/strx/sup encodings (there's no indexed string section at the IR
level
> - string sharing is a lower level implementation detail at the IR Level
> that's not apparent at the IR Level). These are choices the backend
would
> make about how to efficiently encode the strings into the final output.
Agreed. IR doesn't have the semantics that necessitate strp/strx/sup
encodings, and don't have too. However considering the most productive
macro-string representation form /DW_FORM_define_strx/ and taking the
existing infrastructure, emission psuedo code will be :

 if (Version == 4 && getMacinfoType() == 
DW_MACINFO_define)> emit DW_MACINFO_define
>    if (Version == 5 && getMacinfoType() ==  DW_MACINFO_define)
> emit DW_MACRO_define_strx
 Now here the presence of "DW_MACINFO_define" here seems unfair/odd,
may
confuse the user/developer.

With new types, we can have much cleaner code:
> if (getGenericMacroType() == MACRO_definition)  /* IR types doesn't
> show/share any false dependence on how back-end DWARF or other will consume
> this*/
> if (Version == 5) emit "DW_MACRO_define_strx"
> else emit "DW_MACRO_define"

Even at the IR level, user will have confusion. When in .ll file we have
"DW_MACINFO_define" and subsequent binary contains
"DW_MACRO_define_strx".
This again, is due to the fact that "DW_MACINFO_define" is defined in
the
DWARF spec, resulting in a false dependence/representation. At least
visible to user familiar with DWARF.

Yet another motivating example: Exactly same thing happens with
"DW_MACINFO_start_file". We'll have
"DW_MACINFO_start_file" in .ll file and
subsequent binary will contain "DW_MACRO_start_file". Consider this
snippet
from llvm:
>     auto *MF = DIMacroFile::get(VMContext,
> dwarf::DW_MACINFO_start_file, TMF->getLine(), TMF->getFile(),
> getOrCreateMacroArray(I.second.getArrayRef()));
>      replaceTemporary(llvm::TempDIMacroNode(TMF), MF);

Now based on this, emission psuedo code will be:
>    if (Version == 4 && getMacinfoType() ==  DW_MACINFO_start_file)
> emit "DW_MACINFO_start_file"
>    if (Version == 5 && getMacinfoType() ==  DW_MACINFO_start_file)
> emit "DW_MACRO_start_file"

All this is to say that I agree with you about the semantics of
your> proposal, but I think the /syntax/ of the IR can remain the same, just with
> the semantics you're describing. (ie: that DW_MACINFO_define in the IR
> doesn't necessarily mean you'll get a DW_MACINFO_define in the
output (you
> might get a DW_MACINFO_define, maybe a DW_MACINFO_define_strx, etc, etc...
)
>

To summarize: Adding new Generic macro type to metadata IR, and removing
old ones seems a clean and promising solution. This also enforces/depicts
that IR metadata is not dependent on the target consumer DWARF or other.

Thanks,
Sourabh.


On Fri, Jan 3, 2020 at 4:01 AM David Blaikie <dblaikie at gmail.com>
wrote:
>
>
> On Thu, Jan 2, 2020 at 6:03 AM Sourabh Singh Tomar <sourav0311 at
gmail.com>
> wrote:
>
>> Hello Everyone,
>>
>>
>>
>> I would like to have your thoughts on this.
>>
>>
>>
>> Overview of the problem
>>
>> ==================>>
>> While implementing support for the DWARFv5 debug_macro section support
in
>> LLVM. I came across these holes:
>>
>>    -  The macros infrastructure in CLANG/LLVM is inherently tied to a
>>    particular version(v4 macinfo). For instance, consider this snippet
from
>>    CLANG:
>>
>> Gen->getCGDebugInfo()->CreateMacro(getCurrentScope(),
llvm::dwarf::DW_MACINFO_define,
>>> location, Name.str(), Value.str()).
>>
>>
>>    - So, let's say you want to add support for a new form
>>       /DW_MACRO_define_strx/, with existing infrastructure, you
can't do this
>>       without polluting code with lot of if-else. This is just to
highlight the
>>       problem, there are numerous areas where we will have to add
if-else.
>>       -  DWARF  forms[DW_MACINFO*] are unnecessarily hard-coded in
>>    CLANG, resulting in a false dependence of CLANG/LLVM on entire
generation
>>    phase of debug info for macros, even when they don't have to.
>>    -  To worsen the problem even more, we have some DWARF forms sharing
>>    same encoding. Currently, we have 4-forms that share same encoding
with
>>    previous DWARFv4.
>>
>>
> 4 forms that share the same encoding in DWARFv4 - you mean the define,
> define_strp, define_strx, and define_sup?
>
> I think that the solution here is probably not to change the IR
> representation - IR doesn't have the semantics that necessitate
> strip/strx/sup encodings (there's no indexed string section at the IR
level
> - string sharing is a lower level implementation detail at the IR Level
> that's not apparent at the IR Level). These are choices the backend
would
> make about how to efficiently encode the strings into the final output.
>
> All this is to say that I agree with you about the semantics of your
> proposal, but I think the /syntax/ of the IR can remain the same, just with
> the semantics you're describing. (ie: that DW_MACINFO_define in the IR
> doesn't necessarily mean you'll get a DW_MACINFO_define in the
output (you
> might get a DW_MACINFO_define, maybe a DW_MACINFO_define_strx, etc, etc...
)
>
>
>>
>>
>>  High Level Summary of Solution
>>
>> =======================>>
>> To deal with this and to facilitate addition of new future DWARF forms
>> and other stuff, I propose following changes to be done:
>>
>> ·  Segregate CLANG entirely from macro debug info part. This has been
>> done by introducing a new Generic Macro Types. Since macro information
can
>> be  broadly classified into 4 categories:
>>
>> 1.     Definition
>>
>> 2.     Un-definition
>>
>> 3.     File inclusion
>>
>> 4.     File termination
>>
>> So I propose to introduce 4 new Generic Macro Types  to be used as
types,
>> with no dependence to various DWARFv5/v4 forms.
>>
>>
>>    1. MACRO_definition
>>    2. MACRO_undef
>>    3. MACRO_start_file
>>    4. MACRO_end_file
>>
>>
>> - Previous infrastructure uses DWARF forms as types in metadata --
>>
>>> !DIMacro(type: DW_MACINFO_define, line: 9, name: "Name",
value: "Value")
>>
>> And the production part will query this and emits the form
>> present[hardcoded].
>>
>>
>>
>> Upgraded infrastructure will be using something like --
>>
>>> !DIMacro(type: MACRO_definition, line: 9, name: "Name",
value: "Value")
>>
>> Now production part will query the Generic Macro Type present in
metadata
>> and based on the version of DWARF specified, it will emit forms
accordingly.
>>
>>
>>
>> - Now this should cater to all our needs of LLVM and debug-info
>> production-consumption WRT to macro.
>>
>>
>>
>> This approach solves the problem highlighted above and simplifies the
>> code, increases code/infrastructure reuse for building DWARFv5 macro
and
>> future additons.
>>
>>
>> I have a patch[implementing this] ready, currently supporting DWARFv4
>> macro only. Planning to build rest of DWARFv5 on top of this. Could you
>> guys let me know your thoughts and how to proceed forward with this.
Should
>> we need an RFC for it have a better discussion?
>>
>>
>>
>> Thanks in anticipation,
>>
>> Sourabh.
>>
>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200103/d6a7c3c0/attachment.html>

Seemingly Similar Threads

Search for more seemingly similar threads

llvm dev - Jan 2020 - Query/Suggestions on upgrading macro infrastructure.

[llvm-dev] Query/Suggestions on upgrading macro infrastructure.

[llvm-dev] Query/Suggestions on upgrading macro infrastructure.

[llvm-dev] Query/Suggestions on upgrading macro infrastructure.

Seemingly Similar Threads