thr3ads.net - llvm dev - [LLVMdev] RFC: Improving our DWARF (and ELF) emission testing capabilities [Jan 2013]

If this information is useful, please help other people find it:
Share via:

Eli Bendersky

2013-Jan-18 21:00 UTC

[LLVMdev] RFC: Improving our DWARF (and ELF) emission testing capabilities

Hi All,

While working on some recent patches for x32 support, I ran into an
unpleasant limitation the LLVM eco-system has with testing DWARF
emission. We currently have several approaches, neither of which is
great:

1. llvm-dwarfdump: the best approach when it works. But unfortunately
lib/DebugInfo supports only a (small) subset of DWARF. Tricky sections
like debug_frame aren't supported.
2. Relying of assembly directive emissions (i.e. .cfi_*), which is
cumbersome and misses a lot of things like actual DWARF encoding.
3. Using elf-dump and examining the raw binary dumps. This makes tests
nearly unmaintainable.

The latter is also why IMHO our ELF emission in general isn't well
tested. elf-dump is just too rudimentary and relies on simple (=dumb)
binary contents dumps.

The long-term solution for DWARF would be to enhance lib/DebugInfo to
the point where it can handle all interesting DWARF sections. But this
is a lofty goal, since DWARF parsing is notoriously hard and this
would require a large investment of time and effort. And in the
meantime, we just don't write good enough tests (and enough of them)
for this very important feature.

Therefore, as an interim stage, I propose to adopt some external tool
that parses DWARF and emits decoded textual dumps which makes tests
easy to write.

Concretely, I have a pure Python library named pyelftools
(https://bitbucket.org/eliben/pyelftools) which provides comprehensive
ELF and DWARF parsing capabilities and has a dumper that's fully
compatible with the readelf command. Using pyelftools would allow us
to immediately improve the quality of our tests, and as lib/DebugInfo
matures llvm-dwarfdump can gradually replace the dumper without
changing the actual tests.

pyelftools is relatively widely used so it's well tested, all it
requires is Python 2.6 and higher, and its code is in the public
domain. So it can live in tools/ or test/Scripts or wherever and be
distributed with LLVM. I actively maintain it and hacking it to LLVM's
purposes should be relatively easy. As a bonus, it has a much smarter
ELF parser & dumper that can replace the ad-hoc elf-dump. It has also
been successfully adapted in the past to read DWARF from MachO files,
if that's required.

Eli

David Blaikie

2013-Jan-18 21:29 UTC

head link

[LLVMdev] RFC: Improving our DWARF (and ELF) emission testing capabilities

+ other debug info people (Eric & Paul)

On Fri, Jan 18, 2013 at 1:00 PM, Eli Bendersky <eliben at google.com>
wrote:> Hi All,
>
> While working on some recent patches for x32 support, I ran into an
> unpleasant limitation the LLVM eco-system has with testing DWARF
> emission. We currently have several approaches, neither of which is
> great:
>
> 1. llvm-dwarfdump: the best approach when it works. But unfortunately
> lib/DebugInfo supports only a (small) subset of DWARF. Tricky sections
> like debug_frame aren't supported.
Ideally I'd like to see support added whenever a code change is made
to a feature - so long as we hold ourselves to a "test new changes"
that can gate/encourage the necessary feature support in
llvm-dwarfdump.

Since no one's likely to go back & write a bunch of regression tests
for all the existing code it seems premature to add new features to
llvm-dwarfdump before there's a use-case. It does sometimes mean bug
fixes appear to be costly because they include adding the missing test
infrastructure support, but that's essentially where the cost is
anyway.
> 2. Relying of assembly directive emissions (i.e. .cfi_*), which is
> cumbersome and misses a lot of things like actual DWARF encoding.
I'm not sure what you mean by "actual DWARF encoding" here.
(disclaimer: I've only recently started dabbling with debug info, so I
may be missing obvious things)
> 3. Using elf-dump and examining the raw binary dumps. This makes tests
> nearly unmaintainable.
>
> The latter is also why IMHO our ELF emission in general isn't well
> tested. elf-dump is just too rudimentary and relies on simple (=dumb)
> binary contents dumps.
>
> The long-term solution for DWARF would be to enhance lib/DebugInfo to
> the point where it can handle all interesting DWARF sections. But this
> is a lofty goal, since DWARF parsing is notoriously hard and this
> would require a large investment of time and effort. And in the
> meantime, we just don't write good enough tests (and enough of them)
> for this very important feature.
Are there particular recent commits you've been concerned about the
test quality of? I've been trying to keep an eye on this but, again,
don't necessarily fully understand the ramifications of some changes.
> Therefore, as an interim stage, I propose to adopt some external tool
> that parses DWARF and emits decoded textual dumps which makes tests
> easy to write.
>
> Concretely, I have a pure Python library named pyelftools
> (https://bitbucket.org/eliben/pyelftools) which provides comprehensive
> ELF and DWARF parsing capabilities and has a dumper that's fully
> compatible with the readelf command. Using pyelftools would allow us
> to immediately improve the quality of our tests, and as lib/DebugInfo
> matures llvm-dwarfdump can gradually replace the dumper without
> changing the actual tests.
I would be a little hesitant about test execution performance if
involved invoking new python processes for each debug info test. But
numbers could convince me. Beyond that I can't rationally claim any
particular need to support llvm-dwarfdump as the tool of choice over
any 3rd party tool.
> pyelftools is relatively widely used so it's well tested, all it
> requires is Python 2.6 and higher, and its code is in the public
> domain. So it can live in tools/ or test/Scripts or wherever and be
> distributed with LLVM. I actively maintain it and hacking it to LLVM's
> purposes should be relatively easy. As a bonus, it has a much smarter
> ELF parser & dumper that can replace the ad-hoc elf-dump. It has also
> been successfully adapted in the past to read DWARF from MachO files,
> if that's required.
>
> Eli
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Eli Bendersky

2013-Jan-18 21:50 UTC

head link

[LLVMdev] RFC: Improving our DWARF (and ELF) emission testing capabilities

>> 1. llvm-dwarfdump: the best approach when it works. But unfortunately
>> lib/DebugInfo supports only a (small) subset of DWARF. Tricky sections
>> like debug_frame aren't supported.
>
> Ideally I'd like to see support added whenever a code change is made
> to a feature - so long as we hold ourselves to a "test new
changes"
> that can gate/encourage the necessary feature support in
> llvm-dwarfdump.
>
> Since no one's likely to go back & write a bunch of regression
tests
> for all the existing code it seems premature to add new features to
> llvm-dwarfdump before there's a use-case. It does sometimes mean bug
> fixes appear to be costly because they include adding the missing test
> infrastructure support, but that's essentially where the cost is
> anyway.
See test/MC/ELF/cfi-register.s for a test I consider unmaintainable
since it just matches an elf-dump and requires manual decoding of the
data for every change and addition. When tests are too hard to write,
fewer tests get written.
>> 2. Relying of assembly directive emissions (i.e. .cfi_*), which is
>> cumbersome and misses a lot of things like actual DWARF encoding.
>
> I'm not sure what you mean by "actual DWARF encoding" here.
> (disclaimer: I've only recently started dabbling with debug info, so I
> may be missing obvious things)
I mean that it doesn't test the whole way, and there's quite a bit of
DWARF-related functionality in MC. So when a test relies on matching
directives in ASM output, there's quite a bit of code in MC it doesn't
exercise.
>
>> 3. Using elf-dump and examining the raw binary dumps. This makes tests
>> nearly unmaintainable.
>>
>> The latter is also why IMHO our ELF emission in general isn't well
>> tested. elf-dump is just too rudimentary and relies on simple (=dumb)
>> binary contents dumps.
>>
>> The long-term solution for DWARF would be to enhance lib/DebugInfo to
>> the point where it can handle all interesting DWARF sections. But this
>> is a lofty goal, since DWARF parsing is notoriously hard and this
>> would require a large investment of time and effort. And in the
>> meantime, we just don't write good enough tests (and enough of
them)
>> for this very important feature.
>
> Are there particular recent commits you've been concerned about the
> test quality of? I've been trying to keep an eye on this but, again,
> don't necessarily fully understand the ramifications of some changes.
See basically every test employing elf-dump for non-trivial things.
>
>> Therefore, as an interim stage, I propose to adopt some external tool
>> that parses DWARF and emits decoded textual dumps which makes tests
>> easy to write.
>>
>> Concretely, I have a pure Python library named pyelftools
>> (https://bitbucket.org/eliben/pyelftools) which provides comprehensive
>> ELF and DWARF parsing capabilities and has a dumper that's fully
>> compatible with the readelf command. Using pyelftools would allow us
>> to immediately improve the quality of our tests, and as lib/DebugInfo
>> matures llvm-dwarfdump can gradually replace the dumper without
>> changing the actual tests.
>
> I would be a little hesitant about test execution performance if
> involved invoking new python processes for each debug info test. But
> numbers could convince me. Beyond that I can't rationally claim any
> particular need to support llvm-dwarfdump as the tool of choice over
> any 3rd party tool.
This is already done with elf-dump (a Python script) which is used for
a lot of tests for lack better options.


Eli

Michael Spencer

2013-Jan-18 22:17 UTC

head link

[LLVMdev] RFC: Improving our DWARF (and ELF) emission testing capabilities

On Fri, Jan 18, 2013 at 1:00 PM, Eli Bendersky <eliben at google.com>
wrote:> Hi All,
>
> While working on some recent patches for x32 support, I ran into an
> unpleasant limitation the LLVM eco-system has with testing DWARF
> emission. We currently have several approaches, neither of which is
> great:
>
> 1. llvm-dwarfdump: the best approach when it works. But unfortunately
> lib/DebugInfo supports only a (small) subset of DWARF. Tricky sections
> like debug_frame aren't supported.
> 2. Relying of assembly directive emissions (i.e. .cfi_*), which is
> cumbersome and misses a lot of things like actual DWARF encoding.
> 3. Using elf-dump and examining the raw binary dumps. This makes tests
> nearly unmaintainable.
>
> The latter is also why IMHO our ELF emission in general isn't well
> tested. elf-dump is just too rudimentary and relies on simple (=dumb)
> binary contents dumps.
>
> The long-term solution for DWARF would be to enhance lib/DebugInfo to
> the point where it can handle all interesting DWARF sections. But this
> is a lofty goal, since DWARF parsing is notoriously hard and this
> would require a large investment of time and effort. And in the
> meantime, we just don't write good enough tests (and enough of them)
> for this very important feature.
>
> Therefore, as an interim stage, I propose to adopt some external tool
> that parses DWARF and emits decoded textual dumps which makes tests
> easy to write.
>
> Concretely, I have a pure Python library named pyelftools
> (https://bitbucket.org/eliben/pyelftools) which provides comprehensive
> ELF and DWARF parsing capabilities and has a dumper that's fully
> compatible with the readelf command. Using pyelftools would allow us
> to immediately improve the quality of our tests, and as lib/DebugInfo
> matures llvm-dwarfdump can gradually replace the dumper without
> changing the actual tests.
>
> pyelftools is relatively widely used so it's well tested, all it
> requires is Python 2.6 and higher, and its code is in the public
> domain. So it can live in tools/ or test/Scripts or wherever and be
> distributed with LLVM. I actively maintain it and hacking it to LLVM's
> purposes should be relatively easy. As a bonus, it has a much smarter
> ELF parser & dumper that can replace the ad-hoc elf-dump. It has also
> been successfully adapted in the past to read DWARF from MachO files,
> if that's required.
>
> Eli
I'm fine with this as long as llvm-dwarfdump gets maintained.

The only problem is that LLVM does not require Python 2.6, I think the
min version is still 2.4. Although I would love to move to 2.6 :P

- Michael Spencer

Eli Bendersky

2013-Jan-18 22:37 UTC

head link

[LLVMdev] RFC: Improving our DWARF (and ELF) emission testing capabilities

>> 1. llvm-dwarfdump: the best approach when it works. But unfortunately
>> lib/DebugInfo supports only a (small) subset of DWARF. Tricky sections
>> like debug_frame aren't supported.
>> 2. Relying of assembly directive emissions (i.e. .cfi_*), which is
>> cumbersome and misses a lot of things like actual DWARF encoding.
>> 3. Using elf-dump and examining the raw binary dumps. This makes tests
>> nearly unmaintainable.
>>
>> The latter is also why IMHO our ELF emission in general isn't well
>> tested. elf-dump is just too rudimentary and relies on simple (=dumb)
>> binary contents dumps.
>>
>> The long-term solution for DWARF would be to enhance lib/DebugInfo to
>> the point where it can handle all interesting DWARF sections. But this
>> is a lofty goal, since DWARF parsing is notoriously hard and this
>> would require a large investment of time and effort. And in the
>> meantime, we just don't write good enough tests (and enough of
them)
>> for this very important feature.
>>
>> Therefore, as an interim stage, I propose to adopt some external tool
>> that parses DWARF and emits decoded textual dumps which makes tests
>> easy to write.
>>
>> Concretely, I have a pure Python library named pyelftools
>> (https://bitbucket.org/eliben/pyelftools) which provides comprehensive
>> ELF and DWARF parsing capabilities and has a dumper that's fully
>> compatible with the readelf command. Using pyelftools would allow us
>> to immediately improve the quality of our tests, and as lib/DebugInfo
>> matures llvm-dwarfdump can gradually replace the dumper without
>> changing the actual tests.
>>
>> pyelftools is relatively widely used so it's well tested, all it
>> requires is Python 2.6 and higher, and its code is in the public
>> domain. So it can live in tools/ or test/Scripts or wherever and be
>> distributed with LLVM. I actively maintain it and hacking it to
LLVM's
>> purposes should be relatively easy. As a bonus, it has a much smarter
>> ELF parser & dumper that can replace the ad-hoc elf-dump. It has
also
>> been successfully adapted in the past to read DWARF from MachO files,
>> if that's required.
>>
>> Eli
>
> I'm fine with this as long as llvm-dwarfdump gets maintained.
>
I agree, and as I said in the original email, in the long term I
believe llvm-dwarfdump is the correct solution.
> The only problem is that LLVM does not require Python 2.6, I think the
> min version is still 2.4. Although I would love to move to 2.6 :P
Was this not covered by a previous discussion? I had the feeling it
was decided that 2.6 was OK to require, since it's simple to install
on platforms that don't ship it by default.

Eli

Daniel Berlin

2013-Jan-22 23:35 UTC

head link

[LLVMdev] RFC: Improving our DWARF (and ELF) emission testing capabilities

On Fri, Jan 18, 2013 at 4:00 PM, Eli Bendersky <eliben at google.com>
wrote:> Hi All,
>
> While working on some recent patches for x32 support, I ran into an
> unpleasant limitation the LLVM eco-system has with testing DWARF
> emission. We currently have several approaches, neither of which is
> great:
>
> 1. llvm-dwarfdump: the best approach when it works. But unfortunately
> lib/DebugInfo supports only a (small) subset of DWARF. Tricky sections
> like debug_frame aren't supported.
Could you point out what you mean?
In particular, what parts you think it does not support (since you say
it supports a small subset).
What do you want out of debug_frame, past simple parsing?
Anything else requires real evaluation.

I ask because I wrote a DWARF reader that google uses internally, and
then was open sourced and contributed to google breakpad.
(see
http://code.google.com/p/google-breakpad/source/browse/trunk/src/common/dwarf/,
in particular dwarf2reader.cc).
> 2. Relying of assembly directive emissions (i.e. .cfi_*), which is
> cumbersome and misses a lot of things like actual DWARF encoding.
Err, .cfi_ and used because the encoding is tricky to get right, and
assemblers are better at optimizing it.
However, i'll point out that breakpad also has a CFI assembler
(http://code.google.com/p/google-breakpad/source/browse/trunk/src/common/dwarf/cfi_assembler.cc)
>
> The long-term solution for DWARF would be to enhance lib/DebugInfo to
> the point where it can handle all interesting DWARF sections. But this
> is a lofty goal, since DWARF parsing is notoriously hard and this
> would require a large investment of time and effort.?????
Having written about 6 DWARF parsers, I strongly disagree it is either
notoriously hard or a large investment of time and effort.  People
have written DWARF parsers on the weekend.  One of the reasons DWARF
is popular is because it is relatively simple to *parse*, even though
semantic extraction is more difficult.

In any case, I mention the above project (google-breakpad) because i'd
be more than happy to get that DWARF related code relicensed to the
LLVM license if someone wanted it.>

Eli Bendersky

2013-Jan-22 23:45 UTC

head link

[LLVMdev] RFC: Improving our DWARF (and ELF) emission testing capabilities

On Tue, Jan 22, 2013 at 3:35 PM, Daniel Berlin <dberlin at dberlin.org>
wrote:> On Fri, Jan 18, 2013 at 4:00 PM, Eli Bendersky <eliben at google.com>
wrote:
>> Hi All,
>>
>> While working on some recent patches for x32 support, I ran into an
>> unpleasant limitation the LLVM eco-system has with testing DWARF
>> emission. We currently have several approaches, neither of which is
>> great:
>>
>> 1. llvm-dwarfdump: the best approach when it works. But unfortunately
>> lib/DebugInfo supports only a (small) subset of DWARF. Tricky sections
>> like debug_frame aren't supported.
>
> Could you point out what you mean?
> In particular, what parts you think it does not support (since you say
> it supports a small subset).
> What do you want out of debug_frame, past simple parsing?
> Anything else requires real evaluation.
>
As I said, it doesn't support debug_frame, as one relevant example.
> I ask because I wrote a DWARF reader that google uses internally, and
> then was open sourced and contributed to google breakpad.
> (see
http://code.google.com/p/google-breakpad/source/browse/trunk/src/common/dwarf/,
> in particular dwarf2reader.cc).
>
>> 2. Relying of assembly directive emissions (i.e. .cfi_*), which is
>> cumbersome and misses a lot of things like actual DWARF encoding.
>
> Err, .cfi_ and used because the encoding is tricky to get right, and
> assemblers are better at optimizing it.
> However, i'll point out that breakpad also has a CFI assembler
>
(http://code.google.com/p/google-breakpad/source/browse/trunk/src/common/dwarf/cfi_assembler.cc)
>
>>
>> The long-term solution for DWARF would be to enhance lib/DebugInfo to
>> the point where it can handle all interesting DWARF sections. But this
>> is a lofty goal, since DWARF parsing is notoriously hard and this
>> would require a large investment of time and effort.
> ?????
> Having written about 6 DWARF parsers, I strongly disagree it is either
> notoriously hard or a large investment of time and effort.  People
> have written DWARF parsers on the weekend.  One of the reasons DWARF
> is popular is because it is relatively simple to *parse*, even though
> semantic extraction is more difficult.
I do mean semantic extraction which provides a representation that's
meaningful to a user and hence can be effectively compared in a test.
But really, I gave up arguing on this topic a few messages (and heated
IRC discussions) ago. RFC retracted.
>
> In any case, I mention the above project (google-breakpad) because i'd
> be more than happy to get that DWARF related code relicensed to the
> LLVM license if someone wanted it.
>>
This is utterly impossible because your code does not start variables
with a capital! Seriously though, I would expect this to be
challenging since lib/DebugInfo already has quite a bit of parsing and
infrastructure already in place, and I'm not sure how easy it would be
to merge with a completely different parser.

Eli

Evan Cheng

2013-Jan-23 04:18 UTC

head link

[LLVMdev] RFC: Improving our DWARF (and ELF) emission testing capabilities

On Jan 18, 2013, at 1:00 PM, Eli Bendersky <eliben at google.com> wrote:
> Hi All,
> 
> While working on some recent patches for x32 support, I ran into an
> unpleasant limitation the LLVM eco-system has with testing DWARF
> emission. We currently have several approaches, neither of which is
> great:
> 
> 1. llvm-dwarfdump: the best approach when it works. But unfortunately
> lib/DebugInfo supports only a (small) subset of DWARF. Tricky sections
> like debug_frame aren't supported.
> 2. Relying of assembly directive emissions (i.e. .cfi_*), which is
> cumbersome and misses a lot of things like actual DWARF encoding.
> 3. Using elf-dump and examining the raw binary dumps. This makes tests
> nearly unmaintainable.
> 
> The latter is also why IMHO our ELF emission in general isn't well
> tested. elf-dump is just too rudimentary and relies on simple (=dumb)
> binary contents dumps.
> 
> The long-term solution for DWARF would be to enhance lib/DebugInfo to
> the point where it can handle all interesting DWARF sections. But this
> is a lofty goal, since DWARF parsing is notoriously hard and this
> would require a large investment of time and effort. And in the
> meantime, we just don't write good enough tests (and enough of them)
> for this very important feature.
I'm pretty I made Benjamin K. started on lib/DebugInfo. :-) There were two
primary motivations for lib/DebugInfo 1) to add source debug info capability to
llvm disassembler, and 2) to migrate LLDB's dwarf parsing to LLVM (to ease
sharing). I suspect that migration wasn't quite complete (and / or
LLDB's DWARF parsing has since improved). Anyway, IMO the best step forward
to continue to migrate LLDB's DWARF parsing library over and make it fully
featured.

Evan
> 
> Therefore, as an interim stage, I propose to adopt some external tool
> that parses DWARF and emits decoded textual dumps which makes tests
> easy to write.
> 
> Concretely, I have a pure Python library named pyelftools
> (https://bitbucket.org/eliben/pyelftools) which provides comprehensive
> ELF and DWARF parsing capabilities and has a dumper that's fully
> compatible with the readelf command. Using pyelftools would allow us
> to immediately improve the quality of our tests, and as lib/DebugInfo
> matures llvm-dwarfdump can gradually replace the dumper without
> changing the actual tests.
> 
> pyelftools is relatively widely used so it's well tested, all it
> requires is Python 2.6 and higher, and its code is in the public
> domain. So it can live in tools/ or test/Scripts or wherever and be
> distributed with LLVM. I actively maintain it and hacking it to LLVM's
> purposes should be relatively easy. As a bonus, it has a much smarter
> ELF parser & dumper that can replace the ad-hoc elf-dump. It has also
> been successfully adapted in the past to read DWARF from MachO files,
> if that's required.
> 
> Eli
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Reasonably Related Threads

Search for more possibly parallel threads

llvm dev - Jan 2013 - [LLVMdev] RFC: Improving our DWARF (and ELF) emission testing capabilities

[LLVMdev] RFC: Improving our DWARF (and ELF) emission testing capabilities

[LLVMdev] RFC: Improving our DWARF (and ELF) emission testing capabilities

[LLVMdev] RFC: Improving our DWARF (and ELF) emission testing capabilities

[LLVMdev] RFC: Improving our DWARF (and ELF) emission testing capabilities

[LLVMdev] RFC: Improving our DWARF (and ELF) emission testing capabilities

[LLVMdev] RFC: Improving our DWARF (and ELF) emission testing capabilities

[LLVMdev] RFC: Improving our DWARF (and ELF) emission testing capabilities

[LLVMdev] RFC: Improving our DWARF (and ELF) emission testing capabilities

Reasonably Related Threads