Eli Bendersky
2013-Jan-18 21:00 UTC
[LLVMdev] RFC: Improving our DWARF (and ELF) emission testing capabilities
Hi All, While working on some recent patches for x32 support, I ran into an unpleasant limitation the LLVM eco-system has with testing DWARF emission. We currently have several approaches, neither of which is great: 1. llvm-dwarfdump: the best approach when it works. But unfortunately lib/DebugInfo supports only a (small) subset of DWARF. Tricky sections like debug_frame aren't supported. 2. Relying of assembly directive emissions (i.e. .cfi_*), which is cumbersome and misses a lot of things like actual DWARF encoding. 3. Using elf-dump and examining the raw binary dumps. This makes tests nearly unmaintainable. The latter is also why IMHO our ELF emission in general isn't well tested. elf-dump is just too rudimentary and relies on simple (=dumb) binary contents dumps. The long-term solution for DWARF would be to enhance lib/DebugInfo to the point where it can handle all interesting DWARF sections. But this is a lofty goal, since DWARF parsing is notoriously hard and this would require a large investment of time and effort. And in the meantime, we just don't write good enough tests (and enough of them) for this very important feature. Therefore, as an interim stage, I propose to adopt some external tool that parses DWARF and emits decoded textual dumps which makes tests easy to write. Concretely, I have a pure Python library named pyelftools (https://bitbucket.org/eliben/pyelftools) which provides comprehensive ELF and DWARF parsing capabilities and has a dumper that's fully compatible with the readelf command. Using pyelftools would allow us to immediately improve the quality of our tests, and as lib/DebugInfo matures llvm-dwarfdump can gradually replace the dumper without changing the actual tests. pyelftools is relatively widely used so it's well tested, all it requires is Python 2.6 and higher, and its code is in the public domain. So it can live in tools/ or test/Scripts or wherever and be distributed with LLVM. I actively maintain it and hacking it to LLVM's purposes should be relatively easy. As a bonus, it has a much smarter ELF parser & dumper that can replace the ad-hoc elf-dump. It has also been successfully adapted in the past to read DWARF from MachO files, if that's required. Eli
David Blaikie
2013-Jan-18 21:29 UTC
[LLVMdev] RFC: Improving our DWARF (and ELF) emission testing capabilities
+ other debug info people (Eric & Paul) On Fri, Jan 18, 2013 at 1:00 PM, Eli Bendersky <eliben at google.com> wrote:> Hi All, > > While working on some recent patches for x32 support, I ran into an > unpleasant limitation the LLVM eco-system has with testing DWARF > emission. We currently have several approaches, neither of which is > great: > > 1. llvm-dwarfdump: the best approach when it works. But unfortunately > lib/DebugInfo supports only a (small) subset of DWARF. Tricky sections > like debug_frame aren't supported.Ideally I'd like to see support added whenever a code change is made to a feature - so long as we hold ourselves to a "test new changes" that can gate/encourage the necessary feature support in llvm-dwarfdump. Since no one's likely to go back & write a bunch of regression tests for all the existing code it seems premature to add new features to llvm-dwarfdump before there's a use-case. It does sometimes mean bug fixes appear to be costly because they include adding the missing test infrastructure support, but that's essentially where the cost is anyway.> 2. Relying of assembly directive emissions (i.e. .cfi_*), which is > cumbersome and misses a lot of things like actual DWARF encoding.I'm not sure what you mean by "actual DWARF encoding" here. (disclaimer: I've only recently started dabbling with debug info, so I may be missing obvious things)> 3. Using elf-dump and examining the raw binary dumps. This makes tests > nearly unmaintainable. > > The latter is also why IMHO our ELF emission in general isn't well > tested. elf-dump is just too rudimentary and relies on simple (=dumb) > binary contents dumps. > > The long-term solution for DWARF would be to enhance lib/DebugInfo to > the point where it can handle all interesting DWARF sections. But this > is a lofty goal, since DWARF parsing is notoriously hard and this > would require a large investment of time and effort. And in the > meantime, we just don't write good enough tests (and enough of them) > for this very important feature.Are there particular recent commits you've been concerned about the test quality of? I've been trying to keep an eye on this but, again, don't necessarily fully understand the ramifications of some changes.> Therefore, as an interim stage, I propose to adopt some external tool > that parses DWARF and emits decoded textual dumps which makes tests > easy to write. > > Concretely, I have a pure Python library named pyelftools > (https://bitbucket.org/eliben/pyelftools) which provides comprehensive > ELF and DWARF parsing capabilities and has a dumper that's fully > compatible with the readelf command. Using pyelftools would allow us > to immediately improve the quality of our tests, and as lib/DebugInfo > matures llvm-dwarfdump can gradually replace the dumper without > changing the actual tests.I would be a little hesitant about test execution performance if involved invoking new python processes for each debug info test. But numbers could convince me. Beyond that I can't rationally claim any particular need to support llvm-dwarfdump as the tool of choice over any 3rd party tool.> pyelftools is relatively widely used so it's well tested, all it > requires is Python 2.6 and higher, and its code is in the public > domain. So it can live in tools/ or test/Scripts or wherever and be > distributed with LLVM. I actively maintain it and hacking it to LLVM's > purposes should be relatively easy. As a bonus, it has a much smarter > ELF parser & dumper that can replace the ad-hoc elf-dump. It has also > been successfully adapted in the past to read DWARF from MachO files, > if that's required. > > Eli > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Eli Bendersky
2013-Jan-18 21:50 UTC
[LLVMdev] RFC: Improving our DWARF (and ELF) emission testing capabilities
>> 1. llvm-dwarfdump: the best approach when it works. But unfortunately >> lib/DebugInfo supports only a (small) subset of DWARF. Tricky sections >> like debug_frame aren't supported. > > Ideally I'd like to see support added whenever a code change is made > to a feature - so long as we hold ourselves to a "test new changes" > that can gate/encourage the necessary feature support in > llvm-dwarfdump. > > Since no one's likely to go back & write a bunch of regression tests > for all the existing code it seems premature to add new features to > llvm-dwarfdump before there's a use-case. It does sometimes mean bug > fixes appear to be costly because they include adding the missing test > infrastructure support, but that's essentially where the cost is > anyway.See test/MC/ELF/cfi-register.s for a test I consider unmaintainable since it just matches an elf-dump and requires manual decoding of the data for every change and addition. When tests are too hard to write, fewer tests get written.>> 2. Relying of assembly directive emissions (i.e. .cfi_*), which is >> cumbersome and misses a lot of things like actual DWARF encoding. > > I'm not sure what you mean by "actual DWARF encoding" here. > (disclaimer: I've only recently started dabbling with debug info, so I > may be missing obvious things)I mean that it doesn't test the whole way, and there's quite a bit of DWARF-related functionality in MC. So when a test relies on matching directives in ASM output, there's quite a bit of code in MC it doesn't exercise.> >> 3. Using elf-dump and examining the raw binary dumps. This makes tests >> nearly unmaintainable. >> >> The latter is also why IMHO our ELF emission in general isn't well >> tested. elf-dump is just too rudimentary and relies on simple (=dumb) >> binary contents dumps. >> >> The long-term solution for DWARF would be to enhance lib/DebugInfo to >> the point where it can handle all interesting DWARF sections. But this >> is a lofty goal, since DWARF parsing is notoriously hard and this >> would require a large investment of time and effort. And in the >> meantime, we just don't write good enough tests (and enough of them) >> for this very important feature. > > Are there particular recent commits you've been concerned about the > test quality of? I've been trying to keep an eye on this but, again, > don't necessarily fully understand the ramifications of some changes.See basically every test employing elf-dump for non-trivial things.> >> Therefore, as an interim stage, I propose to adopt some external tool >> that parses DWARF and emits decoded textual dumps which makes tests >> easy to write. >> >> Concretely, I have a pure Python library named pyelftools >> (https://bitbucket.org/eliben/pyelftools) which provides comprehensive >> ELF and DWARF parsing capabilities and has a dumper that's fully >> compatible with the readelf command. Using pyelftools would allow us >> to immediately improve the quality of our tests, and as lib/DebugInfo >> matures llvm-dwarfdump can gradually replace the dumper without >> changing the actual tests. > > I would be a little hesitant about test execution performance if > involved invoking new python processes for each debug info test. But > numbers could convince me. Beyond that I can't rationally claim any > particular need to support llvm-dwarfdump as the tool of choice over > any 3rd party tool.This is already done with elf-dump (a Python script) which is used for a lot of tests for lack better options. Eli
Michael Spencer
2013-Jan-18 22:17 UTC
[LLVMdev] RFC: Improving our DWARF (and ELF) emission testing capabilities
On Fri, Jan 18, 2013 at 1:00 PM, Eli Bendersky <eliben at google.com> wrote:> Hi All, > > While working on some recent patches for x32 support, I ran into an > unpleasant limitation the LLVM eco-system has with testing DWARF > emission. We currently have several approaches, neither of which is > great: > > 1. llvm-dwarfdump: the best approach when it works. But unfortunately > lib/DebugInfo supports only a (small) subset of DWARF. Tricky sections > like debug_frame aren't supported. > 2. Relying of assembly directive emissions (i.e. .cfi_*), which is > cumbersome and misses a lot of things like actual DWARF encoding. > 3. Using elf-dump and examining the raw binary dumps. This makes tests > nearly unmaintainable. > > The latter is also why IMHO our ELF emission in general isn't well > tested. elf-dump is just too rudimentary and relies on simple (=dumb) > binary contents dumps. > > The long-term solution for DWARF would be to enhance lib/DebugInfo to > the point where it can handle all interesting DWARF sections. But this > is a lofty goal, since DWARF parsing is notoriously hard and this > would require a large investment of time and effort. And in the > meantime, we just don't write good enough tests (and enough of them) > for this very important feature. > > Therefore, as an interim stage, I propose to adopt some external tool > that parses DWARF and emits decoded textual dumps which makes tests > easy to write. > > Concretely, I have a pure Python library named pyelftools > (https://bitbucket.org/eliben/pyelftools) which provides comprehensive > ELF and DWARF parsing capabilities and has a dumper that's fully > compatible with the readelf command. Using pyelftools would allow us > to immediately improve the quality of our tests, and as lib/DebugInfo > matures llvm-dwarfdump can gradually replace the dumper without > changing the actual tests. > > pyelftools is relatively widely used so it's well tested, all it > requires is Python 2.6 and higher, and its code is in the public > domain. So it can live in tools/ or test/Scripts or wherever and be > distributed with LLVM. I actively maintain it and hacking it to LLVM's > purposes should be relatively easy. As a bonus, it has a much smarter > ELF parser & dumper that can replace the ad-hoc elf-dump. It has also > been successfully adapted in the past to read DWARF from MachO files, > if that's required. > > EliI'm fine with this as long as llvm-dwarfdump gets maintained. The only problem is that LLVM does not require Python 2.6, I think the min version is still 2.4. Although I would love to move to 2.6 :P - Michael Spencer
Eli Bendersky
2013-Jan-18 22:37 UTC
[LLVMdev] RFC: Improving our DWARF (and ELF) emission testing capabilities
>> 1. llvm-dwarfdump: the best approach when it works. But unfortunately >> lib/DebugInfo supports only a (small) subset of DWARF. Tricky sections >> like debug_frame aren't supported. >> 2. Relying of assembly directive emissions (i.e. .cfi_*), which is >> cumbersome and misses a lot of things like actual DWARF encoding. >> 3. Using elf-dump and examining the raw binary dumps. This makes tests >> nearly unmaintainable. >> >> The latter is also why IMHO our ELF emission in general isn't well >> tested. elf-dump is just too rudimentary and relies on simple (=dumb) >> binary contents dumps. >> >> The long-term solution for DWARF would be to enhance lib/DebugInfo to >> the point where it can handle all interesting DWARF sections. But this >> is a lofty goal, since DWARF parsing is notoriously hard and this >> would require a large investment of time and effort. And in the >> meantime, we just don't write good enough tests (and enough of them) >> for this very important feature. >> >> Therefore, as an interim stage, I propose to adopt some external tool >> that parses DWARF and emits decoded textual dumps which makes tests >> easy to write. >> >> Concretely, I have a pure Python library named pyelftools >> (https://bitbucket.org/eliben/pyelftools) which provides comprehensive >> ELF and DWARF parsing capabilities and has a dumper that's fully >> compatible with the readelf command. Using pyelftools would allow us >> to immediately improve the quality of our tests, and as lib/DebugInfo >> matures llvm-dwarfdump can gradually replace the dumper without >> changing the actual tests. >> >> pyelftools is relatively widely used so it's well tested, all it >> requires is Python 2.6 and higher, and its code is in the public >> domain. So it can live in tools/ or test/Scripts or wherever and be >> distributed with LLVM. I actively maintain it and hacking it to LLVM's >> purposes should be relatively easy. As a bonus, it has a much smarter >> ELF parser & dumper that can replace the ad-hoc elf-dump. It has also >> been successfully adapted in the past to read DWARF from MachO files, >> if that's required. >> >> Eli > > I'm fine with this as long as llvm-dwarfdump gets maintained. >I agree, and as I said in the original email, in the long term I believe llvm-dwarfdump is the correct solution.> The only problem is that LLVM does not require Python 2.6, I think the > min version is still 2.4. Although I would love to move to 2.6 :PWas this not covered by a previous discussion? I had the feeling it was decided that 2.6 was OK to require, since it's simple to install on platforms that don't ship it by default. Eli
Daniel Berlin
2013-Jan-22 23:35 UTC
[LLVMdev] RFC: Improving our DWARF (and ELF) emission testing capabilities
On Fri, Jan 18, 2013 at 4:00 PM, Eli Bendersky <eliben at google.com> wrote:> Hi All, > > While working on some recent patches for x32 support, I ran into an > unpleasant limitation the LLVM eco-system has with testing DWARF > emission. We currently have several approaches, neither of which is > great: > > 1. llvm-dwarfdump: the best approach when it works. But unfortunately > lib/DebugInfo supports only a (small) subset of DWARF. Tricky sections > like debug_frame aren't supported.Could you point out what you mean? In particular, what parts you think it does not support (since you say it supports a small subset). What do you want out of debug_frame, past simple parsing? Anything else requires real evaluation. I ask because I wrote a DWARF reader that google uses internally, and then was open sourced and contributed to google breakpad. (see http://code.google.com/p/google-breakpad/source/browse/trunk/src/common/dwarf/, in particular dwarf2reader.cc).> 2. Relying of assembly directive emissions (i.e. .cfi_*), which is > cumbersome and misses a lot of things like actual DWARF encoding.Err, .cfi_ and used because the encoding is tricky to get right, and assemblers are better at optimizing it. However, i'll point out that breakpad also has a CFI assembler (http://code.google.com/p/google-breakpad/source/browse/trunk/src/common/dwarf/cfi_assembler.cc)> > The long-term solution for DWARF would be to enhance lib/DebugInfo to > the point where it can handle all interesting DWARF sections. But this > is a lofty goal, since DWARF parsing is notoriously hard and this > would require a large investment of time and effort.????? Having written about 6 DWARF parsers, I strongly disagree it is either notoriously hard or a large investment of time and effort. People have written DWARF parsers on the weekend. One of the reasons DWARF is popular is because it is relatively simple to *parse*, even though semantic extraction is more difficult. In any case, I mention the above project (google-breakpad) because i'd be more than happy to get that DWARF related code relicensed to the LLVM license if someone wanted it.>
Eli Bendersky
2013-Jan-22 23:45 UTC
[LLVMdev] RFC: Improving our DWARF (and ELF) emission testing capabilities
On Tue, Jan 22, 2013 at 3:35 PM, Daniel Berlin <dberlin at dberlin.org> wrote:> On Fri, Jan 18, 2013 at 4:00 PM, Eli Bendersky <eliben at google.com> wrote: >> Hi All, >> >> While working on some recent patches for x32 support, I ran into an >> unpleasant limitation the LLVM eco-system has with testing DWARF >> emission. We currently have several approaches, neither of which is >> great: >> >> 1. llvm-dwarfdump: the best approach when it works. But unfortunately >> lib/DebugInfo supports only a (small) subset of DWARF. Tricky sections >> like debug_frame aren't supported. > > Could you point out what you mean? > In particular, what parts you think it does not support (since you say > it supports a small subset). > What do you want out of debug_frame, past simple parsing? > Anything else requires real evaluation. >As I said, it doesn't support debug_frame, as one relevant example.> I ask because I wrote a DWARF reader that google uses internally, and > then was open sourced and contributed to google breakpad. > (see http://code.google.com/p/google-breakpad/source/browse/trunk/src/common/dwarf/, > in particular dwarf2reader.cc). > >> 2. Relying of assembly directive emissions (i.e. .cfi_*), which is >> cumbersome and misses a lot of things like actual DWARF encoding. > > Err, .cfi_ and used because the encoding is tricky to get right, and > assemblers are better at optimizing it. > However, i'll point out that breakpad also has a CFI assembler > (http://code.google.com/p/google-breakpad/source/browse/trunk/src/common/dwarf/cfi_assembler.cc) > >> >> The long-term solution for DWARF would be to enhance lib/DebugInfo to >> the point where it can handle all interesting DWARF sections. But this >> is a lofty goal, since DWARF parsing is notoriously hard and this >> would require a large investment of time and effort. > ????? > Having written about 6 DWARF parsers, I strongly disagree it is either > notoriously hard or a large investment of time and effort. People > have written DWARF parsers on the weekend. One of the reasons DWARF > is popular is because it is relatively simple to *parse*, even though > semantic extraction is more difficult.I do mean semantic extraction which provides a representation that's meaningful to a user and hence can be effectively compared in a test. But really, I gave up arguing on this topic a few messages (and heated IRC discussions) ago. RFC retracted.> > In any case, I mention the above project (google-breakpad) because i'd > be more than happy to get that DWARF related code relicensed to the > LLVM license if someone wanted it. >>This is utterly impossible because your code does not start variables with a capital! Seriously though, I would expect this to be challenging since lib/DebugInfo already has quite a bit of parsing and infrastructure already in place, and I'm not sure how easy it would be to merge with a completely different parser. Eli
Evan Cheng
2013-Jan-23 04:18 UTC
[LLVMdev] RFC: Improving our DWARF (and ELF) emission testing capabilities
On Jan 18, 2013, at 1:00 PM, Eli Bendersky <eliben at google.com> wrote:> Hi All, > > While working on some recent patches for x32 support, I ran into an > unpleasant limitation the LLVM eco-system has with testing DWARF > emission. We currently have several approaches, neither of which is > great: > > 1. llvm-dwarfdump: the best approach when it works. But unfortunately > lib/DebugInfo supports only a (small) subset of DWARF. Tricky sections > like debug_frame aren't supported. > 2. Relying of assembly directive emissions (i.e. .cfi_*), which is > cumbersome and misses a lot of things like actual DWARF encoding. > 3. Using elf-dump and examining the raw binary dumps. This makes tests > nearly unmaintainable. > > The latter is also why IMHO our ELF emission in general isn't well > tested. elf-dump is just too rudimentary and relies on simple (=dumb) > binary contents dumps. > > The long-term solution for DWARF would be to enhance lib/DebugInfo to > the point where it can handle all interesting DWARF sections. But this > is a lofty goal, since DWARF parsing is notoriously hard and this > would require a large investment of time and effort. And in the > meantime, we just don't write good enough tests (and enough of them) > for this very important feature.I'm pretty I made Benjamin K. started on lib/DebugInfo. :-) There were two primary motivations for lib/DebugInfo 1) to add source debug info capability to llvm disassembler, and 2) to migrate LLDB's dwarf parsing to LLVM (to ease sharing). I suspect that migration wasn't quite complete (and / or LLDB's DWARF parsing has since improved). Anyway, IMO the best step forward to continue to migrate LLDB's DWARF parsing library over and make it fully featured. Evan> > Therefore, as an interim stage, I propose to adopt some external tool > that parses DWARF and emits decoded textual dumps which makes tests > easy to write. > > Concretely, I have a pure Python library named pyelftools > (https://bitbucket.org/eliben/pyelftools) which provides comprehensive > ELF and DWARF parsing capabilities and has a dumper that's fully > compatible with the readelf command. Using pyelftools would allow us > to immediately improve the quality of our tests, and as lib/DebugInfo > matures llvm-dwarfdump can gradually replace the dumper without > changing the actual tests. > > pyelftools is relatively widely used so it's well tested, all it > requires is Python 2.6 and higher, and its code is in the public > domain. So it can live in tools/ or test/Scripts or wherever and be > distributed with LLVM. I actively maintain it and hacking it to LLVM's > purposes should be relatively easy. As a bonus, it has a much smarter > ELF parser & dumper that can replace the ad-hoc elf-dump. It has also > been successfully adapted in the past to read DWARF from MachO files, > if that's required. > > Eli > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Apparently Analagous Threads
- [LLVMdev] RFC: Improving our DWARF (and ELF) emission testing capabilities
- [LLVMdev] RFC: Improving our DWARF (and ELF) emission testing capabilities
- [LLVMdev] RFC: Improving our DWARF (and ELF) emission testing capabilities
- [LLVMdev] RFC: Improving our DWARF (and ELF) emission testing capabilities
- [LLVMdev] RFC: Improving our DWARF (and ELF) emission testing capabilities