Jeremy Morse via llvm-dev
2019-Oct-09 15:33 UTC
[llvm-dev] [RFC] Adopt Dexter and use it to run debuginfo-tests
Hi llvm-dev@, This is a proposal for LLVM to adopt Sony's Dexter tool [0], import it into the debuginfo-tests repo, and use it to run integration tests between debuggers and clang/llvms debuginfo. (Sony has approved donating Dexter to LLVM). Background ---------- The debuginfo-tests repo contains an integration test suite for debug data, which builds each test case from its source code using clang and runs the output in gdb/lldb, and for some the Windows-based cdb. Each test case contains a list of debugger commands which are executed in order; the output of the commands is used to determine whether the test passes. Directly using debugger commands prevents the tests from being easily portable however, as the driver commands are debugger-specific and so each test case will only run with a specific debugger. gdb and lldb are sufficiently similar that their commands can be translated to the other, but it's much more difficult to support Visual Studios debugger or cdb. Dexter (Debug Experience Tester) [0] was developed for the purpose of testing the quality of debug information for combinations of compilers and debuggers. It was first introduced to the community last year in a lightning talk about measuring the user debug experience [http://llvm.org/devmtg/2018-04/talks.html#Lightning_11]. Tests are written as comments in the source code, with each comment containing a single test command. Each test command specifies an expectation about the debug experience, such as the value of a variable on a given line, which Dexter compares against what the debugger actually observes when the test is run. Dexter gives a score based on how similar the expected and actual experience were; for the purposes of regression testing anything less than a perfect score is considered a failure. The test commands are debugger-agnostic, allowing the same test to be run with any supported debugger. The compiler and debugger to use in the test are specified on the command line. In a previous discussion [https://reviews.llvm.org/D54187], the idea of using Dexter to drive the debuginfo-tests was raised as a solution to adding cdb support without needing to maintain an entirely separate suite of tests (as is currently the case). It can also replace the existing llgdb layer, which provides a frontend for driving tests with GDB and LLDB; the upshot being that each test can be written once and then run with any desired debugger. The Dexter test commands are also more concise than the equivalent set of debugger commands: where the existing tests require 3 separate statements to break on a line, print the value of an expression, and then check the output, a single Dexter command encapsulates all these items. Existing Test: ``` struct S { int a[8]; }; int f(struct S s, unsigned i) { // DEBUGGER: break 14 return s.a[i]; } int main(int argc, const char **argv) { struct S s = {{0, 1, 2, 3, 4, 5, 6, 7}}; if (f(s, 4) == 4) return f(s, 0); return 0; } // DEBUGGER: r // DEBUGGER: p s // CHECK: a // DEBUGGER: p s.a[0] // CHECK: = 0 // DEBUGGER: p s.a[1] // CHECK: = 1 // DEBUGGER: p s.a[7] ``` Dexter: ``` struct S { int a[8]; }; int f(struct S s, unsigned i) { return s.a[i]; // DexLabel('f_ret') } int main(int argc, const char **argv) { struct S s = {{0, 1, 2, 3, 4, 5, 6, 7}}; if (f(s, 4) == 4) return f(s, 0); return 0; } // DexExpectWatchValue('s.a[0]', '0', on_line='f_ret') // DexExpectWatchValue('s.a[1]', '1', on_line='f_ret') // DexExpectWatchValue('s.a[7]', '7', on_line='f_ret') ``` Currently Dexter has commands which can be used to cover most of the existing debuginfo-test cases (see "rough edges" below). Documentation on what expectations can be described are here [1]. We're also working on some sequencing commands ("this state is seen before that state") to better test optimized debuginfo. Proposal ------------ Our patch is up here: https://reviews.llvm.org/D68708 You can browse the tree on github here: https://github.com/jmorse/llvm-project/tree/dexter-rfc It imports Dexter into the debuginfo-tests repository, and converts many existing debuginfo-tests to run with Dexter. We'd suggest that future debuginfo integration tests use Dexter to run by default, so that we can build a testsuite ensuring clang/llvms debuginfo is correct and interpreted correctly everywhere. Rough edges ----------- Unifying debuginfo tests is a good goal, but there are limits -- for example, there's always going to be a place for testing clang-cl specific flags. It doesn't help that the cdb / windows-debug-engine expression parser is very different from other debuggers. We've stopped short of making all debuginfo tests completely generic, as there'll be different opinions on where to draw the line. Other scope for differences includes different platforms having different types for STL constructs, and different debuggers pretty printing differently. There's also a moderate sized question mark over GDB -- their python API is great, but has to be accessed from within a gdb process. This example implementation [2] uses a python RPC library to do that, which is an annoying extra dependency. Plus, GDB is GPL3, touching GDBs API from Dexter might raise licensing questions if LLVM adopted this. A small number of existing debuginfo tests (ex: nested-struct.cpp) don't actually start the built program, and instead just inspect type information. We haven't tried to support this, partially because that usecase doesn't seem suitable for Dexter, but mostly because debugger APIs don't seem to support looking up an arbitrary type on an arbitrary line number. We've also focused entirely on Python 3 given that Python 2 expires at the end of this year. The Dexter tests mark themselves as unsupported if the python process running lit (usually PYTHON_EXECUTABLE) isn't >= 3.0. We've been targeting Python 3.6; I'm not aware of any LLVM minimum python version requirements. Roadmap -------- Non-goals: As stated, Dexter is an integration test framework, which means that it is not intended to be used to test the output of specific optimization passes, or any other transformations other than a complete build and link. More can be done to test the debugging experience, and we would like Dexter to be able to express more behaviours in the future, for example the order in which information appears or the absence of certain information. Beyond straightforward regression tests, Dexter's original function was to quantify the quality of the debug information in a program. While we're only proposing using Dexter as a replacement debuginfo-tests driver, we think there's a future in being able to measure ``how bad'' both debuginfo and codegen changes affect the debugging experience. More about that in a future RFC though! [0] https://github.com/snsystems/dexter [1] https://github.com/SNSystems/dexter/blob/master/Commands.md [2] https://github.com/SNSystems/dexter/pull/70 -- Thanks, Jeremy
David Blaikie via llvm-dev
2019-Oct-09 15:44 UTC
[llvm-dev] [RFC] Adopt Dexter and use it to run debuginfo-tests
+some of the usual debug info folks I haven't looked at the code/complexity of dexter to get a sense of what the cost of maintenance would be - but the general idea seems reasonable to me. On Wed, Oct 9, 2019 at 8:33 AM Jeremy Morse via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi llvm-dev@, > > This is a proposal for LLVM to adopt Sony's Dexter tool [0], import it > into the > debuginfo-tests repo, and use it to run integration tests between debuggers > and clang/llvms debuginfo. (Sony has approved donating Dexter to LLVM). > > Background > ---------- > > The debuginfo-tests repo contains an integration test suite for debug data, > which builds each test case from its source code using clang and runs the > output in gdb/lldb, and for some the Windows-based cdb. Each test case > contains a list of debugger commands which are executed in order; the > output of > the commands is used to determine whether the test passes. Directly using > debugger commands prevents the tests from being easily portable however, > as the > driver commands are debugger-specific and so each test case will only run > with > a specific debugger. gdb and lldb are sufficiently similar that their > commands > can be translated to the other, but it's much more difficult to support > Visual Studios debugger or cdb. > > Dexter (Debug Experience Tester) [0] was developed for the purpose of > testing > the quality of debug information for combinations of compilers and > debuggers. > It was first introduced to the community last year in a lightning talk > about measuring the user debug experience > [http://llvm.org/devmtg/2018-04/talks.html#Lightning_11]. Tests are > written as > comments in the source code, with each comment containing a single test > command. Each test command specifies an expectation about the debug > experience, > such as the value of a variable on a given line, which Dexter compares > against > what the debugger actually observes when the test is run. Dexter gives a > score > based on how similar the expected and actual experience were; for the > purposes of > regression testing anything less than a perfect score is considered a > failure. > The test commands are debugger-agnostic, allowing the same test to be run > with > any supported debugger. The compiler and debugger to use in the test are > specified on the command line. > > In a previous discussion [https://reviews.llvm.org/D54187], the idea of > using > Dexter to drive the debuginfo-tests was raised as a solution to adding > cdb support > without needing to maintain an entirely separate suite of tests (as is > currently the case). It can also replace the existing llgdb layer, which > provides a frontend for driving tests with GDB and LLDB; the upshot being > that > each test can be written once and then run with any desired debugger. The > Dexter test commands are also more concise than the equivalent set of > debugger > commands: where the existing tests require 3 separate statements to break > on a > line, print the value of an expression, and then check the output, a single > Dexter command encapsulates all these items. > > Existing Test: ``` > struct S { int a[8]; }; > > int f(struct S s, unsigned i) { > // DEBUGGER: break 14 > return s.a[i]; > } > > int main(int argc, const char **argv) { > struct S s = {{0, 1, 2, 3, 4, 5, 6, 7}}; > if (f(s, 4) == 4) > return f(s, 0); > return 0; > } > > // DEBUGGER: r > // DEBUGGER: p s > // CHECK: a > // DEBUGGER: p s.a[0] > // CHECK: = 0 > // DEBUGGER: p s.a[1] > // CHECK: = 1 > // DEBUGGER: p s.a[7] > ``` > > Dexter: ``` > struct S { int a[8]; }; > > int f(struct S s, unsigned i) { > return s.a[i]; // DexLabel('f_ret') > } > > int main(int argc, const char **argv) { > struct S s = {{0, 1, 2, 3, 4, 5, 6, 7}}; > if (f(s, 4) == 4) > return f(s, 0); > return 0; > } > > // DexExpectWatchValue('s.a[0]', '0', on_line='f_ret') > // DexExpectWatchValue('s.a[1]', '1', on_line='f_ret') > // DexExpectWatchValue('s.a[7]', '7', on_line='f_ret') > ``` > > Currently Dexter has commands which can be used to cover most of the > existing > debuginfo-test cases (see "rough edges" below). Documentation on what > expectations can be described are here [1]. We're also working on > some sequencing commands ("this state is seen before that state") to better > test optimized debuginfo. > > Proposal > ------------ > > Our patch is up here: https://reviews.llvm.org/D68708 > You can browse the tree on github here: > https://github.com/jmorse/llvm-project/tree/dexter-rfc > > It imports Dexter into the debuginfo-tests repository, and converts many > existing debuginfo-tests to run with Dexter. We'd suggest that future > debuginfo integration tests use Dexter to run by default, so that we can > build a testsuite ensuring clang/llvms debuginfo is correct and interpreted > correctly everywhere. > > Rough edges > ----------- > > Unifying debuginfo tests is a good goal, but there are limits -- for > example, > there's always going to be a place for testing clang-cl specific flags. It > doesn't help that the cdb / windows-debug-engine expression parser is very > different from other debuggers. We've stopped short of making all debuginfo > tests completely generic, as there'll be different opinions on where to > draw > the line. Other scope for differences includes different platforms having > different types for STL constructs, and different debuggers pretty printing > differently. > > There's also a moderate sized question mark over GDB -- their python API is > great, but has to be accessed from within a gdb process. This example > implementation [2] uses a python RPC library to do that, which is an > annoying > extra dependency. Plus, GDB is GPL3, touching GDBs API from Dexter might > raise > licensing questions if LLVM adopted this. > > A small number of existing debuginfo tests (ex: nested-struct.cpp) don't > actually start the built program, and instead just inspect type > information. > We haven't tried to support this, partially because that usecase doesn't > seem > suitable for Dexter, but mostly because debugger APIs don't seem to support > looking up an arbitrary type on an arbitrary line number. > > We've also focused entirely on Python 3 given that Python 2 expires at the > end > of this year. The Dexter tests mark themselves as unsupported if the python > process running lit (usually PYTHON_EXECUTABLE) isn't >= 3.0. We've been > targeting Python 3.6; I'm not aware of any LLVM minimum python version > requirements. > > Roadmap > -------- > > Non-goals: As stated, Dexter is an integration test framework, which means > that > it is not intended to be used to test the output of specific optimization > passes, or any other transformations other than a complete build and link. > > More can be done to test the debugging experience, and we would like Dexter > to be able to express more behaviours in the future, for example the order > in > which information appears or the absence of certain information. > > Beyond straightforward regression tests, Dexter's original function was to > quantify the quality of the debug information in a program. While we're > only > proposing using Dexter as a replacement debuginfo-tests driver, we think > there's a future in being able to measure ``how bad'' both debuginfo and > codegen changes affect the debugging experience. More about that in a > future > RFC though! > > [0] https://github.com/snsystems/dexter > [1] https://github.com/SNSystems/dexter/blob/master/Commands.md > [2] https://github.com/SNSystems/dexter/pull/70 > > -- > Thanks, > Jeremy > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191009/2435e59e/attachment.html>
Adrian Prantl via llvm-dev
2019-Oct-09 15:59 UTC
[llvm-dev] [RFC] Adopt Dexter and use it to run debuginfo-tests
Jeremy (and everyone who worked on dexter), this is great news! I'm excited about having a single debugger driver in debuginfo-tests. I'm very much in favor of this proposal. Can you share experiences as to how hard it is to get tests to work under multiple platforms and toolchains in practice? I see that many tests in the patch hardcode things like specific debuggers/linux/cflags (which presumably look different on windows). Does dexter have an #ifdef feature to allow for slightly different results or for disable one sub-test on a specific platform? -- adrian
Jeremy Morse via llvm-dev
2019-Oct-09 16:44 UTC
[llvm-dev] [RFC] Adopt Dexter and use it to run debuginfo-tests
Hi Adrian, On Wed, Oct 9, 2019 at 4:59 PM Adrian Prantl <aprantl at apple.com> wrote:> this is great news! I'm excited about having a single debugger driver in debuginfo-tests. > I'm very much in favor of this proposal.Much appreciated!,> Can you share experiences as to how hard it is to get tests to work under multiple platforms and toolchains in practice?Sometimes it's easy, othertimes not. Almost all the large blob of Dexter tests I wrote about a year ago [0] to stimulate different passes worked under Windows / cdb immediately. The difficulties I've encountered have been: * Tests that rely on (for example) -fno-inline / -fno-unroll-loops not playing well with clang-cl, * Different STL types on Windows -- gdb and lldb quite naturally pretty-print things like vectors, it's much more difficult to do that in a implementation-independent way, * I don't think cdb's expression parser is as expressive as gdb/lldb, or at least I couldn't convince it to apply a subscript operator to a std::deque object easily. Difficulties examining state are probably the largest sticking point: one can easily write a test that makes use of language and debugger idioms, but they're all implemented differently on different platforms of course.Finding the sweet spot between forcing tests to be more explicit, and baking too much complexity into the test too, is an open question.> I see that many tests in the patch hardcode things like specific debuggers/linux/cflags (which presumably look different on windows). Does dexter have an #ifdef feature to allow for slightly different results or for disable one sub-test on a specific platform?This is mostly because we wanted to freeze things in a condition where they worked for the RFC. We're hoping that later we can have llvm-lit autoselect a working builder/debugger and put that in the %dexter expansion, then filter out any known bad combinations with XFAIL / UNSUPPORTED in the tests. llvm-lit can almost certainly handle per-test configuration -- we haven't approached the idea of making debug-experience expectations conditional yet. [0] https://github.com/SNSystems/dexter/tree/master/tests/nostdlib/llvm_passes -- Thanks, Jeremy