thr3ads.net - llvm dev - [llvm-dev] Status of debuginfo-tests [Sep 2017]

If this information is useful, please help other people find it:
Share via:

Zachary Turner via llvm-dev

2017-Sep-08 16:23 UTC

[llvm-dev] Status of debuginfo-tests

On Fri, Sep 8, 2017 at 9:00 AM Adrian Prantl <aprantl at apple.com> wrote:
>
> > Eventually, some tests will inevitably need to Windows or Posix
> specific, so you're going to have to have all this extra stuff (the new
> substitutions, the different command lines, the custom output formats,
> etc.  So I think something like this provides maximal encouragement of
> sharing whenever possible (since you can almost always share source code),
> while still allowing each format to test real input and real output.
>
> I understand the desire to allow for Windows-specific tests and I think it
> would be good to add them to the repository in a windows subdirectory.
>
> Looking at the example you posted, the two variants are so structurally
> similar that I believe it would be a better to come up with a common
> abstraction from a readability / maintenance effort perspective. Basically,
> the only thing that the RUN lines do is compile and link executables from
> source code using the default target and run the test_debuginfo command. I
> think it would be better to define a new command substitution
> %clang-compile-link(?) in LIT that has different implementations on windows
> and posix. The set of debugger commands used by the tests is so tiny that
> it should not be a lot of work to implement a wrapper for the windows
> debugger (it took me about a day to write the python wrapper for LLDB
> including learning how to use the Python API) and it should also be
> possible to either do a sed-style massaging of the output or relax the
> CHECKs to work with both formats.
>
> I really want to avoid duplicating the debugger commands and checks, and I
> also want to maintain the ability to put the commands and CHECKs into the
> source code, since this makes the tests much easier to understand. Using a
> common abstraction will save us a lot of time in the long run, make
> maintenance and adding new tests cheaper, and won't prevent you from
also
> having windows-specific tests that may use an expanded vocabulary.
>
> What do you think?
> -- adrian
>
I understand the desire to keep them as similar as possible, but I'm still
not really sold that massaging fickle text output into a different text
format is going to make things more scalable.  I'd like there to be as few
layers of text processing as possible.  If someone files a bug report that
includes a WinDbg command log, I'd like to be able to paste those
statements into a test.

I also expect that on Windows we will end up having far more debug info
tests than other platforms, specifically because we don't have the ability
to write tests against the debugger (as it's proprietary / closed source).
So the language used in the current set of debug info tests is very simple,
because GDB and LLDB already have test suites that test more complicated
things.  But the problem is, we don't have those other test suites to fall
back on, so we will need much more.  For example, we may end up wanting a
test that exercises custom debug visualizers, or a test that a certain
proprietary debugger feature works, or a test that builtin debug
visualizers of STL types work.  To write a check for the latter, you need
to know the layout of the type, which depends on the standard library
implementation, so it's already going to be different.

If all we're doing is printing an integer, then I agree we can write a
common test.  But I don't think this is going to be the case outside of 1
or 2 trivial tests.

There's also the issue that we may want to test entirely different things.
For example, in the hypothetical example I posted earlier, we compile once
and link twice with 2 different linkers.  But we might even want to compile
twice and link twice (compile same program with cl and clang-cl, then link
both with lld).  The fundamental difference here is that we have two
different things that can emit debug info - the compiler and linker - and
we need the flexibility to test both independently of each other.

On a posixy platform, you only care the compiler and don't care what the
linker is.  On the other hand, it has its own set of unique aspects.  You
might decide to compile and link many times, so that you can test
-gsplit-dwarf, -gdebug-info-kind=limited, -gdebug-info-kind=full,
-gdb-index, etc. against a single program.

I don't see a useful abstraction that glosses over these differences that
isn't a ton of work for minimal gain, given the frequency with which
we'd
need to fall back to a custom test anyway.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170908/2a9c5966/attachment.html>

Robinson, Paul via llvm-dev

2017-Sep-08 17:45 UTC

head link

[llvm-dev] Status of debuginfo-tests

Let me say up front that I sympathize deeply with the problem; debug
info is an interface, and it is frequently unclear whether the goal of
some bit of work is to test the producer or test the consumer of the
interface.  In fact we end up using the producer to test the consumer,
and (in the case at hand) using the consumer to test the producer.
There are distinct analogies to testing compilers by seeing what the
linker thinks, and testing linkers by seeing whether they can handle
what the compiler produces.
> I understand the desire to keep them as similar as possible, but I'm
> still not really sold that massaging fickle text output into a different
> text format is going to make things more scalable.  I'd like there to
be
> as few layers of text processing as possible.
If the text output is fickle, then I'd think hiding the fickleness behind
a wrapper we control would be preferable to updating dozens of tests when 
something changes.  Or if a debugger changes its presentation in version 
N+1, but people are still running tests with version N, persuading the 
wrapper to handle both would be less overall work than making every test 
accommodate both formats.
>  If someone files a bug
> report that includes a WinDbg command log, I'd like to be able to paste
> those statements into a test.
That sounds like your goal is to turn the bug report into a Windows-only 
test, and not a common test.  Is that actually what you want?
If you still want it to be a common test, then you still need to do the
work to write the non-Windows side, and make sure it's actually still
exercising the same thing; not clear you are saving anything by being
able to copy-paste a WinDbg report.
> I also expect that on Windows we will end up having far more debug info
> tests than other platforms, specifically because we don't have the
> ability to write tests against the debugger (as it's proprietary /
> closed source).
I don't see how that follows. Sony runs the GDB test suite using clang as
the compiler, and while that is certainly perverting a debugger test suite
into being a compiler test suite, it has value in being a body of tests
that exercise a variety of debug-info features.  GDB being open source is
completely irrelevant to this use-case.  We treat it as closed.  We have
local changes to the test suite, but not to GDB.  The expected results
from the suite are based on GDB+GCC, which we treat as an oracle; then we
don't bother with tests of debugger features that clearly don't depend
on
debug info, such as thread handling.

Whatever CodeView/PDB tests you want to write, you can use MS tools as
your oracle.  Maybe you can't leverage an existing test suite, but it
doesn't mean you can't write tests.
> So the language used in the current set of debug info tests is very
> simple, because GDB and LLDB already have test suites that test more
> complicated things.  But the problem is, we don't have those other test
> suites to fall back on, so we will need much more.
This is an argument in favor of a completely separate WinDbg-based
executable test suite, rather than pumping up debuginfo-tests.
> There's also the issue that we may want to test entirely different
> things.  For example, in the hypothetical example I posted earlier, we
> compile once and link twice with 2 different linkers.  But we might even
> want to compile twice and link twice (compile same program with cl and
> clang-cl, then link both with lld).  The fundamental difference here is
> that we have two different things that can emit debug info - the compiler
> and linker - and we need the flexibility to test both independently of
> each other.
Iterating over many combinations is a distinct testing problem.  It helps
to have the test suite designed to handle this up front.  My experience
is that different combinations will have slightly different pass/fail
results and you need to be ready for that as well.
> I don't see a useful abstraction that glosses over these differences
> that isn't a ton of work for minimal gain, given the frequency with
> which we'd need to fall back to a custom test anyway.
As I mentioned above, you seem to be heading in the direction of a 
completely separate project, rather than being able to usefully
leverage anything from debuginfo-tests other than the basic idea.
--paulr

Zachary Turner via llvm-dev

2017-Sep-08 18:37 UTC

head link

[llvm-dev] Status of debuginfo-tests

On Fri, Sep 8, 2017 at 10:46 AM Robinson, Paul <paul.robinson at sony.com>
wrote:
> Let me say up front that I sympathize deeply with the problem; debug
> info is an interface, and it is frequently unclear whether the goal of
> some bit of work is to test the producer or test the consumer of the
> interface.  In fact we end up using the producer to test the consumer,
> and (in the case at hand) using the consumer to test the producer.
> There are distinct analogies to testing compilers by seeing what the
> linker thinks, and testing linkers by seeing whether they can handle
> what the compiler produces.
>
> > I understand the desire to keep them as similar as possible, but
I'm
> > still not really sold that massaging fickle text output into a
different
> > text format is going to make things more scalable.  I'd like there
to be
> > as few layers of text processing as possible.
>
> If the text output is fickle, then I'd think hiding the fickleness
behind
> a wrapper we control would be preferable to updating dozens of tests when
> something changes.  Or if a debugger changes its presentation in version
> N+1, but people are still running tests with version N, persuading the
> wrapper to handle both would be less overall work than making every test
> accommodate both formats.
>But that's just my point.  There are clearly going to be tests where both
formats don't even make sense because it's testing something specific to
one debugger.  What if I want to test that we output correct exception
information, so I send a .exr command to the debugger and get back this:

0:000> .exr -1ExceptionAddress: 77a6db8b
(ntdll!LdrpDoDebuggerBreak+0x0000002b)   ExceptionCode: 80000003
(Break instruction exception)  ExceptionFlags:
00000000NumberParameters: 1   Parameter[0]: 00000000

 What if I want to test that that the debugger can print a valid stack
trace, so I send a kv command and get back this?

 # ChildEBP RetAddr  Args to Child              *00* 0198fa4c 77a2f5ca
55fe0b87 00000000 00000000 ntdll!LdrpDoDebuggerBreak+0x2b (FPO:
[Non-Fpo])*01* 0198fc8c 77a18a42 55fe0bef 00000000 00000000
ntdll!LdrpInitializeProcess+0x1967 (FPO: [Non-Fpo])*02* 0198fce4
77a1886c 00000000 bad81aba 00000000 ntdll!_LdrpInitialize+0x180 (FPO:
[Non-Fpo])*03* 0198fcf4 00000000 0198fd08 779c0000 00000000
ntdll!LdrInitializeThunk+0x1c (FPO: [Non-Fpo])

Whereas GDB would print something like

#0  m4_traceon (obs=0x24eb0, argc=1, argv=0x2b8c8)
    at builtin.c:993
#1  0x6e38 in expand_macro (sym=0x2b600) at macro.c:242
#2  0x6840 in expand_token (obs=0x0, t=177664, td=0xf7fffb08)
    at macro.c:71
(More stack frames follow...)

I really don't want to get in the business of trying to convert the first
format into the second format.  Not only is it a recipe for disaster, but
it leads to worse diagnostics.  When my CHECK statement fails, I can't even
see the original stack trace anymore, only a generic error message like
"could not parse stack trace"

> > I also expect that on Windows we will end up having far more debug
info
> > tests than other platforms, specifically because we don't have the
> > ability to write tests against the debugger (as it's proprietary /
> > closed source).
>
> I don't see how that follows. Sony runs the GDB test suite using clang
as
> the compiler, and while that is certainly perverting a debugger test suite
> into being a compiler test suite, it has value in being a body of tests
> that exercise a variety of debug-info features.  GDB being open source is
> completely irrelevant to this use-case.  We treat it as closed.  We have
> local changes to the test suite, but not to GDB.  The expected results
> from the suite are based on GDB+GCC, which we treat as an oracle; then we
> don't bother with tests of debugger features that clearly don't
depend on
> debug info, such as thread handling.
>GDB being open source is very relevant to this case, because it means you
*have* GDB's test suite.  We don't have WinDbg or Visual Studio
debugger's
test suite.

>
> Whatever CodeView/PDB tests you want to write, you can use MS tools as
> your oracle.  Maybe you can't leverage an existing test suite, but it
> doesn't mean you can't write tests.
>Right, it just means we will end up writing plenty of tests that test
specific features of the debugger, something that would normally be handled
in a debugger test suite, which we don't have.

>
> > I don't see a useful abstraction that glosses over these
differences
> > that isn't a ton of work for minimal gain, given the frequency
with
> > which we'd need to fall back to a custom test anyway.
>
> As I mentioned above, you seem to be heading in the direction of a
> completely separate project, rather than being able to usefully
> leverage anything from debuginfo-tests other than the basic idea.
> --paulr
>
>I don't entirely disagree with this assessment.  On the other hand, I
don't
see any reason to call it something other than "debuginfo-tests" or to
put
it somewhere else, since conceptually both things are the same.

Even in this case though, reusing the source code of the tests seems like a
clear win since the high level ideas behind a test case often transcend
consumer boundaries, even when the implementation doesn't.

Plus, there is more to be gained from sharing than just the tests
themselves.  For example, I'm trying to get debuginfo-tests working
properly with CMake in more idiomatic LLVM style.  If I go off and fork the
tests into an entirely separate project, we wouldn't have that shared
benefit.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170908/e8802d26/attachment.html>

llvm dev - Sep 2017 - Status of debuginfo-tests

[llvm-dev] Status of debuginfo-tests

[llvm-dev] Status of debuginfo-tests

[llvm-dev] Status of debuginfo-tests