thr3ads.net - llvm dev - [llvm-dev] [cfe-dev] Testing Best Practices/Goals (in the context of compiler-rt) [Feb 2016]

If this information is useful, please help other people find it:
Share via:

David Blaikie via llvm-dev

2016-Feb-26 22:15 UTC

[llvm-dev] [cfe-dev] Testing Best Practices/Goals (in the context of compiler-rt)

On Fri, Feb 26, 2016 at 2:10 PM, Alexey Samsonov <vonosmas at gmail.com>
wrote:
>
> On Fri, Feb 26, 2016 at 1:34 PM, David Blaikie <dblaikie at
gmail.com> wrote:
>
>>
>>
>> On Fri, Feb 26, 2016 at 1:31 PM, Sean Silva <chisophugis at
gmail.com>
>> wrote:
>>
>>>
>>>
>>> On Fri, Feb 26, 2016 at 1:11 PM, David Blaikie <dblaikie at
gmail.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Fri, Feb 26, 2016 at 1:07 PM, Sean Silva <chisophugis at
gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Wed, Feb 17, 2016 at 8:45 AM, David Blaikie via cfe-dev
<
>>>>> cfe-dev at lists.llvm.org> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Feb 12, 2016 at 5:43 PM, Alexey Samsonov
<vonosmas at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Feb 11, 2016 at 1:50 PM, David Blaikie
<dblaikie at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Feb 10, 2016 at 3:55 PM, Alexey
Samsonov via cfe-dev <
>>>>>>>> cfe-dev at lists.llvm.org> wrote:
>>>>>>>>
>>>>>>>>> I mostly agree with what Richard and Justin
said. Adding a few
>>>>>>>>> notes about the general strategy we use:
>>>>>>>>>
>>>>>>>>> (1) lit tests which look
"end-to-end" proved to be way more
>>>>>>>>> convenient for testing runtime libraries
than unit tests.
>>>>>>>>>
>>>>>>>> We do have
>>>>>>>>> the latter, and use them to provide test
coverage for utility
>>>>>>>>> functions, but we quite often accompany fix
to the runtime library with
>>>>>>>>> "end-to-end" small reproducer
extracted from the real-world code
>>>>>>>>> that exposed the issue.
>>>>>>>>> Incidentally, this tests a whole lot of
other functionality: Clang
>>>>>>>>> driver, frontend, LLVM passes, etc, but
it's not the intent of the test.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Indeed - this is analogous to the tests for,
say, LLD that use
>>>>>>>> llvm-mc to produce the inputs rather than
checking in object files. That
>>>>>>>> area is open to some discussion as to just how
many tools we should rope
>>>>>>>> in/how isolated we should make tests (eg: maybe
building the json object
>>>>>>>> file format was going too far towards
isolation? Not clear - opinions
>>>>>>>> differ). But the point of the test is to test
the compiler-rt functionality
>>>>>>>> that was added/removed/modified.
>>>>>>>>
>>>>>>>> I think most people are in agreement with that,
while acknowledging
>>>>>>>> the fuzzy line about how isolated we might be.
>>>>>>>>
>>>>>>>
>>>>>>> Yes.
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> These tests are sometimes platform-specific
and poorly portable,
>>>>>>>>> but they are more reliable (we make the
same steps as the
>>>>>>>>> user of the compiler), and serve the
purpose of documentation.
>>>>>>>>>
>>>>>>>>> (2) If we change LLVM instrumentation - we
add a test to LLVM. If
>>>>>>>>> we change Clang code generation or driver
behavior - we add
>>>>>>>>> a test to Clang. No excuses here.
>>>>>>>>>
>>>>>>>>> (3) Sometimes we still add a compiler-rt
test for the change in
>>>>>>>>> LLVM or Clang: e.g. if we enhance Clang
frontend to teach UBSan
>>>>>>>>> to detecting yet another kind of overflow,
it makes sense to add a
>>>>>>>>> test to UBSan test-suite that demonstrates
it, in addition to
>>>>>>>>> Clang test verifying that we emit a call to
UBSan runtime. Also,
>>>>>>>>> compiler-rt test would allow us to verify
that the actual error report
>>>>>>>>> we present to the user is sane.
>>>>>>>>>
>>>>>>>>
>>>>>>>> This bit ^ is a bit unclear to me. If there was
no change to the
>>>>>>>> UBSan runtime, and the code generated by Clang
is equivalent/similar to an
>>>>>>>> existing use of the UBSan runtime - what is it
that the new compiler-rt
>>>>>>>> test is providing? (perhaps you could give a
concrete example you had in
>>>>>>>> mind to look at?)
>>>>>>>>
>>>>>>>
>>>>>>> See r235568 (change to Clang) followed by r235569
(change to
>>>>>>> compiler-rt test). Now, it's a cheat because
I'm fixing test, not adding
>>>>>>> it. However, I would have definitely added it, if
it was missing.
>>>>>>>
>>>>>>
>>>>>> Right, I think the difference here is "if it was
missing" - the test
>>>>>> case itself seems like it could be a reasonable one
(are there other tests
>>>>>> of the same compiler-rt functionality? (I assume the
compiler-rt
>>>>>> functionality is the implementation of sadd/ssub?))
>>>>>>
>>>>>>
>>>>>>> In this case, a change to Clang
>>>>>>> instrumentation (arguments passed to UBSan runtime
callbacks)
>>>>>>> improved the user-facing part of the tool, and
compiler-rt test suite is a
>>>>>>> good place to verify that.
>>>>>>>
>>>>>>
>>>>>> This seems like the problematic part - changes to LLVM
improve the
>>>>>> user-facing part of Clang, but we don't add
end-to-end tests of that, as a
>>>>>> general rule. I'm trying to understand why the
difference between that and
>>>>>> compiler-rt
>>>>>>
>>>>>
>>>>> In what way do changes in LLVM change the user-facing part
of Clang?
>>>>> It obviously depends on how broadly one defines
user-facing. Is a 1%
>>>>> performance improvement from a particular optimization
user-facing? Is
>>>>> better debug info accuracy user-facing? I'm not sure.
But it seems clear
>>>>> that "the user sees a diagnostic or not"
definitely is.
>>>>>
>>>>
>>>> There's more than just performance in LLVM - ABI features,
and yes, I'd
>>>> argue some pieces of debug info are pretty user facing (as are
some
>>>> optimizations). We also have the remarks system in place now.
(also the
>>>> compiler crashing (or not) is pretty user facing).
>>>>
>>>
>>> I'd argue that we probably should have some sort of integration
tests
>>> for ABI features. I think at the moment we're getting by thanks
to
>>> self-hosting and regularly building lots of real-world programs
with
>>> ToT-ish compilers.
>>>
>>
>> Perhaps so, but I'd argue that they shouldn't be run as part of
"make
>> check" & should be in a separate test grouping (probably
mostly run by
>> buildbots) for the purpose of integration testing.
>>
>
> If you have llvm/clang/compiler-rt/libc++/libc++abi checkout, they are not
> run as a part of "make check", only "make check-all",
which kind of makes
> sense (run *all* the tests!). You're free to run "make
check-clang", "make
> check-asan" etc.
> if you're sure your changes are limited in scope. Just to be clear - do
> you suggest that compiler-rt tests are too heavy for this configuration,
> and want to introduce extra level - i.e. extract "make
check-compiler-rt"
> out of "make check-all", and introduce "make
check-absolutely-everything",
> that would encompass them?
>
Fair point, check-all would be/is check all the features, but, yes,
potentially pulling out a separate subgroup for integration testing of any
kind. Yes, I realize this has the "-Wall" problem. But I suppose what
I'm
getting at is "check" in the LLVM project parlance is, with the
exception
of compiler-rt, the "lightweight, immediate verification tests, not
integration testing" & I would rather like that to be the case across
the
board.

Perhaps using a different verb for cross-project testing (I don't object to
a catch-all target for running all the different kinds of testing we have).
Naming is hard.

>
>
>> We've made a pretty conscious, deliberate, and consistent effort to
not
>> do integration testing across the LLVM projects in "make
check"-like
>> testing, and to fix it where we do find it. It seems to me that
compiler-rt
>> diverged from that approach and I'm not really in favor of that
divergence.
>>
>
> I don't see why consistency by itself is a good thing.
>
Makes it easier to work between projects. Sets expectations for developers
when they work in one area or another.

> As a sanitizer developer, current situation is convenient for me, but if
> it harms / slows down / complicates workflow for other developers or LLVM
> as a whole - sure, let's fix it.
>
One of the things I'm trying to understand is what the benefit of this
extra testing is - to take the add/sub example again, is adding extra
coverage for printing different text through the same mechanism in
compiler-rt valuable? What sort of regressions/bugs is that catching that
makes it compelling to author, maintain, and incur the ongoing execution
cost of?

>
>
>>
>> - David
>>
>>
>>>
>>> -- Sean Silva
>>>
>>>
>>>>
>>>>
>>>>> Also, I think part of this is that in compiler-rt there are
usually
>>>>> more moving parts we don't control. E.g. it isn't
just the interface
>>>>> between LLVM and clang. The information needs to pass
through archivers,
>>>>> linkers, runtime loaders, etc. that all may have issues
that affect whether
>>>>> the user sees the final result. In general the interface
between LLVM and
>>>>> clang has no middlemen so there really isn't anything
to check.
>>>>>
>>>>
>>>> Correctness/behavior of the compiler depends on those things
too
>>>> (linkers, loaders, etc) to produce the final working product
the user
>>>> requested. If we emitted symbols with the wrong linkage we
could produce
>>>> linker errors, drop important entities, etc. But we don't
generally test
>>>> that the output of LLVM/Clang produces the right binary when
linked, we
>>>> test that it produces the right linkages on the resulting
entities.
>>>>
>>>> - David
>>>>
>>>>
>>>>>
>>>>> -- Sean Silva
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>> You may argue that Clang test would have been
enough (I disagree
>>>>>>> with that), or that it qualifies as "adding
coverage" (maybe).
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> (4) True, we're intimidated by
test-suite :) I feel that current
>>>>>>>>> use of compiler-rt test suite to check
compiler-rt libs better follows
>>>>>>>>> the doctrine described by David.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Which David? ;) (I guess David Li, not me)
>>>>>>>>
>>>>>>>
>>>>>>> Nope, paragraph 2 from your original email.
>>>>>>>
>>>>>>>
>>>>>>>> I think maybe what could be worth doing would
be separating out the
>>>>>>>> broader/intentionally "end to end"
sort of tests from the ones intended to
>>>>>>>> test compiler-rt in relative isolation.
>>>>>>>>
>>>>>>>
>>>>>>> It's really hard to draw the line here, even
some of compiler-rt
>>>>>>> unit tests require instrumentation, therefore
depend on new features of
>>>>>>> Clang/LLVM. Unlike builtins, which are
>>>>>>> trivial to test in isolation, testing sanitizer
runtimes in
>>>>>>> isolation (w/o compiler) is often hard to implement
(we tried to do so for
>>>>>>> TSan, but found unit tests extremely hard to
write),
>>>>>>> and is barely useful - compiler-rt runtimes
don't consist of modules
>>>>>>> (like LLVMCodeGen and LLVMMC for instance), and are
never used w/o the
>>>>>>> compiler anyway.
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Most importantly, I'd expect only the
latter to run in a "make
>>>>>>>> check-all" run, as we do for Clang/LLVM,
etc.
>>>>>>>>
>>>>>>>
>>>>>>> And now we're getting to the goals :) Why would
such a change be
>>>>>>> good? Do you worry about the time it takes to
execute the test suite?
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Also, there's significant complexity in
compiler-rt test suite
>>>>>>>>> that narrows the tests executed
>>>>>>>>> to those supported by current host.
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Feb 10, 2016 at 2:33 PM, Xinliang
David Li via cfe-dev <
>>>>>>>>> cfe-dev at lists.llvm.org> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Feb 10, 2016 at 2:11 PM, Justin
Bogner via llvm-dev <
>>>>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>>>>
>>>>>>>>>>> David Blaikie via cfe-dev
<cfe-dev at lists.llvm.org> writes:
>>>>>>>>>>> > Recently had a bit of a
digression in a review thread related
>>>>>>>>>>> to some tests
>>>>>>>>>>> > going in to compiler-rt (
>>>>>>>>>>> >
>>>>>>>>>>>
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160208/330759.html
>>>>>>>>>>> > ) and there seems to be some
disconnect at least between my
>>>>>>>>>>> expectations
>>>>>>>>>>> > and reality. So I figured
I'd have a bit of a discussion out
>>>>>>>>>>> here on the
>>>>>>>>>>> > dev lists where there's a
bit more visibility.
>>>>>>>>>>> >
>>>>>>>>>>> > My basic expectation is that
the lit tests in any LLVM project
>>>>>>>>>>> except the
>>>>>>>>>>> > test-suite are targeted tests
intended to test only the
>>>>>>>>>>> functionality in
>>>>>>>>>>> > the project. This seems like a
pretty well accepted doctrine
>>>>>>>>>>> across most
>>>>>>>>>>> > LLVM projects - most visibly
in Clang, where we make a
>>>>>>>>>>> concerted effort not
>>>>>>>>>>> > to have tests that execute
LLVM optimizations, etc.
>>>>>>>>>>> >
>>>>>>>>>>> > There are exceptions/middle
ground to this - DIBuilder is in
>>>>>>>>>>> LLVM, but
>>>>>>>>>>> > essentially tested in Clang
rather than writing LLVM unit
>>>>>>>>>>> tests. It's
>>>>>>>>>>> > somewhat unavoidable that any
of the IR building code
>>>>>>>>>>> (IRBuilder,
>>>>>>>>>>> > DIBuilder, IR asm printing,
etc) is 'tested' incidentally in
>>>>>>>>>>> Clang in
>>>>>>>>>>> > process of testing Clang's
IR generation. But these are seen
>>>>>>>>>>> as incidental,
>>>>>>>>>>> > not intentionally trying to
cover LLVM with Clang tests (we
>>>>>>>>>>> don't add a
>>>>>>>>>>> > Clang test if we add a new
feature to IRBuilder just to test
>>>>>>>>>>> the IRBuilder).
>>>>>>>>>>> >
>>>>>>>>>>> > Another case with some middle
ground are things like linker
>>>>>>>>>>> tests and
>>>>>>>>>>> > objdump, dwarfdump, etc - in
theory to isolate the test we
>>>>>>>>>>> would checkin
>>>>>>>>>>> > binaries (or the textual
object representation lld had for a
>>>>>>>>>>> while, etc) to
>>>>>>>>>>> > test those tools. Some tests
instead checkin assembly and
>>>>>>>>>>> assemble it with
>>>>>>>>>>> > llvm-mc. Again, not to cover
llvm-mc, but on the assumption
>>>>>>>>>>> that llvm-mc is
>>>>>>>>>>> > tested, and just using it as a
tool to make tests easier to
>>>>>>>>>>> maintain.
>>>>>>>>>>> >
>>>>>>>>>>> > So I was surprised to find
that the compiler-rt lit tests seem
>>>>>>>>>>> to diverge
>>>>>>>>>>> > from this philosophy &
contain more intentional end-to-end
>>>>>>>>>>> tests (eg:
>>>>>>>>>>> > adding a test there when
making a fix to Clang to add a
>>>>>>>>>>> counter to a
>>>>>>>>>>> > function that was otherwise
missing a counter - I'd expect
>>>>>>>>>>> that to be
>>>>>>>>>>> > tested in Clang and that there
would already be coverage in
>>>>>>>>>>> compiler-rt for
>>>>>>>>>>> > "if a function has a
counter, does compiler-rt do the right
>>>>>>>>>>> thing with that
>>>>>>>>>>> > counter" (testing
whatever code in compiler-rt needs to be
>>>>>>>>>>> tested)).
>>>>>>>>>>> >
>>>>>>>>>>> > Am I off base here? Are
compiler-rt's tests fundamentally
>>>>>>>>>>> different to the
>>>>>>>>>>> > rest of the LLVM project? Why?
Should they continue to be?
>>>>>>>>>>>
>>>>>>>>>>> I think there's a bit of grey
area in terms testing the runtime -
>>>>>>>>>>> generally it's pretty hard to
use the runtime without a fairly
>>>>>>>>>>> end-to-end test, so tests of the
runtime often end up looking
>>>>>>>>>>> pretty
>>>>>>>>>>> close to an end-to-end test.
>>>>>>>>>>>
>>>>>>>>>>> That said, I don't think that
should be used as an excuse to
>>>>>>>>>>> sneak
>>>>>>>>>>> arbitrary end-to-end tests into
compiler-rt. We should
>>>>>>>>>>> absolutely write
>>>>>>>>>>> tests in clang and llvm that
we're inputting what we expect to
>>>>>>>>>>> the
>>>>>>>>>>> runtime and try to keep the tests
in compiler-rt as focused on
>>>>>>>>>>> just
>>>>>>>>>>> exercising the runtime code as
possible.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Yes, we should not use compiler-rt
tests as an excuse of not
>>>>>>>>>> adding clang/LLVM test. The latter
should always be added if possible --
>>>>>>>>>> they are platform independent and is
the first level of defense.  runtime
>>>>>>>>>> test's focus is also more on the
runtime lib itself and interaction between
>>>>>>>>>>  runtime, compiler, binutils and other
tools.
>>>>>>>>>>
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> IIUC, the correct place for
integration tests in general is
>>>>>>>>>>> somewhere
>>>>>>>>>>> like test-suite, but I think
it's a bit intimidating to some
>>>>>>>>>>> people to
>>>>>>>>>>> add new tests there (Are there docs
on this?). I suspect some of
>>>>>>>>>>> the
>>>>>>>>>>> profiling related tests in
compiler-rt are doing a bit much and
>>>>>>>>>>> should
>>>>>>>>>>> graduate to a spot in the
test-suite (but I don't have time to
>>>>>>>>>>> volunteer
>>>>>>>>>>> to do the work, unfortunately).
>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>> LLVM Developers mailing list
>>>>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
_______________________________________________
>>>>>>>>>> cfe-dev mailing list
>>>>>>>>>> cfe-dev at lists.llvm.org
>>>>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Alexey Samsonov
>>>>>>>>> vonosmas at gmail.com
>>>>>>>>>
>>>>>>>>>
_______________________________________________
>>>>>>>>> cfe-dev mailing list
>>>>>>>>> cfe-dev at lists.llvm.org
>>>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Alexey Samsonov
>>>>>>> vonosmas at gmail.com
>>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> cfe-dev mailing list
>>>>>> cfe-dev at lists.llvm.org
>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
>
> --
> Alexey Samsonov
> vonosmas at gmail.com
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160226/a37df560/attachment-0001.html>

Xinliang David Li via llvm-dev

2016-Feb-26 23:17 UTC

head link

[llvm-dev] [cfe-dev] Testing Best Practices/Goals (in the context of compiler-rt)

Sean and Alexey have said a lot on this topic.  Here is my version of
explanation that LLVM testing is not suitable to replace end to end testing.

 - The most important reason being that LLVM tests tend to test
'implementation details'. This has many drawbacks:
  a) by only writing test cases like this, it is hard to expose bugs
inherited in the implementation itself;
  b) the connection between the implementation and end-user expectation is
not obvious
  c) often in times, we will have to throw out the existing implementation
for a whole new more efficient implementation -- not only do we need to pay
the cost of tedious test case update (from one impl to another) -- but that
all the previous implementation specific code coverage become useless --
how are we sure the new implementation does not break the end-end user
expectation?

End-end tests on the other hand have no such drawbacks -- its specification
is well defined:
  1) it is easy to write test cases based on well defined specifications
and cover all corner cases (e.g, new language constructs etc);
  2) it is easy to extract test cases from real failures from very large
apps
  3) once the test case is written, it almost never needs to be updated as
long as the public interfaces (options) are used.

End-end testing also cover areas not in compiler proper -- like runtime,
linker, other tools, and their subtle interactions.  For instance latest
darwin i386 profile-rt bug exposed by the improved profile runtime test
coverage will be hard to expose via LLVM only testing ..

In short, I think runtime tests do not overlap with LLVM testing -- even
though they look like testing the same thing. For reference, we currently
have < 100 profile rt end-end testing which is alarmingly small.  We should
push increasing its coverage.

thanks,

David

On Fri, Feb 26, 2016 at 2:15 PM, David Blaikie <dblaikie at gmail.com>
wrote:
>
>
> On Fri, Feb 26, 2016 at 2:10 PM, Alexey Samsonov <vonosmas at
gmail.com>
> wrote:
>
>>
>> On Fri, Feb 26, 2016 at 1:34 PM, David Blaikie <dblaikie at
gmail.com>
>> wrote:
>>
>>>
>>>
>>> On Fri, Feb 26, 2016 at 1:31 PM, Sean Silva <chisophugis at
gmail.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Fri, Feb 26, 2016 at 1:11 PM, David Blaikie <dblaikie at
gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Fri, Feb 26, 2016 at 1:07 PM, Sean Silva <chisophugis
at gmail.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Feb 17, 2016 at 8:45 AM, David Blaikie via
cfe-dev <
>>>>>> cfe-dev at lists.llvm.org> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Feb 12, 2016 at 5:43 PM, Alexey Samsonov
<vonosmas at gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Feb 11, 2016 at 1:50 PM, David Blaikie
<dblaikie at gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Feb 10, 2016 at 3:55 PM, Alexey
Samsonov via cfe-dev <
>>>>>>>>> cfe-dev at lists.llvm.org> wrote:
>>>>>>>>>
>>>>>>>>>> I mostly agree with what Richard and
Justin said. Adding a few
>>>>>>>>>> notes about the general strategy we
use:
>>>>>>>>>>
>>>>>>>>>> (1) lit tests which look
"end-to-end" proved to be way more
>>>>>>>>>> convenient for testing runtime
libraries than unit tests.
>>>>>>>>>>
>>>>>>>>> We do have
>>>>>>>>>> the latter, and use them to provide
test coverage for utility
>>>>>>>>>> functions, but we quite often accompany
fix to the runtime library with
>>>>>>>>>> "end-to-end" small reproducer
extracted from the real-world code
>>>>>>>>>> that exposed the issue.
>>>>>>>>>> Incidentally, this tests a whole lot of
other functionality:
>>>>>>>>>> Clang driver, frontend, LLVM passes,
etc, but it's not the intent of the
>>>>>>>>>> test.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Indeed - this is analogous to the tests
for, say, LLD that use
>>>>>>>>> llvm-mc to produce the inputs rather than
checking in object files. That
>>>>>>>>> area is open to some discussion as to just
how many tools we should rope
>>>>>>>>> in/how isolated we should make tests (eg:
maybe building the json object
>>>>>>>>> file format was going too far towards
isolation? Not clear - opinions
>>>>>>>>> differ). But the point of the test is to
test the compiler-rt functionality
>>>>>>>>> that was added/removed/modified.
>>>>>>>>>
>>>>>>>>> I think most people are in agreement with
that, while
>>>>>>>>> acknowledging the fuzzy line about how
isolated we might be.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Yes.
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> These tests are sometimes
platform-specific and poorly portable,
>>>>>>>>>> but they are more reliable (we make the
same steps as the
>>>>>>>>>> user of the compiler), and serve the
purpose of documentation.
>>>>>>>>>>
>>>>>>>>>> (2) If we change LLVM instrumentation -
we add a test to LLVM. If
>>>>>>>>>> we change Clang code generation or
driver behavior - we add
>>>>>>>>>> a test to Clang. No excuses here.
>>>>>>>>>>
>>>>>>>>>> (3) Sometimes we still add a
compiler-rt test for the change in
>>>>>>>>>> LLVM or Clang: e.g. if we enhance Clang
frontend to teach UBSan
>>>>>>>>>> to detecting yet another kind of
overflow, it makes sense to add
>>>>>>>>>> a test to UBSan test-suite that
demonstrates it, in addition to
>>>>>>>>>> Clang test verifying that we emit a
call to UBSan runtime. Also,
>>>>>>>>>> compiler-rt test would allow us to
verify that the actual error report
>>>>>>>>>> we present to the user is sane.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> This bit ^ is a bit unclear to me. If there
was no change to the
>>>>>>>>> UBSan runtime, and the code generated by
Clang is equivalent/similar to an
>>>>>>>>> existing use of the UBSan runtime - what is
it that the new compiler-rt
>>>>>>>>> test is providing? (perhaps you could give
a concrete example you had in
>>>>>>>>> mind to look at?)
>>>>>>>>>
>>>>>>>>
>>>>>>>> See r235568 (change to Clang) followed by
r235569 (change to
>>>>>>>> compiler-rt test). Now, it's a cheat
because I'm fixing test, not adding
>>>>>>>> it. However, I would have definitely added it,
if it was missing.
>>>>>>>>
>>>>>>>
>>>>>>> Right, I think the difference here is "if it
was missing" - the test
>>>>>>> case itself seems like it could be a reasonable one
(are there other tests
>>>>>>> of the same compiler-rt functionality? (I assume
the compiler-rt
>>>>>>> functionality is the implementation of sadd/ssub?))
>>>>>>>
>>>>>>>
>>>>>>>> In this case, a change to Clang
>>>>>>>> instrumentation (arguments passed to UBSan
runtime callbacks)
>>>>>>>> improved the user-facing part of the tool, and
compiler-rt test suite is a
>>>>>>>> good place to verify that.
>>>>>>>>
>>>>>>>
>>>>>>> This seems like the problematic part - changes to
LLVM improve the
>>>>>>> user-facing part of Clang, but we don't add
end-to-end tests of that, as a
>>>>>>> general rule. I'm trying to understand why the
difference between that and
>>>>>>> compiler-rt
>>>>>>>
>>>>>>
>>>>>> In what way do changes in LLVM change the user-facing
part of Clang?
>>>>>> It obviously depends on how broadly one defines
user-facing. Is a 1%
>>>>>> performance improvement from a particular optimization
user-facing? Is
>>>>>> better debug info accuracy user-facing? I'm not
sure. But it seems clear
>>>>>> that "the user sees a diagnostic or not"
definitely is.
>>>>>>
>>>>>
>>>>> There's more than just performance in LLVM - ABI
features, and yes,
>>>>> I'd argue some pieces of debug info are pretty user
facing (as are some
>>>>> optimizations). We also have the remarks system in place
now. (also the
>>>>> compiler crashing (or not) is pretty user facing).
>>>>>
>>>>
>>>> I'd argue that we probably should have some sort of
integration tests
>>>> for ABI features. I think at the moment we're getting by
thanks to
>>>> self-hosting and regularly building lots of real-world programs
with
>>>> ToT-ish compilers.
>>>>
>>>
>>> Perhaps so, but I'd argue that they shouldn't be run as
part of "make
>>> check" & should be in a separate test grouping (probably
mostly run by
>>> buildbots) for the purpose of integration testing.
>>>
>>
>> If you have llvm/clang/compiler-rt/libc++/libc++abi checkout, they are
>> not run as a part of "make check", only "make
check-all", which kind of
>> makes sense (run *all* the tests!). You're free to run "make
check-clang",
>> "make check-asan" etc.
>> if you're sure your changes are limited in scope. Just to be clear
- do
>> you suggest that compiler-rt tests are too heavy for this
configuration,
>> and want to introduce extra level - i.e. extract "make
check-compiler-rt"
>> out of "make check-all", and introduce "make
check-absolutely-everything",
>> that would encompass them?
>>
>
> Fair point, check-all would be/is check all the features, but, yes,
> potentially pulling out a separate subgroup for integration testing of any
> kind. Yes, I realize this has the "-Wall" problem. But I suppose
what I'm
> getting at is "check" in the LLVM project parlance is, with the
exception
> of compiler-rt, the "lightweight, immediate verification tests, not
> integration testing" & I would rather like that to be the case
across the
> board.
>
> Perhaps using a different verb for cross-project testing (I don't
object
> to a catch-all target for running all the different kinds of testing we
> have). Naming is hard.
>
>
>>
>>
>>> We've made a pretty conscious, deliberate, and consistent
effort to not
>>> do integration testing across the LLVM projects in "make
check"-like
>>> testing, and to fix it where we do find it. It seems to me that
compiler-rt
>>> diverged from that approach and I'm not really in favor of that
divergence.
>>>
>>
>> I don't see why consistency by itself is a good thing.
>>
>
> Makes it easier to work between projects. Sets expectations for developers
> when they work in one area or another.
>
>
>> As a sanitizer developer, current situation is convenient for me, but
if
>> it harms / slows down / complicates workflow for other developers or
LLVM
>> as a whole - sure, let's fix it.
>>
>
> One of the things I'm trying to understand is what the benefit of this
> extra testing is - to take the add/sub example again, is adding extra
> coverage for printing different text through the same mechanism in
> compiler-rt valuable? What sort of regressions/bugs is that catching that
> makes it compelling to author, maintain, and incur the ongoing execution
> cost of?
>

>
>
>>
>>
>>>
>>> - David
>>>
>>>
>>>>
>>>> -- Sean Silva
>>>>
>>>>
>>>>>
>>>>>
>>>>>> Also, I think part of this is that in compiler-rt there
are usually
>>>>>> more moving parts we don't control. E.g. it
isn't just the interface
>>>>>> between LLVM and clang. The information needs to pass
through archivers,
>>>>>> linkers, runtime loaders, etc. that all may have issues
that affect whether
>>>>>> the user sees the final result. In general the
interface between LLVM and
>>>>>> clang has no middlemen so there really isn't
anything to check.
>>>>>>
>>>>>
>>>>> Correctness/behavior of the compiler depends on those
things too
>>>>> (linkers, loaders, etc) to produce the final working
product the user
>>>>> requested. If we emitted symbols with the wrong linkage we
could produce
>>>>> linker errors, drop important entities, etc. But we
don't generally test
>>>>> that the output of LLVM/Clang produces the right binary
when linked, we
>>>>> test that it produces the right linkages on the resulting
entities.
>>>>>
>>>>> - David
>>>>>
>>>>>
>>>>>>
>>>>>> -- Sean Silva
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>> You may argue that Clang test would have been
enough (I disagree
>>>>>>>> with that), or that it qualifies as
"adding coverage" (maybe).
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> (4) True, we're intimidated by
test-suite :) I feel that current
>>>>>>>>>> use of compiler-rt test suite to check
compiler-rt libs better follows
>>>>>>>>>> the doctrine described by David.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Which David? ;) (I guess David Li, not me)
>>>>>>>>>
>>>>>>>>
>>>>>>>> Nope, paragraph 2 from your original email.
>>>>>>>>
>>>>>>>>
>>>>>>>>> I think maybe what could be worth doing
would be separating out
>>>>>>>>> the broader/intentionally "end to
end" sort of tests from the ones intended
>>>>>>>>> to test compiler-rt in relative isolation.
>>>>>>>>>
>>>>>>>>
>>>>>>>> It's really hard to draw the line here,
even some of compiler-rt
>>>>>>>> unit tests require instrumentation, therefore
depend on new features of
>>>>>>>> Clang/LLVM. Unlike builtins, which are
>>>>>>>> trivial to test in isolation, testing sanitizer
runtimes in
>>>>>>>> isolation (w/o compiler) is often hard to
implement (we tried to do so for
>>>>>>>> TSan, but found unit tests extremely hard to
write),
>>>>>>>> and is barely useful - compiler-rt runtimes
don't consist of
>>>>>>>> modules (like LLVMCodeGen and LLVMMC for
instance), and are never used w/o
>>>>>>>> the compiler anyway.
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Most importantly, I'd expect only the
latter to run in a "make
>>>>>>>>> check-all" run, as we do for
Clang/LLVM, etc.
>>>>>>>>>
>>>>>>>>
>>>>>>>> And now we're getting to the goals :) Why
would such a change be
>>>>>>>> good? Do you worry about the time it takes to
execute the test suite?
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Also, there's significant
complexity in compiler-rt test suite
>>>>>>>>>> that narrows the tests executed
>>>>>>>>>> to those supported by current host.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Feb 10, 2016 at 2:33 PM,
Xinliang David Li via cfe-dev <
>>>>>>>>>> cfe-dev at lists.llvm.org> wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Feb 10, 2016 at 2:11 PM,
Justin Bogner via llvm-dev <
>>>>>>>>>>> llvm-dev at lists.llvm.org>
wrote:
>>>>>>>>>>>
>>>>>>>>>>>> David Blaikie via cfe-dev
<cfe-dev at lists.llvm.org> writes:
>>>>>>>>>>>> > Recently had a bit of a
digression in a review thread related
>>>>>>>>>>>> to some tests
>>>>>>>>>>>> > going in to compiler-rt (
>>>>>>>>>>>> >
>>>>>>>>>>>>
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160208/330759.html
>>>>>>>>>>>> > ) and there seems to be
some disconnect at least between my
>>>>>>>>>>>> expectations
>>>>>>>>>>>> > and reality. So I figured
I'd have a bit of a discussion out
>>>>>>>>>>>> here on the
>>>>>>>>>>>> > dev lists where
there's a bit more visibility.
>>>>>>>>>>>> >
>>>>>>>>>>>> > My basic expectation is
that the lit tests in any LLVM
>>>>>>>>>>>> project except the
>>>>>>>>>>>> > test-suite are targeted
tests intended to test only the
>>>>>>>>>>>> functionality in
>>>>>>>>>>>> > the project. This seems
like a pretty well accepted doctrine
>>>>>>>>>>>> across most
>>>>>>>>>>>> > LLVM projects - most
visibly in Clang, where we make a
>>>>>>>>>>>> concerted effort not
>>>>>>>>>>>> > to have tests that execute
LLVM optimizations, etc.
>>>>>>>>>>>> >
>>>>>>>>>>>> > There are
exceptions/middle ground to this - DIBuilder is in
>>>>>>>>>>>> LLVM, but
>>>>>>>>>>>> > essentially tested in
Clang rather than writing LLVM unit
>>>>>>>>>>>> tests. It's
>>>>>>>>>>>> > somewhat unavoidable that
any of the IR building code
>>>>>>>>>>>> (IRBuilder,
>>>>>>>>>>>> > DIBuilder, IR asm
printing, etc) is 'tested' incidentally in
>>>>>>>>>>>> Clang in
>>>>>>>>>>>> > process of testing
Clang's IR generation. But these are seen
>>>>>>>>>>>> as incidental,
>>>>>>>>>>>> > not intentionally trying
to cover LLVM with Clang tests (we
>>>>>>>>>>>> don't add a
>>>>>>>>>>>> > Clang test if we add a new
feature to IRBuilder just to test
>>>>>>>>>>>> the IRBuilder).
>>>>>>>>>>>> >
>>>>>>>>>>>> > Another case with some
middle ground are things like linker
>>>>>>>>>>>> tests and
>>>>>>>>>>>> > objdump, dwarfdump, etc -
in theory to isolate the test we
>>>>>>>>>>>> would checkin
>>>>>>>>>>>> > binaries (or the textual
object representation lld had for a
>>>>>>>>>>>> while, etc) to
>>>>>>>>>>>> > test those tools. Some
tests instead checkin assembly and
>>>>>>>>>>>> assemble it with
>>>>>>>>>>>> > llvm-mc. Again, not to
cover llvm-mc, but on the assumption
>>>>>>>>>>>> that llvm-mc is
>>>>>>>>>>>> > tested, and just using it
as a tool to make tests easier to
>>>>>>>>>>>> maintain.
>>>>>>>>>>>> >
>>>>>>>>>>>> > So I was surprised to find
that the compiler-rt lit tests
>>>>>>>>>>>> seem to diverge
>>>>>>>>>>>> > from this philosophy &
contain more intentional end-to-end
>>>>>>>>>>>> tests (eg:
>>>>>>>>>>>> > adding a test there when
making a fix to Clang to add a
>>>>>>>>>>>> counter to a
>>>>>>>>>>>> > function that was
otherwise missing a counter - I'd expect
>>>>>>>>>>>> that to be
>>>>>>>>>>>> > tested in Clang and that
there would already be coverage in
>>>>>>>>>>>> compiler-rt for
>>>>>>>>>>>> > "if a function has a
counter, does compiler-rt do the right
>>>>>>>>>>>> thing with that
>>>>>>>>>>>> > counter" (testing
whatever code in compiler-rt needs to be
>>>>>>>>>>>> tested)).
>>>>>>>>>>>> >
>>>>>>>>>>>> > Am I off base here? Are
compiler-rt's tests fundamentally
>>>>>>>>>>>> different to the
>>>>>>>>>>>> > rest of the LLVM project?
Why? Should they continue to be?
>>>>>>>>>>>>
>>>>>>>>>>>> I think there's a bit of
grey area in terms testing the runtime
>>>>>>>>>>>> -
>>>>>>>>>>>> generally it's pretty hard
to use the runtime without a fairly
>>>>>>>>>>>> end-to-end test, so tests of
the runtime often end up looking
>>>>>>>>>>>> pretty
>>>>>>>>>>>> close to an end-to-end test.
>>>>>>>>>>>>
>>>>>>>>>>>> That said, I don't think
that should be used as an excuse to
>>>>>>>>>>>> sneak
>>>>>>>>>>>> arbitrary end-to-end tests into
compiler-rt. We should
>>>>>>>>>>>> absolutely write
>>>>>>>>>>>> tests in clang and llvm that
we're inputting what we expect to
>>>>>>>>>>>> the
>>>>>>>>>>>> runtime and try to keep the
tests in compiler-rt as focused on
>>>>>>>>>>>> just
>>>>>>>>>>>> exercising the runtime code as
possible.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Yes, we should not use compiler-rt
tests as an excuse of not
>>>>>>>>>>> adding clang/LLVM test. The latter
should always be added if possible --
>>>>>>>>>>> they are platform independent and
is the first level of defense.  runtime
>>>>>>>>>>> test's focus is also more on
the runtime lib itself and interaction between
>>>>>>>>>>>  runtime, compiler, binutils and
other tools.
>>>>>>>>>>>
>>>>>>>>>>> David
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> IIUC, the correct place for
integration tests in general is
>>>>>>>>>>>> somewhere
>>>>>>>>>>>> like test-suite, but I think
it's a bit intimidating to some
>>>>>>>>>>>> people to
>>>>>>>>>>>> add new tests there (Are there
docs on this?). I suspect some
>>>>>>>>>>>> of the
>>>>>>>>>>>> profiling related tests in
compiler-rt are doing a bit much and
>>>>>>>>>>>> should
>>>>>>>>>>>> graduate to a spot in the
test-suite (but I don't have time to
>>>>>>>>>>>> volunteer
>>>>>>>>>>>> to do the work, unfortunately).
>>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>>> LLVM Developers mailing list
>>>>>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>> cfe-dev mailing list
>>>>>>>>>>> cfe-dev at lists.llvm.org
>>>>>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Alexey Samsonov
>>>>>>>>>> vonosmas at gmail.com
>>>>>>>>>>
>>>>>>>>>>
_______________________________________________
>>>>>>>>>> cfe-dev mailing list
>>>>>>>>>> cfe-dev at lists.llvm.org
>>>>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Alexey Samsonov
>>>>>>>> vonosmas at gmail.com
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> cfe-dev mailing list
>>>>>>> cfe-dev at lists.llvm.org
>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>>
>> --
>> Alexey Samsonov
>> vonosmas at gmail.com
>>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160226/5ee28448/attachment.html>

David Blaikie via llvm-dev

2016-Feb-27 00:28 UTC

head link

[llvm-dev] [cfe-dev] Testing Best Practices/Goals (in the context of compiler-rt)

On Fri, Feb 26, 2016 at 3:17 PM, Xinliang David Li <davidxl at google.com>
wrote:
> Sean and Alexey have said a lot on this topic.  Here is my version of
> explanation that LLVM testing is not suitable to replace end to end
testing.
>
>  - The most important reason being that LLVM tests tend to test
> 'implementation details'. This has many drawbacks:
>   a) by only writing test cases like this, it is hard to expose bugs
> inherited in the implementation itself;
>   b) the connection between the implementation and end-user expectation is
> not obvious
>   c) often in times, we will have to throw out the existing implementation
> for a whole new more efficient implementation -- not only do we need to pay
> the cost of tedious test case update (from one impl to another) -- but that
> all the previous implementation specific code coverage become useless --
> how are we sure the new implementation does not break the end-end user
> expectation?
>
Yep, there are certainly tradeoffs between unit/feature/integration testing.

> End-end tests on the other hand have no such drawbacks -- its
> specification is well defined:
>   1) it is easy to write test cases based on well defined specifications
> and cover all corner cases (e.g, new language constructs etc);
>
I'm not sure that integration/end-to-end testing is more likely to exercise
corner cases.

>   2) it is easy to extract test cases from real failures from very large
> apps
>
We seem to have been pretty good, as a community, at producing the narrow
feature/unit tests for breakages from very large applications/scenarios. I
wouldn't be happy about not having targeted tests because we had broad
coverage from some large test - so I don't think this removes the need to
reduce and isolate test cases.

>   3) once the test case is written, it almost never needs to be updated as
> long as the public interfaces (options) are used.
>
> End-end testing also cover areas not in compiler proper -- like runtime,
> linker, other tools, and their subtle interactions.
>
This is actually a drawback of integration testing - we don't actually want
to test those products. And we don't want to make our testing dependent on
them, generally - it makes the tests more expensive in a number of ways
(more work to setup, costlier to run, more likely to break for non-product
reasons, more work to debug because they involve more moving parts). This
is one of the tradeoffs of broader testing strategies.

> For instance latest darwin i386 profile-rt bug exposed by the improved
> profile runtime test coverage will be hard to expose via LLVM only testing
> ..
>
I'm not familiar with the bug - could you provide more detail?

> In short, I think runtime tests do not overlap with LLVM testing -- even
> though they look like testing the same thing. For reference, we currently
> have < 100 profile rt end-end testing which is alarmingly small.  We
should
> push increasing its coverage.
>
While I don't fundamentally have a problem with different kinds of testing
(Up to feature/integration testing, etc), I really don't want that testing
to become what we rely on for correctness, nor what we expect developers
(or even most buildbots) to run for correctness. They are slower and
problematic in different ways that would slow down our velocity.

That sort of coverage should still be carefully chosen for when it adds
value, not too scattershot/broad (we have the test-suite for really broad
"does this work with the real world" sort of stuff - but anything
below
that should still be fairly isolated/targeted, I think).

- David

>
> thanks,
>
> David
>
>
> On Fri, Feb 26, 2016 at 2:15 PM, David Blaikie <dblaikie at
gmail.com> wrote:
>
>>
>>
>> On Fri, Feb 26, 2016 at 2:10 PM, Alexey Samsonov <vonosmas at
gmail.com>
>> wrote:
>>
>>>
>>> On Fri, Feb 26, 2016 at 1:34 PM, David Blaikie <dblaikie at
gmail.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Fri, Feb 26, 2016 at 1:31 PM, Sean Silva <chisophugis at
gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Fri, Feb 26, 2016 at 1:11 PM, David Blaikie <dblaikie
at gmail.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Feb 26, 2016 at 1:07 PM, Sean Silva
<chisophugis at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Feb 17, 2016 at 8:45 AM, David Blaikie via
cfe-dev <
>>>>>>> cfe-dev at lists.llvm.org> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Feb 12, 2016 at 5:43 PM, Alexey
Samsonov <
>>>>>>>> vonosmas at gmail.com> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Feb 11, 2016 at 1:50 PM, David
Blaikie <dblaikie at gmail.com
>>>>>>>>> > wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Feb 10, 2016 at 3:55 PM, Alexey
Samsonov via cfe-dev <
>>>>>>>>>> cfe-dev at lists.llvm.org> wrote:
>>>>>>>>>>
>>>>>>>>>>> I mostly agree with what Richard
and Justin said. Adding a few
>>>>>>>>>>> notes about the general strategy we
use:
>>>>>>>>>>>
>>>>>>>>>>> (1) lit tests which look
"end-to-end" proved to be way more
>>>>>>>>>>> convenient for testing runtime
libraries than unit tests.
>>>>>>>>>>>
>>>>>>>>>> We do have
>>>>>>>>>>> the latter, and use them to provide
test coverage for utility
>>>>>>>>>>> functions, but we quite often
accompany fix to the runtime library with
>>>>>>>>>>> "end-to-end" small
reproducer extracted from the real-world code
>>>>>>>>>>> that exposed the issue.
>>>>>>>>>>> Incidentally, this tests a whole
lot of other functionality:
>>>>>>>>>>> Clang driver, frontend, LLVM
passes, etc, but it's not the intent of the
>>>>>>>>>>> test.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Indeed - this is analogous to the tests
for, say, LLD that use
>>>>>>>>>> llvm-mc to produce the inputs rather
than checking in object files. That
>>>>>>>>>> area is open to some discussion as to
just how many tools we should rope
>>>>>>>>>> in/how isolated we should make tests
(eg: maybe building the json object
>>>>>>>>>> file format was going too far towards
isolation? Not clear - opinions
>>>>>>>>>> differ). But the point of the test is
to test the compiler-rt functionality
>>>>>>>>>> that was added/removed/modified.
>>>>>>>>>>
>>>>>>>>>> I think most people are in agreement
with that, while
>>>>>>>>>> acknowledging the fuzzy line about how
isolated we might be.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Yes.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> These tests are sometimes
platform-specific and poorly portable,
>>>>>>>>>>> but they are more reliable (we make
the same steps as the
>>>>>>>>>>> user of the compiler), and serve
the purpose of documentation.
>>>>>>>>>>>
>>>>>>>>>>> (2) If we change LLVM
instrumentation - we add a test to LLVM.
>>>>>>>>>>> If we change Clang code generation
or driver behavior - we add
>>>>>>>>>>> a test to Clang. No excuses here.
>>>>>>>>>>>
>>>>>>>>>>> (3) Sometimes we still add a
compiler-rt test for the change in
>>>>>>>>>>> LLVM or Clang: e.g. if we enhance
Clang frontend to teach UBSan
>>>>>>>>>>> to detecting yet another kind of
overflow, it makes sense to add
>>>>>>>>>>> a test to UBSan test-suite that
demonstrates it, in addition to
>>>>>>>>>>> Clang test verifying that we emit a
call to UBSan runtime. Also,
>>>>>>>>>>> compiler-rt test would allow us to
verify that the actual error report
>>>>>>>>>>> we present to the user is sane.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> This bit ^ is a bit unclear to me. If
there was no change to the
>>>>>>>>>> UBSan runtime, and the code generated
by Clang is equivalent/similar to an
>>>>>>>>>> existing use of the UBSan runtime -
what is it that the new compiler-rt
>>>>>>>>>> test is providing? (perhaps you could
give a concrete example you had in
>>>>>>>>>> mind to look at?)
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> See r235568 (change to Clang) followed by
r235569 (change to
>>>>>>>>> compiler-rt test). Now, it's a cheat
because I'm fixing test, not adding
>>>>>>>>> it. However, I would have definitely added
it, if it was missing.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Right, I think the difference here is "if
it was missing" - the
>>>>>>>> test case itself seems like it could be a
reasonable one (are there other
>>>>>>>> tests of the same compiler-rt functionality? (I
assume the compiler-rt
>>>>>>>> functionality is the implementation of
sadd/ssub?))
>>>>>>>>
>>>>>>>>
>>>>>>>>> In this case, a change to Clang
>>>>>>>>> instrumentation (arguments passed to UBSan
runtime callbacks)
>>>>>>>>> improved the user-facing part of the tool,
and compiler-rt test suite is a
>>>>>>>>> good place to verify that.
>>>>>>>>>
>>>>>>>>
>>>>>>>> This seems like the problematic part - changes
to LLVM improve the
>>>>>>>> user-facing part of Clang, but we don't add
end-to-end tests of that, as a
>>>>>>>> general rule. I'm trying to understand why
the difference between that and
>>>>>>>> compiler-rt
>>>>>>>>
>>>>>>>
>>>>>>> In what way do changes in LLVM change the
user-facing part of Clang?
>>>>>>> It obviously depends on how broadly one defines
user-facing. Is a 1%
>>>>>>> performance improvement from a particular
optimization user-facing? Is
>>>>>>> better debug info accuracy user-facing? I'm not
sure. But it seems clear
>>>>>>> that "the user sees a diagnostic or not"
definitely is.
>>>>>>>
>>>>>>
>>>>>> There's more than just performance in LLVM - ABI
features, and yes,
>>>>>> I'd argue some pieces of debug info are pretty user
facing (as are some
>>>>>> optimizations). We also have the remarks system in
place now. (also the
>>>>>> compiler crashing (or not) is pretty user facing).
>>>>>>
>>>>>
>>>>> I'd argue that we probably should have some sort of
integration tests
>>>>> for ABI features. I think at the moment we're getting
by thanks to
>>>>> self-hosting and regularly building lots of real-world
programs with
>>>>> ToT-ish compilers.
>>>>>
>>>>
>>>> Perhaps so, but I'd argue that they shouldn't be run as
part of "make
>>>> check" & should be in a separate test grouping
(probably mostly run by
>>>> buildbots) for the purpose of integration testing.
>>>>
>>>
>>> If you have llvm/clang/compiler-rt/libc++/libc++abi checkout, they
are
>>> not run as a part of "make check", only "make
check-all", which kind of
>>> makes sense (run *all* the tests!). You're free to run
"make check-clang",
>>> "make check-asan" etc.
>>> if you're sure your changes are limited in scope. Just to be
clear - do
>>> you suggest that compiler-rt tests are too heavy for this
configuration,
>>> and want to introduce extra level - i.e. extract "make
check-compiler-rt"
>>> out of "make check-all", and introduce "make
check-absolutely-everything",
>>> that would encompass them?
>>>
>>
>> Fair point, check-all would be/is check all the features, but, yes,
>> potentially pulling out a separate subgroup for integration testing of
any
>> kind. Yes, I realize this has the "-Wall" problem. But I
suppose what I'm
>> getting at is "check" in the LLVM project parlance is, with
the exception
>> of compiler-rt, the "lightweight, immediate verification tests,
not
>> integration testing" & I would rather like that to be the case
across the
>> board.
>>
>> Perhaps using a different verb for cross-project testing (I don't
object
>> to a catch-all target for running all the different kinds of testing we
>> have). Naming is hard.
>>
>>
>>>
>>>
>>>> We've made a pretty conscious, deliberate, and consistent
effort to not
>>>> do integration testing across the LLVM projects in "make
check"-like
>>>> testing, and to fix it where we do find it. It seems to me that
compiler-rt
>>>> diverged from that approach and I'm not really in favor of
that divergence.
>>>>
>>>
>>> I don't see why consistency by itself is a good thing.
>>>
>>
>> Makes it easier to work between projects. Sets expectations for
>> developers when they work in one area or another.
>>
>>
>>> As a sanitizer developer, current situation is convenient for me,
but if
>>> it harms / slows down / complicates workflow for other developers
or LLVM
>>> as a whole - sure, let's fix it.
>>>
>>
>> One of the things I'm trying to understand is what the benefit of
this
>> extra testing is - to take the add/sub example again, is adding extra
>> coverage for printing different text through the same mechanism in
>> compiler-rt valuable? What sort of regressions/bugs is that catching
that
>> makes it compelling to author, maintain, and incur the ongoing
execution
>> cost of?
>>
>
>
>
>>
>>
>>>
>>>
>>>>
>>>> - David
>>>>
>>>>
>>>>>
>>>>> -- Sean Silva
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>> Also, I think part of this is that in compiler-rt
there are usually
>>>>>>> more moving parts we don't control. E.g. it
isn't just the interface
>>>>>>> between LLVM and clang. The information needs to
pass through archivers,
>>>>>>> linkers, runtime loaders, etc. that all may have
issues that affect whether
>>>>>>> the user sees the final result. In general the
interface between LLVM and
>>>>>>> clang has no middlemen so there really isn't
anything to check.
>>>>>>>
>>>>>>
>>>>>> Correctness/behavior of the compiler depends on those
things too
>>>>>> (linkers, loaders, etc) to produce the final working
product the user
>>>>>> requested. If we emitted symbols with the wrong linkage
we could produce
>>>>>> linker errors, drop important entities, etc. But we
don't generally test
>>>>>> that the output of LLVM/Clang produces the right binary
when linked, we
>>>>>> test that it produces the right linkages on the
resulting entities.
>>>>>>
>>>>>> - David
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> -- Sean Silva
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>> You may argue that Clang test would have
been enough (I disagree
>>>>>>>>> with that), or that it qualifies as
"adding coverage" (maybe).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> (4) True, we're intimidated by
test-suite :) I feel that current
>>>>>>>>>>> use of compiler-rt test suite to
check compiler-rt libs better follows
>>>>>>>>>>> the doctrine described by David.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Which David? ;) (I guess David Li, not
me)
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Nope, paragraph 2 from your original email.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> I think maybe what could be worth doing
would be separating out
>>>>>>>>>> the broader/intentionally "end to
end" sort of tests from the ones intended
>>>>>>>>>> to test compiler-rt in relative
isolation.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> It's really hard to draw the line here,
even some of compiler-rt
>>>>>>>>> unit tests require instrumentation,
therefore depend on new features of
>>>>>>>>> Clang/LLVM. Unlike builtins, which are
>>>>>>>>> trivial to test in isolation, testing
sanitizer runtimes in
>>>>>>>>> isolation (w/o compiler) is often hard to
implement (we tried to do so for
>>>>>>>>> TSan, but found unit tests extremely hard
to write),
>>>>>>>>> and is barely useful - compiler-rt runtimes
don't consist of
>>>>>>>>> modules (like LLVMCodeGen and LLVMMC for
instance), and are never used w/o
>>>>>>>>> the compiler anyway.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Most importantly, I'd expect only
the latter to run in a "make
>>>>>>>>>> check-all" run, as we do for
Clang/LLVM, etc.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> And now we're getting to the goals :)
Why would such a change be
>>>>>>>>> good? Do you worry about the time it takes
to execute the test suite?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Also, there's significant
complexity in compiler-rt test suite
>>>>>>>>>>> that narrows the tests executed
>>>>>>>>>>> to those supported by current host.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Feb 10, 2016 at 2:33 PM,
Xinliang David Li via cfe-dev <
>>>>>>>>>>> cfe-dev at lists.llvm.org>
wrote:
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Feb 10, 2016 at 2:11
PM, Justin Bogner via llvm-dev <
>>>>>>>>>>>> llvm-dev at lists.llvm.org>
wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> David Blaikie via cfe-dev
<cfe-dev at lists.llvm.org> writes:
>>>>>>>>>>>>> > Recently had a bit of
a digression in a review thread
>>>>>>>>>>>>> related to some tests
>>>>>>>>>>>>> > going in to
compiler-rt (
>>>>>>>>>>>>> >
>>>>>>>>>>>>>
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160208/330759.html
>>>>>>>>>>>>> > ) and there seems to
be some disconnect at least between my
>>>>>>>>>>>>> expectations
>>>>>>>>>>>>> > and reality. So I
figured I'd have a bit of a discussion out
>>>>>>>>>>>>> here on the
>>>>>>>>>>>>> > dev lists where
there's a bit more visibility.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > My basic expectation
is that the lit tests in any LLVM
>>>>>>>>>>>>> project except the
>>>>>>>>>>>>> > test-suite are
targeted tests intended to test only the
>>>>>>>>>>>>> functionality in
>>>>>>>>>>>>> > the project. This
seems like a pretty well accepted doctrine
>>>>>>>>>>>>> across most
>>>>>>>>>>>>> > LLVM projects - most
visibly in Clang, where we make a
>>>>>>>>>>>>> concerted effort not
>>>>>>>>>>>>> > to have tests that
execute LLVM optimizations, etc.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > There are
exceptions/middle ground to this - DIBuilder is in
>>>>>>>>>>>>> LLVM, but
>>>>>>>>>>>>> > essentially tested in
Clang rather than writing LLVM unit
>>>>>>>>>>>>> tests. It's
>>>>>>>>>>>>> > somewhat unavoidable
that any of the IR building code
>>>>>>>>>>>>> (IRBuilder,
>>>>>>>>>>>>> > DIBuilder, IR asm
printing, etc) is 'tested' incidentally in
>>>>>>>>>>>>> Clang in
>>>>>>>>>>>>> > process of testing
Clang's IR generation. But these are seen
>>>>>>>>>>>>> as incidental,
>>>>>>>>>>>>> > not intentionally
trying to cover LLVM with Clang tests (we
>>>>>>>>>>>>> don't add a
>>>>>>>>>>>>> > Clang test if we add a
new feature to IRBuilder just to test
>>>>>>>>>>>>> the IRBuilder).
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Another case with some
middle ground are things like linker
>>>>>>>>>>>>> tests and
>>>>>>>>>>>>> > objdump, dwarfdump,
etc - in theory to isolate the test we
>>>>>>>>>>>>> would checkin
>>>>>>>>>>>>> > binaries (or the
textual object representation lld had for a
>>>>>>>>>>>>> while, etc) to
>>>>>>>>>>>>> > test those tools. Some
tests instead checkin assembly and
>>>>>>>>>>>>> assemble it with
>>>>>>>>>>>>> > llvm-mc. Again, not to
cover llvm-mc, but on the assumption
>>>>>>>>>>>>> that llvm-mc is
>>>>>>>>>>>>> > tested, and just using
it as a tool to make tests easier to
>>>>>>>>>>>>> maintain.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > So I was surprised to
find that the compiler-rt lit tests
>>>>>>>>>>>>> seem to diverge
>>>>>>>>>>>>> > from this philosophy
& contain more intentional end-to-end
>>>>>>>>>>>>> tests (eg:
>>>>>>>>>>>>> > adding a test there
when making a fix to Clang to add a
>>>>>>>>>>>>> counter to a
>>>>>>>>>>>>> > function that was
otherwise missing a counter - I'd expect
>>>>>>>>>>>>> that to be
>>>>>>>>>>>>> > tested in Clang and
that there would already be coverage in
>>>>>>>>>>>>> compiler-rt for
>>>>>>>>>>>>> > "if a function
has a counter, does compiler-rt do the right
>>>>>>>>>>>>> thing with that
>>>>>>>>>>>>> > counter" (testing
whatever code in compiler-rt needs to be
>>>>>>>>>>>>> tested)).
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Am I off base here?
Are compiler-rt's tests fundamentally
>>>>>>>>>>>>> different to the
>>>>>>>>>>>>> > rest of the LLVM
project? Why? Should they continue to be?
>>>>>>>>>>>>>
>>>>>>>>>>>>> I think there's a bit
of grey area in terms testing the
>>>>>>>>>>>>> runtime -
>>>>>>>>>>>>> generally it's pretty
hard to use the runtime without a fairly
>>>>>>>>>>>>> end-to-end test, so tests
of the runtime often end up looking
>>>>>>>>>>>>> pretty
>>>>>>>>>>>>> close to an end-to-end
test.
>>>>>>>>>>>>>
>>>>>>>>>>>>> That said, I don't
think that should be used as an excuse to
>>>>>>>>>>>>> sneak
>>>>>>>>>>>>> arbitrary end-to-end tests
into compiler-rt. We should
>>>>>>>>>>>>> absolutely write
>>>>>>>>>>>>> tests in clang and llvm
that we're inputting what we expect to
>>>>>>>>>>>>> the
>>>>>>>>>>>>> runtime and try to keep the
tests in compiler-rt as focused on
>>>>>>>>>>>>> just
>>>>>>>>>>>>> exercising the runtime code
as possible.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Yes, we should not use
compiler-rt tests as an excuse of not
>>>>>>>>>>>> adding clang/LLVM test. The
latter should always be added if possible --
>>>>>>>>>>>> they are platform independent
and is the first level of defense.  runtime
>>>>>>>>>>>> test's focus is also more
on the runtime lib itself and interaction between
>>>>>>>>>>>>  runtime, compiler, binutils
and other tools.
>>>>>>>>>>>>
>>>>>>>>>>>> David
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> IIUC, the correct place for
integration tests in general is
>>>>>>>>>>>>> somewhere
>>>>>>>>>>>>> like test-suite, but I
think it's a bit intimidating to some
>>>>>>>>>>>>> people to
>>>>>>>>>>>>> add new tests there (Are
there docs on this?). I suspect some
>>>>>>>>>>>>> of the
>>>>>>>>>>>>> profiling related tests in
compiler-rt are doing a bit much
>>>>>>>>>>>>> and should
>>>>>>>>>>>>> graduate to a spot in the
test-suite (but I don't have time to
>>>>>>>>>>>>> volunteer
>>>>>>>>>>>>> to do the work,
unfortunately).
>>>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>>>> LLVM Developers mailing
list
>>>>>>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>>> cfe-dev mailing list
>>>>>>>>>>>> cfe-dev at lists.llvm.org
>>>>>>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Alexey Samsonov
>>>>>>>>>>> vonosmas at gmail.com
>>>>>>>>>>>
>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>> cfe-dev mailing list
>>>>>>>>>>> cfe-dev at lists.llvm.org
>>>>>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Alexey Samsonov
>>>>>>>>> vonosmas at gmail.com
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> cfe-dev mailing list
>>>>>>>> cfe-dev at lists.llvm.org
>>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Alexey Samsonov
>>> vonosmas at gmail.com
>>>
>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160226/c3e9d906/attachment-0001.html>

Sean Silva via llvm-dev

2016-Feb-27 02:39 UTC

head link

[llvm-dev] [cfe-dev] Testing Best Practices/Goals (in the context of compiler-rt)

On Fri, Feb 26, 2016 at 3:17 PM, Xinliang David Li <davidxl at google.com>
wrote:
> Sean and Alexey have said a lot on this topic.  Here is my version of
> explanation that LLVM testing is not suitable to replace end to end
testing.
>
>  - The most important reason being that LLVM tests tend to test
> 'implementation details'. This has many drawbacks:
>   a) by only writing test cases like this, it is hard to expose bugs
> inherited in the implementation itself;
>   b) the connection between the implementation and end-user expectation is
> not obvious
>   c) often in times, we will have to throw out the existing implementation
> for a whole new more efficient implementation -- not only do we need to pay
> the cost of tedious test case update (from one impl to another) -- but that
> all the previous implementation specific code coverage become useless --
> how are we sure the new implementation does not break the end-end user
> expectation?
>
a) and c) are very good points. For example, in LLD
__attribute__((init_priority(N))) was recently added. LLVM/Clang already
support this feature.

If we had created good integration tests when the features were implemented
in LLVM/Clang, LLD development of the features would have been much easier
since we would already have a good set of coverage to test a prospective
new implementation (and, probably, to reduce unit-test cases from).

In fact, there were multiple implementation attempts in LLD, and even the
second "corrected" one misunderstood the actual expected behavior
(e.g.
http://reviews.llvm.org/D17120#inline-141631). I.e. all of the tests in the
patch passed and the tests had (I believe) full coverage of all the new
code, but there were still issues. And the issues would have been trivially
smoked out by any kind of integration testing, let alone a nice complete
set of integration tests that was already developed when the features were
implemented in LLVM/Clang.

-- Sean Silva


>
> End-end tests on the other hand have no such drawbacks -- its
> specification is well defined:
>   1) it is easy to write test cases based on well defined specifications
> and cover all corner cases (e.g, new language constructs etc);
>   2) it is easy to extract test cases from real failures from very large
> apps
>   3) once the test case is written, it almost never needs to be updated as
> long as the public interfaces (options) are used.
>
> End-end testing also cover areas not in compiler proper -- like runtime,
> linker, other tools, and their subtle interactions.  For instance latest
> darwin i386 profile-rt bug exposed by the improved profile runtime test
> coverage will be hard to expose via LLVM only testing ..
>
> In short, I think runtime tests do not overlap with LLVM testing -- even
> though they look like testing the same thing. For reference, we currently
> have < 100 profile rt end-end testing which is alarmingly small.  We
should
> push increasing its coverage.
>
> thanks,
>
> David
>
>
> On Fri, Feb 26, 2016 at 2:15 PM, David Blaikie <dblaikie at
gmail.com> wrote:
>
>>
>>
>> On Fri, Feb 26, 2016 at 2:10 PM, Alexey Samsonov <vonosmas at
gmail.com>
>> wrote:
>>
>>>
>>> On Fri, Feb 26, 2016 at 1:34 PM, David Blaikie <dblaikie at
gmail.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Fri, Feb 26, 2016 at 1:31 PM, Sean Silva <chisophugis at
gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Fri, Feb 26, 2016 at 1:11 PM, David Blaikie <dblaikie
at gmail.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Feb 26, 2016 at 1:07 PM, Sean Silva
<chisophugis at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Feb 17, 2016 at 8:45 AM, David Blaikie via
cfe-dev <
>>>>>>> cfe-dev at lists.llvm.org> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Feb 12, 2016 at 5:43 PM, Alexey
Samsonov <
>>>>>>>> vonosmas at gmail.com> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Feb 11, 2016 at 1:50 PM, David
Blaikie <dblaikie at gmail.com
>>>>>>>>> > wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Feb 10, 2016 at 3:55 PM, Alexey
Samsonov via cfe-dev <
>>>>>>>>>> cfe-dev at lists.llvm.org> wrote:
>>>>>>>>>>
>>>>>>>>>>> I mostly agree with what Richard
and Justin said. Adding a few
>>>>>>>>>>> notes about the general strategy we
use:
>>>>>>>>>>>
>>>>>>>>>>> (1) lit tests which look
"end-to-end" proved to be way more
>>>>>>>>>>> convenient for testing runtime
libraries than unit tests.
>>>>>>>>>>>
>>>>>>>>>> We do have
>>>>>>>>>>> the latter, and use them to provide
test coverage for utility
>>>>>>>>>>> functions, but we quite often
accompany fix to the runtime library with
>>>>>>>>>>> "end-to-end" small
reproducer extracted from the real-world code
>>>>>>>>>>> that exposed the issue.
>>>>>>>>>>> Incidentally, this tests a whole
lot of other functionality:
>>>>>>>>>>> Clang driver, frontend, LLVM
passes, etc, but it's not the intent of the
>>>>>>>>>>> test.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Indeed - this is analogous to the tests
for, say, LLD that use
>>>>>>>>>> llvm-mc to produce the inputs rather
than checking in object files. That
>>>>>>>>>> area is open to some discussion as to
just how many tools we should rope
>>>>>>>>>> in/how isolated we should make tests
(eg: maybe building the json object
>>>>>>>>>> file format was going too far towards
isolation? Not clear - opinions
>>>>>>>>>> differ). But the point of the test is
to test the compiler-rt functionality
>>>>>>>>>> that was added/removed/modified.
>>>>>>>>>>
>>>>>>>>>> I think most people are in agreement
with that, while
>>>>>>>>>> acknowledging the fuzzy line about how
isolated we might be.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Yes.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> These tests are sometimes
platform-specific and poorly portable,
>>>>>>>>>>> but they are more reliable (we make
the same steps as the
>>>>>>>>>>> user of the compiler), and serve
the purpose of documentation.
>>>>>>>>>>>
>>>>>>>>>>> (2) If we change LLVM
instrumentation - we add a test to LLVM.
>>>>>>>>>>> If we change Clang code generation
or driver behavior - we add
>>>>>>>>>>> a test to Clang. No excuses here.
>>>>>>>>>>>
>>>>>>>>>>> (3) Sometimes we still add a
compiler-rt test for the change in
>>>>>>>>>>> LLVM or Clang: e.g. if we enhance
Clang frontend to teach UBSan
>>>>>>>>>>> to detecting yet another kind of
overflow, it makes sense to add
>>>>>>>>>>> a test to UBSan test-suite that
demonstrates it, in addition to
>>>>>>>>>>> Clang test verifying that we emit a
call to UBSan runtime. Also,
>>>>>>>>>>> compiler-rt test would allow us to
verify that the actual error report
>>>>>>>>>>> we present to the user is sane.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> This bit ^ is a bit unclear to me. If
there was no change to the
>>>>>>>>>> UBSan runtime, and the code generated
by Clang is equivalent/similar to an
>>>>>>>>>> existing use of the UBSan runtime -
what is it that the new compiler-rt
>>>>>>>>>> test is providing? (perhaps you could
give a concrete example you had in
>>>>>>>>>> mind to look at?)
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> See r235568 (change to Clang) followed by
r235569 (change to
>>>>>>>>> compiler-rt test). Now, it's a cheat
because I'm fixing test, not adding
>>>>>>>>> it. However, I would have definitely added
it, if it was missing.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Right, I think the difference here is "if
it was missing" - the
>>>>>>>> test case itself seems like it could be a
reasonable one (are there other
>>>>>>>> tests of the same compiler-rt functionality? (I
assume the compiler-rt
>>>>>>>> functionality is the implementation of
sadd/ssub?))
>>>>>>>>
>>>>>>>>
>>>>>>>>> In this case, a change to Clang
>>>>>>>>> instrumentation (arguments passed to UBSan
runtime callbacks)
>>>>>>>>> improved the user-facing part of the tool,
and compiler-rt test suite is a
>>>>>>>>> good place to verify that.
>>>>>>>>>
>>>>>>>>
>>>>>>>> This seems like the problematic part - changes
to LLVM improve the
>>>>>>>> user-facing part of Clang, but we don't add
end-to-end tests of that, as a
>>>>>>>> general rule. I'm trying to understand why
the difference between that and
>>>>>>>> compiler-rt
>>>>>>>>
>>>>>>>
>>>>>>> In what way do changes in LLVM change the
user-facing part of Clang?
>>>>>>> It obviously depends on how broadly one defines
user-facing. Is a 1%
>>>>>>> performance improvement from a particular
optimization user-facing? Is
>>>>>>> better debug info accuracy user-facing? I'm not
sure. But it seems clear
>>>>>>> that "the user sees a diagnostic or not"
definitely is.
>>>>>>>
>>>>>>
>>>>>> There's more than just performance in LLVM - ABI
features, and yes,
>>>>>> I'd argue some pieces of debug info are pretty user
facing (as are some
>>>>>> optimizations). We also have the remarks system in
place now. (also the
>>>>>> compiler crashing (or not) is pretty user facing).
>>>>>>
>>>>>
>>>>> I'd argue that we probably should have some sort of
integration tests
>>>>> for ABI features. I think at the moment we're getting
by thanks to
>>>>> self-hosting and regularly building lots of real-world
programs with
>>>>> ToT-ish compilers.
>>>>>
>>>>
>>>> Perhaps so, but I'd argue that they shouldn't be run as
part of "make
>>>> check" & should be in a separate test grouping
(probably mostly run by
>>>> buildbots) for the purpose of integration testing.
>>>>
>>>
>>> If you have llvm/clang/compiler-rt/libc++/libc++abi checkout, they
are
>>> not run as a part of "make check", only "make
check-all", which kind of
>>> makes sense (run *all* the tests!). You're free to run
"make check-clang",
>>> "make check-asan" etc.
>>> if you're sure your changes are limited in scope. Just to be
clear - do
>>> you suggest that compiler-rt tests are too heavy for this
configuration,
>>> and want to introduce extra level - i.e. extract "make
check-compiler-rt"
>>> out of "make check-all", and introduce "make
check-absolutely-everything",
>>> that would encompass them?
>>>
>>
>> Fair point, check-all would be/is check all the features, but, yes,
>> potentially pulling out a separate subgroup for integration testing of
any
>> kind. Yes, I realize this has the "-Wall" problem. But I
suppose what I'm
>> getting at is "check" in the LLVM project parlance is, with
the exception
>> of compiler-rt, the "lightweight, immediate verification tests,
not
>> integration testing" & I would rather like that to be the case
across the
>> board.
>>
>> Perhaps using a different verb for cross-project testing (I don't
object
>> to a catch-all target for running all the different kinds of testing we
>> have). Naming is hard.
>>
>>
>>>
>>>
>>>> We've made a pretty conscious, deliberate, and consistent
effort to not
>>>> do integration testing across the LLVM projects in "make
check"-like
>>>> testing, and to fix it where we do find it. It seems to me that
compiler-rt
>>>> diverged from that approach and I'm not really in favor of
that divergence.
>>>>
>>>
>>> I don't see why consistency by itself is a good thing.
>>>
>>
>> Makes it easier to work between projects. Sets expectations for
>> developers when they work in one area or another.
>>
>>
>>> As a sanitizer developer, current situation is convenient for me,
but if
>>> it harms / slows down / complicates workflow for other developers
or LLVM
>>> as a whole - sure, let's fix it.
>>>
>>
>> One of the things I'm trying to understand is what the benefit of
this
>> extra testing is - to take the add/sub example again, is adding extra
>> coverage for printing different text through the same mechanism in
>> compiler-rt valuable? What sort of regressions/bugs is that catching
that
>> makes it compelling to author, maintain, and incur the ongoing
execution
>> cost of?
>>
>
>
>
>>
>>
>>>
>>>
>>>>
>>>> - David
>>>>
>>>>
>>>>>
>>>>> -- Sean Silva
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>> Also, I think part of this is that in compiler-rt
there are usually
>>>>>>> more moving parts we don't control. E.g. it
isn't just the interface
>>>>>>> between LLVM and clang. The information needs to
pass through archivers,
>>>>>>> linkers, runtime loaders, etc. that all may have
issues that affect whether
>>>>>>> the user sees the final result. In general the
interface between LLVM and
>>>>>>> clang has no middlemen so there really isn't
anything to check.
>>>>>>>
>>>>>>
>>>>>> Correctness/behavior of the compiler depends on those
things too
>>>>>> (linkers, loaders, etc) to produce the final working
product the user
>>>>>> requested. If we emitted symbols with the wrong linkage
we could produce
>>>>>> linker errors, drop important entities, etc. But we
don't generally test
>>>>>> that the output of LLVM/Clang produces the right binary
when linked, we
>>>>>> test that it produces the right linkages on the
resulting entities.
>>>>>>
>>>>>> - David
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> -- Sean Silva
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>> You may argue that Clang test would have
been enough (I disagree
>>>>>>>>> with that), or that it qualifies as
"adding coverage" (maybe).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> (4) True, we're intimidated by
test-suite :) I feel that current
>>>>>>>>>>> use of compiler-rt test suite to
check compiler-rt libs better follows
>>>>>>>>>>> the doctrine described by David.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Which David? ;) (I guess David Li, not
me)
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Nope, paragraph 2 from your original email.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> I think maybe what could be worth doing
would be separating out
>>>>>>>>>> the broader/intentionally "end to
end" sort of tests from the ones intended
>>>>>>>>>> to test compiler-rt in relative
isolation.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> It's really hard to draw the line here,
even some of compiler-rt
>>>>>>>>> unit tests require instrumentation,
therefore depend on new features of
>>>>>>>>> Clang/LLVM. Unlike builtins, which are
>>>>>>>>> trivial to test in isolation, testing
sanitizer runtimes in
>>>>>>>>> isolation (w/o compiler) is often hard to
implement (we tried to do so for
>>>>>>>>> TSan, but found unit tests extremely hard
to write),
>>>>>>>>> and is barely useful - compiler-rt runtimes
don't consist of
>>>>>>>>> modules (like LLVMCodeGen and LLVMMC for
instance), and are never used w/o
>>>>>>>>> the compiler anyway.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Most importantly, I'd expect only
the latter to run in a "make
>>>>>>>>>> check-all" run, as we do for
Clang/LLVM, etc.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> And now we're getting to the goals :)
Why would such a change be
>>>>>>>>> good? Do you worry about the time it takes
to execute the test suite?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Also, there's significant
complexity in compiler-rt test suite
>>>>>>>>>>> that narrows the tests executed
>>>>>>>>>>> to those supported by current host.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Feb 10, 2016 at 2:33 PM,
Xinliang David Li via cfe-dev <
>>>>>>>>>>> cfe-dev at lists.llvm.org>
wrote:
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Feb 10, 2016 at 2:11
PM, Justin Bogner via llvm-dev <
>>>>>>>>>>>> llvm-dev at lists.llvm.org>
wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> David Blaikie via cfe-dev
<cfe-dev at lists.llvm.org> writes:
>>>>>>>>>>>>> > Recently had a bit of
a digression in a review thread
>>>>>>>>>>>>> related to some tests
>>>>>>>>>>>>> > going in to
compiler-rt (
>>>>>>>>>>>>> >
>>>>>>>>>>>>>
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160208/330759.html
>>>>>>>>>>>>> > ) and there seems to
be some disconnect at least between my
>>>>>>>>>>>>> expectations
>>>>>>>>>>>>> > and reality. So I
figured I'd have a bit of a discussion out
>>>>>>>>>>>>> here on the
>>>>>>>>>>>>> > dev lists where
there's a bit more visibility.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > My basic expectation
is that the lit tests in any LLVM
>>>>>>>>>>>>> project except the
>>>>>>>>>>>>> > test-suite are
targeted tests intended to test only the
>>>>>>>>>>>>> functionality in
>>>>>>>>>>>>> > the project. This
seems like a pretty well accepted doctrine
>>>>>>>>>>>>> across most
>>>>>>>>>>>>> > LLVM projects - most
visibly in Clang, where we make a
>>>>>>>>>>>>> concerted effort not
>>>>>>>>>>>>> > to have tests that
execute LLVM optimizations, etc.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > There are
exceptions/middle ground to this - DIBuilder is in
>>>>>>>>>>>>> LLVM, but
>>>>>>>>>>>>> > essentially tested in
Clang rather than writing LLVM unit
>>>>>>>>>>>>> tests. It's
>>>>>>>>>>>>> > somewhat unavoidable
that any of the IR building code
>>>>>>>>>>>>> (IRBuilder,
>>>>>>>>>>>>> > DIBuilder, IR asm
printing, etc) is 'tested' incidentally in
>>>>>>>>>>>>> Clang in
>>>>>>>>>>>>> > process of testing
Clang's IR generation. But these are seen
>>>>>>>>>>>>> as incidental,
>>>>>>>>>>>>> > not intentionally
trying to cover LLVM with Clang tests (we
>>>>>>>>>>>>> don't add a
>>>>>>>>>>>>> > Clang test if we add a
new feature to IRBuilder just to test
>>>>>>>>>>>>> the IRBuilder).
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Another case with some
middle ground are things like linker
>>>>>>>>>>>>> tests and
>>>>>>>>>>>>> > objdump, dwarfdump,
etc - in theory to isolate the test we
>>>>>>>>>>>>> would checkin
>>>>>>>>>>>>> > binaries (or the
textual object representation lld had for a
>>>>>>>>>>>>> while, etc) to
>>>>>>>>>>>>> > test those tools. Some
tests instead checkin assembly and
>>>>>>>>>>>>> assemble it with
>>>>>>>>>>>>> > llvm-mc. Again, not to
cover llvm-mc, but on the assumption
>>>>>>>>>>>>> that llvm-mc is
>>>>>>>>>>>>> > tested, and just using
it as a tool to make tests easier to
>>>>>>>>>>>>> maintain.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > So I was surprised to
find that the compiler-rt lit tests
>>>>>>>>>>>>> seem to diverge
>>>>>>>>>>>>> > from this philosophy
& contain more intentional end-to-end
>>>>>>>>>>>>> tests (eg:
>>>>>>>>>>>>> > adding a test there
when making a fix to Clang to add a
>>>>>>>>>>>>> counter to a
>>>>>>>>>>>>> > function that was
otherwise missing a counter - I'd expect
>>>>>>>>>>>>> that to be
>>>>>>>>>>>>> > tested in Clang and
that there would already be coverage in
>>>>>>>>>>>>> compiler-rt for
>>>>>>>>>>>>> > "if a function
has a counter, does compiler-rt do the right
>>>>>>>>>>>>> thing with that
>>>>>>>>>>>>> > counter" (testing
whatever code in compiler-rt needs to be
>>>>>>>>>>>>> tested)).
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Am I off base here?
Are compiler-rt's tests fundamentally
>>>>>>>>>>>>> different to the
>>>>>>>>>>>>> > rest of the LLVM
project? Why? Should they continue to be?
>>>>>>>>>>>>>
>>>>>>>>>>>>> I think there's a bit
of grey area in terms testing the
>>>>>>>>>>>>> runtime -
>>>>>>>>>>>>> generally it's pretty
hard to use the runtime without a fairly
>>>>>>>>>>>>> end-to-end test, so tests
of the runtime often end up looking
>>>>>>>>>>>>> pretty
>>>>>>>>>>>>> close to an end-to-end
test.
>>>>>>>>>>>>>
>>>>>>>>>>>>> That said, I don't
think that should be used as an excuse to
>>>>>>>>>>>>> sneak
>>>>>>>>>>>>> arbitrary end-to-end tests
into compiler-rt. We should
>>>>>>>>>>>>> absolutely write
>>>>>>>>>>>>> tests in clang and llvm
that we're inputting what we expect to
>>>>>>>>>>>>> the
>>>>>>>>>>>>> runtime and try to keep the
tests in compiler-rt as focused on
>>>>>>>>>>>>> just
>>>>>>>>>>>>> exercising the runtime code
as possible.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Yes, we should not use
compiler-rt tests as an excuse of not
>>>>>>>>>>>> adding clang/LLVM test. The
latter should always be added if possible --
>>>>>>>>>>>> they are platform independent
and is the first level of defense.  runtime
>>>>>>>>>>>> test's focus is also more
on the runtime lib itself and interaction between
>>>>>>>>>>>>  runtime, compiler, binutils
and other tools.
>>>>>>>>>>>>
>>>>>>>>>>>> David
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> IIUC, the correct place for
integration tests in general is
>>>>>>>>>>>>> somewhere
>>>>>>>>>>>>> like test-suite, but I
think it's a bit intimidating to some
>>>>>>>>>>>>> people to
>>>>>>>>>>>>> add new tests there (Are
there docs on this?). I suspect some
>>>>>>>>>>>>> of the
>>>>>>>>>>>>> profiling related tests in
compiler-rt are doing a bit much
>>>>>>>>>>>>> and should
>>>>>>>>>>>>> graduate to a spot in the
test-suite (but I don't have time to
>>>>>>>>>>>>> volunteer
>>>>>>>>>>>>> to do the work,
unfortunately).
>>>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>>>> LLVM Developers mailing
list
>>>>>>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>>> cfe-dev mailing list
>>>>>>>>>>>> cfe-dev at lists.llvm.org
>>>>>>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Alexey Samsonov
>>>>>>>>>>> vonosmas at gmail.com
>>>>>>>>>>>
>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>> cfe-dev mailing list
>>>>>>>>>>> cfe-dev at lists.llvm.org
>>>>>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Alexey Samsonov
>>>>>>>>> vonosmas at gmail.com
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> cfe-dev mailing list
>>>>>>>> cfe-dev at lists.llvm.org
>>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Alexey Samsonov
>>> vonosmas at gmail.com
>>>
>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160226/37e20476/attachment.html>

Apparently Analagous Threads

Search for more maybe matching threads

llvm dev - Feb 2016 - [cfe-dev] Testing Best Practices/Goals (in the context of compiler-rt)

[llvm-dev] [cfe-dev] Testing Best Practices/Goals (in the context of compiler-rt)

[llvm-dev] [cfe-dev] Testing Best Practices/Goals (in the context of compiler-rt)

[llvm-dev] [cfe-dev] Testing Best Practices/Goals (in the context of compiler-rt)

[llvm-dev] [cfe-dev] Testing Best Practices/Goals (in the context of compiler-rt)

Apparently Analagous Threads