David Greene via llvm-dev
2019-Oct-16 20:00 UTC
[llvm-dev] [Openmp-dev] [cfe-dev] RFC: End-to-end testing
Renato Golin via Openmp-dev <openmp-dev at lists.llvm.org> writes:> We already have tests in clang that check for diagnostics, IR and > other things. Expanding those can handle 99.9% of what Clang could > possibly do without descending into assembly.I agree that for a great many things this is sufficient.> Assembly errors are more complicated than just "not generating VADD", > and that's easier done in the TS than LIT.Can you elaborate? I'm talking about very small tests targeted to generate a specific instruction or small number of instructions. Vectorization isn't the best example. Something like verifying FMA generation is a better example. -David
Roman Lebedev via llvm-dev
2019-Oct-16 20:09 UTC
[llvm-dev] [cfe-dev] [Openmp-dev] RFC: End-to-end testing
FWIW I'm personally cautiously non-optimistic about this, but maybe i'm just not seeing the whole picture of the proposal. Both checking final asm, and checking more than one layer of abstraction feels overreaching and very prone to breakage/too restrictful. Even minimal changes to the scheduling for particular CPU can cause many instructions to reorder. I'm not sure what effect that will have on middle-end pass development, too. A change affects these end-to-end tests, what then? Just blindly regenerate every affected test? This will be further complicated once clang isn't the only upstream front-end. Roman. On Wed, Oct 16, 2019 at 11:00 PM David Greene via cfe-dev <cfe-dev at lists.llvm.org> wrote:> > Renato Golin via Openmp-dev <openmp-dev at lists.llvm.org> writes: > > > We already have tests in clang that check for diagnostics, IR and > > other things. Expanding those can handle 99.9% of what Clang could > > possibly do without descending into assembly. > > I agree that for a great many things this is sufficient. > > > Assembly errors are more complicated than just "not generating VADD", > > and that's easier done in the TS than LIT. > > Can you elaborate? I'm talking about very small tests targeted to > generate a specific instruction or small number of instructions. > Vectorization isn't the best example. Something like verifying FMA > generation is a better example. > > -David > _______________________________________________ > cfe-dev mailing list > cfe-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
David Blaikie via llvm-dev
2019-Oct-16 21:03 UTC
[llvm-dev] [cfe-dev] [Openmp-dev] RFC: End-to-end testing
On Wed, Oct 16, 2019 at 1:09 PM Roman Lebedev via cfe-dev < cfe-dev at lists.llvm.org> wrote:> FWIW I'm personally cautiously non-optimistic about this, > but maybe i'm just not seeing the whole picture of the proposal. > > Both checking final asm, and checking more than one layer of abstraction > feels overreaching and very prone to breakage/too restrictful. > Even minimal changes to the scheduling for particular CPU can cause many > instructions to reorder. > I'm not sure what effect that will have on middle-end pass development, > too. > > A change affects these end-to-end tests, what then? > Just blindly regenerate every affected test? > This will be further complicated once clang isn't the only upstream > front-end. >Agreed that the broader a test is, the more careful one has to be about making it reliable in spite of other changes - sometimes that's really difficult (if you're trying to get a particular instruction selection or register allocation) but in others it can be fairly reliable if done carefully to sufficiently restrict optimizations, etc. (having function calls to external functions to act as sinks/sources for values, etc, for instance - picking places where the output is already "optimal" and trivially/obviously so (for whatever set of constraints you've provided - not heroic optimizations, etc) to ensure that it's fairly stable) - Dave> > Roman. > > On Wed, Oct 16, 2019 at 11:00 PM David Greene via cfe-dev > <cfe-dev at lists.llvm.org> wrote: > > > > Renato Golin via Openmp-dev <openmp-dev at lists.llvm.org> writes: > > > > > We already have tests in clang that check for diagnostics, IR and > > > other things. Expanding those can handle 99.9% of what Clang could > > > possibly do without descending into assembly. > > > > I agree that for a great many things this is sufficient. > > > > > Assembly errors are more complicated than just "not generating VADD", > > > and that's easier done in the TS than LIT. > > > > Can you elaborate? I'm talking about very small tests targeted to > > generate a specific instruction or small number of instructions. > > Vectorization isn't the best example. Something like verifying FMA > > generation is a better example. > > > > -David > > _______________________________________________ > > cfe-dev mailing list > > cfe-dev at lists.llvm.org > > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev > _______________________________________________ > cfe-dev mailing list > cfe-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191016/f195d4fa/attachment.html>
Renato Golin via llvm-dev
2019-Oct-17 10:00 UTC
[llvm-dev] [Openmp-dev] [cfe-dev] RFC: End-to-end testing
On Wed, 16 Oct 2019 at 21:00, David Greene <greened at obbligato.org> wrote:> Can you elaborate? I'm talking about very small tests targeted to > generate a specific instruction or small number of instructions. > Vectorization isn't the best example. Something like verifying FMA > generation is a better example.To check that instructions are generated from source, a two-step test is the best approach: - Verify that Clang emits different IR for different options, or the right IR for a new functionality - Verify that the affected targets (or at least two of the main ones) can take that IR and generate the right asm Clang can emit LLVM IR for any target, but you don't necessarily need to build the back-ends. If you want to do the test in Clang all the way to asm, you need to make sure the back-end is built. Clang is not always build with all back-ends, possibly even none. To do that in the back-end, you'd have to rely on Clang being built, which is not always true. Hacking our test infrastructure to test different things when a combination of components is built, especially after they start to merge after being in a monorepo, will complicate tests and increase the likelihood that some tests will never be run by CI and bit rot. On the test-suite, you can guarantee that the whole toolchain is available: Front and back end of the compilers, assemblers (if necessary), linkers, libraries, etc. Writing a small source file per test, as you would in Clang/LLVM, running LIT and FileCheck, and *always* running it in the TS would be trivial. --renato
Robinson, Paul via llvm-dev
2019-Oct-17 15:28 UTC
[llvm-dev] [Openmp-dev] [cfe-dev] RFC: End-to-end testing
Renato wrote:> If you want to do the test in Clang all the way to asm, you need to > make sure the back-end is built. Clang is not always build with all > back-ends, possibly even none.This is no different than today. Many tests in Clang require a specific target to exist. Grep clang/test for "registered-target" for example; I get 577 hits. Integration tests (here called "end-to-end" tests) clearly need to specify their REQUIRES conditions correctly.> To do that in the back-end, you'd have to rely on Clang being built, > which is not always true.A frontend-based test in the backend would be a layering violation. Nobody is suggesting that.> Hacking our test infrastructure to test different things when a > combination of components is built, especially after they start to > merge after being in a monorepo, will complicate tests and increase > the likelihood that some tests will never be run by CI and bit rot.Monorepo isn't the relevant thing. It's all about the build config. Any test introduced by any patch today is expected to be run by CI. This expectation would not be any different for these integration tests.> On the test-suite, you can guarantee that the whole toolchain is > available: Front and back end of the compilers, assemblers (if > necessary), linkers, libraries, etc. > > Writing a small source file per test, as you would in Clang/LLVM, > running LIT and FileCheck, and *always* running it in the TS would be > trivial.I have to say, it's highly unusual for me to make a commit that does *not* produce blame mail from some bot running lit tests. Thankfully it's rare to get one that is actually my fault. I can't remember *ever* getting blame mail related to test-suite. Do they actually run? Do they ever catch anything? Do they ever send blame mail? I have to wonder about that. Mehdi wrote:> David Greene wrote: >> Personally, I still find source-to-asm tests to be highly valuable and I >> don't think we need test-suite for that. Such tests don't (usually) >> depend on system libraries (headers may occasionally be an issue but I >> would argue that the test is too fragile in that case). >> >> So maybe we separate concerns. Use test-suite to do the kind of >> system-level testing you've discussed but still allow some tests in a >> monorepo top-level directory that test across components but don't >> depend on system configurations. >> >> If people really object to a top-level monorepo test directory I guess >> they could go into test-suite but that makes it much more cumbersome to >> run what really should be very simple tests. > > The main thing I see that will justify push-back on such test is the > maintenance: you need to convince everyone that every component in LLVM > must also maintain (update, fix, etc.) the tests that are in other > components (clang, flang, other future subproject, etc.). Changing the > vectorizer in the middle-end may require now to understand the kind of > update a test written in Fortran (or Haskell?) is checking with some > Hexagon assembly. This is a non-trivial burden when you compute the > full matrix of possible frontend and backends.So how is this different from today? If I put in a patch that breaks Hexagon, or compiler-rt, or LLDB, none of which I really understand... or omg Chrome, which isn't even an LLVM project... it's still my job to fix whatever is broken. If it's some component where I am genuinely clueless, I'm expected to ask for help. Integration tests would not be any different. Flaky or fragile tests that constantly break for no good reason would need to be replaced or made more robust. Again this is no different from any other flaky or fragile test. I can understand people being worried that because an integration test depends on more components, it has a wider "surface area" of potential breakage points. This, I claim, is exactly the *value* of such tests. And I see them breaking primarily under two conditions. 1) Something is broken that causes other component-level failures. Fixing that component-level problem will likely fix the integration test as well; or, the integration test must be fixed the same way as the component-level tests. 2) Something is broken that does *not* cause other component-level failures. That's exactly what integration tests are for! They verify *interactions* that are hard or maybe impossible to test in a component-level way. The worry I'm hearing is about a third category: 3) Integration tests fail due to fragility or overly-specific checks. ...which should be addressed in exactly the same way as our overly fragile or overly specific component-level tests. Is there some reason they wouldn't be? --paulr
David Greene via llvm-dev
2019-Oct-17 17:09 UTC
[llvm-dev] [Openmp-dev] [cfe-dev] RFC: End-to-end testing
Renato Golin <rengolin at gmail.com> writes:> On Wed, 16 Oct 2019 at 21:00, David Greene <greened at obbligato.org> wrote: >> Can you elaborate? I'm talking about very small tests targeted to >> generate a specific instruction or small number of instructions. >> Vectorization isn't the best example. Something like verifying FMA >> generation is a better example. > > To check that instructions are generated from source, a two-step test > is the best approach: > - Verify that Clang emits different IR for different options, or the > right IR for a new functionality > - Verify that the affected targets (or at least two of the main ones) > can take that IR and generate the right asmYes, of course we have tests like that. We have found they are not always sufficient.> If you want to do the test in Clang all the way to asm, you need to > make sure the back-end is built. Clang is not always build with all > back-ends, possibly even none.Right, which is why we have things like REQUIRES: x86-registered-target.> To do that in the back-end, you'd have to rely on Clang being built, > which is not always true.Sure.> Hacking our test infrastructure to test different things when a > combination of components is built, especially after they start to > merge after being in a monorepo, will complicate tests and increase > the likelihood that some tests will never be run by CI and bit rot.>From other discussion, it sounds like at least some people are open toasm tests under clang. I think that should be fine. But there are probably other kinds of end-to-end tests that should not live under clang.> On the test-suite, you can guarantee that the whole toolchain is > available: Front and back end of the compilers, assemblers (if > necessary), linkers, libraries, etc. > > Writing a small source file per test, as you would in Clang/LLVM, > running LIT and FileCheck, and *always* running it in the TS would be > trivial.How often would such tests be run as part of test-suite? Honestly, it's not really clear to me exactly which bots cover what, how often they run and so on. Is there a document somewhere describing the setup? -David