Mehdi AMINI via llvm-dev
2019-Oct-09 02:14 UTC
[llvm-dev] [cfe-dev] RFC: End-to-end testing
> I have a bit of concern about this sort of thing - worrying it'll lead to > people being less cautious about writing the more isolated tests. >I have the same concern. I really believe we need to be careful about testing at the right granularity to keep things both modular and the testing maintainable (for instance checking vectorized ASM from a C++ source through clang has always been considered a bad FileCheck practice). (Not saying that there is no space for better integration testing in some areas).> That said, clearly there's value in end-to-end testing for all the reasons > you've mentioned (& we do see these problems in practice - recently DWARF > indexing broke when support for more nuanced language codes were added to > Clang). > > Dunno if they need a new place or should just be more stuff in test-suite, > though. > > On Tue, Oct 8, 2019 at 9:50 AM David Greene via cfe-dev < > cfe-dev at lists.llvm.org> wrote: > >> [ I am initially copying only a few lists since they seem like >> the most impacted projects and I didn't want to spam all the mailing >> lists. Please let me know if other lists should be included. ] >> >> I submitted D68230 for review but this is not about that patch per se. >> The patch allows update_cc_test_checks.py to process tests that should >> check target asm rather than LLVM IR. We use this facility downstream >> for our end-to-end tests. It strikes me that it might be useful for >> upstream to do similar end-to-end testing. >> >> Now that the monorepo is about to become the canonical source of truth, >> we have an opportunity for convenient end-to-end testing that we didn't >> easily have before with svn (yes, it could be done but in an ugly way). >> AFAIK the only upstream end-to-end testing we have is in test-suite and >> many of those codes are very large and/or unfocused tests. >> >> With the monorepo we have a place to put lit-style tests that exercise >> multiple subprojects, for example tests that ensure the entire clang >> compilation pipeline executes correctly. > >I don't think I agree with the relationship to the monorepo: there was nothing that prevented tests inside the clang project to exercise the full pipeline already. I don't believe that the SVN repo structure was really a factor in the way the testing was setup, but instead it was a deliberate choice in the way we structure our testing. For instance I remember asking about implementing test based on checking if some loops written in C source file were properly vectorized by the -O2 / -O3 pipeline and it was deemed like the kind of test that we don't want to maintain: instead I was pointed at the test-suite to add better benchmarks there for the end-to-end story. What is interesting is that the test-suite is not gonna be part of the monorepo! To be clear: I'm not saying here we can't change our way of testing, I just don't think the monorepo has anything to do with it and that it should carefully motivated and scoped into what belongs/doesn't belong to integration tests.> We could, for example, create >> a top-level "test" directory and put end-to-end tests there. Some of >> the things that could be tested include: >> >> - Pipeline execution (debug-pass=Executions) >> > - Optimization warnings/messages >> - Specific asm code sequences out of clang (e.g. ensure certain loops >> are vectorized) >> - Pragma effects (e.g. ensure loop optimizations are honored) >> - Complete end-to-end PGO (generate a profile and re-compile) >> - GPU/accelerator offloading >> - Debuggability of clang-generated code >> >> Each of these things is tested to some degree within their own >> subprojects, but AFAIK there are currently no dedicated tests ensuring >> such things work through the entire clang pipeline flow and with other >> tools that make use of the results (debuggers, etc.). It is relatively >> easy to break the pipeline while the individual subproject tests >> continue to pass. >> >I'm not sure I really see much in your list that isn't purely about testing clang itself here? Actually the first one seems more of a pure LLVM test.> I realize that some folks prefer to work on only a portion of the >> monorepo (for example, they just hack on LLVM). I am not sure how to >> address those developers WRT end-to-end testing. On the one hand, >> requiring them to run end-to-end testing means they will have to at >> least check out and build the monorepo. On the other hand, it seems >> less than ideal to have people developing core infrastructure and not >> running tests. >> >I think we already expect LLVM developers to update clang APIs? And we revert LLVM patches when clang testing is broken. So I believe the acknowledgment to maintain the other in-tree projects isn't really new, it is true that the monorepo will make this easy for everyone to reproduce locally most failure, and find all the use of an API across projects (which was provided as a motivation to move to a monorepo model: https://llvm.org/docs/Proposals/GitHubMove.html#monorepo ). -- Mehdi -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191008/55b4c71f/attachment.html>
David Greene via llvm-dev
2019-Oct-09 15:12 UTC
[llvm-dev] [cfe-dev] RFC: End-to-end testing
Mehdi AMINI via cfe-dev <cfe-dev at lists.llvm.org> writes:>> I have a bit of concern about this sort of thing - worrying it'll lead to >> people being less cautious about writing the more isolated tests. >> > > I have the same concern. I really believe we need to be careful about > testing at the right granularity to keep things both modular and the > testing maintainable (for instance checking vectorized ASM from a C++ > source through clang has always been considered a bad FileCheck practice). > (Not saying that there is no space for better integration testing in some > areas).I absolutely disagree about vectorization tests. We have seen vectorization loss in clang even though related LLVM lit tests pass, because something else in the clang pipeline changed that caused the vectorizer to not do its job. We need both kinds of tests. There are many asm tests of value beyond vectorization and they should include component and well as end-to-end tests.> For instance I remember asking about implementing test based on checking if > some loops written in C source file were properly vectorized by the -O2 / > -O3 pipeline and it was deemed like the kind of test that we don't want to > maintain: instead I was pointed at the test-suite to add better benchmarks > there for the end-to-end story. What is interesting is that the test-suite > is not gonna be part of the monorepo!And it shouldn't be. It's much too big. But there is a place for small end-to-end tests that live alongside the code.>>> We could, for example, create >>> a top-level "test" directory and put end-to-end tests there. Some of >>> the things that could be tested include: >>> >>> - Pipeline execution (debug-pass=Executions) >>> >>> - Optimization warnings/messages >>> - Specific asm code sequences out of clang (e.g. ensure certain loops >>> are vectorized) >>> - Pragma effects (e.g. ensure loop optimizations are honored) >>> - Complete end-to-end PGO (generate a profile and re-compile) >>> - GPU/accelerator offloading >>> - Debuggability of clang-generated code >>> >>> Each of these things is tested to some degree within their own >>> subprojects, but AFAIK there are currently no dedicated tests ensuring >>> such things work through the entire clang pipeline flow and with other >>> tools that make use of the results (debuggers, etc.). It is relatively >>> easy to break the pipeline while the individual subproject tests >>> continue to pass. >>> >> > > I'm not sure I really see much in your list that isn't purely about testing > clang itself here?Debugging and PGO involve other components, no? If we want to put clang end-to-end tests in the clang subdirectory, that's fine with me. But we need a place for tests that cut across components. I could also imagine llvm-mca end-to-end tests through clang.> Actually the first one seems more of a pure LLVM test.Definitely not. It would test the pipeline as constructed by clang, which is very different from the default pipeline constructed by opt/llc. The old and new pass managers also construct different pipelines. As we have seen with various mailing list messages, this is surprising to users. Best to document and check it with testing. -David
Mehdi AMINI via llvm-dev
2019-Oct-09 19:38 UTC
[llvm-dev] [cfe-dev] RFC: End-to-end testing
On Wed, Oct 9, 2019 at 8:12 AM David Greene <dag at cray.com> wrote:> Mehdi AMINI via cfe-dev <cfe-dev at lists.llvm.org> writes: > > >> I have a bit of concern about this sort of thing - worrying it'll lead > to > >> people being less cautious about writing the more isolated tests. > >> > > > > I have the same concern. I really believe we need to be careful about > > testing at the right granularity to keep things both modular and the > > testing maintainable (for instance checking vectorized ASM from a C++ > > source through clang has always been considered a bad FileCheck > practice). > > (Not saying that there is no space for better integration testing in some > > areas). > > I absolutely disagree about vectorization tests. We have seen > vectorization loss in clang even though related LLVM lit tests pass, > because something else in the clang pipeline changed that caused the > vectorizer to not do its job.Of course, and as I mentioned I tried to add these tests (probably 4 or 5 years ago), but someone (I think Chandler?) was asking me at the time: does it affect a benchmark performance? If so why isn't it tracked there? And if not does it matter? The benchmark was presented as the actual way to check this invariant (because you're only vectoring to get performance, not for the sake of it). So I never pursued, even if I'm a bit puzzled that we don't have such tests.> We need both kinds of tests. There are > many asm tests of value beyond vectorization and they should include > component and well as end-to-end tests. > > > For instance I remember asking about implementing test based on checking > if > > some loops written in C source file were properly vectorized by the -O2 / > > -O3 pipeline and it was deemed like the kind of test that we don't want > to > > maintain: instead I was pointed at the test-suite to add better > benchmarks > > there for the end-to-end story. What is interesting is that the > test-suite > > is not gonna be part of the monorepo! > > And it shouldn't be. It's much too big. But there is a place for small > end-to-end tests that live alongside the code. > > >>> We could, for example, create > >>> a top-level "test" directory and put end-to-end tests there. Some of > >>> the things that could be tested include: > >>> > >>> - Pipeline execution (debug-pass=Executions) > >>> > >>> - Optimization warnings/messages > >>> - Specific asm code sequences out of clang (e.g. ensure certain loops > >>> are vectorized) > >>> - Pragma effects (e.g. ensure loop optimizations are honored) > >>> - Complete end-to-end PGO (generate a profile and re-compile) > >>> - GPU/accelerator offloading > >>> - Debuggability of clang-generated code > >>> > >>> Each of these things is tested to some degree within their own > >>> subprojects, but AFAIK there are currently no dedicated tests ensuring > >>> such things work through the entire clang pipeline flow and with other > >>> tools that make use of the results (debuggers, etc.). It is relatively > >>> easy to break the pipeline while the individual subproject tests > >>> continue to pass. > >>> > >> > > > > I'm not sure I really see much in your list that isn't purely about > testing > > clang itself here? > > Debugging and PGO involve other components, no?I don't think that you need anything else than LLVM core (which is a dependency of clang) itself? Things like PGO (unless you're using frontend instrumentation) don't even have anything to do with clang, so we may get into the situation David mentioned where we would rely on clang to test LLVM features, which I find non-desirable.> If we want to put clang > end-to-end tests in the clang subdirectory, that's fine with me. But we > need a place for tests that cut across components. > > I could also imagine llvm-mca end-to-end tests through clang. > > > Actually the first one seems more of a pure LLVM test. > > Definitely not. It would test the pipeline as constructed by clang, > which is very different from the default pipeline constructed by > opt/llc.I am not convinced it is "very" difference (they are using the PassManagerBuilder AFAIK), I am only aware of very subtle difference. But more fundamentally: *should* they be different? I would want `opt -O3` to be able to reproduce 1-1 the clang pipeline. Isn't it the role of LLVM PassManagerBuilder to expose what is the "-O3" pipeline? If we see the PassManagerBuilder as the abstraction for the pipeline, then I don't see what testing belongs to clang here, this seems like a layering violation (and maintaining the PassManagerBuilder in LLVM I wouldn't want to have to update the tests of all the subproject using it because they retest the same feature).> The old and new pass managers also construct different > pipelines. As we have seen with various mailing list messages, this is > surprising to users. Best to document and check it with testing. >Yes: both old and new pass managers are LLVM components, so hopefully that are documented and tested in LLVM :) -- Mehdi -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191009/cf333ca6/attachment.html>
Florian Hahn via llvm-dev
2019-Oct-10 09:55 UTC
[llvm-dev] [cfe-dev] RFC: End-to-end testing
> On Oct 9, 2019, at 16:12, David Greene via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Mehdi AMINI via cfe-dev <cfe-dev at lists.llvm.org> writes: > >>> I have a bit of concern about this sort of thing - worrying it'll lead to >>> people being less cautious about writing the more isolated tests. >>> >> >> I have the same concern. I really believe we need to be careful about >> testing at the right granularity to keep things both modular and the >> testing maintainable (for instance checking vectorized ASM from a C++ >> source through clang has always been considered a bad FileCheck practice). >> (Not saying that there is no space for better integration testing in some >> areas). > > I absolutely disagree about vectorization tests. We have seen > vectorization loss in clang even though related LLVM lit tests pass, > because something else in the clang pipeline changed that caused the > vectorizer to not do its job. We need both kinds of tests. There are > many asm tests of value beyond vectorization and they should include > component and well as end-to-end tests.Have you considered alternatives to checking the assembly for ensuring vectorization or other transformations? For example, instead of checking the assembly, we could check LLVM’s statistics or optimization remarks. If you want to ensure a loop got vectorized, you could check the loop-vectorize remarks, which should give you the position of the loop in the source and vectorization/interleave factor used. There are few other things that could go wrong later on that would prevent vector instruction selection, but I think it should be sufficient to guard against most cases where we loose vectorization and should be much more robust to unrelated changes. If there are additional properties you want to ensure, they potentially could be added to the remark as well. This idea of leveraging statistics and optimization remarks to track the impact of changes on overall optimization results is nothing new and I think several people already discussed it in various forms. For regular benchmark runs, in addition to tracking the existing benchmarks, we could also track selected optimization remarks (e.g. loop-vectorize, but not necessarily noisy ones like gvn) and statistics. Comparing those run-to-run could potentially highlight new end-to-end issues on a much larger scale, across all existing benchmarks integrated in test-suite. We might be able to detect loss in vectorization pro-actively, instead of requiring someone to file a bug report and then we add an isolated test after the fact. But building something like this would be much more work of course…. Cheers, Florian