Michael Kruse via llvm-dev
2020-Jun-24  01:33 UTC
[llvm-dev] [RFC] Compiled regression tests.
Hello LLVM community,
For testing IR passes, LLVM currently has two kinds of tests:
 1. regression tests (in llvm/test); .ll files invoking opt, and
matching its text output using FileCheck.
 2. unittests (in llvm/unittests); Google tests containing the IR as a
string, constructing a pass pipeline, and inspecting the output using
code.
I propose to add an additional kind of test, which I call "compiled
regression test", combining the advantages of the two. A test is a
single .cxx file of the general structure below that can be dumped
into the llvm/test directory. I am not proposing to replace FileCheck,
but in a lot of cases, domain-specific verifiers can be more powerful
(e.g. verify-uselistorder or `clang -verify`).
    #ifdef IR
      define void @func() {
      entry:
        ret
      }
    #else /* IR */
      #include "compiledtestboilerplate.h"
      TEST(TestSuiteName, TestName) {
        unique_ptr<Module> Output = run_opt(__FILE__, "IR",
"-passes=loop-vectorize");
        /* Check Output */
      }
    #endif /* IR */
That is, input IR and check code are in the same file. The run_opt
command is a replica of main() from the opt tool, so any command line
arguments (passes with legacy or new passmanager, cl::opt options,
etc.) can be passed. It also makes converting existing tests simpler.
The top-level structure is C++ (i.e. the LLVM-IR is removed by the
preprocessor) and compiled with cmake. This allows a
compile_commands.json to be created such that refactoring tools,
clang-tidy, and clang-format can easily be applied on the code. The
second argument to run_opt is the preprocessor directive for the IR
such that multiple IR modules can be embedded into the file.
Such tests can be compiled in two modes: Either within the LLVM
project, or as an external subproject using llvm_ExternalProject_Add.
The former has the disadvantage that new .cxx files dumped into the
test folder are not recognized until the next cmake run, unless the
CONFIGURE_DEPENDS option is used. I found this adds seconds to each
invocation of ninja which I considered a dealbreaker. The external
project searched for tests every time, but is only invoked in a
check-llvm run, no different than llvm-lit. It uses CMake's
find_package to build against the main project's results (which
currently we do not have tests for) and could also be compiled in
debug mode while LLVM itself is compiled in release mode.
The checks themselves can be any of gtest's ASSERT/EXPECT macros, but
for common test idioms I suggest to add custom macros, such as
    ASSERT_ALL_OF(InstList, !isa<VectorType>(I->getType()));
which on failure prints the instruction that does not return a vector.
Try that with FileCheck. PattenMatch.h from InstCombine can be used as
well. Structural comparison with a reference output could also be
possible (like clang-diff,
[llvm-canon](http://llvm.org/devmtg/2019-10/talk-abstracts.html#tech12),
https://reviews.llvm.org/D80916).
Some additional tooling could be helpful:
 * A test file creator, taking some IR, wrapping it into the above
structure, and write it into the test directory.
 * A tool for extracting and updating (running opt) the IR inside the
#ifdef, if not even add this functionality to opt itself. This is the
main reason to not just the IR inside a string.
A Work-in-Progress differential and what it improves over FileCheck
and unittests is available here: https://reviews.llvm.org/D82426
Any kind of feedback welcome.
David Blaikie via llvm-dev
2020-Jun-24  05:36 UTC
[llvm-dev] [RFC] Compiled regression tests.
I'm pretty change averse - and am in this case (but doesn't mean other folks won't be in favor and doesn't mean it isn't worth trying, etc - but if it were up to me, at the moment, I'd decline) To take one point from the phab review (to try to keep the discussion here rather than there - might be worth a note on the phab review to discourage commenting there so the discussion doesn't get spread through different threads):> Because of all of the above, maintenance of all the regression tests is anightmare. I expect it to be a serious issue for introducing opaque pointers. My prediction is that we will have a typed-pointer command line flag to not have to require updating all the write-only regression tests. Actually I already did a lot of work with the initial patches years ago for opaque pointers (that added explicit types to gep/store/etc) and used (& provided in the commit messages) python scripts to migrate all the tests, both the IR itself and the CHECK text. This is probably more readily automatable than a more free-form C++ based checking proposed here. That said, it sounds like the proposal is a lot closer to the GUnit tests, and if this testing strategy is valuable, it seems like it could be mostly achieved by adding some API utilities (like the one you proposed in the patch) to make it more convenient to run optimization passes in a GUnit test. It doesn't seem to me like an #ifdef based approach to embedding IR in C++ would result in particularly more manageable/formattable code than a raw string. Perhaps the proposed improvements could be used to reduce/remove the cost of adding new GUnit tests/the need to touch CMake files/etc. (though I'd worry about the divergence of where optimization tests are written - as unit tests or as lit/FileCheck tests - that doesn't mean experimentation isn't worthwhile, but I think it'd require a pretty compelling justification to propose a replacement to the FileCheck approach (& perhaps a timeline for an experiment before either removing it, or deciding that it is the intended future state) - and if it's not a replacement, then I think we'd need to discuss what sort of situations this new thing is suitable and what FileCheck should be used for going forward) - Dave On Tue, Jun 23, 2020 at 6:34 PM Michael Kruse via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hello LLVM community, > > For testing IR passes, LLVM currently has two kinds of tests: > 1. regression tests (in llvm/test); .ll files invoking opt, and > matching its text output using FileCheck. > 2. unittests (in llvm/unittests); Google tests containing the IR as a > string, constructing a pass pipeline, and inspecting the output using > code. > > I propose to add an additional kind of test, which I call "compiled > regression test", combining the advantages of the two. A test is a > single .cxx file of the general structure below that can be dumped > into the llvm/test directory. I am not proposing to replace FileCheck, > but in a lot of cases, domain-specific verifiers can be more powerful > (e.g. verify-uselistorder or `clang -verify`). > > #ifdef IR > define void @func() { > entry: > ret > } > #else /* IR */ > #include "compiledtestboilerplate.h" > TEST(TestSuiteName, TestName) { > unique_ptr<Module> Output = run_opt(__FILE__, "IR", > "-passes=loop-vectorize"); > /* Check Output */ > } > #endif /* IR */ > > That is, input IR and check code are in the same file. The run_opt > command is a replica of main() from the opt tool, so any command line > arguments (passes with legacy or new passmanager, cl::opt options, > etc.) can be passed. It also makes converting existing tests simpler. > > The top-level structure is C++ (i.e. the LLVM-IR is removed by the > preprocessor) and compiled with cmake. This allows a > compile_commands.json to be created such that refactoring tools, > clang-tidy, and clang-format can easily be applied on the code. The > second argument to run_opt is the preprocessor directive for the IR > such that multiple IR modules can be embedded into the file. > > Such tests can be compiled in two modes: Either within the LLVM > project, or as an external subproject using llvm_ExternalProject_Add. > The former has the disadvantage that new .cxx files dumped into the > test folder are not recognized until the next cmake run, unless the > CONFIGURE_DEPENDS option is used. I found this adds seconds to each > invocation of ninja which I considered a dealbreaker. The external > project searched for tests every time, but is only invoked in a > check-llvm run, no different than llvm-lit. It uses CMake's > find_package to build against the main project's results (which > currently we do not have tests for) and could also be compiled in > debug mode while LLVM itself is compiled in release mode. > > The checks themselves can be any of gtest's ASSERT/EXPECT macros, but > for common test idioms I suggest to add custom macros, such as > > ASSERT_ALL_OF(InstList, !isa<VectorType>(I->getType())); > > which on failure prints the instruction that does not return a vector. > Try that with FileCheck. PattenMatch.h from InstCombine can be used as > well. Structural comparison with a reference output could also be > possible (like clang-diff, > [llvm-canon](http://llvm.org/devmtg/2019-10/talk-abstracts.html#tech12), > https://reviews.llvm.org/D80916). > > Some additional tooling could be helpful: > > * A test file creator, taking some IR, wrapping it into the above > structure, and write it into the test directory. > * A tool for extracting and updating (running opt) the IR inside the > #ifdef, if not even add this functionality to opt itself. This is the > main reason to not just the IR inside a string. > > A Work-in-Progress differential and what it improves over FileCheck > and unittests is available here: https://reviews.llvm.org/D82426 > > Any kind of feedback welcome. > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200623/393ed4a9/attachment-0001.html>
Roman Lebedev via llvm-dev
2020-Jun-24  10:58 UTC
[llvm-dev] [RFC] Compiled regression tests.
On Wed, Jun 24, 2020 at 12:35 PM David Blaikie via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > I'm pretty change averse - and am in this case (but doesn't mean other folks won't be in favor and doesn't mean it isn't worth trying, etc - but if it were up to me, at the moment, I'd decline) > > To take one point from the phab review (to try to keep the discussion here rather than there - might be worth a note on the phab review to discourage commenting there so the discussion doesn't get spread through different threads): > > > Because of all of the above, maintenance of all the regression tests is a nightmare. I expect it to be a serious issue for introducing opaque pointers. My prediction is that we will have a typed-pointer command line flag to not have to require updating all the write-only regression tests. > > Actually I already did a lot of work with the initial patches years ago for opaque pointers (that added explicit types to gep/store/etc) and used (& provided in the commit messages) python scripts to migrate all the tests, both the IR itself and the CHECK text. This is probably more readily automatable than a more free-form C++ based checking proposed here.+1, i'm not sure what kind of scalability issues with updating existing tests there is.> That said, it sounds like the proposal is a lot closer to the GUnit tests, and if this testing strategy is valuable, it seems like it could be mostly achieved by adding some API utilities (like the one you proposed in the patch) to make it more convenient to run optimization passes in a GUnit test. It doesn't seem to me like an #ifdef based approach to embedding IR in C++ would result in particularly more manageable/formattable code than a raw string. Perhaps the proposed improvements could be used to reduce/remove the cost of adding new GUnit tests/the need to touch CMake files/etc. (though I'd worry about the divergence of where optimization tests are written - as unit tests or as lit/FileCheck tests - that doesn't mean experimentation isn't worthwhile, but I think it'd require a pretty compelling justification to propose a replacement to the FileCheck approach (& perhaps a timeline for an experiment before either removing it, or deciding that it is the intended future state) - and if it's not a replacement, then I think we'd need to discuss what sort of situations this new thing is suitable and what FileCheck should be used for going forward) > > - DaveRoman> On Tue, Jun 23, 2020 at 6:34 PM Michael Kruse via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> >> Hello LLVM community, >> >> For testing IR passes, LLVM currently has two kinds of tests: >> 1. regression tests (in llvm/test); .ll files invoking opt, and >> matching its text output using FileCheck. >> 2. unittests (in llvm/unittests); Google tests containing the IR as a >> string, constructing a pass pipeline, and inspecting the output using >> code. >> >> I propose to add an additional kind of test, which I call "compiled >> regression test", combining the advantages of the two. A test is a >> single .cxx file of the general structure below that can be dumped >> into the llvm/test directory. I am not proposing to replace FileCheck, >> but in a lot of cases, domain-specific verifiers can be more powerful >> (e.g. verify-uselistorder or `clang -verify`). >> >> #ifdef IR >> define void @func() { >> entry: >> ret >> } >> #else /* IR */ >> #include "compiledtestboilerplate.h" >> TEST(TestSuiteName, TestName) { >> unique_ptr<Module> Output = run_opt(__FILE__, "IR", >> "-passes=loop-vectorize"); >> /* Check Output */ >> } >> #endif /* IR */ >> >> That is, input IR and check code are in the same file. The run_opt >> command is a replica of main() from the opt tool, so any command line >> arguments (passes with legacy or new passmanager, cl::opt options, >> etc.) can be passed. It also makes converting existing tests simpler. >> >> The top-level structure is C++ (i.e. the LLVM-IR is removed by the >> preprocessor) and compiled with cmake. This allows a >> compile_commands.json to be created such that refactoring tools, >> clang-tidy, and clang-format can easily be applied on the code. The >> second argument to run_opt is the preprocessor directive for the IR >> such that multiple IR modules can be embedded into the file. >> >> Such tests can be compiled in two modes: Either within the LLVM >> project, or as an external subproject using llvm_ExternalProject_Add. >> The former has the disadvantage that new .cxx files dumped into the >> test folder are not recognized until the next cmake run, unless the >> CONFIGURE_DEPENDS option is used. I found this adds seconds to each >> invocation of ninja which I considered a dealbreaker. The external >> project searched for tests every time, but is only invoked in a >> check-llvm run, no different than llvm-lit. It uses CMake's >> find_package to build against the main project's results (which >> currently we do not have tests for) and could also be compiled in >> debug mode while LLVM itself is compiled in release mode. >> >> The checks themselves can be any of gtest's ASSERT/EXPECT macros, but >> for common test idioms I suggest to add custom macros, such as >> >> ASSERT_ALL_OF(InstList, !isa<VectorType>(I->getType())); >> >> which on failure prints the instruction that does not return a vector. >> Try that with FileCheck. PattenMatch.h from InstCombine can be used as >> well. Structural comparison with a reference output could also be >> possible (like clang-diff, >> [llvm-canon](http://llvm.org/devmtg/2019-10/talk-abstracts.html#tech12), >> https://reviews.llvm.org/D80916). >> >> Some additional tooling could be helpful: >> >> * A test file creator, taking some IR, wrapping it into the above >> structure, and write it into the test directory. >> * A tool for extracting and updating (running opt) the IR inside the >> #ifdef, if not even add this functionality to opt itself. This is the >> main reason to not just the IR inside a string. >> >> A Work-in-Progress differential and what it improves over FileCheck >> and unittests is available here: https://reviews.llvm.org/D82426 >> >> Any kind of feedback welcome. >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Michael Kruse via llvm-dev
2020-Jun-24  15:03 UTC
[llvm-dev] [RFC] Compiled regression tests.
Am Mi., 24. Juni 2020 um 00:37 Uhr schrieb David Blaikie <dblaikie at gmail.com>:> I'm pretty change averse - and am in this case (but doesn't mean other folks won't be in favor and doesn't mean it isn't worth trying, etc - but if it were up to me, at the moment, I'd decline)That's understandable. New features also come with a cost that they need to recoup with their usefulness.> To take one point from the phab review (to try to keep the discussion here rather than there - might be worth a note on the phab review to discourage commenting there so the discussion doesn't get spread through different threads):I added a remark to the Phabrictor thread. (I kept these arguments there to keep the RFC itself short)> > Because of all of the above, maintenance of all the regression tests is a nightmare. I expect it to be a serious issue for introducing opaque pointers. My prediction is that we will have a typed-pointer command line flag to not have to require updating all the write-only regression tests. > > Actually I already did a lot of work with the initial patches years ago for opaque pointers (that added explicit types to gep/store/etc) and used (& provided in the commit messages) python scripts to migrate all the tests, both the IR itself and the CHECK text. This is probably more readily automatable than a more free-form C++ based checking proposed here. > > That said, it sounds like the proposal is a lot closer to the GUnit tests, and if this testing strategy is valuable, it seems like it could be mostly achieved by adding some API utilities (like the one you proposed in the patch) to make it more convenient to run optimization passes in a GUnit test. It doesn't seem to me like an #ifdef based approach to embedding IR in C++ would result in particularly more manageable/formattable code than a raw string. Perhaps the proposed improvements could be used to reduce/remove the cost of adding new GUnit tests/the need to touch CMake files/etc. (though I'd worry about the divergence of where optimization tests are written - as unit tests or as lit/FileCheck tests - that doesn't mean experimentation isn't worthwhile, but I think it'd require a pretty compelling justification to propose a replacement to the FileCheck approach (& perhaps a timeline for an experiment before either removing it, or deciding that it is the intended future state) - and if it's not a replacement, then I think we'd need to discuss what sort of situations this new thing is suitable and what FileCheck should be used for going forward)As mentioned in the Differential, generating the tests automatically will lose information about what actually is intended to be tested, making it harder to understand the test. If the had a method to just update all tests (Polly as `ninja polly-update-format` to automatically update `ninja polly-check-format` failures), updating is easier, but because of the lack of understanding in practice most changes will just be glossed over. We already have a split between unittests (eg. llvm/Transforms/Scalar/LICMTests.cpp) and regression tests (llvm/test/Transforms/LICM/*.ll). New tests in D82426 are located next to the .ll files. The unittests could be moved there too. Michael
Mehdi AMINI via llvm-dev
2020-Jun-24  16:18 UTC
[llvm-dev] [RFC] Compiled regression tests.
Hi, On Tue, Jun 23, 2020 at 6:34 PM Michael Kruse via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hello LLVM community, > > For testing IR passes, LLVM currently has two kinds of tests: > 1. regression tests (in llvm/test); .ll files invoking opt, and > matching its text output using FileCheck. > 2. unittests (in llvm/unittests); Google tests containing the IR as a > string, constructing a pass pipeline, and inspecting the output using > code. > > I propose to add an additional kind of test, which I call "compiled > regression test", combining the advantages of the two.You expand below on the mechanism you'd like to implement, but I am a bit puzzled about the motivation right now? I'm failing to see what kind of IR-level test (unittests are relevant for data-structures and non-IR internals IMO) we would implement this way that we can't just implement with lit/FileCheck? Thanks, -- Mehdi> A test is a > single .cxx file of the general structure below that can be dumped > into the llvm/test directory. I am not proposing to replace FileCheck, > but in a lot of cases, domain-specific verifiers can be more powerful > (e.g. verify-uselistorder or `clang -verify`). > > #ifdef IR > define void @func() { > entry: > ret > } > #else /* IR */ > #include "compiledtestboilerplate.h" > TEST(TestSuiteName, TestName) { > unique_ptr<Module> Output = run_opt(__FILE__, "IR", > "-passes=loop-vectorize"); > /* Check Output */ > } > #endif /* IR */ > > That is, input IR and check code are in the same file. The run_opt > command is a replica of main() from the opt tool, so any command line > arguments (passes with legacy or new passmanager, cl::opt options, > etc.) can be passed. It also makes converting existing tests simpler. > > The top-level structure is C++ (i.e. the LLVM-IR is removed by the > preprocessor) and compiled with cmake. This allows a > compile_commands.json to be created such that refactoring tools, > clang-tidy, and clang-format can easily be applied on the code. The > second argument to run_opt is the preprocessor directive for the IR > such that multiple IR modules can be embedded into the file. > > Such tests can be compiled in two modes: Either within the LLVM > project, or as an external subproject using llvm_ExternalProject_Add. > The former has the disadvantage that new .cxx files dumped into the > test folder are not recognized until the next cmake run, unless the > CONFIGURE_DEPENDS option is used. I found this adds seconds to each > invocation of ninja which I considered a dealbreaker. The external > project searched for tests every time, but is only invoked in a > check-llvm run, no different than llvm-lit. It uses CMake's > find_package to build against the main project's results (which > currently we do not have tests for) and could also be compiled in > debug mode while LLVM itself is compiled in release mode. > > The checks themselves can be any of gtest's ASSERT/EXPECT macros, but > for common test idioms I suggest to add custom macros, such as > > ASSERT_ALL_OF(InstList, !isa<VectorType>(I->getType())); > > which on failure prints the instruction that does not return a vector. > Try that with FileCheck. PattenMatch.h from InstCombine can be used as > well. Structural comparison with a reference output could also be > possible (like clang-diff, > [llvm-canon](http://llvm.org/devmtg/2019-10/talk-abstracts.html#tech12), > https://reviews.llvm.org/D80916). > > Some additional tooling could be helpful: > > * A test file creator, taking some IR, wrapping it into the above > structure, and write it into the test directory. > * A tool for extracting and updating (running opt) the IR inside the > #ifdef, if not even add this functionality to opt itself. This is the > main reason to not just the IR inside a string. > > A Work-in-Progress differential and what it improves over FileCheck > and unittests is available here: https://reviews.llvm.org/D82426 > > Any kind of feedback welcome. > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200624/bea74e4f/attachment.html>
Michael Kruse via llvm-dev
2020-Jun-24  16:50 UTC
[llvm-dev] [RFC] Compiled regression tests.
Am Mi., 24. Juni 2020 um 11:19 Uhr schrieb Mehdi AMINI <joker.eph at gmail.com>:> > Hi, > > On Tue, Jun 23, 2020 at 6:34 PM Michael Kruse via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> >> Hello LLVM community, >> >> For testing IR passes, LLVM currently has two kinds of tests: >> 1. regression tests (in llvm/test); .ll files invoking opt, and >> matching its text output using FileCheck. >> 2. unittests (in llvm/unittests); Google tests containing the IR as a >> string, constructing a pass pipeline, and inspecting the output using >> code. >> >> I propose to add an additional kind of test, which I call "compiled >> regression test", combining the advantages of the two. > > > You expand below on the mechanism you'd like to implement, but I am a bit puzzled about the motivation right now?See https://reviews.llvm.org/D82426 and http://lists.llvm.org/pipermail/llvm-dev/2020-June/142706.html for more motivation.> I'm failing to see what kind of IR-level test (unittests are relevant for data-structures and non-IR internals IMO) we would implement this way that we can't just implement with lit/FileCheck?My argument is not that those tests cannot be written using FileCheck, but tend (not all of them) to check more than relevant, may be truisms and are difficult to reverse-engineer and to update when something changes. Michael
Joel E. Denny via llvm-dev
2020-Jun-28  18:42 UTC
[llvm-dev] [RFC] Compiled regression tests.
Hi Michael,
You propose using the preprocessor for mixing C++ test code with its input
code, such as LLVM IR.  Of course, LIT, FileCheck, `clang -verify`, and
unit tests all enable mixing their various forms of test code with input
code.  To my eyes, a difference between unit tests vs. LIT, FileCheck, and
`clang -verify` tests is that the latter tend to make the input code more
prominent (I prefer that) and make it easier to clarify which test code is
associated with which input code.  So far, I think the preprocessor
approach is better in this regard than unit tests, but what about a more
familiar syntax, like the following?
```
// RUN-CXX: #include "compiledtestboilerplate.h"
// RUN-CXX: unique_ptr<Module> Output = run_opt("%s",
"-passes=loop-vectorize");
// RUN-CXX: /* Check func1 Output */
define void @func1() {
entry:
  ret
}
// RUN-CXX: /* Check func2 Output */
define void @func2() {
entry:
  ret
}
```
It seems it should be feasible to automate extraction of the `RUN-CXX:`
code for compilation and for analysis by clang-format and clang-tidy.
Perhaps there would be a new script that extracts at build time for all
such uses.  But there are other possibilities that could be considered: a
LIT extension for `RUN-CXX:`, C++ JIT compilation, a clang-format-diff.py
extension that greps modified files for `RUN-CXX:`, etc.
LIT, FileCheck, and `RUN-CXX:` directives should then be able to co-exist
in a single test file.  Thus, you might incrementally add `RUN-CXX:`
directives to test files that already contain LIT and FileCheck directives
to handle cases where FileCheck directives are difficult to use.  You could
keep the FileCheck directives when they are more reasonable, or you could
eventually replace them.  You might run `opt` once with a `RUN:` directive
and then check its output `.ll` file with both `RUN-CXX:` and FileCheck
directives (maybe all `RUN:` directives execute before all `RUN-CXX:`
directives, or maybe C++ JIT compilation would permit them to execute in
the order specified).
It's not clear to me whether the above idea is worth the trouble, but I
think I'd at least prefer the syntax to the preprocessor approach.
I also have a vague feeling that something like this has been discussed
before.  If so, please just point me to the discussion.
In any case, thanks for working to improve LLVM's testing infrastructure.
Joel
On Tue, Jun 23, 2020 at 9:33 PM Michael Kruse via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hello LLVM community,
>
> For testing IR passes, LLVM currently has two kinds of tests:
>  1. regression tests (in llvm/test); .ll files invoking opt, and
> matching its text output using FileCheck.
>  2. unittests (in llvm/unittests); Google tests containing the IR as a
> string, constructing a pass pipeline, and inspecting the output using
> code.
>
> I propose to add an additional kind of test, which I call "compiled
> regression test", combining the advantages of the two. A test is a
> single .cxx file of the general structure below that can be dumped
> into the llvm/test directory. I am not proposing to replace FileCheck,
> but in a lot of cases, domain-specific verifiers can be more powerful
> (e.g. verify-uselistorder or `clang -verify`).
>
>     #ifdef IR
>       define void @func() {
>       entry:
>         ret
>       }
>     #else /* IR */
>       #include "compiledtestboilerplate.h"
>       TEST(TestSuiteName, TestName) {
>         unique_ptr<Module> Output = run_opt(__FILE__, "IR",
> "-passes=loop-vectorize");
>         /* Check Output */
>       }
>     #endif /* IR */
>
> That is, input IR and check code are in the same file. The run_opt
> command is a replica of main() from the opt tool, so any command line
> arguments (passes with legacy or new passmanager, cl::opt options,
> etc.) can be passed. It also makes converting existing tests simpler.
>
> The top-level structure is C++ (i.e. the LLVM-IR is removed by the
> preprocessor) and compiled with cmake. This allows a
> compile_commands.json to be created such that refactoring tools,
> clang-tidy, and clang-format can easily be applied on the code. The
> second argument to run_opt is the preprocessor directive for the IR
> such that multiple IR modules can be embedded into the file.
>
> Such tests can be compiled in two modes: Either within the LLVM
> project, or as an external subproject using llvm_ExternalProject_Add.
> The former has the disadvantage that new .cxx files dumped into the
> test folder are not recognized until the next cmake run, unless the
> CONFIGURE_DEPENDS option is used. I found this adds seconds to each
> invocation of ninja which I considered a dealbreaker. The external
> project searched for tests every time, but is only invoked in a
> check-llvm run, no different than llvm-lit. It uses CMake's
> find_package to build against the main project's results (which
> currently we do not have tests for) and could also be compiled in
> debug mode while LLVM itself is compiled in release mode.
>
> The checks themselves can be any of gtest's ASSERT/EXPECT macros, but
> for common test idioms I suggest to add custom macros, such as
>
>     ASSERT_ALL_OF(InstList, !isa<VectorType>(I->getType()));
>
> which on failure prints the instruction that does not return a vector.
> Try that with FileCheck. PattenMatch.h from InstCombine can be used as
> well. Structural comparison with a reference output could also be
> possible (like clang-diff,
> [llvm-canon](http://llvm.org/devmtg/2019-10/talk-abstracts.html#tech12),
> https://reviews.llvm.org/D80916).
>
> Some additional tooling could be helpful:
>
>  * A test file creator, taking some IR, wrapping it into the above
> structure, and write it into the test directory.
>  * A tool for extracting and updating (running opt) the IR inside the
> #ifdef, if not even add this functionality to opt itself. This is the
> main reason to not just the IR inside a string.
>
> A Work-in-Progress differential and what it improves over FileCheck
> and unittests is available here: https://reviews.llvm.org/D82426
>
> Any kind of feedback welcome.
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200628/0d8fd6cc/attachment-0001.html>
Michael Kruse via llvm-dev
2020-Jun-29  01:20 UTC
[llvm-dev] [RFC] Compiled regression tests.
Am So., 28. Juni 2020 um 13:42 Uhr schrieb Joel E. Denny <jdenny.ornl at gmail.com>:> You propose using the preprocessor for mixing C++ test code with its input code, such as LLVM IR. Of course, LIT, FileCheck, `clang -verify`, and unit tests all enable mixing their various forms of test code with input code. To my eyes, a difference between unit tests vs. LIT, FileCheck, and `clang -verify` tests is that the latter tend to make the input code more prominent (I prefer that) and make it easier to clarify which test code is associated with which input code. So far, I think the preprocessor approach is better in this regard than unit tests, but what about a more familiar syntax, like the following? > > ``` > // RUN-CXX: #include "compiledtestboilerplate.h" > // RUN-CXX: unique_ptr<Module> Output = run_opt("%s", "-passes=loop-vectorize"); > > // RUN-CXX: /* Check func1 Output */ > define void @func1() { > entry: > ret > } > > // RUN-CXX: /* Check func2 Output */ > define void @func2() { > entry: > ret > } > ``` > > It seems it should be feasible to automate extraction of the `RUN-CXX:` code for compilation and for analysis by clang-format and clang-tidy. Perhaps there would be a new script that extracts at build time for all such uses. But there are other possibilities that could be considered: a LIT extension for `RUN-CXX:`, C++ JIT compilation, a clang-format-diff.py extension that greps modified files for `RUN-CXX:`, etc.I think there is a significant gain of having the C++ code at the top-level, rather than the IR. There are more tools for C++ of which some, such as IDEs, include-what-you-use, youcompleteme, etc are external whereas LLVM-IR is project-specific. If func1 is unrelated to func2, I really think these should not be in the same module. Consider this: static const char *Func1IR = R"IR( define void @func1() { entry: ret } )IR"; TEST(MyTests, TestFunc1) { unique_ptr<Module> Output = run_opt(Func1IR, "-passes=loop-vectorize"); /* Check func1 Output */ } static const char *Func2IR = R"IR( define void @func2() { entry: ret } )IR"; TEST(MyTests, TestFunc2) { unique_ptr<Module> Output = run_opt(Func2IR, "-passes=loop-vectorize"); /* Check func2 Output */ } Part of the reason is that not all of what makes a function is syntactically contained within that function definition: Forward declarations, types, function attributes, metadata, etc. When using update_test_checks.py, these are per-function anyways. However, for cases where it is indeed beneficial, there could be some preprocessing of the embedded string taking place. Maybe: static const char *Func1IR = R"IR( define i32 @func1() { entry: %two = add i32 1, 1 ; ASSERT_TRUE(isa<AddInst>(F["two"])) ret i32 %two } )IR"; // All collected ASSERT_TRUE emitted into another file // #line preprocessor hints could point to the original line. #include "inlineasserts.inc"> I also have a vague feeling that something like this has been discussed before. If so, please just point me to the discussion.I am not aware of any such previous discussion. Michael