Joel E. Denny via llvm-dev
2018-May-04  15:40 UTC
[llvm-dev] RFC: [FileCheck] CHECK-DAG for multiple occurrences of string
Hi,
Using FileCheck, I have not found a way to make a group of CHECK-DAG
directives match multiple occurrences of a string.  For example, I naively
thought the following would match successfully:
```
$ cat checks.txt
// CHECK: start
// CHECK-DAG: foo
// CHECK-DAG: foo
// CHECK-DAG: bar
// CHECK-NEXT: end
$ cat input.txt
start
foo
bar
foo
end
$ FileCheck --input-file=input.txt checks.txt
checks.txt:5:16: error: CHECK-NEXT: is not on the line after the previous
match
// CHECK-NEXT: end
               ^
input.txt:5:1: note: 'next' match was here
end
^
input.txt:3:4: note: previous match ended here
bar
   ^
input.txt:4:1: note: non-matching line after previous match is here
foo
^
```
The trouble is that both "CHECK-DAG: foo" directives match the first
"foo".
I'd like this ability for testing a parallel program that outputs a series
of non-unique strings in non-deterministic order.  Am I trying to push
FileCheck beyond its intended domain?  Is there some existing feature for
this purpose that I've overlooked?  If not, I see two potential solutions:
1. In a CHECK-DAG group, don't let the matches for patterns overlap.
2. Add a new CHECK-DAG-N directive, where N is some integer, to express
that a pattern must have N non-overlapping matches.
An advantage of #1 that the intuitive way (at least in my mind) of
expressing multiple occurrences of a string, as in the example above, would
work.  An advantage of #2 is that existing CHECK-DAG functionality would
not change, and so there should be no chance of impacting existing well
formed tests.
To understand the issue better, I've prototyped #2.  It still needs test
cases and documentation, so it's not ready for a formal patch review.  If
people like the idea, I'll polish it up.
Thanks.
Joel
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180504/d488038a/attachment.html>
Jessica Paquette via llvm-dev
2018-May-04  16:45 UTC
[llvm-dev] RFC: [FileCheck] CHECK-DAG for multiple occurrences of string
I would personally like a feature like that in FileCheck because it would make it a lot easier to write MachineOutliner tests, and would make the tests significantly smaller and easier to understand. - Jessica> On May 4, 2018, at 8:40 AM, Joel E. Denny via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hi, > > Using FileCheck, I have not found a way to make a group of CHECK-DAG directives match multiple occurrences of a string. For example, I naively thought the following would match successfully: > > ``` > $ cat checks.txt > // CHECK: start > // CHECK-DAG: foo > // CHECK-DAG: foo > // CHECK-DAG: bar > // CHECK-NEXT: end > > $ cat input.txt > start > foo > bar > foo > end > > $ FileCheck --input-file=input.txt checks.txt > checks.txt:5:16: error: CHECK-NEXT: is not on the line after the previous match > // CHECK-NEXT: end > ^ > input.txt:5:1: note: 'next' match was here > end > ^ > input.txt:3:4: note: previous match ended here > bar > ^ > input.txt:4:1: note: non-matching line after previous match is here > foo > ^ > ``` > > The trouble is that both "CHECK-DAG: foo" directives match the first "foo". > > I'd like this ability for testing a parallel program that outputs a series of non-unique strings in non-deterministic order. Am I trying to push FileCheck beyond its intended domain? Is there some existing feature for this purpose that I've overlooked? If not, I see two potential solutions: > > 1. In a CHECK-DAG group, don't let the matches for patterns overlap. > > 2. Add a new CHECK-DAG-N directive, where N is some integer, to express that a pattern must have N non-overlapping matches. > > An advantage of #1 that the intuitive way (at least in my mind) of expressing multiple occurrences of a string, as in the example above, would work. An advantage of #2 is that existing CHECK-DAG functionality would not change, and so there should be no chance of impacting existing well formed tests. > > To understand the issue better, I've prototyped #2. It still needs test cases and documentation, so it's not ready for a formal patch review. If people like the idea, I'll polish it up. > > Thanks. > > Joel > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Joel E. Denny via llvm-dev
2018-May-04  17:05 UTC
[llvm-dev] RFC: [FileCheck] CHECK-DAG for multiple occurrences of string
Hi Jessica, This time I'm replying all.... On Fri, May 4, 2018 at 12:45 PM, Jessica Paquette <jpaquette at apple.com> wrote:> I would personally like a feature like that in FileCheck because it would > make it a lot easier to write MachineOutliner tests, and would make the > tests significantly smaller and easier to understand. >How do MachineOutliner tests accomplish this now? Can you point me to an example? Thanks. Joel> > - Jessica > > > On May 4, 2018, at 8:40 AM, Joel E. Denny via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > > > Hi, > > > > Using FileCheck, I have not found a way to make a group of CHECK-DAG > directives match multiple occurrences of a string. For example, I naively > thought the following would match successfully: > > > > ``` > > $ cat checks.txt > > // CHECK: start > > // CHECK-DAG: foo > > // CHECK-DAG: foo > > // CHECK-DAG: bar > > // CHECK-NEXT: end > > > > $ cat input.txt > > start > > foo > > bar > > foo > > end > > > > $ FileCheck --input-file=input.txt checks.txt > > checks.txt:5:16: error: CHECK-NEXT: is not on the line after the > previous match > > // CHECK-NEXT: end > > ^ > > input.txt:5:1: note: 'next' match was here > > end > > ^ > > input.txt:3:4: note: previous match ended here > > bar > > ^ > > input.txt:4:1: note: non-matching line after previous match is here > > foo > > ^ > > ``` > > > > The trouble is that both "CHECK-DAG: foo" directives match the first > "foo". > > > > I'd like this ability for testing a parallel program that outputs a > series of non-unique strings in non-deterministic order. Am I trying to > push FileCheck beyond its intended domain? Is there some existing feature > for this purpose that I've overlooked? If not, I see two potential > solutions: > > > > 1. In a CHECK-DAG group, don't let the matches for patterns overlap. > > > > 2. Add a new CHECK-DAG-N directive, where N is some integer, to express > that a pattern must have N non-overlapping matches. > > > > An advantage of #1 that the intuitive way (at least in my mind) of > expressing multiple occurrences of a string, as in the example above, would > work. An advantage of #2 is that existing CHECK-DAG functionality would > not change, and so there should be no chance of impacting existing well > formed tests. > > > > To understand the issue better, I've prototyped #2. It still needs test > cases and documentation, so it's not ready for a formal patch review. If > people like the idea, I'll polish it up. > > > > Thanks. > > > > Joel > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180504/ebf6bd06/attachment.html>
via llvm-dev
2018-May-07  16:56 UTC
[llvm-dev] RFC: [FileCheck] CHECK-DAG for multiple occurrences of string
> 1. In a CHECK-DAG group, don't let the matches for patterns overlap. > 2. Add a new CHECK-DAG-N directive, where N is some integer, to express > that a pattern must have N non-overlapping matches.I think #1 is much more intuitive and easy to describe/document than #2. Changing the meaning of DAG in that way is highly unlikely to affect any existing test, IMO. And if it does, my first expectation is that the test wasn't doing what the author intended anyway. --paulr
Joel E. Denny via llvm-dev
2018-May-07  17:06 UTC
[llvm-dev] RFC: [FileCheck] CHECK-DAG for multiple occurrences of string
Hi Paul, On Mon, May 7, 2018 at 12:56 PM, <paul.robinson at sony.com> wrote:> > 1. In a CHECK-DAG group, don't let the matches for patterns overlap. > > 2. Add a new CHECK-DAG-N directive, where N is some integer, to express > > that a pattern must have N non-overlapping matches. > > I think #1 is much more intuitive and easy to describe/document than #2. >Agreed.> Changing the meaning of DAG in that way is highly unlikely to affect any > existing test, IMO. And if it does, my first expectation is that the test > wasn't doing what the author intended anyway. >I'm glad to hear that. I'll look into #1. Thanks. Joel> --paulr > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180507/dad25970/attachment.html>
Joel E. Denny via llvm-dev
2018-May-11  22:55 UTC
[llvm-dev] RFC: [FileCheck] CHECK-DAG for multiple occurrences of string
On Mon, May 7, 2018 at 12:56 PM, <paul.robinson at sony.com> wrote:> > 1. In a CHECK-DAG group, don't let the matches for patterns overlap. > > 2. Add a new CHECK-DAG-N directive, where N is some integer, to express > > that a pattern must have N non-overlapping matches. > > I think #1 is much more intuitive and easy to describe/document than #2. > Changing the meaning of DAG in that way is highly unlikely to affect any > existing test, IMO. And if it does, my first expectation is that the test > wasn't doing what the author intended anyway. >I implemented #1, and I now have 105 new unexpected tests failures from llvm core (58), clang (45), openmp runtime (1), and compiler-rt (1) test suites. I've only looked at a few failures so far. Of those, the problem is the expected one: in a CHECK-DAG group, multiple patterns match the same line, so not permitting overlapping matches causes some patterns never to match successfully. However, all the overlapping matches so far seem unintentional. That is, this new implementation seems to be catching test errors. Here are some scenarios I found: S1. Patterns sometimes identical -------------------------------- For example, tools/clang/test/OpenMP/for_codegen.cpp has: ``` // TERM_DEBUG-DAG: [[DBG_LOC_START]] = !DILocation(line: [[@LINE-15]], // TERM_DEBUG-DAG: [[DBG_LOC_END]] = !DILocation(line: [[@LINE-16]], // TERM_DEBUG-DAG: [[DBG_LOC_CANCEL]] = !DILocation(line: [[@LINE-17]], ``` Patterns are the same except for the names of regex variables referenced. The variables happen to have the same value, so the patterns match the same line when overlapping is permitted. Perhaps the test was designed to handle the possibility that the variables might have different values one day, and not permitting overlapping matches would break that case. My guess is that this was actually unintentional, so my solution is to rename all the variables to DBG_LOC and remove two of the above patterns. S2. Superset before subset -------------------------- For example, tools/clang/test/OpenMP/target_parallel_codegen.cpp has this: ``` // CHECK-DAG: store i[[SZ]] {{4|8}}, i[[SZ]]* {{%[^,]+}} ... // CHECK-DAG: store i[[SZ]] 4, i[[SZ]]* {{%[^,]+}} ``` Some pattern X matches a superset of some pattern Y. X appears before Y, but Y's intended match appears before X's intended match in the output. The result is that X matches Y's intended match. If overlapping is not permitted, Y has nothing to match, and FileCheck complains. If overlapping is permitted, Y also matches Y's intended match, and X's intended match is quietly ignored. This case is clearly unintentional. One fix is to move X after Y. A cleaner fix is to somehow make X distinct. Another example of this scenario: * test/Bitcode/thinlto-function-summary-originalnames.ll S3. Wrong number of identical patterns -------------------------------------- This scenario is just an incorrect attempt at the original use case I wanted to address. For example, projects/openmp/runtime/test/ompt/parallel/not_enough_threads.c repeats the following three times: ``` // CHECK-DAG: {{^}}[[THREAD_ID:[0-9]+]]: ompt_event_implicit_task_begin: parallel_id=[[PARALLEL_ID]], task_id=[[IMPLICIT_TASK_ID:[0-9]+]] // CHECK-DAG: {{^}}[[THREAD_ID]]: ompt_event_barrier_begin: parallel_id=[[PARALLEL_ID]], task_id=[[IMPLICIT_TASK_ID]] ``` The author was obviously expecting CHECK-DAG not to permit overlapping matches and wanted to check that the above lines appeared three times. Specifically, he's expecting 3 threads after the master thread, but the run line sets OMP_THREAD_LIMIT=2. If overlapping matches are permitted, this will permit any non-zero number of threads after the master thread. If overlapping matches are not permitted, we must increase OMP_THREAD_LIMIT to at least 4 or reduce the number of pattern repetitions. S4. CHECK-DAG unneeded ---------------------- For example, test/CodeGen/AMDGPU/store-v3i64.ll has: ``` ; GCN-DAG: buffer_store_byte v ; GCN-DAG: buffer_store_byte v ; GCN-DAG: buffer_store_byte v ``` The intention appears to be the same as S3: match a pattern N times. However, this example doesn't actually need CHECK-DAG at all. First, all patterns in the group are the same, so they are not unordered, so CHECK would have been sufficient. Second, there's only one occurrence in the actual output (which I'm assuming is correct), so there should be only one directive, so it really doesn't matter which of these directives we use. S5. Incorrect pattern with same match ------------------------------------- For example, projects/compiler-rt/test/profile/instrprof-visibility.cpp has: ``` // COV-DAG: {{.*}}|{{ *}}1|#ifndef NO_WEAK ... // COV-DAG: {{.*}}|{{ *}}1|#ifndef NO_WEAK ``` The "1" does not appear in first occurrence in the actual output. Let's assume the actual output is correct and the first pattern is incorrect. If overlapping is permitted, the first pattern matches the intended line for the second pattern, which quietly matches that line too. If overlapping is not permitted, FileCheck complains that the second pattern has nothing left to match. An example where the second pattern is incorrect: * test/Bitcode/thinlto-summary-local-5.0.ll An example where the incorrect pattern is identical but it's not as obvious because a regex variable by a different name (ATOMIC_REDUCE_BARRIER_LOC instead of REDUCTION_LOC) is captured within it: * tools/clang/test/OpenMP/for_reduction_codegen_UDR.cpp has: Path Forward ------------------ These examples seem like strong evidence that permitting overlapping matches is a significant source of test errors. Moreover, we have my original goal that some test authors were apparently assuming was already supported: to be able to handle sets of strings that are both unordered and non-unique. Nevertheless, it's possible that there are use cases, such as the one suggested by S1, that we lose. Here are some potential paths forward: P1. Give up on approach #1 (non-overlapping CHECK-DAG) and go back to #2 (CHECK-DAG-N). I don't prefer this, but it would be less work. P2. Make CHECK-DAG non-overlapping but fix all test cases before committing that change. Patch reviews will surely be slow. P3. Make non-overlapping CHECK-DAG a special mode that's off by default until all test suites have been fixed. By the way, I think it would be generally useful for FileCheck to have a debugging mode that prints every match. It would be particularly useful while migrating to a non-overlapping CHECK-DAG. Thoughts? Joel> --paulr > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180511/479976df/attachment.html>
Reasonably Related Threads
- RFC: [FileCheck] CHECK-DAG for multiple occurrences of string
- RFC: [FileCheck] CHECK-DAG for multiple occurrences of string
- RFC: [FileCheck] CHECK-DAG for multiple occurrences of string
- RFC: [FileCheck] CHECK-DAG for multiple occurrences of string
- RFC: [FileCheck] CHECK-DAG for multiple occurrences of string