thr3ads.net - llvm dev - [llvm-dev] [RFC] Compiled regression tests. [Jul 2020]

If this information is useful, please help other people find it:
Share via:

Hal Finkel via llvm-dev

2020-Jul-01 14:33 UTC

[llvm-dev] [RFC] Compiled regression tests.

On 7/1/20 12:40 AM, Michael Kruse via llvm-dev wrote:> To illustrate some difficulties with FileCheck, lets make a 
> non-semantic change in LLVM:
>
>     --- a/llvm/lib/Analysis/VectorUtils.cpp
>     +++ b/llvm/lib/Analysis/VectorUtils.cpp
>     @@ -642,8 +642,8 @@ MDNode *llvm::uniteAccessGroups(MDNode 
> *AccGroups1, MDNode *AccGroups2) {
>          return AccGroups1;
>
>        SmallSetVector<Metadata *, 4> Union;
>     -  addToAccessGroupList(Union, AccGroups1);
>        addToAccessGroupList(Union, AccGroups2);
>     +  addToAccessGroupList(Union, AccGroups1);
>
>        if (Union.size() == 0)
>          return nullptr;
>
> This changes the order of access groups when merging them.
> Same background: Access groups are used to mark that a side-effect 
> (e.g. memory access) does not limit the parallelization of a loop, 
> added by e.g. #pragma clang loop vectorize(assume_safety). Therefore a 
> loop can be assumed to be parallelizable without further dependency 
> analysis if all its side-effect instructions are marked this way (and 
> has no loop-carried scalar dependencies).
> An access group represents the instructions marked for a specific 
> loop. A single instruction can be parallel with regrads to multiple 
> loop, i.e. belong to multiple access groups. The order of the access 
> groups is insignificant, i.e. whether either `AccGroups1` or 
> `AccGroups2` comes first, the instruction is marked parallel for both 
> loops.
>
> Even the change should have no effect on correctness, `ninja 
> check-llvm` fails:
>
>     test/Transforms/Inline/parallel-loop-md-merge.ll
>
>      string not found in input
>     ; CHECK: ![[ACCESSES_INNER]] =
!{!"llvm.loop.parallel_accesses",
> ![[ACCESS_GROUP_INNER]]}
>              ^
>     <stdin>:45:1: note: scanning from here
>     !7 = !{!"llvm.loop.parallel_accesses", !5}
>     ^
>     <stdin>:45:1: note: with "ACCESSES_INNER" equal to
"7"
>     !7 = !{!"llvm.loop.parallel_accesses", !5}
>     ^
>     <stdin>:45:1: note: with "ACCESS_GROUP_INNER" equal to
"4"
>     !7 = !{!"llvm.loop.parallel_accesses", !5}
>     ^
>
> The line FileCheck points to has nothing to do with the changed order 
> of access groups.
> (what's the point of annotating the `!7 = ...` line repeatedly? Why 
> not pointing to the lines where ACCESSES_INNER/ACCESS_GROUP_INNER have 
> their values from?)
>
> The CHECK lines, annotated with their line numbers from 
> parallel-loop-md-merge.ll, are:
>
>     68 ; CHECK: store double 4.200000e+01, {{.*}} !llvm.access.group 
> ![[ACCESS_GROUP_LIST_3:[0-9]+]]
>     69 ; CHECK: br label %for.cond.i, !llvm.loop ![[LOOP_INNER:[0-9]+]]
>     70 ; CHECK: br label %for.cond, !llvm.loop ![[LOOP_OUTER:[0-9]+]]
>     71
>     72 ; CHECK: ![[ACCESS_GROUP_LIST_3]] = 
> !{![[ACCESS_GROUP_INNER:[0-9]+]], ![[ACCESS_GROUP_OUTER:[0-9]+]]}
>     73 ; CHECK: ![[ACCESS_GROUP_INNER]] = distinct !{}
>     74 ; CHECK: ![[ACCESS_GROUP_OUTER]] = distinct !{}
>     75 ; CHECK: ![[LOOP_INNER]] = distinct !{![[LOOP_INNER]], 
> ![[ACCESSES_INNER:[0-9]+]]}
>     76 ; CHECK: ![[ACCESSES_INNER]] = 
> !{!"llvm.loop.parallel_accesses", ![[ACCESS_GROUP_INNER]]}
>     77 ; CHECK: ![[LOOP_OUTER]] = distinct !{![[LOOP_OUTER]], 
> ![[ACCESSES_OUTER:[0-9]+]]}
>     78 ; CHECK: ![[ACCESSES_OUTER]] = 
> !{!"llvm.loop.parallel_accesses", ![[ACCESS_GROUP_OUTER]]}
>
> And the output of `FileCheck --dump-input -v` is:
>
>          38: !0 = !{!1}
>          39: !1 = distinct !{!1, !2, !"callee: %A"}
>          40: !2 = distinct !{!2, !"callee"}
>          41: !3 = !{!4, !5}
> check:72     ^~~~~~~~~~~~~~
>          42: !4 = distinct !{}
> check:73     ^~~~~~~~~~~~~~~~~
>          43: !5 = distinct !{}
> check:74     ^~~~~~~~~~~~~~~~~
>          44: !6 = distinct !{!6, !7}
> check:75     ^~~~~~~~~~~~~~~~~~~~~~~
>          45: !7 = !{!"llvm.loop.parallel_accesses", !5}
> check:76     X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no 
> match found
>          46: !8 = distinct !{!8, !9}
> check:76     ~~~~~~~~~~~~~~~~~~~~~~~
>          47: !9 = !{!"llvm.loop.parallel_accesses", !4}
> check:76     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> To fix this regression test, one needs to solve the following questions:
>
>   *  Is `!0 = !{!1}` supposed to match with anything?
>   *  Which of the CHECK lines are the ones that are important for the
>     regression tests?
>   *  The `![[SOMETHING]] = distinct !{}` line appears twice. Which one
>     is ACCESS_GROUP_INNER/ACCESS_GROUP_OUTER?
>   *  There are two lines with `llvm.loop.parallel_accesses` as well.
>     The author of this regression test was kind enough to give the
>     placeholders descriptive names, but do they match with what
>     FileCheck matched?
>   *  LOOP_INNER and LOOP_OUTER are only distinguished by position and
>     `.i`. Why does the inner loop come first?
>   *  Before the patch that modifies the order, the output of
>     --dump-input=always was:
>
>      38: !0 = !{!1}
>      39: !1 = distinct !{!1, !2, !"callee: %A"}
>      40: !2 = distinct !{!2, !"callee"}
>      41: !3 = !{!4, !5}
>      42: !4 = distinct !{}
>      43: !5 = distinct !{}
>      44: !6 = distinct !{!6, !7}
>      45: !7 = !{!"llvm.loop.parallel_accesses", !4}
>      46: !8 = distinct !{!8, !9}
>      47: !9 = !{!"llvm.loop.parallel_accesses", !5}
>
> This seems to suggest that our change caused !4 and !5 in lines 45 and 
> 47 to swap. This is not what the patch did, which changes to order of 
> two MDNode arguments. The only MDNode that has two other MDNodes as 
> operands is line (line !6 and !8 are representing loops as they 
> reference themselves as first arguments; noalias scopes do this as 
> well, but there is no noalias metadata in this example).
>
> An hasty fix is to change CHECK lines 76 and 77 to
>
>     76 ; CHECK: ![[ACCESSES_INNER]] = 
> !{!"llvm.loop.parallel_accesses", ![[ACCESS_GROUP_OUTER]]}
>     78 ; CHECK: ![[ACCESSES_OUTER]] = 
> !{!"llvm.loop.parallel_accesses", ![[ACCESS_GROUP_INNER]]}
>
> However, ACCESS_GROUP_OUTER belongs to ACCESSES_INNER and 
> ACCESS_GROUP_INNER to ACCESSES_OUTER? This cannot be true.
>
>
> The reason is that MDNodes are emitted in topological order of a 
> depth-first search. The patch changed the order of ACCESS_GROUP_INNER 
> and ACCESS_GROUP_OUTER in line 41 but because in the mitted IR file, 
> the first argument of !3 is emitted before the second argument, 
> ACCESS_GROUP_OUTER is on line 42 and ACCESS_GROUP_INNER on line 43 
> instead.
>
> Even though this regression test uses proper regex-ing of metadata 
> node numbers, it is still fragile as it fails on irrelevant changes. 
> Once it fails, it requires time to fix properly because it is 
> non-obvious which lines are intended to match which output lines. The 
> hasty fix would have mixed up ACCESS_GROUP_OUTER/ACCESS_GROUP_INNER 
> and I wouldn't expect someone who is working on the inliner to take 
> the time to understand all the loop metadata. Set-semantics of 
> metadata occurs often, such as noalias metadata, loop properties, tbaa 
> metadata, etc.
>
> The difficulty gets amplified when there are multiple functions in the 
> regression tests; metadata of all functions all appended to the end 
> and MDNodes with same content uniqued, making it impossible to 
> associate particular MDNodes with specific functions.
>
>
> Ideally the regression test would be robust and understandable, 
> achievable with two asserts in a unittest:
>
>     Loop &OuterLoop = *LI->begin();
>     ASSERT_TRUE(OuterLoop.isAnnotatedParallel());
>     Loop &InnerLoop = *OuterLoop.begin();
>     ASSERT_TRUE(InnerLoop.isAnnotatedParallel());

I definitely agree that we should not be trying to do this kind of 
checking using textual metadata-node matching in FileCheck. The 
alternative already available is to add an analysis pass with some kind 
of verifier output. This output, not the raw metadata itself, can be 
checked by FileCheck. We also need to check the verification code, but 
at least that's something we can keep just in one place. For parallel 
annotations, we already have such a thing (we can run opt -loops 
-analyze; e.g., in 
test/Analysis/LoopInfo/annotated-parallel-complex.ll). We also do this 
kind of thing for the cost model (by running with -cost-model -analyze). 
To what extent would making more-extensive use of this technique address 
the use cases you're trying to address?

Maybe we should, also, just make more-extensive use of unit tests?

Thanks again,

Hal


>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200701/09eb3530/attachment.html>

Michael Kruse via llvm-dev

2020-Jul-01 16:13 UTC

head link

[llvm-dev] [RFC] Compiled regression tests.

Am Mi., 1. Juli 2020 um 09:33 Uhr schrieb Hal Finkel <hfinkel at anl.gov>:
> I definitely agree that we should not be trying to do this kind of
> checking using textual metadata-node matching in FileCheck. The alternative
> already available is to add an analysis pass with some kind of verifier
> output. This output, not the raw metadata itself, can be checked by
> FileCheck. We also need to check the verification code, but at least
that's
> something we can keep just in one place. For parallel annotations, we
> already have such a thing (we can run opt -loops -analyze; e.g., in
> test/Analysis/LoopInfo/annotated-parallel-complex.ll). We also do this kind
> of thing for the cost model (by running with -cost-model -analyze). To what
> extent would making more-extensive use of this technique address the use
> cases you're trying to address?
>The CHECK lines in annotated-parallel-complex.ll are:

; CHECK: Parallel Loop at depth 1
; CHECK-NOT: Parallel
; CHECK:     Loop at depth 2
; CHECK:         Parallel Loop
; CHECK:             Parallel Loop

When adding this test, I had to change LoopInfo to emit the "Parallel"
in
front of "Loop". For readability, I would have preferred the parallel
info
as a "tag", such as `Loop (parallel) as depth 1`, but this would break
other tests that check "Loop at depth 1". Later I noticed that there
are
regression tests that check "LoopFullUnrollPass on Loop at depth 3
containing: %l0.0.0<header>", but it seems I got lucky in that none
of
these loops have parallel annotations.

"CHECK-NOT" is inherently fragile. It is too easy to make a change in
LLVM
that changes the text output and oversee that this test does not check what
it was supposed to test. For a FileCheck-friendlier output, it could emit
"NonParallel" and match this. However, this clutters the output for
humans,
will actually break the "LoopFullUnrollPass on Loop at depth 3 ..."
and
"CHECK: Parallel" matches this as well since FileCheck ignores word
boundaries.

The CHECK lines test more than necessary. The first and third CHECK lines
also check the "at depth" to make it match the correct loop (and not,
e.g.
match the next inner loop), although we are not interested in the loop
depths themselves. Ironically, is the reason why cannot be tags between
"Loop" and "at depth"

Not all of the loop metadata have passes that print them. For instance,
there are loop properties such as llvm.loop.isvectorized. Reading those is
usually done using utility functions such as getBooleanLoopAttribute(L,
"llvm.loop.isvectorized"). A solution using FileCheck would be to add
another pass that prints loop metadata. That pass would only be used for
testing, making the release LLVM binaries larger than necessary and still
have the other problems.

Processing the IR through a tool can make the output more
FileCheck-friendly, but it doesn't make its problems disappear. IMHO it
adds to the maintenance burden since it adds more textual interfaces.

Michael
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200701/cb74444c/attachment.html>

Hal Finkel via llvm-dev

2020-Jul-01 16:36 UTC

head link

[llvm-dev] [RFC] Compiled regression tests.

On 7/1/20 11:13 AM, Michael Kruse wrote:> Am Mi., 1. Juli 2020 um 09:33 Uhr schrieb Hal Finkel <hfinkel at anl.gov
> <mailto:hfinkel at anl.gov>>:
>
>     I definitely agree that we should not be trying to do this kind of
>     checking using textual metadata-node matching in FileCheck. The
>     alternative already available is to add an analysis pass with some
>     kind of verifier output. This output, not the raw metadata itself,
>     can be checked by FileCheck. We also need to check the
>     verification code, but at least that's something we can keep just
>     in one place. For parallel annotations, we already have such a
>     thing (we can run opt -loops -analyze; e.g., in
>     test/Analysis/LoopInfo/annotated-parallel-complex.ll). We also do
>     this kind of thing for the cost model (by running with -cost-model
>     -analyze). To what extent would making more-extensive use of this
>     technique address the use cases you're trying to address?
>
> The CHECK lines in annotated-parallel-complex.ll are:
>
> ; CHECK: Parallel Loop at depth 1
> ; CHECK-NOT: Parallel
> ; CHECK:     Loop at depth 2
> ; CHECK:         Parallel Loop
> ; CHECK:             Parallel Loop
>
> When adding this test, I had to change LoopInfo to emit the
"Parallel"
> in front of "Loop". For readability, I would have preferred the 
> parallel info as a "tag", such as `Loop (parallel) as depth 1`,
but
> this would break other tests that check "Loop at depth 1". Later
I
> noticed that there are regression tests that check "LoopFullUnrollPass
> on Loop at depth 3 containing: %l0.0.0<header>", but it seems I
got
> lucky in that none of these loops have parallel annotations.
>
> "CHECK-NOT" is inherently fragile. It is too easy to make a
change in
> LLVM that changes the text output and oversee that this test does not 
> check what it was supposed to test. For a FileCheck-friendlier output, 
> it could emit "NonParallel" and match this. However, this
clutters the
> output for humans, will actually break the "LoopFullUnrollPass on Loop
> at depth 3 ..." and "CHECK: Parallel" matches this as well
since
> FileCheck ignores word boundaries.
>
> The CHECK lines test more than necessary. The first and third CHECK 
> lines also check the "at depth" to make it match the correct loop
(and
> not, e.g. match the next inner loop), although we are not interested 
> in the loop depths themselves. Ironically, is the reason why cannot be 
> tags between "Loop" and "at depth"

We can have different printing modes. There can be a more-human-friendly 
mode and a more-FileCheck-friendly mode. Or modes customized for 
different kinds of tests. I agree, however, that this does not solve the 
fragility problems with CHECK-NOT.

>
> Not all of the loop metadata have passes that print them. For 
> instance, there are loop properties such as llvm.loop.isvectorized. 
> Reading those is usually done using utility functions such as 
> getBooleanLoopAttribute(L, "llvm.loop.isvectorized"). A solution
using
> FileCheck would be to add another pass that prints loop metadata. That 
> pass would only be used for testing, making the release LLVM binaries 
> larger than necessary and still have the other problems.
>
> Processing the IR through a tool can make the output more 
> FileCheck-friendly, but it doesn't make its problems disappear. IMHO 
> it adds to the maintenance burden since it adds more textual interfaces.

That's the interesting question... it does add to the maintenance 
burden. However, having textual outputs are also quite convenient when 
debugging things (because I can change the input and see the output 
quickly, without needing to create and compile another program). 
Obviously, at some point, this becomes ridiculous. How much is too much? 
I don't know. But it's also not clear to me that we're at that point
yet. We could add more textual analysis outputs and still have that be a 
net benefit in many places.

In cases where the requisite output would just be too specific, we do 
have unit tests. Should we just add more? Maybe we're too happy to add 
lit tests instead of unit tests for some of these cases.

>
> Michael
>
>
>
>
>
>
>-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200701/ef4ddf56/attachment.html>

David Greene via llvm-dev

2020-Jul-01 18:43 UTC

head link

[llvm-dev] [RFC] Compiled regression tests.

Hal Finkel via llvm-dev <llvm-dev at lists.llvm.org> writes:
> I definitely agree that we should not be trying to do this kind of 
> checking using textual metadata-node matching in FileCheck. The 
> alternative already available is to add an analysis pass with some kind 
> of verifier output. This output, not the raw metadata itself, can be 
> checked by FileCheck. We also need to check the verification code, but 
> at least that's something we can keep just in one place. For parallel 
> annotations, we already have such a thing (we can run opt -loops 
> -analyze; e.g., in 
> test/Analysis/LoopInfo/annotated-parallel-complex.ll). We also do this 
> kind of thing for the cost model (by running with -cost-model -analyze). 
> To what extent would making more-extensive use of this technique address 
> the use cases you're trying to address?
Analysis tests are definitely useful but I don't think they're
sufficient.  By their nature, analysis tests run in a "sanitized"
environment.  The test specifies exactly what is analyzed and no other
passes are run.
>From my quick glance at Michael's example test, it's checking that
theinlining pass does something specific with metadata:

; Check that the access groups (llvm.access.group) are correctly merged.

I suppose one could create an "analysis" that just dumps out metadata
information.  Is that what you have in mind?  That seems like a lot of
overhead just to write a test to check metadata.

                      -David

Hal Finkel via llvm-dev

2020-Jul-01 18:46 UTC

head link

[llvm-dev] [RFC] Compiled regression tests.

On 7/1/20 1:43 PM, David Greene wrote:> Hal Finkel via llvm-dev <llvm-dev at lists.llvm.org> writes:
>
>> I definitely agree that we should not be trying to do this kind of
>> checking using textual metadata-node matching in FileCheck. The
>> alternative already available is to add an analysis pass with some kind
>> of verifier output. This output, not the raw metadata itself, can be
>> checked by FileCheck. We also need to check the verification code, but
>> at least that's something we can keep just in one place. For
parallel
>> annotations, we already have such a thing (we can run opt -loops
>> -analyze; e.g., in
>> test/Analysis/LoopInfo/annotated-parallel-complex.ll). We also do this
>> kind of thing for the cost model (by running with -cost-model
-analyze).
>> To what extent would making more-extensive use of this technique
address
>> the use cases you're trying to address?
> Analysis tests are definitely useful but I don't think they're
> sufficient.  By their nature, analysis tests run in a "sanitized"
> environment.  The test specifies exactly what is analyzed and no other
> passes are run.

Yes, but we can also combine them with a transformation pipeline.

>
>  From my quick glance at Michael's example test, it's checking that
the
> inlining pass does something specific with metadata:
>
> ; Check that the access groups (llvm.access.group) are correctly merged.
>
> I suppose one could create an "analysis" that just dumps out
metadata
> information.  Is that what you have in mind?  That seems like a lot of
> overhead just to write a test to check metadata.

Yes. From my perspective, it depends on how much resuse you get out of 
that work. Can the "dump out metadata information" part be reused for 
many tests? And is there a reason not to use a traditional unit test?

  -Hal

>
>                        -David
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

Chris Lattner via llvm-dev

2020-Jul-01 20:01 UTC

head link

[llvm-dev] [RFC] Compiled regression tests.

> On Jul 1, 2020, at 7:33 AM, Hal Finkel via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
>> 
>> Ideally the regression test would be robust and understandable,
achievable with two asserts in a unittest:
>> 
>>     Loop &OuterLoop = *LI->begin();
>>     ASSERT_TRUE(OuterLoop.isAnnotatedParallel());
>>     Loop &InnerLoop = *OuterLoop.begin();
>>     ASSERT_TRUE(InnerLoop.isAnnotatedParallel());
> 
> I definitely agree that we should not be trying to do this kind of checking
using textual metadata-node matching in FileCheck. The alternative already
available is to add an analysis pass with some kind of verifier output. This
output, not the raw metadata itself, can be checked by FileCheck. We also need
to check the verification code, but at least that's something we can keep
just in one place. For parallel annotations, we already have such a thing (we
can run opt -loops -analyze; e.g., in
test/Analysis/LoopInfo/annotated-parallel-complex.ll). We also do this kind of
thing for the cost model (by running with -cost-model -analyze). To what extent
would making more-extensive use of this technique address the use cases
you're trying to address?
> 
> Maybe we should, also, just make more-extensive use of unit tests?
> Agreed, it is a very reasonable thing to write an LLVM pass (analysis or
otherwise) that does things with the IR and prints things in a structured output
format that is easy to FileCheck.  This achieves the goals I was striving for,
including the design and curation of high quality test logic.

On the MLIR side of things, there are good examples in testing dependence
analysis and other things like that.

-Chris
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200701/a8593836/attachment.html>

Mehdi AMINI via llvm-dev

2020-Jul-01 20:22 UTC

head link

[llvm-dev] [RFC] Compiled regression tests.

On Wed, Jul 1, 2020 at 7:34 AM Hal Finkel via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
>
> On 7/1/20 12:40 AM, Michael Kruse via llvm-dev wrote:
>
> To illustrate some difficulties with FileCheck, lets make a non-semantic
> change in LLVM:
>
>     --- a/llvm/lib/Analysis/VectorUtils.cpp
>     +++ b/llvm/lib/Analysis/VectorUtils.cpp
>     @@ -642,8 +642,8 @@ MDNode *llvm::uniteAccessGroups(MDNode
> *AccGroups1, MDNode *AccGroups2) {
>          return AccGroups1;
>
>        SmallSetVector<Metadata *, 4> Union;
>     -  addToAccessGroupList(Union, AccGroups1);
>        addToAccessGroupList(Union, AccGroups2);
>     +  addToAccessGroupList(Union, AccGroups1);
>
>        if (Union.size() == 0)
>          return nullptr;
>
> This changes the order of access groups when merging them.
> Same background: Access groups are used to mark that a side-effect (e.g.
> memory access) does not limit the parallelization of a loop, added by e.g.
> #pragma clang loop vectorize(assume_safety). Therefore a loop can be
> assumed to be parallelizable without further dependency analysis if all its
> side-effect instructions are marked this way (and has no loop-carried
> scalar dependencies).
> An access group represents the instructions marked for a specific loop. A
> single instruction can be parallel with regrads to multiple loop, i.e.
> belong to multiple access groups. The order of the access groups is
> insignificant, i.e. whether either `AccGroups1` or `AccGroups2` comes
> first, the instruction is marked parallel for both loops.
>
> Even the change should have no effect on correctness, `ninja check-llvm`
> fails:
>
>     test/Transforms/Inline/parallel-loop-md-merge.ll
>
>      string not found in input
>     ; CHECK: ![[ACCESSES_INNER]] =
!{!"llvm.loop.parallel_accesses",
> ![[ACCESS_GROUP_INNER]]}
>              ^
>     <stdin>:45:1: note: scanning from here
>     !7 = !{!"llvm.loop.parallel_accesses", !5}
>     ^
>     <stdin>:45:1: note: with "ACCESSES_INNER" equal to
"7"
>     !7 = !{!"llvm.loop.parallel_accesses", !5}
>     ^
>     <stdin>:45:1: note: with "ACCESS_GROUP_INNER" equal to
"4"
>     !7 = !{!"llvm.loop.parallel_accesses", !5}
>     ^
>
> The line FileCheck points to has nothing to do with the changed order of
> access groups.
> (what's the point of annotating the `!7 = ...` line repeatedly? Why not
> pointing to the lines where ACCESSES_INNER/ACCESS_GROUP_INNER have their
> values from?)
>
> The CHECK lines, annotated with their line numbers from
> parallel-loop-md-merge.ll, are:
>
>     68 ; CHECK: store double 4.200000e+01, {{.*}} !llvm.access.group
> ![[ACCESS_GROUP_LIST_3:[0-9]+]]
>     69 ; CHECK: br label %for.cond.i, !llvm.loop ![[LOOP_INNER:[0-9]+]]
>     70 ; CHECK: br label %for.cond, !llvm.loop ![[LOOP_OUTER:[0-9]+]]
>     71
>     72 ; CHECK: ![[ACCESS_GROUP_LIST_3]] >
!{![[ACCESS_GROUP_INNER:[0-9]+]], ![[ACCESS_GROUP_OUTER:[0-9]+]]}
>     73 ; CHECK: ![[ACCESS_GROUP_INNER]] = distinct !{}
>     74 ; CHECK: ![[ACCESS_GROUP_OUTER]] = distinct !{}
>     75 ; CHECK: ![[LOOP_INNER]] = distinct !{![[LOOP_INNER]],
> ![[ACCESSES_INNER:[0-9]+]]}
>     76 ; CHECK: ![[ACCESSES_INNER]] =
!{!"llvm.loop.parallel_accesses",
> ![[ACCESS_GROUP_INNER]]}
>     77 ; CHECK: ![[LOOP_OUTER]] = distinct !{![[LOOP_OUTER]],
> ![[ACCESSES_OUTER:[0-9]+]]}
>     78 ; CHECK: ![[ACCESSES_OUTER]] =
!{!"llvm.loop.parallel_accesses",
> ![[ACCESS_GROUP_OUTER]]}
>
> And the output of `FileCheck --dump-input -v` is:
>
>          38: !0 = !{!1}
>          39: !1 = distinct !{!1, !2, !"callee: %A"}
>          40: !2 = distinct !{!2, !"callee"}
>          41: !3 = !{!4, !5}
> check:72     ^~~~~~~~~~~~~~
>          42: !4 = distinct !{}
> check:73     ^~~~~~~~~~~~~~~~~
>          43: !5 = distinct !{}
> check:74     ^~~~~~~~~~~~~~~~~
>          44: !6 = distinct !{!6, !7}
> check:75     ^~~~~~~~~~~~~~~~~~~~~~~
>          45: !7 = !{!"llvm.loop.parallel_accesses", !5}
> check:76     X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match
> found
>          46: !8 = distinct !{!8, !9}
> check:76     ~~~~~~~~~~~~~~~~~~~~~~~
>          47: !9 = !{!"llvm.loop.parallel_accesses", !4}
> check:76     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> To fix this regression test, one needs to solve the following questions:
>
>    -  Is `!0 = !{!1}` supposed to match with anything?
>    -  Which of the CHECK lines are the ones that are important for the
>    regression tests?
>    -  The `![[SOMETHING]] = distinct !{}` line appears twice. Which one
>    is ACCESS_GROUP_INNER/ACCESS_GROUP_OUTER?
>    -  There are two lines with `llvm.loop.parallel_accesses` as well. The
>    author of this regression test was kind enough to give the placeholders
>    descriptive names, but do they match with what FileCheck matched?
>    -  LOOP_INNER and LOOP_OUTER are only distinguished by position and
>    `.i`. Why does the inner loop come first?
>    -  Before the patch that modifies the order, the output of
>    --dump-input=always was:
>
>      38: !0 = !{!1}
>      39: !1 = distinct !{!1, !2, !"callee: %A"}
>      40: !2 = distinct !{!2, !"callee"}
>      41: !3 = !{!4, !5}
>      42: !4 = distinct !{}
>      43: !5 = distinct !{}
>      44: !6 = distinct !{!6, !7}
>      45: !7 = !{!"llvm.loop.parallel_accesses", !4}
>      46: !8 = distinct !{!8, !9}
>      47: !9 = !{!"llvm.loop.parallel_accesses", !5}
>
> This seems to suggest that our change caused !4 and !5 in lines 45 and 47
> to swap. This is not what the patch did, which changes to order of two
> MDNode arguments. The only MDNode that has two other MDNodes as operands is
> line (line !6 and !8 are representing loops as they reference themselves as
> first arguments; noalias scopes do this as well, but there is no noalias
> metadata in this example).
>
> An hasty fix is to change CHECK lines 76 and 77 to
>
>     76 ; CHECK: ![[ACCESSES_INNER]] =
!{!"llvm.loop.parallel_accesses",
> ![[ACCESS_GROUP_OUTER]]}
>     78 ; CHECK: ![[ACCESSES_OUTER]] =
!{!"llvm.loop.parallel_accesses",
> ![[ACCESS_GROUP_INNER]]}
>
> However, ACCESS_GROUP_OUTER belongs to ACCESSES_INNER and
> ACCESS_GROUP_INNER to ACCESSES_OUTER? This cannot be true.
>
>
> The reason is that MDNodes are emitted in topological order of a
> depth-first search. The patch changed the order of ACCESS_GROUP_INNER and
> ACCESS_GROUP_OUTER in line 41 but because in the mitted IR file, the first
> argument of !3 is emitted before the second argument, ACCESS_GROUP_OUTER is
> on line 42 and ACCESS_GROUP_INNER on line 43 instead.
>
> Even though this regression test uses proper regex-ing of metadata node
> numbers, it is still fragile as it fails on irrelevant changes. Once it
> fails, it requires time to fix properly because it is non-obvious which
> lines are intended to match which output lines. The hasty fix would have
> mixed up ACCESS_GROUP_OUTER/ACCESS_GROUP_INNER and I wouldn't expect
> someone who is working on the inliner to take the time to understand all
> the loop metadata. Set-semantics of metadata occurs often, such as noalias
> metadata, loop properties, tbaa metadata, etc.
>
> The difficulty gets amplified when there are multiple functions in the
> regression tests; metadata of all functions all appended to the end and
> MDNodes with same content uniqued, making it impossible to associate
> particular MDNodes with specific functions.
>
>
> Ideally the regression test would be robust and understandable, achievable
> with two asserts in a unittest:
>
>     Loop &OuterLoop = *LI->begin();
>     ASSERT_TRUE(OuterLoop.isAnnotatedParallel());
>     Loop &InnerLoop = *OuterLoop.begin();
>     ASSERT_TRUE(InnerLoop.isAnnotatedParallel());
>
>
> I definitely agree that we should not be trying to do this kind of
> checking using textual metadata-node matching in FileCheck. The alternative
> already available is to add an analysis pass with some kind of verifier
> output. This output, not the raw metadata itself, can be checked by
> FileCheck. We also need to check the verification code, but at least
that's
> something we can keep just in one place. For parallel annotations, we
> already have such a thing (we can run opt -loops -analyze; e.g., in
> test/Analysis/LoopInfo/annotated-parallel-complex.ll). We also do this kind
> of thing for the cost model (by running with -cost-model -analyze). To what
> extent would making more-extensive use of this technique address the use
> cases you're trying to address?
>In MLIR we create passes specifically for the purpose of testing this kind
of thing: the pass would traverse a structure and dump it in a textual
form. This is useful as a tool for developers, but mostly makes it really
nice to write FileCheck tests.

For example our dominance test is:
https://github.com/llvm/llvm-project/blob/master/mlir/test/Analysis/test-dominance.mlir
The test pass is here:
https://github.com/llvm/llvm-project/blob/master/mlir/test/lib/Transforms/TestDominance.cpp

-- 
Mehdi




> Maybe we should, also, just make more-extensive use of unit tests?
>
> Thanks again,
>
> Hal
>
>
>
>
> _______________________________________________
> LLVM Developers mailing listllvm-dev at
lists.llvm.orghttps://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
> --
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200701/28f5b4b9/attachment-0001.html>

Reasonably Related Threads

Search for more reasonably related threads

llvm dev - Jul 2020 - [RFC] Compiled regression tests.

[llvm-dev] [RFC] Compiled regression tests.

[llvm-dev] [RFC] Compiled regression tests.

[llvm-dev] [RFC] Compiled regression tests.

[llvm-dev] [RFC] Compiled regression tests.

[llvm-dev] [RFC] Compiled regression tests.

[llvm-dev] [RFC] Compiled regression tests.

[llvm-dev] [RFC] Compiled regression tests.

Reasonably Related Threads