On 7/2/20 12:44 AM, Michael Kruse wrote:> Am Mi., 1. Juli 2020 um 14:36 Uhr schrieb Hal Finkel <hfinkel at anl.gov>: >> When I teach my compilers class, I tell my students to liberally add the >> ability to serialize to interpretable text all of their internal data >> structures. It will seem like extra work at first, but when they're >> trying to debug things later, it will be really helpful. I think this is >> a key lesson that I, at least, have learned from LLVM. It makes us all >> more productive in the end (in part because we often spend much more >> time debugging our code than writing it in the first place). Firing up >> an actual debugger is slow and (despite our best efforts) fragile, >> changing a textual input and running it through something that produces >> textual output is fast. > One of the first things I write for my data structure is indeed a dump > function. However, the output is not stable since I regularly > change/remove/add information that is dumped depending on whether > information is relevant, adds too much noise, or found a better > textual representation of the same thing.I think that, to a large extent, we're on the same page on this aspect. It's a question of reuse and stability. If there's a principled way to design an output that will be reused across many tests and can reasonably be believed will remain relatively stable, then we should do that. If not, then unit tests are better. The question is: do we have so many such unit tests that we want a special way to construct them from IR files (instead of, I suppose, just having the IR in a string in the code)? I don't know. -Hal> > > Michael-- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory
Michael Kruse via llvm-dev
2020-Jul-02 18:07 UTC
[llvm-dev] [RFC] Compiled regression tests.
Am Do., 2. Juli 2020 um 12:41 Uhr schrieb Hal Finkel <hfinkel at anl.gov>:> I think that, to a large extent, we're on the same page on this aspect. > It's a question of reuse and stability. If there's a principled way to > design an output that will be reused across many tests and can > reasonably be believed will remain relatively stable, then we should do > that. If not, then unit tests are better. The question is: do we have so > many such unit tests that we want a special way to construct them from > IR files (instead of, I suppose, just having the IR in a string in the > code)? I don't know.Implementing another print mechanism and justifying its presence significantly adds to the burden of writing robust and understandable tests that often could be simple call of an already existing API function in a unittest. Inserting a layer of text representations and matching it using FileCheck feels like "unittesting with extra steps", adding the previously mentioned shortcoming of FileCheck-based testing to the system. Another example from Polly: Arrays { i32 MemRef0[*]; // Element size 4 } Arrays (Bounds as pw_affs) { i32 MemRef0[*]; // Element size 4 } This emits the same information twice. The latter is the more high-level representation that should be more useful to users, but the former is still printed as well because there are existing regression tests that expect the former format. For humans trying to understand this it makes the output more verbose. Michael
On 7/2/20 1:07 PM, Michael Kruse wrote:> Am Do., 2. Juli 2020 um 12:41 Uhr schrieb Hal Finkel <hfinkel at anl.gov>: >> I think that, to a large extent, we're on the same page on this aspect. >> It's a question of reuse and stability. If there's a principled way to >> design an output that will be reused across many tests and can >> reasonably be believed will remain relatively stable, then we should do >> that. If not, then unit tests are better. The question is: do we have so >> many such unit tests that we want a special way to construct them from >> IR files (instead of, I suppose, just having the IR in a string in the >> code)? I don't know. > Implementing another print mechanism and justifying its presence > significantly adds to the burden of writing robust and understandable > tests that often could be simple call of an already existing API > function in a unittest. > Inserting a layer of text representations and matching it using > FileCheck feels like "unittesting with extra steps", adding the > previously mentioned shortcoming of FileCheck-based testing to the > system. > > Another example from Polly: > > Arrays { > i32 MemRef0[*]; // Element size 4 > } > Arrays (Bounds as pw_affs) { > i32 MemRef0[*]; // Element size 4 > } > > This emits the same information twice. The latter is the more > high-level representation that should be more useful to users, but the > former is still printed as well because there are existing regression > tests that expect the former format. For humans trying to understand > this it makes the output more verbose.Someone should have updated the tests instead? The interesting question is: What about the format made it difficult to update the tests? Aside from that, as you point out, the textual output is also useful for humans. My hypothesis is that this is the general case, and all things considered, this utility pays for the price of the extra steps in the "unit testing with extra steps." -Hal> > Michael-- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory