Hi all, quite a few tests use the pattern "2>&1 | FileCheck %s". AFAIK how stdout and stderr are merged into a single character stream is undefined and depends e.g. on whether stdout is buffered. I think we are often saved by the fact that standard output is written only at the end of the program and stderr is unbuffered, i.e. always written before stdout. A lot of tests disable stdout using either "-o /dev/null" or "-disable-output", but not all. For instance, test/Transforms/SLPVectorizer/X86/reduction_unrolled.ll does not. It checks for output from stdout and stderr using the same FileCheck. The stderr it is checking even comes from -debug, which has an additional buffering layer (circular_raw_ostream). The testing guide [1] does not mention how to test stderr. My questions: 1. Are these tests, e.g. reduction_unrolled.ll fragile? Maybe I am missing something that says that interleaving stdout and stderr (and llvm::dbgs()) is well-defined in llvm-lit. 2. Can -debug (or -debug-only) be used in regression tests at all? I understood them as debugging aids only. I would not like if adding/changing DEBUG(dbgs() << ...); lines causing regression tests to fail. 3. What are the canonical ways to test... 3a) opt -stat output (e.g. "2>&1 | FileCheck\n; REQUIRES: asserts") 3b) A statistic from -stat being zero 3c) stderr only (and be sure that no lines from stdout will be interleaved with it) 3d) stdout and stderr at the same time, but independently. 3e) the output of DEBUG(dbgs() << ...) lines, if allowed to do so. 3f) If not, how to replace it? Eg. how to test whether a source code line has been executed. Thanks in advance, Michael [1] http://llvm.org/docs/TestingGuide.html
On Thu, Feb 23, 2017 at 6:54 AM Michael Kruse via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi all, > > quite a few tests use the pattern "2>&1 | FileCheck %s". AFAIK how > stdout and stderr are merged into a single character stream is > undefined and depends e.g. on whether stdout is buffered. I think we > are often saved by the fact that standard output is written only at > the end of the program and stderr is unbuffered, i.e. always written > before stdout. > > A lot of tests disable stdout using either "-o /dev/null" or > "-disable-output", but not all. For instance, > test/Transforms/SLPVectorizer/X86/reduction_unrolled.ll does not. It > checks for output from stdout and stderr using the same FileCheck. The > stderr it is checking even comes from -debug, which has an additional > buffering layer (circular_raw_ostream). > > The testing guide [1] does not mention how to test stderr. > > > My questions: > > 1. Are these tests, e.g. reduction_unrolled.ll fragile? Maybe I am > missing something that says that interleaving stdout and stderr (and > llvm::dbgs()) is well-defined in llvm-lit. > > 2. Can -debug (or -debug-only) be used in regression tests at all? I > understood them as debugging aids only. I would not like if > adding/changing DEBUG(dbgs() << ...); lines causing regression tests > to fail. > >Rough guesses, based on no broad review of test cases. All of this seems OK except for the interleaved case(s) as you mentioned. 3f - generally it's probably best not to test for whether a source code line has executed. That would make the test fragile - the observable behavior should be tested instead. Though I would imagine it comes up sometimes as the best thing to do in a bad situation.> 3. What are the canonical ways to test... > 3a) opt -stat output (e.g. "2>&1 | FileCheck\n; REQUIRES: asserts")3b) A statistic from -stat being zero> 3c) stderr only (and be sure that no lines from stdout will be > interleaved with it) > 3d) stdout and stderr at the same time, but independently. > 3e) the output of DEBUG(dbgs() << ...) lines, if allowed to do so. > 3f) If not, how to replace it? Eg. how to test whether a source code > line has been executed. > > Thanks in advance, > Michael > > > [1] http://llvm.org/docs/TestingGuide.html > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170223/fd309aaa/attachment.html>
> -----Original Message----- > From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of > Michael Kruse via llvm-dev > Sent: Thursday, February 23, 2017 6:53 AM > To: llvm-dev > Subject: [llvm-dev] llvm-lit: 2>&1 and FileCheck > > Hi all, > > quite a few tests use the pattern "2>&1 | FileCheck %s". AFAIK how > stdout and stderr are merged into a single character stream is > undefined and depends e.g. on whether stdout is buffered. I think we > are often saved by the fact that standard output is written only at > the end of the program and stderr is unbuffered, i.e. always written > before stdout. > > A lot of tests disable stdout using either "-o /dev/null" or > "-disable-output", but not all. For instance, > test/Transforms/SLPVectorizer/X86/reduction_unrolled.ll does not. It > checks for output from stdout and stderr using the same FileCheck. The > stderr it is checking even comes from -debug, which has an additional > buffering layer (circular_raw_ostream). > > The testing guide [1] does not mention how to test stderr. > > > My questions: > > 1. Are these tests, e.g. reduction_unrolled.ll fragile? Maybe I am > missing something that says that interleaving stdout and stderr (and > llvm::dbgs()) is well-defined in llvm-lit.I'd consider them fragile, but obviously their behavior has been consistent across a variety of bots for some time. So the fragility is a bit pedantic/theoretical. "The behavior is undefined but I know what I'm doing!" There are times running a test when I've seen interleaved stdout/stderr text, but not the text that a CHECK was looking for; so I think people are getting lucky in at least some cases.> > 2. Can -debug (or -debug-only) be used in regression tests at all? I > understood them as debugging aids only. I would not like if > adding/changing DEBUG(dbgs() << ...); lines causing regression tests > to fail.The line between "debugging aid" and "event logging" is not clear, but I have written tests relying on logging-style output; I think that's ok. As always you want your CHECKs to be specific enough to avoid false matches but not so specific that they become too fragile.> > 3. What are the canonical ways to test... > 3a) opt -stat output (e.g. "2>&1 | FileCheck\n; REQUIRES: asserts") > 3b) A statistic from -stat being zero > 3c) stderr only (and be sure that no lines from stdout will be > interleaved with it) > 3d) stdout and stderr at the same time, but independently. > 3e) the output of DEBUG(dbgs() << ...) lines, if allowed to do so. > 3f) If not, how to replace it? Eg. how to test whether a source code > line has been executed. > > Thanks in advance, > Michael > > > [1] http://llvm.org/docs/TestingGuide.html > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> On Feb 23, 2017, at 10:48 AM, Robinson, Paul via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > >> -----Original Message----- >> From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org <mailto:llvm-dev-bounces at lists.llvm.org>] On Behalf Of >> Michael Kruse via llvm-dev >> Sent: Thursday, February 23, 2017 6:53 AM >> To: llvm-dev >> Subject: [llvm-dev] llvm-lit: 2>&1 and FileCheck >> >> Hi all, >> >> quite a few tests use the pattern "2>&1 | FileCheck %s". AFAIK how >> stdout and stderr are merged into a single character stream is >> undefined and depends e.g. on whether stdout is buffered. I think we >> are often saved by the fact that standard output is written only at >> the end of the program and stderr is unbuffered, i.e. always written >> before stdout. >> >> A lot of tests disable stdout using either "-o /dev/null" or >> "-disable-output", but not all. For instance, >> test/Transforms/SLPVectorizer/X86/reduction_unrolled.ll does not. It >> checks for output from stdout and stderr using the same FileCheck. The >> stderr it is checking even comes from -debug, which has an additional >> buffering layer (circular_raw_ostream). >> >> The testing guide [1] does not mention how to test stderr. >> >> >> My questions: >> >> 1. Are these tests, e.g. reduction_unrolled.ll fragile? Maybe I am >> missing something that says that interleaving stdout and stderr (and >> llvm::dbgs()) is well-defined in llvm-lit. > > I'd consider them fragile, but obviously their behavior has been > consistent across a variety of bots for some time. So the fragility > is a bit pedantic/theoretical. "The behavior is undefined but I know > what I'm doing!" > > There are times running a test when I've seen interleaved stdout/stderr > text, but not the text that a CHECK was looking for; so I think people > are getting lucky in at least some cases. > >> >> 2. Can -debug (or -debug-only) be used in regression tests at all? I >> understood them as debugging aids only. I would not like if >> adding/changing DEBUG(dbgs() << ...); lines causing regression tests >> to fail. > > The line between "debugging aid" and "event logging" is not clear, but > I have written tests relying on logging-style output; I think that's ok. > As always you want your CHECKs to be specific enough to avoid false > matches but not so specific that they become too fragile.- In general you should try hard not to use dbgs() output for unit testing, it is certainly an anti-pattern. - We do indeed not have any other logging mechanism (and I am not convinced that we need one). - There are areas where we should improve: For example someone should implement the equivalent of 'opt -analyze' for llc so we can use Pass:print() in codegen tests. - If all else fails I still consider using dbgs() for testing okay. It's easy enough to run the tests and see if you broke something when you changed a DEBUG() line. - Matthias -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170223/f65159d6/attachment.html>
On Thu, Feb 23, 2017 at 10:48 AM, Robinson, Paul via llvm-dev < llvm-dev at lists.llvm.org> wrote:> > > > -----Original Message----- > > From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of > > Michael Kruse via llvm-dev > > Sent: Thursday, February 23, 2017 6:53 AM > > To: llvm-dev > > Subject: [llvm-dev] llvm-lit: 2>&1 and FileCheck > > > > Hi all, > > > > quite a few tests use the pattern "2>&1 | FileCheck %s". AFAIK how > > stdout and stderr are merged into a single character stream is > > undefined and depends e.g. on whether stdout is buffered. I think we > > are often saved by the fact that standard output is written only at > > the end of the program and stderr is unbuffered, i.e. always written > > before stdout. > > > > A lot of tests disable stdout using either "-o /dev/null" or > > "-disable-output", but not all. For instance, > > test/Transforms/SLPVectorizer/X86/reduction_unrolled.ll does not. It > > checks for output from stdout and stderr using the same FileCheck. The > > stderr it is checking even comes from -debug, which has an additional > > buffering layer (circular_raw_ostream). > > > > The testing guide [1] does not mention how to test stderr. > > > > > > My questions: > > > > 1. Are these tests, e.g. reduction_unrolled.ll fragile? Maybe I am > > missing something that says that interleaving stdout and stderr (and > > llvm::dbgs()) is well-defined in llvm-lit. > > I'd consider them fragile, but obviously their behavior has been > consistent across a variety of bots for some time. So the fragility > is a bit pedantic/theoretical. "The behavior is undefined but I know > what I'm doing!" > > There are times running a test when I've seen interleaved stdout/stderr > text, but not the text that a CHECK was looking for; so I think people > are getting lucky in at least some cases. > > > > > 2. Can -debug (or -debug-only) be used in regression tests at all? I > > understood them as debugging aids only. I would not like if > > adding/changing DEBUG(dbgs() << ...); lines causing regression tests > > to fail. > > The line between "debugging aid" and "event logging" is not clear, but > I have written tests relying on logging-style output; I think that's ok. > As always you want your CHECKs to be specific enough to avoid false > matches but not so specific that they become too fragile. >Right. Where we can, we prefer to have separate printing passes, but that only really works well for analysis and preparation transformations. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170223/99de6240/attachment.html>