thr3ads.net - llvm dev - [llvm-dev] [Openmp-dev] [cfe-dev] RFC: End-to-end testing [Oct 2019]

If this information is useful, please help other people find it:
Share via:

Robinson, Paul via llvm-dev

2019-Oct-15 14:55 UTC

[llvm-dev] [cfe-dev] RFC: End-to-end testing

> -----Original Message-----
> From: cfe-dev <cfe-dev-bounces at lists.llvm.org> On Behalf Of Renato
Golin
> via cfe-dev
> Sent: Friday, October 11, 2019 11:24 AM
> To: David Greene <dag at cray.com>
> Cc: llvm-dev at lists.llvm.org; cfe-dev at lists.llvm.org; Gerolf Hoflehner
> <ghoflehner at apple.com>; openmp-dev at lists.llvm.org; lldb-dev at
lists.llvm.org
> Subject: Re: [cfe-dev] [llvm-dev] RFC: End-to-end testing
> 
> Hi David,
> 
> You answer some of your own questions down below, so I'll try to
> collate the responses and shorten my reply.
> 
> On Fri, 11 Oct 2019 at 15:20, David Greene <dag at cray.com> wrote:
> > How are regressions reported?  On llvm-dev?
> 
> They're buildbots, exactly like any other. Direct email, llvm-commits,
> irc, bugzilla. There is no distinction, broken bots need to be fixed.
> 
> llvm-dev is not the place to report bugs.
> 
> > I'm confused.  On the one hand you say you don't want to put
e2e tests
> > in a dark corner, but here you speculate they could be easily removed.
> > Presumably a test was added because there was some failure that other
> > tests did not catch.  It's true that once a test is fixed it's
very
> > likely it will never break again.  Is that a reason to remove tests?
> 
> Sorry, my point is about the dynamics between number of tests, their
> coverage, time to run, frequency of *unrelated* breakage, etc.
> 
> There are no set rules, but there is a back-pressure as developers and
> bot owners tend to breakages.
> 
> > What do you mean by "annoy?"  Taking too long to run?
> 
> Tests that break more often are looked at more often, and if their
> breakages overlap with other simpler tests, than developers will begin
> to question their importance. Tests that take too long to run will be
> looked into, and if they don't add much, they can be asked for
> removal. That pressure is higher in the LIT side than in the
> test-suite.
> 
> I'm trying to find a place where we can put the tests that will run at
> the appropriate frequency and have the lowest probability of upsetting
> CI and developers, so we can evolve them into what they *need* to be,
> not cap it from the start and end up with something sub-par.
> 
> > Would it be possible to keep them in the monorepo but have bots that
> > exercise those tests at the test-suite frequency?  I suspect that if
e2e
> > tests live in test-suite very few people will ever run them before
> > merging to master.
> 
> Bots are pretty dumb: either they run something or they don't.
> 
> But more importantly, if we split the e2e tests in LIT, then people
> won't run them before merging to master anyway.
Depends on whether they are part of check-all.
> Truth is, we don't *need* to. That's the whole point of having a
fast
> CI and the community really respects that.
> 
> As long as we have the tests running every few dozen commits, and bot
> owner and developers work to fix them in time, we're good.
> 
> Furthermore, the test-suite already has e2e tests in there, so it is
> the right place to add more. We can have more control of which tools
> and libraries to use, how to check for quality, etc.
My understanding is that test-suite had large-ish executable tests.
David is talking about small compile-only e2e tests.  These would hardly
take any more time than any other existing lit test.
> > I still think the kinds of e2e tests I'm thinking of are much
closer to
> > the existing LIT tests in the monorepo than things in test-suite.  I
> > expect them to be quite small.
> 
> Adding tests to LIT means all fast bots will be slower. Adding them to
> the test-suite means all test-suite bots will still take the about
> same time.
> 
> If the tests only need to be ran once ever few dozen commits, then
> having them on LIT is clearly not the right place.
The lit-versus-test-suite distinction is not the right one.  Bots don't
run "lit tests" as one big lump; they run the tests for a configured
set
of projects.  If the e2e tests are in with all the other clang tests, 
then they get run by the clang bots.  If they are in a different project 
(test-suite or their own) then they get run by the bots that run that 
project.  This is decided by the bot owner.
> 
> > They wouldn't necessarily need to run as
> > part of check-all (and indeed, I've been told that no one runs
check-all
> > anyway because it's too fragile).
> 
> check-all doesn't need to check everything that is in the repo, but
> everything that is built.
> 
> So if you build llvm+clang, then you should *definitely* check both.
> "make check" doesn't do that.
> 
> With the monorepo this may change slightly, but we still need a way to
> test everything that our patches touch, including clang, rt, and
> others.
> 
> I always ran check-all before every patch, FWIW.
Yep.  Although I run check-all before *starting* on a patch, to make sure
the starting point is clean.  It usually is, but I've been caught enough
times to be slightly wary.
--paulr
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

Renato Golin via llvm-dev

2019-Oct-15 15:52 UTC

head link

[llvm-dev] [cfe-dev] RFC: End-to-end testing

Hi Paul,

I'm not to strongly opposing to anything, and I don't want to be the
noisy one in the corner. :)

The only point I have is that it's easier to control the environment
in the TS and e2e tests are supposed to catch higher level problems
that cannot be handled in Clang.

We already have tests in clang that check for diagnostics, IR and
other things. Expanding those can handle 99.9% of what Clang could
possibly do without descending into assembly.

Assembly errors are more complicated than just "not generating VADD",
and that's easier done in the TS than LIT.

On Tue, 15 Oct 2019 at 15:55, Robinson, Paul <paul.robinson at sony.com>
wrote:> Yep.  Although I run check-all before *starting* on a patch, to make sure
> the starting point is clean.  It usually is, but I've been caught
enough
> times to be slightly wary.
Ah! yes! Been there, too. :)

David Greene via llvm-dev

2019-Oct-16 19:57 UTC

head link

[llvm-dev] [Openmp-dev] [cfe-dev] RFC: End-to-end testing

"Robinson, Paul via Openmp-dev" <openmp-dev at lists.llvm.org>
writes:
>> I always ran check-all before every patch, FWIW.
>
> Yep.  Although I run check-all before *starting* on a patch, to make sure
> the starting point is clean.  It usually is, but I've been caught
enough
> times to be slightly wary.
This is interesting.  I literally have never seen a clean check-all.  I
suspect that is because we have more components built than (most?)
others along with multiple targets.

                       -David

David Greene via llvm-dev

2019-Oct-16 20:00 UTC

head link

[llvm-dev] [Openmp-dev] [cfe-dev] RFC: End-to-end testing

Renato Golin via Openmp-dev <openmp-dev at lists.llvm.org> writes:
> We already have tests in clang that check for diagnostics, IR and
> other things. Expanding those can handle 99.9% of what Clang could
> possibly do without descending into assembly.
I agree that for a great many things this is sufficient.
> Assembly errors are more complicated than just "not generating
VADD",
> and that's easier done in the TS than LIT.
Can you elaborate?  I'm talking about very small tests targeted to
generate a specific instruction or small number of instructions.
Vectorization isn't the best example.  Something like verifying FMA
generation is a better example.

                        -David

Maybe Matching Threads

Search for more reasonably related threads

llvm dev - Oct 2019 - [Openmp-dev] [cfe-dev] RFC: End-to-end testing

[llvm-dev] [cfe-dev] RFC: End-to-end testing

[llvm-dev] [cfe-dev] RFC: End-to-end testing

[llvm-dev] [Openmp-dev] [cfe-dev] RFC: End-to-end testing

[llvm-dev] [Openmp-dev] [cfe-dev] RFC: End-to-end testing

Maybe Matching Threads