thr3ads.net - llvm dev - [llvm-dev] Clarification on expectations of buildbot email notifications [Feb 2019]

If this information is useful, please help other people find it:
Share via:

Sjoerd Meijer via llvm-dev

2019-Feb-20 09:32 UTC

[llvm-dev] Clarification on expectations of buildbot email notifications

I think we could/should be a little bit more precise here:
> ... any regressions whether they affect buildbots or not, the
> patch author should be responsible for fixing the issue.
especially if we say that the bar for a revert is low. That is, the "any
regression" needs a bit more clarifications. Assuming we are talking about
performance regressions (not language conformance or correctness):

1) We sometimes see regressions where code generation is (almost) the same, but
the code layout is different. Some micro-architectures are more sensitive to
this than others, causing significant regressions. We always thought it was
unfair to ask for a revert for these kind of regressions, and thus never ask for
that.

2) We also sometimes see that patches that cause regressions actually do the
right thing, but have all sorts of knock on effects e.g. causing different
codegen and regressions. Sometimes this is just unlucky (e.g. regalloc making
different decisions), but sometimes other passes can't handle the IR or
machine code less efficient and something need to be actually fixed. But we also
very rarely ask for a revert in these cases.

3) The obvious and straightforward case is when a patch is not doing the right
thing or e.g. forgets certain cases. Usually what we do is leave a comment on
the Phab ticket, and when the author responds fast and works on a fix we can
live with the regression for a few days (but it looks like we could be a bit
more aggressive with reverts if we wanted to).

The straightforward cases are 1) and 3), where the former is not worth a revert
(but it would be good to be explicit about this), and 3) is definitely worth a
revert.

2) is the tricky one, because it has a lot of grey areas. I guess the reason why
we are not very aggressive with reverts is that we don't want to stop others
from making progress, and also thought that in some cases it was just our
problem and not the author's. In the example of knock on effects and some
heuristic making a different/wrong decision, I thought it was unfair to the
author to ask for a revert. A more aggressive revert policy here could easily
lead to people not making any progress or a lot less fast.

Cheers,
Sjoerd.

________________________________
From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Reid
Kleckner via llvm-dev <llvm-dev at lists.llvm.org>
Sent: 19 February 2019 23:29
To: Zachary Turner
Cc: llvm-dev
Subject: Re: [llvm-dev] Clarification on expectations of buildbot email
notifications

I don't think whether a buildbot sends email should have anything to do with
whether we revert to green or not. Very often, developers commit patches that
cause regressions not caught by our buildbots. If the regression is severe
enough, then I think community members have the right, and perhaps
responsibility, to revert the change that caused it. Our team maintains bots
that build chrome with trunk versions of clang, and we identify many regressions
this way and end up doing many reverts as a result. I think it's important
to continue this practice so that we don't let multiple regressions pile up.

I think what's important, and what we should, after this discussion
concludes, put in the developer policy, is that the person doing the revert has
the responsibility to do their best to help the patch author reproduce the
problem or at least understand the bug.

This can take many forms. They can link directly to an LLVM buildbot, which
should be self-explanatory as far as reproduction goes. It can be an unreduced
crash report. If they're nice, they can use CReduce to make it smaller. But,
a reverter can't just say "Revert rNNN, breaks $RANDOM_PROJECT on
x86_64-linux-gu". If they add, "reduction forthcoming" and they
deliver on that promise, I think we should support that.

In other words, the bar to revert should be low, so we can do it fast and save
downstream consumers time and effort. If someone isn't making a good faith
effort to follow up after a revert, then authors have a right to push back.

I agree with Paul that we should remove the text about checking nightly
builders. That suggestion seems a bit dated.

On Tue, Feb 19, 2019 at 11:22 AM Zachary Turner via llvm-dev <llvm-dev at
lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:
Hi all,

Over the past year or so, all of us have broken the buildbots on many occasions.
Usually we get notified on IRC, or via an buildbot email notification sent to
everyone on the blamelist.
If I happen to be on IRC I'll see the notification, but if not, the next
best thing is an email that was automatically sent to me (along with everyone
else on the blamelist) from the buildbot with information about the failure.
And then finally, I'll occasionally get a response to my commit message
telling me that it's broken, and the patch may be reverted with information
in the commit message explaining which bot was broken and providing a link to
it.

However, we have some buildbots on the public waterfall which are specifically
configured not to send emails to people.  In some cases it's because the
bots are experimental, but there are a handful where the reasoning I've been
given is that it "wastes peoples time and contributes to build
blindness", but we are still expected to keep them green (usually by people
manually reaching out to us when they fail, or patches getting reverted and us
getting notified of the revert).

It is this last case that I'm concerned about, as it appears to be in direct
conflict with our own developer policy
[https://llvm.org/docs/DeveloperPolicy.html#id14], which states this
-----
We prefer for this to be handled before submission but understand that it isn’t
possible to test all of this for every submission. Our build bots and nightly
testing infrastructure normally finds these problems. A good rule of thumb is to
check the nightly testers for regressions the day after your change. Build bots
will directly email you if a group of commits that included yours caused a
failure. You are expected to check the build bot messages to see if they are
your fault and, if so, fix the breakage.

Commits that violate these quality standards (e.g. are very broken) may be
reverted. This is necessary when the change blocks other developers from making
progress. The developer is welcome to re-commit the change after the problem has
been fixed.

-----

I'm sending this email to get a sense of the community's views on this
matter.  If I'm correctly reading between the lines in the above passage,
buildbots which do not send emails should not be subject to the revert-to-green
policy.  To be honest, it's actually not even clear from reading the above
passage where the burden of fixing a "broken" patch on a silent
buildbot lies at all - with the patch author or with the bot maintainer.

Would anyone care to weigh in with an unbiased opinion here?

_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190220/0dc0d962/attachment.html>

Michael Kruse via llvm-dev

2019-Feb-20 18:28 UTC

head link

[llvm-dev] Clarification on expectations of buildbot email notifications

Hi,

maybe another interesting case is when there is disagreement over
whether there is a regression.

For instance, one of my patches made clang emit an additional warning
when compiling a popular project. This was not intentional by my
patch, but due to an inconsistent implementation of the warning in
clang. However, the warning was legitimate. I reverted the patch (it
had another problem), but before I recommitted it, I put up a patch
for review that fixed the inconsistent implementation such that the
warning is always emitted.

My question here: Should the patch be reverted even if it did not have
the other problem?

Michael



Am Mi., 20. Feb. 2019 um 03:32 Uhr schrieb Sjoerd Meijer via llvm-dev
<llvm-dev at lists.llvm.org>:>
> I think we could/should be a little bit more precise here:
>
> > ... any regressions whether they affect buildbots or not, the
> > patch author should be responsible for fixing the issue.
>
> especially if we say that the bar for a revert is low. That is, the
"any regression" needs a bit more clarifications. Assuming we are
talking about performance regressions (not language conformance or correctness):
>
> 1) We sometimes see regressions where code generation is (almost) the same,
but the code layout is different. Some micro-architectures are more sensitive to
this than others, causing significant regressions. We always thought it was
unfair to ask for a revert for these kind of regressions, and thus never ask for
that.
>
> 2) We also sometimes see that patches that cause regressions actually do
the right thing, but have all sorts of knock on effects e.g. causing different
codegen and regressions. Sometimes this is just unlucky (e.g. regalloc making
different decisions), but sometimes other passes can't handle the IR or
machine code less efficient and something need to be actually fixed. But we also
very rarely ask for a revert in these cases.
>
> 3) The obvious and straightforward case is when a patch is not doing the
right thing or e.g. forgets certain cases. Usually what we do is leave a comment
on the Phab ticket, and when the author responds fast and works on a fix we can
live with the regression for a few days (but it looks like we could be a bit
more aggressive with reverts if we wanted to).
>
> The straightforward cases are 1) and 3), where the former is not worth a
revert (but it would be good to be explicit about this), and 3) is definitely
worth a revert.
>
> 2) is the tricky one, because it has a lot of grey areas. I guess the
reason why we are not very aggressive with reverts is that we don't want to
stop others from making progress, and also thought that in some cases it was
just our problem and not the author's. In the example of knock on effects
and some heuristic making a different/wrong decision, I thought it was unfair to
the author to ask for a revert. A more aggressive revert policy here could
easily lead to people not making any progress or a lot less fast.
>
> Cheers,
> Sjoerd.
>
>
> ________________________________
> From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Reid
Kleckner via llvm-dev <llvm-dev at lists.llvm.org>
> Sent: 19 February 2019 23:29
> To: Zachary Turner
> Cc: llvm-dev
> Subject: Re: [llvm-dev] Clarification on expectations of buildbot email
notifications
>
> I don't think whether a buildbot sends email should have anything to do
with whether we revert to green or not. Very often, developers commit patches
that cause regressions not caught by our buildbots. If the regression is severe
enough, then I think community members have the right, and perhaps
responsibility, to revert the change that caused it. Our team maintains bots
that build chrome with trunk versions of clang, and we identify many regressions
this way and end up doing many reverts as a result. I think it's important
to continue this practice so that we don't let multiple regressions pile up.
>
> I think what's important, and what we should, after this discussion
concludes, put in the developer policy, is that the person doing the revert has
the responsibility to do their best to help the patch author reproduce the
problem or at least understand the bug.
>
> This can take many forms. They can link directly to an LLVM buildbot, which
should be self-explanatory as far as reproduction goes. It can be an unreduced
crash report. If they're nice, they can use CReduce to make it smaller. But,
a reverter can't just say "Revert rNNN, breaks $RANDOM_PROJECT on
x86_64-linux-gu". If they add, "reduction forthcoming" and they
deliver on that promise, I think we should support that.
>
> In other words, the bar to revert should be low, so we can do it fast and
save downstream consumers time and effort. If someone isn't making a good
faith effort to follow up after a revert, then authors have a right to push
back.
>
> I agree with Paul that we should remove the text about checking nightly
builders. That suggestion seems a bit dated.
>
> On Tue, Feb 19, 2019 at 11:22 AM Zachary Turner via llvm-dev <llvm-dev
at lists.llvm.org> wrote:
>
> Hi all,
>
> Over the past year or so, all of us have broken the buildbots on many
occasions.  Usually we get notified on IRC, or via an buildbot email
notification sent to everyone on the blamelist.
> If I happen to be on IRC I'll see the notification, but if not, the
next best thing is an email that was automatically sent to me (along with
everyone else on the blamelist) from the buildbot with information about the
failure.
> And then finally, I'll occasionally get a response to my commit message
telling me that it's broken, and the patch may be reverted with information
in the commit message explaining which bot was broken and providing a link to
it.
>
> However, we have some buildbots on the public waterfall which are
specifically configured not to send emails to people.  In some cases it's
because the bots are experimental, but there are a handful where the reasoning
I've been given is that it "wastes peoples time and contributes to
build blindness", but we are still expected to keep them green (usually by
people manually reaching out to us when they fail, or patches getting reverted
and us getting notified of the revert).
>
> It is this last case that I'm concerned about, as it appears to be in
direct conflict with our own developer policy
[https://llvm.org/docs/DeveloperPolicy.html#id14], which states this
> -----
> We prefer for this to be handled before submission but understand that it
isn’t possible to test all of this for every submission. Our build bots and
nightly testing infrastructure normally finds these problems. A good rule of
thumb is to check the nightly testers for regressions the day after your change.
Build bots will directly email you if a group of commits that included yours
caused a failure. You are expected to check the build bot messages to see if
they are your fault and, if so, fix the breakage.
>
> Commits that violate these quality standards (e.g. are very broken) may be
reverted. This is necessary when the change blocks other developers from making
progress. The developer is welcome to re-commit the change after the problem has
been fixed.
>
> -----
>
> I'm sending this email to get a sense of the community's views on
this matter.  If I'm correctly reading between the lines in the above
passage, buildbots which do not send emails should not be subject to the
revert-to-green policy.  To be honest, it's actually not even clear from
reading the above passage where the burden of fixing a "broken" patch
on a silent buildbot lies at all - with the patch author or with the bot
maintainer.
>
>
> Would anyone care to weigh in with an unbiased opinion here?
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Reid Kleckner via llvm-dev

2019-Feb-20 22:16 UTC

head link

[llvm-dev] Clarification on expectations of buildbot email notifications

On Wed, Feb 20, 2019 at 10:29 AM Michael Kruse <llvmdev at meinersbur.de>
wrote:
> maybe another interesting case is when there is disagreement over
> whether there is a regression.
>
> For instance, one of my patches made clang emit an additional warning
> when compiling a popular project. This was not intentional by my
> patch, but due to an inconsistent implementation of the warning in
> clang. However, the warning was legitimate. I reverted the patch (it
> had another problem), but before I recommitted it, I put up a patch
> for review that fixed the inconsistent implementation such that the
> warning is always emitted.
>
> My question here: Should the patch be reverted even if it did not have
> the other problem?
>
I would say not necessarily. If a new warning is added that fires on a
popular project and the warning is working as intended, that may be a
signal that the warning shouldn't be on by default (or shouldn't be part
of
-Wall). We obviously need to allow ourselves the freedom to add new
warnings over time. Just because a project uses "-Werror -Wall"
doesn't
mean that their code will compile cleanly with new compilers. However, if
the warning really is low value, then we may want to remove it. If someone
wants to revert a new warning because it is too noisy or has false
positives, they need to actively engage the patch author to support their
position.

Elaborating on your question of what constitutes a regression in general, I
would say that everything is case-by-case, but we have some clear cut ones,
like miscompiles or new compiler crashes. People often misdiagnose UB as a
miscompile, so to revert for a miscompile, you need some evidence that the
code doesn't exercise UB. Reverting for a performance regression would need
to come with a strong rationale to motivate why the slowdown outweighs the
supposed benefits of the patch in question.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190220/49146c25/attachment.html>

llvm dev - Feb 2019 - Clarification on expectations of buildbot email notifications

[llvm-dev] Clarification on expectations of buildbot email notifications

[llvm-dev] Clarification on expectations of buildbot email notifications

[llvm-dev] Clarification on expectations of buildbot email notifications