Sjoerd Meijer via llvm-dev
2019-Feb-20 09:32 UTC
[llvm-dev] Clarification on expectations of buildbot email notifications
I think we could/should be a little bit more precise here:> ... any regressions whether they affect buildbots or not, the > patch author should be responsible for fixing the issue.especially if we say that the bar for a revert is low. That is, the "any regression" needs a bit more clarifications. Assuming we are talking about performance regressions (not language conformance or correctness): 1) We sometimes see regressions where code generation is (almost) the same, but the code layout is different. Some micro-architectures are more sensitive to this than others, causing significant regressions. We always thought it was unfair to ask for a revert for these kind of regressions, and thus never ask for that. 2) We also sometimes see that patches that cause regressions actually do the right thing, but have all sorts of knock on effects e.g. causing different codegen and regressions. Sometimes this is just unlucky (e.g. regalloc making different decisions), but sometimes other passes can't handle the IR or machine code less efficient and something need to be actually fixed. But we also very rarely ask for a revert in these cases. 3) The obvious and straightforward case is when a patch is not doing the right thing or e.g. forgets certain cases. Usually what we do is leave a comment on the Phab ticket, and when the author responds fast and works on a fix we can live with the regression for a few days (but it looks like we could be a bit more aggressive with reverts if we wanted to). The straightforward cases are 1) and 3), where the former is not worth a revert (but it would be good to be explicit about this), and 3) is definitely worth a revert. 2) is the tricky one, because it has a lot of grey areas. I guess the reason why we are not very aggressive with reverts is that we don't want to stop others from making progress, and also thought that in some cases it was just our problem and not the author's. In the example of knock on effects and some heuristic making a different/wrong decision, I thought it was unfair to the author to ask for a revert. A more aggressive revert policy here could easily lead to people not making any progress or a lot less fast. Cheers, Sjoerd. ________________________________ From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Reid Kleckner via llvm-dev <llvm-dev at lists.llvm.org> Sent: 19 February 2019 23:29 To: Zachary Turner Cc: llvm-dev Subject: Re: [llvm-dev] Clarification on expectations of buildbot email notifications I don't think whether a buildbot sends email should have anything to do with whether we revert to green or not. Very often, developers commit patches that cause regressions not caught by our buildbots. If the regression is severe enough, then I think community members have the right, and perhaps responsibility, to revert the change that caused it. Our team maintains bots that build chrome with trunk versions of clang, and we identify many regressions this way and end up doing many reverts as a result. I think it's important to continue this practice so that we don't let multiple regressions pile up. I think what's important, and what we should, after this discussion concludes, put in the developer policy, is that the person doing the revert has the responsibility to do their best to help the patch author reproduce the problem or at least understand the bug. This can take many forms. They can link directly to an LLVM buildbot, which should be self-explanatory as far as reproduction goes. It can be an unreduced crash report. If they're nice, they can use CReduce to make it smaller. But, a reverter can't just say "Revert rNNN, breaks $RANDOM_PROJECT on x86_64-linux-gu". If they add, "reduction forthcoming" and they deliver on that promise, I think we should support that. In other words, the bar to revert should be low, so we can do it fast and save downstream consumers time and effort. If someone isn't making a good faith effort to follow up after a revert, then authors have a right to push back. I agree with Paul that we should remove the text about checking nightly builders. That suggestion seems a bit dated. On Tue, Feb 19, 2019 at 11:22 AM Zachary Turner via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote: Hi all, Over the past year or so, all of us have broken the buildbots on many occasions. Usually we get notified on IRC, or via an buildbot email notification sent to everyone on the blamelist. If I happen to be on IRC I'll see the notification, but if not, the next best thing is an email that was automatically sent to me (along with everyone else on the blamelist) from the buildbot with information about the failure. And then finally, I'll occasionally get a response to my commit message telling me that it's broken, and the patch may be reverted with information in the commit message explaining which bot was broken and providing a link to it. However, we have some buildbots on the public waterfall which are specifically configured not to send emails to people. In some cases it's because the bots are experimental, but there are a handful where the reasoning I've been given is that it "wastes peoples time and contributes to build blindness", but we are still expected to keep them green (usually by people manually reaching out to us when they fail, or patches getting reverted and us getting notified of the revert). It is this last case that I'm concerned about, as it appears to be in direct conflict with our own developer policy [https://llvm.org/docs/DeveloperPolicy.html#id14], which states this ----- We prefer for this to be handled before submission but understand that it isn’t possible to test all of this for every submission. Our build bots and nightly testing infrastructure normally finds these problems. A good rule of thumb is to check the nightly testers for regressions the day after your change. Build bots will directly email you if a group of commits that included yours caused a failure. You are expected to check the build bot messages to see if they are your fault and, if so, fix the breakage. Commits that violate these quality standards (e.g. are very broken) may be reverted. This is necessary when the change blocks other developers from making progress. The developer is welcome to re-commit the change after the problem has been fixed. ----- I'm sending this email to get a sense of the community's views on this matter. If I'm correctly reading between the lines in the above passage, buildbots which do not send emails should not be subject to the revert-to-green policy. To be honest, it's actually not even clear from reading the above passage where the burden of fixing a "broken" patch on a silent buildbot lies at all - with the patch author or with the bot maintainer. Would anyone care to weigh in with an unbiased opinion here? _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190220/0dc0d962/attachment.html>
Michael Kruse via llvm-dev
2019-Feb-20 18:28 UTC
[llvm-dev] Clarification on expectations of buildbot email notifications
Hi, maybe another interesting case is when there is disagreement over whether there is a regression. For instance, one of my patches made clang emit an additional warning when compiling a popular project. This was not intentional by my patch, but due to an inconsistent implementation of the warning in clang. However, the warning was legitimate. I reverted the patch (it had another problem), but before I recommitted it, I put up a patch for review that fixed the inconsistent implementation such that the warning is always emitted. My question here: Should the patch be reverted even if it did not have the other problem? Michael Am Mi., 20. Feb. 2019 um 03:32 Uhr schrieb Sjoerd Meijer via llvm-dev <llvm-dev at lists.llvm.org>:> > I think we could/should be a little bit more precise here: > > > ... any regressions whether they affect buildbots or not, the > > patch author should be responsible for fixing the issue. > > especially if we say that the bar for a revert is low. That is, the "any regression" needs a bit more clarifications. Assuming we are talking about performance regressions (not language conformance or correctness): > > 1) We sometimes see regressions where code generation is (almost) the same, but the code layout is different. Some micro-architectures are more sensitive to this than others, causing significant regressions. We always thought it was unfair to ask for a revert for these kind of regressions, and thus never ask for that. > > 2) We also sometimes see that patches that cause regressions actually do the right thing, but have all sorts of knock on effects e.g. causing different codegen and regressions. Sometimes this is just unlucky (e.g. regalloc making different decisions), but sometimes other passes can't handle the IR or machine code less efficient and something need to be actually fixed. But we also very rarely ask for a revert in these cases. > > 3) The obvious and straightforward case is when a patch is not doing the right thing or e.g. forgets certain cases. Usually what we do is leave a comment on the Phab ticket, and when the author responds fast and works on a fix we can live with the regression for a few days (but it looks like we could be a bit more aggressive with reverts if we wanted to). > > The straightforward cases are 1) and 3), where the former is not worth a revert (but it would be good to be explicit about this), and 3) is definitely worth a revert. > > 2) is the tricky one, because it has a lot of grey areas. I guess the reason why we are not very aggressive with reverts is that we don't want to stop others from making progress, and also thought that in some cases it was just our problem and not the author's. In the example of knock on effects and some heuristic making a different/wrong decision, I thought it was unfair to the author to ask for a revert. A more aggressive revert policy here could easily lead to people not making any progress or a lot less fast. > > Cheers, > Sjoerd. > > > ________________________________ > From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Reid Kleckner via llvm-dev <llvm-dev at lists.llvm.org> > Sent: 19 February 2019 23:29 > To: Zachary Turner > Cc: llvm-dev > Subject: Re: [llvm-dev] Clarification on expectations of buildbot email notifications > > I don't think whether a buildbot sends email should have anything to do with whether we revert to green or not. Very often, developers commit patches that cause regressions not caught by our buildbots. If the regression is severe enough, then I think community members have the right, and perhaps responsibility, to revert the change that caused it. Our team maintains bots that build chrome with trunk versions of clang, and we identify many regressions this way and end up doing many reverts as a result. I think it's important to continue this practice so that we don't let multiple regressions pile up. > > I think what's important, and what we should, after this discussion concludes, put in the developer policy, is that the person doing the revert has the responsibility to do their best to help the patch author reproduce the problem or at least understand the bug. > > This can take many forms. They can link directly to an LLVM buildbot, which should be self-explanatory as far as reproduction goes. It can be an unreduced crash report. If they're nice, they can use CReduce to make it smaller. But, a reverter can't just say "Revert rNNN, breaks $RANDOM_PROJECT on x86_64-linux-gu". If they add, "reduction forthcoming" and they deliver on that promise, I think we should support that. > > In other words, the bar to revert should be low, so we can do it fast and save downstream consumers time and effort. If someone isn't making a good faith effort to follow up after a revert, then authors have a right to push back. > > I agree with Paul that we should remove the text about checking nightly builders. That suggestion seems a bit dated. > > On Tue, Feb 19, 2019 at 11:22 AM Zachary Turner via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hi all, > > Over the past year or so, all of us have broken the buildbots on many occasions. Usually we get notified on IRC, or via an buildbot email notification sent to everyone on the blamelist. > If I happen to be on IRC I'll see the notification, but if not, the next best thing is an email that was automatically sent to me (along with everyone else on the blamelist) from the buildbot with information about the failure. > And then finally, I'll occasionally get a response to my commit message telling me that it's broken, and the patch may be reverted with information in the commit message explaining which bot was broken and providing a link to it. > > However, we have some buildbots on the public waterfall which are specifically configured not to send emails to people. In some cases it's because the bots are experimental, but there are a handful where the reasoning I've been given is that it "wastes peoples time and contributes to build blindness", but we are still expected to keep them green (usually by people manually reaching out to us when they fail, or patches getting reverted and us getting notified of the revert). > > It is this last case that I'm concerned about, as it appears to be in direct conflict with our own developer policy [https://llvm.org/docs/DeveloperPolicy.html#id14], which states this > ----- > We prefer for this to be handled before submission but understand that it isn’t possible to test all of this for every submission. Our build bots and nightly testing infrastructure normally finds these problems. A good rule of thumb is to check the nightly testers for regressions the day after your change. Build bots will directly email you if a group of commits that included yours caused a failure. You are expected to check the build bot messages to see if they are your fault and, if so, fix the breakage. > > Commits that violate these quality standards (e.g. are very broken) may be reverted. This is necessary when the change blocks other developers from making progress. The developer is welcome to re-commit the change after the problem has been fixed. > > ----- > > I'm sending this email to get a sense of the community's views on this matter. If I'm correctly reading between the lines in the above passage, buildbots which do not send emails should not be subject to the revert-to-green policy. To be honest, it's actually not even clear from reading the above passage where the burden of fixing a "broken" patch on a silent buildbot lies at all - with the patch author or with the bot maintainer. > > > Would anyone care to weigh in with an unbiased opinion here? > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reid Kleckner via llvm-dev
2019-Feb-20 22:16 UTC
[llvm-dev] Clarification on expectations of buildbot email notifications
On Wed, Feb 20, 2019 at 10:29 AM Michael Kruse <llvmdev at meinersbur.de> wrote:> maybe another interesting case is when there is disagreement over > whether there is a regression. > > For instance, one of my patches made clang emit an additional warning > when compiling a popular project. This was not intentional by my > patch, but due to an inconsistent implementation of the warning in > clang. However, the warning was legitimate. I reverted the patch (it > had another problem), but before I recommitted it, I put up a patch > for review that fixed the inconsistent implementation such that the > warning is always emitted. > > My question here: Should the patch be reverted even if it did not have > the other problem? >I would say not necessarily. If a new warning is added that fires on a popular project and the warning is working as intended, that may be a signal that the warning shouldn't be on by default (or shouldn't be part of -Wall). We obviously need to allow ourselves the freedom to add new warnings over time. Just because a project uses "-Werror -Wall" doesn't mean that their code will compile cleanly with new compilers. However, if the warning really is low value, then we may want to remove it. If someone wants to revert a new warning because it is too noisy or has false positives, they need to actively engage the patch author to support their position. Elaborating on your question of what constitutes a regression in general, I would say that everything is case-by-case, but we have some clear cut ones, like miscompiles or new compiler crashes. People often misdiagnose UB as a miscompile, so to revert for a miscompile, you need some evidence that the code doesn't exercise UB. Reverting for a performance regression would need to come with a strong rationale to motivate why the slowdown outweighs the supposed benefits of the patch in question. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190220/49146c25/attachment.html>