Galina Kistanova via llvm-dev
2021-Feb-24 23:55 UTC
[llvm-dev] Buildbots building one revision at a time
Hi Nemanja I have cancelled the queue for both ppc64le-mlir-rhel-clang and ppc64le-flang-rhel-clang for you. Hope this helps.> the short version - can I make a bot skip a bunch of revisions even if itis set up to build every revision? In cases like this, you can manually cancel the queued build requests from the Web UI. Under the hood there could be multiple queues, so it might take more than one click on the "Cancel whole queue" button. Every worker owner has permissions to control the worker itself and the queue of the build requests, modulo to unknown issues. If the worker owners logged in and the github accounts, they logged in with, have e-mail addresses matching those in the workers info. Yours must be powerllvm at ca.ibm.com. Was that the case? I have also noticed that you have a worker with the wrong name "ppc64le-flang+mlir-rhel-test" trying to connect over and over again from the same IP address the right one with the name "ppc64le-flang-mlir-rhel-test" connected from. Could you locate and stop the wrong worker, please? Stay safe and warm! Galina On Wed, Feb 24, 2021 at 10:40 AM David Blaikie <dblaikie at gmail.com> wrote:> Not sure, but Galina might have some ideas. > > On Wed, Feb 24, 2021 at 4:37 AM Nemanja Ivanovic via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Hi everyone, >> the short version - can I make a bot skip a bunch of revisions even if it >> is set up to build every revision? >> >> TL; DR; >> Since many of the PPC build bots were down for 8 days due to the winter >> storm in Austin, TX, a large number of build requests have queued up. For >> most of our bots, this isn't really an issue since they just pick up the >> latest revision and make the jump. >> >> However, our flang and mlir bots appear to be building a single revision >> at a time. Even though each build takes under 10 minutes, it will take >> quite some time for them to catch up on the few hundred requests. This >> appears to be because they have 'collapseRequests': False in their >> configuration. >> I would like to keep that behaviour, but hopefully there is an override >> for special circumstances such as this. None of my attempts (Cancel whole >> queue and Force Build on https://lab.llvm.org/buildbot/#/builders/88) >> have done anything. Does someone know of a way to make these bots skip all >> these build requests? >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210224/869c555b/attachment.html>
Michael Kruse via llvm-dev
2021-Feb-25 20:14 UTC
[llvm-dev] Buildbots building one revision at a time
Am Mi., 24. Feb. 2021 um 17:55 Uhr schrieb Galina Kistanova via llvm-dev <llvm-dev at lists.llvm.org>:> Every worker owner has permissions to control the worker itself and the queue of the build requests, modulo to unknown issues. If the worker owners logged in and the github accounts, they logged in with, have e-mail addresses matching those in the workers info. Yours must be powerllvm at ca.ibm.com.That's a good thing to know. I tried this myself and added pollybot at meinersbur.de to GitHub's list of my email addresses. After logging in with GitHub, it unfortunately does not work. Any action I try is denied with a "unable to pause worker polly-x86_64-fdcserver:you need to have role 'LLVM Lab team'" message. The problem that I currently have is that one of my buildbots (https://lab.llvm.org/staging/#/workers/19) live-locks after a few hours. The master thinks ithe worker is still building, but the worker is doing nothing. It's twisted.log says 2021-02-24 04:27:04-0600 [-] WorkerForBuilder.commandComplete <buildbot_worker.commands.shell.WorkerShellCommand object at 0x7f59b68def10> 2021-02-24 04:33:24-0600 [-] sending app-level keepalive 2021-02-24 04:33:24-0600 [Broker,client] Master replied to keepalive, everything's fine 2021-02-24 04:43:24-0600 [-] sending app-level keepalive 2021-02-24 04:43:24-0600 [Broker,client] Master replied to keepalive, everything's fine [... lots of keepalive entries ...] I have to restart the buildbot-worker to start working on the next job only to live-lock again in a few hours. For this reason I put this worker to staging instead of production. I sthis a known problem? Michael