Michael Kruse via llvm-dev
2021-Feb-25 20:14 UTC
[llvm-dev] Buildbots building one revision at a time
Am Mi., 24. Feb. 2021 um 17:55 Uhr schrieb Galina Kistanova via llvm-dev <llvm-dev at lists.llvm.org>:> Every worker owner has permissions to control the worker itself and the queue of the build requests, modulo to unknown issues. If the worker owners logged in and the github accounts, they logged in with, have e-mail addresses matching those in the workers info. Yours must be powerllvm at ca.ibm.com.That's a good thing to know. I tried this myself and added pollybot at meinersbur.de to GitHub's list of my email addresses. After logging in with GitHub, it unfortunately does not work. Any action I try is denied with a "unable to pause worker polly-x86_64-fdcserver:you need to have role 'LLVM Lab team'" message. The problem that I currently have is that one of my buildbots (https://lab.llvm.org/staging/#/workers/19) live-locks after a few hours. The master thinks ithe worker is still building, but the worker is doing nothing. It's twisted.log says 2021-02-24 04:27:04-0600 [-] WorkerForBuilder.commandComplete <buildbot_worker.commands.shell.WorkerShellCommand object at 0x7f59b68def10> 2021-02-24 04:33:24-0600 [-] sending app-level keepalive 2021-02-24 04:33:24-0600 [Broker,client] Master replied to keepalive, everything's fine 2021-02-24 04:43:24-0600 [-] sending app-level keepalive 2021-02-24 04:43:24-0600 [Broker,client] Master replied to keepalive, everything's fine [... lots of keepalive entries ...] I have to restart the buildbot-worker to start working on the next job only to live-lock again in a few hours. For this reason I put this worker to staging instead of production. I sthis a known problem? Michael
Galina Kistanova via llvm-dev
2021-Mar-01 04:57 UTC
[llvm-dev] Buildbots building one revision at a time
Only the primary e-mail is used for the authorization purposes. If that was the case, we can set up time when we are both available and troubleshoot, if needed.>From the buildbot server perspective, the test step never ended, and theworker seemed online and responsive. I have restarted the staging to make sure all the connections reset, and now your bot builds fine, it seems. Thanks Galina On Thu, Feb 25, 2021 at 12:15 PM Michael Kruse <llvmdev at meinersbur.de> wrote:> Am Mi., 24. Feb. 2021 um 17:55 Uhr schrieb Galina Kistanova via > llvm-dev <llvm-dev at lists.llvm.org>: > > Every worker owner has permissions to control the worker itself and the > queue of the build requests, modulo to unknown issues. If the worker owners > logged in and the github accounts, they logged in with, have e-mail > addresses matching those in the workers info. Yours must be > powerllvm at ca.ibm.com. > > That's a good thing to know. I tried this myself and added > pollybot at meinersbur.de to GitHub's list of my email addresses. After > logging in with GitHub, it unfortunately does not work. Any action I > try is denied with a "unable to pause worker > polly-x86_64-fdcserver:you need to have role 'LLVM Lab team'" message. > > The problem that I currently have is that one of my buildbots > (https://lab.llvm.org/staging/#/workers/19) live-locks after a few > hours. The master thinks ithe worker is still building, but the worker > is doing nothing. It's twisted.log says > > 2021-02-24 04:27:04-0600 [-] WorkerForBuilder.commandComplete > <buildbot_worker.commands.shell.WorkerShellCommand object at > 0x7f59b68def10> > 2021-02-24 04:33:24-0600 [-] sending app-level keepalive > 2021-02-24 04:33:24-0600 [Broker,client] Master replied to keepalive, > everything's fine > 2021-02-24 04:43:24-0600 [-] sending app-level keepalive > 2021-02-24 04:43:24-0600 [Broker,client] Master replied to keepalive, > everything's fine > [... lots of keepalive entries ...] > > I have to restart the buildbot-worker to start working on the next job > only to live-lock again in a few hours. For this reason I put this > worker to staging instead of production. I sthis a known problem? > > Michael >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210228/6d31338c/attachment.html>
Michael Kruse via llvm-dev
2021-Mar-04 23:55 UTC
[llvm-dev] Buildbots building one revision at a time
Unfortunately, just restarting the server did not help. It is stuck again. Michael Am So., 28. Feb. 2021 um 22:57 Uhr schrieb Galina Kistanova <gkistanova at gmail.com>:> > Only the primary e-mail is used for the authorization purposes. > If that was the case, we can set up time when we are both available and troubleshoot, if needed. > > From the buildbot server perspective, the test step never ended, and the worker seemed online and responsive. > I have restarted the staging to make sure all the connections reset, and now your bot builds fine, it seems. > > Thanks > > Galina > > On Thu, Feb 25, 2021 at 12:15 PM Michael Kruse <llvmdev at meinersbur.de> wrote: >> >> Am Mi., 24. Feb. 2021 um 17:55 Uhr schrieb Galina Kistanova via >> llvm-dev <llvm-dev at lists.llvm.org>: >> > Every worker owner has permissions to control the worker itself and the queue of the build requests, modulo to unknown issues. If the worker owners logged in and the github accounts, they logged in with, have e-mail addresses matching those in the workers info. Yours must be powerllvm at ca.ibm.com. >> >> That's a good thing to know. I tried this myself and added >> pollybot at meinersbur.de to GitHub's list of my email addresses. After >> logging in with GitHub, it unfortunately does not work. Any action I >> try is denied with a "unable to pause worker >> polly-x86_64-fdcserver:you need to have role 'LLVM Lab team'" message. >> >> The problem that I currently have is that one of my buildbots >> (https://lab.llvm.org/staging/#/workers/19) live-locks after a few >> hours. The master thinks ithe worker is still building, but the worker >> is doing nothing. It's twisted.log says >> >> 2021-02-24 04:27:04-0600 [-] WorkerForBuilder.commandComplete >> <buildbot_worker.commands.shell.WorkerShellCommand object at >> 0x7f59b68def10> >> 2021-02-24 04:33:24-0600 [-] sending app-level keepalive >> 2021-02-24 04:33:24-0600 [Broker,client] Master replied to keepalive, >> everything's fine >> 2021-02-24 04:43:24-0600 [-] sending app-level keepalive >> 2021-02-24 04:43:24-0600 [Broker,client] Master replied to keepalive, >> everything's fine >> [... lots of keepalive entries ...] >> >> I have to restart the buildbot-worker to start working on the next job >> only to live-lock again in a few hours. For this reason I put this >> worker to staging instead of production. I sthis a known problem? >> >> Michael