Renato Golin via llvm-dev
2016-Sep-05 21:48 UTC
[llvm-dev] Buildbot General Failure - Production Stop?
Folks, As Nico and Diana investigated earlier [1], there was a change in Zorg which made buildbots update one source directory (llvm.src) but build from another (llvm), which made *all* builds from the same revision, no matter the update. Essentially, the bots were all lying when they said this or that commit "passed", since they were still testing the same old commit. All our bots were affected, and it seems many other Windows, PowerPC, s390, Atom, etc. I have worked around the problem now by making "llvm" as a symbolic link to "llvm.src", so we build what we update and *many* of the bots are coming back with a myriad of failures, which are most likely from different commits in the last 4 days. This will take a while to clean up... for all of us. My question is: what do we do now? The safest option would be to stop production, ie. block commits, until the bots are reverted and then green. In a way, with all those bots not testing anything, whatever we commit is *not* going to be tested at all in a large part of our infrastructure, so I don't really think there is a point in assuming we can continue committing at will... I don't remember this every happening in LLVM, that's why I'm reluctant to propose it more strongly, but I see no better alternative. So, what now? cheers, --renato [1] http://lists.llvm.org/pipermail/cfe-dev/2016-September/050651.html
Krzysztof Parzyszek via llvm-dev
2016-Sep-05 22:04 UTC
[llvm-dev] Buildbot General Failure - Production Stop?
On 9/5/2016 4:48 PM, Renato Golin via llvm-dev wrote:> The safest option would be to stop production, ie. block commits, > until the bots are reverted and then green.Let's first see how bad it is once bots are fixed to build the latest revision. It's only been a few days and that includes a weekend. -Krzysztof
Renato Golin via llvm-dev
2016-Sep-05 22:24 UTC
[llvm-dev] Buildbot General Failure - Production Stop?
On 5 September 2016 at 23:04, Krzysztof Parzyszek <kparzysz at codeaurora.org> wrote:> Let's first see how bad it is once bots are fixed to build the latest > revision. It's only been a few days and that includes a weekend.Of course. I'll wait until all our bots return from the first round to know the size of the damage. I just wanted to warn all other buildbot owners and general public that we do have a tough situation going, and if their patches are not essential, maybe voluntarily hold on for a while would be a good idea. (Though, now that I read my email, it does sound alarmist. I'm in panic mode right now, so I apologise :). But all other affected bots won't come back until Galina reverts and restart the master (or until their owners work around like I did). There's no way of knowing how bad it is if the bots continue chugging bogus green status to every commit... cheers, --renato
Possibly Parallel Threads
- Buildbot General Failure - Production Stop?
- [LLVMdev] Handling of KILL instructions.
- What is the correct way to cross-compile LLVM and run the (in-tree) tests on a target board?
- Test Email - Apologies for the noise
- [libunwind][Mips] Problem using gas to assemble UnwindRegistersSave.S