Simon Moll via llvm-dev
2021-Nov-10 14:08 UTC
[llvm-dev] Proposal: Make the VE target official
On Mon, 2021-11-08 at 13:43 -0800, Philip Reames wrote:> +1 to Renato's points. > One extra point on the build bot is that your cycle time appears to > be about 30 minutes. That's not unreasonable, but faster cycles are > always better (i.e. shorter blame lists). Any chance you can reduce > that time via e.g. more hardware or build config tweaks (such as > ccache)? I don't mean to suggest this as a blocking item, simple as > an area where improvement is possible.We should be able to bring that down. clang-ve-ninja currently builds everything from scratch (and it's all static - i'd love to have working shared component libraries for faster/incremental builds). We are also considering a second, faster builder that only builds and tests LLVM+Clang. That would be the canary for any issues with the VE backend. clang-ve-ninja would be the slow but thorough builder that includes all supported runtimes and runs compiled code on the VE.> Philip > On 11/8/21 7:52 AM, Renato Golin via llvm-dev wrote: > > > On Mon, 8 Nov 2021 at 14:56, Simon Moll <Simon.Moll at emea.nec.com> > > wrote: > > > > > If you look at the build logs of clang-ve-ninja, you will see > > > that > > > the > > > "check all" tests for LLVM+Clang have been passing for a while. > > > What's failing is compiler-rt and we have a patch for that. > > > > > > > Right, what we mean by "green bots" is that there should be no > > conditions for the bot to be considered a success. > > > > Buildbots *must* not only test all known functionality that is > > expected to work, but they also must not be "red". > > > > This is something that perhaps isn't clear on the new target > > section of the documents but it's the modus operandi for a long > > time. > > > > If the bot is red, or turns red easily, it can't be relied upon to > > convey success in target testing, because you can't expect non-NEC > > developers to know what's good and what's not, or what should pass > > and what shouldn't. > > > > It's the responsibility of the bot owner (and ultimately, the > > target's community), to make sure the bots accurately reflect the > > quality of the target. > > > > Therefore, a (perhaps undocumented) item on the checklist before > > moving out of experimental is: the bots must test the target and > > they must be green and stable (weeks without crashing for spurious > > reasons). > > > > In VE's case, looking at the earlier builds and seeing that "clang > > check" passes them all, should be enough to assert history, but > > before the target is built by everyone else, the bot must be green.Thanks for shedding some light on the more implicit items on the checklist. Once D113093 is in, clang-ve-ninja is expected to be green. We can call that the stable state - everything that's tested is supposed to work and any red-ness implies breakage.> > > > > > > > > Yes, the compiler-rt tests are failing for well understood > > > reasons > > > (documented in the patch - check-all on LLVM+Clang is green). > > > The > > > patch > > > will make compiler-rt pass on VE by accounting for those (no > > > denorm > > > support, syscall differences). > > > We explicitly include compiler-rt testing (even though it is > > > failing) > > > to have LLVM-compiled code running on the VE in CI.. this is not > > > something we'd do for slick optics. > > > > > > > Right, I've done the same thing when turning on the Arm back-end. I > > built enough buildbots that shown that the target was working on > > the basic level, then disabled the compiler-rt and test-suite that > > were not passing with specific bugzilla items for each one, and > > then with time, I fixed all of them and then all Arm bots were > > green. > > > > In your case, no other bot (should? will?) build compiler-rt for > > VE, so this shouldn't hit other bots, which will start building VE > > once it builds by default. > > > > But your buildbot will still be the *only* bot that build VE proper > > and uses hardware, so it will be the representative of the VE > > target. > > > > If it continues red, and it later on problems start to appear in > > the LIT tests, then other developers will look at your bot, red for > > ages, and will likely infer that no one cares, and disable the > > broken test.This may be a good moment to mention that the compiler-rt patch disables tests that will never work on VE - there is no fp denormal support, for example.> > > > Overall, it's much easier if the main bot is green and all the > > disabled tests have bugzilla entries showing that you are working > > on it.Using bugzilla for this makes sense - evidently for bugs but also to track/document features that aren't ready yet. Besides that, i'd like to have a CI-approach for turning on features (in particular runtimes, which tend to be less incremental than the backend work). I am thinking the following: With the compiler-rt patch clang-ve-ninja will be green. The coverage of that bot defines what's officially supported for VE at any given moment. We add a new staging buildbot that builds everything clang-ve-ninja does plus the yet-unsupported features that we are currently working on. Initially, that bot will be 'red' while the official one has to be kept 'green'. Once we are confident about the feature/runtime - both bots are 'green' - we will make the official bot test that feature, thereby declaring the feature official. The staging bot will turn to new experimental features. Coming back to Philip's point about slow turn arounds, one advantage of the "ambitious twin" of clang-ve-ninja is that we could experiment with speeding up the build without affecting the official bot. Btw, we are planning to port all LLVM runtimes to VE more or less. The delta between clang-ve-ninja and the staging twin will mostly be runtimes.> > > > > > > The github repo is for reference only. If you look at our > > > upstream > > > patch history, you will see that we submit small patches with > > > tests and > > > follow the review protocol. > > > > > > > I know, that's not what I meant. > > > > My point is that it's really hard to use that branch for reference > > because of all of the other non-VE stuff that is there too, bundled > > in a single merge commit. > > > > cheers, > > --renatoThanks - Simon> > > > > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >
Philip Reames via llvm-dev
2021-Nov-10 17:00 UTC
[llvm-dev] Proposal: Make the VE target official
On 11/10/21 6:08 AM, Simon Moll wrote:> On Mon, 2021-11-08 at 13:43 -0800, Philip Reames wrote: >> +1 to Renato's points. >> One extra point on the build bot is that your cycle time appears to >> be about 30 minutes. That's not unreasonable, but faster cycles are >> always better (i.e. shorter blame lists). Any chance you can reduce >> that time via e.g. more hardware or build config tweaks (such as >> ccache)? I don't mean to suggest this as a blocking item, simple as >> an area where improvement is possible. > We should be able to bring that down. clang-ve-ninja currently builds > everything from scratch (and it's all static - i'd love to have working > shared component libraries for faster/incremental builds). > > We are also considering a second, faster builder that only builds and > tests LLVM+Clang. That would be the canary for any issues with the VE > backend. > > clang-ve-ninja would be the slow but thorough builder that includes all > supported runtimes and runs compiled code on the VE.This all sounds entirely reasonable. With the one caveat that incremental can be risky. I've been told that ccache works almost as well with less risk of build weirdness. (I don't personal maintain a bot, so this is hearsay.)> >> Philip >> On 11/8/21 7:52 AM, Renato Golin via llvm-dev wrote: >> >>> On Mon, 8 Nov 2021 at 14:56, Simon Moll <Simon.Moll at emea.nec.com> >>> wrote: >>> >>>> If you look at the build logs of clang-ve-ninja, you will see >>>> that >>>> the >>>> "check all" tests for LLVM+Clang have been passing for a while. >>>> What's failing is compiler-rt and we have a patch for that. >>>> >>> Right, what we mean by "green bots" is that there should be no >>> conditions for the bot to be considered a success. >>> >>> Buildbots *must* not only test all known functionality that is >>> expected to work, but they also must not be "red". >>> >>> This is something that perhaps isn't clear on the new target >>> section of the documents but it's the modus operandi for a long >>> time. >>> >>> If the bot is red, or turns red easily, it can't be relied upon to >>> convey success in target testing, because you can't expect non-NEC >>> developers to know what's good and what's not, or what should pass >>> and what shouldn't. >>> >>> It's the responsibility of the bot owner (and ultimately, the >>> target's community), to make sure the bots accurately reflect the >>> quality of the target. >>> >>> Therefore, a (perhaps undocumented) item on the checklist before >>> moving out of experimental is: the bots must test the target and >>> they must be green and stable (weeks without crashing for spurious >>> reasons). >>> >>> In VE's case, looking at the earlier builds and seeing that "clang >>> check" passes them all, should be enough to assert history, but >>> before the target is built by everyone else, the bot must be green. > Thanks for shedding some light on the more implicit items on the > checklist. > > Once D113093 is in, clang-ve-ninja is expected to be green. > We can call that the stable state - everything that's tested is > supposed to work and any red-ness implies breakage. > >>> >>> >>>> Yes, the compiler-rt tests are failing for well understood >>>> reasons >>>> (documented in the patch - check-all on LLVM+Clang is green). >>>> The >>>> patch >>>> will make compiler-rt pass on VE by accounting for those (no >>>> denorm >>>> support, syscall differences). >>>> We explicitly include compiler-rt testing (even though it is >>>> failing) >>>> to have LLVM-compiled code running on the VE in CI.. this is not >>>> something we'd do for slick optics. >>>> >>> Right, I've done the same thing when turning on the Arm back-end. I >>> built enough buildbots that shown that the target was working on >>> the basic level, then disabled the compiler-rt and test-suite that >>> were not passing with specific bugzilla items for each one, and >>> then with time, I fixed all of them and then all Arm bots were >>> green. >>> >>> In your case, no other bot (should? will?) build compiler-rt for >>> VE, so this shouldn't hit other bots, which will start building VE >>> once it builds by default. >>> >>> But your buildbot will still be the *only* bot that build VE proper >>> and uses hardware, so it will be the representative of the VE >>> target. >>> >>> If it continues red, and it later on problems start to appear in >>> the LIT tests, then other developers will look at your bot, red for >>> ages, and will likely infer that no one cares, and disable the >>> broken test. > This may be a good moment to mention that the compiler-rt patch > disables tests that will never work on VE - there is no fp denormal > support, for example. > >>> Overall, it's much easier if the main bot is green and all the >>> disabled tests have bugzilla entries showing that you are working >>> on it. > Using bugzilla for this makes sense - evidently for bugs but also to > track/document features that aren't ready yet. Besides that, i'd like > to have a CI-approach for turning on features (in particular runtimes, > which tend to be less incremental than the backend work). I am thinking > the following: > > With the compiler-rt patch clang-ve-ninja will be green. The coverage > of that bot defines what's officially supported for VE at any given > moment. > > We add a new staging buildbot that builds everything clang-ve-ninja > does plus the yet-unsupported features that we are currently working > on. > Initially, that bot will be 'red' while the official one has to be kept > 'green'. Once we are confident about the feature/runtime - both bots > are 'green' - we will make the official bot test that feature, thereby > declaring the feature official. The staging bot will turn to new > experimental features. > > Coming back to Philip's point about slow turn arounds, one advantage of > the "ambitious twin" of clang-ve-ninja is that we could experiment with > speeding up the build without affecting the official bot. > > Btw, we are planning to port all LLVM runtimes to VE more or less. The > delta between clang-ve-ninja and the staging twin will mostly be > runtimes. > >>> >>>> The github repo is for reference only. If you look at our >>>> upstream >>>> patch history, you will see that we submit small patches with >>>> tests and >>>> follow the review protocol. >>>> >>> I know, that's not what I meant. >>> >>> My point is that it's really hard to use that branch for reference >>> because of all of the other non-VE stuff that is there too, bundled >>> in a single merge commit. >>> >>> cheers, >>> --renato > Thanks > - Simon > >>> >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >>
Renato Golin via llvm-dev
2021-Nov-10 17:52 UTC
[llvm-dev] Proposal: Make the VE target official
On Wed, 10 Nov 2021 at 14:10, Simon Moll <Simon.Moll at emea.nec.com> wrote:> We should be able to bring that down. clang-ve-ninja currently builds > everything from scratch (and it's all static - i'd love to have working > shared component libraries for faster/incremental builds). > > We are also considering a second, faster builder that only builds and > tests LLVM+Clang. That would be the canary for any issues with the VE > backend. >IIUC, your builder is an x86_64 machine cross testing on a VE target. If the canary builder doesn't run anything on VE hardware, then after VE is official, it won't be different than any other x86_64 builder building/testing VE. Once D113093 is in, clang-ve-ninja is expected to be green.> We can call that the stable state - everything that's tested is > supposed to work and any red-ness implies breakage. >Excellent! Make sure once green that the bot be moved to the production server. Check with Galina to make sure you're on the right place. This may be a good moment to mention that the compiler-rt patch> disables tests that will never work on VE - there is no fp denormal > support, for example. >Ah, right, so then permanently disabled is the right thing to do, no need for bugzilla entries. We add a new staging buildbot that builds everything clang-ve-ninja> does plus the yet-unsupported features that we are currently working > on. >If you do that, make sure you add it to the staging server (or a private one). This bot cannot notify people of breakages (email, IRC, nothing). Initially, that bot will be 'red' while the official one has to be kept> 'green'. Once we are confident about the feature/runtime - both bots > are 'green' - we will make the official bot test that feature, thereby > declaring the feature official. The staging bot will turn to new > experimental features. >Once it's official, and green, move the bot to the production (noisy) server and create a new one for new features in the staging one. Once the target is out of experimental, every buildbot that doesn't restrict the targets it builds (most bots) will build VE and run its LIT tests. Because you're the VE code owner, if your target breaks other people's bots, developers will (hopefully) notify you, too. So your strategy would have at the very least three (classes of) bots: 1. Fast "canary" bot, building only Clang and LLVM and running LIT tests (maybe you'll want to add some cross-VE testing here, too) 2. Slow complete bot, building everything that is supposed to be supported. This can be multiple bots with different configurations or one huge build. 3. A staging (silent) build with new stuff that the team is working on. This can also be different bots, totally up to the team, and doesn't even need to be public. I did a similar division for Arm and we've been doing it since for Arm 32 and 64. Even if we use the same server for multiple bots, it's easier to debug breakages when we build less stuff. cheers, --renato -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20211110/daba333a/attachment.html>