thr3ads.net - llvm dev - [llvm-dev] Proposal: Make the VE target official [Nov 2021]

If this information is useful, please help other people find it:
Share via:

Simon Moll via llvm-dev

2021-Nov-10 14:08 UTC

[llvm-dev] Proposal: Make the VE target official

On Mon, 2021-11-08 at 13:43 -0800, Philip Reames wrote:> +1 to Renato's points.
> One extra point on the build bot is that your cycle time appears to
> be about 30 minutes.  That's not unreasonable, but faster cycles are
> always better (i.e. shorter blame lists).  Any chance you can reduce
> that time via e.g. more hardware or build config tweaks (such as
> ccache)?  I don't mean to suggest this as a blocking item, simple as
> an area where improvement is possible.We should be able to bring that down. clang-ve-ninja currently builds
everything from scratch (and it's all static - i'd love to have working
shared component libraries for faster/incremental builds).

We are also considering a second, faster builder that only builds and
tests LLVM+Clang. That would be the canary for any issues with the VE
backend.

clang-ve-ninja would be the slow but thorough builder that includes all
supported runtimes and runs compiled code on the VE.
> Philip
> On 11/8/21 7:52 AM, Renato Golin via llvm-dev wrote:
>  
> > On Mon, 8 Nov 2021 at 14:56, Simon Moll <Simon.Moll at
emea.nec.com>
> > wrote:
> >  
> > > If you look at the build logs of clang-ve-ninja, you will see
> > > that
> > > the
> > >  "check all" tests for LLVM+Clang have been passing for
a while.
> > >  What's failing is compiler-rt and we have a patch for that.
> > > 
> > 
> > Right, what we mean by "green bots" is that there should be
no
> > conditions for the bot to be considered a success. 
> > 
> > Buildbots *must* not only test all known functionality that is
> > expected to work, but they also must not be "red".
> > 
> > This is something that perhaps isn't clear on the new target
> > section of the documents but it's the modus operandi for a long
> > time.
> > 
> > If the bot is red, or turns red easily, it can't be relied upon to
> > convey success in target testing, because you can't expect non-NEC
> > developers to know what's good and what's not, or what should
pass
> > and what shouldn't.
> > 
> > It's the responsibility of the bot owner (and ultimately, the
> > target's community), to make sure the bots accurately reflect the
> > quality of the target.
> > 
> > Therefore, a (perhaps undocumented) item on the checklist before
> > moving out of experimental is: the bots must test the target and
> > they must be green and stable (weeks without crashing for spurious
> > reasons).
> > 
> > In VE's case, looking at the earlier builds and seeing that
"clang
> > check" passes them all, should be enough to assert history, but
> > before the target is built by everyone else, the bot must be green.Thanks for shedding some light on the more implicit items on the
checklist.

Once D113093 is in, clang-ve-ninja is expected to be green.
We can call that the stable state - everything that's tested is
supposed to work and any red-ness implies breakage.
> > 
> > 
> >  
> > > Yes, the compiler-rt tests are failing for well understood
> > > reasons
> > >  (documented in the patch - check-all on LLVM+Clang is green).
> > > The
> > > patch
> > >  will make compiler-rt pass on VE by accounting for those (no
> > > denorm
> > >  support, syscall differences).
> > >  We explicitly include compiler-rt testing (even though it is
> > > failing)
> > >  to have LLVM-compiled code running on the VE in CI.. this is not
> > >  something we'd do for slick optics.
> > > 
> > 
> > Right, I've done the same thing when turning on the Arm back-end.
I
> > built enough buildbots that shown that the target was working on
> > the basic level, then disabled the compiler-rt and test-suite that
> > were not passing with specific bugzilla items for each one, and
> > then with time, I fixed all of them and then all Arm bots were
> > green.
> > 
> > In your case, no other bot (should? will?) build compiler-rt for
> > VE, so this shouldn't hit other bots, which will start building VE
> > once it builds by default.
> > 
> > But your buildbot will still be the *only* bot that build VE proper
> > and uses hardware, so it will be the representative of the VE
> > target.
> > 
> > If it continues red, and it later on problems start to appear in
> > the LIT tests, then other developers will look at your bot, red for
> > ages, and will likely infer that no one cares, and disable the
> > broken test.This may be a good moment to mention that the compiler-rt patch
disables tests that will never work on VE - there is no fp denormal
support, for example.
> > 
> > Overall, it's much easier if the main bot is green and all the
> > disabled tests have bugzilla entries showing that you are working
> > on it.Using bugzilla for this makes sense - evidently for bugs but also to
track/document features that aren't ready yet. Besides that, i'd like
to have a CI-approach for turning on features (in particular runtimes,
which tend to be less incremental than the backend work). I am thinking
the following:

With the compiler-rt patch clang-ve-ninja will be green. The coverage
of that bot defines what's officially supported for VE at any given
moment.

We add a new staging buildbot that builds everything clang-ve-ninja
does plus the yet-unsupported features that we are currently working
on.
Initially, that bot will be 'red' while the official one has to be kept
'green'. Once we are confident about the feature/runtime - both bots
are 'green' - we will make the official bot test that feature, thereby
declaring the feature official. The staging bot will turn to new
experimental features.

Coming back to Philip's point about slow turn arounds, one advantage of
the "ambitious twin" of clang-ve-ninja is that we could experiment
with
speeding up the build without affecting the official bot.

Btw, we are planning to port all LLVM runtimes to VE more or less. The
delta between clang-ve-ninja and the staging twin will mostly be
runtimes.
> > 
> >  
> > > The github repo is for reference only. If you look at our
> > > upstream
> > >  patch history, you will see that we submit small patches with
> > > tests and
> > >  follow the review protocol.
> > > 
> > 
> > I know, that's not what I meant.
> > 
> > My point is that it's really hard to use that branch for reference
> > because of all of the other non-VE stuff that is there too, bundled
> > in a single merge commit. 
> > 
> > cheers,
> > --renato
Thanks
- Simon
> >  
> >  
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>  
>

Philip Reames via llvm-dev

2021-Nov-10 17:00 UTC

head link

[llvm-dev] Proposal: Make the VE target official

On 11/10/21 6:08 AM, Simon Moll wrote:> On Mon, 2021-11-08 at 13:43 -0800, Philip Reames wrote:
>> +1 to Renato's points.
>> One extra point on the build bot is that your cycle time appears to
>> be about 30 minutes.  That's not unreasonable, but faster cycles
are
>> always better (i.e. shorter blame lists).  Any chance you can reduce
>> that time via e.g. more hardware or build config tweaks (such as
>> ccache)?  I don't mean to suggest this as a blocking item, simple
as
>> an area where improvement is possible.
> We should be able to bring that down. clang-ve-ninja currently builds
> everything from scratch (and it's all static - i'd love to have
working
> shared component libraries for faster/incremental builds).
>
> We are also considering a second, faster builder that only builds and
> tests LLVM+Clang. That would be the canary for any issues with the VE
> backend.
>
> clang-ve-ninja would be the slow but thorough builder that includes all
> supported runtimes and runs compiled code on the VE.This all sounds entirely reasonable.  With the one caveat that 
incremental can be risky.  I've been told that ccache works almost as 
well with less risk of build weirdness.  (I don't personal maintain a 
bot, so this is hearsay.)>
>> Philip
>> On 11/8/21 7:52 AM, Renato Golin via llvm-dev wrote:
>>   
>>> On Mon, 8 Nov 2021 at 14:56, Simon Moll <Simon.Moll at
emea.nec.com>
>>> wrote:
>>>   
>>>> If you look at the build logs of clang-ve-ninja, you will see
>>>> that
>>>> the
>>>>   "check all" tests for LLVM+Clang have been passing
for a while.
>>>>   What's failing is compiler-rt and we have a patch for
that.
>>>>
>>> Right, what we mean by "green bots" is that there should
be no
>>> conditions for the bot to be considered a success.
>>>
>>> Buildbots *must* not only test all known functionality that is
>>> expected to work, but they also must not be "red".
>>>
>>> This is something that perhaps isn't clear on the new target
>>> section of the documents but it's the modus operandi for a long
>>> time.
>>>
>>> If the bot is red, or turns red easily, it can't be relied upon
to
>>> convey success in target testing, because you can't expect
non-NEC
>>> developers to know what's good and what's not, or what
should pass
>>> and what shouldn't.
>>>
>>> It's the responsibility of the bot owner (and ultimately, the
>>> target's community), to make sure the bots accurately reflect
the
>>> quality of the target.
>>>
>>> Therefore, a (perhaps undocumented) item on the checklist before
>>> moving out of experimental is: the bots must test the target and
>>> they must be green and stable (weeks without crashing for spurious
>>> reasons).
>>>
>>> In VE's case, looking at the earlier builds and seeing that
"clang
>>> check" passes them all, should be enough to assert history,
but
>>> before the target is built by everyone else, the bot must be green.
> Thanks for shedding some light on the more implicit items on the
> checklist.
>
> Once D113093 is in, clang-ve-ninja is expected to be green.
> We can call that the stable state - everything that's tested is
> supposed to work and any red-ness implies breakage.
>
>>>
>>>   
>>>> Yes, the compiler-rt tests are failing for well understood
>>>> reasons
>>>>   (documented in the patch - check-all on LLVM+Clang is green).
>>>> The
>>>> patch
>>>>   will make compiler-rt pass on VE by accounting for those (no
>>>> denorm
>>>>   support, syscall differences).
>>>>   We explicitly include compiler-rt testing (even though it is
>>>> failing)
>>>>   to have LLVM-compiled code running on the VE in CI.. this is
not
>>>>   something we'd do for slick optics.
>>>>
>>> Right, I've done the same thing when turning on the Arm
back-end. I
>>> built enough buildbots that shown that the target was working on
>>> the basic level, then disabled the compiler-rt and test-suite that
>>> were not passing with specific bugzilla items for each one, and
>>> then with time, I fixed all of them and then all Arm bots were
>>> green.
>>>
>>> In your case, no other bot (should? will?) build compiler-rt for
>>> VE, so this shouldn't hit other bots, which will start building
VE
>>> once it builds by default.
>>>
>>> But your buildbot will still be the *only* bot that build VE proper
>>> and uses hardware, so it will be the representative of the VE
>>> target.
>>>
>>> If it continues red, and it later on problems start to appear in
>>> the LIT tests, then other developers will look at your bot, red for
>>> ages, and will likely infer that no one cares, and disable the
>>> broken test.
> This may be a good moment to mention that the compiler-rt patch
> disables tests that will never work on VE - there is no fp denormal
> support, for example.
>
>>> Overall, it's much easier if the main bot is green and all the
>>> disabled tests have bugzilla entries showing that you are working
>>> on it.
> Using bugzilla for this makes sense - evidently for bugs but also to
> track/document features that aren't ready yet. Besides that, i'd
like
> to have a CI-approach for turning on features (in particular runtimes,
> which tend to be less incremental than the backend work). I am thinking
> the following:
>
> With the compiler-rt patch clang-ve-ninja will be green. The coverage
> of that bot defines what's officially supported for VE at any given
> moment.
>
> We add a new staging buildbot that builds everything clang-ve-ninja
> does plus the yet-unsupported features that we are currently working
> on.
> Initially, that bot will be 'red' while the official one has to be
kept
> 'green'. Once we are confident about the feature/runtime - both
bots
> are 'green' - we will make the official bot test that feature,
thereby
> declaring the feature official. The staging bot will turn to new
> experimental features.
>
> Coming back to Philip's point about slow turn arounds, one advantage of
> the "ambitious twin" of clang-ve-ninja is that we could
experiment with
> speeding up the build without affecting the official bot.
>
> Btw, we are planning to port all LLVM runtimes to VE more or less. The
> delta between clang-ve-ninja and the staging twin will mostly be
> runtimes.
>
>>>   
>>>> The github repo is for reference only. If you look at our
>>>> upstream
>>>>   patch history, you will see that we submit small patches with
>>>> tests and
>>>>   follow the review protocol.
>>>>
>>> I know, that's not what I meant.
>>>
>>> My point is that it's really hard to use that branch for
reference
>>> because of all of the other non-VE stuff that is there too, bundled
>>> in a single merge commit.
>>>
>>> cheers,
>>> --renato
> Thanks
> - Simon
>
>>>   
>>>   
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>   
>>

Renato Golin via llvm-dev

2021-Nov-10 17:52 UTC

head link

[llvm-dev] Proposal: Make the VE target official

On Wed, 10 Nov 2021 at 14:10, Simon Moll <Simon.Moll at emea.nec.com>
wrote:
> We should be able to bring that down. clang-ve-ninja currently builds
> everything from scratch (and it's all static - i'd love to have
working
> shared component libraries for faster/incremental builds).
>
> We are also considering a second, faster builder that only builds and
> tests LLVM+Clang. That would be the canary for any issues with the VE
> backend.
>
IIUC, your builder is an x86_64 machine cross testing on a VE target.

If the canary builder doesn't run anything on VE hardware, then after VE is
official, it won't be different than any other x86_64 builder
building/testing VE.

Once D113093 is in, clang-ve-ninja is expected to be
green.> We can call that the stable state - everything that's tested is
> supposed to work and any red-ness implies breakage.
>
Excellent! Make sure once green that the bot be moved to the production
server. Check with Galina to make sure you're on the right place.

This may be a good moment to mention that the compiler-rt
patch> disables tests that will never work on VE - there is no fp denormal
> support, for example.
>
Ah, right, so then permanently disabled is the right thing to do, no need
for bugzilla entries.

We add a new staging buildbot that builds everything
clang-ve-ninja> does plus the yet-unsupported features that we are currently working
> on.
>
If you do that, make sure you add it to the staging server (or a private
one). This bot cannot notify people of breakages (email, IRC, nothing).

Initially, that bot will be 'red' while the official one has to be
kept> 'green'. Once we are confident about the feature/runtime - both
bots
> are 'green' - we will make the official bot test that feature,
thereby
> declaring the feature official. The staging bot will turn to new
> experimental features.
>
Once it's official, and green, move the bot to the production (noisy)
server and create a new one for new features in the staging one.

Once the target is out of experimental, every buildbot that doesn't
restrict the targets it builds (most bots) will build VE and run its LIT
tests.

Because you're the VE code owner, if your target breaks other people's
bots, developers will (hopefully) notify you, too.

So your strategy would have at the very least three (classes of) bots:

1. Fast "canary" bot, building only Clang and LLVM and running LIT
tests
(maybe you'll want to add some cross-VE testing here, too)

2. Slow complete bot, building everything that is supposed to be supported.
This can be multiple bots with different configurations or one huge build.

3. A staging (silent) build with new stuff that the team is working on.
This can also be different bots, totally up to the team, and doesn't even
need to be public.

I did a similar division for Arm and we've been doing it since for Arm 32
and 64. Even if we use the same server for multiple bots, it's easier to
debug breakages when we build less stuff.

cheers,
--renato
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20211110/daba333a/attachment.html>

llvm dev - Nov 2021 - Proposal: Make the VE target official

[llvm-dev] Proposal: Make the VE target official

[llvm-dev] Proposal: Make the VE target official

[llvm-dev] Proposal: Make the VE target official