thr3ads.net - llvm dev - [llvm-dev] RFC: Building GlobalISel by default [Jan 2017]

If this information is useful, please help other people find it:
Share via:

Mehdi Amini via llvm-dev

2017-Jan-14 18:53 UTC

[llvm-dev] RFC: Building GlobalISel by default

> On Jan 14, 2017, at 4:57 AM, Renato Golin via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> On 14 January 2017 at 01:54, Quentin Colombet via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>> Now, four backends (if I am counting right: X86, ARM, AArch64, AMDGPU)
are working on bringing-up GlobalISel, I’d like to switch the default of the
LLVM_BUILD_GLOBAL_ISEL variable in CMake, such that the framework gets built by
default.
> 
> Hi Quentin,
> 
> I'm not extremely confident in a full switch right now, mainly due to
> Michael's concerns. The ARM port is in its initial stage and there
> could still be many refactorings and destructive changes. I'm not
> foreseeing a cross between the isel tests, but every failure makes the
> bots red, and if Global ISel incurs in more failures due to its stage
> in the development cycle, we'll start seeing the bots more red than
> we'd like. Right now, it's already very likely that you'll
receive bot
> emails on commits, I don't want to increase that.
I don’t see why having global ISel is supposed to increase the rate. `make
check` will runs on the committer machine with GlobalISel enabled.

The good question from Michael is IMO "if my (non-GlobalISel) change breaks
GlobalISel, how likely is it to be a bug in my code vs. a bug in GlobalISel?”

But that’s totally unrelated to the bots, since it’ll fails on the committer
machine as well.
> 
> One alternative is to create a buildbot that turns it on by default,
> build all back-ends that use GlobalISel and then let it running for a
> while. I don't think we're self-hosting on any architecture right
now,
> nor we're passing the test-suites, so the bot will probably just be a
> simple "make check-all", which is more than enough for the time
being.
This is in place for months:
http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-globalisel/

> 
> As soon as we can self-host and pass the test-suite, I think we can
> enable it by default.
I don’t understand why it is an interesting property to be able to pass the
test-suite to have `make check` testing it.
> 
> Some comments inline...
> 
> 
>> * Upsides *
>> 
>> For people developing on GlobalISel, it will:
>> - Simplify the CMake command to type :)
>> - Build/Test GlobalISel on all the LLVM bots
> 
> This is already covered by the people developing Global ISel on the
> different architectures. As an experimental feature, I'm ok with the
> round time of "ARM developers picking up x86 code breaking their
> stuff".
> 
> 
>> For people not developing on GlobalISel, it will:
>> - Test that GlobalISel still works with your changes (make check will
test that for you)
> 
> A separate buildbot will do that for you.
> 
> 
>> - Allow you to play with it!
> 
> The cost of adding one CMake option is really small.
> 
> 
>> * Downsides *
>> 
>> For people developing on GlobalISel, it will:
>> - Leave the status as it is now, meaning that mainly only people
working on GlobalISel look at the failures of GlobalISel specific bots
> 
> If we already have those, I don't see why we need more. We're not
> self-hosting not passing the test-suite. the "check-all" tests
should
> not matter which platform they run.
The `check-all` are *not* testing GlobalISel right now because GlobalISel is not
*built* on the bots. This is *exactly* the point of the proposal: have `make`
building the global isel code on every platform, and have `make check` testing
the lit-tests for GlobalISel on every platform.
This is *not* about enabling GlobalISel  *by default* on any target AFAIK.


— 
Mehdi

> 
> 
>> For people not developing for GlobalISel, it will:
>> - Increase the compile time since the GlobalISel framework and the
related target specific parts will have to be built
>> - Increase the size of the binary (depending on what backend you pull)
>> - Require the setting of an additional CMake variable to disable it
(-DLLVM_BUILD_GLOBAL_ISEL=OFF)
> 
> This has a higher impact on slow bots, such as ARM and MIPS, and it's
> not a trivial amount of time and space.
> 
> On previous occasions (for example integrated-asm, lld, vecorizer), we
> have used the "self-host + test-suite passing" model and it
worked
> well. Meaning, before it passes those two, it should have special
> buildbots, after, it can be turned on by default.
> 
> To move to the new technology we need an additional step, which is to
> fix most remaining bugs, which we have done for the vectorizer and
> integrated-asm, but not lld, and that's why the two first are not only
> built by default, but enabled by default, while the latter isn't.
> 
> I want Global ISel to be a success as much as you do, but I'd rather
> go through a smooth path, even if it takes a bit longer.
> 
> cheers,
> --renato
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Renato Golin via llvm-dev

2017-Jan-14 19:32 UTC

head link

[llvm-dev] RFC: Building GlobalISel by default

On 14 January 2017 at 18:53, Mehdi Amini <mehdi.amini at apple.com>
wrote:> I don’t see why having global ISel is supposed to increase the rate. `make
check` will runs on the committer machine with GlobalISel enabled.
You assume the environments are the same, so the same bugs will be
caught on an x86 host or on an ARM host. If that was true, than most
of the breakages we have would just not happen.

To make matters worse, if a breakage is unknown to the committer (ie,
due to some GISel idiosyncrasy), the committer will not worry about it
and will leave the bot red, which increases the load on bot
maintainers.

> But that’s totally unrelated to the bots, since it’ll fails on the
committer machine as well.
If that's unrelated, than why is it a plus that we have all other
buildbots building it instead of just the one Apple has it already?

> This is *not* about enabling GlobalISel  *by default* on any target AFAIK.
I got that. That is a second step, for later. My point is that, if a
feature is not good enough to be used anywhere, then it can remain as
experimental, since the cost is just adding a flag to enable the
back-end.

I can easily do that on one of my bots, just for that purpose, but I
wouldn't feel comfortable doing that to all my bots at once (or having
to use special flags to remove experimental features).

--renato

Mehdi Amini via llvm-dev

2017-Jan-14 20:03 UTC

head link

[llvm-dev] RFC: Building GlobalISel by default

> On Jan 14, 2017, at 11:32 AM, Renato Golin <renato.golin at
linaro.org> wrote:
> 
> On 14 January 2017 at 18:53, Mehdi Amini <mehdi.amini at apple.com>
wrote:
>> I don’t see why having global ISel is supposed to increase the rate.
`make check` will runs on the committer machine with GlobalISel enabled.
> 
> You assume the environments are the same, so the same bugs will be
> caught on an x86 host or on an ARM host. If that was true, than most
> of the breakages we have would just not happen.
> 
> To make matters worse, if a breakage is unknown to the committer (ie,
> due to some GISel idiosyncrasy), the committer will not worry about it
> and will leave the bot red, which increases the load on bot
> maintainers.
> 
> 
>> But that’s totally unrelated to the bots, since it’ll fails on the
committer machine as well.
> 
> If that's unrelated, than why is it a plus that we have all other
> buildbots building it instead of just the one Apple has it already?
Because everyone on their local machine will get the failure before pushing
their patch.
And it’s not just the people working on GlobalISel that will have to cleanup
behind.


> 
> 
>> This is *not* about enabling GlobalISel  *by default* on any target
AFAIK.
> 
> I got that. That is a second step, for later. My point is that, if a
> feature is not good enough to be used anywhere, then it can remain as
> experimental, since the cost is just adding a flag to enable the
> back-end.
This is a very similar situation to any experimental backend: we turn them on by
default when they are stable enough to not disturb the other features in LLVM.
There has never be a constraint about “being able to bootstrap” or “run the
test-suite”, I don’t see any reason to apply such criteria to GlobalISel, that
seems totally arbitrary to me.



> 
> I can easily do that on one of my bots, just for that purpose, but I
> wouldn't feel comfortable doing that to all my bots at once (or having
> to use special flags to remove experimental features).
With this state of mind, Green dragon would build X86 and ARM and disable all
the other targets, because... why should we care about these? They only make our
build less “stable” and “take time”.
 

— 
Mehdi

Michael Kuperstein via llvm-dev

2017-Jan-15 04:24 UTC

head link

[llvm-dev] RFC: Building GlobalISel by default

On 14 January 2017 at 10:53, Mehdi Amini via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
>
> > On Jan 14, 2017, at 4:57 AM, Renato Golin via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> >
> > On 14 January 2017 at 01:54, Quentin Colombet via llvm-dev
> > <llvm-dev at lists.llvm.org> wrote:
> >> Now, four backends (if I am counting right: X86, ARM, AArch64,
AMDGPU)
> are working on bringing-up GlobalISel, I’d like to switch the default of
> the LLVM_BUILD_GLOBAL_ISEL variable in CMake, such that the framework gets
> built by default.
> >
> > Hi Quentin,
> >
> > I'm not extremely confident in a full switch right now, mainly due
to
> > Michael's concerns. The ARM port is in its initial stage and there
> > could still be many refactorings and destructive changes. I'm not
> > foreseeing a cross between the isel tests, but every failure makes the
> > bots red, and if Global ISel incurs in more failures due to its stage
> > in the development cycle, we'll start seeing the bots more red
than
> > we'd like. Right now, it's already very likely that you'll
receive bot
> > emails on commits, I don't want to increase that.
>
> I don’t see why having global ISel is supposed to increase the rate. `make
> check` will runs on the committer machine with GlobalISel enabled.
>
> The good question from Michael is IMO "if my (non-GlobalISel) change
> breaks GlobalISel, how likely is it to be a bug in my code vs. a bug in
> GlobalISel?”
>
> But that’s totally unrelated to the bots, since it’ll fails on the
> committer machine as well.
>
>To emphasize this - I don't think the bots should be a blocker, the issue
that bothers me is local developer workflow.

Regarding "we shouldn't enable it because it will make the bots
slower" -
well, yes, but that's just postponing the inevitable. We will enable
GlobalISel eventually, and there will probably be a very long time-frame
during which both are enabled concurrently. If we expect to have some magic
solution to make overall testing much faster in the near future, it may
make sense to delay until it's implemented. But I'm not aware of
anything
like that.

As to it making the bots "redder" because of failures in the newer
GlobalISel ports - if devs don't see a lot of local breakage, then the
effect on the bots shouldn't be big either. And if they do, I think it's
a
non-starter anyway.
Still, as a strawman proposal - would it make sense to add
LLVM_BUILD_GLOBAL_ISEL_EXPERIMENTAL,
or something of that sort? The more mature ports will build under
LLVM_BUILD_GLOBAL_ISEL (which will be enabled by default), the less mature
ones only under LLVM_BUILD_GLOBAL_ISEL_EXPERIMENTAL (which won't be), and
moving a port from GLOBAL_ISEL_EXPERIMENTAL to GLOBAL_ISEL would be
equivalent to marking a regular backend non-experimental. We can start with
just AArch64 in the non-experimental category, and move the others as they
become more stable.
Or is this too much complexity for what we gain?

Michael
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170114/da8f2558/attachment.html>

Renato Golin via llvm-dev

2017-Jan-15 13:14 UTC

head link

[llvm-dev] RFC: Building GlobalISel by default

On 15 January 2017 at 04:24, Michael Kuperstein
<michael.kuperstein at gmail.com> wrote:> Regarding "we shouldn't enable it because it will make the bots
slower" -
> well, yes, but that's just postponing the inevitable. We will enable
> GlobalISel eventually, and there will probably be a very long time-frame
> during which both are enabled concurrently.
No magic solutions expected. I just don't want to crank all bots up to
11 on day one. We start with the fast bots, when they're happy, we
move on until we're confident that the whole thing is stable, then we
make it build by default.

There are two problems here:

1. Environmental issues.

We do find more of those than we'd hope. Different compilers, linkers,
libraries that produce not only crashes, but unstable builds and
unpredictable or irreproducible test failures. The environment on our
bots are wildly different (on purpose) and dealing with different
unpredictable issues all at once is something that I'm not willing to
take on lightly. Due to the nature of buildbots and some of our
hardware and the fact that virtually all ARM/AArch64 bots are ours,
this will be something that Linaro will pay the price in full.

2. Experimental nature.

If the past serves as a guide, experimental features at the core of
LLVM will be largely independent until they start to merge, when they
can become a huge issue and be in merging state for months, if not
years, or worse, never be merged. Right now, we're not at that state,
we're still largely independent, so your worries regarding the
development tests (which also hit buildbots at a different pace
because of the environment issues above), will be lower for now, but
will increase massively soon enough.

There is the argument that having the bots building means we'll have
*less* teething issues and the people building GlobalISel (which also
includes Linaro) will have a lot less work. But there's a cost to be
paid, and that cost is a lot higher on the ARM/AArch64 side than x86,
just because most developers use an x86 as their main machine.

> Still, as a strawman proposal - would it make sense to add
> LLVM_BUILD_GLOBAL_ISEL_EXPERIMENTAL, or something of that sort? The more
> mature ports will build under LLVM_BUILD_GLOBAL_ISEL (which will be enabled
> by default), the less mature ones only under
> LLVM_BUILD_GLOBAL_ISEL_EXPERIMENTAL (which won't be), and moving a port
from
> GLOBAL_ISEL_EXPERIMENTAL to GLOBAL_ISEL would be equivalent to marking a
> regular backend non-experimental. We can start with just AArch64 in the
> non-experimental category, and move the others as they become more stable.
> Or is this too much complexity for what we gain?
It is.

We already separate that by not building any other back-end than ARM
and AArch64. So I think we can do that without the added complexity.

I'm not trying to block the move, I'm just trying to make it smother
and raise awareness that the cost we pay for any large move is not
well distributed.

For the past few months I have reverted patches that only broke the
ARM bots and it was clear on most of those instances which patch was
to blame, but people didn't bother and after 8, 12 sometimes 24 hours,
I reverted. This is not a nice place to be in, and given that
resurgence lately, I am honestly afraid adding more things will make
it worse.

A red bot can't catch new errors. A slow red bot can go on red for
weeks (and we have seen it numerous times) because once you fix bug
#1, #2 has made #3 appear and not be warned.

This is not everyone, and I do appreciate all the people that helped
us and worried about the breakages, but it's a trend that is
increasing, not decreasing. Our team is too small to cope with further
increases, so if the community is willing to do its part, I think we
can make it work.

But it has to be slow and steady.

cheers,
--renato

Seemingly Similar Threads

Search for more apparently analagous threads

llvm dev - Jan 2017 - RFC: Building GlobalISel by default

[llvm-dev] RFC: Building GlobalISel by default

[llvm-dev] RFC: Building GlobalISel by default

[llvm-dev] RFC: Building GlobalISel by default

[llvm-dev] RFC: Building GlobalISel by default

[llvm-dev] RFC: Building GlobalISel by default

Seemingly Similar Threads