thr3ads.net - llvm dev - [llvm-dev] [RFC][AArch64] Make -mcpu=generic schedule for an in-order core [Oct 2021]

If this information is useful, please help other people find it:
Share via:

David Green via llvm-dev

2021-Oct-04 07:40 UTC

[llvm-dev] [RFC][AArch64] Make -mcpu=generic schedule for an in-order core

Hello folks,

We would like to start pushing -mcpu=generic for AArch64 towards enabling a set
of features that is believed to be beneficial in general - that improve
performance the for some CPUs without hurting it on any others. A blend of the
performance options hopefully beneficial to all CPUs.

The largest part of that is enabling in-order scheduling using the Cortex-A55
schedule model. This is similar to the Arm backend change from eecb353d0e25ba
which made -mcpu=generic perform inorder scheduling using the Cortex-A8
scheduling model.

The idea is that in-order cpu's require the most help in instruction
scheduling, whereas out-of-order cpus can for the most part out-of-order
schedule around different codegen. Our benchmarking suggests that hypothesis
holds, with in-order performance benefiting from the scheduling by between 1%
and 4% geomean. Out of order performance was quite noisy and the results were
within the noise margins, tending towards a slight improvement in general.

When specifying an Apple target, clang will set "-target-cpu apple-a7"
on the command line, so should not be affected by this change when running from
clang. This also doesn't enable more runtime unrolling like -mcpu=cortex-a55
does, only changing the schedule used.

There is a patch to make the change in https://reviews.llvm.org/D110830, with
extra details about performance changes and all the tests that are updated.

Let us know if you have comments.

Thanks
Dave

Renato Golin via llvm-dev

2021-Oct-04 09:08 UTC

head link

[llvm-dev] [RFC][AArch64] Make -mcpu=generic schedule for an in-order core

On Mon, 4 Oct 2021 at 08:43, David Green via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hello folks,
>
> We would like to start pushing -mcpu=generic for AArch64 towards enabling
> a set of features that is believed to be beneficial in general - that
> improve performance the for some CPUs without hurting it on any others. A
> blend of the performance options hopefully beneficial to all CPUs.
>
Hi David,

This is the usual LLVM definition of "generic", so working on that
goal is
always good.

The largest part of that is enabling in-order scheduling using
the> Cortex-A55 schedule model. This is similar to the Arm backend change from
> eecb353d0e25ba which made -mcpu=generic perform inorder scheduling using
> the Cortex-A8 scheduling model.
>
I think this makes sense because the A55 scheduling model is more likely to
benefit the chips produced nowadays than the A8's.

When specifying an Apple target, clang will set "-target-cpu apple-a7"
on> the command line, so should not be affected by this change when running
> from clang. This also doesn't enable more runtime unrolling like
> -mcpu=cortex-a55 does, only changing the schedule used.

Thinking out loud, what do people think of creating an additional
"ooo"
target? So, "generic" is the same as "in-order", but the
"ooo" (or
"unordered", whatever) would pick a base OOO target, like A57, A72,
etc.

A few years ago, when I was doing benchmarks for OpenBLAS changes on Arm, I
realised doing that was beneficial to most targets, often only beaten by
specifying the correct target.

cheers,
--renato
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20211004/9cfbc7ff/attachment.html>

llvm dev - Oct 2021 - [RFC][AArch64] Make -mcpu=generic schedule for an in-order core

[llvm-dev] [RFC][AArch64] Make -mcpu=generic schedule for an in-order core

[llvm-dev] [RFC][AArch64] Make -mcpu=generic schedule for an in-order core