thr3ads.net - llvm dev - [llvm-dev] New x86-64 micro-architecture levels [Jul 2020]

If this information is useful, please help other people find it:
Share via:

Florian Weimer via llvm-dev

2020-Jul-21 18:04 UTC

[llvm-dev] New x86-64 micro-architecture levels

* Premachandra Mallappa:
> [AMD Public Use]
>
> Hi Floarian,
>
>> I'm including a proposal for the levels below.  I use single
letters for them, but I expect that the concrete implementation of this proposal
will use
>> names like “x86-100”, “x86-101”, like in the glibc patch referenced
above.  (But we can discuss other approaches.)
>
> Personally I am not a big fan of this, for 2 reasons 
> 1. uses just x86 in name on x86_64 as well
That's deliberate, so that we can use the same x86-* names for 32-bit
library selection (once we define matching micro-architecture levels
there).

GCC has -m32 -march=x86-64 for K8 without 3DNow! (essentially the shared
x86-64/EMT64 baseline), but I find this a bit confusing.
> 2. 100/101 not very intuitive
Any suggestions?  The advantage is that these numbers show a strong
preference ordering.  They do make in false suggestions about feature
sets: if we named Level C "x86-avx2", it would still be wrong for
glibc
to load libraries found in that directory just because a system has AVX2
support, because the libraries might also need FMA, based on the Level C
definition).  On the GCC side, it avoids a confusion between -mavx2 and
-march=x86-avx2.

If numbers are out, what should we use instead?
x86-sse4, x86-avx2, x86-avx512?  Would that work?
>> * Level A
> ...
>> * Level B
>> This step is so small that it probably can be dropped, unless the
benefits from using VEX encoding are truly significant.
>
> Yes, Agree, the delta is too small, can be clubbed into A or C.
Let's merge Level B into level C then?
>> * Level C
>> * Level D
>
> Others are inline with the what we expect as logical grouping.
Thanks.
> Also we would also like to have dynamic loader support for "zen"
/
> "zen2" as a version of "Level D" and takes preference
over Level D,
> which may have super-optimized libraries from AMD or other vendors.
*That* shouldn't be too hard to implement if we can nail down the
selection criteria.  Let's call this Zen-specific Level C x86-zen-avx2
for the sake of exposition.

What's going to be difficult is the choice for a hypothetical Zen
successor that's compatible feature-flag-wise with Level D.

Basically, there are two choices here:

  * Level D wins because it's the more powerful ISA.
  * x86-zen-avx2 wins because it has the Zen architecture optimizations.

There's also a related issue with Level C vs x86-zen-avx2 depending on
how we implement the Zen detection for AMD family numbers in the glibc
dynamic linker.  What I mean by this?  glibc detects that this a Level C
capable Zen-type CPU, but it's not one of the family/model numbers that
were hard-coded into the glibc sources.  What should we do then?  Should
we still prefer the x86-zen-avx2 library over the Level C library?
> These libraries are expected to be optimized according to
> micro-architectural details, not just ISA.
If it's supposed to be generally useful, we really need to document the
selection criteria for the subdirectory and make sure that it matches
what these libraries actually require at run time in terms of ISA.

I want to avoid two things here specifically: A hardware upgrade results
in crashes because we incorrectly load an incompatible library.  And, if
possible: A hardware upgrade (or kernel/hypervisor upgrade that exposes
more of the actual hardware) causes us to drop optimizations, so that
users experience a performance regression.

With the levels I proposed, these aspects are covered.  But if we start
to create vendor-specific forks in the feature progression, things get
complicated.

Do you think we need to figure this out in this iteration?  If yes, then
I really need a semi-formal description of the selection criteria for
this x86-zen-avx2 directory, so that I can passed it along with my psABI
proposal.

Thanks,
Florian

Dongsheng Song via llvm-dev

2020-Jul-22 01:31 UTC

head link

[llvm-dev] New x86-64 micro-architecture levels

I fully agree these names (100/101, A/B/C/D) are not very intuitive, I
recommend using isa tags by year (e.g. x64_2010, x64_2014) like the
python's platform tags (e.g. manylinux2010, manylinux2014).

Jan Beulich via llvm-dev

2020-Jul-22 07:48 UTC

head link

[llvm-dev] New x86-64 micro-architecture levels

On 21.07.2020 20:04, Florian Weimer wrote:> * Premachandra Mallappa:
> 
>> [AMD Public Use]
>>
>> Hi Floarian,
>>
>>> I'm including a proposal for the levels below.  I use single
letters for them, but I expect that the concrete implementation of this proposal
will use
>>> names like “x86-100”, “x86-101”, like in the glibc patch referenced
above.  (But we can discuss other approaches.)
>>
>> Personally I am not a big fan of this, for 2 reasons 
>> 1. uses just x86 in name on x86_64 as well
> 
> That's deliberate, so that we can use the same x86-* names for 32-bit
> library selection (once we define matching micro-architecture levels
> there).
While indeed I did understand it to be deliberate, in the light of
64-bit only ISA extensions (like AMX, and I suspect we're going to
see more) I nevertheless think Premachandra has a point here.

Jan

Florian Weimer via llvm-dev

2020-Jul-22 08:44 UTC

head link

[llvm-dev] New x86-64 micro-architecture levels

* Dongsheng Song:
> I fully agree these names (100/101, A/B/C/D) are not very intuitive, I
> recommend using isa tags by year (e.g. x64_2010, x64_2014) like the
> python's platform tags (e.g. manylinux2010, manylinux2014).
I started out with a year number, but that was before the was Level A.
Too many new CPUs only fall under level A unfortunately because they do
not even have AVX.  This even applies to some new server CPU designs
released this year.

I'm concerned that putting a year into the level name suggests that
everything main-stream released after that year supports that level, and
that's not true.  I think for manylinux, it's different, and it actually
works out there.  No one is building a new GNU/Linux distribution that
is based on glibc 2.12 today, for example.  But not so much for x86
CPUs.

If you think my worry is unfounded, then a year-based approach sounds
compelling.

Thanks,
Florian

Florian Weimer via llvm-dev

2020-Jul-22 10:34 UTC

head link

[llvm-dev] New x86-64 micro-architecture levels

* Jan Beulich:
> On 21.07.2020 20:04, Florian Weimer wrote:
>> * Premachandra Mallappa:
>> 
>>> [AMD Public Use]
>>>
>>> Hi Floarian,
>>>
>>>> I'm including a proposal for the levels below.  I use
single letters for them, but I expect that the concrete implementation of this
proposal will use
>>>> names like “x86-100”, “x86-101”, like in the glibc patch
referenced above.  (But we can discuss other approaches.)
>>>
>>> Personally I am not a big fan of this, for 2 reasons 
>>> 1. uses just x86 in name on x86_64 as well
>> 
>> That's deliberate, so that we can use the same x86-* names for
32-bit
>> library selection (once we define matching micro-architecture levels
>> there).
>
> While indeed I did understand it to be deliberate, in the light of
> 64-bit only ISA extensions (like AMX, and I suspect we're going to
> see more) I nevertheless think Premachandra has a point here.
Let me explain how I ended up there.  Maybe I'm wrong.

Previously, I observed that it is difficult to set LD_PRELOAD and
LD_LIBRARY_PATH on combined x86-64/i386 systems, so that the right
libraries are loaded for both variants, and users aren't confused by
dynamic linker warning messages.  On some systems, it is possible to use
dynamic string tokens ($LIB), but not all.

Eventually, it will be possible to add and restrict glibc-hwcaps
subdirectories by setting an environment variable.  The original patch
series only contains ld.so command line options because I wanted to
avoid a discussion about the precise mechanism for setting the
environment variable (current glibc has two approaches).  But the desire
to provide this functionality is there: for adding additional
glibc-hwcaps subdirectories to be searched first, and for restricting
selection to a subset of the built-in (automatically-selected)
subdirectories.

I was worried that we would run into the same problem as with
LD_PRELOAD, where x86-64 and i386 binaries may have different
requirements.  I wanted to minimize the conflict by sharing the names
(eventually, once we have 32-bit variants).

But thinking about this again, I'm not sure if my worry is warranted.
The main selection criteria is still the library load path, and that is
already provided by some different means (e.g. $LIB).  Within the
library path, there is the glibc-hwcaps subdirectory, but since it is
nested under a specific library path subdirectory (determined by the
architecture), adding subdirectories to be searched which do not exist
on the file system, or surpressing directories which would not be
searched in the first place, is not a problem.  The situation is
completely benign and would not warrant any error message from the
dynamic loader.

If this analysis is correct, there is no reason to share the
subdirectory names between x86-64 and i386 binaries, and we can put “64”
somewhere in the x86-64 strings.

The remaining issue is the - vs _ issue.  I think GCC currently uses
“x86-64” in places that are not part of identifiers or target triplets.
Richard mentioned “x86_64-” as a potential choice.  Would it be too
awkward to have ”-march=x86_64-…”?

Thanks,
Florian

Mallappa, Premachandra via llvm-dev

2020-Jul-22 16:45 UTC

head link

[llvm-dev] New x86-64 micro-architecture levels

[AMD Public Use]

> That's deliberate, so that we can use the same x86-* names for 32-bit
library selection (once we define matching micro-architecture levels there).
Understood.
> If numbers are out, what should we use instead?
> x86-sse4, x86-avx2, x86-avx512?  Would that work?
Yes please, I think we have to choose somewhere, above would be more descriptive
> Let's merge Level B into level C then?
I would vote for this.
>> Also we would also like to have dynamic loader support for
"zen" /
>> "zen2" as a version of "Level D" and takes
preference over Level D,
>> which may have super-optimized libraries from AMD or other vendors.
> *That* shouldn't be too hard to implement if we can nail down the
selection criteria.  Let's call this Zen-specific Level C x86-zen-avx2 for
the sake of exposition.
Some way of specifying a superset of "level C" , that "C"
will capture fully.

Zen/zen2 takes precedence over Level C, but not Level D, but falls back to
"Level C" or "x86-avx2" but not "x86-avx".

I think it is better to run a x86-zen on a x86-avx2 or x86-avx compared to
running on a base x86_64 config.
> With the levels I proposed, these aspects are covered.  But if we start to
create vendor-specific forks in the feature progression, things get complicated.I am not strictly proposing OS vendors should create/maintain this (it would be
nice if they did), but a support to cached load via system-wide-config. This
directory may/will contain a subset of system libs.
> Do you think we need to figure this out in this iteration?  If yes, then I
really need a semi-formal description of the selection criteria for this
x86-zen-avx2 directory, so that I can passed it along with my psABI proposal.
Preference level (decreasing order) (I can only speak for AMD, others please
pitch in)
- system wide config to override (in this case x86-zen)
- x86-avx2
- x86-sse4 (or avx, based on how we name and merge Level B)
- default x86_64

Michael Matz via llvm-dev

2020-Jul-23 12:44 UTC

head link

[llvm-dev] New x86-64 micro-architecture levels

Hello,

On Wed, 22 Jul 2020, Mallappa, Premachandra wrote:
> > That's deliberate, so that we can use the same x86-* names for
32-bit library selection (once we define matching micro-architecture levels
there).
> 
> Understood.
> 
> > If numbers are out, what should we use instead?
> > x86-sse4, x86-avx2, x86-avx512?  Would that work?
> 
> Yes please, I think we have to choose somewhere, above would be more 
> descriptive
And IMHO that's exactly the problem.  These names should _not_ be 
descriptive, because any description invokes a wrong feeling of precision.  
E.g. what Florian already mentioned: sse4 - does it imply 4.1 and 4.2, or 
avx512: what of F, CD, ER, PF, VL, DQ, BW, IFMA, VBMI, 4VNNIW, 4FMAPS, 
VPOPCNTDQ, VNNI, VBMI2, BITALG, VP2INTERSECT, GFNI, VPCLMULQDQ, VAES does 
that one imply (rhethorical question, list shown just to make sillyness 
explicit).

Regarding precision: I think we should rule out any mathematically correct 
scheme, e.g. one in which every ISA subset gets an index and the directory 
name contains a hexnumber constructed by bits with the corresponding index 
being one or zero, depending on if the ISA subset is required or not: I 
think we're currently at about 40 ISA subsets, and hence would end up in 
names like x86-32001afff and x86-22001afef (the latter missing two subset 
compared to the former).

No, IMHO the non-vendor names should be non-descript, and either be 
numbers or characters, of which I would vote for characters, i.e. A, B, C.  
Obviously, as already mentioned here, the mapping of level to feature set 
needs to be described in documentation somewhere, and should be maintained 
by either glibc, glibc/gcc/llvm or psABI people.

I don't have many suggestions about vendor names, be them ISA-subset 
market names, or core names or company names.  I will just note that using 
such names has lead to an explosion of number of names without very good 
separation between them.  As long as we're only talking about -march= 
cmdline flags that may have been okay, if silly, but under this proposal 
every such name is potentially a subdirectory containing many shared 
libraries, and one that potentially needs to be searched at every library 
looking in the dynamic linker; so it's prudent to limit the size of this 
name set as well.

As for which subsets should or shouldn't be required in which level: I 
think the current suggestions all sound good, ultimately it's always going 
to be some compromise.

Ciao,
Michael.

Florian Weimer via llvm-dev

2020-Jul-28 15:54 UTC

head link

[llvm-dev] New x86-64 micro-architecture levels

* Premachandra Mallappa:
> [AMD Public Use]
>>> Also we would also like to have dynamic loader support for
"zen" /
>>> "zen2" as a version of "Level D" and takes
preference over Level D,
>>> which may have super-optimized libraries from AMD or other vendors.
>
>> *That* shouldn't be too hard to implement if we can nail down the
selection criteria.  Let's call this Zen-specific Level C x86-zen-avx2 for
the sake of exposition.
>
> Some way of specifying a superset of "level C" , that
"C" will capture fully.
>
> Zen/zen2 takes precedence over Level C, but not Level D, but falls
> back to "Level C" or "x86-avx2" but not
"x86-avx".
>
> I think it is better to run a x86-zen on a x86-avx2 or x86-avx
> compared to running on a base x86_64 config.
We discussed this off-list for a bit and concluded that we do not want
to address this as part of this proposal.

I went ahead and created a merge request against the x86-64 psABI
supplement:

  <https://gitlab.com/x86-psABIs/x86-64-ABI/-/merge_requests/8>

I used x86-64-v2 etc. as the level names, picking up the suggestion to
use x86-64 there.  I think we don't need to share names with 32-bit (if
that ever happens), as explained here:

  <https://sourceware.org/pipermail/libc-alpha/2020-July/116536.html>

There are only three new levels (level B was merged into level C).

I tried to make precise the meaning of the levels by matching them to
CPU features, based on their CPUID detection logic.  It's somewhat
complicated, but I think it's within reason for the task at hand.

Thanks,
Florian

Seemingly Similar Threads

Search for more apparently analagous threads

llvm dev - Jul 2020 - New x86-64 micro-architecture levels

[llvm-dev] New x86-64 micro-architecture levels

[llvm-dev] New x86-64 micro-architecture levels

[llvm-dev] New x86-64 micro-architecture levels

[llvm-dev] New x86-64 micro-architecture levels

[llvm-dev] New x86-64 micro-architecture levels

[llvm-dev] New x86-64 micro-architecture levels

[llvm-dev] New x86-64 micro-architecture levels

[llvm-dev] New x86-64 micro-architecture levels

Seemingly Similar Threads