thr3ads.net - similar to: "New x86-64 micro-architecture levels"

Displaying 20 results from an estimated 20000 matches similar to: "New x86-64 micro-architecture levels"

2020 Jul 13

New x86-64 micro-architecture levels

On Fri, Jul 10, 2020 at 11:45 PM H.J. Lu via Gcc <gcc at gcc.gnu.org> wrote: > > On Fri, Jul 10, 2020 at 10:30 AM Florian Weimer <fweimer at redhat.com> wrote: > > > > Most Linux distributions still compile against the original x86-64 > > baseline that was based on the AMD K8 (minus the 3DNow! parts, for Intel > > EM64T compatibility). > > > >

New x86-64 micro-architecture levels

2020 Jul 21

New x86-64 micro-architecture levels

* Premachandra Mallappa: > [AMD Public Use] > > Hi Floarian, > >> I'm including a proposal for the levels below. I use single letters for them, but I expect that the concrete implementation of this proposal will use >> names like “x86-100”, “x86-101”, like in the glibc patch referenced above. (But we can discuss other approaches.) > > Personally I am not a big

Generating object files more efficiently

2019 Mar 23

Generating object files more efficiently

It is my actual target architecture ________________________________ From: Doerfert, Johannes <jdoerfert at anl.gov> Sent: Saturday, March 23, 2019 1:30 PM To: J S Cc: via llvm-dev Subject: Re: [llvm-dev] Generating object files more efficiently I copied "-march=XYZ" from your original email, you have to replace it with your actual target architecture or simply drop it.

Generating object files more efficiently

2019 Mar 23

Generating object files more efficiently

Johannes, I tried the last one and it gave me this: error: unknown target CPU 'XYZ' note: valid target CPU values are: nocona, core2, penryn, bonnell, atom, silvermont, slm, goldmont, goldmont-plus, tremont, nehalem, corei7, westmere, sandybridge, corei7-avx, ivybridge, core-avx-i, haswell, core-avx2, broadwell, skylake, skylake-avx512, skx, cascadelake,

New x86-64 micro-architecture levels

2020 Jul 22

New x86-64 micro-architecture levels

* Richard Biener: > On Wed, Jul 22, 2020 at 10:58 AM Florian Weimer via Gcc <gcc at gcc.gnu.org> wrote: >> >> * Dongsheng Song: >> >> > I fully agree these names (100/101, A/B/C/D) are not very intuitive, I >> > recommend using isa tags by year (e.g. x64_2010, x64_2014) like the >> > python's platform tags (e.g. manylinux2010,

Generating object files more efficiently

2019 Mar 23

Generating object files more efficiently

-march for clang and -march for llc do different things unfortunately. -march for clang at least on x86 is the same as -mcpu in llc. Which is an artifact of gcc compatibility. ~Craig On Sat, Mar 23, 2019 at 1:40 PM Doerfert, Johannes via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Oh, my bad. > > > Idk why llc seems to know that architecture but clang does not. > >

New x86-64 micro-architecture levels

2020 Jul 22

New x86-64 micro-architecture levels

* Dongsheng Song: > I fully agree these names (100/101, A/B/C/D) are not very intuitive, I > recommend using isa tags by year (e.g. x64_2010, x64_2014) like the > python's platform tags (e.g. manylinux2010, manylinux2014). I started out with a year number, but that was before the was Level A. Too many new CPUs only fall under level A unfortunately because they do not even have AVX.

LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target

2020 Jul 16

LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target

Tried a bunch of them there (x86-64, haswell, znver2) and they all defaulted to 4-wide - haswell additionally caused some extra loop unrolling but still with 8-wide pows. Cheers, -Neil. On Thu, Jul 16, 2020 at 2:39 PM Roman Lebedev <lebedev.ri at gmail.com> wrote: > Did you specify the target CPU the code should be optimized for? > For clang that is -march=native/znver2/... /

Generating object files more efficiently

2019 Mar 23

Generating object files more efficiently

Currently I compile my C code in 2 steps in order to generate .o files clang -emit-llvm -c foo.c -o foo.bc llc -march=XYZ foo.bc -filetype=obj Is there a way to generate either .o or .elf files in just 1 command? Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190323/da9b3c18/attachment.html>

New x86-64 micro-architecture levels

2020 Jul 13

New x86-64 micro-architecture levels

On 13.07.2020 09:40, Florian Weimer wrote: > * Richard Biener: >>> 2. I have a library with AVX2 and FMA, which directory should it go? >> >> Eventually GCC/gas can annotate objects with the lowest architecture >> level that is applicable? > > H.J. has patches for ELF program properties. I think > GNU_PROPERTY_X86_ISA_1_NEEDED would convey this information.

LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target

2020 Jul 16

LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target

So for us we use SLEEF to actually implement the libcalls (LLVM intrinsics) that LLVM by default would generate - and since SLEEF has highly optimal 8-wide pow, optimized for AVX and AVX2, we really want to use that. So we would not see 4/8 libcalls and instead see 1 call to something that lights up the ymm registers. I guess the problem then is that the default expectation is that pow would be

Re: LibVirt query CPU Model support and restore operation

2017 Jan 27

Re: LibVirt query CPU Model support and restore operation

hello , thanks for comments . I tried now with force options for CPU flag which were not supported . Now the command with non fully supported CPU model gets executed , But i am surprised to see that still Guest cpu model is not changed and still same as host cpu model(SAndy Bridge) Why don't i see the model as HAswell now , could you please comment. Command used : virt-install

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

2013 Nov 23

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

I agree with Tim, you need to implement a GetCpuIDAndInfoEx function in Host.cpp and pass the correct value to ecx. Also you need to verify that 7 is a valid leaf because an invalid leaf is defined to return the highest supported leaf on that processor. So if a processor supports say leaf 6 and not leaf 7, then an access leaf 7 will return the data from leaf 6 causing unrelated bits to be

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

2013 Sep 12

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

> That's far more worrying to me than not being able to detect Haswell. > I can't reproduce the problem here at the moment: both debug and > release builds give identical assembly for Host.cpp. OK. I know the reason you cannot reproduce it, before posting the patch I've decided to check for AVX before checking AVX2, just not to cpuid AVX2 when we don't have AVX1 anyway.

LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target

2020 Jul 16

LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target

Hey list, I've recently done the first test run of bumping our Burst compiler from LLVM 10 -> 11 now that the branch has been cut, and have noticed an apparent loop vectorization codegen regression for X86 with AVX or AVX2 enabled. The following IR example is vectorized to 4 wide with LLVM 11 and trunk whereas in LLVM 10 it (correctly as per what we want) vectorized it 8 wide matching the

Way to detect virtual machine cpu features

2017 Jun 29

Way to detect virtual machine cpu features

Hello everyone I want to know how can I use libvirt to detect what cpu features a virtual machine will see. I guess I could do it in following way: 1. if cpu mode is 'custom', use 'virsh cpu-baseline --features' on the cpu model to get model features. 2. if cpu mode is 'host-passthrough' or 'host-model', do a 'virsh capabilities' to list cpu features of

[LLVMdev] AVX code gen

2013 Dec 11

[LLVMdev] AVX code gen

Hello - I found this post on the llvm blog: http://blog.llvm.org/2012/12/new-loop-vectorizer.html which makes me think that clang / llvm are capable of generating AVX with packed instructions as well as utilizing the full width of the YMM registers… I have an environment where icc generates these instructions (vmulps %ymm1, %ymm3, %ymm2 for example) but I can not get clang/llvm to generate such

RFC: SIMD math-function library

2016 Jul 15

RFC: SIMD math-function library

Is it possible to see the source code of the open-sourced SVML? The diff file does not include the library. I searched the Internet but I could not find. Regards, Naoki Shibata On 2016/07/15 13:55, Tian, Xinmin wrote: > Naoki, > > Intel is planning open-source SVML library (most of them if it not 100%), 6 functions of SVML are open sourced for GCC and LLVM already. But, Intel SVML

[LLVMdev] SIMD for sdiv <2 x i64>

2015 Jul 24

[LLVMdev] SIMD for sdiv <2 x i64>

It seems that that it's hard to vectorize int64 in LLVM. For example, LLVM 3.4 generates very complicated code for the following IR. I am running on a Haswell processor. Is it because there is no alternative AVX/2 instructions for int64? The same thing also happens to zext <2 x i32> -> <2 x i64> and trunc <2 x i64> -> <2 x i32>. Any ideas to optimize these

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

2013 Sep 12

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

Hi Adam, > OK. I know the reason you cannot reproduce it, before posting > the patch I've decided to check for AVX before checking AVX2, > just not to cpuid AVX2 when we don't have AVX1 anyway. I suspect it was also incompetence on my part. Given the differences I'm seeing now I can't believe there'd be *no* difference in my tests if I'd done them properly.

similar to: New x86-64 micro-architecture levels