similar to: New x86-64 micro-architecture levels

Displaying 20 results from an estimated 20000 matches similar to: "New x86-64 micro-architecture levels"

2020 Jul 13
3
New x86-64 micro-architecture levels
On Fri, Jul 10, 2020 at 11:45 PM H.J. Lu via Gcc <gcc at gcc.gnu.org> wrote: > > On Fri, Jul 10, 2020 at 10:30 AM Florian Weimer <fweimer at redhat.com> wrote: > > > > Most Linux distributions still compile against the original x86-64 > > baseline that was based on the AMD K8 (minus the 3DNow! parts, for Intel > > EM64T compatibility). > > > >
2020 Jul 21
7
New x86-64 micro-architecture levels
* Premachandra Mallappa: > [AMD Public Use] > > Hi Floarian, > >> I'm including a proposal for the levels below. I use single letters for them, but I expect that the concrete implementation of this proposal will use >> names like “x86-100”, “x86-101”, like in the glibc patch referenced above. (But we can discuss other approaches.) > > Personally I am not a big
2019 Mar 23
4
Generating object files more efficiently
It is my actual target architecture ________________________________ From: Doerfert, Johannes <jdoerfert at anl.gov> Sent: Saturday, March 23, 2019 1:30 PM To: J S Cc: via llvm-dev Subject: Re: [llvm-dev] Generating object files more efficiently I copied "-march=XYZ" from your original email, you have to replace it with your actual target architecture or simply drop it.
2019 Mar 23
2
Generating object files more efficiently
Johannes, I tried the last one and it gave me this: error: unknown target CPU 'XYZ' note: valid target CPU values are: nocona, core2, penryn, bonnell, atom, silvermont, slm, goldmont, goldmont-plus, tremont, nehalem, corei7, westmere, sandybridge, corei7-avx, ivybridge, core-avx-i, haswell, core-avx2, broadwell, skylake, skylake-avx512, skx, cascadelake,
2020 Jul 22
2
New x86-64 micro-architecture levels
* Richard Biener: > On Wed, Jul 22, 2020 at 10:58 AM Florian Weimer via Gcc <gcc at gcc.gnu.org> wrote: >> >> * Dongsheng Song: >> >> > I fully agree these names (100/101, A/B/C/D) are not very intuitive, I >> > recommend using isa tags by year (e.g. x64_2010, x64_2014) like the >> > python's platform tags (e.g. manylinux2010,
2019 Mar 23
2
Generating object files more efficiently
-march for clang and -march for llc do different things unfortunately. -march for clang at least on x86 is the same as -mcpu in llc. Which is an artifact of gcc compatibility. ~Craig On Sat, Mar 23, 2019 at 1:40 PM Doerfert, Johannes via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Oh, my bad. > > > Idk why llc seems to know that architecture but clang does not. > >
2020 Jul 22
3
New x86-64 micro-architecture levels
* Dongsheng Song: > I fully agree these names (100/101, A/B/C/D) are not very intuitive, I > recommend using isa tags by year (e.g. x64_2010, x64_2014) like the > python's platform tags (e.g. manylinux2010, manylinux2014). I started out with a year number, but that was before the was Level A. Too many new CPUs only fall under level A unfortunately because they do not even have AVX.
2020 Jul 16
2
LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target
Tried a bunch of them there (x86-64, haswell, znver2) and they all defaulted to 4-wide - haswell additionally caused some extra loop unrolling but still with 8-wide pows. Cheers, -Neil. On Thu, Jul 16, 2020 at 2:39 PM Roman Lebedev <lebedev.ri at gmail.com> wrote: > Did you specify the target CPU the code should be optimized for? > For clang that is -march=native/znver2/... /
2019 Mar 23
2
Generating object files more efficiently
Currently I compile my C code in 2 steps in order to generate .o files clang -emit-llvm -c foo.c -o foo.bc llc -march=XYZ foo.bc -filetype=obj Is there a way to generate either .o or .elf files in just 1 command? Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190323/da9b3c18/attachment.html>
2020 Jul 13
2
New x86-64 micro-architecture levels
On 13.07.2020 09:40, Florian Weimer wrote: > * Richard Biener: >>> 2. I have a library with AVX2 and FMA, which directory should it go? >> >> Eventually GCC/gas can annotate objects with the lowest architecture >> level that is applicable? > > H.J. has patches for ELF program properties. I think > GNU_PROPERTY_X86_ISA_1_NEEDED would convey this information.
2020 Jul 16
4
LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target
So for us we use SLEEF to actually implement the libcalls (LLVM intrinsics) that LLVM by default would generate - and since SLEEF has highly optimal 8-wide pow, optimized for AVX and AVX2, we really want to use that. So we would not see 4/8 libcalls and instead see 1 call to something that lights up the ymm registers. I guess the problem then is that the default expectation is that pow would be
2017 Jan 27
3
Re: LibVirt query CPU Model support and restore operation
hello , thanks for comments . I tried now with force options for CPU flag which were not supported . Now the command with non fully supported CPU model gets executed , But i am surprised to see that still Guest cpu model is not changed and still same as host cpu model(SAndy Bridge) Why don't i see the model as HAswell now , could you please comment. Command used : virt-install
2013 Nov 23
2
[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
I agree with Tim, you need to implement a GetCpuIDAndInfoEx function in Host.cpp and pass the correct value to ecx. Also you need to verify that 7 is a valid leaf because an invalid leaf is defined to return the highest supported leaf on that processor. So if a processor supports say leaf 6 and not leaf 7, then an access leaf 7 will return the data from leaf 6 causing unrelated bits to be
2013 Sep 12
3
[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
> That's far more worrying to me than not being able to detect Haswell. > I can't reproduce the problem here at the moment: both debug and > release builds give identical assembly for Host.cpp. OK. I know the reason you cannot reproduce it, before posting the patch I've decided to check for AVX before checking AVX2, just not to cpuid AVX2 when we don't have AVX1 anyway.
2020 Jul 16
2
LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target
Hey list, I've recently done the first test run of bumping our Burst compiler from LLVM 10 -> 11 now that the branch has been cut, and have noticed an apparent loop vectorization codegen regression for X86 with AVX or AVX2 enabled. The following IR example is vectorized to 4 wide with LLVM 11 and trunk whereas in LLVM 10 it (correctly as per what we want) vectorized it 8 wide matching the
2017 Jun 29
2
Way to detect virtual machine cpu features
Hello everyone I want to know how can I use libvirt to detect what cpu features a virtual machine will see. I guess I could do it in following way: 1. if cpu mode is 'custom', use 'virsh cpu-baseline --features' on the cpu model to get model features. 2. if cpu mode is 'host-passthrough' or 'host-model', do a 'virsh capabilities' to list cpu features of
2013 Dec 11
2
[LLVMdev] AVX code gen
Hello - I found this post on the llvm blog: http://blog.llvm.org/2012/12/new-loop-vectorizer.html which makes me think that clang / llvm are capable of generating AVX with packed instructions as well as utilizing the full width of the YMM registers… I have an environment where icc generates these instructions (vmulps %ymm1, %ymm3, %ymm2 for example) but I can not get clang/llvm to generate such
2016 Jul 15
3
RFC: SIMD math-function library
Is it possible to see the source code of the open-sourced SVML? The diff file does not include the library. I searched the Internet but I could not find. Regards, Naoki Shibata On 2016/07/15 13:55, Tian, Xinmin wrote: > Naoki, > > Intel is planning open-source SVML library (most of them if it not 100%), 6 functions of SVML are open sourced for GCC and LLVM already. But, Intel SVML
2015 Jul 24
2
[LLVMdev] SIMD for sdiv <2 x i64>
It seems that that it's hard to vectorize int64 in LLVM. For example, LLVM 3.4 generates very complicated code for the following IR. I am running on a Haswell processor. Is it because there is no alternative AVX/2 instructions for int64? The same thing also happens to zext <2 x i32> -> <2 x i64> and trunc <2 x i64> -> <2 x i32>. Any ideas to optimize these
2013 Sep 12
0
[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
Hi Adam, > OK. I know the reason you cannot reproduce it, before posting > the patch I've decided to check for AVX before checking AVX2, > just not to cpuid AVX2 when we don't have AVX1 anyway. I suspect it was also incompetence on my part. Given the differences I'm seeing now I can't believe there'd be *no* difference in my tests if I'd done them properly.