thr3ads.net - similar to: "LLVM and Xeon Skylake v5"

Displaying 20 results from an estimated 1000 matches similar to: "LLVM and Xeon Skylake v5"

2017 May 08

LLVM and Xeon Skylake v5

Thank you. I'm letting it auto detect by setting the target using getProcessTarget. I disabled avx512 support by passing -avx512f (and the other variants) to setMAttrs on EngineBuilder. I can see refs to avx512 in X86.td. It's the exact same executable running on Kabylake. What does the Cannot select: specifically mean? Is there some table that doesn't have a definition for a key in

LLVM and Xeon Skylake v5

2017 May 08

LLVM and Xeon Skylake v5

getProcessTriple just determines operation system, and architecture. It doesn't deal with specific instruction set features. The CPU should be controlled by MCPU on the EngineBuilder i think. The CPU autodetection code lives in getHostCPUName in lib/Support/Host.cpp, but I don't think the JIT calls into. I think its expected the user would call it or pass a specific CPU string to the MCPU

llvm is illegally vectorizing with a recurrence on skylake

2019 May 02

llvm is illegally vectorizing with a recurrence on skylake

Hi -- I have found a bug in an HPC code where llvm is vectorizing a loop on Skylake that has an obvious recurrence. I derived a small test case based on the original benchmark below: /*****************************************************************/ static void __attribute__ ((always_inline)) one( const int *restrict in, const int *const end, const unsigned shift, int *const restrict index,

RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available

2017 Nov 14

RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available

I haven't looked into actually implementing revectorization, so we may just want to ignore that possibility for now. But I imagined that revectorization could hit the same problem that we're trying to avoid here: if the cost models say that wider vectors are legal and cheaper, but the reality is that perf will suffer when using those wider vectors, then we want to avoid using the wider

RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available

2017 Nov 13

RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available

On Sat, Nov 11, 2017 at 8:52 PM, Hal Finkel via llvm-dev < llvm-dev at lists.llvm.org> wrote: > > On 11/11/2017 09:52 PM, UE US via llvm-dev wrote: > > If skylake is that bad at AVX2 > > > I don't think this says anything negative about AVX2, but AVX-512. > > it belongs in -mcpu / -march IMO. > > > No. We'd still want to enable the architectural

RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available

2017 Nov 13

RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available

On 11/13/2017 05:49 PM, Eric Christopher wrote: > > > On Mon, Nov 13, 2017 at 2:15 PM Craig Topper via llvm-dev > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > On Sat, Nov 11, 2017 at 8:52 PM, Hal Finkel via llvm-dev > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > > On

RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available

2017 Nov 12

RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available

If skylake is that bad at AVX2 it belongs in -mcpu / -march IMO. Most people will build for the standard x86_64-pc-linux or whatever anyway, and completely ignore the change. This will mainly affect those who build their own software and optimize for their system, and lots there have probably caught on to this already. I always thought that's what -march was made for, really. GNOMETOYS

AVX512 instruction generated when JIT compiling for an avx2 architecture

2016 Jun 23

AVX512 instruction generated when JIT compiling for an avx2 architecture

On 06/23/2016 12:56 PM, Craig Topper wrote: > Can you check what value "getHostCPUName" returned? getHostCPUName() = skylake > > On Thu, Jun 23, 2016 at 9:53 AM, Frank Winter via llvm-dev > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > With LLVM 3.8 the JIT compiler engine generates an AVX512 > instruction although I

[RFC PATCH] pci: prevent putting pcie devices into lower device states on certain intel bridges

2019 Oct 01

[RFC PATCH] pci: prevent putting pcie devices into lower device states on certain intel bridges

On Tue, Oct 1, 2019 at 10:47 AM Mika Westerberg <mika.westerberg at linux.intel.com> wrote: > > On Mon, Sep 30, 2019 at 06:36:12PM +0200, Karol Herbst wrote: > > On Mon, Sep 30, 2019 at 6:30 PM Mika Westerberg > > <mika.westerberg at linux.intel.com> wrote: > > > > > > On Mon, Sep 30, 2019 at 06:05:14PM +0200, Karol Herbst wrote: > > > >

RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available

2017 Nov 11

RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available

Are you referring to the X86TargetLowering::isFsqrtCheap hook? ~Craig On Fri, Nov 10, 2017 at 7:39 AM, Sanjay Patel <spatel at rotateright.com> wrote: > We can tie a user preference / override to a CPU model. We do something > like that for square root estimates already (although it does use a > SubtargetFeature currently for x86; ideally, we'd key that off of something >

RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available

2017 Nov 01

RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available

Hello all, I would like to propose adding the -mprefer-avx256 and -mprefer-avx128 command line flags supported by latest GCC to clang. These flags will be used to limit the vector register size presented by TTI to the vectorizers. The backend will still be able to use wider registers for code written using the instrinsics in x86intrin.h. And the backend will still be able to use AVX512VL

RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available

2017 Nov 09

RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available

I agree that a less x86 specific command line makes sense. I've been having an internal discussions with gcc folks and their evaluating switching to something like -mprefer-vector-width=128/256/512/none Based on the current performance data we're seeing, we think we need to ultimately default skylake-avx512 to -mprefer-vector-width=256. If we go with a target independent

RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available

2017 Nov 07

RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available

On Fri, Nov 3, 2017, at 05:47, Craig Topper via llvm-dev wrote: > That's a very good point about the ordering of the command line options. > gcc's current implementation treats -mprefer-avx256 has "prefer 256 over > 512" and -mprefer-avx128 as "prefer 128 over 256". Which feels weird for > other reasons, but has less of an ordering ambiguity. > >

Generating object files more efficiently

2019 Mar 23

Generating object files more efficiently

It is my actual target architecture ________________________________ From: Doerfert, Johannes <jdoerfert at anl.gov> Sent: Saturday, March 23, 2019 1:30 PM To: J S Cc: via llvm-dev Subject: Re: [llvm-dev] Generating object files more efficiently I copied "-march=XYZ" from your original email, you have to replace it with your actual target architecture or simply drop it.

Generating object files more efficiently

2019 Mar 23

Generating object files more efficiently

Johannes, I tried the last one and it gave me this: error: unknown target CPU 'XYZ' note: valid target CPU values are: nocona, core2, penryn, bonnell, atom, silvermont, slm, goldmont, goldmont-plus, tremont, nehalem, corei7, westmere, sandybridge, corei7-avx, ivybridge, core-avx-i, haswell, core-avx2, broadwell, skylake, skylake-avx512, skx, cascadelake,

RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available

2017 Nov 03

RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available

On Thu, Nov 2, 2017 at 7:05 PM James Y Knight via llvm-dev < llvm-dev at lists.llvm.org> wrote: > On Wed, Nov 1, 2017 at 7:35 PM, Craig Topper via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Hello all, >> >> >> >> I would like to propose adding the -mprefer-avx256 and -mprefer-avx128 >> command line flags supported by latest GCC to

Possible AVX512 codegen bug in LLVM 10.0.1?

2020 Sep 05

Possible AVX512 codegen bug in LLVM 10.0.1?

Hey LLVMDev, Perhaps I'm missing something, but I think I've stumbled across a codegen bug in LLVM 10.0.1 related to AVX512. I've attached a small LLVM IR testcase and generated x86_64 assembly file that shows the bug. The test case is small, but not quite minimal, mostly because of driver code included in the test case so one can compile and run the program. The program does a

[RFC PATCH] pci: prevent putting pcie devices into lower device states on certain intel bridges

2019 Oct 01

[RFC PATCH] pci: prevent putting pcie devices into lower device states on certain intel bridges

On Tue, Oct 1, 2019 at 11:11 AM Mika Westerberg <mika.westerberg at linux.intel.com> wrote: > > On Tue, Oct 01, 2019 at 10:56:39AM +0200, Karol Herbst wrote: > > On Tue, Oct 1, 2019 at 10:47 AM Mika Westerberg > > <mika.westerberg at linux.intel.com> wrote: > > > > > > On Mon, Sep 30, 2019 at 06:36:12PM +0200, Karol Herbst wrote: > > > >

Intel Skylake Server

2017 Sep 14

Intel Skylake Server

Hello, I have a pre-production Intel Skylake server using dual 8176 processors (28 core @2.1Ghz) I have loaded from the DVD distribution CentOS 7.3 and RHEL 7.4 with no problems. When I try and load CentOS 6.9 the boot process hangs just as the Anaconda graphics is started for CentOS configuration. If I use a secondary graphics card then the process completes with no problems and CentOS boots

CentOS 6.9 Skylake soft error

2017 May 31

CentOS 6.9 Skylake soft error

Hello, I have an Intel NUC6i5 (Skylake i5-6260u processor) running CentOS 6.9 and there are no real problems. However during boot an error occurs in dmesg where a PCH unknown device 0x9d48 is logged. How can I get rid of this error output? Thank you, Mark

similar to: LLVM and Xeon Skylake v5