thr3ads.net - similar to: "[LLVMdev] VEX prefixes for JIT in llvm 3.5"

Displaying 20 results from an estimated 3000 matches similar to: "[LLVMdev] VEX prefixes for JIT in llvm 3.5"

[LLVMdev] VEX prefixes for JIT in llvm 3.5

2014 Sep 17

[LLVMdev] VEX prefixes for JIT in llvm 3.5

Hi Jim, Thanks for a very quick reply! That indeed does the trick! Presumably the default has changed in 3.5 to be a "generic" CPU instead of the native one? If that's the case I wonder why: especially when JITting it really only makes sense to target the actual CPU - unless I'm missing something? :) Thanks again, Matt On Wed, Sep 17, 2014 at 2:16 PM, Jim Grosbach

[LLVMdev] VEX prefixes for JIT in llvm 3.5

2014 Sep 17

[LLVMdev] VEX prefixes for JIT in llvm 3.5

Great stuff; thanks both! I'm also looking to turn my MCJIT conversion spike into our main use case. The only thing I'm missing is the ability to get a post-linked copy of the generated assembly. In JIT I used JITEventListener's NotifyFunctionEmitted and used a MCDisassembler to disassemble the stream (with my own custom annotators), and redirected the output to the relevant place

[LLVMdev] VEX prefixes for JIT in llvm 3.5

2014 Sep 18

[LLVMdev] VEX prefixes for JIT in llvm 3.5

Hi Matt, Philip, You could get the data you want by recording the addresses returned by the allocateCodeSection and allocateDataSection methods on your RTDyldMemoryManager, then disassembling those sections after you've called resolveRelocations. That's a little unsatisfying though. For one thing, unless you very carefully maintain the association with the original object via

[LLVMdev] Set up ExecutionEngine according to actual machine capabilities

2015 May 11

[LLVMdev] Set up ExecutionEngine according to actual machine capabilities

I am currently setting up my Module with module->setTargetTriple(llvm::sys::getProcessTriple() #ifdef _WIN32 + "-elf" #endif ); And my ExecutionEngine with llvm::EngineBuilder(std::move(module)) .setErrorStr(&err) .setMCPU(llvm::sys::getHostCPUName())

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 12

[LLVMdev] Limit loop vectorizer to SSE

On 12 November 2013 15:53, Frank Winter <fwinter at jlab.org> wrote: > .. forcing the vector size to 4 does not prevent using AVX. > Sure. That's more for tests than anything else. So, there are ways of disabling stuf in Clang, for instance "-mattr=-avx" or "-target-feature -avx", but I'm not sure how you're doing it in the JIT. I'm also not sure

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 12

[LLVMdev] Limit loop vectorizer to SSE

On 12 November 2013 16:05, Frank Winter <fwinter at jlab.org> wrote: > engineBuilder.setMCPU(llvm::sys::getHostCPUName()); > Try: engineBuilder.setMAttrs("-avx"); --renato -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131112/4b00aed7/attachment.html>

[LLVMdev] Proposal to improve vzeroupper optimization strategy

2013 Sep 21

[LLVMdev] Proposal to improve vzeroupper optimization strategy

Is it realistic to worry about performance of vectorized code that does PIC calls into a non-vectorized sin() in libc? Maybe there's an example other than sin() that is more realistic? -- Sean Silva On Fri, Sep 20, 2013 at 7:11 PM, Eli Friedman <eli.friedman at gmail.com>wrote: > On Fri, Sep 20, 2013 at 2:58 PM, Gao, Yunzhong < > yunzhong_gao at playstation.sony.com>

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 12

[LLVMdev] Limit loop vectorizer to SSE

On 12/11/13 11:01, Renato Golin wrote: > On 12 November 2013 15:53, Frank Winter <fwinter at jlab.org > <mailto:fwinter at jlab.org>> wrote: > > .. forcing the vector size to 4 does not prevent using AVX. > > > Sure. That's more for tests than anything else. > > So, there are ways of disabling stuf in Clang, for instance > "-mattr=-avx"

AVX512 instruction generated when JIT compiling for an avx2 architecture

2016 Jun 23

AVX512 instruction generated when JIT compiling for an avx2 architecture

On 06/23/2016 12:56 PM, Craig Topper wrote: > Can you check what value "getHostCPUName" returned? getHostCPUName() = skylake > > On Thu, Jun 23, 2016 at 9:53 AM, Frank Winter via llvm-dev > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > With LLVM 3.8 the JIT compiler engine generates an AVX512 > instruction although I

[LLVMdev] X86 Tablegen Description and VEX.W

2012 Nov 08

[LLVMdev] X86 Tablegen Description and VEX.W

On Thu, Nov 8, 2012 at 1:34 AM, Anitha Boyapati <anitha.boyapati at gmail.com>wrote: ... > > I actually have confusion in mapping the role of vex_w during > instruction selection. For the moment, lets just consider vex_w and > not memop. > > [1]. What does " def rr : FMA4<>, VEX_W" mean? As per tablegen > description, "rr" now inherits FMA4 and

[LLVMdev] X86 Tablegen Description and VEX.W

2012 Nov 08

[LLVMdev] X86 Tablegen Description and VEX.W

On 8 November 2012 11:12, Cameron McInally <cameron.mcinally at nyu.edu> wrote: > On Wed, Nov 7, 2012 at 10:52 PM, Anitha Boyapati <anitha.boyapati at gmail.com> > wrote: > ... >> >> For the multiclass "fma4s", why is "mr" not inherited from "VEX_W" and >> "MemOp4" like those of "rm" or "rr" ? >

[LLVMdev] X86 Tablegen Description and VEX.W

2012 Nov 08

[LLVMdev] X86 Tablegen Description and VEX.W

On Wed, Nov 7, 2012 at 10:52 PM, Anitha Boyapati <anitha.boyapati at gmail.com>wrote: ... > For the multiclass "fma4s", why is "mr" not inherited from "VEX_W" and > "MemOp4" like those of "rm" or "rr" ? > Hey Anitha, The VEX.W bit is used to denote operand order. In other words, this bit allows for a memop to be used as

New x86-64 micro-architecture levels

2020 Jul 13

New x86-64 micro-architecture levels

On Fri, Jul 10, 2020 at 11:45 PM H.J. Lu via Gcc <gcc at gcc.gnu.org> wrote: > > On Fri, Jul 10, 2020 at 10:30 AM Florian Weimer <fweimer at redhat.com> wrote: > > > > Most Linux distributions still compile against the original x86-64 > > baseline that was based on the AMD K8 (minus the 3DNow! parts, for Intel > > EM64T compatibility). > > > >

RFC: code size reduction in X86 by replacing EVEX with VEX encoding

2016 Nov 23

RFC: code size reduction in X86 by replacing EVEX with VEX encoding

I would like a command line option to disable this optimization. That way tests can still verify that EVEX instructions came out of isel by using -show-mc-encoding. On Wed, Nov 23, 2016 at 5:01 AM Hal Finkel via llvm-dev < llvm-dev at lists.llvm.org> wrote: > > ------------------------------ > > *From: *"Gadi via llvm-dev Haber" <llvm-dev at lists.llvm.org> >

RFC: code size reduction in X86 by replacing EVEX with VEX encoding

2016 Nov 23

RFC: code size reduction in X86 by replacing EVEX with VEX encoding

Hi All. This is an RFC for a proposed target specific X86 optimization for reducing code size in the encoding of AVX-512 instructions when possible. When the AVX512F instruction set was introduced in X86 it included additional 32 registers of 512bit size each ZMM0 - ZMM31, as well as additional 16 XMM registers XMM16-XMM31 and 16 YMM registers YMM16-YMM31. In order to encode the new registers of

LLVM and Xeon Skylake v5

2017 May 08

LLVM and Xeon Skylake v5

Thank you. I'm letting it auto detect by setting the target using getProcessTarget. I disabled avx512 support by passing -avx512f (and the other variants) to setMAttrs on EngineBuilder. I can see refs to avx512 in X86.td. It's the exact same executable running on Kabylake. What does the Cannot select: specifically mean? Is there some table that doesn't have a definition for a key in

[LLVMdev] LLVM 3.3 JIT code speed

2013 Jul 18

[LLVMdev] LLVM 3.3 JIT code speed

Hi, Our DSL LLVM IR emitted code (optimized with -O3 kind of IR ==> IR passes) runs slower when executed with the LLVM 3.3 JIT, compared to what we had with LLVM 3.1. What could be the reason? I tried to play with TargetOptions without any success… Here is the kind of code we use to allocate the JIT: EngineBuilder builder(fResult->fModule);

LLVM and Xeon Skylake v5

2017 May 08

LLVM and Xeon Skylake v5

getProcessTriple just determines operation system, and architecture. It doesn't deal with specific instruction set features. The CPU should be controlled by MCPU on the EngineBuilder i think. The CPU autodetection code lives in getHostCPUName in lib/Support/Host.cpp, but I don't think the JIT calls into. I think its expected the user would call it or pass a specific CPU string to the MCPU

RFC: code size reduction in X86 by replacing EVEX with VEX encoding

2016 Nov 28

RFC: code size reduction in X86 by replacing EVEX with VEX encoding

Hal, that’s a good point. There are more manually-maintained tables in the X86 backend that should probably be tablegened: the memory-folding tables and ReplaceableInstrs, to name a couple. If you have ideas on how to get these auto-generated, please let us know. From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Hal Finkel via llvm-dev Sent: Wednesday, November 23, 2016

RFC: code size reduction in X86 by replacing EVEX with VEX encoding

2016 Nov 24

RFC: code size reduction in X86 by replacing EVEX with VEX encoding

> I would like a command line option to disable this optimization. That way tests can still verify that EVEX instructions came out of isel by using -show-mc-encoding. I think that keeping tests compatibility is not a reason for an additional “llc” flag. We check encoding in test/MC/X86 dir. Is there any option to report-out from llc in non-debug mode? It should be an option to control

similar to: [LLVMdev] VEX prefixes for JIT in llvm 3.5