similar to: RFC: code size reduction in X86 by replacing EVEX with VEX encoding

Displaying 20 results from an estimated 1000 matches similar to: "RFC: code size reduction in X86 by replacing EVEX with VEX encoding"

2016 Nov 23
2
RFC: code size reduction in X86 by replacing EVEX with VEX encoding
I would like a command line option to disable this optimization. That way tests can still verify that EVEX instructions came out of isel by using -show-mc-encoding. On Wed, Nov 23, 2016 at 5:01 AM Hal Finkel via llvm-dev < llvm-dev at lists.llvm.org> wrote: > > ------------------------------ > > *From: *"Gadi via llvm-dev Haber" <llvm-dev at lists.llvm.org> >
2016 Nov 24
3
RFC: code size reduction in X86 by replacing EVEX with VEX encoding
> I would like a command line option to disable this optimization. That way tests can still verify that EVEX instructions came out of isel by using -show-mc-encoding. I think that keeping tests compatibility is not a reason for an additional “llc” flag. We check encoding in test/MC/X86 dir. Is there any option to report-out from llc in non-debug mode? It should be an option to control
2016 Nov 28
2
RFC: code size reduction in X86 by replacing EVEX with VEX encoding
Hal, that’s a good point. There are more manually-maintained tables in the X86 backend that should probably be tablegened: the memory-folding tables and ReplaceableInstrs, to name a couple. If you have ideas on how to get these auto-generated, please let us know. From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Hal Finkel via llvm-dev Sent: Wednesday, November 23, 2016
2018 Mar 28
4
x86 instruction format which takes a single 64-bit immediate
I am attempting to create an instruction which takes a single 64-bit immediate. This doesn't seem like a thing that would exist already (because who needs an instruction which just takes an immediate?) How might I implement this easily? Perhaps I could use a format which encodes a register, which is then unused? Thanks for the help. Gus -------------- next part -------------- An HTML
2017 Jul 28
3
Purpose of various register classes in X86 target
Hello Matthias, On 28 July 2017 at 04:13, Matthias Braun <mbraun at apple.com> wrote: > It's not that hard in principle: > - A register class is a set of registers. > - Virtual Registers have a register class assigned. > - If you have register constraints (like x86 8bit operations only work on > al,ah,etc.) then you have to create a new register class to express that.
2017 Sep 05
2
Issues in Vector Add Instruction Machine Code Emission
I was getting same error when i keep both EVEX/EVEX_4V and TA. So, i restored my original instructions and for that i have to include bool HasTA = TSFlags & X86II::TA; in x86MCCodeEmitter.cpp then used this condition; if(HasTA) ++SrcRegNum; in order to emit binary correctly. Is it right? On Tue, Sep 5, 2017 at 5:45 AM, Craig Topper <craig.topper at gmail.com> wrote: >
2017 Sep 05
2
Issues in Vector Add Instruction Machine Code Emission
Thank You, I changed TA to EVEX or EVEX_4V. But now i am getting following error: Invalid prefix! UNREACHABLE executed at /lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp:647! On Tue, Sep 5, 2017 at 4:36 AM, Craig Topper <craig.topper at gmail.com> wrote: > Not all instructions can use EVEX_4V. Move instructions in particular > cannot because they don't have 2 sources. >
2018 Mar 28
0
x86 instruction format which takes a single 64-bit immediate
Copy Ii32 in X86InstrFormats.td rename to Ii64 and change Imm32 to Imm64. Instantiate your instruction inheriting from Ii64. Pass RawFrm to the form parameter. Initial documentation for the encoding system is attached. ~Craig On Wed, Mar 28, 2018 at 4:50 PM, Gus Smith via llvm-dev < llvm-dev at lists.llvm.org> wrote: > I am attempting to create an instruction which takes a single
2013 Aug 28
3
[PATCH] x86: AVX instruction emulation fixes
- we used the C4/C5 (first prefix) byte instead of the apparent ModR/M one as the second prefix byte - early decoding normalized vex.reg, thus corrupting it for the main consumer (copy_REX_VEX()), resulting in #UD on the two-operand instructions we emulate Also add respective test cases to the testing utility plus - fix get_fpu() (the fall-through order was inverted) - add cpu_has_avx2,
2017 Sep 04
2
Issues in Vector Add Instruction Machine Code Emission
Thank You. I used EVEX_4V with all the instructions. I replaced TA and EVEX both with EVEX_4V. Now, I am getting following error: llvm-tblgen: /utils/TableGen/X86RecognizableInstr.cpp:687: void llvm::X86Disassembler::RecognizableInstr::emitInstructionSpecifier(): Assertion `numPhysicalOperands >= 2 + additionalOperands && numPhysicalOperands <= 4 + additionalOperands &&
2013 Dec 12
3
[LLVMdev] [RFC PATCH 1/2] x86: Fix ModR/M byte output in 16-bit addressing mode
This attempts to address http://llvm.org/bugs/show_bug.cgi?id=18220 It also fixes a test which was requiring the *wrong* output. I'm relatively happy with this part, and it even solves most of the hard part of feature request for .code16 in bug 8684 — which was actually why I started prodding at this. But I could do with some help with the 16-bit signed relocation handling, which I've
2020 Jul 02
2
flags to reproduce clang -O3 with opt -O3
Hello, I've been trying to figure out how to reproduce the results of a single clang -O3 compilation to a binary with a multi-step process using opt. Specifically I have: clang -O3 foo.c -o foo.exe which I want to replicate with: clang -O0 -c -emit-llvm foo.c opt -O3 foo.bc -o foo_o.bc clang foo_o.bc -o foo.exe Any hints / suggestions on what additional flags I need to produce the same
2017 Jul 07
2
Unhandled reg/opcode register encoding VR2048 Error in backend
Hello, I m working towards backend. Here i need to define vector load and stores for 64 i32 elements. so in x86instrinfo.td i wrote; def VMOV_256B_RM : I<0x6F, MRMSrcMem, (outs VR2048:$dst), (ins i32mem:$src), "vmov_256B_rm\t{$src, $dst|$dst, $src}", [(set VR2048:$dst, (v64i32 (scalar_to_vector (loadi32 addr:$src))))],
2020 Jul 03
2
flags to reproduce clang -O3 with opt -O3
Awesome, thanks! I'd like to have the last step (llc in your example) not perform additional optimization passes, such as O3, and simply use the O3 pass from opt in the previous line. Do you happen to know if I should use 'llc -O0 foo_o.bc -o foo.exe' instead to achieve this? On Thu, Jul 2, 2020 at 6:35 PM Mehdi AMINI <joker.eph at gmail.com> wrote: > > > On Thu,
2012 Jul 10
0
[LLVMdev] question on table gen TIED_TO constraint
I don't think changing to VEX_4VOp3 to VEX_4V is the right fix. I think the fix is to increment CurOp twice at the start for these instructions so that only the input operands are used for encoding. Also, I just submitted a patch to revert the operand order for these instructions in the assembler/disassembler. Destination register should appear on the right and the mask should appear on the
2020 Aug 31
2
Vectorization of math function failed?
Hi, After reading https://llvm.org/docs/Vectorizers.html#vectorization-of-function-calls I decided to write the following C++ program: #include <cmath> using v4f32 = float __attribute__((__vector_size__(16))); v4f32 fct1(v4f32 x) { v4f32 y; y[0] = std::sin(x[0]); y[1] = std::sin(x[1]); y[2] = std::sin(x[2]); y[3] = std::sin(x[3]); return y; } v4f32 fct2(v4f32 x) { v4f32 y;
2012 Jul 10
2
[LLVMdev] question on table gen TIED_TO constraint
Yes, there is an easy way to fix this. MRMSrcMem assumes register, memory, vvvv register if VEX_4VOp3 is true and assumes register, vvvv register, memory if VEX_4V is true. I just need to change the flag from VEX_4VOp3 to VEX_4V. There are a few places where we assume only the 2nd operand can be tied-to: Desc->getOperandConstraint(1, MCOI::TIED_TO) != -1 (hard-coded index 1) I will fix those
2017 Sep 04
2
Issues in Vector Add Instruction Machine Code Emission
Sorry to ask but what does it mean to put both? On Tue, Sep 5, 2017 at 4:01 AM, Craig Topper <craig.topper at gmail.com> wrote: > Leave TA. Put both. > > ~Craig > > On Mon, Sep 4, 2017 at 4:00 PM, hameeza ahmed <hahmed2305 at gmail.com> > wrote: > >> You are right. But when i defined my instruction as follows: >> def P_256B_VADD : I<0xE1,
2020 Jul 10
12
New x86-64 micro-architecture levels
Most Linux distributions still compile against the original x86-64 baseline that was based on the AMD K8 (minus the 3DNow! parts, for Intel EM64T compatibility). There has been an attempt to use the existing AT_PLATFORM-based loading mechanism in the glibc dynamic linker to enable a selection of optimized libraries. But the general selection mechanism in glibc is problematic: hwcaps
2016 May 01
2
r267690 - [Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa vmovntpd vmovntps instruction set
Hi, For now no. But I will add this three builtins to CGBuiltin.cpp. If you want, you can be a reviewer of this change. Regards Michael Zuckerman From: Craig Topper [mailto:craig.topper at gmail.com] Sent: Thursday, April 28, 2016 04:53 To: Zuckerman, Michael <michael.zuckerman at intel.com> Subject: Re: r267690 - [Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa vmovntpd vmovntps