similar to: VMOVNTDQ in LLVM

Displaying 20 results from an estimated 4000 matches similar to: "VMOVNTDQ in LLVM"

2017 Sep 28
0
VMOVNTDQ in LLVM
What code did you write where you expected VMOVNTDQ to be generated? The x86 MOVNT instructions are something of an attractive nuisance. There are cases where they are beneficial, but those cases are much less common and harder to characterize than you might expect. LLVM is deliberately conservative about using MOVNT (IIRC it’s currently only generated if you explicitly ask for it by using
2017 Sep 28
1
VMOVNTDQ in LLVM
This thread from last year has some additional detail regarding the hazards of aggressive MOVNT generation if you’re interested: http://lists.llvm.org/pipermail/llvm-dev/2016-May/098980.html <http://lists.llvm.org/pipermail/llvm-dev/2016-May/098980.html> – Steve > On Sep 28, 2017, at 9:31 AM, Stephen Canon via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > What code did
2017 Sep 21
3
RFC: [X86] Can we begin removing AutoUpgrade support for x86 instrinsics added in early 3.X versions
I agree with Paul that we need to formalise the compatibility policy before we start removing support for old intrinsics. Do we want a deprecation warning of some kind for the use of any intrinsic used in auto-upgrade, would that actually be useful or just a nuisance? In the meantime I’m happy to help fix any missing test coverage. > On 20 Sep 2017, at 22:16, Robinson, Paul via llvm-dev
2017 Sep 22
0
RFC: [X86] Can we begin removing AutoUpgrade support for x86 instrinsics added in early 3.X versions
Hi, I believe we have a formal policy: we support every bitcode produced since LLVM 3.0: https://llvm.org/docs/DeveloperPolicy.html#ir-backwards-compatibility (until we decide to uprev the version we support). Unfortunately, the testing was only added around 3.6 or 3.7? And support is only as good as the testing we have... -- Mehdi 2017-09-21 0:23 GMT-07:00 Simon Pilgrim via llvm-dev <
2019 Apr 23
5
StringRef Iterator Variable Display
Hello, I want to display the variable names in stringref iterator. But it is not displayed using following code. for (set<StringRef>::iterator sit = L.begin(); sit != L.end(); sit++) { errs() << *sit << " "; } How to do this? Please help.. -------------- next part -------------- An HTML attachment was scrubbed... URL:
2017 Aug 07
3
VBROADCAST Implementation Issues
Thank You. Still getting errors.I have modified my instructions as you said as follows: def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst, VK64WM:$mask_wb), (ins VR_2048:$src1, VK64WM:$mask, i2048mem:$src2), "GATHER_256B\t{$src2, {$dst} {${mask}}|${dst} {${mask}}, $src2}", [(set VR_2048:$dst, VK64WM:$mask_wb, (v64i32 (masked_gather
2017 Aug 07
2
VBROADCAST Implementation Issues
Hello, I did as you said, Please tell me whether the following correct now?? def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst, _.KRCWM:$mask_wb), (VR_2048:$src1, _.KRCWM:$mask, ins i2048mem:$src2), "GATHER_256B\t{$src2, {$dst}{${mask}}|${dst} {${mask}}, $src2}"), [(set VR_2048:$dst, _.KRCWM:$mask_wb, (v64i32 (GatherNode
2017 Aug 06
2
VBROADCAST Implementation Issues
i want to implement gather for v64i32. i wrote following code. def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst), (ins i2048mem:$src), "GATHER_256B\t{$src, $dst|$dst, $src}", [(set VR_2048:$dst, (v64i32 (masked_gather addr:$src)))], IIC_MOV_MEM>, TA; def: Pat<(v64f32 (masked_gather addr:$src)), (GATHER_256B
2017 Jul 01
2
Jacobi 5 Point Stencil Code not Vectorizing
I am able to vectorize it with the following code; #include <stdio.h> #define N 100351 // This function computes 2D-5 point Jacobi stencil void stencil(int a[][N], int b[][N]) { int i, j, k; for (k = 0; k < N; k++) { for (i = 1; i <= N-2; i++) for (j = 1; j <= N-2; j++) b[i][j] = 0.25 * (a[i][j] + a[i-1][j] + a[i+1][j] + a[i][j-1] + a[i][j+1]); for
2017 Jul 08
5
Error in v64i32 type in x86 backend
Thank You. I have seen the opcode is 8 bits and all the combinations are already used in llvm x86. Now what to do? On Sat, Jul 8, 2017 at 10:57 AM, Craig Topper <craig.topper at gmail.com> wrote: > Yes its an opcode conflict. You'll have to look through Intel documents > and find an unused opcode. I've only added instructions based on a real > spec so I don't know
2017 Jul 14
3
error:Ran out of lanemask bits to represent subregister
Do your 32768 registers also have sub registers? I can't tell you exactly what to change. I'm not familiar with the code. I would just be running grep or something. ~Craig On Fri, Jul 14, 2017 at 10:23 AM, hameeza ahmed <hahmed2305 at gmail.com> wrote: > Thank you so much. I think there is no issue with my definitions since i > have to use larger registers i.e 65536 bit
2017 Sep 20
2
RFC: [X86] Can we begin removing AutoUpgrade support for x86 instrinsics added in early 3.X versions
Many of the older autoupgrades have no test cases because I think when we upgraded them we just replace all the code in the tests with native IR. So for some of the code we don't even know if it works. I don't really want to watch the amount of code here continue to grow indefinitely. It's pretty poorly structured and has been up against the MSVC cascaded if/else limit a couple times.
2017 Jul 01
3
Jacobi 5 Point Stencil Code not Vectorizing
Does it happen due to loop carried dependence? if yes what is the solution to vectorize such codes? please reply. i m waiting. On Jul 1, 2017 12:30 PM, "hameeza ahmed" <hahmed2305 at gmail.com> wrote: > I even tried polly but still my llvm IR does not contain vector > instructions. i used the following command; > > clang -S -emit-llvm stencil.c -march=knl -O3
2017 Jul 12
2
Using new types v32f32, v32f64 in llvm backend not possible
I would be very grateful if you specify whether there is some way to allocate registers (different order) / from different register sets to the same instruction based on the vector width/ no of iterations. I have tried several alternatives but could not succeed. Also I have asked this question many times but no one responds. Is there something wrong with this?? Kindly guide me. Thank You On
2017 Sep 05
2
Issues in Vector Add Instruction Machine Code Emission
I was getting same error when i keep both EVEX/EVEX_4V and TA. So, i restored my original instructions and for that i have to include bool HasTA = TSFlags & X86II::TA; in x86MCCodeEmitter.cpp then used this condition; if(HasTA) ++SrcRegNum; in order to emit binary correctly. Is it right? On Tue, Sep 5, 2017 at 5:45 AM, Craig Topper <craig.topper at gmail.com> wrote: >
2017 Sep 05
2
Issues in Vector Add Instruction Machine Code Emission
Thank You, I changed TA to EVEX or EVEX_4V. But now i am getting following error: Invalid prefix! UNREACHABLE executed at /lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp:647! On Tue, Sep 5, 2017 at 4:36 AM, Craig Topper <craig.topper at gmail.com> wrote: > Not all instructions can use EVEX_4V. Move instructions in particular > cannot because they don't have 2 sources. >
2017 Jul 07
2
Error in v64i32 type in x86 backend
Thank You. On Fri, Jul 7, 2017 at 10:03 AM, Craig Topper <craig.topper at gmail.com> wrote: > Yes, that error is from instruction selection. I think your legalization > changes worked fine. > > ~Craig > > On Thu, Jul 6, 2017 at 8:21 PM, hameeza ahmed via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> also i further run the following command;
2018 Jul 24
2
KNL Vectorization with larger vector width
Hello, I need help here. I am able to adjust the vector width through WidestRegister value. When number of iterations=31 and I set vector width=32 it gives <16xi32> and <8xi32> instructions. However if i replicate same behavior with number of iterations=63 and I set vector width=64, no vector instructions are emitted. it should do as previous and gives <32xi32> and
2017 Aug 17
4
unable to emit vectorized code in LLVM IR
I assume compiler knows that your only have 2 input values that you just added together 1000 times. Despite the fact that you stored to a[i] and b[i] here, nothing reads them other than the addition in the same loop iteration. So the compiler easily removed the a and b arrays. Same with 'c', it's not read outside the loop so it doesn't need to exist. So the compiler turned your
2017 Sep 04
2
Issues in Vector Add Instruction Machine Code Emission
Thank You. I used EVEX_4V with all the instructions. I replaced TA and EVEX both with EVEX_4V. Now, I am getting following error: llvm-tblgen: /utils/TableGen/X86RecognizableInstr.cpp:687: void llvm::X86Disassembler::RecognizableInstr::emitInstructionSpecifier(): Assertion `numPhysicalOperands >= 2 + additionalOperands && numPhysicalOperands <= 4 + additionalOperands &&