thr3ads.net - similar to: "StringRef Iterator Variable Display"

Displaying 20 results from an estimated 3000 matches similar to: "StringRef Iterator Variable Display"

2017 Aug 07

VBROADCAST Implementation Issues

Thank You. Still getting errors.I have modified my instructions as you said as follows: def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst, VK64WM:$mask_wb), (ins VR_2048:$src1, VK64WM:$mask, i2048mem:$src2), "GATHER_256B\t{$src2, {$dst} {${mask}}|${dst} {${mask}}, $src2}", [(set VR_2048:$dst, VK64WM:$mask_wb, (v64i32 (masked_gather

VBROADCAST Implementation Issues

2017 Aug 07

VBROADCAST Implementation Issues

Hello, I did as you said, Please tell me whether the following correct now?? def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst, _.KRCWM:$mask_wb), (VR_2048:$src1, _.KRCWM:$mask, ins i2048mem:$src2), "GATHER_256B\t{$src2, {$dst}{${mask}}|${dst} {${mask}}, $src2}"), [(set VR_2048:$dst, _.KRCWM:$mask_wb, (v64i32 (GatherNode

VBROADCAST Implementation Issues

2017 Aug 06

VBROADCAST Implementation Issues

i want to implement gather for v64i32. i wrote following code. def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst), (ins i2048mem:$src), "GATHER_256B\t{$src, $dst|$dst, $src}", [(set VR_2048:$dst, (v64i32 (masked_gather addr:$src)))], IIC_MOV_MEM>, TA; def: Pat<(v64f32 (masked_gather addr:$src)), (GATHER_256B

Jacobi 5 Point Stencil Code not Vectorizing

2017 Jul 01

Jacobi 5 Point Stencil Code not Vectorizing

I am able to vectorize it with the following code; #include <stdio.h> #define N 100351 // This function computes 2D-5 point Jacobi stencil void stencil(int a[][N], int b[][N]) { int i, j, k; for (k = 0; k < N; k++) { for (i = 1; i <= N-2; i++) for (j = 1; j <= N-2; j++) b[i][j] = 0.25 * (a[i][j] + a[i-1][j] + a[i+1][j] + a[i][j-1] + a[i][j+1]); for

Error in v64i32 type in x86 backend

2017 Jul 08

Error in v64i32 type in x86 backend

Thank You. I have seen the opcode is 8 bits and all the combinations are already used in llvm x86. Now what to do? On Sat, Jul 8, 2017 at 10:57 AM, Craig Topper <craig.topper at gmail.com> wrote: > Yes its an opcode conflict. You'll have to look through Intel documents > and find an unused opcode. I've only added instructions based on a real > spec so I don't know

error:Ran out of lanemask bits to represent subregister

2017 Jul 14

error:Ran out of lanemask bits to represent subregister

Do your 32768 registers also have sub registers? I can't tell you exactly what to change. I'm not familiar with the code. I would just be running grep or something. ~Craig On Fri, Jul 14, 2017 at 10:23 AM, hameeza ahmed <hahmed2305 at gmail.com> wrote: > Thank you so much. I think there is no issue with my definitions since i > have to use larger registers i.e 65536 bit

Jacobi 5 Point Stencil Code not Vectorizing

2017 Jul 01

Jacobi 5 Point Stencil Code not Vectorizing

Does it happen due to loop carried dependence? if yes what is the solution to vectorize such codes? please reply. i m waiting. On Jul 1, 2017 12:30 PM, "hameeza ahmed" <hahmed2305 at gmail.com> wrote: > I even tried polly but still my llvm IR does not contain vector > instructions. i used the following command; > > clang -S -emit-llvm stencil.c -march=knl -O3

Issues in Vector Add Instruction Machine Code Emission

2017 Sep 05

Issues in Vector Add Instruction Machine Code Emission

I was getting same error when i keep both EVEX/EVEX_4V and TA. So, i restored my original instructions and for that i have to include bool HasTA = TSFlags & X86II::TA; in x86MCCodeEmitter.cpp then used this condition; if(HasTA) ++SrcRegNum; in order to emit binary correctly. Is it right? On Tue, Sep 5, 2017 at 5:45 AM, Craig Topper <craig.topper at gmail.com> wrote: >

Issues in Vector Add Instruction Machine Code Emission

2017 Sep 05

Issues in Vector Add Instruction Machine Code Emission

Thank You, I changed TA to EVEX or EVEX_4V. But now i am getting following error: Invalid prefix! UNREACHABLE executed at /lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp:647! On Tue, Sep 5, 2017 at 4:36 AM, Craig Topper <craig.topper at gmail.com> wrote: > Not all instructions can use EVEX_4V. Move instructions in particular > cannot because they don't have 2 sources. >

Using new types v32f32, v32f64 in llvm backend not possible

2017 Jul 12

Using new types v32f32, v32f64 in llvm backend not possible

I would be very grateful if you specify whether there is some way to allocate registers (different order) / from different register sets to the same instruction based on the vector width/ no of iterations. I have tried several alternatives but could not succeed. Also I have asked this question many times but no one responds. Is there something wrong with this?? Kindly guide me. Thank You On

Issues in Vector Add Instruction Machine Code Emission

2017 Sep 04

Issues in Vector Add Instruction Machine Code Emission

Thank You. I used EVEX_4V with all the instructions. I replaced TA and EVEX both with EVEX_4V. Now, I am getting following error: llvm-tblgen: /utils/TableGen/X86RecognizableInstr.cpp:687: void llvm::X86Disassembler::RecognizableInstr::emitInstructionSpecifier(): Assertion `numPhysicalOperands >= 2 + additionalOperands && numPhysicalOperands <= 4 + additionalOperands &&

Error in v64i32 type in x86 backend

2017 Jul 07

Error in v64i32 type in x86 backend

Thank You. On Fri, Jul 7, 2017 at 10:03 AM, Craig Topper <craig.topper at gmail.com> wrote: > Yes, that error is from instruction selection. I think your legalization > changes worked fine. > > ~Craig > > On Thu, Jul 6, 2017 at 8:21 PM, hameeza ahmed via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> also i further run the following command;

unable to emit vectorized code in LLVM IR

2017 Aug 17

unable to emit vectorized code in LLVM IR

I assume compiler knows that your only have 2 input values that you just added together 1000 times. Despite the fact that you stored to a[i] and b[i] here, nothing reads them other than the addition in the same loop iteration. So the compiler easily removed the a and b arrays. Same with 'c', it's not read outside the loop so it doesn't need to exist. So the compiler turned your

KNL Vectorization with larger vector width

2018 Jul 24

KNL Vectorization with larger vector width

Hello, I need help here. I am able to adjust the vector width through WidestRegister value. When number of iterations=31 and I set vector width=32 it gives <16xi32> and <8xi32> instructions. However if i replicate same behavior with number of iterations=63 and I set vector width=64, no vector instructions are emitted. it should do as previous and gives <32xi32> and

error:Ran out of lanemask bits to represent subregisterr

2017 Jul 19

error:Ran out of lanemask bits to represent subregisterr

You are right. Regarding lanes i can comment only when the other things run fine. Here I am stuck with unsigned vs uint64_t. it looks as if i need to replace each occurrence of unsigned with uint64_t. Should i do it for complete llvm folder or codegen only?? i am continuously getting such errors which require changing unsigned with uint64_t. What to do now??? On Thu, Jul 20, 2017 at 1:03 AM,

unable to emit vectorized code in LLVM IR

2017 Aug 17

unable to emit vectorized code in LLVM IR

lli sum-vec03.ll 5 2 #0 0x0000000000c1f818 (lli+0xc1f818) #1 0x0000000000c1d90e (lli+0xc1d90e) #2 0x0000000000c1da5c (lli+0xc1da5c) #3 0x00007f987c2c3d10 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x10d10) #4 0x00007f987c6f0038 #5 0x0000000000989f8c (lli+0x989f8c) #6 0x00000000009383dc (lli+0x9383dc) #7 0x000000000057eedd (lli+0x57eedd) #8 0x00007f987b464a40 __libc_start_main

Polly Dependency Analysis in MyPass

2018 Jan 29

Polly Dependency Analysis in MyPass

i put following line in CMakeLists.txt; add_subdirectory(mypass) then used make -j9 then i used following and run on canonicalize IR $ opt -load lib/LLVMmypass.so -mypass vec-sum.preopt.ll On Mon, Jan 29, 2018 at 9:39 PM, Michael Kruse <llvmdev at meinersbur.de> wrote: > 2018-01-29 10:18 GMT-06:00 hameeza ahmed <hahmed2305 at gmail.com>: > > I tried writing

Issues in Vector Add Instruction Machine Code Emission

2017 Sep 04

Issues in Vector Add Instruction Machine Code Emission

Sorry to ask but what does it mean to put both? On Tue, Sep 5, 2017 at 4:01 AM, Craig Topper <craig.topper at gmail.com> wrote: > Leave TA. Put both. > > ~Craig > > On Mon, Sep 4, 2017 at 4:00 PM, hameeza ahmed <hahmed2305 at gmail.com> > wrote: > >> You are right. But when i defined my instruction as follows: >> def P_256B_VADD : I<0xE1,

unable to emit vectorized code in LLVM IR

2017 Aug 17

unable to emit vectorized code in LLVM IR

Ok. I have managed to vectorize the second loop in the following code. But the first loop is still not vectorized? Why? int main(int argc, char** argv) { int a[1000], b[1000], c[1000]; int g=0; int aa=atoi(argv[1]), bb=atoi(argv[2]); for (int i=0; i<1000; i++) { a[i]=aa+i, b[i]=bb+i;} for (int i=0; i<1000; i++) { c[i]=a[i] + b[i]; g+=c[i]; } printf("sum: %d\n", g); return 0;

Conditional Register Assignment based on the no of loop iterations

2017 Jul 10

Conditional Register Assignment based on the no of loop iterations

Here basically my problem is vector width since i have used v64i32 in my backend. now if vector width=64. i want the Reg_B class registers to be assigned and if vector width=2048 i want Reg_A registers to be assigned to instruction. Should i incorporate the solution in lowering stage? some thing like; addRegisterClass(MVT::v2048i32, &X86::Reg_B);

similar to: StringRef Iterator Variable Display