Displaying 20 results from an estimated 3000 matches similar to: "StringRef Iterator Variable Display"
2017 Aug 07
3
VBROADCAST Implementation Issues
Thank You. Still getting errors.I have modified my instructions as you said
as follows:
def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst, VK64WM:$mask_wb),
(ins VR_2048:$src1, VK64WM:$mask, i2048mem:$src2),
"GATHER_256B\t{$src2, {$dst} {${mask}}|${dst}
{${mask}}, $src2}",
[(set VR_2048:$dst, VK64WM:$mask_wb, (v64i32
(masked_gather
2017 Aug 07
2
VBROADCAST Implementation Issues
Hello,
I did as you said,
Please tell me whether the following correct now??
def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst, _.KRCWM:$mask_wb),
(VR_2048:$src1, _.KRCWM:$mask, ins i2048mem:$src2),
"GATHER_256B\t{$src2, {$dst}{${mask}}|${dst} {${mask}},
$src2}"),
[(set VR_2048:$dst, _.KRCWM:$mask_wb, (v64i32
(GatherNode
2017 Aug 06
2
VBROADCAST Implementation Issues
i want to implement gather for v64i32. i wrote following code.
def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst), (ins
i2048mem:$src),
"GATHER_256B\t{$src, $dst|$dst, $src}",
[(set VR_2048:$dst, (v64i32 (masked_gather
addr:$src)))],
IIC_MOV_MEM>, TA;
def: Pat<(v64f32 (masked_gather addr:$src)), (GATHER_256B
2017 Jul 01
2
Jacobi 5 Point Stencil Code not Vectorizing
I am able to vectorize it with the following code;
#include <stdio.h>
#define N 100351
// This function computes 2D-5 point Jacobi stencil
void stencil(int a[][N], int b[][N])
{
int i, j, k;
for (k = 0; k < N; k++) {
for (i = 1; i <= N-2; i++)
for (j = 1; j <= N-2; j++)
b[i][j] = 0.25 * (a[i][j] + a[i-1][j] + a[i+1][j] + a[i][j-1] +
a[i][j+1]);
for
2017 Jul 08
5
Error in v64i32 type in x86 backend
Thank You.
I have seen the opcode is 8 bits and all the combinations are already used
in llvm x86.
Now what to do?
On Sat, Jul 8, 2017 at 10:57 AM, Craig Topper <craig.topper at gmail.com>
wrote:
> Yes its an opcode conflict. You'll have to look through Intel documents
> and find an unused opcode. I've only added instructions based on a real
> spec so I don't know
2017 Jul 14
3
error:Ran out of lanemask bits to represent subregister
Do your 32768 registers also have sub registers?
I can't tell you exactly what to change. I'm not familiar with the code. I
would just be running grep or something.
~Craig
On Fri, Jul 14, 2017 at 10:23 AM, hameeza ahmed <hahmed2305 at gmail.com>
wrote:
> Thank you so much. I think there is no issue with my definitions since i
> have to use larger registers i.e 65536 bit
2017 Jul 01
3
Jacobi 5 Point Stencil Code not Vectorizing
Does it happen due to loop carried dependence? if yes what is the solution
to vectorize such codes?
please reply. i m waiting.
On Jul 1, 2017 12:30 PM, "hameeza ahmed" <hahmed2305 at gmail.com> wrote:
> I even tried polly but still my llvm IR does not contain vector
> instructions. i used the following command;
>
> clang -S -emit-llvm stencil.c -march=knl -O3
2017 Sep 05
2
Issues in Vector Add Instruction Machine Code Emission
I was getting same error when i keep both EVEX/EVEX_4V and TA. So, i
restored my original instructions and for that i have to include
bool HasTA = TSFlags & X86II::TA; in x86MCCodeEmitter.cpp
then used this condition;
if(HasTA)
++SrcRegNum;
in order to emit binary correctly.
Is it right?
On Tue, Sep 5, 2017 at 5:45 AM, Craig Topper <craig.topper at gmail.com> wrote:
>
2017 Sep 05
2
Issues in Vector Add Instruction Machine Code Emission
Thank You,
I changed TA to EVEX or EVEX_4V. But now i am getting following error:
Invalid prefix!
UNREACHABLE executed at
/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp:647!
On Tue, Sep 5, 2017 at 4:36 AM, Craig Topper <craig.topper at gmail.com> wrote:
> Not all instructions can use EVEX_4V. Move instructions in particular
> cannot because they don't have 2 sources.
>
2017 Jul 12
2
Using new types v32f32, v32f64 in llvm backend not possible
I would be very grateful if you specify whether there is some way to
allocate registers (different order) / from different register sets to the
same instruction based on the vector width/ no of iterations.
I have tried several alternatives but could not succeed.
Also I have asked this question many times but no one responds.
Is there something wrong with this??
Kindly guide me.
Thank You
On
2017 Sep 04
2
Issues in Vector Add Instruction Machine Code Emission
Thank You.
I used EVEX_4V with all the instructions. I replaced TA and EVEX both with
EVEX_4V. Now, I am getting following error:
llvm-tblgen: /utils/TableGen/X86RecognizableInstr.cpp:687: void
llvm::X86Disassembler::RecognizableInstr::emitInstructionSpecifier():
Assertion `numPhysicalOperands >= 2 + additionalOperands &&
numPhysicalOperands <= 4 + additionalOperands &&
2017 Jul 07
2
Error in v64i32 type in x86 backend
Thank You.
On Fri, Jul 7, 2017 at 10:03 AM, Craig Topper <craig.topper at gmail.com>
wrote:
> Yes, that error is from instruction selection. I think your legalization
> changes worked fine.
>
> ~Craig
>
> On Thu, Jul 6, 2017 at 8:21 PM, hameeza ahmed via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> also i further run the following command;
2017 Aug 17
4
unable to emit vectorized code in LLVM IR
I assume compiler knows that your only have 2 input values that you just
added together 1000 times.
Despite the fact that you stored to a[i] and b[i] here, nothing reads them
other than the addition in the same loop iteration. So the compiler easily
removed the a and b arrays. Same with 'c', it's not read outside the loop
so it doesn't need to exist. So the compiler turned your
2018 Jul 24
2
KNL Vectorization with larger vector width
Hello,
I need help here. I am able to adjust the vector width through
WidestRegister value. When number of iterations=31 and I set vector
width=32 it gives <16xi32> and <8xi32> instructions.
However if i replicate same behavior with number of iterations=63 and I
set vector width=64, no vector instructions are emitted. it should do as
previous and gives <32xi32> and
2017 Jul 19
2
error:Ran out of lanemask bits to represent subregisterr
You are right. Regarding lanes i can comment only when the other things run
fine.
Here I am stuck with unsigned vs uint64_t. it looks as if i need to replace
each occurrence of unsigned with uint64_t.
Should i do it for complete llvm folder or codegen only??
i am continuously getting such errors which require changing unsigned with
uint64_t.
What to do now???
On Thu, Jul 20, 2017 at 1:03 AM,
2017 Aug 17
2
unable to emit vectorized code in LLVM IR
lli sum-vec03.ll 5 2 #0 0x0000000000c1f818 (lli+0xc1f818)
#1 0x0000000000c1d90e (lli+0xc1d90e)
#2 0x0000000000c1da5c (lli+0xc1da5c)
#3 0x00007f987c2c3d10 __restore_rt
(/lib/x86_64-linux-gnu/libpthread.so.0+0x10d10)
#4 0x00007f987c6f0038
#5 0x0000000000989f8c (lli+0x989f8c)
#6 0x00000000009383dc (lli+0x9383dc)
#7 0x000000000057eedd (lli+0x57eedd)
#8 0x00007f987b464a40 __libc_start_main
2018 Jan 29
2
Polly Dependency Analysis in MyPass
i put following line in CMakeLists.txt;
add_subdirectory(mypass)
then used make -j9
then i used following and run on canonicalize IR
$ opt -load lib/LLVMmypass.so -mypass vec-sum.preopt.ll
On Mon, Jan 29, 2018 at 9:39 PM, Michael Kruse <llvmdev at meinersbur.de>
wrote:
> 2018-01-29 10:18 GMT-06:00 hameeza ahmed <hahmed2305 at gmail.com>:
> > I tried writing
2017 Sep 04
2
Issues in Vector Add Instruction Machine Code Emission
Sorry to ask but what does it mean to put both?
On Tue, Sep 5, 2017 at 4:01 AM, Craig Topper <craig.topper at gmail.com> wrote:
> Leave TA. Put both.
>
> ~Craig
>
> On Mon, Sep 4, 2017 at 4:00 PM, hameeza ahmed <hahmed2305 at gmail.com>
> wrote:
>
>> You are right. But when i defined my instruction as follows:
>> def P_256B_VADD : I<0xE1,
2017 Aug 17
2
unable to emit vectorized code in LLVM IR
Ok. I have managed to vectorize the second loop in the following code. But
the first loop is still not vectorized? Why?
int main(int argc, char** argv) {
int a[1000], b[1000], c[1000]; int g=0;
int aa=atoi(argv[1]), bb=atoi(argv[2]);
for (int i=0; i<1000; i++) {
a[i]=aa+i, b[i]=bb+i;}
for (int i=0; i<1000; i++) {
c[i]=a[i] + b[i];
g+=c[i];
}
printf("sum: %d\n", g);
return 0;
2017 Jul 10
2
Conditional Register Assignment based on the no of loop iterations
Here basically my problem is vector width since i have used v64i32 in my
backend. now if vector width=64. i want the Reg_B class registers to be
assigned and if vector width=2048 i want Reg_A registers to be assigned to
instruction.
Should i incorporate the solution in lowering stage? some thing like;
addRegisterClass(MVT::v2048i32, &X86::Reg_B);