thr3ads.net - search: "hahmed2305"

Displaying 20 results from an estimated 83 matches for "hahmed2305".

2017 Aug 07

VBROADCAST Implementation Issues

...MemFrm"' failed. On Mon, Aug 7, 2017 at 8:23 PM, Craig Topper <craig.topper at gmail.com> wrote: > masked_gather takes 3 inputs. not just an address. See the AVX512 pattern > is pasted earlier > > ~Craig > > On Mon, Aug 7, 2017 at 1:54 AM, hameeza ahmed <hahmed2305 at gmail.com> > wrote: > >> Changed it to; >> >> def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst, VK64:$mask), >> (ins i2048mem:$src), >> "GATHER_256B\t{$src, {$dst}{${mask}}|${dst} >> {${mask}}, $src}", >>...

VBROADCAST Implementation Issues

2017 Aug 07

VBROADCAST Implementation Issues

...et _.RC:$dst, _.KRCWM:$mask_wb, > (GatherNode (_.VT _.RC:$src1), _.KRCWM:$mask, > vectoraddr:$src2))]>, EVEX, EVEX_K, > EVEX_CD8<_.EltSize, CD8VT1>; > } > > ~Craig > > On Sun, Aug 6, 2017 at 2:21 PM, hameeza ahmed <hahmed2305 at gmail.com> > wrote: > >> i want to implement gather for v64i32. i wrote following code. >> >> def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst), (ins >> i2048mem:$src), >> "GATHER_256B\t{$src, $dst|$dst, $src}", >&...

Jacobi 5 Point Stencil Code not Vectorizing

2017 Jul 01

Jacobi 5 Point Stencil Code not Vectorizing

...; i++) for (j = 1; j <= N-2; j++) b[i][j] = 0.25 * (a[i][j] + a[i-1][j] + a[i+1][j] + a[i][j-1] + a[i][j+1]); for (i = 1; i <= N-2; i++) for (j = 1; j <= N-2; j++) a[i][j] = b[i][j]; } } I removed restrict over here. On Sun, Jul 2, 2017 at 3:11 AM, hameeza ahmed <hahmed2305 at gmail.com> wrote: > further i modified the code to the following; > > #include <stdio.h> > #define N 100351 > > // This function computes 2D-5 point Jacobi stencil > void stencil(int a[restrict][N], int b[restrict][N]) > { > int i, j, k; > for (k = 0...

VBROADCAST Implementation Issues

2017 Aug 06

VBROADCAST Implementation Issues

...deGenDAGPatterns.cpp:2134: llvm::TreePatternNode *llvm::TreePattern::ParseTreePattern(llvm::Init *, llvm::StringRef): Assertion `New->getNumTypes() == 1 && "FIXME: Unhandled"' failed. What is my mistake? Please help me. On Mon, Aug 7, 2017 at 12:03 AM, hameeza ahmed <hahmed2305 at gmail.com> wrote: > I am trying to implement vector shuffle for v64i32. Is the following > correct? > > > def VSHUFFLE_256B : I<0xE8, MRMDestReg, (outs VR_2048:$dst), > (ins VR_2048:$src1, VRPIM_2048:$src2),"VSHUFFLE_256B\t{$src1, $src2, > $dst|$dst, $src1, $src...

Issues in Vector Add Instruction Machine Code Emission

2017 Sep 05

Issues in Vector Add Instruction Machine Code Emission

...Topper <craig.topper at gmail.com> wrote: > Put the TA's back. EVEX/EVEX_4V does not replace TA. They are for > different things. An EVEX/EVEX_4V instruction must use one of T8, TA, XOP8, > XOP9, XOPA. > > ~Craig > > On Mon, Sep 4, 2017 at 5:33 PM, hameeza ahmed <hahmed2305 at gmail.com> > wrote: > >> Thank You, >> I changed TA to EVEX or EVEX_4V. But now i am getting following error: >> >> Invalid prefix! >> UNREACHABLE executed at /lib/Target/X86/MCTargetDesc/X >> 86MCCodeEmitter.cpp:647! >> >> >> On Tue...

Issues in Vector Add Instruction Machine Code Emission

2017 Sep 05

Issues in Vector Add Instruction Machine Code Emission

...they don't have 2 sources. > > What do you intend to do with the binary output once you have it? You > don't seem to be targeting a particular binary definition so its > effectively just random numbers. > > ~Craig > > On Mon, Sep 4, 2017 at 4:28 PM, hameeza ahmed <hahmed2305 at gmail.com> > wrote: > >> Thank You. >> >> I used EVEX_4V with all the instructions. I replaced TA and EVEX both >> with EVEX_4V. Now, I am getting following error: >> >> llvm-tblgen: /utils/TableGen/X86RecognizableInstr.cpp:687: void >> llvm::X8...

StringRef Iterator Variable Display

2019 Apr 23

StringRef Iterator Variable Display

Hello, I want to display the variable names in stringref iterator. But it is not displayed using following code. for (set<StringRef>::iterator sit = L.begin(); sit != L.end(); sit++) { errs() << *sit << " "; } How to do this? Please help.. -------------- next part -------------- An HTML attachment was scrubbed... URL:

Issues in Vector Add Instruction Machine Code Emission

2017 Sep 04

Issues in Vector Add Instruction Machine Code Emission

...that the destination and one of > the sources must be the same physical register. > > TA indicates which of the opcode maps the instruction belongs to. This > corresponds to encoding 0x3 of the VEX.mmmmm field. > > ~Craig > > On Mon, Sep 4, 2017 at 4:01 PM, hameeza ahmed <hahmed2305 at gmail.com> > wrote: > >> Sorry to ask but what does it mean to put both? >> >> On Tue, Sep 5, 2017 at 4:01 AM, Craig Topper <craig.topper at gmail.com> >> wrote: >> >>> Leave TA. Put both. >>> >>> ~Craig >>> >&gt...

Jacobi 5 Point Stencil Code not Vectorizing

2017 Jul 01

Jacobi 5 Point Stencil Code not Vectorizing

Does it happen due to loop carried dependence? if yes what is the solution to vectorize such codes? please reply. i m waiting. On Jul 1, 2017 12:30 PM, "hameeza ahmed" <hahmed2305 at gmail.com> wrote: > I even tried polly but still my llvm IR does not contain vector > instructions. i used the following command; > > clang -S -emit-llvm stencil.c -march=knl -O3 -mllvm -polly -mllvm > -polly-vectorizer=stripmine -o stencil_poly.ll > > Please specify wh...

Conditional Register Assignment based on the no of loop iterations

2017 Jul 10

Conditional Register Assignment based on the no of loop iterations

...so that in instructioninfo.td while pattern matching both LD256 and LD256_N are treated separately. 1 will use Reg_B registers and other will use Reg_A respectively. Is it fine??? Please guide me... I need serious help, please..... Thank You On Mon, Jul 10, 2017 at 9:29 AM, hameeza ahmed <hahmed2305 at gmail.com> wrote: > or should i write a condition in registerinfo.td; to define the registers > in object Reg_A in specific order according to loop iterations. > > On Mon, Jul 10, 2017 at 9:17 AM, hameeza ahmed <hahmed2305 at gmail.com> > wrote: > >> hello, >...

Error in v64i32 type in x86 backend

2017 Jul 08

Error in v64i32 type in x86 backend

...e: > Yes its an opcode conflict. You'll have to look through Intel documents > and find an unused opcode. I've only added instructions based on a real > spec so I don't know how to make up an opcode. > > ~Craig > > On Fri, Jul 7, 2017 at 10:43 PM, hameeza ahmed <hahmed2305 at gmail.com> > wrote: > >> Thank You. >> >> Now i am getting this error repeatedly; >> >> Error: Primary decode conflict: VADD_256B would overwrite INC8r >> ModRM 192 >> Opcode 254 >> Context IC >> Error: Primary decode conflict: VA...

unable to emit vectorized code in LLVM IR

2017 Aug 17

unable to emit vectorized code in LLVM IR

...his got further simplified to (aa+bb)*1000. int main(int argc, char** argv) { int a[1000], b[1000], c[1000]; int g=0; int aa=atoi(argv[1]), bb=atoi(argv[2]); for (int i=0; i<1000; i++) { a[i]=aa, b[i]=bb; c[i]=a[i] + b[i]; g+=c[i]; } ~Craig On Thu, Aug 17, 2017 at 11:37 AM, hameeza ahmed <hahmed2305 at gmail.com> wrote: > why is it happening? is there any way to solve this? > > On Thu, Aug 17, 2017 at 10:09 PM, hameeza ahmed <hahmed2305 at gmail.com> > wrote: > >> even if i make my code as follows: vectorized instructions not get >> emitted. What to do? &gt...

Issues in Vector Add Instruction Machine Code Emission

2017 Sep 04

Issues in Vector Add Instruction Machine Code Emission

Sorry to ask but what does it mean to put both? On Tue, Sep 5, 2017 at 4:01 AM, Craig Topper <craig.topper at gmail.com> wrote: > Leave TA. Put both. > > ~Craig > > On Mon, Sep 4, 2017 at 4:00 PM, hameeza ahmed <hahmed2305 at gmail.com> > wrote: > >> You are right. But when i defined my instruction as follows: >> def P_256B_VADD : I<0xE1, MRMDestReg, (outs VRP_2048:$dst), (ins >> VRP_2048:$src1, VRPIM_2048:$src2),"P_256B_VADD\t{$src1, $src2, >> $dst|$dst, $src1, $src2}",...

error:Ran out of lanemask bits to represent subregister

2017 Jul 14

error:Ran out of lanemask bits to represent subregister

Do your 32768 registers also have sub registers? I can't tell you exactly what to change. I'm not familiar with the code. I would just be running grep or something. ~Craig On Fri, Jul 14, 2017 at 10:23 AM, hameeza ahmed <hahmed2305 at gmail.com> wrote: > Thank you so much. I think there is no issue with my definitions since i > have to use larger registers i.e 65536 bit register made from 2 32768 > registers. > I have seen your mentioned code files. But it looks difficult what to > change. > Could you pl...

Vectorizing remainder loop

2018 Aug 03

Vectorizing remainder loop

...f you struggle in convincing LoopVectorizationLegality to think remainder is just as legal to vectorize as main vector loop, you should be able to avoid that if you take this approach. You still have two approaches to unblock yourself in the short term. Thanks, Hideki From: hameeza ahmed [mailto:hahmed2305 at gmail.com] Sent: Friday, August 03, 2018 10:58 AM To: Saito, Hideki <hideki.saito at intel.com> Cc: Craig Topper <craig.topper at gmail.com>; Hal Finkel <hfinkel at anl.gov>; Friedman, Eli <efriedma at codeaurora.org>; ashutosh.nema at amd.com; llvm-dev at lists.llvm.org...

error:Ran out of lanemask bits to represent subregisterr

2017 Jul 19

error:Ran out of lanemask bits to represent subregisterr

...craig.topper at gmail.com>> wrote: >> >> Did you change the hardcoded 32 right before the line that prints >> that error in CodeGenRegisters.cpp to 64? >> >> ~Craig >> >> On Wed, Jul 19, 2017 at 11:38 AM, hameeza ahmed >> <hahmed2305 at gmail.com <mailto:hahmed2305 at gmail.com>> wrote: >> >> Thank You. >> >> I have replaced all the occurrences of unsigned with uint64_t in >> Lanemask.h and in all other related files like >> codegenregisters.cpp, codeg...

Polly Dependency Analysis in MyPass

2018 Jan 29

Polly Dependency Analysis in MyPass

..._subdirectory(mypass) then used make -j9 then i used following and run on canonicalize IR $ opt -load lib/LLVMmypass.so -mypass vec-sum.preopt.ll On Mon, Jan 29, 2018 at 9:39 PM, Michael Kruse <llvmdev at meinersbur.de> wrote: > 2018-01-29 10:18 GMT-06:00 hameeza ahmed <hahmed2305 at gmail.com>: > > I tried writing following code. Could you please help me on this? What to > > modify here? > > Please send a script executing the commands that you have been using, > including all required files (eg .cpp, CMakeLists.txt, .ll, patches, > ...) > >...

unable to emit vectorized code in LLVM IR

2017 Aug 17

unable to emit vectorized code in LLVM IR

...? On Fri, Aug 18, 2017 at 12:17 AM, Craig Topper <craig.topper at gmail.com> wrote: > What was your lli command line? Is this based on your code where you > created 2048-bit instructions in the x86 backend? > > ~Craig > > On Thu, Aug 17, 2017 at 12:12 PM, hameeza ahmed <hahmed2305 at gmail.com> > wrote: > >> Ok. I have managed to vectorize the second loop in the following code. >> But the first loop is still not vectorized? Why? >> >> int main(int argc, char** argv) { >> int a[1000], b[1000], c[1000]; int g=0; >> int aa=atoi(argv[...

Using new types v32f32, v32f64 in llvm backend not possible

2017 Jul 12

Using new types v32f32, v32f64 in llvm backend not possible

...instruction based on the vector width/ no of iterations. I have tried several alternatives but could not succeed. Also I have asked this question many times but no one responds. Is there something wrong with this?? Kindly guide me. Thank You On Wed, Jul 12, 2017 at 12:56 AM, hameeza ahmed <hahmed2305 at gmail.com> wrote: > here by the ********my backend supports v64i32 i mean my hardware. > > On Wed, Jul 12, 2017 at 12:49 AM, hameeza ahmed <hahmed2305 at gmail.com> > wrote: > >> Thank you so much. it run fine. >> Can you please resolve following issue; >...

error: In anonymous_4820: Unrecognized node 'VRR128'!

2017 Jul 11

error: In anonymous_4820: Unrecognized node 'VRR128'!

...er <craig.topper at gmail.com> wrote: > "add" only works for integers. Floating point requires fadd. They are > different operations in hardware too so you probably need different > instructions. > > ~Craig > > On Tue, Jul 11, 2017 at 8:55 AM, hameeza ahmed <hahmed2305 at gmail.com> > wrote: > >> Thank You. >> >> How to do the same for add please see the following; it gives duplication >> error. >> >> def VADD : I<0x0E, MRMDestReg, (outs VRR128:$dst), (ins VRR128:$src1, >> VRR128:$src2),"VADD\t{$src1, $s...

search for: hahmed2305