search for: hahmed2305

Displaying 20 results from an estimated 83 matches for "hahmed2305".

2017 Aug 07
3
VBROADCAST Implementation Issues
...MemFrm"' failed. On Mon, Aug 7, 2017 at 8:23 PM, Craig Topper <craig.topper at gmail.com> wrote: > masked_gather takes 3 inputs. not just an address. See the AVX512 pattern > is pasted earlier > > ~Craig > > On Mon, Aug 7, 2017 at 1:54 AM, hameeza ahmed <hahmed2305 at gmail.com> > wrote: > >> Changed it to; >> >> def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst, VK64:$mask), >> (ins i2048mem:$src), >> "GATHER_256B\t{$src, {$dst}{${mask}}|${dst} >> {${mask}}, $src}", >>...
2017 Aug 07
2
VBROADCAST Implementation Issues
...et _.RC:$dst, _.KRCWM:$mask_wb, > (GatherNode (_.VT _.RC:$src1), _.KRCWM:$mask, > vectoraddr:$src2))]>, EVEX, EVEX_K, > EVEX_CD8<_.EltSize, CD8VT1>; > } > > ~Craig > > On Sun, Aug 6, 2017 at 2:21 PM, hameeza ahmed <hahmed2305 at gmail.com> > wrote: > >> i want to implement gather for v64i32. i wrote following code. >> >> def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst), (ins >> i2048mem:$src), >> "GATHER_256B\t{$src, $dst|$dst, $src}", >&...
2017 Jul 01
2
Jacobi 5 Point Stencil Code not Vectorizing
...; i++) for (j = 1; j <= N-2; j++) b[i][j] = 0.25 * (a[i][j] + a[i-1][j] + a[i+1][j] + a[i][j-1] + a[i][j+1]); for (i = 1; i <= N-2; i++) for (j = 1; j <= N-2; j++) a[i][j] = b[i][j]; } } I removed restrict over here. On Sun, Jul 2, 2017 at 3:11 AM, hameeza ahmed <hahmed2305 at gmail.com> wrote: > further i modified the code to the following; > > #include <stdio.h> > #define N 100351 > > // This function computes 2D-5 point Jacobi stencil > void stencil(int a[restrict][N], int b[restrict][N]) > { > int i, j, k; > for (k = 0...
2017 Aug 06
2
VBROADCAST Implementation Issues
...deGenDAGPatterns.cpp:2134: llvm::TreePatternNode *llvm::TreePattern::ParseTreePattern(llvm::Init *, llvm::StringRef): Assertion `New->getNumTypes() == 1 && "FIXME: Unhandled"' failed. What is my mistake? Please help me. On Mon, Aug 7, 2017 at 12:03 AM, hameeza ahmed <hahmed2305 at gmail.com> wrote: > I am trying to implement vector shuffle for v64i32. Is the following > correct? > > > def VSHUFFLE_256B : I<0xE8, MRMDestReg, (outs VR_2048:$dst), > (ins VR_2048:$src1, VRPIM_2048:$src2),"VSHUFFLE_256B\t{$src1, $src2, > $dst|$dst, $src1, $src...
2017 Sep 05
2
Issues in Vector Add Instruction Machine Code Emission
...Topper <craig.topper at gmail.com> wrote: > Put the TA's back. EVEX/EVEX_4V does not replace TA. They are for > different things. An EVEX/EVEX_4V instruction must use one of T8, TA, XOP8, > XOP9, XOPA. > > ~Craig > > On Mon, Sep 4, 2017 at 5:33 PM, hameeza ahmed <hahmed2305 at gmail.com> > wrote: > >> Thank You, >> I changed TA to EVEX or EVEX_4V. But now i am getting following error: >> >> Invalid prefix! >> UNREACHABLE executed at /lib/Target/X86/MCTargetDesc/X >> 86MCCodeEmitter.cpp:647! >> >> >> On Tue...
2017 Sep 05
2
Issues in Vector Add Instruction Machine Code Emission
...they don't have 2 sources. > > What do you intend to do with the binary output once you have it? You > don't seem to be targeting a particular binary definition so its > effectively just random numbers. > > ~Craig > > On Mon, Sep 4, 2017 at 4:28 PM, hameeza ahmed <hahmed2305 at gmail.com> > wrote: > >> Thank You. >> >> I used EVEX_4V with all the instructions. I replaced TA and EVEX both >> with EVEX_4V. Now, I am getting following error: >> >> llvm-tblgen: /utils/TableGen/X86RecognizableInstr.cpp:687: void >> llvm::X8...
2019 Apr 23
5
StringRef Iterator Variable Display
Hello, I want to display the variable names in stringref iterator. But it is not displayed using following code. for (set<StringRef>::iterator sit = L.begin(); sit != L.end(); sit++) { errs() << *sit << " "; } How to do this? Please help.. -------------- next part -------------- An HTML attachment was scrubbed... URL:
2017 Sep 04
2
Issues in Vector Add Instruction Machine Code Emission
...that the destination and one of > the sources must be the same physical register. > > TA indicates which of the opcode maps the instruction belongs to. This > corresponds to encoding 0x3 of the VEX.mmmmm field. > > ~Craig > > On Mon, Sep 4, 2017 at 4:01 PM, hameeza ahmed <hahmed2305 at gmail.com> > wrote: > >> Sorry to ask but what does it mean to put both? >> >> On Tue, Sep 5, 2017 at 4:01 AM, Craig Topper <craig.topper at gmail.com> >> wrote: >> >>> Leave TA. Put both. >>> >>> ~Craig >>> >&gt...
2017 Jul 01
3
Jacobi 5 Point Stencil Code not Vectorizing
Does it happen due to loop carried dependence? if yes what is the solution to vectorize such codes? please reply. i m waiting. On Jul 1, 2017 12:30 PM, "hameeza ahmed" <hahmed2305 at gmail.com> wrote: > I even tried polly but still my llvm IR does not contain vector > instructions. i used the following command; > > clang -S -emit-llvm stencil.c -march=knl -O3 -mllvm -polly -mllvm > -polly-vectorizer=stripmine -o stencil_poly.ll > > Please specify wh...
2017 Jul 10
2
Conditional Register Assignment based on the no of loop iterations
...so that in instructioninfo.td while pattern matching both LD256 and LD256_N are treated separately. 1 will use Reg_B registers and other will use Reg_A respectively. Is it fine??? Please guide me... I need serious help, please..... Thank You On Mon, Jul 10, 2017 at 9:29 AM, hameeza ahmed <hahmed2305 at gmail.com> wrote: > or should i write a condition in registerinfo.td; to define the registers > in object Reg_A in specific order according to loop iterations. > > On Mon, Jul 10, 2017 at 9:17 AM, hameeza ahmed <hahmed2305 at gmail.com> > wrote: > >> hello, >...
2017 Jul 08
5
Error in v64i32 type in x86 backend
...e: > Yes its an opcode conflict. You'll have to look through Intel documents > and find an unused opcode. I've only added instructions based on a real > spec so I don't know how to make up an opcode. > > ~Craig > > On Fri, Jul 7, 2017 at 10:43 PM, hameeza ahmed <hahmed2305 at gmail.com> > wrote: > >> Thank You. >> >> Now i am getting this error repeatedly; >> >> Error: Primary decode conflict: VADD_256B would overwrite INC8r >> ModRM 192 >> Opcode 254 >> Context IC >> Error: Primary decode conflict: VA...
2017 Aug 17
4
unable to emit vectorized code in LLVM IR
...his got further simplified to (aa+bb)*1000. int main(int argc, char** argv) { int a[1000], b[1000], c[1000]; int g=0; int aa=atoi(argv[1]), bb=atoi(argv[2]); for (int i=0; i<1000; i++) { a[i]=aa, b[i]=bb; c[i]=a[i] + b[i]; g+=c[i]; } ~Craig On Thu, Aug 17, 2017 at 11:37 AM, hameeza ahmed <hahmed2305 at gmail.com> wrote: > why is it happening? is there any way to solve this? > > On Thu, Aug 17, 2017 at 10:09 PM, hameeza ahmed <hahmed2305 at gmail.com> > wrote: > >> even if i make my code as follows: vectorized instructions not get >> emitted. What to do? &gt...
2017 Sep 04
2
Issues in Vector Add Instruction Machine Code Emission
Sorry to ask but what does it mean to put both? On Tue, Sep 5, 2017 at 4:01 AM, Craig Topper <craig.topper at gmail.com> wrote: > Leave TA. Put both. > > ~Craig > > On Mon, Sep 4, 2017 at 4:00 PM, hameeza ahmed <hahmed2305 at gmail.com> > wrote: > >> You are right. But when i defined my instruction as follows: >> def P_256B_VADD : I<0xE1, MRMDestReg, (outs VRP_2048:$dst), (ins >> VRP_2048:$src1, VRPIM_2048:$src2),"P_256B_VADD\t{$src1, $src2, >> $dst|$dst, $src1, $src2}",...
2017 Jul 14
3
error:Ran out of lanemask bits to represent subregister
Do your 32768 registers also have sub registers? I can't tell you exactly what to change. I'm not familiar with the code. I would just be running grep or something. ~Craig On Fri, Jul 14, 2017 at 10:23 AM, hameeza ahmed <hahmed2305 at gmail.com> wrote: > Thank you so much. I think there is no issue with my definitions since i > have to use larger registers i.e 65536 bit register made from 2 32768 > registers. > I have seen your mentioned code files. But it looks difficult what to > change. > Could you pl...
2018 Aug 03
2
Vectorizing remainder loop
...f you struggle in convincing LoopVectorizationLegality to think remainder is just as legal to vectorize as main vector loop, you should be able to avoid that if you take this approach. You still have two approaches to unblock yourself in the short term. Thanks, Hideki From: hameeza ahmed [mailto:hahmed2305 at gmail.com] Sent: Friday, August 03, 2018 10:58 AM To: Saito, Hideki <hideki.saito at intel.com> Cc: Craig Topper <craig.topper at gmail.com>; Hal Finkel <hfinkel at anl.gov>; Friedman, Eli <efriedma at codeaurora.org>; ashutosh.nema at amd.com; llvm-dev at lists.llvm.org...
2017 Jul 19
2
error:Ran out of lanemask bits to represent subregisterr
...craig.topper at gmail.com>> wrote: >> >> Did you change the hardcoded 32 right before the line that prints >> that error in CodeGenRegisters.cpp to 64? >> >> ~Craig >> >> On Wed, Jul 19, 2017 at 11:38 AM, hameeza ahmed >> <hahmed2305 at gmail.com <mailto:hahmed2305 at gmail.com>> wrote: >> >> Thank You. >> >> I have replaced all the occurrences of unsigned with uint64_t in >> Lanemask.h and in all other related files like >> codegenregisters.cpp, codeg...
2018 Jan 29
2
Polly Dependency Analysis in MyPass
..._subdirectory(mypass) then used make -j9 then i used following and run on canonicalize IR $ opt -load lib/LLVMmypass.so -mypass vec-sum.preopt.ll On Mon, Jan 29, 2018 at 9:39 PM, Michael Kruse <llvmdev at meinersbur.de> wrote: > 2018-01-29 10:18 GMT-06:00 hameeza ahmed <hahmed2305 at gmail.com>: > > I tried writing following code. Could you please help me on this? What to > > modify here? > > Please send a script executing the commands that you have been using, > including all required files (eg .cpp, CMakeLists.txt, .ll, patches, > ...) > >...
2017 Aug 17
2
unable to emit vectorized code in LLVM IR
...? On Fri, Aug 18, 2017 at 12:17 AM, Craig Topper <craig.topper at gmail.com> wrote: > What was your lli command line? Is this based on your code where you > created 2048-bit instructions in the x86 backend? > > ~Craig > > On Thu, Aug 17, 2017 at 12:12 PM, hameeza ahmed <hahmed2305 at gmail.com> > wrote: > >> Ok. I have managed to vectorize the second loop in the following code. >> But the first loop is still not vectorized? Why? >> >> int main(int argc, char** argv) { >> int a[1000], b[1000], c[1000]; int g=0; >> int aa=atoi(argv[...
2017 Jul 12
2
Using new types v32f32, v32f64 in llvm backend not possible
...instruction based on the vector width/ no of iterations. I have tried several alternatives but could not succeed. Also I have asked this question many times but no one responds. Is there something wrong with this?? Kindly guide me. Thank You On Wed, Jul 12, 2017 at 12:56 AM, hameeza ahmed <hahmed2305 at gmail.com> wrote: > here by the ********my backend supports v64i32 i mean my hardware. > > On Wed, Jul 12, 2017 at 12:49 AM, hameeza ahmed <hahmed2305 at gmail.com> > wrote: > >> Thank you so much. it run fine. >> Can you please resolve following issue; >...
2017 Jul 11
2
error: In anonymous_4820: Unrecognized node 'VRR128'!
...er <craig.topper at gmail.com> wrote: > "add" only works for integers. Floating point requires fadd. They are > different operations in hardware too so you probably need different > instructions. > > ~Craig > > On Tue, Jul 11, 2017 at 8:55 AM, hameeza ahmed <hahmed2305 at gmail.com> > wrote: > >> Thank You. >> >> How to do the same for add please see the following; it gives duplication >> error. >> >> def VADD : I<0x0E, MRMDestReg, (outs VRR128:$dst), (ins VRR128:$src1, >> VRR128:$src2),"VADD\t{$src1, $s...