hameeza ahmed via llvm-dev
2017-Jul-08 06:03 UTC
[llvm-dev] Error in v64i32 type in x86 backend
Thank You. I have seen the opcode is 8 bits and all the combinations are already used in llvm x86. Now what to do? On Sat, Jul 8, 2017 at 10:57 AM, Craig Topper <craig.topper at gmail.com> wrote:> Yes its an opcode conflict. You'll have to look through Intel documents > and find an unused opcode. I've only added instructions based on a real > spec so I don't know how to make up an opcode. > > ~Craig > > On Fri, Jul 7, 2017 at 10:43 PM, hameeza ahmed <hahmed2305 at gmail.com> > wrote: > >> Thank You. >> >> Now i am getting this error repeatedly; >> >> Error: Primary decode conflict: VADD_256B would overwrite INC8r >> ModRM 192 >> Opcode 254 >> Context IC >> Error: Primary decode conflict: VADD_256B would overwrite INC8r >> ModRM 193 >> Opcode 254 >> Context IC >> >> Is it due to opcode conflict? what should i keep opcode then? >> >> >> On Sat, Jul 8, 2017 at 10:33 AM, Craig Topper <craig.topper at gmail.com> >> wrote: >> >>> Keep I >>> >>> ~Craig >>> >>> On Fri, Jul 7, 2017 at 10:28 PM, hameeza ahmed <hahmed2305 at gmail.com> >>> wrote: >>> >>>> I keep this one; >>>> >>>> def VADD_256B : I<0xFE, MRMDestReg, (outs VR2048:$dst), (ins VR2048:$src1, >>>> VR2048:$src2), >>>> "VADD_256B\t{$dst, $src1, $src2 }", [(set VR2048:$dst, >>>> (add VR2048:$src1, VR2048:$src2))]>; >>>> >>>> On Sat, Jul 8, 2017 at 10:17 AM, hameeza ahmed <hahmed2305 at gmail.com> >>>> wrote: >>>> >>>>> sorry i didnt understand RI/ I thing. should i keep RI or I? >>>>> >>>>> On Sat, Jul 8, 2017 at 10:13 AM, Craig Topper <craig.topper at gmail.com> >>>>> wrote: >>>>> >>>>>> I think so. >>>>>> >>>>>> ~Craig >>>>>> >>>>>> On Fri, Jul 7, 2017 at 10:10 PM, hameeza ahmed <hahmed2305 at gmail.com> >>>>>> wrote: >>>>>> >>>>>>> sorry to disturb again,, >>>>>>> >>>>>>> def VADD_256B : RI<0xFE, MRMDestReg, (outs VR2048:$dst), (ins VR >>>>>>> 2048:$src1, VR2048:$src2), >>>>>>> "VADD_256B\t{$dst, $src1, $src2 }", [(set VR2048:$dst, >>>>>>> (add VR2048:$src1, VR2048:$src2))], IIC_XADD_REG>, TB; >>>>>>> >>>>>>> >>>>>>> Is it fine now?? >>>>>>> >>>>>>> >>>>>>> On Sat, Jul 8, 2017 at 10:00 AM, Craig Topper < >>>>>>> craig.topper at gmail.com> wrote: >>>>>>> >>>>>>>> Oops that should have said "REX prefix" in the first sentence. >>>>>>>> >>>>>>>> ~Craig >>>>>>>> >>>>>>>> On Fri, Jul 7, 2017 at 9:59 PM, Craig Topper < >>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>> >>>>>>>>> You don't want RI. That's used for instructions that need a reg >>>>>>>>> prefix. You need to use $src1 and $src2 in the assembly string too. It also >>>>>>>>> looks like you have two closing ] brackets. >>>>>>>>> >>>>>>>>> ~Craig >>>>>>>>> >>>>>>>>> On Fri, Jul 7, 2017 at 9:55 PM, hameeza ahmed < >>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Thank you; >>>>>>>>>> i have changed as follows.is it fine now? >>>>>>>>>> >>>>>>>>>> def VADD_256B : I<0xFE, MRMDestReg, (outs VR2048:$dst), (ins VR >>>>>>>>>> 2048:$src1, VR2048:$src2), >>>>>>>>>> "VADD_256B\t{$src, $dst|$dst, $src}", [(set VR >>>>>>>>>> 2048:$dst, (add VR2048:$src1, VR2048:$src2))]]>; >>>>>>>>>> >>>>>>>>>> Also here i have changed class RI to I. Does it make any >>>>>>>>>> difference? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Sat, Jul 8, 2017 at 9:38 AM, Craig Topper < >>>>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> IIC_XADD_REG is used to associate latency and other information >>>>>>>>>>> for use by the instruction scheduling pass. >>>>>>>>>>> >>>>>>>>>>> You're missing a pattern in the square bracket to match an add >>>>>>>>>>> node. You also need two VR2048 registers in the 'ins' >>>>>>>>>>> >>>>>>>>>>> ~Craig >>>>>>>>>>> >>>>>>>>>>> On Fri, Jul 7, 2017 at 9:29 PM, hameeza ahmed < >>>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Can you please tell whether following add is correct to add 2 >>>>>>>>>>>> 64xi32 numbers. >>>>>>>>>>>> >>>>>>>>>>>> def VADD_256B : RI<0xFE, MRMDestReg, (outs VR2048:$dst), (ins >>>>>>>>>>>> VR2048:$src), >>>>>>>>>>>> "VADD_256B\t{$src, $dst|$dst, $src}", [], >>>>>>>>>>>> IIC_XADD_REG>, TB; >>>>>>>>>>>> >>>>>>>>>>>> what is llc_xadd_reg here? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Sat, Jul 8, 2017 at 8:48 AM, Craig Topper < >>>>>>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Change the i32 in the store pattern to v64i32. >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Jul 7, 2017 at 8:41 PM hameeza ahmed < >>>>>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Thank you. i understood how avx512 vector instructions are >>>>>>>>>>>>>> written in x86instravx512. i need to define my vector instructions so i >>>>>>>>>>>>>> wrote; >>>>>>>>>>>>>> >>>>>>>>>>>>>> def VMOV_256B_RM : I<0x6F, MRMSrcMem, (outs VR2048:$dst), >>>>>>>>>>>>>> (ins i32mem:$src), >>>>>>>>>>>>>> "vmov_256B_rm\t{$src, $dst|$dst, $src}", >>>>>>>>>>>>>> [(set VR2048:$dst, (v64i32 >>>>>>>>>>>>>> (scalar_to_vector (loadi32 addr:$src))))], >>>>>>>>>>>>>> IIC_MOV_MEM>, EVEX; >>>>>>>>>>>>>> >>>>>>>>>>>>>> def VMOV_256B_MR : I<0x7F, MRMDestMem, (outs), (ins >>>>>>>>>>>>>> i32mem:$dst, VR2048:$src), >>>>>>>>>>>>>> "vmov_256B_mr\t{$src, $dst|$dst, $src}", >>>>>>>>>>>>>> [(store (i32 (bitconvert VR2048:$src)), >>>>>>>>>>>>>> addr:$dst)], IIC_MOV_MEM>, EVEX; >>>>>>>>>>>>>> >>>>>>>>>>>>>> in x86instrinfo.td; >>>>>>>>>>>>>> >>>>>>>>>>>>>> when i build i got these instructions in X86GenInstrInfo. >>>>>>>>>>>>>> but still my instruction is not selected when i run input >>>>>>>>>>>>>> file in debug mode; getting following errors; >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> ===== Instruction selection begins: BB#1 'vector.body' >>>>>>>>>>>>>> Selecting: t9: ch = store<ST256[bitcast ([65 x i32]* @c to >>>>>>>>>>>>>> <64 x i32>*)](align=16)(tbaa=<0x3817578>)> t8, t7, t11, >>>>>>>>>>>>>> undef:i64 >>>>>>>>>>>>>> >>>>>>>>>>>>>> ISEL: Starting pattern match on root node: t9: ch >>>>>>>>>>>>>> store<ST256[bitcast ([65 x i32]* @c to <64 x i32>*)](align=16)(tbaa=<0x3817578>)> >>>>>>>>>>>>>> t8, t7, t11, undef:i64 >>>>>>>>>>>>>> >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 14, >>>>>>>>>>>>>> continuing at 81 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 82, >>>>>>>>>>>>>> continuing at 149 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 150, >>>>>>>>>>>>>> continuing at 217 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 218, >>>>>>>>>>>>>> continuing at 267 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 268, >>>>>>>>>>>>>> continuing at 317 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 318, >>>>>>>>>>>>>> continuing at 367 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 368, >>>>>>>>>>>>>> continuing at 394 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 395, >>>>>>>>>>>>>> continuing at 421 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 422, >>>>>>>>>>>>>> continuing at 471 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 472, >>>>>>>>>>>>>> continuing at 521 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 522, >>>>>>>>>>>>>> continuing at 571 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 572, >>>>>>>>>>>>>> continuing at 639 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 640, >>>>>>>>>>>>>> continuing at 707 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 708, >>>>>>>>>>>>>> continuing at 775 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 776, >>>>>>>>>>>>>> continuing at 804 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 805, >>>>>>>>>>>>>> continuing at 833 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 834, >>>>>>>>>>>>>> continuing at 862 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 863, >>>>>>>>>>>>>> continuing at 891 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 892, >>>>>>>>>>>>>> continuing at 920 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 921, >>>>>>>>>>>>>> continuing at 949 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 950, >>>>>>>>>>>>>> continuing at 987 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 988, >>>>>>>>>>>>>> continuing at 1025 >>>>>>>>>>>>>> Match failed at index 12 >>>>>>>>>>>>>> Continuing at 1026 >>>>>>>>>>>>>> OpcodeSwitch from 1029 to 5725 >>>>>>>>>>>>>> Match failed at index 5743 >>>>>>>>>>>>>> Continuing at 5772 >>>>>>>>>>>>>> Match failed at index 5776 >>>>>>>>>>>>>> Continuing at 5805 >>>>>>>>>>>>>> Match failed at index 5809 >>>>>>>>>>>>>> Continuing at 5838 >>>>>>>>>>>>>> Match failed at index 5842 >>>>>>>>>>>>>> Continuing at 5911 >>>>>>>>>>>>>> Match failed at index 5915 >>>>>>>>>>>>>> Continuing at 5953 >>>>>>>>>>>>>> Match failed at index 5957 >>>>>>>>>>>>>> Continuing at 5995 >>>>>>>>>>>>>> Match failed at index 5999 >>>>>>>>>>>>>> Continuing at 6037 >>>>>>>>>>>>>> Match failed at index 6041 >>>>>>>>>>>>>> Continuing at 6084 >>>>>>>>>>>>>> Match failed at index 6088 >>>>>>>>>>>>>> Continuing at 6131 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 6138, >>>>>>>>>>>>>> continuing at 6181 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 6182, >>>>>>>>>>>>>> continuing at 6228 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 6235, >>>>>>>>>>>>>> continuing at 6384 >>>>>>>>>>>>>> Match failed at index 6388 >>>>>>>>>>>>>> Continuing at 6419 >>>>>>>>>>>>>> Match failed at index 6423 >>>>>>>>>>>>>> Continuing at 6454 >>>>>>>>>>>>>> Match failed at index 6458 >>>>>>>>>>>>>> Continuing at 6489 >>>>>>>>>>>>>> Continuing at 6490 >>>>>>>>>>>>>> Continuing at 6491 >>>>>>>>>>>>>> Continuing at 6492 >>>>>>>>>>>>>> Match failed at index 6514 >>>>>>>>>>>>>> Continuing at 6545 >>>>>>>>>>>>>> Match failed at index 6562 >>>>>>>>>>>>>> Continuing at 6593 >>>>>>>>>>>>>> Match failed at index 6610 >>>>>>>>>>>>>> Continuing at 6641 >>>>>>>>>>>>>> Continuing at 6642 >>>>>>>>>>>>>> Match failed at index 6658 >>>>>>>>>>>>>> Continuing at 6772 >>>>>>>>>>>>>> Match failed at index 6788 >>>>>>>>>>>>>> Continuing at 6902 >>>>>>>>>>>>>> Continuing at 13636 >>>>>>>>>>>>>> Match failed at index 13640 >>>>>>>>>>>>>> Continuing at 14940 >>>>>>>>>>>>>> Match failed at index 14943 >>>>>>>>>>>>>> Continuing at 15415 >>>>>>>>>>>>>> Match failed at index 15417 >>>>>>>>>>>>>> Continuing at 15570 >>>>>>>>>>>>>> Match failed at index 15571 >>>>>>>>>>>>>> Continuing at 15598 >>>>>>>>>>>>>> Match failed at index 15599 >>>>>>>>>>>>>> Continuing at 15716 >>>>>>>>>>>>>> Match failed at index 15719 >>>>>>>>>>>>>> Continuing at 15837 >>>>>>>>>>>>>> Match failed at index 15840 >>>>>>>>>>>>>> Continuing at 16198 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 16203, continuing at 16285 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 16286, continuing at 16394 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 16395, continuing at 16464 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 16465, continuing at 16487 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 16488, continuing at 16510 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 16511, continuing at 16533 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 16534, continuing at 16556 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 16557, continuing at 16680 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 16681, continuing at 16804 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 16805, continuing at 16890 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 16891, continuing at 16976 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 16978, continuing at 17169 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 17171, continuing at 17342 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 17344, continuing at 17497 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 17499, continuing at 17632 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 17634, continuing at 17801 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 17803, continuing at 17944 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 17946, continuing at 18074 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 18075, continuing at 18178 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 18179, continuing at 18253 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 18254, continuing at 18278 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 18279, continuing at 18303 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 18304, continuing at 18328 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 18329, continuing at 18376 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 18377, continuing at 18424 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 18425, continuing at 18520 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 18521, continuing at 18636 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 18637, continuing at 18661 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 18662, continuing at 18711 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 18712, continuing at 18736 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 18737, continuing at 18770 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 18771, continuing at 18856 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 18857, continuing at 18942 >>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>> 18943, continuing at 19028 >>>>>>>>>>>>>> Match failed at index 16201 >>>>>>>>>>>>>> Continuing at 19029 >>>>>>>>>>>>>> LLVM ERROR: Cannot select: t9: ch = store<ST256[bitcast ([65 >>>>>>>>>>>>>> x i32]* @c to <64 x i32>*)](align=16)(tbaa=<0x3817578>)> t8, >>>>>>>>>>>>>> t7, t11, undef:i64 >>>>>>>>>>>>>> t7: v64i32 = add t6, t4 >>>>>>>>>>>>>> t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* @c to <64 >>>>>>>>>>>>>> x i32>*)](align=16)(tbaa=<0x3817578>)(dereferenceable)> t0, >>>>>>>>>>>>>> t11, undef:i64 >>>>>>>>>>>>>> t11: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 >>>>>>>>>>>>>> x i32]* @c> 0 >>>>>>>>>>>>>> t10: i64 = TargetGlobalAddress<[65 x i32]* @c> 0 >>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>> t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* @b to <64 >>>>>>>>>>>>>> x i32>*)](align=16)(tbaa=<0x3817578>)(dereferenceable)> t0, >>>>>>>>>>>>>> t13, undef:i64 >>>>>>>>>>>>>> t13: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 >>>>>>>>>>>>>> x i32]* @b> 0 >>>>>>>>>>>>>> t12: i64 = TargetGlobalAddress<[65 x i32]* @b> 0 >>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>> t11: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x >>>>>>>>>>>>>> i32]* @c> 0 >>>>>>>>>>>>>> t10: i64 = TargetGlobalAddress<[65 x i32]* @c> 0 >>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>> In function: foo >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> What could be the reason of this?? Please correct me. >>>>>>>>>>>>>> I am stuck at this point.... >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 10:59 PM, Friedman, Eli < >>>>>>>>>>>>>> efriedma at codeaurora.org> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> The word "fold" is used all over LLVM. It generally refers >>>>>>>>>>>>>>> to transformations which delete an instruction. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> If you're asking about http://llvm.org/docs/CodeGener >>>>>>>>>>>>>>> ator.html#instruction-folding , it just means an >>>>>>>>>>>>>>> instruction which was produced by the "instruction folding" transform; >>>>>>>>>>>>>>> there isn't anything special about the instruction itself. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -Eli >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 7/6/2017 10:51 PM, hameeza ahmed wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> What is meant by folded instructions in LLVM? >>>>>>>>>>>>>>> How they work? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 10:19 AM, hameeza ahmed < >>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thank You. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 10:03 AM, Craig Topper < >>>>>>>>>>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Yes, that error is from instruction selection. I think >>>>>>>>>>>>>>>>> your legalization changes worked fine. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> ~Craig >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Thu, Jul 6, 2017 at 8:21 PM, hameeza ahmed via llvm-dev >>>>>>>>>>>>>>>>> <llvm-dev at lists.llvm.org> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> also i further run the following command; >>>>>>>>>>>>>>>>>> llc -debug filer-knl_o3.ll >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> and its output is attached here. by looking at the output >>>>>>>>>>>>>>>>>> can we say that legalization runs fine and the error is due to instruction >>>>>>>>>>>>>>>>>> selection/ pattern matching which is not yet implemented? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> so do i need to worry and try to correct it at this stage >>>>>>>>>>>>>>>>>> or should i move forward to implement instruction selection/ pattern >>>>>>>>>>>>>>>>>> matching? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Please guide me. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thank You >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 8:00 AM, hameeza ahmed < >>>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thank You. well i have seen these links. but they dont >>>>>>>>>>>>>>>>>>> cover the problem that i have mentioned. actually i am doing all the things >>>>>>>>>>>>>>>>>>> step by step. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> so i havent yet worked with instruction selection phase/ >>>>>>>>>>>>>>>>>>> files. rather before that i am trying to do legalization by allowing vector >>>>>>>>>>>>>>>>>>> elements>16 i.e 64xi32. here i have mainly worked with 2 files uptil now, >>>>>>>>>>>>>>>>>>> i.e registerinfo.td to define register class to be >>>>>>>>>>>>>>>>>>> called in legalization. and most importantly i am dealing with file >>>>>>>>>>>>>>>>>>> X86ISelLowering.cpp. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Now is there any relation in this and instruction >>>>>>>>>>>>>>>>>>> selection. since instruction selection comes after combine and legalize so >>>>>>>>>>>>>>>>>>> i havent yet worked on it. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Please correct me, I am stuck here. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thank You again >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 7:11 AM, Friedman, Eli < >>>>>>>>>>>>>>>>>>> efriedma at codeaurora.org> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Have you read http://llvm.org/docs/WritingAn >>>>>>>>>>>>>>>>>>>> LLVMBackend.html and http://llvm.org/docs/CodeGener >>>>>>>>>>>>>>>>>>>> ator.html ? http://llvm.org/docs/WritingAn >>>>>>>>>>>>>>>>>>>> LLVMBackend.html#instruction-selector describes how to >>>>>>>>>>>>>>>>>>>> define a store instruction. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> -Eli >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On 7/6/2017 6:51 PM, hameeza ahmed via llvm-dev wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Please correct me i m stuck at this point. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Jul 6, 2017 5:18 PM, "hameeza ahmed" < >>>>>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>>>>>>>> i am experimenting with the increase in register/ >>>>>>>>>>>>>>>>>>>> vector width to 64 elements of 32 bits instead of 16 in x86 backend. >>>>>>>>>>>>>>>>>>>> for eg. >>>>>>>>>>>>>>>>>>>> i have a loop with 65 iterations; >>>>>>>>>>>>>>>>>>>> if my IR generates v64i32 and 1 scalar, still the >>>>>>>>>>>>>>>>>>>> backend breaks the v64i32 into 4 v16i32. i want it to retain v64i32. like >>>>>>>>>>>>>>>>>>>> if there are 128 elements in loop then it should break it into 2 v64i32 >>>>>>>>>>>>>>>>>>>> instructions. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> in order to do this i have made necessary changes in >>>>>>>>>>>>>>>>>>>> X86ISelLowering.cpp. and rebuild llvm. then when i use the >>>>>>>>>>>>>>>>>>>> command -view-dag-combine2-dags i get the required >>>>>>>>>>>>>>>>>>>> output in graph but the following error on console: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> LLVM ERROR: Cannot select: t10: ch >>>>>>>>>>>>>>>>>>>> store<ST256[bitcast ([65 x i32]* @a to <64 x i32>*)](align=16)(tbaa=<0x30c5438>)> >>>>>>>>>>>>>>>>>>>> t9, t7, t12, undef:i64 >>>>>>>>>>>>>>>>>>>> t7: v64i32 = add t6, t4 >>>>>>>>>>>>>>>>>>>> t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* @c >>>>>>>>>>>>>>>>>>>> to <64 x i32>*)](align=16)(tbaa=<0x30c5438>)(dereferenceable)> >>>>>>>>>>>>>>>>>>>> t0, t14, undef:i64 >>>>>>>>>>>>>>>>>>>> t14: i64 = X86ISD::Wrapper >>>>>>>>>>>>>>>>>>>> TargetGlobalAddress:i64<[65 x i32]* @c> 0 >>>>>>>>>>>>>>>>>>>> t13: i64 = TargetGlobalAddress<[65 x i32]* @c> 0 >>>>>>>>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>>>>>>>> t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* @b >>>>>>>>>>>>>>>>>>>> to <64 x i32>*)](align=16)(tbaa=<0x30c5438>)(dereferenceable)> >>>>>>>>>>>>>>>>>>>> t0, t16, undef:i64 >>>>>>>>>>>>>>>>>>>> t16: i64 = X86ISD::Wrapper >>>>>>>>>>>>>>>>>>>> TargetGlobalAddress:i64<[65 x i32]* @b> 0 >>>>>>>>>>>>>>>>>>>> t15: i64 = TargetGlobalAddress<[65 x i32]* @b> 0 >>>>>>>>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>>>>>>>> t12: i64 = X86ISD::Wrapper >>>>>>>>>>>>>>>>>>>> TargetGlobalAddress:i64<[65 x i32]* @a> 0 >>>>>>>>>>>>>>>>>>>> t11: i64 = TargetGlobalAddress<[65 x i32]* @a> 0 >>>>>>>>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>>>>>>>> In function: foo >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> The dag after legalization is also attached here. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> the source is vector sum of 65 elements. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Kindly correct me. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>>>> LLVM Developers mailing listllvm-dev at lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>> Employee of Qualcomm Innovation Center, Inc. >>>>>>>>>>>>>>>>>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> Employee of Qualcomm Innovation Center, Inc. >>>>>>>>>>>>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>> ~Craig >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170708/698d9275/attachment.html>
hameeza ahmed via llvm-dev
2017-Jul-08 06:23 UTC
[llvm-dev] Error in v64i32 type in x86 backend
Thank you. add is working fine i keep opcode=0x0F it is unused. Sorry to disturb, but load is not matching pattern; is the following load correct; def VMOV_256B_RM : I<0x6F, MRMSrcMem, (outs VR2048:$dst), (ins i32mem:$src), "vmov_256B_rm\t{$src, $dst|$dst, $src}", [(set VR2048:$dst, (v64i32 (scalar_to_vector (loadi32 addr:$src))))], IIC_MOV_MEM>, EVEX; i am getting this error; LLVM ERROR: Cannot select: t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* @b to <64 x i32>*)](align=16)(tbaa=<0x3fb8578>)(dereferenceable)> t0, t13, undef:i64 t13: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]* @b> 0 t12: i64 = TargetGlobalAddress<[65 x i32]* @b> 0 t3: i64 = undef In function: foo On Sat, Jul 8, 2017 at 11:03 AM, hameeza ahmed <hahmed2305 at gmail.com> wrote:> Thank You. > > I have seen the opcode is 8 bits and all the combinations are already used > in llvm x86. > > Now what to do? > > On Sat, Jul 8, 2017 at 10:57 AM, Craig Topper <craig.topper at gmail.com> > wrote: > >> Yes its an opcode conflict. You'll have to look through Intel documents >> and find an unused opcode. I've only added instructions based on a real >> spec so I don't know how to make up an opcode. >> >> ~Craig >> >> On Fri, Jul 7, 2017 at 10:43 PM, hameeza ahmed <hahmed2305 at gmail.com> >> wrote: >> >>> Thank You. >>> >>> Now i am getting this error repeatedly; >>> >>> Error: Primary decode conflict: VADD_256B would overwrite INC8r >>> ModRM 192 >>> Opcode 254 >>> Context IC >>> Error: Primary decode conflict: VADD_256B would overwrite INC8r >>> ModRM 193 >>> Opcode 254 >>> Context IC >>> >>> Is it due to opcode conflict? what should i keep opcode then? >>> >>> >>> On Sat, Jul 8, 2017 at 10:33 AM, Craig Topper <craig.topper at gmail.com> >>> wrote: >>> >>>> Keep I >>>> >>>> ~Craig >>>> >>>> On Fri, Jul 7, 2017 at 10:28 PM, hameeza ahmed <hahmed2305 at gmail.com> >>>> wrote: >>>> >>>>> I keep this one; >>>>> >>>>> def VADD_256B : I<0xFE, MRMDestReg, (outs VR2048:$dst), (ins VR2048:$src1, >>>>> VR2048:$src2), >>>>> "VADD_256B\t{$dst, $src1, $src2 }", [(set VR2048:$dst, >>>>> (add VR2048:$src1, VR2048:$src2))]>; >>>>> >>>>> On Sat, Jul 8, 2017 at 10:17 AM, hameeza ahmed <hahmed2305 at gmail.com> >>>>> wrote: >>>>> >>>>>> sorry i didnt understand RI/ I thing. should i keep RI or I? >>>>>> >>>>>> On Sat, Jul 8, 2017 at 10:13 AM, Craig Topper <craig.topper at gmail.com >>>>>> > wrote: >>>>>> >>>>>>> I think so. >>>>>>> >>>>>>> ~Craig >>>>>>> >>>>>>> On Fri, Jul 7, 2017 at 10:10 PM, hameeza ahmed <hahmed2305 at gmail.com >>>>>>> > wrote: >>>>>>> >>>>>>>> sorry to disturb again,, >>>>>>>> >>>>>>>> def VADD_256B : RI<0xFE, MRMDestReg, (outs VR2048:$dst), (ins VR >>>>>>>> 2048:$src1, VR2048:$src2), >>>>>>>> "VADD_256B\t{$dst, $src1, $src2 }", [(set VR2048:$dst, >>>>>>>> (add VR2048:$src1, VR2048:$src2))], IIC_XADD_REG>, TB; >>>>>>>> >>>>>>>> >>>>>>>> Is it fine now?? >>>>>>>> >>>>>>>> >>>>>>>> On Sat, Jul 8, 2017 at 10:00 AM, Craig Topper < >>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>> >>>>>>>>> Oops that should have said "REX prefix" in the first sentence. >>>>>>>>> >>>>>>>>> ~Craig >>>>>>>>> >>>>>>>>> On Fri, Jul 7, 2017 at 9:59 PM, Craig Topper < >>>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> You don't want RI. That's used for instructions that need a reg >>>>>>>>>> prefix. You need to use $src1 and $src2 in the assembly string too. It also >>>>>>>>>> looks like you have two closing ] brackets. >>>>>>>>>> >>>>>>>>>> ~Craig >>>>>>>>>> >>>>>>>>>> On Fri, Jul 7, 2017 at 9:55 PM, hameeza ahmed < >>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Thank you; >>>>>>>>>>> i have changed as follows.is it fine now? >>>>>>>>>>> >>>>>>>>>>> def VADD_256B : I<0xFE, MRMDestReg, (outs VR2048:$dst), (ins >>>>>>>>>>> VR2048:$src1, VR2048:$src2), >>>>>>>>>>> "VADD_256B\t{$src, $dst|$dst, $src}", [(set VR >>>>>>>>>>> 2048:$dst, (add VR2048:$src1, VR2048:$src2))]]>; >>>>>>>>>>> >>>>>>>>>>> Also here i have changed class RI to I. Does it make any >>>>>>>>>>> difference? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Sat, Jul 8, 2017 at 9:38 AM, Craig Topper < >>>>>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> IIC_XADD_REG is used to associate latency and other information >>>>>>>>>>>> for use by the instruction scheduling pass. >>>>>>>>>>>> >>>>>>>>>>>> You're missing a pattern in the square bracket to match an add >>>>>>>>>>>> node. You also need two VR2048 registers in the 'ins' >>>>>>>>>>>> >>>>>>>>>>>> ~Craig >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Jul 7, 2017 at 9:29 PM, hameeza ahmed < >>>>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Can you please tell whether following add is correct to add 2 >>>>>>>>>>>>> 64xi32 numbers. >>>>>>>>>>>>> >>>>>>>>>>>>> def VADD_256B : RI<0xFE, MRMDestReg, (outs VR2048:$dst), >>>>>>>>>>>>> (ins VR2048:$src), >>>>>>>>>>>>> "VADD_256B\t{$src, $dst|$dst, $src}", [], >>>>>>>>>>>>> IIC_XADD_REG>, TB; >>>>>>>>>>>>> >>>>>>>>>>>>> what is llc_xadd_reg here? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Sat, Jul 8, 2017 at 8:48 AM, Craig Topper < >>>>>>>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Change the i32 in the store pattern to v64i32. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 8:41 PM hameeza ahmed < >>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thank you. i understood how avx512 vector instructions are >>>>>>>>>>>>>>> written in x86instravx512. i need to define my vector instructions so i >>>>>>>>>>>>>>> wrote; >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> def VMOV_256B_RM : I<0x6F, MRMSrcMem, (outs VR2048:$dst), >>>>>>>>>>>>>>> (ins i32mem:$src), >>>>>>>>>>>>>>> "vmov_256B_rm\t{$src, $dst|$dst, $src}", >>>>>>>>>>>>>>> [(set VR2048:$dst, (v64i32 >>>>>>>>>>>>>>> (scalar_to_vector (loadi32 addr:$src))))], >>>>>>>>>>>>>>> IIC_MOV_MEM>, EVEX; >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> def VMOV_256B_MR : I<0x7F, MRMDestMem, (outs), (ins >>>>>>>>>>>>>>> i32mem:$dst, VR2048:$src), >>>>>>>>>>>>>>> "vmov_256B_mr\t{$src, $dst|$dst, $src}", >>>>>>>>>>>>>>> [(store (i32 (bitconvert VR2048:$src)), >>>>>>>>>>>>>>> addr:$dst)], IIC_MOV_MEM>, EVEX; >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> in x86instrinfo.td; >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> when i build i got these instructions in X86GenInstrInfo. >>>>>>>>>>>>>>> but still my instruction is not selected when i run input >>>>>>>>>>>>>>> file in debug mode; getting following errors; >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ===== Instruction selection begins: BB#1 'vector.body' >>>>>>>>>>>>>>> Selecting: t9: ch = store<ST256[bitcast ([65 x i32]* @c to >>>>>>>>>>>>>>> <64 x i32>*)](align=16)(tbaa=<0x3817578>)> t8, t7, t11, >>>>>>>>>>>>>>> undef:i64 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ISEL: Starting pattern match on root node: t9: ch >>>>>>>>>>>>>>> store<ST256[bitcast ([65 x i32]* @c to <64 x i32>*)](align=16)(tbaa=<0x3817578>)> >>>>>>>>>>>>>>> t8, t7, t11, undef:i64 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 14, >>>>>>>>>>>>>>> continuing at 81 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 82, >>>>>>>>>>>>>>> continuing at 149 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 150, >>>>>>>>>>>>>>> continuing at 217 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 218, >>>>>>>>>>>>>>> continuing at 267 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 268, >>>>>>>>>>>>>>> continuing at 317 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 318, >>>>>>>>>>>>>>> continuing at 367 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 368, >>>>>>>>>>>>>>> continuing at 394 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 395, >>>>>>>>>>>>>>> continuing at 421 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 422, >>>>>>>>>>>>>>> continuing at 471 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 472, >>>>>>>>>>>>>>> continuing at 521 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 522, >>>>>>>>>>>>>>> continuing at 571 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 572, >>>>>>>>>>>>>>> continuing at 639 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 640, >>>>>>>>>>>>>>> continuing at 707 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 708, >>>>>>>>>>>>>>> continuing at 775 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 776, >>>>>>>>>>>>>>> continuing at 804 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 805, >>>>>>>>>>>>>>> continuing at 833 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 834, >>>>>>>>>>>>>>> continuing at 862 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 863, >>>>>>>>>>>>>>> continuing at 891 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 892, >>>>>>>>>>>>>>> continuing at 920 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 921, >>>>>>>>>>>>>>> continuing at 949 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 950, >>>>>>>>>>>>>>> continuing at 987 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 988, >>>>>>>>>>>>>>> continuing at 1025 >>>>>>>>>>>>>>> Match failed at index 12 >>>>>>>>>>>>>>> Continuing at 1026 >>>>>>>>>>>>>>> OpcodeSwitch from 1029 to 5725 >>>>>>>>>>>>>>> Match failed at index 5743 >>>>>>>>>>>>>>> Continuing at 5772 >>>>>>>>>>>>>>> Match failed at index 5776 >>>>>>>>>>>>>>> Continuing at 5805 >>>>>>>>>>>>>>> Match failed at index 5809 >>>>>>>>>>>>>>> Continuing at 5838 >>>>>>>>>>>>>>> Match failed at index 5842 >>>>>>>>>>>>>>> Continuing at 5911 >>>>>>>>>>>>>>> Match failed at index 5915 >>>>>>>>>>>>>>> Continuing at 5953 >>>>>>>>>>>>>>> Match failed at index 5957 >>>>>>>>>>>>>>> Continuing at 5995 >>>>>>>>>>>>>>> Match failed at index 5999 >>>>>>>>>>>>>>> Continuing at 6037 >>>>>>>>>>>>>>> Match failed at index 6041 >>>>>>>>>>>>>>> Continuing at 6084 >>>>>>>>>>>>>>> Match failed at index 6088 >>>>>>>>>>>>>>> Continuing at 6131 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 6138, continuing at 6181 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 6182, continuing at 6228 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 6235, continuing at 6384 >>>>>>>>>>>>>>> Match failed at index 6388 >>>>>>>>>>>>>>> Continuing at 6419 >>>>>>>>>>>>>>> Match failed at index 6423 >>>>>>>>>>>>>>> Continuing at 6454 >>>>>>>>>>>>>>> Match failed at index 6458 >>>>>>>>>>>>>>> Continuing at 6489 >>>>>>>>>>>>>>> Continuing at 6490 >>>>>>>>>>>>>>> Continuing at 6491 >>>>>>>>>>>>>>> Continuing at 6492 >>>>>>>>>>>>>>> Match failed at index 6514 >>>>>>>>>>>>>>> Continuing at 6545 >>>>>>>>>>>>>>> Match failed at index 6562 >>>>>>>>>>>>>>> Continuing at 6593 >>>>>>>>>>>>>>> Match failed at index 6610 >>>>>>>>>>>>>>> Continuing at 6641 >>>>>>>>>>>>>>> Continuing at 6642 >>>>>>>>>>>>>>> Match failed at index 6658 >>>>>>>>>>>>>>> Continuing at 6772 >>>>>>>>>>>>>>> Match failed at index 6788 >>>>>>>>>>>>>>> Continuing at 6902 >>>>>>>>>>>>>>> Continuing at 13636 >>>>>>>>>>>>>>> Match failed at index 13640 >>>>>>>>>>>>>>> Continuing at 14940 >>>>>>>>>>>>>>> Match failed at index 14943 >>>>>>>>>>>>>>> Continuing at 15415 >>>>>>>>>>>>>>> Match failed at index 15417 >>>>>>>>>>>>>>> Continuing at 15570 >>>>>>>>>>>>>>> Match failed at index 15571 >>>>>>>>>>>>>>> Continuing at 15598 >>>>>>>>>>>>>>> Match failed at index 15599 >>>>>>>>>>>>>>> Continuing at 15716 >>>>>>>>>>>>>>> Match failed at index 15719 >>>>>>>>>>>>>>> Continuing at 15837 >>>>>>>>>>>>>>> Match failed at index 15840 >>>>>>>>>>>>>>> Continuing at 16198 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 16203, continuing at 16285 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 16286, continuing at 16394 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 16395, continuing at 16464 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 16465, continuing at 16487 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 16488, continuing at 16510 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 16511, continuing at 16533 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 16534, continuing at 16556 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 16557, continuing at 16680 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 16681, continuing at 16804 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 16805, continuing at 16890 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 16891, continuing at 16976 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 16978, continuing at 17169 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 17171, continuing at 17342 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 17344, continuing at 17497 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 17499, continuing at 17632 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 17634, continuing at 17801 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 17803, continuing at 17944 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 17946, continuing at 18074 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 18075, continuing at 18178 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 18179, continuing at 18253 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 18254, continuing at 18278 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 18279, continuing at 18303 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 18304, continuing at 18328 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 18329, continuing at 18376 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 18377, continuing at 18424 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 18425, continuing at 18520 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 18521, continuing at 18636 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 18637, continuing at 18661 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 18662, continuing at 18711 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 18712, continuing at 18736 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 18737, continuing at 18770 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 18771, continuing at 18856 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 18857, continuing at 18942 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 18943, continuing at 19028 >>>>>>>>>>>>>>> Match failed at index 16201 >>>>>>>>>>>>>>> Continuing at 19029 >>>>>>>>>>>>>>> LLVM ERROR: Cannot select: t9: ch = store<ST256[bitcast ([65 >>>>>>>>>>>>>>> x i32]* @c to <64 x i32>*)](align=16)(tbaa=<0x3817578>)> >>>>>>>>>>>>>>> t8, t7, t11, undef:i64 >>>>>>>>>>>>>>> t7: v64i32 = add t6, t4 >>>>>>>>>>>>>>> t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* @c to >>>>>>>>>>>>>>> <64 x i32>*)](align=16)(tbaa=<0x3817578>)(dereferenceable)> >>>>>>>>>>>>>>> t0, t11, undef:i64 >>>>>>>>>>>>>>> t11: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 >>>>>>>>>>>>>>> x i32]* @c> 0 >>>>>>>>>>>>>>> t10: i64 = TargetGlobalAddress<[65 x i32]* @c> 0 >>>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>>> t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* @b to >>>>>>>>>>>>>>> <64 x i32>*)](align=16)(tbaa=<0x3817578>)(dereferenceable)> >>>>>>>>>>>>>>> t0, t13, undef:i64 >>>>>>>>>>>>>>> t13: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 >>>>>>>>>>>>>>> x i32]* @b> 0 >>>>>>>>>>>>>>> t12: i64 = TargetGlobalAddress<[65 x i32]* @b> 0 >>>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>>> t11: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x >>>>>>>>>>>>>>> i32]* @c> 0 >>>>>>>>>>>>>>> t10: i64 = TargetGlobalAddress<[65 x i32]* @c> 0 >>>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>>> In function: foo >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> What could be the reason of this?? Please correct me. >>>>>>>>>>>>>>> I am stuck at this point.... >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 10:59 PM, Friedman, Eli < >>>>>>>>>>>>>>> efriedma at codeaurora.org> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The word "fold" is used all over LLVM. It generally refers >>>>>>>>>>>>>>>> to transformations which delete an instruction. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> If you're asking about http://llvm.org/docs/CodeGener >>>>>>>>>>>>>>>> ator.html#instruction-folding , it just means an >>>>>>>>>>>>>>>> instruction which was produced by the "instruction folding" transform; >>>>>>>>>>>>>>>> there isn't anything special about the instruction itself. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -Eli >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 7/6/2017 10:51 PM, hameeza ahmed wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> What is meant by folded instructions in LLVM? >>>>>>>>>>>>>>>> How they work? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 10:19 AM, hameeza ahmed < >>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thank You. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 10:03 AM, Craig Topper < >>>>>>>>>>>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Yes, that error is from instruction selection. I think >>>>>>>>>>>>>>>>>> your legalization changes worked fine. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> ~Craig >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Thu, Jul 6, 2017 at 8:21 PM, hameeza ahmed via >>>>>>>>>>>>>>>>>> llvm-dev <llvm-dev at lists.llvm.org> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> also i further run the following command; >>>>>>>>>>>>>>>>>>> llc -debug filer-knl_o3.ll >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> and its output is attached here. by looking at the >>>>>>>>>>>>>>>>>>> output can we say that legalization runs fine and the error is due to >>>>>>>>>>>>>>>>>>> instruction selection/ pattern matching which is not yet implemented? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> so do i need to worry and try to correct it at this >>>>>>>>>>>>>>>>>>> stage or should i move forward to implement instruction selection/ pattern >>>>>>>>>>>>>>>>>>> matching? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Please guide me. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thank You >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 8:00 AM, hameeza ahmed < >>>>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thank You. well i have seen these links. but they dont >>>>>>>>>>>>>>>>>>>> cover the problem that i have mentioned. actually i am doing all the things >>>>>>>>>>>>>>>>>>>> step by step. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> so i havent yet worked with instruction selection >>>>>>>>>>>>>>>>>>>> phase/ files. rather before that i am trying to do legalization by allowing >>>>>>>>>>>>>>>>>>>> vector elements>16 i.e 64xi32. here i have mainly worked with 2 files uptil >>>>>>>>>>>>>>>>>>>> now, i.e registerinfo.td to define register class to >>>>>>>>>>>>>>>>>>>> be called in legalization. and most importantly i am dealing with file >>>>>>>>>>>>>>>>>>>> X86ISelLowering.cpp. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Now is there any relation in this and instruction >>>>>>>>>>>>>>>>>>>> selection. since instruction selection comes after combine and legalize so >>>>>>>>>>>>>>>>>>>> i havent yet worked on it. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Please correct me, I am stuck here. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thank You again >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 7:11 AM, Friedman, Eli < >>>>>>>>>>>>>>>>>>>> efriedma at codeaurora.org> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Have you read http://llvm.org/docs/WritingAn >>>>>>>>>>>>>>>>>>>>> LLVMBackend.html and http://llvm.org/docs/CodeGener >>>>>>>>>>>>>>>>>>>>> ator.html ? http://llvm.org/docs/WritingAn >>>>>>>>>>>>>>>>>>>>> LLVMBackend.html#instruction-selector describes how >>>>>>>>>>>>>>>>>>>>> to define a store instruction. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> -Eli >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On 7/6/2017 6:51 PM, hameeza ahmed via llvm-dev wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Please correct me i m stuck at this point. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Jul 6, 2017 5:18 PM, "hameeza ahmed" < >>>>>>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>>>>>>>>> i am experimenting with the increase in register/ >>>>>>>>>>>>>>>>>>>>> vector width to 64 elements of 32 bits instead of 16 in x86 backend. >>>>>>>>>>>>>>>>>>>>> for eg. >>>>>>>>>>>>>>>>>>>>> i have a loop with 65 iterations; >>>>>>>>>>>>>>>>>>>>> if my IR generates v64i32 and 1 scalar, still the >>>>>>>>>>>>>>>>>>>>> backend breaks the v64i32 into 4 v16i32. i want it to retain v64i32. like >>>>>>>>>>>>>>>>>>>>> if there are 128 elements in loop then it should break it into 2 v64i32 >>>>>>>>>>>>>>>>>>>>> instructions. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> in order to do this i have made necessary changes in >>>>>>>>>>>>>>>>>>>>> X86ISelLowering.cpp. and rebuild llvm. then when i use the >>>>>>>>>>>>>>>>>>>>> command -view-dag-combine2-dags i get the required >>>>>>>>>>>>>>>>>>>>> output in graph but the following error on console: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> LLVM ERROR: Cannot select: t10: ch >>>>>>>>>>>>>>>>>>>>> store<ST256[bitcast ([65 x i32]* @a to <64 x i32>*)](align=16)(tbaa=<0x30c5438>)> >>>>>>>>>>>>>>>>>>>>> t9, t7, t12, undef:i64 >>>>>>>>>>>>>>>>>>>>> t7: v64i32 = add t6, t4 >>>>>>>>>>>>>>>>>>>>> t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* @c >>>>>>>>>>>>>>>>>>>>> to <64 x i32>*)](align=16)(tbaa=<0x30c5438>)(dereferenceable)> >>>>>>>>>>>>>>>>>>>>> t0, t14, undef:i64 >>>>>>>>>>>>>>>>>>>>> t14: i64 = X86ISD::Wrapper >>>>>>>>>>>>>>>>>>>>> TargetGlobalAddress:i64<[65 x i32]* @c> 0 >>>>>>>>>>>>>>>>>>>>> t13: i64 = TargetGlobalAddress<[65 x i32]* @c> >>>>>>>>>>>>>>>>>>>>> 0 >>>>>>>>>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>>>>>>>>> t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* @b >>>>>>>>>>>>>>>>>>>>> to <64 x i32>*)](align=16)(tbaa=<0x30c5438>)(dereferenceable)> >>>>>>>>>>>>>>>>>>>>> t0, t16, undef:i64 >>>>>>>>>>>>>>>>>>>>> t16: i64 = X86ISD::Wrapper >>>>>>>>>>>>>>>>>>>>> TargetGlobalAddress:i64<[65 x i32]* @b> 0 >>>>>>>>>>>>>>>>>>>>> t15: i64 = TargetGlobalAddress<[65 x i32]* @b> >>>>>>>>>>>>>>>>>>>>> 0 >>>>>>>>>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>>>>>>>>> t12: i64 = X86ISD::Wrapper >>>>>>>>>>>>>>>>>>>>> TargetGlobalAddress:i64<[65 x i32]* @a> 0 >>>>>>>>>>>>>>>>>>>>> t11: i64 = TargetGlobalAddress<[65 x i32]* @a> 0 >>>>>>>>>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>>>>>>>>> In function: foo >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> The dag after legalization is also attached here. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> the source is vector sum of 65 elements. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Kindly correct me. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>>>>> LLVM Developers mailing listllvm-dev at lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>> Employee of Qualcomm Innovation Center, Inc. >>>>>>>>>>>>>>>>>>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>>>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>>>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> Employee of Qualcomm Innovation Center, Inc. >>>>>>>>>>>>>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>> ~Craig >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170708/8c5f2501/attachment-0001.html>
Craig Topper via llvm-dev
2017-Jul-08 06:28 UTC
[llvm-dev] Error in v64i32 type in x86 backend
The opcode field is 8-bits, but there are multiple opcode maps specified by things like the "TB" on the end of your current instruction. There are many others like TA, VEX, XOP, etc. that you can find on other instructions. What exactly is your end goal with making up these fake instructions? ~Craig On Fri, Jul 7, 2017 at 11:03 PM, hameeza ahmed <hahmed2305 at gmail.com> wrote:> Thank You. > > I have seen the opcode is 8 bits and all the combinations are already used > in llvm x86. > > Now what to do? > > On Sat, Jul 8, 2017 at 10:57 AM, Craig Topper <craig.topper at gmail.com> > wrote: > >> Yes its an opcode conflict. You'll have to look through Intel documents >> and find an unused opcode. I've only added instructions based on a real >> spec so I don't know how to make up an opcode. >> >> ~Craig >> >> On Fri, Jul 7, 2017 at 10:43 PM, hameeza ahmed <hahmed2305 at gmail.com> >> wrote: >> >>> Thank You. >>> >>> Now i am getting this error repeatedly; >>> >>> Error: Primary decode conflict: VADD_256B would overwrite INC8r >>> ModRM 192 >>> Opcode 254 >>> Context IC >>> Error: Primary decode conflict: VADD_256B would overwrite INC8r >>> ModRM 193 >>> Opcode 254 >>> Context IC >>> >>> Is it due to opcode conflict? what should i keep opcode then? >>> >>> >>> On Sat, Jul 8, 2017 at 10:33 AM, Craig Topper <craig.topper at gmail.com> >>> wrote: >>> >>>> Keep I >>>> >>>> ~Craig >>>> >>>> On Fri, Jul 7, 2017 at 10:28 PM, hameeza ahmed <hahmed2305 at gmail.com> >>>> wrote: >>>> >>>>> I keep this one; >>>>> >>>>> def VADD_256B : I<0xFE, MRMDestReg, (outs VR2048:$dst), (ins VR2048:$src1, >>>>> VR2048:$src2), >>>>> "VADD_256B\t{$dst, $src1, $src2 }", [(set VR2048:$dst, >>>>> (add VR2048:$src1, VR2048:$src2))]>; >>>>> >>>>> On Sat, Jul 8, 2017 at 10:17 AM, hameeza ahmed <hahmed2305 at gmail.com> >>>>> wrote: >>>>> >>>>>> sorry i didnt understand RI/ I thing. should i keep RI or I? >>>>>> >>>>>> On Sat, Jul 8, 2017 at 10:13 AM, Craig Topper <craig.topper at gmail.com >>>>>> > wrote: >>>>>> >>>>>>> I think so. >>>>>>> >>>>>>> ~Craig >>>>>>> >>>>>>> On Fri, Jul 7, 2017 at 10:10 PM, hameeza ahmed <hahmed2305 at gmail.com >>>>>>> > wrote: >>>>>>> >>>>>>>> sorry to disturb again,, >>>>>>>> >>>>>>>> def VADD_256B : RI<0xFE, MRMDestReg, (outs VR2048:$dst), (ins VR >>>>>>>> 2048:$src1, VR2048:$src2), >>>>>>>> "VADD_256B\t{$dst, $src1, $src2 }", [(set VR2048:$dst, >>>>>>>> (add VR2048:$src1, VR2048:$src2))], IIC_XADD_REG>, TB; >>>>>>>> >>>>>>>> >>>>>>>> Is it fine now?? >>>>>>>> >>>>>>>> >>>>>>>> On Sat, Jul 8, 2017 at 10:00 AM, Craig Topper < >>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>> >>>>>>>>> Oops that should have said "REX prefix" in the first sentence. >>>>>>>>> >>>>>>>>> ~Craig >>>>>>>>> >>>>>>>>> On Fri, Jul 7, 2017 at 9:59 PM, Craig Topper < >>>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> You don't want RI. That's used for instructions that need a reg >>>>>>>>>> prefix. You need to use $src1 and $src2 in the assembly string too. It also >>>>>>>>>> looks like you have two closing ] brackets. >>>>>>>>>> >>>>>>>>>> ~Craig >>>>>>>>>> >>>>>>>>>> On Fri, Jul 7, 2017 at 9:55 PM, hameeza ahmed < >>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Thank you; >>>>>>>>>>> i have changed as follows.is it fine now? >>>>>>>>>>> >>>>>>>>>>> def VADD_256B : I<0xFE, MRMDestReg, (outs VR2048:$dst), (ins >>>>>>>>>>> VR2048:$src1, VR2048:$src2), >>>>>>>>>>> "VADD_256B\t{$src, $dst|$dst, $src}", [(set VR >>>>>>>>>>> 2048:$dst, (add VR2048:$src1, VR2048:$src2))]]>; >>>>>>>>>>> >>>>>>>>>>> Also here i have changed class RI to I. Does it make any >>>>>>>>>>> difference? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Sat, Jul 8, 2017 at 9:38 AM, Craig Topper < >>>>>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> IIC_XADD_REG is used to associate latency and other information >>>>>>>>>>>> for use by the instruction scheduling pass. >>>>>>>>>>>> >>>>>>>>>>>> You're missing a pattern in the square bracket to match an add >>>>>>>>>>>> node. You also need two VR2048 registers in the 'ins' >>>>>>>>>>>> >>>>>>>>>>>> ~Craig >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Jul 7, 2017 at 9:29 PM, hameeza ahmed < >>>>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Can you please tell whether following add is correct to add 2 >>>>>>>>>>>>> 64xi32 numbers. >>>>>>>>>>>>> >>>>>>>>>>>>> def VADD_256B : RI<0xFE, MRMDestReg, (outs VR2048:$dst), >>>>>>>>>>>>> (ins VR2048:$src), >>>>>>>>>>>>> "VADD_256B\t{$src, $dst|$dst, $src}", [], >>>>>>>>>>>>> IIC_XADD_REG>, TB; >>>>>>>>>>>>> >>>>>>>>>>>>> what is llc_xadd_reg here? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Sat, Jul 8, 2017 at 8:48 AM, Craig Topper < >>>>>>>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Change the i32 in the store pattern to v64i32. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 8:41 PM hameeza ahmed < >>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thank you. i understood how avx512 vector instructions are >>>>>>>>>>>>>>> written in x86instravx512. i need to define my vector instructions so i >>>>>>>>>>>>>>> wrote; >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> def VMOV_256B_RM : I<0x6F, MRMSrcMem, (outs VR2048:$dst), >>>>>>>>>>>>>>> (ins i32mem:$src), >>>>>>>>>>>>>>> "vmov_256B_rm\t{$src, $dst|$dst, $src}", >>>>>>>>>>>>>>> [(set VR2048:$dst, (v64i32 >>>>>>>>>>>>>>> (scalar_to_vector (loadi32 addr:$src))))], >>>>>>>>>>>>>>> IIC_MOV_MEM>, EVEX; >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> def VMOV_256B_MR : I<0x7F, MRMDestMem, (outs), (ins >>>>>>>>>>>>>>> i32mem:$dst, VR2048:$src), >>>>>>>>>>>>>>> "vmov_256B_mr\t{$src, $dst|$dst, $src}", >>>>>>>>>>>>>>> [(store (i32 (bitconvert VR2048:$src)), >>>>>>>>>>>>>>> addr:$dst)], IIC_MOV_MEM>, EVEX; >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> in x86instrinfo.td; >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> when i build i got these instructions in X86GenInstrInfo. >>>>>>>>>>>>>>> but still my instruction is not selected when i run input >>>>>>>>>>>>>>> file in debug mode; getting following errors; >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ===== Instruction selection begins: BB#1 'vector.body' >>>>>>>>>>>>>>> Selecting: t9: ch = store<ST256[bitcast ([65 x i32]* @c to >>>>>>>>>>>>>>> <64 x i32>*)](align=16)(tbaa=<0x3817578>)> t8, t7, t11, >>>>>>>>>>>>>>> undef:i64 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ISEL: Starting pattern match on root node: t9: ch >>>>>>>>>>>>>>> store<ST256[bitcast ([65 x i32]* @c to <64 x i32>*)](align=16)(tbaa=<0x3817578>)> >>>>>>>>>>>>>>> t8, t7, t11, undef:i64 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 14, >>>>>>>>>>>>>>> continuing at 81 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 82, >>>>>>>>>>>>>>> continuing at 149 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 150, >>>>>>>>>>>>>>> continuing at 217 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 218, >>>>>>>>>>>>>>> continuing at 267 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 268, >>>>>>>>>>>>>>> continuing at 317 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 318, >>>>>>>>>>>>>>> continuing at 367 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 368, >>>>>>>>>>>>>>> continuing at 394 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 395, >>>>>>>>>>>>>>> continuing at 421 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 422, >>>>>>>>>>>>>>> continuing at 471 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 472, >>>>>>>>>>>>>>> continuing at 521 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 522, >>>>>>>>>>>>>>> continuing at 571 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 572, >>>>>>>>>>>>>>> continuing at 639 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 640, >>>>>>>>>>>>>>> continuing at 707 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 708, >>>>>>>>>>>>>>> continuing at 775 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 776, >>>>>>>>>>>>>>> continuing at 804 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 805, >>>>>>>>>>>>>>> continuing at 833 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 834, >>>>>>>>>>>>>>> continuing at 862 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 863, >>>>>>>>>>>>>>> continuing at 891 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 892, >>>>>>>>>>>>>>> continuing at 920 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 921, >>>>>>>>>>>>>>> continuing at 949 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 950, >>>>>>>>>>>>>>> continuing at 987 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 988, >>>>>>>>>>>>>>> continuing at 1025 >>>>>>>>>>>>>>> Match failed at index 12 >>>>>>>>>>>>>>> Continuing at 1026 >>>>>>>>>>>>>>> OpcodeSwitch from 1029 to 5725 >>>>>>>>>>>>>>> Match failed at index 5743 >>>>>>>>>>>>>>> Continuing at 5772 >>>>>>>>>>>>>>> Match failed at index 5776 >>>>>>>>>>>>>>> Continuing at 5805 >>>>>>>>>>>>>>> Match failed at index 5809 >>>>>>>>>>>>>>> Continuing at 5838 >>>>>>>>>>>>>>> Match failed at index 5842 >>>>>>>>>>>>>>> Continuing at 5911 >>>>>>>>>>>>>>> Match failed at index 5915 >>>>>>>>>>>>>>> Continuing at 5953 >>>>>>>>>>>>>>> Match failed at index 5957 >>>>>>>>>>>>>>> Continuing at 5995 >>>>>>>>>>>>>>> Match failed at index 5999 >>>>>>>>>>>>>>> Continuing at 6037 >>>>>>>>>>>>>>> Match failed at index 6041 >>>>>>>>>>>>>>> Continuing at 6084 >>>>>>>>>>>>>>> Match failed at index 6088 >>>>>>>>>>>>>>> Continuing at 6131 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 6138, continuing at 6181 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 6182, continuing at 6228 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 6235, continuing at 6384 >>>>>>>>>>>>>>> Match failed at index 6388 >>>>>>>>>>>>>>> Continuing at 6419 >>>>>>>>>>>>>>> Match failed at index 6423 >>>>>>>>>>>>>>> Continuing at 6454 >>>>>>>>>>>>>>> Match failed at index 6458 >>>>>>>>>>>>>>> Continuing at 6489 >>>>>>>>>>>>>>> Continuing at 6490 >>>>>>>>>>>>>>> Continuing at 6491 >>>>>>>>>>>>>>> Continuing at 6492 >>>>>>>>>>>>>>> Match failed at index 6514 >>>>>>>>>>>>>>> Continuing at 6545 >>>>>>>>>>>>>>> Match failed at index 6562 >>>>>>>>>>>>>>> Continuing at 6593 >>>>>>>>>>>>>>> Match failed at index 6610 >>>>>>>>>>>>>>> Continuing at 6641 >>>>>>>>>>>>>>> Continuing at 6642 >>>>>>>>>>>>>>> Match failed at index 6658 >>>>>>>>>>>>>>> Continuing at 6772 >>>>>>>>>>>>>>> Match failed at index 6788 >>>>>>>>>>>>>>> Continuing at 6902 >>>>>>>>>>>>>>> Continuing at 13636 >>>>>>>>>>>>>>> Match failed at index 13640 >>>>>>>>>>>>>>> Continuing at 14940 >>>>>>>>>>>>>>> Match failed at index 14943 >>>>>>>>>>>>>>> Continuing at 15415 >>>>>>>>>>>>>>> Match failed at index 15417 >>>>>>>>>>>>>>> Continuing at 15570 >>>>>>>>>>>>>>> Match failed at index 15571 >>>>>>>>>>>>>>> Continuing at 15598 >>>>>>>>>>>>>>> Match failed at index 15599 >>>>>>>>>>>>>>> Continuing at 15716 >>>>>>>>>>>>>>> Match failed at index 15719 >>>>>>>>>>>>>>> Continuing at 15837 >>>>>>>>>>>>>>> Match failed at index 15840 >>>>>>>>>>>>>>> Continuing at 16198 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 16203, continuing at 16285 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 16286, continuing at 16394 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 16395, continuing at 16464 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 16465, continuing at 16487 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 16488, continuing at 16510 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 16511, continuing at 16533 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 16534, continuing at 16556 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 16557, continuing at 16680 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 16681, continuing at 16804 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 16805, continuing at 16890 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 16891, continuing at 16976 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 16978, continuing at 17169 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 17171, continuing at 17342 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 17344, continuing at 17497 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 17499, continuing at 17632 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 17634, continuing at 17801 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 17803, continuing at 17944 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 17946, continuing at 18074 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 18075, continuing at 18178 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 18179, continuing at 18253 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 18254, continuing at 18278 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 18279, continuing at 18303 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 18304, continuing at 18328 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 18329, continuing at 18376 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 18377, continuing at 18424 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 18425, continuing at 18520 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 18521, continuing at 18636 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 18637, continuing at 18661 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 18662, continuing at 18711 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 18712, continuing at 18736 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 18737, continuing at 18770 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 18771, continuing at 18856 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 18857, continuing at 18942 >>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>> 18943, continuing at 19028 >>>>>>>>>>>>>>> Match failed at index 16201 >>>>>>>>>>>>>>> Continuing at 19029 >>>>>>>>>>>>>>> LLVM ERROR: Cannot select: t9: ch = store<ST256[bitcast ([65 >>>>>>>>>>>>>>> x i32]* @c to <64 x i32>*)](align=16)(tbaa=<0x3817578>)> >>>>>>>>>>>>>>> t8, t7, t11, undef:i64 >>>>>>>>>>>>>>> t7: v64i32 = add t6, t4 >>>>>>>>>>>>>>> t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* @c to >>>>>>>>>>>>>>> <64 x i32>*)](align=16)(tbaa=<0x3817578>)(dereferenceable)> >>>>>>>>>>>>>>> t0, t11, undef:i64 >>>>>>>>>>>>>>> t11: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 >>>>>>>>>>>>>>> x i32]* @c> 0 >>>>>>>>>>>>>>> t10: i64 = TargetGlobalAddress<[65 x i32]* @c> 0 >>>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>>> t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* @b to >>>>>>>>>>>>>>> <64 x i32>*)](align=16)(tbaa=<0x3817578>)(dereferenceable)> >>>>>>>>>>>>>>> t0, t13, undef:i64 >>>>>>>>>>>>>>> t13: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 >>>>>>>>>>>>>>> x i32]* @b> 0 >>>>>>>>>>>>>>> t12: i64 = TargetGlobalAddress<[65 x i32]* @b> 0 >>>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>>> t11: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x >>>>>>>>>>>>>>> i32]* @c> 0 >>>>>>>>>>>>>>> t10: i64 = TargetGlobalAddress<[65 x i32]* @c> 0 >>>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>>> In function: foo >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> What could be the reason of this?? Please correct me. >>>>>>>>>>>>>>> I am stuck at this point.... >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 10:59 PM, Friedman, Eli < >>>>>>>>>>>>>>> efriedma at codeaurora.org> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The word "fold" is used all over LLVM. It generally refers >>>>>>>>>>>>>>>> to transformations which delete an instruction. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> If you're asking about http://llvm.org/docs/CodeGener >>>>>>>>>>>>>>>> ator.html#instruction-folding , it just means an >>>>>>>>>>>>>>>> instruction which was produced by the "instruction folding" transform; >>>>>>>>>>>>>>>> there isn't anything special about the instruction itself. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -Eli >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 7/6/2017 10:51 PM, hameeza ahmed wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> What is meant by folded instructions in LLVM? >>>>>>>>>>>>>>>> How they work? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 10:19 AM, hameeza ahmed < >>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thank You. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 10:03 AM, Craig Topper < >>>>>>>>>>>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Yes, that error is from instruction selection. I think >>>>>>>>>>>>>>>>>> your legalization changes worked fine. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> ~Craig >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Thu, Jul 6, 2017 at 8:21 PM, hameeza ahmed via >>>>>>>>>>>>>>>>>> llvm-dev <llvm-dev at lists.llvm.org> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> also i further run the following command; >>>>>>>>>>>>>>>>>>> llc -debug filer-knl_o3.ll >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> and its output is attached here. by looking at the >>>>>>>>>>>>>>>>>>> output can we say that legalization runs fine and the error is due to >>>>>>>>>>>>>>>>>>> instruction selection/ pattern matching which is not yet implemented? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> so do i need to worry and try to correct it at this >>>>>>>>>>>>>>>>>>> stage or should i move forward to implement instruction selection/ pattern >>>>>>>>>>>>>>>>>>> matching? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Please guide me. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thank You >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 8:00 AM, hameeza ahmed < >>>>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thank You. well i have seen these links. but they dont >>>>>>>>>>>>>>>>>>>> cover the problem that i have mentioned. actually i am doing all the things >>>>>>>>>>>>>>>>>>>> step by step. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> so i havent yet worked with instruction selection >>>>>>>>>>>>>>>>>>>> phase/ files. rather before that i am trying to do legalization by allowing >>>>>>>>>>>>>>>>>>>> vector elements>16 i.e 64xi32. here i have mainly worked with 2 files uptil >>>>>>>>>>>>>>>>>>>> now, i.e registerinfo.td to define register class to >>>>>>>>>>>>>>>>>>>> be called in legalization. and most importantly i am dealing with file >>>>>>>>>>>>>>>>>>>> X86ISelLowering.cpp. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Now is there any relation in this and instruction >>>>>>>>>>>>>>>>>>>> selection. since instruction selection comes after combine and legalize so >>>>>>>>>>>>>>>>>>>> i havent yet worked on it. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Please correct me, I am stuck here. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thank You again >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 7:11 AM, Friedman, Eli < >>>>>>>>>>>>>>>>>>>> efriedma at codeaurora.org> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Have you read http://llvm.org/docs/WritingAn >>>>>>>>>>>>>>>>>>>>> LLVMBackend.html and http://llvm.org/docs/CodeGener >>>>>>>>>>>>>>>>>>>>> ator.html ? http://llvm.org/docs/WritingAn >>>>>>>>>>>>>>>>>>>>> LLVMBackend.html#instruction-selector describes how >>>>>>>>>>>>>>>>>>>>> to define a store instruction. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> -Eli >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On 7/6/2017 6:51 PM, hameeza ahmed via llvm-dev wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Please correct me i m stuck at this point. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Jul 6, 2017 5:18 PM, "hameeza ahmed" < >>>>>>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>>>>>>>>> i am experimenting with the increase in register/ >>>>>>>>>>>>>>>>>>>>> vector width to 64 elements of 32 bits instead of 16 in x86 backend. >>>>>>>>>>>>>>>>>>>>> for eg. >>>>>>>>>>>>>>>>>>>>> i have a loop with 65 iterations; >>>>>>>>>>>>>>>>>>>>> if my IR generates v64i32 and 1 scalar, still the >>>>>>>>>>>>>>>>>>>>> backend breaks the v64i32 into 4 v16i32. i want it to retain v64i32. like >>>>>>>>>>>>>>>>>>>>> if there are 128 elements in loop then it should break it into 2 v64i32 >>>>>>>>>>>>>>>>>>>>> instructions. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> in order to do this i have made necessary changes in >>>>>>>>>>>>>>>>>>>>> X86ISelLowering.cpp. and rebuild llvm. then when i use the >>>>>>>>>>>>>>>>>>>>> command -view-dag-combine2-dags i get the required >>>>>>>>>>>>>>>>>>>>> output in graph but the following error on console: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> LLVM ERROR: Cannot select: t10: ch >>>>>>>>>>>>>>>>>>>>> store<ST256[bitcast ([65 x i32]* @a to <64 x i32>*)](align=16)(tbaa=<0x30c5438>)> >>>>>>>>>>>>>>>>>>>>> t9, t7, t12, undef:i64 >>>>>>>>>>>>>>>>>>>>> t7: v64i32 = add t6, t4 >>>>>>>>>>>>>>>>>>>>> t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* @c >>>>>>>>>>>>>>>>>>>>> to <64 x i32>*)](align=16)(tbaa=<0x30c5438>)(dereferenceable)> >>>>>>>>>>>>>>>>>>>>> t0, t14, undef:i64 >>>>>>>>>>>>>>>>>>>>> t14: i64 = X86ISD::Wrapper >>>>>>>>>>>>>>>>>>>>> TargetGlobalAddress:i64<[65 x i32]* @c> 0 >>>>>>>>>>>>>>>>>>>>> t13: i64 = TargetGlobalAddress<[65 x i32]* @c> >>>>>>>>>>>>>>>>>>>>> 0 >>>>>>>>>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>>>>>>>>> t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* @b >>>>>>>>>>>>>>>>>>>>> to <64 x i32>*)](align=16)(tbaa=<0x30c5438>)(dereferenceable)> >>>>>>>>>>>>>>>>>>>>> t0, t16, undef:i64 >>>>>>>>>>>>>>>>>>>>> t16: i64 = X86ISD::Wrapper >>>>>>>>>>>>>>>>>>>>> TargetGlobalAddress:i64<[65 x i32]* @b> 0 >>>>>>>>>>>>>>>>>>>>> t15: i64 = TargetGlobalAddress<[65 x i32]* @b> >>>>>>>>>>>>>>>>>>>>> 0 >>>>>>>>>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>>>>>>>>> t12: i64 = X86ISD::Wrapper >>>>>>>>>>>>>>>>>>>>> TargetGlobalAddress:i64<[65 x i32]* @a> 0 >>>>>>>>>>>>>>>>>>>>> t11: i64 = TargetGlobalAddress<[65 x i32]* @a> 0 >>>>>>>>>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>>>>>>>>> In function: foo >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> The dag after legalization is also attached here. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> the source is vector sum of 65 elements. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Kindly correct me. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>>>>> LLVM Developers mailing listllvm-dev at lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>> Employee of Qualcomm Innovation Center, Inc. >>>>>>>>>>>>>>>>>>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>>>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>>>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> Employee of Qualcomm Innovation Center, Inc. >>>>>>>>>>>>>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>> ~Craig >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170707/f814c62d/attachment-0001.html>
Craig Topper via llvm-dev
2017-Jul-08 06:29 UTC
[llvm-dev] Error in v64i32 type in x86 backend
Correct pattern is (set VR2048:$dst, (v64i32 (load addr:$src))) ~Craig On Fri, Jul 7, 2017 at 11:23 PM, hameeza ahmed <hahmed2305 at gmail.com> wrote:> Thank you. add is working fine i keep opcode=0x0F it is unused. > > Sorry to disturb, but load is not matching pattern; > is the following load correct; > > def VMOV_256B_RM : I<0x6F, MRMSrcMem, (outs VR2048:$dst), (ins > i32mem:$src), > "vmov_256B_rm\t{$src, $dst|$dst, $src}", > [(set VR2048:$dst, (v64i32 (scalar_to_vector (loadi32 > addr:$src))))], > IIC_MOV_MEM>, EVEX; > > i am getting this error; > > LLVM ERROR: Cannot select: t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* > @b to <64 x i32>*)](align=16)(tbaa=<0x3fb8578>)(dereferenceable)> t0, > t13, undef:i64 > t13: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]* @b> 0 > t12: i64 = TargetGlobalAddress<[65 x i32]* @b> 0 > t3: i64 = undef > In function: foo > > > > > On Sat, Jul 8, 2017 at 11:03 AM, hameeza ahmed <hahmed2305 at gmail.com> > wrote: > >> Thank You. >> >> I have seen the opcode is 8 bits and all the combinations are already >> used in llvm x86. >> >> Now what to do? >> >> On Sat, Jul 8, 2017 at 10:57 AM, Craig Topper <craig.topper at gmail.com> >> wrote: >> >>> Yes its an opcode conflict. You'll have to look through Intel documents >>> and find an unused opcode. I've only added instructions based on a real >>> spec so I don't know how to make up an opcode. >>> >>> ~Craig >>> >>> On Fri, Jul 7, 2017 at 10:43 PM, hameeza ahmed <hahmed2305 at gmail.com> >>> wrote: >>> >>>> Thank You. >>>> >>>> Now i am getting this error repeatedly; >>>> >>>> Error: Primary decode conflict: VADD_256B would overwrite INC8r >>>> ModRM 192 >>>> Opcode 254 >>>> Context IC >>>> Error: Primary decode conflict: VADD_256B would overwrite INC8r >>>> ModRM 193 >>>> Opcode 254 >>>> Context IC >>>> >>>> Is it due to opcode conflict? what should i keep opcode then? >>>> >>>> >>>> On Sat, Jul 8, 2017 at 10:33 AM, Craig Topper <craig.topper at gmail.com> >>>> wrote: >>>> >>>>> Keep I >>>>> >>>>> ~Craig >>>>> >>>>> On Fri, Jul 7, 2017 at 10:28 PM, hameeza ahmed <hahmed2305 at gmail.com> >>>>> wrote: >>>>> >>>>>> I keep this one; >>>>>> >>>>>> def VADD_256B : I<0xFE, MRMDestReg, (outs VR2048:$dst), (ins VR2048:$src1, >>>>>> VR2048:$src2), >>>>>> "VADD_256B\t{$dst, $src1, $src2 }", [(set VR2048:$dst, >>>>>> (add VR2048:$src1, VR2048:$src2))]>; >>>>>> >>>>>> On Sat, Jul 8, 2017 at 10:17 AM, hameeza ahmed <hahmed2305 at gmail.com> >>>>>> wrote: >>>>>> >>>>>>> sorry i didnt understand RI/ I thing. should i keep RI or I? >>>>>>> >>>>>>> On Sat, Jul 8, 2017 at 10:13 AM, Craig Topper < >>>>>>> craig.topper at gmail.com> wrote: >>>>>>> >>>>>>>> I think so. >>>>>>>> >>>>>>>> ~Craig >>>>>>>> >>>>>>>> On Fri, Jul 7, 2017 at 10:10 PM, hameeza ahmed < >>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>> >>>>>>>>> sorry to disturb again,, >>>>>>>>> >>>>>>>>> def VADD_256B : RI<0xFE, MRMDestReg, (outs VR2048:$dst), (ins VR >>>>>>>>> 2048:$src1, VR2048:$src2), >>>>>>>>> "VADD_256B\t{$dst, $src1, $src2 }", [(set VR >>>>>>>>> 2048:$dst, (add VR2048:$src1, VR2048:$src2))], IIC_XADD_REG>, TB; >>>>>>>>> >>>>>>>>> >>>>>>>>> Is it fine now?? >>>>>>>>> >>>>>>>>> >>>>>>>>> On Sat, Jul 8, 2017 at 10:00 AM, Craig Topper < >>>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Oops that should have said "REX prefix" in the first sentence. >>>>>>>>>> >>>>>>>>>> ~Craig >>>>>>>>>> >>>>>>>>>> On Fri, Jul 7, 2017 at 9:59 PM, Craig Topper < >>>>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> You don't want RI. That's used for instructions that need a reg >>>>>>>>>>> prefix. You need to use $src1 and $src2 in the assembly string too. It also >>>>>>>>>>> looks like you have two closing ] brackets. >>>>>>>>>>> >>>>>>>>>>> ~Craig >>>>>>>>>>> >>>>>>>>>>> On Fri, Jul 7, 2017 at 9:55 PM, hameeza ahmed < >>>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Thank you; >>>>>>>>>>>> i have changed as follows.is it fine now? >>>>>>>>>>>> >>>>>>>>>>>> def VADD_256B : I<0xFE, MRMDestReg, (outs VR2048:$dst), (ins >>>>>>>>>>>> VR2048:$src1, VR2048:$src2), >>>>>>>>>>>> "VADD_256B\t{$src, $dst|$dst, $src}", [(set >>>>>>>>>>>> VR2048:$dst, (add VR2048:$src1, VR2048:$src2))]]>; >>>>>>>>>>>> >>>>>>>>>>>> Also here i have changed class RI to I. Does it make any >>>>>>>>>>>> difference? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Sat, Jul 8, 2017 at 9:38 AM, Craig Topper < >>>>>>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> IIC_XADD_REG is used to associate latency and other >>>>>>>>>>>>> information for use by the instruction scheduling pass. >>>>>>>>>>>>> >>>>>>>>>>>>> You're missing a pattern in the square bracket to match an add >>>>>>>>>>>>> node. You also need two VR2048 registers in the 'ins' >>>>>>>>>>>>> >>>>>>>>>>>>> ~Craig >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Jul 7, 2017 at 9:29 PM, hameeza ahmed < >>>>>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Can you please tell whether following add is correct to add 2 >>>>>>>>>>>>>> 64xi32 numbers. >>>>>>>>>>>>>> >>>>>>>>>>>>>> def VADD_256B : RI<0xFE, MRMDestReg, (outs VR2048:$dst), >>>>>>>>>>>>>> (ins VR2048:$src), >>>>>>>>>>>>>> "VADD_256B\t{$src, $dst|$dst, $src}", [], >>>>>>>>>>>>>> IIC_XADD_REG>, TB; >>>>>>>>>>>>>> >>>>>>>>>>>>>> what is llc_xadd_reg here? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Sat, Jul 8, 2017 at 8:48 AM, Craig Topper < >>>>>>>>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Change the i32 in the store pattern to v64i32. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 8:41 PM hameeza ahmed < >>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thank you. i understood how avx512 vector instructions are >>>>>>>>>>>>>>>> written in x86instravx512. i need to define my vector instructions so i >>>>>>>>>>>>>>>> wrote; >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> def VMOV_256B_RM : I<0x6F, MRMSrcMem, (outs VR2048:$dst), >>>>>>>>>>>>>>>> (ins i32mem:$src), >>>>>>>>>>>>>>>> "vmov_256B_rm\t{$src, $dst|$dst, $src}", >>>>>>>>>>>>>>>> [(set VR2048:$dst, (v64i32 >>>>>>>>>>>>>>>> (scalar_to_vector (loadi32 addr:$src))))], >>>>>>>>>>>>>>>> IIC_MOV_MEM>, EVEX; >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> def VMOV_256B_MR : I<0x7F, MRMDestMem, (outs), (ins >>>>>>>>>>>>>>>> i32mem:$dst, VR2048:$src), >>>>>>>>>>>>>>>> "vmov_256B_mr\t{$src, $dst|$dst, $src}", >>>>>>>>>>>>>>>> [(store (i32 (bitconvert VR2048:$src)), >>>>>>>>>>>>>>>> addr:$dst)], IIC_MOV_MEM>, EVEX; >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> in x86instrinfo.td; >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> when i build i got these instructions in X86GenInstrInfo. >>>>>>>>>>>>>>>> but still my instruction is not selected when i run input >>>>>>>>>>>>>>>> file in debug mode; getting following errors; >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ===== Instruction selection begins: BB#1 'vector.body' >>>>>>>>>>>>>>>> Selecting: t9: ch = store<ST256[bitcast ([65 x i32]* @c to >>>>>>>>>>>>>>>> <64 x i32>*)](align=16)(tbaa=<0x3817578>)> t8, t7, t11, >>>>>>>>>>>>>>>> undef:i64 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ISEL: Starting pattern match on root node: t9: ch >>>>>>>>>>>>>>>> store<ST256[bitcast ([65 x i32]* @c to <64 x i32>*)](align=16)(tbaa=<0x3817578>)> >>>>>>>>>>>>>>>> t8, t7, t11, undef:i64 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 14, >>>>>>>>>>>>>>>> continuing at 81 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index 82, >>>>>>>>>>>>>>>> continuing at 149 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 150, continuing at 217 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 218, continuing at 267 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 268, continuing at 317 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 318, continuing at 367 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 368, continuing at 394 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 395, continuing at 421 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 422, continuing at 471 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 472, continuing at 521 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 522, continuing at 571 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 572, continuing at 639 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 640, continuing at 707 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 708, continuing at 775 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 776, continuing at 804 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 805, continuing at 833 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 834, continuing at 862 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 863, continuing at 891 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 892, continuing at 920 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 921, continuing at 949 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 950, continuing at 987 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 988, continuing at 1025 >>>>>>>>>>>>>>>> Match failed at index 12 >>>>>>>>>>>>>>>> Continuing at 1026 >>>>>>>>>>>>>>>> OpcodeSwitch from 1029 to 5725 >>>>>>>>>>>>>>>> Match failed at index 5743 >>>>>>>>>>>>>>>> Continuing at 5772 >>>>>>>>>>>>>>>> Match failed at index 5776 >>>>>>>>>>>>>>>> Continuing at 5805 >>>>>>>>>>>>>>>> Match failed at index 5809 >>>>>>>>>>>>>>>> Continuing at 5838 >>>>>>>>>>>>>>>> Match failed at index 5842 >>>>>>>>>>>>>>>> Continuing at 5911 >>>>>>>>>>>>>>>> Match failed at index 5915 >>>>>>>>>>>>>>>> Continuing at 5953 >>>>>>>>>>>>>>>> Match failed at index 5957 >>>>>>>>>>>>>>>> Continuing at 5995 >>>>>>>>>>>>>>>> Match failed at index 5999 >>>>>>>>>>>>>>>> Continuing at 6037 >>>>>>>>>>>>>>>> Match failed at index 6041 >>>>>>>>>>>>>>>> Continuing at 6084 >>>>>>>>>>>>>>>> Match failed at index 6088 >>>>>>>>>>>>>>>> Continuing at 6131 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 6138, continuing at 6181 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 6182, continuing at 6228 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 6235, continuing at 6384 >>>>>>>>>>>>>>>> Match failed at index 6388 >>>>>>>>>>>>>>>> Continuing at 6419 >>>>>>>>>>>>>>>> Match failed at index 6423 >>>>>>>>>>>>>>>> Continuing at 6454 >>>>>>>>>>>>>>>> Match failed at index 6458 >>>>>>>>>>>>>>>> Continuing at 6489 >>>>>>>>>>>>>>>> Continuing at 6490 >>>>>>>>>>>>>>>> Continuing at 6491 >>>>>>>>>>>>>>>> Continuing at 6492 >>>>>>>>>>>>>>>> Match failed at index 6514 >>>>>>>>>>>>>>>> Continuing at 6545 >>>>>>>>>>>>>>>> Match failed at index 6562 >>>>>>>>>>>>>>>> Continuing at 6593 >>>>>>>>>>>>>>>> Match failed at index 6610 >>>>>>>>>>>>>>>> Continuing at 6641 >>>>>>>>>>>>>>>> Continuing at 6642 >>>>>>>>>>>>>>>> Match failed at index 6658 >>>>>>>>>>>>>>>> Continuing at 6772 >>>>>>>>>>>>>>>> Match failed at index 6788 >>>>>>>>>>>>>>>> Continuing at 6902 >>>>>>>>>>>>>>>> Continuing at 13636 >>>>>>>>>>>>>>>> Match failed at index 13640 >>>>>>>>>>>>>>>> Continuing at 14940 >>>>>>>>>>>>>>>> Match failed at index 14943 >>>>>>>>>>>>>>>> Continuing at 15415 >>>>>>>>>>>>>>>> Match failed at index 15417 >>>>>>>>>>>>>>>> Continuing at 15570 >>>>>>>>>>>>>>>> Match failed at index 15571 >>>>>>>>>>>>>>>> Continuing at 15598 >>>>>>>>>>>>>>>> Match failed at index 15599 >>>>>>>>>>>>>>>> Continuing at 15716 >>>>>>>>>>>>>>>> Match failed at index 15719 >>>>>>>>>>>>>>>> Continuing at 15837 >>>>>>>>>>>>>>>> Match failed at index 15840 >>>>>>>>>>>>>>>> Continuing at 16198 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 16203, continuing at 16285 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 16286, continuing at 16394 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 16395, continuing at 16464 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 16465, continuing at 16487 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 16488, continuing at 16510 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 16511, continuing at 16533 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 16534, continuing at 16556 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 16557, continuing at 16680 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 16681, continuing at 16804 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 16805, continuing at 16890 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 16891, continuing at 16976 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 16978, continuing at 17169 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 17171, continuing at 17342 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 17344, continuing at 17497 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 17499, continuing at 17632 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 17634, continuing at 17801 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 17803, continuing at 17944 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 17946, continuing at 18074 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 18075, continuing at 18178 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 18179, continuing at 18253 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 18254, continuing at 18278 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 18279, continuing at 18303 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 18304, continuing at 18328 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 18329, continuing at 18376 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 18377, continuing at 18424 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 18425, continuing at 18520 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 18521, continuing at 18636 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 18637, continuing at 18661 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 18662, continuing at 18711 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 18712, continuing at 18736 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 18737, continuing at 18770 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 18771, continuing at 18856 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 18857, continuing at 18942 >>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>> 18943, continuing at 19028 >>>>>>>>>>>>>>>> Match failed at index 16201 >>>>>>>>>>>>>>>> Continuing at 19029 >>>>>>>>>>>>>>>> LLVM ERROR: Cannot select: t9: ch = store<ST256[bitcast >>>>>>>>>>>>>>>> ([65 x i32]* @c to <64 x i32>*)](align=16)(tbaa=<0x3817578>)> >>>>>>>>>>>>>>>> t8, t7, t11, undef:i64 >>>>>>>>>>>>>>>> t7: v64i32 = add t6, t4 >>>>>>>>>>>>>>>> t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* @c to >>>>>>>>>>>>>>>> <64 x i32>*)](align=16)(tbaa=<0x3817578>)(dereferenceable)> >>>>>>>>>>>>>>>> t0, t11, undef:i64 >>>>>>>>>>>>>>>> t11: i64 = X86ISD::Wrapper >>>>>>>>>>>>>>>> TargetGlobalAddress:i64<[65 x i32]* @c> 0 >>>>>>>>>>>>>>>> t10: i64 = TargetGlobalAddress<[65 x i32]* @c> 0 >>>>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>>>> t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* @b to >>>>>>>>>>>>>>>> <64 x i32>*)](align=16)(tbaa=<0x3817578>)(dereferenceable)> >>>>>>>>>>>>>>>> t0, t13, undef:i64 >>>>>>>>>>>>>>>> t13: i64 = X86ISD::Wrapper >>>>>>>>>>>>>>>> TargetGlobalAddress:i64<[65 x i32]* @b> 0 >>>>>>>>>>>>>>>> t12: i64 = TargetGlobalAddress<[65 x i32]* @b> 0 >>>>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>>>> t11: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x >>>>>>>>>>>>>>>> i32]* @c> 0 >>>>>>>>>>>>>>>> t10: i64 = TargetGlobalAddress<[65 x i32]* @c> 0 >>>>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>>>> In function: foo >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> What could be the reason of this?? Please correct me. >>>>>>>>>>>>>>>> I am stuck at this point.... >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 10:59 PM, Friedman, Eli < >>>>>>>>>>>>>>>> efriedma at codeaurora.org> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The word "fold" is used all over LLVM. It generally >>>>>>>>>>>>>>>>> refers to transformations which delete an instruction. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> If you're asking about http://llvm.org/docs/CodeGener >>>>>>>>>>>>>>>>> ator.html#instruction-folding , it just means an >>>>>>>>>>>>>>>>> instruction which was produced by the "instruction folding" transform; >>>>>>>>>>>>>>>>> there isn't anything special about the instruction itself. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -Eli >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 7/6/2017 10:51 PM, hameeza ahmed wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> What is meant by folded instructions in LLVM? >>>>>>>>>>>>>>>>> How they work? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 10:19 AM, hameeza ahmed < >>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thank You. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 10:03 AM, Craig Topper < >>>>>>>>>>>>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Yes, that error is from instruction selection. I think >>>>>>>>>>>>>>>>>>> your legalization changes worked fine. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> ~Craig >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Thu, Jul 6, 2017 at 8:21 PM, hameeza ahmed via >>>>>>>>>>>>>>>>>>> llvm-dev <llvm-dev at lists.llvm.org> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> also i further run the following command; >>>>>>>>>>>>>>>>>>>> llc -debug filer-knl_o3.ll >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> and its output is attached here. by looking at the >>>>>>>>>>>>>>>>>>>> output can we say that legalization runs fine and the error is due to >>>>>>>>>>>>>>>>>>>> instruction selection/ pattern matching which is not yet implemented? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> so do i need to worry and try to correct it at this >>>>>>>>>>>>>>>>>>>> stage or should i move forward to implement instruction selection/ pattern >>>>>>>>>>>>>>>>>>>> matching? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Please guide me. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thank You >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 8:00 AM, hameeza ahmed < >>>>>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thank You. well i have seen these links. but they dont >>>>>>>>>>>>>>>>>>>>> cover the problem that i have mentioned. actually i am doing all the things >>>>>>>>>>>>>>>>>>>>> step by step. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> so i havent yet worked with instruction selection >>>>>>>>>>>>>>>>>>>>> phase/ files. rather before that i am trying to do legalization by allowing >>>>>>>>>>>>>>>>>>>>> vector elements>16 i.e 64xi32. here i have mainly worked with 2 files uptil >>>>>>>>>>>>>>>>>>>>> now, i.e registerinfo.td to define register class to >>>>>>>>>>>>>>>>>>>>> be called in legalization. and most importantly i am dealing with file >>>>>>>>>>>>>>>>>>>>> X86ISelLowering.cpp. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Now is there any relation in this and instruction >>>>>>>>>>>>>>>>>>>>> selection. since instruction selection comes after combine and legalize so >>>>>>>>>>>>>>>>>>>>> i havent yet worked on it. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Please correct me, I am stuck here. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thank You again >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 7:11 AM, Friedman, Eli < >>>>>>>>>>>>>>>>>>>>> efriedma at codeaurora.org> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Have you read http://llvm.org/docs/WritingAn >>>>>>>>>>>>>>>>>>>>>> LLVMBackend.html and http://llvm.org/docs/CodeGener >>>>>>>>>>>>>>>>>>>>>> ator.html ? http://llvm.org/docs/WritingAn >>>>>>>>>>>>>>>>>>>>>> LLVMBackend.html#instruction-selector describes how >>>>>>>>>>>>>>>>>>>>>> to define a store instruction. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> -Eli >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On 7/6/2017 6:51 PM, hameeza ahmed via llvm-dev wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Please correct me i m stuck at this point. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On Jul 6, 2017 5:18 PM, "hameeza ahmed" < >>>>>>>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>>>>>>>>>> i am experimenting with the increase in register/ >>>>>>>>>>>>>>>>>>>>>> vector width to 64 elements of 32 bits instead of 16 in x86 backend. >>>>>>>>>>>>>>>>>>>>>> for eg. >>>>>>>>>>>>>>>>>>>>>> i have a loop with 65 iterations; >>>>>>>>>>>>>>>>>>>>>> if my IR generates v64i32 and 1 scalar, still the >>>>>>>>>>>>>>>>>>>>>> backend breaks the v64i32 into 4 v16i32. i want it to retain v64i32. like >>>>>>>>>>>>>>>>>>>>>> if there are 128 elements in loop then it should break it into 2 v64i32 >>>>>>>>>>>>>>>>>>>>>> instructions. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> in order to do this i have made necessary changes in >>>>>>>>>>>>>>>>>>>>>> X86ISelLowering.cpp. and rebuild llvm. then when i use the >>>>>>>>>>>>>>>>>>>>>> command -view-dag-combine2-dags i get the required >>>>>>>>>>>>>>>>>>>>>> output in graph but the following error on console: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> LLVM ERROR: Cannot select: t10: ch >>>>>>>>>>>>>>>>>>>>>> store<ST256[bitcast ([65 x i32]* @a to <64 x i32>*)](align=16)(tbaa=<0x30c5438>)> >>>>>>>>>>>>>>>>>>>>>> t9, t7, t12, undef:i64 >>>>>>>>>>>>>>>>>>>>>> t7: v64i32 = add t6, t4 >>>>>>>>>>>>>>>>>>>>>> t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* >>>>>>>>>>>>>>>>>>>>>> @c to <64 x i32>*)](align=16)(tbaa=<0x30c5438>)(dereferenceable)> >>>>>>>>>>>>>>>>>>>>>> t0, t14, undef:i64 >>>>>>>>>>>>>>>>>>>>>> t14: i64 = X86ISD::Wrapper >>>>>>>>>>>>>>>>>>>>>> TargetGlobalAddress:i64<[65 x i32]* @c> 0 >>>>>>>>>>>>>>>>>>>>>> t13: i64 = TargetGlobalAddress<[65 x i32]* >>>>>>>>>>>>>>>>>>>>>> @c> 0 >>>>>>>>>>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>>>>>>>>>> t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* >>>>>>>>>>>>>>>>>>>>>> @b to <64 x i32>*)](align=16)(tbaa=<0x30c5438>)(dereferenceable)> >>>>>>>>>>>>>>>>>>>>>> t0, t16, undef:i64 >>>>>>>>>>>>>>>>>>>>>> t16: i64 = X86ISD::Wrapper >>>>>>>>>>>>>>>>>>>>>> TargetGlobalAddress:i64<[65 x i32]* @b> 0 >>>>>>>>>>>>>>>>>>>>>> t15: i64 = TargetGlobalAddress<[65 x i32]* >>>>>>>>>>>>>>>>>>>>>> @b> 0 >>>>>>>>>>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>>>>>>>>>> t12: i64 = X86ISD::Wrapper >>>>>>>>>>>>>>>>>>>>>> TargetGlobalAddress:i64<[65 x i32]* @a> 0 >>>>>>>>>>>>>>>>>>>>>> t11: i64 = TargetGlobalAddress<[65 x i32]* @a> 0 >>>>>>>>>>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>>>>>>>>>> In function: foo >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> The dag after legalization is also attached here. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> the source is vector sum of 65 elements. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Kindly correct me. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>>>>>> LLVM Developers mailing listllvm-dev at lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>> Employee of Qualcomm Innovation Center, Inc. >>>>>>>>>>>>>>>>>>>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>>>>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>>>>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>> Employee of Qualcomm Innovation Center, Inc. >>>>>>>>>>>>>>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> ~Craig >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170707/075c0fdf/attachment-0001.html>
hameeza ahmed via llvm-dev
2017-Jul-08 06:46 UTC
[llvm-dev] Error in v64i32 type in x86 backend
Thank you so much. On Sat, Jul 8, 2017 at 11:30 AM, hameeza ahmed <hahmed2305 at gmail.com> wrote:> I am targetting some new hardware which supports greater vector width. i > am adding that in x86 not separately. > > On Sat, Jul 8, 2017 at 11:28 AM, Craig Topper <craig.topper at gmail.com> > wrote: > >> The opcode field is 8-bits, but there are multiple opcode maps specified >> by things like the "TB" on the end of your current instruction. There are >> many others like TA, VEX, XOP, etc. that you can find on other instructions. >> >> What exactly is your end goal with making up these fake instructions? >> >> ~Craig >> >> On Fri, Jul 7, 2017 at 11:03 PM, hameeza ahmed <hahmed2305 at gmail.com> >> wrote: >> >>> Thank You. >>> >>> I have seen the opcode is 8 bits and all the combinations are already >>> used in llvm x86. >>> >>> Now what to do? >>> >>> On Sat, Jul 8, 2017 at 10:57 AM, Craig Topper <craig.topper at gmail.com> >>> wrote: >>> >>>> Yes its an opcode conflict. You'll have to look through Intel documents >>>> and find an unused opcode. I've only added instructions based on a real >>>> spec so I don't know how to make up an opcode. >>>> >>>> ~Craig >>>> >>>> On Fri, Jul 7, 2017 at 10:43 PM, hameeza ahmed <hahmed2305 at gmail.com> >>>> wrote: >>>> >>>>> Thank You. >>>>> >>>>> Now i am getting this error repeatedly; >>>>> >>>>> Error: Primary decode conflict: VADD_256B would overwrite INC8r >>>>> ModRM 192 >>>>> Opcode 254 >>>>> Context IC >>>>> Error: Primary decode conflict: VADD_256B would overwrite INC8r >>>>> ModRM 193 >>>>> Opcode 254 >>>>> Context IC >>>>> >>>>> Is it due to opcode conflict? what should i keep opcode then? >>>>> >>>>> >>>>> On Sat, Jul 8, 2017 at 10:33 AM, Craig Topper <craig.topper at gmail.com> >>>>> wrote: >>>>> >>>>>> Keep I >>>>>> >>>>>> ~Craig >>>>>> >>>>>> On Fri, Jul 7, 2017 at 10:28 PM, hameeza ahmed <hahmed2305 at gmail.com> >>>>>> wrote: >>>>>> >>>>>>> I keep this one; >>>>>>> >>>>>>> def VADD_256B : I<0xFE, MRMDestReg, (outs VR2048:$dst), (ins VR >>>>>>> 2048:$src1, VR2048:$src2), >>>>>>> "VADD_256B\t{$dst, $src1, $src2 }", [(set VR2048:$dst, >>>>>>> (add VR2048:$src1, VR2048:$src2))]>; >>>>>>> >>>>>>> On Sat, Jul 8, 2017 at 10:17 AM, hameeza ahmed <hahmed2305 at gmail.com >>>>>>> > wrote: >>>>>>> >>>>>>>> sorry i didnt understand RI/ I thing. should i keep RI or I? >>>>>>>> >>>>>>>> On Sat, Jul 8, 2017 at 10:13 AM, Craig Topper < >>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>> >>>>>>>>> I think so. >>>>>>>>> >>>>>>>>> ~Craig >>>>>>>>> >>>>>>>>> On Fri, Jul 7, 2017 at 10:10 PM, hameeza ahmed < >>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> sorry to disturb again,, >>>>>>>>>> >>>>>>>>>> def VADD_256B : RI<0xFE, MRMDestReg, (outs VR2048:$dst), (ins >>>>>>>>>> VR2048:$src1, VR2048:$src2), >>>>>>>>>> "VADD_256B\t{$dst, $src1, $src2 }", [(set VR >>>>>>>>>> 2048:$dst, (add VR2048:$src1, VR2048:$src2))], IIC_XADD_REG>, TB; >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Is it fine now?? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Sat, Jul 8, 2017 at 10:00 AM, Craig Topper < >>>>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Oops that should have said "REX prefix" in the first sentence. >>>>>>>>>>> >>>>>>>>>>> ~Craig >>>>>>>>>>> >>>>>>>>>>> On Fri, Jul 7, 2017 at 9:59 PM, Craig Topper < >>>>>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> You don't want RI. That's used for instructions that need a reg >>>>>>>>>>>> prefix. You need to use $src1 and $src2 in the assembly string too. It also >>>>>>>>>>>> looks like you have two closing ] brackets. >>>>>>>>>>>> >>>>>>>>>>>> ~Craig >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Jul 7, 2017 at 9:55 PM, hameeza ahmed < >>>>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Thank you; >>>>>>>>>>>>> i have changed as follows.is it fine now? >>>>>>>>>>>>> >>>>>>>>>>>>> def VADD_256B : I<0xFE, MRMDestReg, (outs VR2048:$dst), >>>>>>>>>>>>> (ins VR2048:$src1, VR2048:$src2), >>>>>>>>>>>>> "VADD_256B\t{$src, $dst|$dst, $src}", [(set >>>>>>>>>>>>> VR2048:$dst, (add VR2048:$src1, VR2048:$src2))]]>; >>>>>>>>>>>>> >>>>>>>>>>>>> Also here i have changed class RI to I. Does it make any >>>>>>>>>>>>> difference? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Sat, Jul 8, 2017 at 9:38 AM, Craig Topper < >>>>>>>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> IIC_XADD_REG is used to associate latency and other >>>>>>>>>>>>>> information for use by the instruction scheduling pass. >>>>>>>>>>>>>> >>>>>>>>>>>>>> You're missing a pattern in the square bracket to match an >>>>>>>>>>>>>> add node. You also need two VR2048 registers in the 'ins' >>>>>>>>>>>>>> >>>>>>>>>>>>>> ~Craig >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 9:29 PM, hameeza ahmed < >>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Can you please tell whether following add is correct to add >>>>>>>>>>>>>>> 2 64xi32 numbers. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> def VADD_256B : RI<0xFE, MRMDestReg, (outs VR2048:$dst), >>>>>>>>>>>>>>> (ins VR2048:$src), >>>>>>>>>>>>>>> "VADD_256B\t{$src, $dst|$dst, $src}", [], >>>>>>>>>>>>>>> IIC_XADD_REG>, TB; >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> what is llc_xadd_reg here? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Sat, Jul 8, 2017 at 8:48 AM, Craig Topper < >>>>>>>>>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Change the i32 in the store pattern to v64i32. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 8:41 PM hameeza ahmed < >>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thank you. i understood how avx512 vector instructions are >>>>>>>>>>>>>>>>> written in x86instravx512. i need to define my vector instructions so i >>>>>>>>>>>>>>>>> wrote; >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> def VMOV_256B_RM : I<0x6F, MRMSrcMem, (outs VR2048:$dst), >>>>>>>>>>>>>>>>> (ins i32mem:$src), >>>>>>>>>>>>>>>>> "vmov_256B_rm\t{$src, $dst|$dst, >>>>>>>>>>>>>>>>> $src}", >>>>>>>>>>>>>>>>> [(set VR2048:$dst, (v64i32 >>>>>>>>>>>>>>>>> (scalar_to_vector (loadi32 addr:$src))))], >>>>>>>>>>>>>>>>> IIC_MOV_MEM>, EVEX; >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> def VMOV_256B_MR : I<0x7F, MRMDestMem, (outs), (ins >>>>>>>>>>>>>>>>> i32mem:$dst, VR2048:$src), >>>>>>>>>>>>>>>>> "vmov_256B_mr\t{$src, $dst|$dst, >>>>>>>>>>>>>>>>> $src}", >>>>>>>>>>>>>>>>> [(store (i32 (bitconvert >>>>>>>>>>>>>>>>> VR2048:$src)), addr:$dst)], IIC_MOV_MEM>, EVEX; >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> in x86instrinfo.td; >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> when i build i got these instructions in X86GenInstrInfo. >>>>>>>>>>>>>>>>> but still my instruction is not selected when i run input >>>>>>>>>>>>>>>>> file in debug mode; getting following errors; >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> ===== Instruction selection begins: BB#1 'vector.body' >>>>>>>>>>>>>>>>> Selecting: t9: ch = store<ST256[bitcast ([65 x i32]* @c to >>>>>>>>>>>>>>>>> <64 x i32>*)](align=16)(tbaa=<0x3817578>)> t8, t7, t11, >>>>>>>>>>>>>>>>> undef:i64 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> ISEL: Starting pattern match on root node: t9: ch >>>>>>>>>>>>>>>>> store<ST256[bitcast ([65 x i32]* @c to <64 x i32>*)](align=16)(tbaa=<0x3817578>)> >>>>>>>>>>>>>>>>> t8, t7, t11, undef:i64 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 14, continuing at 81 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 82, continuing at 149 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 150, continuing at 217 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 218, continuing at 267 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 268, continuing at 317 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 318, continuing at 367 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 368, continuing at 394 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 395, continuing at 421 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 422, continuing at 471 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 472, continuing at 521 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 522, continuing at 571 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 572, continuing at 639 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 640, continuing at 707 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 708, continuing at 775 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 776, continuing at 804 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 805, continuing at 833 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 834, continuing at 862 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 863, continuing at 891 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 892, continuing at 920 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 921, continuing at 949 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 950, continuing at 987 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 988, continuing at 1025 >>>>>>>>>>>>>>>>> Match failed at index 12 >>>>>>>>>>>>>>>>> Continuing at 1026 >>>>>>>>>>>>>>>>> OpcodeSwitch from 1029 to 5725 >>>>>>>>>>>>>>>>> Match failed at index 5743 >>>>>>>>>>>>>>>>> Continuing at 5772 >>>>>>>>>>>>>>>>> Match failed at index 5776 >>>>>>>>>>>>>>>>> Continuing at 5805 >>>>>>>>>>>>>>>>> Match failed at index 5809 >>>>>>>>>>>>>>>>> Continuing at 5838 >>>>>>>>>>>>>>>>> Match failed at index 5842 >>>>>>>>>>>>>>>>> Continuing at 5911 >>>>>>>>>>>>>>>>> Match failed at index 5915 >>>>>>>>>>>>>>>>> Continuing at 5953 >>>>>>>>>>>>>>>>> Match failed at index 5957 >>>>>>>>>>>>>>>>> Continuing at 5995 >>>>>>>>>>>>>>>>> Match failed at index 5999 >>>>>>>>>>>>>>>>> Continuing at 6037 >>>>>>>>>>>>>>>>> Match failed at index 6041 >>>>>>>>>>>>>>>>> Continuing at 6084 >>>>>>>>>>>>>>>>> Match failed at index 6088 >>>>>>>>>>>>>>>>> Continuing at 6131 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 6138, continuing at 6181 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 6182, continuing at 6228 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 6235, continuing at 6384 >>>>>>>>>>>>>>>>> Match failed at index 6388 >>>>>>>>>>>>>>>>> Continuing at 6419 >>>>>>>>>>>>>>>>> Match failed at index 6423 >>>>>>>>>>>>>>>>> Continuing at 6454 >>>>>>>>>>>>>>>>> Match failed at index 6458 >>>>>>>>>>>>>>>>> Continuing at 6489 >>>>>>>>>>>>>>>>> Continuing at 6490 >>>>>>>>>>>>>>>>> Continuing at 6491 >>>>>>>>>>>>>>>>> Continuing at 6492 >>>>>>>>>>>>>>>>> Match failed at index 6514 >>>>>>>>>>>>>>>>> Continuing at 6545 >>>>>>>>>>>>>>>>> Match failed at index 6562 >>>>>>>>>>>>>>>>> Continuing at 6593 >>>>>>>>>>>>>>>>> Match failed at index 6610 >>>>>>>>>>>>>>>>> Continuing at 6641 >>>>>>>>>>>>>>>>> Continuing at 6642 >>>>>>>>>>>>>>>>> Match failed at index 6658 >>>>>>>>>>>>>>>>> Continuing at 6772 >>>>>>>>>>>>>>>>> Match failed at index 6788 >>>>>>>>>>>>>>>>> Continuing at 6902 >>>>>>>>>>>>>>>>> Continuing at 13636 >>>>>>>>>>>>>>>>> Match failed at index 13640 >>>>>>>>>>>>>>>>> Continuing at 14940 >>>>>>>>>>>>>>>>> Match failed at index 14943 >>>>>>>>>>>>>>>>> Continuing at 15415 >>>>>>>>>>>>>>>>> Match failed at index 15417 >>>>>>>>>>>>>>>>> Continuing at 15570 >>>>>>>>>>>>>>>>> Match failed at index 15571 >>>>>>>>>>>>>>>>> Continuing at 15598 >>>>>>>>>>>>>>>>> Match failed at index 15599 >>>>>>>>>>>>>>>>> Continuing at 15716 >>>>>>>>>>>>>>>>> Match failed at index 15719 >>>>>>>>>>>>>>>>> Continuing at 15837 >>>>>>>>>>>>>>>>> Match failed at index 15840 >>>>>>>>>>>>>>>>> Continuing at 16198 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 16203, continuing at 16285 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 16286, continuing at 16394 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 16395, continuing at 16464 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 16465, continuing at 16487 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 16488, continuing at 16510 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 16511, continuing at 16533 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 16534, continuing at 16556 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 16557, continuing at 16680 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 16681, continuing at 16804 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 16805, continuing at 16890 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 16891, continuing at 16976 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 16978, continuing at 17169 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 17171, continuing at 17342 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 17344, continuing at 17497 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 17499, continuing at 17632 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 17634, continuing at 17801 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 17803, continuing at 17944 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 17946, continuing at 18074 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 18075, continuing at 18178 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 18179, continuing at 18253 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 18254, continuing at 18278 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 18279, continuing at 18303 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 18304, continuing at 18328 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 18329, continuing at 18376 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 18377, continuing at 18424 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 18425, continuing at 18520 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 18521, continuing at 18636 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 18637, continuing at 18661 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 18662, continuing at 18711 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 18712, continuing at 18736 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 18737, continuing at 18770 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 18771, continuing at 18856 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 18857, continuing at 18942 >>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>> 18943, continuing at 19028 >>>>>>>>>>>>>>>>> Match failed at index 16201 >>>>>>>>>>>>>>>>> Continuing at 19029 >>>>>>>>>>>>>>>>> LLVM ERROR: Cannot select: t9: ch = store<ST256[bitcast >>>>>>>>>>>>>>>>> ([65 x i32]* @c to <64 x i32>*)](align=16)(tbaa=<0x3817578>)> >>>>>>>>>>>>>>>>> t8, t7, t11, undef:i64 >>>>>>>>>>>>>>>>> t7: v64i32 = add t6, t4 >>>>>>>>>>>>>>>>> t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* @c to >>>>>>>>>>>>>>>>> <64 x i32>*)](align=16)(tbaa=<0x3817578>)(dereferenceable)> >>>>>>>>>>>>>>>>> t0, t11, undef:i64 >>>>>>>>>>>>>>>>> t11: i64 = X86ISD::Wrapper >>>>>>>>>>>>>>>>> TargetGlobalAddress:i64<[65 x i32]* @c> 0 >>>>>>>>>>>>>>>>> t10: i64 = TargetGlobalAddress<[65 x i32]* @c> 0 >>>>>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>>>>> t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* @b to >>>>>>>>>>>>>>>>> <64 x i32>*)](align=16)(tbaa=<0x3817578>)(dereferenceable)> >>>>>>>>>>>>>>>>> t0, t13, undef:i64 >>>>>>>>>>>>>>>>> t13: i64 = X86ISD::Wrapper >>>>>>>>>>>>>>>>> TargetGlobalAddress:i64<[65 x i32]* @b> 0 >>>>>>>>>>>>>>>>> t12: i64 = TargetGlobalAddress<[65 x i32]* @b> 0 >>>>>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>>>>> t11: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x >>>>>>>>>>>>>>>>> i32]* @c> 0 >>>>>>>>>>>>>>>>> t10: i64 = TargetGlobalAddress<[65 x i32]* @c> 0 >>>>>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>>>>> In function: foo >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> What could be the reason of this?? Please correct me. >>>>>>>>>>>>>>>>> I am stuck at this point.... >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 10:59 PM, Friedman, Eli < >>>>>>>>>>>>>>>>> efriedma at codeaurora.org> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The word "fold" is used all over LLVM. It generally >>>>>>>>>>>>>>>>>> refers to transformations which delete an instruction. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> If you're asking about http://llvm.org/docs/CodeGener >>>>>>>>>>>>>>>>>> ator.html#instruction-folding , it just means an >>>>>>>>>>>>>>>>>> instruction which was produced by the "instruction folding" transform; >>>>>>>>>>>>>>>>>> there isn't anything special about the instruction itself. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -Eli >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 7/6/2017 10:51 PM, hameeza ahmed wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> What is meant by folded instructions in LLVM? >>>>>>>>>>>>>>>>>> How they work? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 10:19 AM, hameeza ahmed < >>>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thank You. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 10:03 AM, Craig Topper < >>>>>>>>>>>>>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Yes, that error is from instruction selection. I think >>>>>>>>>>>>>>>>>>>> your legalization changes worked fine. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> ~Craig >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Thu, Jul 6, 2017 at 8:21 PM, hameeza ahmed via >>>>>>>>>>>>>>>>>>>> llvm-dev <llvm-dev at lists.llvm.org> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> also i further run the following command; >>>>>>>>>>>>>>>>>>>>> llc -debug filer-knl_o3.ll >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> and its output is attached here. by looking at the >>>>>>>>>>>>>>>>>>>>> output can we say that legalization runs fine and the error is due to >>>>>>>>>>>>>>>>>>>>> instruction selection/ pattern matching which is not yet implemented? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> so do i need to worry and try to correct it at this >>>>>>>>>>>>>>>>>>>>> stage or should i move forward to implement instruction selection/ pattern >>>>>>>>>>>>>>>>>>>>> matching? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Please guide me. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thank You >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 8:00 AM, hameeza ahmed < >>>>>>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Thank You. well i have seen these links. but they >>>>>>>>>>>>>>>>>>>>>> dont cover the problem that i have mentioned. actually i am doing all the >>>>>>>>>>>>>>>>>>>>>> things step by step. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> so i havent yet worked with instruction selection >>>>>>>>>>>>>>>>>>>>>> phase/ files. rather before that i am trying to do legalization by allowing >>>>>>>>>>>>>>>>>>>>>> vector elements>16 i.e 64xi32. here i have mainly worked with 2 files uptil >>>>>>>>>>>>>>>>>>>>>> now, i.e registerinfo.td to define register class to >>>>>>>>>>>>>>>>>>>>>> be called in legalization. and most importantly i am dealing with file >>>>>>>>>>>>>>>>>>>>>> X86ISelLowering.cpp. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Now is there any relation in this and instruction >>>>>>>>>>>>>>>>>>>>>> selection. since instruction selection comes after combine and legalize so >>>>>>>>>>>>>>>>>>>>>> i havent yet worked on it. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Please correct me, I am stuck here. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Thank You again >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 7:11 AM, Friedman, Eli < >>>>>>>>>>>>>>>>>>>>>> efriedma at codeaurora.org> wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Have you read http://llvm.org/docs/WritingAn >>>>>>>>>>>>>>>>>>>>>>> LLVMBackend.html and http://llvm.org/docs/CodeGener >>>>>>>>>>>>>>>>>>>>>>> ator.html ? http://llvm.org/docs/WritingAn >>>>>>>>>>>>>>>>>>>>>>> LLVMBackend.html#instruction-selector describes how >>>>>>>>>>>>>>>>>>>>>>> to define a store instruction. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> -Eli >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On 7/6/2017 6:51 PM, hameeza ahmed via llvm-dev >>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Please correct me i m stuck at this point. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On Jul 6, 2017 5:18 PM, "hameeza ahmed" < >>>>>>>>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>>>>>>>>>>> i am experimenting with the increase in register/ >>>>>>>>>>>>>>>>>>>>>>> vector width to 64 elements of 32 bits instead of 16 in x86 backend. >>>>>>>>>>>>>>>>>>>>>>> for eg. >>>>>>>>>>>>>>>>>>>>>>> i have a loop with 65 iterations; >>>>>>>>>>>>>>>>>>>>>>> if my IR generates v64i32 and 1 scalar, still the >>>>>>>>>>>>>>>>>>>>>>> backend breaks the v64i32 into 4 v16i32. i want it to retain v64i32. like >>>>>>>>>>>>>>>>>>>>>>> if there are 128 elements in loop then it should break it into 2 v64i32 >>>>>>>>>>>>>>>>>>>>>>> instructions. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> in order to do this i have made necessary changes in >>>>>>>>>>>>>>>>>>>>>>> X86ISelLowering.cpp. and rebuild llvm. then when i use the >>>>>>>>>>>>>>>>>>>>>>> command -view-dag-combine2-dags i get the required >>>>>>>>>>>>>>>>>>>>>>> output in graph but the following error on console: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> LLVM ERROR: Cannot select: t10: ch >>>>>>>>>>>>>>>>>>>>>>> store<ST256[bitcast ([65 x i32]* @a to <64 x i32>*)](align=16)(tbaa=<0x30c5438>)> >>>>>>>>>>>>>>>>>>>>>>> t9, t7, t12, undef:i64 >>>>>>>>>>>>>>>>>>>>>>> t7: v64i32 = add t6, t4 >>>>>>>>>>>>>>>>>>>>>>> t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* >>>>>>>>>>>>>>>>>>>>>>> @c to <64 x i32>*)](align=16)(tbaa=<0x30c5438>)(dereferenceable)> >>>>>>>>>>>>>>>>>>>>>>> t0, t14, undef:i64 >>>>>>>>>>>>>>>>>>>>>>> t14: i64 = X86ISD::Wrapper >>>>>>>>>>>>>>>>>>>>>>> TargetGlobalAddress:i64<[65 x i32]* @c> 0 >>>>>>>>>>>>>>>>>>>>>>> t13: i64 = TargetGlobalAddress<[65 x i32]* >>>>>>>>>>>>>>>>>>>>>>> @c> 0 >>>>>>>>>>>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>>>>>>>>>>> t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* >>>>>>>>>>>>>>>>>>>>>>> @b to <64 x i32>*)](align=16)(tbaa=<0x30c5438>)(dereferenceable)> >>>>>>>>>>>>>>>>>>>>>>> t0, t16, undef:i64 >>>>>>>>>>>>>>>>>>>>>>> t16: i64 = X86ISD::Wrapper >>>>>>>>>>>>>>>>>>>>>>> TargetGlobalAddress:i64<[65 x i32]* @b> 0 >>>>>>>>>>>>>>>>>>>>>>> t15: i64 = TargetGlobalAddress<[65 x i32]* >>>>>>>>>>>>>>>>>>>>>>> @b> 0 >>>>>>>>>>>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>>>>>>>>>>> t12: i64 = X86ISD::Wrapper >>>>>>>>>>>>>>>>>>>>>>> TargetGlobalAddress:i64<[65 x i32]* @a> 0 >>>>>>>>>>>>>>>>>>>>>>> t11: i64 = TargetGlobalAddress<[65 x i32]* @a> 0 >>>>>>>>>>>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>>>>>>>>>>> In function: foo >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> The dag after legalization is also attached here. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> the source is vector sum of 65 elements. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Kindly correct me. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>>>>>>> LLVM Developers mailing listllvm-dev at lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>> Employee of Qualcomm Innovation Center, Inc. >>>>>>>>>>>>>>>>>>>>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>>>>>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>>>>>>>>>>>>>> http://lists.llvm.org/cgi-bin/ >>>>>>>>>>>>>>>>>>>>> mailman/listinfo/llvm-dev >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>> Employee of Qualcomm Innovation Center, Inc. >>>>>>>>>>>>>>>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> ~Craig >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170708/bfd0d8e0/attachment-0001.html>
hameeza ahmed via llvm-dev
2017-Jul-08 07:27 UTC
[llvm-dev] Error in v64i32 type in x86 backend
As you pointed i used same opcode 0xFE with add but i added evex at the end so no error. Now, my instructions are correctly selected but i am getting the following errors: Invalid prefix! UNREACHABLE executed at /home/hameeza/Documents/PIM/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp:647! #0 0x00000000018e2c6c llvm::sys::PrintStackTrace(llvm::raw_ostream&) /home/hameeza/Documents/PIM/lib/Support/Unix/Signals.inc:402:11 What changes do i need to do in X86MCCodeEmitter.cpp file? On Sat, Jul 8, 2017 at 11:46 AM, hameeza ahmed <hahmed2305 at gmail.com> wrote:> Thank you so much. > > On Sat, Jul 8, 2017 at 11:30 AM, hameeza ahmed <hahmed2305 at gmail.com> > wrote: > >> I am targetting some new hardware which supports greater vector width. i >> am adding that in x86 not separately. >> >> On Sat, Jul 8, 2017 at 11:28 AM, Craig Topper <craig.topper at gmail.com> >> wrote: >> >>> The opcode field is 8-bits, but there are multiple opcode maps specified >>> by things like the "TB" on the end of your current instruction. There are >>> many others like TA, VEX, XOP, etc. that you can find on other instructions. >>> >>> What exactly is your end goal with making up these fake instructions? >>> >>> ~Craig >>> >>> On Fri, Jul 7, 2017 at 11:03 PM, hameeza ahmed <hahmed2305 at gmail.com> >>> wrote: >>> >>>> Thank You. >>>> >>>> I have seen the opcode is 8 bits and all the combinations are already >>>> used in llvm x86. >>>> >>>> Now what to do? >>>> >>>> On Sat, Jul 8, 2017 at 10:57 AM, Craig Topper <craig.topper at gmail.com> >>>> wrote: >>>> >>>>> Yes its an opcode conflict. You'll have to look through Intel >>>>> documents and find an unused opcode. I've only added instructions based on >>>>> a real spec so I don't know how to make up an opcode. >>>>> >>>>> ~Craig >>>>> >>>>> On Fri, Jul 7, 2017 at 10:43 PM, hameeza ahmed <hahmed2305 at gmail.com> >>>>> wrote: >>>>> >>>>>> Thank You. >>>>>> >>>>>> Now i am getting this error repeatedly; >>>>>> >>>>>> Error: Primary decode conflict: VADD_256B would overwrite INC8r >>>>>> ModRM 192 >>>>>> Opcode 254 >>>>>> Context IC >>>>>> Error: Primary decode conflict: VADD_256B would overwrite INC8r >>>>>> ModRM 193 >>>>>> Opcode 254 >>>>>> Context IC >>>>>> >>>>>> Is it due to opcode conflict? what should i keep opcode then? >>>>>> >>>>>> >>>>>> On Sat, Jul 8, 2017 at 10:33 AM, Craig Topper <craig.topper at gmail.com >>>>>> > wrote: >>>>>> >>>>>>> Keep I >>>>>>> >>>>>>> ~Craig >>>>>>> >>>>>>> On Fri, Jul 7, 2017 at 10:28 PM, hameeza ahmed <hahmed2305 at gmail.com >>>>>>> > wrote: >>>>>>> >>>>>>>> I keep this one; >>>>>>>> >>>>>>>> def VADD_256B : I<0xFE, MRMDestReg, (outs VR2048:$dst), (ins VR >>>>>>>> 2048:$src1, VR2048:$src2), >>>>>>>> "VADD_256B\t{$dst, $src1, $src2 }", [(set VR2048:$dst, >>>>>>>> (add VR2048:$src1, VR2048:$src2))]>; >>>>>>>> >>>>>>>> On Sat, Jul 8, 2017 at 10:17 AM, hameeza ahmed < >>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>> >>>>>>>>> sorry i didnt understand RI/ I thing. should i keep RI or I? >>>>>>>>> >>>>>>>>> On Sat, Jul 8, 2017 at 10:13 AM, Craig Topper < >>>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> I think so. >>>>>>>>>> >>>>>>>>>> ~Craig >>>>>>>>>> >>>>>>>>>> On Fri, Jul 7, 2017 at 10:10 PM, hameeza ahmed < >>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> sorry to disturb again,, >>>>>>>>>>> >>>>>>>>>>> def VADD_256B : RI<0xFE, MRMDestReg, (outs VR2048:$dst), (ins >>>>>>>>>>> VR2048:$src1, VR2048:$src2), >>>>>>>>>>> "VADD_256B\t{$dst, $src1, $src2 }", [(set VR >>>>>>>>>>> 2048:$dst, (add VR2048:$src1, VR2048:$src2))], IIC_XADD_REG>, >>>>>>>>>>> TB; >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Is it fine now?? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Sat, Jul 8, 2017 at 10:00 AM, Craig Topper < >>>>>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Oops that should have said "REX prefix" in the first sentence. >>>>>>>>>>>> >>>>>>>>>>>> ~Craig >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Jul 7, 2017 at 9:59 PM, Craig Topper < >>>>>>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> You don't want RI. That's used for instructions that need a >>>>>>>>>>>>> reg prefix. You need to use $src1 and $src2 in the assembly string too. It >>>>>>>>>>>>> also looks like you have two closing ] brackets. >>>>>>>>>>>>> >>>>>>>>>>>>> ~Craig >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Jul 7, 2017 at 9:55 PM, hameeza ahmed < >>>>>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Thank you; >>>>>>>>>>>>>> i have changed as follows.is it fine now? >>>>>>>>>>>>>> >>>>>>>>>>>>>> def VADD_256B : I<0xFE, MRMDestReg, (outs VR2048:$dst), >>>>>>>>>>>>>> (ins VR2048:$src1, VR2048:$src2), >>>>>>>>>>>>>> "VADD_256B\t{$src, $dst|$dst, $src}", >>>>>>>>>>>>>> [(set VR2048:$dst, (add VR2048:$src1, VR2048:$src2))]]>; >>>>>>>>>>>>>> >>>>>>>>>>>>>> Also here i have changed class RI to I. Does it make any >>>>>>>>>>>>>> difference? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Sat, Jul 8, 2017 at 9:38 AM, Craig Topper < >>>>>>>>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> IIC_XADD_REG is used to associate latency and other >>>>>>>>>>>>>>> information for use by the instruction scheduling pass. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> You're missing a pattern in the square bracket to match an >>>>>>>>>>>>>>> add node. You also need two VR2048 registers in the 'ins' >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ~Craig >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 9:29 PM, hameeza ahmed < >>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Can you please tell whether following add is correct to add >>>>>>>>>>>>>>>> 2 64xi32 numbers. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> def VADD_256B : RI<0xFE, MRMDestReg, (outs VR2048:$dst), >>>>>>>>>>>>>>>> (ins VR2048:$src), >>>>>>>>>>>>>>>> "VADD_256B\t{$src, $dst|$dst, $src}", >>>>>>>>>>>>>>>> [], IIC_XADD_REG>, TB; >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> what is llc_xadd_reg here? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Sat, Jul 8, 2017 at 8:48 AM, Craig Topper < >>>>>>>>>>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Change the i32 in the store pattern to v64i32. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 8:41 PM hameeza ahmed < >>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thank you. i understood how avx512 vector instructions >>>>>>>>>>>>>>>>>> are written in x86instravx512. i need to define my vector instructions so i >>>>>>>>>>>>>>>>>> wrote; >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> def VMOV_256B_RM : I<0x6F, MRMSrcMem, (outs >>>>>>>>>>>>>>>>>> VR2048:$dst), (ins i32mem:$src), >>>>>>>>>>>>>>>>>> "vmov_256B_rm\t{$src, $dst|$dst, >>>>>>>>>>>>>>>>>> $src}", >>>>>>>>>>>>>>>>>> [(set VR2048:$dst, (v64i32 >>>>>>>>>>>>>>>>>> (scalar_to_vector (loadi32 addr:$src))))], >>>>>>>>>>>>>>>>>> IIC_MOV_MEM>, EVEX; >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> def VMOV_256B_MR : I<0x7F, MRMDestMem, (outs), (ins >>>>>>>>>>>>>>>>>> i32mem:$dst, VR2048:$src), >>>>>>>>>>>>>>>>>> "vmov_256B_mr\t{$src, $dst|$dst, >>>>>>>>>>>>>>>>>> $src}", >>>>>>>>>>>>>>>>>> [(store (i32 (bitconvert >>>>>>>>>>>>>>>>>> VR2048:$src)), addr:$dst)], IIC_MOV_MEM>, EVEX; >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> in x86instrinfo.td; >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> when i build i got these instructions in X86GenInstrInfo. >>>>>>>>>>>>>>>>>> but still my instruction is not selected when i run input >>>>>>>>>>>>>>>>>> file in debug mode; getting following errors; >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> ===== Instruction selection begins: BB#1 'vector.body' >>>>>>>>>>>>>>>>>> Selecting: t9: ch = store<ST256[bitcast ([65 x i32]* @c >>>>>>>>>>>>>>>>>> to <64 x i32>*)](align=16)(tbaa=<0x3817578>)> t8, t7, >>>>>>>>>>>>>>>>>> t11, undef:i64 >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> ISEL: Starting pattern match on root node: t9: ch >>>>>>>>>>>>>>>>>> store<ST256[bitcast ([65 x i32]* @c to <64 x i32>*)](align=16)(tbaa=<0x3817578>)> >>>>>>>>>>>>>>>>>> t8, t7, t11, undef:i64 >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 14, continuing at 81 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 82, continuing at 149 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 150, continuing at 217 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 218, continuing at 267 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 268, continuing at 317 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 318, continuing at 367 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 368, continuing at 394 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 395, continuing at 421 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 422, continuing at 471 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 472, continuing at 521 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 522, continuing at 571 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 572, continuing at 639 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 640, continuing at 707 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 708, continuing at 775 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 776, continuing at 804 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 805, continuing at 833 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 834, continuing at 862 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 863, continuing at 891 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 892, continuing at 920 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 921, continuing at 949 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 950, continuing at 987 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 988, continuing at 1025 >>>>>>>>>>>>>>>>>> Match failed at index 12 >>>>>>>>>>>>>>>>>> Continuing at 1026 >>>>>>>>>>>>>>>>>> OpcodeSwitch from 1029 to 5725 >>>>>>>>>>>>>>>>>> Match failed at index 5743 >>>>>>>>>>>>>>>>>> Continuing at 5772 >>>>>>>>>>>>>>>>>> Match failed at index 5776 >>>>>>>>>>>>>>>>>> Continuing at 5805 >>>>>>>>>>>>>>>>>> Match failed at index 5809 >>>>>>>>>>>>>>>>>> Continuing at 5838 >>>>>>>>>>>>>>>>>> Match failed at index 5842 >>>>>>>>>>>>>>>>>> Continuing at 5911 >>>>>>>>>>>>>>>>>> Match failed at index 5915 >>>>>>>>>>>>>>>>>> Continuing at 5953 >>>>>>>>>>>>>>>>>> Match failed at index 5957 >>>>>>>>>>>>>>>>>> Continuing at 5995 >>>>>>>>>>>>>>>>>> Match failed at index 5999 >>>>>>>>>>>>>>>>>> Continuing at 6037 >>>>>>>>>>>>>>>>>> Match failed at index 6041 >>>>>>>>>>>>>>>>>> Continuing at 6084 >>>>>>>>>>>>>>>>>> Match failed at index 6088 >>>>>>>>>>>>>>>>>> Continuing at 6131 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 6138, continuing at 6181 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 6182, continuing at 6228 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 6235, continuing at 6384 >>>>>>>>>>>>>>>>>> Match failed at index 6388 >>>>>>>>>>>>>>>>>> Continuing at 6419 >>>>>>>>>>>>>>>>>> Match failed at index 6423 >>>>>>>>>>>>>>>>>> Continuing at 6454 >>>>>>>>>>>>>>>>>> Match failed at index 6458 >>>>>>>>>>>>>>>>>> Continuing at 6489 >>>>>>>>>>>>>>>>>> Continuing at 6490 >>>>>>>>>>>>>>>>>> Continuing at 6491 >>>>>>>>>>>>>>>>>> Continuing at 6492 >>>>>>>>>>>>>>>>>> Match failed at index 6514 >>>>>>>>>>>>>>>>>> Continuing at 6545 >>>>>>>>>>>>>>>>>> Match failed at index 6562 >>>>>>>>>>>>>>>>>> Continuing at 6593 >>>>>>>>>>>>>>>>>> Match failed at index 6610 >>>>>>>>>>>>>>>>>> Continuing at 6641 >>>>>>>>>>>>>>>>>> Continuing at 6642 >>>>>>>>>>>>>>>>>> Match failed at index 6658 >>>>>>>>>>>>>>>>>> Continuing at 6772 >>>>>>>>>>>>>>>>>> Match failed at index 6788 >>>>>>>>>>>>>>>>>> Continuing at 6902 >>>>>>>>>>>>>>>>>> Continuing at 13636 >>>>>>>>>>>>>>>>>> Match failed at index 13640 >>>>>>>>>>>>>>>>>> Continuing at 14940 >>>>>>>>>>>>>>>>>> Match failed at index 14943 >>>>>>>>>>>>>>>>>> Continuing at 15415 >>>>>>>>>>>>>>>>>> Match failed at index 15417 >>>>>>>>>>>>>>>>>> Continuing at 15570 >>>>>>>>>>>>>>>>>> Match failed at index 15571 >>>>>>>>>>>>>>>>>> Continuing at 15598 >>>>>>>>>>>>>>>>>> Match failed at index 15599 >>>>>>>>>>>>>>>>>> Continuing at 15716 >>>>>>>>>>>>>>>>>> Match failed at index 15719 >>>>>>>>>>>>>>>>>> Continuing at 15837 >>>>>>>>>>>>>>>>>> Match failed at index 15840 >>>>>>>>>>>>>>>>>> Continuing at 16198 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 16203, continuing at 16285 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 16286, continuing at 16394 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 16395, continuing at 16464 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 16465, continuing at 16487 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 16488, continuing at 16510 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 16511, continuing at 16533 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 16534, continuing at 16556 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 16557, continuing at 16680 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 16681, continuing at 16804 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 16805, continuing at 16890 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 16891, continuing at 16976 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 16978, continuing at 17169 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 17171, continuing at 17342 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 17344, continuing at 17497 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 17499, continuing at 17632 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 17634, continuing at 17801 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 17803, continuing at 17944 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 17946, continuing at 18074 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 18075, continuing at 18178 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 18179, continuing at 18253 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 18254, continuing at 18278 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 18279, continuing at 18303 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 18304, continuing at 18328 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 18329, continuing at 18376 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 18377, continuing at 18424 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 18425, continuing at 18520 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 18521, continuing at 18636 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 18637, continuing at 18661 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 18662, continuing at 18711 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 18712, continuing at 18736 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 18737, continuing at 18770 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 18771, continuing at 18856 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 18857, continuing at 18942 >>>>>>>>>>>>>>>>>> Skipped scope entry (due to false predicate) at index >>>>>>>>>>>>>>>>>> 18943, continuing at 19028 >>>>>>>>>>>>>>>>>> Match failed at index 16201 >>>>>>>>>>>>>>>>>> Continuing at 19029 >>>>>>>>>>>>>>>>>> LLVM ERROR: Cannot select: t9: ch = store<ST256[bitcast >>>>>>>>>>>>>>>>>> ([65 x i32]* @c to <64 x i32>*)](align=16)(tbaa=<0x3817578>)> >>>>>>>>>>>>>>>>>> t8, t7, t11, undef:i64 >>>>>>>>>>>>>>>>>> t7: v64i32 = add t6, t4 >>>>>>>>>>>>>>>>>> t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* @c to >>>>>>>>>>>>>>>>>> <64 x i32>*)](align=16)(tbaa=<0x3817578>)(dereferenceable)> >>>>>>>>>>>>>>>>>> t0, t11, undef:i64 >>>>>>>>>>>>>>>>>> t11: i64 = X86ISD::Wrapper >>>>>>>>>>>>>>>>>> TargetGlobalAddress:i64<[65 x i32]* @c> 0 >>>>>>>>>>>>>>>>>> t10: i64 = TargetGlobalAddress<[65 x i32]* @c> 0 >>>>>>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>>>>>> t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* @b to >>>>>>>>>>>>>>>>>> <64 x i32>*)](align=16)(tbaa=<0x3817578>)(dereferenceable)> >>>>>>>>>>>>>>>>>> t0, t13, undef:i64 >>>>>>>>>>>>>>>>>> t13: i64 = X86ISD::Wrapper >>>>>>>>>>>>>>>>>> TargetGlobalAddress:i64<[65 x i32]* @b> 0 >>>>>>>>>>>>>>>>>> t12: i64 = TargetGlobalAddress<[65 x i32]* @b> 0 >>>>>>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>>>>>> t11: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 >>>>>>>>>>>>>>>>>> x i32]* @c> 0 >>>>>>>>>>>>>>>>>> t10: i64 = TargetGlobalAddress<[65 x i32]* @c> 0 >>>>>>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>>>>>> In function: foo >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> What could be the reason of this?? Please correct me. >>>>>>>>>>>>>>>>>> I am stuck at this point.... >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 10:59 PM, Friedman, Eli < >>>>>>>>>>>>>>>>>> efriedma at codeaurora.org> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> The word "fold" is used all over LLVM. It generally >>>>>>>>>>>>>>>>>>> refers to transformations which delete an instruction. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> If you're asking about http://llvm.org/docs/CodeGener >>>>>>>>>>>>>>>>>>> ator.html#instruction-folding , it just means an >>>>>>>>>>>>>>>>>>> instruction which was produced by the "instruction folding" transform; >>>>>>>>>>>>>>>>>>> there isn't anything special about the instruction itself. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -Eli >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 7/6/2017 10:51 PM, hameeza ahmed wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> What is meant by folded instructions in LLVM? >>>>>>>>>>>>>>>>>>> How they work? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 10:19 AM, hameeza ahmed < >>>>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thank You. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 10:03 AM, Craig Topper < >>>>>>>>>>>>>>>>>>>> craig.topper at gmail.com> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Yes, that error is from instruction selection. I think >>>>>>>>>>>>>>>>>>>>> your legalization changes worked fine. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> ~Craig >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Thu, Jul 6, 2017 at 8:21 PM, hameeza ahmed via >>>>>>>>>>>>>>>>>>>>> llvm-dev <llvm-dev at lists.llvm.org> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> also i further run the following command; >>>>>>>>>>>>>>>>>>>>>> llc -debug filer-knl_o3.ll >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> and its output is attached here. by looking at the >>>>>>>>>>>>>>>>>>>>>> output can we say that legalization runs fine and the error is due to >>>>>>>>>>>>>>>>>>>>>> instruction selection/ pattern matching which is not yet implemented? >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> so do i need to worry and try to correct it at this >>>>>>>>>>>>>>>>>>>>>> stage or should i move forward to implement instruction selection/ pattern >>>>>>>>>>>>>>>>>>>>>> matching? >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Please guide me. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Thank You >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 8:00 AM, hameeza ahmed < >>>>>>>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Thank You. well i have seen these links. but they >>>>>>>>>>>>>>>>>>>>>>> dont cover the problem that i have mentioned. actually i am doing all the >>>>>>>>>>>>>>>>>>>>>>> things step by step. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> so i havent yet worked with instruction selection >>>>>>>>>>>>>>>>>>>>>>> phase/ files. rather before that i am trying to do legalization by allowing >>>>>>>>>>>>>>>>>>>>>>> vector elements>16 i.e 64xi32. here i have mainly worked with 2 files uptil >>>>>>>>>>>>>>>>>>>>>>> now, i.e registerinfo.td to define register class >>>>>>>>>>>>>>>>>>>>>>> to be called in legalization. and most importantly i am dealing with file >>>>>>>>>>>>>>>>>>>>>>> X86ISelLowering.cpp. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Now is there any relation in this and instruction >>>>>>>>>>>>>>>>>>>>>>> selection. since instruction selection comes after combine and legalize so >>>>>>>>>>>>>>>>>>>>>>> i havent yet worked on it. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Please correct me, I am stuck here. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Thank You again >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On Fri, Jul 7, 2017 at 7:11 AM, Friedman, Eli < >>>>>>>>>>>>>>>>>>>>>>> efriedma at codeaurora.org> wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Have you read http://llvm.org/docs/WritingAn >>>>>>>>>>>>>>>>>>>>>>>> LLVMBackend.html and http://llvm.org/docs/CodeGener >>>>>>>>>>>>>>>>>>>>>>>> ator.html ? http://llvm.org/docs/WritingAn >>>>>>>>>>>>>>>>>>>>>>>> LLVMBackend.html#instruction-selector describes >>>>>>>>>>>>>>>>>>>>>>>> how to define a store instruction. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> -Eli >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On 7/6/2017 6:51 PM, hameeza ahmed via llvm-dev >>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Please correct me i m stuck at this point. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On Jul 6, 2017 5:18 PM, "hameeza ahmed" < >>>>>>>>>>>>>>>>>>>>>>>> hahmed2305 at gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>>>>>>>>>>>> i am experimenting with the increase in register/ >>>>>>>>>>>>>>>>>>>>>>>> vector width to 64 elements of 32 bits instead of 16 in x86 backend. >>>>>>>>>>>>>>>>>>>>>>>> for eg. >>>>>>>>>>>>>>>>>>>>>>>> i have a loop with 65 iterations; >>>>>>>>>>>>>>>>>>>>>>>> if my IR generates v64i32 and 1 scalar, still the >>>>>>>>>>>>>>>>>>>>>>>> backend breaks the v64i32 into 4 v16i32. i want it to retain v64i32. like >>>>>>>>>>>>>>>>>>>>>>>> if there are 128 elements in loop then it should break it into 2 v64i32 >>>>>>>>>>>>>>>>>>>>>>>> instructions. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> in order to do this i have made necessary changes >>>>>>>>>>>>>>>>>>>>>>>> in X86ISelLowering.cpp. and rebuild llvm. then when i use the >>>>>>>>>>>>>>>>>>>>>>>> command -view-dag-combine2-dags i get the required >>>>>>>>>>>>>>>>>>>>>>>> output in graph but the following error on console: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> LLVM ERROR: Cannot select: t10: ch >>>>>>>>>>>>>>>>>>>>>>>> store<ST256[bitcast ([65 x i32]* @a to <64 x i32>*)](align=16)(tbaa=<0x30c5438>)> >>>>>>>>>>>>>>>>>>>>>>>> t9, t7, t12, undef:i64 >>>>>>>>>>>>>>>>>>>>>>>> t7: v64i32 = add t6, t4 >>>>>>>>>>>>>>>>>>>>>>>> t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* >>>>>>>>>>>>>>>>>>>>>>>> @c to <64 x i32>*)](align=16)(tbaa=<0x30c5438>)(dereferenceable)> >>>>>>>>>>>>>>>>>>>>>>>> t0, t14, undef:i64 >>>>>>>>>>>>>>>>>>>>>>>> t14: i64 = X86ISD::Wrapper >>>>>>>>>>>>>>>>>>>>>>>> TargetGlobalAddress:i64<[65 x i32]* @c> 0 >>>>>>>>>>>>>>>>>>>>>>>> t13: i64 = TargetGlobalAddress<[65 x i32]* >>>>>>>>>>>>>>>>>>>>>>>> @c> 0 >>>>>>>>>>>>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>>>>>>>>>>>> t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* >>>>>>>>>>>>>>>>>>>>>>>> @b to <64 x i32>*)](align=16)(tbaa=<0x30c5438>)(dereferenceable)> >>>>>>>>>>>>>>>>>>>>>>>> t0, t16, undef:i64 >>>>>>>>>>>>>>>>>>>>>>>> t16: i64 = X86ISD::Wrapper >>>>>>>>>>>>>>>>>>>>>>>> TargetGlobalAddress:i64<[65 x i32]* @b> 0 >>>>>>>>>>>>>>>>>>>>>>>> t15: i64 = TargetGlobalAddress<[65 x i32]* >>>>>>>>>>>>>>>>>>>>>>>> @b> 0 >>>>>>>>>>>>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>>>>>>>>>>>> t12: i64 = X86ISD::Wrapper >>>>>>>>>>>>>>>>>>>>>>>> TargetGlobalAddress:i64<[65 x i32]* @a> 0 >>>>>>>>>>>>>>>>>>>>>>>> t11: i64 = TargetGlobalAddress<[65 x i32]* @a> 0 >>>>>>>>>>>>>>>>>>>>>>>> t3: i64 = undef >>>>>>>>>>>>>>>>>>>>>>>> In function: foo >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> The dag after legalization is also attached here. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> the source is vector sum of 65 elements. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Kindly correct me. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>>>>>>>> LLVM Developers mailing listllvm-dev at lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>> Employee of Qualcomm Innovation Center, Inc. >>>>>>>>>>>>>>>>>>>>>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>>>>>>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>>>>>>>>>>>>>>> http://lists.llvm.org/cgi-bin/ >>>>>>>>>>>>>>>>>>>>>> mailman/listinfo/llvm-dev >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>> Employee of Qualcomm Innovation Center, Inc. >>>>>>>>>>>>>>>>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>> ~Craig >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170708/aa59fc29/attachment-0001.html>