thr3ads.net - llvm dev - [llvm-dev] Error in v64i32 type in x86 backend [Jul 2017]

If this information is useful, please help other people find it:
Share via:

hameeza ahmed via llvm-dev

2017-Jul-08 04:55 UTC

[llvm-dev] Error in v64i32 type in x86 backend

Thank you;
i have changed as follows.is it fine now?

def  VADD_256B  : I<0xFE, MRMDestReg, (outs VR2048:$dst), (ins VR2048:$src1,
VR2048:$src2),
                   "VADD_256B\t{$src, $dst|$dst, $src}", [(set
VR2048:$dst,
(add VR2048:$src1, VR2048:$src2))]]>;

Also here i have changed class RI to I. Does it make any difference?



On Sat, Jul 8, 2017 at 9:38 AM, Craig Topper <craig.topper at gmail.com>
wrote:
> IIC_XADD_REG is used to associate latency and other information for use by
> the instruction scheduling pass.
>
> You're missing a pattern in the square bracket to match an add node.
You
> also need two VR2048 registers in the 'ins'
>
> ~Craig
>
> On Fri, Jul 7, 2017 at 9:29 PM, hameeza ahmed <hahmed2305 at
gmail.com>
> wrote:
>
>> Can you please tell whether following add is correct to add 2 64xi32
>> numbers.
>>
>> def VADD_256B  : RI<0xFE, MRMDestReg, (outs VR2048:$dst), (ins
VR2048
>> :$src),
>>                    "VADD_256B\t{$src, $dst|$dst, $src}", [],
>> IIC_XADD_REG>, TB;
>>
>> what is llc_xadd_reg here?
>>
>>
>>
>> On Sat, Jul 8, 2017 at 8:48 AM, Craig Topper <craig.topper at
gmail.com>
>> wrote:
>>
>>> Change the i32 in the store pattern to v64i32.
>>>
>>> On Fri, Jul 7, 2017 at 8:41 PM hameeza ahmed <hahmed2305 at
gmail.com>
>>> wrote:
>>>
>>>> Thank you. i understood how avx512 vector instructions are
written in
>>>> x86instravx512. i need to define my vector instructions so i
wrote;
>>>>
>>>>  def VMOV_256B_RM : I<0x6F, MRMSrcMem, (outs VR2048:$dst),
(ins
>>>> i32mem:$src),
>>>>                     "vmov_256B_rm\t{$src, $dst|$dst,
$src}",
>>>>                     [(set VR2048:$dst, (v64i32
(scalar_to_vector
>>>> (loadi32 addr:$src))))],
>>>>                     IIC_MOV_MEM>, EVEX;
>>>>
>>>> def VMOV_256B_MR : I<0x7F, MRMDestMem, (outs), (ins
i32mem:$dst,
>>>> VR2048:$src),
>>>>                     "vmov_256B_mr\t{$src, $dst|$dst,
$src}",
>>>>                     [(store (i32 (bitconvert VR2048:$src)),
>>>> addr:$dst)], IIC_MOV_MEM>, EVEX;
>>>>
>>>> in x86instrinfo.td;
>>>>
>>>> when i build i got these instructions in X86GenInstrInfo.
>>>> but still my instruction is not selected when i run input file
in debug
>>>> mode; getting following errors;
>>>>
>>>>
>>>> ===== Instruction selection begins: BB#1 'vector.body'
>>>> Selecting: t9: ch = store<ST256[bitcast ([65 x i32]* @c to
<64 x
>>>> i32>*)](align=16)(tbaa=<0x3817578>)> t8, t7, t11,
undef:i64
>>>>
>>>> ISEL: Starting pattern match on root node: t9: ch =
store<ST256[bitcast
>>>> ([65 x i32]* @c to <64 x
i32>*)](align=16)(tbaa=<0x3817578>)> t8, t7,
>>>> t11, undef:i64
>>>>
>>>>   Skipped scope entry (due to false predicate) at index 14,
continuing
>>>> at 81
>>>>   Skipped scope entry (due to false predicate) at index 82,
continuing
>>>> at 149
>>>>   Skipped scope entry (due to false predicate) at index 150,
continuing
>>>> at 217
>>>>   Skipped scope entry (due to false predicate) at index 218,
continuing
>>>> at 267
>>>>   Skipped scope entry (due to false predicate) at index 268,
continuing
>>>> at 317
>>>>   Skipped scope entry (due to false predicate) at index 318,
continuing
>>>> at 367
>>>>   Skipped scope entry (due to false predicate) at index 368,
continuing
>>>> at 394
>>>>   Skipped scope entry (due to false predicate) at index 395,
continuing
>>>> at 421
>>>>   Skipped scope entry (due to false predicate) at index 422,
continuing
>>>> at 471
>>>>   Skipped scope entry (due to false predicate) at index 472,
continuing
>>>> at 521
>>>>   Skipped scope entry (due to false predicate) at index 522,
continuing
>>>> at 571
>>>>   Skipped scope entry (due to false predicate) at index 572,
continuing
>>>> at 639
>>>>   Skipped scope entry (due to false predicate) at index 640,
continuing
>>>> at 707
>>>>   Skipped scope entry (due to false predicate) at index 708,
continuing
>>>> at 775
>>>>   Skipped scope entry (due to false predicate) at index 776,
continuing
>>>> at 804
>>>>   Skipped scope entry (due to false predicate) at index 805,
continuing
>>>> at 833
>>>>   Skipped scope entry (due to false predicate) at index 834,
continuing
>>>> at 862
>>>>   Skipped scope entry (due to false predicate) at index 863,
continuing
>>>> at 891
>>>>   Skipped scope entry (due to false predicate) at index 892,
continuing
>>>> at 920
>>>>   Skipped scope entry (due to false predicate) at index 921,
continuing
>>>> at 949
>>>>   Skipped scope entry (due to false predicate) at index 950,
continuing
>>>> at 987
>>>>   Skipped scope entry (due to false predicate) at index 988,
continuing
>>>> at 1025
>>>>   Match failed at index 12
>>>>   Continuing at 1026
>>>>   OpcodeSwitch from 1029 to 5725
>>>>   Match failed at index 5743
>>>>   Continuing at 5772
>>>>   Match failed at index 5776
>>>>   Continuing at 5805
>>>>   Match failed at index 5809
>>>>   Continuing at 5838
>>>>   Match failed at index 5842
>>>>   Continuing at 5911
>>>>   Match failed at index 5915
>>>>   Continuing at 5953
>>>>   Match failed at index 5957
>>>>   Continuing at 5995
>>>>   Match failed at index 5999
>>>>   Continuing at 6037
>>>>   Match failed at index 6041
>>>>   Continuing at 6084
>>>>   Match failed at index 6088
>>>>   Continuing at 6131
>>>>   Skipped scope entry (due to false predicate) at index 6138,
>>>> continuing at 6181
>>>>   Skipped scope entry (due to false predicate) at index 6182,
>>>> continuing at 6228
>>>>   Skipped scope entry (due to false predicate) at index 6235,
>>>> continuing at 6384
>>>>   Match failed at index 6388
>>>>   Continuing at 6419
>>>>   Match failed at index 6423
>>>>   Continuing at 6454
>>>>   Match failed at index 6458
>>>>   Continuing at 6489
>>>>   Continuing at 6490
>>>>   Continuing at 6491
>>>>   Continuing at 6492
>>>>   Match failed at index 6514
>>>>   Continuing at 6545
>>>>   Match failed at index 6562
>>>>   Continuing at 6593
>>>>   Match failed at index 6610
>>>>   Continuing at 6641
>>>>   Continuing at 6642
>>>>   Match failed at index 6658
>>>>   Continuing at 6772
>>>>   Match failed at index 6788
>>>>   Continuing at 6902
>>>>   Continuing at 13636
>>>>   Match failed at index 13640
>>>>   Continuing at 14940
>>>>   Match failed at index 14943
>>>>   Continuing at 15415
>>>>   Match failed at index 15417
>>>>   Continuing at 15570
>>>>   Match failed at index 15571
>>>>   Continuing at 15598
>>>>   Match failed at index 15599
>>>>   Continuing at 15716
>>>>   Match failed at index 15719
>>>>   Continuing at 15837
>>>>   Match failed at index 15840
>>>>   Continuing at 16198
>>>>   Skipped scope entry (due to false predicate) at index 16203,
>>>> continuing at 16285
>>>>   Skipped scope entry (due to false predicate) at index 16286,
>>>> continuing at 16394
>>>>   Skipped scope entry (due to false predicate) at index 16395,
>>>> continuing at 16464
>>>>   Skipped scope entry (due to false predicate) at index 16465,
>>>> continuing at 16487
>>>>   Skipped scope entry (due to false predicate) at index 16488,
>>>> continuing at 16510
>>>>   Skipped scope entry (due to false predicate) at index 16511,
>>>> continuing at 16533
>>>>   Skipped scope entry (due to false predicate) at index 16534,
>>>> continuing at 16556
>>>>   Skipped scope entry (due to false predicate) at index 16557,
>>>> continuing at 16680
>>>>   Skipped scope entry (due to false predicate) at index 16681,
>>>> continuing at 16804
>>>>   Skipped scope entry (due to false predicate) at index 16805,
>>>> continuing at 16890
>>>>   Skipped scope entry (due to false predicate) at index 16891,
>>>> continuing at 16976
>>>>   Skipped scope entry (due to false predicate) at index 16978,
>>>> continuing at 17169
>>>>   Skipped scope entry (due to false predicate) at index 17171,
>>>> continuing at 17342
>>>>   Skipped scope entry (due to false predicate) at index 17344,
>>>> continuing at 17497
>>>>   Skipped scope entry (due to false predicate) at index 17499,
>>>> continuing at 17632
>>>>   Skipped scope entry (due to false predicate) at index 17634,
>>>> continuing at 17801
>>>>   Skipped scope entry (due to false predicate) at index 17803,
>>>> continuing at 17944
>>>>   Skipped scope entry (due to false predicate) at index 17946,
>>>> continuing at 18074
>>>>   Skipped scope entry (due to false predicate) at index 18075,
>>>> continuing at 18178
>>>>   Skipped scope entry (due to false predicate) at index 18179,
>>>> continuing at 18253
>>>>   Skipped scope entry (due to false predicate) at index 18254,
>>>> continuing at 18278
>>>>   Skipped scope entry (due to false predicate) at index 18279,
>>>> continuing at 18303
>>>>   Skipped scope entry (due to false predicate) at index 18304,
>>>> continuing at 18328
>>>>   Skipped scope entry (due to false predicate) at index 18329,
>>>> continuing at 18376
>>>>   Skipped scope entry (due to false predicate) at index 18377,
>>>> continuing at 18424
>>>>   Skipped scope entry (due to false predicate) at index 18425,
>>>> continuing at 18520
>>>>   Skipped scope entry (due to false predicate) at index 18521,
>>>> continuing at 18636
>>>>   Skipped scope entry (due to false predicate) at index 18637,
>>>> continuing at 18661
>>>>   Skipped scope entry (due to false predicate) at index 18662,
>>>> continuing at 18711
>>>>   Skipped scope entry (due to false predicate) at index 18712,
>>>> continuing at 18736
>>>>   Skipped scope entry (due to false predicate) at index 18737,
>>>> continuing at 18770
>>>>   Skipped scope entry (due to false predicate) at index 18771,
>>>> continuing at 18856
>>>>   Skipped scope entry (due to false predicate) at index 18857,
>>>> continuing at 18942
>>>>   Skipped scope entry (due to false predicate) at index 18943,
>>>> continuing at 19028
>>>>   Match failed at index 16201
>>>>   Continuing at 19029
>>>> LLVM ERROR: Cannot select: t9: ch = store<ST256[bitcast ([65
x i32]* @c
>>>> to <64 x i32>*)](align=16)(tbaa=<0x3817578>)>
t8, t7, t11, undef:i64
>>>>   t7: v64i32 = add t6, t4
>>>>     t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* @c to
<64 x
>>>>
i32>*)](align=16)(tbaa=<0x3817578>)(dereferenceable)> t0, t11,
>>>> undef:i64
>>>>       t11: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65
x i32]*
>>>> @c> 0
>>>>         t10: i64 = TargetGlobalAddress<[65 x i32]* @c> 0
>>>>       t3: i64 = undef
>>>>     t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* @b to
<64 x
>>>>
i32>*)](align=16)(tbaa=<0x3817578>)(dereferenceable)> t0, t13,
>>>> undef:i64
>>>>       t13: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65
x i32]*
>>>> @b> 0
>>>>         t12: i64 = TargetGlobalAddress<[65 x i32]* @b> 0
>>>>       t3: i64 = undef
>>>>   t11: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x
i32]* @c> 0
>>>>     t10: i64 = TargetGlobalAddress<[65 x i32]* @c> 0
>>>>   t3: i64 = undef
>>>> In function: foo
>>>>
>>>>
>>>>
>>>> What could be the reason of this?? Please correct me.
>>>> I am stuck at this point....
>>>>
>>>>
>>>>
>>>> On Fri, Jul 7, 2017 at 10:59 PM, Friedman, Eli <efriedma at
codeaurora.org
>>>> > wrote:
>>>>
>>>>> The word "fold" is used all over LLVM.  It
generally refers to
>>>>> transformations which delete an instruction.
>>>>>
>>>>> If you're asking about http://llvm.org/docs/CodeGener
>>>>> ator.html#instruction-folding , it just means an
instruction which
>>>>> was produced by the "instruction folding"
transform; there isn't anything
>>>>> special about the instruction itself.
>>>>>
>>>>> -Eli
>>>>>
>>>>>
>>>>> On 7/6/2017 10:51 PM, hameeza ahmed wrote:
>>>>>
>>>>> What is meant by folded instructions in LLVM?
>>>>> How they work?
>>>>>
>>>>> On Fri, Jul 7, 2017 at 10:19 AM, hameeza ahmed
<hahmed2305 at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Thank You.
>>>>>>
>>>>>> On Fri, Jul 7, 2017 at 10:03 AM, Craig Topper
<craig.topper at gmail.com
>>>>>> > wrote:
>>>>>>
>>>>>>> Yes, that error is from instruction selection. I
think your
>>>>>>> legalization changes worked fine.
>>>>>>>
>>>>>>> ~Craig
>>>>>>>
>>>>>>> On Thu, Jul 6, 2017 at 8:21 PM, hameeza ahmed via
llvm-dev <
>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>
>>>>>>>> also i further run the following command;
>>>>>>>> llc -debug filer-knl_o3.ll
>>>>>>>>
>>>>>>>> and its output is attached here. by looking at
the output can we
>>>>>>>> say that legalization runs fine and the error
is due to instruction
>>>>>>>> selection/ pattern matching which is not yet
implemented?
>>>>>>>>
>>>>>>>> so do i need to worry and try to correct it at
this stage or should
>>>>>>>> i move forward to implement instruction
selection/ pattern matching?
>>>>>>>>
>>>>>>>> Please guide me.
>>>>>>>>
>>>>>>>> Thank You
>>>>>>>>
>>>>>>>> On Fri, Jul 7, 2017 at 8:00 AM, hameeza ahmed
<hahmed2305 at gmail.com
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> Thank You. well i have seen these links.
but they dont cover the
>>>>>>>>> problem that i have mentioned. actually i
am doing all the things step by
>>>>>>>>> step.
>>>>>>>>>
>>>>>>>>> so i havent yet worked with instruction
selection phase/ files.
>>>>>>>>> rather before that i am trying to do
legalization by allowing vector
>>>>>>>>> elements>16 i.e 64xi32. here i have
mainly worked with 2 files uptil now,
>>>>>>>>> i.e registerinfo.td to define register
class to be called in
>>>>>>>>> legalization. and most importantly i am
dealing with file
>>>>>>>>> X86ISelLowering.cpp.
>>>>>>>>>
>>>>>>>>> Now is there any relation in this and
instruction selection. since
>>>>>>>>> instruction selection comes after combine
and legalize so i havent yet
>>>>>>>>> worked on it.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Please correct me, I am stuck here.
>>>>>>>>>
>>>>>>>>> Thank You again
>>>>>>>>>
>>>>>>>>> On Fri, Jul 7, 2017 at 7:11 AM, Friedman,
Eli <
>>>>>>>>> efriedma at codeaurora.org> wrote:
>>>>>>>>>
>>>>>>>>>> Have you read
http://llvm.org/docs/WritingAnLLVMBackend.html and
>>>>>>>>>> http://llvm.org/docs/CodeGenerator.html
?
>>>>>>>>>>
http://llvm.org/docs/WritingAnLLVMBackend.html#instruction-s
>>>>>>>>>> elector describes how to define a store
instruction.
>>>>>>>>>>
>>>>>>>>>> -Eli
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 7/6/2017 6:51 PM, hameeza ahmed via
llvm-dev wrote:
>>>>>>>>>>
>>>>>>>>>> Please correct me i m stuck at this
point.
>>>>>>>>>>
>>>>>>>>>> On Jul 6, 2017 5:18 PM, "hameeza
ahmed" <hahmed2305 at gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Hello,
>>>>>>>>>> i am experimenting with the increase in
register/ vector width to
>>>>>>>>>> 64 elements of 32 bits instead of 16 in
x86 backend.
>>>>>>>>>> for eg.
>>>>>>>>>> i have a loop with 65 iterations;
>>>>>>>>>> if my IR generates v64i32 and 1 scalar,
still the backend breaks
>>>>>>>>>> the v64i32 into 4 v16i32. i want it to
retain v64i32. like if there are 128
>>>>>>>>>> elements in loop then it should break
it into 2 v64i32 instructions.
>>>>>>>>>>
>>>>>>>>>> in order to do this i have made
necessary changes in
>>>>>>>>>> X86ISelLowering.cpp. and rebuild llvm.
then when i use the
>>>>>>>>>> command -view-dag-combine2-dags i get
the required output in
>>>>>>>>>> graph but the following error on
console:
>>>>>>>>>>
>>>>>>>>>> LLVM ERROR: Cannot select: t10: ch =
store<ST256[bitcast ([65 x
>>>>>>>>>> i32]* @a to <64 x
i32>*)](align=16)(tbaa=<0x30c5438>)> t9, t7,
>>>>>>>>>> t12, undef:i64
>>>>>>>>>>   t7: v64i32 = add t6, t4
>>>>>>>>>>     t6: v64i32,ch =
load<LD256[bitcast ([65 x i32]* @c to <64 x
>>>>>>>>>>
i32>*)](align=16)(tbaa=<0x30c5438>)(dereferenceable)> t0, t14,
>>>>>>>>>> undef:i64
>>>>>>>>>>       t14: i64 = X86ISD::Wrapper
TargetGlobalAddress:i64<[65 x
>>>>>>>>>> i32]* @c> 0
>>>>>>>>>>         t13: i64 =
TargetGlobalAddress<[65 x i32]* @c> 0
>>>>>>>>>>       t3: i64 = undef
>>>>>>>>>>     t4: v64i32,ch =
load<LD256[bitcast ([65 x i32]* @b to <64 x
>>>>>>>>>>
i32>*)](align=16)(tbaa=<0x30c5438>)(dereferenceable)> t0, t16,
>>>>>>>>>> undef:i64
>>>>>>>>>>       t16: i64 = X86ISD::Wrapper
TargetGlobalAddress:i64<[65 x
>>>>>>>>>> i32]* @b> 0
>>>>>>>>>>         t15: i64 =
TargetGlobalAddress<[65 x i32]* @b> 0
>>>>>>>>>>       t3: i64 = undef
>>>>>>>>>>   t12: i64 = X86ISD::Wrapper
TargetGlobalAddress:i64<[65 x i32]*
>>>>>>>>>> @a> 0
>>>>>>>>>>     t11: i64 =
TargetGlobalAddress<[65 x i32]* @a> 0
>>>>>>>>>>   t3: i64 = undef
>>>>>>>>>> In function: foo
>>>>>>>>>>
>>>>>>>>>> The dag after legalization is also
attached here.
>>>>>>>>>>
>>>>>>>>>> the source is vector sum of 65
elements.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Kindly correct me.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
_______________________________________________
>>>>>>>>>> LLVM Developers mailing listllvm-dev at
lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Employee of Qualcomm Innovation Center,
Inc.
>>>>>>>>>> Qualcomm Innovation Center, Inc. is a
member of Code Aurora Forum, a Linux Foundation Collaborative Project
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> LLVM Developers mailing list
>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Employee of Qualcomm Innovation Center, Inc.
>>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora
Forum, a Linux Foundation Collaborative Project
>>>>>
>>>>>
>>>> --
>>> ~Craig
>>>
>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170708/3385c506/attachment-0001.html>

Craig Topper via llvm-dev

2017-Jul-08 04:59 UTC

head link

[llvm-dev] Error in v64i32 type in x86 backend

You don't want RI. That's used for instructions that need a reg prefix.
You
need to use $src1 and $src2 in the assembly string too. It also looks like
you have two closing ] brackets.

~Craig

On Fri, Jul 7, 2017 at 9:55 PM, hameeza ahmed <hahmed2305 at gmail.com>
wrote:
> Thank you;
> i have changed as follows.is it fine now?
>
> def  VADD_256B  : I<0xFE, MRMDestReg, (outs VR2048:$dst), (ins
VR2048:$src1,
> VR2048:$src2),
>                    "VADD_256B\t{$src, $dst|$dst, $src}", [(set
VR2048:$dst,
> (add VR2048:$src1, VR2048:$src2))]]>;
>
> Also here i have changed class RI to I. Does it make any difference?
>
>
>
> On Sat, Jul 8, 2017 at 9:38 AM, Craig Topper <craig.topper at
gmail.com>
> wrote:
>
>> IIC_XADD_REG is used to associate latency and other information for use
>> by the instruction scheduling pass.
>>
>> You're missing a pattern in the square bracket to match an add
node. You
>> also need two VR2048 registers in the 'ins'
>>
>> ~Craig
>>
>> On Fri, Jul 7, 2017 at 9:29 PM, hameeza ahmed <hahmed2305 at
gmail.com>
>> wrote:
>>
>>> Can you please tell whether following add is correct to add 2
64xi32
>>> numbers.
>>>
>>> def VADD_256B  : RI<0xFE, MRMDestReg, (outs VR2048:$dst), (ins
VR2048
>>> :$src),
>>>                    "VADD_256B\t{$src, $dst|$dst, $src}",
[],
>>> IIC_XADD_REG>, TB;
>>>
>>> what is llc_xadd_reg here?
>>>
>>>
>>>
>>> On Sat, Jul 8, 2017 at 8:48 AM, Craig Topper <craig.topper at
gmail.com>
>>> wrote:
>>>
>>>> Change the i32 in the store pattern to v64i32.
>>>>
>>>> On Fri, Jul 7, 2017 at 8:41 PM hameeza ahmed <hahmed2305 at
gmail.com>
>>>> wrote:
>>>>
>>>>> Thank you. i understood how avx512 vector instructions are
written in
>>>>> x86instravx512. i need to define my vector instructions so
i wrote;
>>>>>
>>>>>  def VMOV_256B_RM : I<0x6F, MRMSrcMem, (outs
VR2048:$dst), (ins
>>>>> i32mem:$src),
>>>>>                     "vmov_256B_rm\t{$src, $dst|$dst,
$src}",
>>>>>                     [(set VR2048:$dst, (v64i32
(scalar_to_vector
>>>>> (loadi32 addr:$src))))],
>>>>>                     IIC_MOV_MEM>, EVEX;
>>>>>
>>>>> def VMOV_256B_MR : I<0x7F, MRMDestMem, (outs), (ins
i32mem:$dst,
>>>>> VR2048:$src),
>>>>>                     "vmov_256B_mr\t{$src, $dst|$dst,
$src}",
>>>>>                     [(store (i32 (bitconvert VR2048:$src)),
>>>>> addr:$dst)], IIC_MOV_MEM>, EVEX;
>>>>>
>>>>> in x86instrinfo.td;
>>>>>
>>>>> when i build i got these instructions in X86GenInstrInfo.
>>>>> but still my instruction is not selected when i run input
file in
>>>>> debug mode; getting following errors;
>>>>>
>>>>>
>>>>> ===== Instruction selection begins: BB#1
'vector.body'
>>>>> Selecting: t9: ch = store<ST256[bitcast ([65 x i32]* @c
to <64 x
>>>>> i32>*)](align=16)(tbaa=<0x3817578>)> t8, t7,
t11, undef:i64
>>>>>
>>>>> ISEL: Starting pattern match on root node: t9: ch
>>>>> store<ST256[bitcast ([65 x i32]* @c to <64 x
i32>*)](align=16)(tbaa=<0x3817578>)>
>>>>> t8, t7, t11, undef:i64
>>>>>
>>>>>   Skipped scope entry (due to false predicate) at index 14,
continuing
>>>>> at 81
>>>>>   Skipped scope entry (due to false predicate) at index 82,
continuing
>>>>> at 149
>>>>>   Skipped scope entry (due to false predicate) at index
150,
>>>>> continuing at 217
>>>>>   Skipped scope entry (due to false predicate) at index
218,
>>>>> continuing at 267
>>>>>   Skipped scope entry (due to false predicate) at index
268,
>>>>> continuing at 317
>>>>>   Skipped scope entry (due to false predicate) at index
318,
>>>>> continuing at 367
>>>>>   Skipped scope entry (due to false predicate) at index
368,
>>>>> continuing at 394
>>>>>   Skipped scope entry (due to false predicate) at index
395,
>>>>> continuing at 421
>>>>>   Skipped scope entry (due to false predicate) at index
422,
>>>>> continuing at 471
>>>>>   Skipped scope entry (due to false predicate) at index
472,
>>>>> continuing at 521
>>>>>   Skipped scope entry (due to false predicate) at index
522,
>>>>> continuing at 571
>>>>>   Skipped scope entry (due to false predicate) at index
572,
>>>>> continuing at 639
>>>>>   Skipped scope entry (due to false predicate) at index
640,
>>>>> continuing at 707
>>>>>   Skipped scope entry (due to false predicate) at index
708,
>>>>> continuing at 775
>>>>>   Skipped scope entry (due to false predicate) at index
776,
>>>>> continuing at 804
>>>>>   Skipped scope entry (due to false predicate) at index
805,
>>>>> continuing at 833
>>>>>   Skipped scope entry (due to false predicate) at index
834,
>>>>> continuing at 862
>>>>>   Skipped scope entry (due to false predicate) at index
863,
>>>>> continuing at 891
>>>>>   Skipped scope entry (due to false predicate) at index
892,
>>>>> continuing at 920
>>>>>   Skipped scope entry (due to false predicate) at index
921,
>>>>> continuing at 949
>>>>>   Skipped scope entry (due to false predicate) at index
950,
>>>>> continuing at 987
>>>>>   Skipped scope entry (due to false predicate) at index
988,
>>>>> continuing at 1025
>>>>>   Match failed at index 12
>>>>>   Continuing at 1026
>>>>>   OpcodeSwitch from 1029 to 5725
>>>>>   Match failed at index 5743
>>>>>   Continuing at 5772
>>>>>   Match failed at index 5776
>>>>>   Continuing at 5805
>>>>>   Match failed at index 5809
>>>>>   Continuing at 5838
>>>>>   Match failed at index 5842
>>>>>   Continuing at 5911
>>>>>   Match failed at index 5915
>>>>>   Continuing at 5953
>>>>>   Match failed at index 5957
>>>>>   Continuing at 5995
>>>>>   Match failed at index 5999
>>>>>   Continuing at 6037
>>>>>   Match failed at index 6041
>>>>>   Continuing at 6084
>>>>>   Match failed at index 6088
>>>>>   Continuing at 6131
>>>>>   Skipped scope entry (due to false predicate) at index
6138,
>>>>> continuing at 6181
>>>>>   Skipped scope entry (due to false predicate) at index
6182,
>>>>> continuing at 6228
>>>>>   Skipped scope entry (due to false predicate) at index
6235,
>>>>> continuing at 6384
>>>>>   Match failed at index 6388
>>>>>   Continuing at 6419
>>>>>   Match failed at index 6423
>>>>>   Continuing at 6454
>>>>>   Match failed at index 6458
>>>>>   Continuing at 6489
>>>>>   Continuing at 6490
>>>>>   Continuing at 6491
>>>>>   Continuing at 6492
>>>>>   Match failed at index 6514
>>>>>   Continuing at 6545
>>>>>   Match failed at index 6562
>>>>>   Continuing at 6593
>>>>>   Match failed at index 6610
>>>>>   Continuing at 6641
>>>>>   Continuing at 6642
>>>>>   Match failed at index 6658
>>>>>   Continuing at 6772
>>>>>   Match failed at index 6788
>>>>>   Continuing at 6902
>>>>>   Continuing at 13636
>>>>>   Match failed at index 13640
>>>>>   Continuing at 14940
>>>>>   Match failed at index 14943
>>>>>   Continuing at 15415
>>>>>   Match failed at index 15417
>>>>>   Continuing at 15570
>>>>>   Match failed at index 15571
>>>>>   Continuing at 15598
>>>>>   Match failed at index 15599
>>>>>   Continuing at 15716
>>>>>   Match failed at index 15719
>>>>>   Continuing at 15837
>>>>>   Match failed at index 15840
>>>>>   Continuing at 16198
>>>>>   Skipped scope entry (due to false predicate) at index
16203,
>>>>> continuing at 16285
>>>>>   Skipped scope entry (due to false predicate) at index
16286,
>>>>> continuing at 16394
>>>>>   Skipped scope entry (due to false predicate) at index
16395,
>>>>> continuing at 16464
>>>>>   Skipped scope entry (due to false predicate) at index
16465,
>>>>> continuing at 16487
>>>>>   Skipped scope entry (due to false predicate) at index
16488,
>>>>> continuing at 16510
>>>>>   Skipped scope entry (due to false predicate) at index
16511,
>>>>> continuing at 16533
>>>>>   Skipped scope entry (due to false predicate) at index
16534,
>>>>> continuing at 16556
>>>>>   Skipped scope entry (due to false predicate) at index
16557,
>>>>> continuing at 16680
>>>>>   Skipped scope entry (due to false predicate) at index
16681,
>>>>> continuing at 16804
>>>>>   Skipped scope entry (due to false predicate) at index
16805,
>>>>> continuing at 16890
>>>>>   Skipped scope entry (due to false predicate) at index
16891,
>>>>> continuing at 16976
>>>>>   Skipped scope entry (due to false predicate) at index
16978,
>>>>> continuing at 17169
>>>>>   Skipped scope entry (due to false predicate) at index
17171,
>>>>> continuing at 17342
>>>>>   Skipped scope entry (due to false predicate) at index
17344,
>>>>> continuing at 17497
>>>>>   Skipped scope entry (due to false predicate) at index
17499,
>>>>> continuing at 17632
>>>>>   Skipped scope entry (due to false predicate) at index
17634,
>>>>> continuing at 17801
>>>>>   Skipped scope entry (due to false predicate) at index
17803,
>>>>> continuing at 17944
>>>>>   Skipped scope entry (due to false predicate) at index
17946,
>>>>> continuing at 18074
>>>>>   Skipped scope entry (due to false predicate) at index
18075,
>>>>> continuing at 18178
>>>>>   Skipped scope entry (due to false predicate) at index
18179,
>>>>> continuing at 18253
>>>>>   Skipped scope entry (due to false predicate) at index
18254,
>>>>> continuing at 18278
>>>>>   Skipped scope entry (due to false predicate) at index
18279,
>>>>> continuing at 18303
>>>>>   Skipped scope entry (due to false predicate) at index
18304,
>>>>> continuing at 18328
>>>>>   Skipped scope entry (due to false predicate) at index
18329,
>>>>> continuing at 18376
>>>>>   Skipped scope entry (due to false predicate) at index
18377,
>>>>> continuing at 18424
>>>>>   Skipped scope entry (due to false predicate) at index
18425,
>>>>> continuing at 18520
>>>>>   Skipped scope entry (due to false predicate) at index
18521,
>>>>> continuing at 18636
>>>>>   Skipped scope entry (due to false predicate) at index
18637,
>>>>> continuing at 18661
>>>>>   Skipped scope entry (due to false predicate) at index
18662,
>>>>> continuing at 18711
>>>>>   Skipped scope entry (due to false predicate) at index
18712,
>>>>> continuing at 18736
>>>>>   Skipped scope entry (due to false predicate) at index
18737,
>>>>> continuing at 18770
>>>>>   Skipped scope entry (due to false predicate) at index
18771,
>>>>> continuing at 18856
>>>>>   Skipped scope entry (due to false predicate) at index
18857,
>>>>> continuing at 18942
>>>>>   Skipped scope entry (due to false predicate) at index
18943,
>>>>> continuing at 19028
>>>>>   Match failed at index 16201
>>>>>   Continuing at 19029
>>>>> LLVM ERROR: Cannot select: t9: ch = store<ST256[bitcast
([65 x i32]*
>>>>> @c to <64 x
i32>*)](align=16)(tbaa=<0x3817578>)> t8, t7, t11,
>>>>> undef:i64
>>>>>   t7: v64i32 = add t6, t4
>>>>>     t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* @c
to <64 x
>>>>>
i32>*)](align=16)(tbaa=<0x3817578>)(dereferenceable)> t0, t11,
>>>>> undef:i64
>>>>>       t11: i64 = X86ISD::Wrapper
TargetGlobalAddress:i64<[65 x i32]*
>>>>> @c> 0
>>>>>         t10: i64 = TargetGlobalAddress<[65 x i32]*
@c> 0
>>>>>       t3: i64 = undef
>>>>>     t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* @b
to <64 x
>>>>>
i32>*)](align=16)(tbaa=<0x3817578>)(dereferenceable)> t0, t13,
>>>>> undef:i64
>>>>>       t13: i64 = X86ISD::Wrapper
TargetGlobalAddress:i64<[65 x i32]*
>>>>> @b> 0
>>>>>         t12: i64 = TargetGlobalAddress<[65 x i32]*
@b> 0
>>>>>       t3: i64 = undef
>>>>>   t11: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65
x i32]* @c> 0
>>>>>     t10: i64 = TargetGlobalAddress<[65 x i32]* @c> 0
>>>>>   t3: i64 = undef
>>>>> In function: foo
>>>>>
>>>>>
>>>>>
>>>>> What could be the reason of this?? Please correct me.
>>>>> I am stuck at this point....
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Jul 7, 2017 at 10:59 PM, Friedman, Eli <
>>>>> efriedma at codeaurora.org> wrote:
>>>>>
>>>>>> The word "fold" is used all over LLVM.  It
generally refers to
>>>>>> transformations which delete an instruction.
>>>>>>
>>>>>> If you're asking about
http://llvm.org/docs/CodeGener
>>>>>> ator.html#instruction-folding , it just means an
instruction which
>>>>>> was produced by the "instruction folding"
transform; there isn't anything
>>>>>> special about the instruction itself.
>>>>>>
>>>>>> -Eli
>>>>>>
>>>>>>
>>>>>> On 7/6/2017 10:51 PM, hameeza ahmed wrote:
>>>>>>
>>>>>> What is meant by folded instructions in LLVM?
>>>>>> How they work?
>>>>>>
>>>>>> On Fri, Jul 7, 2017 at 10:19 AM, hameeza ahmed
<hahmed2305 at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Thank You.
>>>>>>>
>>>>>>> On Fri, Jul 7, 2017 at 10:03 AM, Craig Topper <
>>>>>>> craig.topper at gmail.com> wrote:
>>>>>>>
>>>>>>>> Yes, that error is from instruction selection.
I think your
>>>>>>>> legalization changes worked fine.
>>>>>>>>
>>>>>>>> ~Craig
>>>>>>>>
>>>>>>>> On Thu, Jul 6, 2017 at 8:21 PM, hameeza ahmed
via llvm-dev <
>>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>>
>>>>>>>>> also i further run the following command;
>>>>>>>>> llc -debug filer-knl_o3.ll
>>>>>>>>>
>>>>>>>>> and its output is attached here. by looking
at the output can we
>>>>>>>>> say that legalization runs fine and the
error is due to instruction
>>>>>>>>> selection/ pattern matching which is not
yet implemented?
>>>>>>>>>
>>>>>>>>> so do i need to worry and try to correct it
at this stage or
>>>>>>>>> should i move forward to implement
instruction selection/ pattern matching?
>>>>>>>>>
>>>>>>>>> Please guide me.
>>>>>>>>>
>>>>>>>>> Thank You
>>>>>>>>>
>>>>>>>>> On Fri, Jul 7, 2017 at 8:00 AM, hameeza
ahmed <
>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Thank You. well i have seen these
links. but they dont cover the
>>>>>>>>>> problem that i have mentioned. actually
i am doing all the things step by
>>>>>>>>>> step.
>>>>>>>>>>
>>>>>>>>>> so i havent yet worked with instruction
selection phase/ files.
>>>>>>>>>> rather before that i am trying to do
legalization by allowing vector
>>>>>>>>>> elements>16 i.e 64xi32. here i have
mainly worked with 2 files uptil now,
>>>>>>>>>> i.e registerinfo.td to define register
class to be called in
>>>>>>>>>> legalization. and most importantly i am
dealing with file
>>>>>>>>>> X86ISelLowering.cpp.
>>>>>>>>>>
>>>>>>>>>> Now is there any relation in this and
instruction selection.
>>>>>>>>>> since instruction selection comes after
combine and legalize so i havent
>>>>>>>>>> yet worked on it.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Please correct me, I am stuck here.
>>>>>>>>>>
>>>>>>>>>> Thank You again
>>>>>>>>>>
>>>>>>>>>> On Fri, Jul 7, 2017 at 7:11 AM,
Friedman, Eli <
>>>>>>>>>> efriedma at codeaurora.org> wrote:
>>>>>>>>>>
>>>>>>>>>>> Have you read
http://llvm.org/docs/WritingAnLLVMBackend.html
>>>>>>>>>>> and
http://llvm.org/docs/CodeGenerator.html ?
>>>>>>>>>>>
http://llvm.org/docs/WritingAnLLVMBackend.html#instruction-s
>>>>>>>>>>> elector describes how to define a
store instruction.
>>>>>>>>>>>
>>>>>>>>>>> -Eli
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 7/6/2017 6:51 PM, hameeza ahmed
via llvm-dev wrote:
>>>>>>>>>>>
>>>>>>>>>>> Please correct me i m stuck at this
point.
>>>>>>>>>>>
>>>>>>>>>>> On Jul 6, 2017 5:18 PM,
"hameeza ahmed" <hahmed2305 at gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hello,
>>>>>>>>>>> i am experimenting with the
increase in register/ vector width
>>>>>>>>>>> to 64 elements of 32 bits instead
of 16 in x86 backend.
>>>>>>>>>>> for eg.
>>>>>>>>>>> i have a loop with 65 iterations;
>>>>>>>>>>> if my IR generates v64i32 and 1
scalar, still the backend breaks
>>>>>>>>>>> the v64i32 into 4 v16i32. i want it
to retain v64i32. like if there are 128
>>>>>>>>>>> elements in loop then it should
break it into 2 v64i32 instructions.
>>>>>>>>>>>
>>>>>>>>>>> in order to do this i have made
necessary changes in
>>>>>>>>>>> X86ISelLowering.cpp. and rebuild
llvm. then when i use the
>>>>>>>>>>> command -view-dag-combine2-dags i
get the required output in
>>>>>>>>>>> graph but the following error on
console:
>>>>>>>>>>>
>>>>>>>>>>> LLVM ERROR: Cannot select: t10: ch
= store<ST256[bitcast ([65 x
>>>>>>>>>>> i32]* @a to <64 x
i32>*)](align=16)(tbaa=<0x30c5438>)> t9, t7,
>>>>>>>>>>> t12, undef:i64
>>>>>>>>>>>   t7: v64i32 = add t6, t4
>>>>>>>>>>>     t6: v64i32,ch =
load<LD256[bitcast ([65 x i32]* @c to <64 x
>>>>>>>>>>>
i32>*)](align=16)(tbaa=<0x30c5438>)(dereferenceable)> t0, t14,
>>>>>>>>>>> undef:i64
>>>>>>>>>>>       t14: i64 = X86ISD::Wrapper
TargetGlobalAddress:i64<[65 x
>>>>>>>>>>> i32]* @c> 0
>>>>>>>>>>>         t13: i64 =
TargetGlobalAddress<[65 x i32]* @c> 0
>>>>>>>>>>>       t3: i64 = undef
>>>>>>>>>>>     t4: v64i32,ch =
load<LD256[bitcast ([65 x i32]* @b to <64 x
>>>>>>>>>>>
i32>*)](align=16)(tbaa=<0x30c5438>)(dereferenceable)> t0, t16,
>>>>>>>>>>> undef:i64
>>>>>>>>>>>       t16: i64 = X86ISD::Wrapper
TargetGlobalAddress:i64<[65 x
>>>>>>>>>>> i32]* @b> 0
>>>>>>>>>>>         t15: i64 =
TargetGlobalAddress<[65 x i32]* @b> 0
>>>>>>>>>>>       t3: i64 = undef
>>>>>>>>>>>   t12: i64 = X86ISD::Wrapper
TargetGlobalAddress:i64<[65 x i32]*
>>>>>>>>>>> @a> 0
>>>>>>>>>>>     t11: i64 =
TargetGlobalAddress<[65 x i32]* @a> 0
>>>>>>>>>>>   t3: i64 = undef
>>>>>>>>>>> In function: foo
>>>>>>>>>>>
>>>>>>>>>>> The dag after legalization is also
attached here.
>>>>>>>>>>>
>>>>>>>>>>> the source is vector sum of 65
elements.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Kindly correct me.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>> LLVM Developers mailing
listllvm-dev at
lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Employee of Qualcomm Innovation
Center, Inc.
>>>>>>>>>>> Qualcomm Innovation Center, Inc. is
a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
_______________________________________________
>>>>>>>>> LLVM Developers mailing list
>>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> Employee of Qualcomm Innovation Center, Inc.
>>>>>> Qualcomm Innovation Center, Inc. is a member of Code
Aurora Forum, a Linux Foundation Collaborative Project
>>>>>>
>>>>>>
>>>>> --
>>>> ~Craig
>>>>
>>>
>>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170707/41b5b433/attachment.html>

Craig Topper via llvm-dev

2017-Jul-08 05:00 UTC

head link

[llvm-dev] Error in v64i32 type in x86 backend

Oops that should have said "REX prefix" in the first sentence.

~Craig

On Fri, Jul 7, 2017 at 9:59 PM, Craig Topper <craig.topper at gmail.com>
wrote:
> You don't want RI. That's used for instructions that need a reg
prefix.
> You need to use $src1 and $src2 in the assembly string too. It also looks
> like you have two closing ] brackets.
>
> ~Craig
>
> On Fri, Jul 7, 2017 at 9:55 PM, hameeza ahmed <hahmed2305 at
gmail.com>
> wrote:
>
>> Thank you;
>> i have changed as follows.is it fine now?
>>
>> def  VADD_256B  : I<0xFE, MRMDestReg, (outs VR2048:$dst), (ins
VR2048:$src1,
>> VR2048:$src2),
>>                    "VADD_256B\t{$src, $dst|$dst, $src}",
[(set VR2048:$dst,
>> (add VR2048:$src1, VR2048:$src2))]]>;
>>
>> Also here i have changed class RI to I. Does it make any difference?
>>
>>
>>
>> On Sat, Jul 8, 2017 at 9:38 AM, Craig Topper <craig.topper at
gmail.com>
>> wrote:
>>
>>> IIC_XADD_REG is used to associate latency and other information for
use
>>> by the instruction scheduling pass.
>>>
>>> You're missing a pattern in the square bracket to match an add
node. You
>>> also need two VR2048 registers in the 'ins'
>>>
>>> ~Craig
>>>
>>> On Fri, Jul 7, 2017 at 9:29 PM, hameeza ahmed <hahmed2305 at
gmail.com>
>>> wrote:
>>>
>>>> Can you please tell whether following add is correct to add 2
64xi32
>>>> numbers.
>>>>
>>>> def VADD_256B  : RI<0xFE, MRMDestReg, (outs VR2048:$dst),
(ins VR2048
>>>> :$src),
>>>>                    "VADD_256B\t{$src, $dst|$dst,
$src}", [],
>>>> IIC_XADD_REG>, TB;
>>>>
>>>> what is llc_xadd_reg here?
>>>>
>>>>
>>>>
>>>> On Sat, Jul 8, 2017 at 8:48 AM, Craig Topper <craig.topper
at gmail.com>
>>>> wrote:
>>>>
>>>>> Change the i32 in the store pattern to v64i32.
>>>>>
>>>>> On Fri, Jul 7, 2017 at 8:41 PM hameeza ahmed <hahmed2305
at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Thank you. i understood how avx512 vector instructions
are written in
>>>>>> x86instravx512. i need to define my vector instructions
so i wrote;
>>>>>>
>>>>>>  def VMOV_256B_RM : I<0x6F, MRMSrcMem, (outs
VR2048:$dst), (ins
>>>>>> i32mem:$src),
>>>>>>                     "vmov_256B_rm\t{$src,
$dst|$dst, $src}",
>>>>>>                     [(set VR2048:$dst, (v64i32
(scalar_to_vector
>>>>>> (loadi32 addr:$src))))],
>>>>>>                     IIC_MOV_MEM>, EVEX;
>>>>>>
>>>>>> def VMOV_256B_MR : I<0x7F, MRMDestMem, (outs), (ins
i32mem:$dst,
>>>>>> VR2048:$src),
>>>>>>                     "vmov_256B_mr\t{$src,
$dst|$dst, $src}",
>>>>>>                     [(store (i32 (bitconvert
VR2048:$src)),
>>>>>> addr:$dst)], IIC_MOV_MEM>, EVEX;
>>>>>>
>>>>>> in x86instrinfo.td;
>>>>>>
>>>>>> when i build i got these instructions in
X86GenInstrInfo.
>>>>>> but still my instruction is not selected when i run
input file in
>>>>>> debug mode; getting following errors;
>>>>>>
>>>>>>
>>>>>> ===== Instruction selection begins: BB#1
'vector.body'
>>>>>> Selecting: t9: ch = store<ST256[bitcast ([65 x i32]*
@c to <64 x
>>>>>> i32>*)](align=16)(tbaa=<0x3817578>)> t8,
t7, t11, undef:i64
>>>>>>
>>>>>> ISEL: Starting pattern match on root node: t9: ch
>>>>>> store<ST256[bitcast ([65 x i32]* @c to <64 x
i32>*)](align=16)(tbaa=<0x3817578>)>
>>>>>> t8, t7, t11, undef:i64
>>>>>>
>>>>>>   Skipped scope entry (due to false predicate) at index
14,
>>>>>> continuing at 81
>>>>>>   Skipped scope entry (due to false predicate) at index
82,
>>>>>> continuing at 149
>>>>>>   Skipped scope entry (due to false predicate) at index
150,
>>>>>> continuing at 217
>>>>>>   Skipped scope entry (due to false predicate) at index
218,
>>>>>> continuing at 267
>>>>>>   Skipped scope entry (due to false predicate) at index
268,
>>>>>> continuing at 317
>>>>>>   Skipped scope entry (due to false predicate) at index
318,
>>>>>> continuing at 367
>>>>>>   Skipped scope entry (due to false predicate) at index
368,
>>>>>> continuing at 394
>>>>>>   Skipped scope entry (due to false predicate) at index
395,
>>>>>> continuing at 421
>>>>>>   Skipped scope entry (due to false predicate) at index
422,
>>>>>> continuing at 471
>>>>>>   Skipped scope entry (due to false predicate) at index
472,
>>>>>> continuing at 521
>>>>>>   Skipped scope entry (due to false predicate) at index
522,
>>>>>> continuing at 571
>>>>>>   Skipped scope entry (due to false predicate) at index
572,
>>>>>> continuing at 639
>>>>>>   Skipped scope entry (due to false predicate) at index
640,
>>>>>> continuing at 707
>>>>>>   Skipped scope entry (due to false predicate) at index
708,
>>>>>> continuing at 775
>>>>>>   Skipped scope entry (due to false predicate) at index
776,
>>>>>> continuing at 804
>>>>>>   Skipped scope entry (due to false predicate) at index
805,
>>>>>> continuing at 833
>>>>>>   Skipped scope entry (due to false predicate) at index
834,
>>>>>> continuing at 862
>>>>>>   Skipped scope entry (due to false predicate) at index
863,
>>>>>> continuing at 891
>>>>>>   Skipped scope entry (due to false predicate) at index
892,
>>>>>> continuing at 920
>>>>>>   Skipped scope entry (due to false predicate) at index
921,
>>>>>> continuing at 949
>>>>>>   Skipped scope entry (due to false predicate) at index
950,
>>>>>> continuing at 987
>>>>>>   Skipped scope entry (due to false predicate) at index
988,
>>>>>> continuing at 1025
>>>>>>   Match failed at index 12
>>>>>>   Continuing at 1026
>>>>>>   OpcodeSwitch from 1029 to 5725
>>>>>>   Match failed at index 5743
>>>>>>   Continuing at 5772
>>>>>>   Match failed at index 5776
>>>>>>   Continuing at 5805
>>>>>>   Match failed at index 5809
>>>>>>   Continuing at 5838
>>>>>>   Match failed at index 5842
>>>>>>   Continuing at 5911
>>>>>>   Match failed at index 5915
>>>>>>   Continuing at 5953
>>>>>>   Match failed at index 5957
>>>>>>   Continuing at 5995
>>>>>>   Match failed at index 5999
>>>>>>   Continuing at 6037
>>>>>>   Match failed at index 6041
>>>>>>   Continuing at 6084
>>>>>>   Match failed at index 6088
>>>>>>   Continuing at 6131
>>>>>>   Skipped scope entry (due to false predicate) at index
6138,
>>>>>> continuing at 6181
>>>>>>   Skipped scope entry (due to false predicate) at index
6182,
>>>>>> continuing at 6228
>>>>>>   Skipped scope entry (due to false predicate) at index
6235,
>>>>>> continuing at 6384
>>>>>>   Match failed at index 6388
>>>>>>   Continuing at 6419
>>>>>>   Match failed at index 6423
>>>>>>   Continuing at 6454
>>>>>>   Match failed at index 6458
>>>>>>   Continuing at 6489
>>>>>>   Continuing at 6490
>>>>>>   Continuing at 6491
>>>>>>   Continuing at 6492
>>>>>>   Match failed at index 6514
>>>>>>   Continuing at 6545
>>>>>>   Match failed at index 6562
>>>>>>   Continuing at 6593
>>>>>>   Match failed at index 6610
>>>>>>   Continuing at 6641
>>>>>>   Continuing at 6642
>>>>>>   Match failed at index 6658
>>>>>>   Continuing at 6772
>>>>>>   Match failed at index 6788
>>>>>>   Continuing at 6902
>>>>>>   Continuing at 13636
>>>>>>   Match failed at index 13640
>>>>>>   Continuing at 14940
>>>>>>   Match failed at index 14943
>>>>>>   Continuing at 15415
>>>>>>   Match failed at index 15417
>>>>>>   Continuing at 15570
>>>>>>   Match failed at index 15571
>>>>>>   Continuing at 15598
>>>>>>   Match failed at index 15599
>>>>>>   Continuing at 15716
>>>>>>   Match failed at index 15719
>>>>>>   Continuing at 15837
>>>>>>   Match failed at index 15840
>>>>>>   Continuing at 16198
>>>>>>   Skipped scope entry (due to false predicate) at index
16203,
>>>>>> continuing at 16285
>>>>>>   Skipped scope entry (due to false predicate) at index
16286,
>>>>>> continuing at 16394
>>>>>>   Skipped scope entry (due to false predicate) at index
16395,
>>>>>> continuing at 16464
>>>>>>   Skipped scope entry (due to false predicate) at index
16465,
>>>>>> continuing at 16487
>>>>>>   Skipped scope entry (due to false predicate) at index
16488,
>>>>>> continuing at 16510
>>>>>>   Skipped scope entry (due to false predicate) at index
16511,
>>>>>> continuing at 16533
>>>>>>   Skipped scope entry (due to false predicate) at index
16534,
>>>>>> continuing at 16556
>>>>>>   Skipped scope entry (due to false predicate) at index
16557,
>>>>>> continuing at 16680
>>>>>>   Skipped scope entry (due to false predicate) at index
16681,
>>>>>> continuing at 16804
>>>>>>   Skipped scope entry (due to false predicate) at index
16805,
>>>>>> continuing at 16890
>>>>>>   Skipped scope entry (due to false predicate) at index
16891,
>>>>>> continuing at 16976
>>>>>>   Skipped scope entry (due to false predicate) at index
16978,
>>>>>> continuing at 17169
>>>>>>   Skipped scope entry (due to false predicate) at index
17171,
>>>>>> continuing at 17342
>>>>>>   Skipped scope entry (due to false predicate) at index
17344,
>>>>>> continuing at 17497
>>>>>>   Skipped scope entry (due to false predicate) at index
17499,
>>>>>> continuing at 17632
>>>>>>   Skipped scope entry (due to false predicate) at index
17634,
>>>>>> continuing at 17801
>>>>>>   Skipped scope entry (due to false predicate) at index
17803,
>>>>>> continuing at 17944
>>>>>>   Skipped scope entry (due to false predicate) at index
17946,
>>>>>> continuing at 18074
>>>>>>   Skipped scope entry (due to false predicate) at index
18075,
>>>>>> continuing at 18178
>>>>>>   Skipped scope entry (due to false predicate) at index
18179,
>>>>>> continuing at 18253
>>>>>>   Skipped scope entry (due to false predicate) at index
18254,
>>>>>> continuing at 18278
>>>>>>   Skipped scope entry (due to false predicate) at index
18279,
>>>>>> continuing at 18303
>>>>>>   Skipped scope entry (due to false predicate) at index
18304,
>>>>>> continuing at 18328
>>>>>>   Skipped scope entry (due to false predicate) at index
18329,
>>>>>> continuing at 18376
>>>>>>   Skipped scope entry (due to false predicate) at index
18377,
>>>>>> continuing at 18424
>>>>>>   Skipped scope entry (due to false predicate) at index
18425,
>>>>>> continuing at 18520
>>>>>>   Skipped scope entry (due to false predicate) at index
18521,
>>>>>> continuing at 18636
>>>>>>   Skipped scope entry (due to false predicate) at index
18637,
>>>>>> continuing at 18661
>>>>>>   Skipped scope entry (due to false predicate) at index
18662,
>>>>>> continuing at 18711
>>>>>>   Skipped scope entry (due to false predicate) at index
18712,
>>>>>> continuing at 18736
>>>>>>   Skipped scope entry (due to false predicate) at index
18737,
>>>>>> continuing at 18770
>>>>>>   Skipped scope entry (due to false predicate) at index
18771,
>>>>>> continuing at 18856
>>>>>>   Skipped scope entry (due to false predicate) at index
18857,
>>>>>> continuing at 18942
>>>>>>   Skipped scope entry (due to false predicate) at index
18943,
>>>>>> continuing at 19028
>>>>>>   Match failed at index 16201
>>>>>>   Continuing at 19029
>>>>>> LLVM ERROR: Cannot select: t9: ch =
store<ST256[bitcast ([65 x i32]*
>>>>>> @c to <64 x
i32>*)](align=16)(tbaa=<0x3817578>)> t8, t7, t11,
>>>>>> undef:i64
>>>>>>   t7: v64i32 = add t6, t4
>>>>>>     t6: v64i32,ch = load<LD256[bitcast ([65 x i32]*
@c to <64 x
>>>>>>
i32>*)](align=16)(tbaa=<0x3817578>)(dereferenceable)> t0, t11,
>>>>>> undef:i64
>>>>>>       t11: i64 = X86ISD::Wrapper
TargetGlobalAddress:i64<[65 x i32]*
>>>>>> @c> 0
>>>>>>         t10: i64 = TargetGlobalAddress<[65 x i32]*
@c> 0
>>>>>>       t3: i64 = undef
>>>>>>     t4: v64i32,ch = load<LD256[bitcast ([65 x i32]*
@b to <64 x
>>>>>>
i32>*)](align=16)(tbaa=<0x3817578>)(dereferenceable)> t0, t13,
>>>>>> undef:i64
>>>>>>       t13: i64 = X86ISD::Wrapper
TargetGlobalAddress:i64<[65 x i32]*
>>>>>> @b> 0
>>>>>>         t12: i64 = TargetGlobalAddress<[65 x i32]*
@b> 0
>>>>>>       t3: i64 = undef
>>>>>>   t11: i64 = X86ISD::Wrapper
TargetGlobalAddress:i64<[65 x i32]* @c> 0
>>>>>>     t10: i64 = TargetGlobalAddress<[65 x i32]*
@c> 0
>>>>>>   t3: i64 = undef
>>>>>> In function: foo
>>>>>>
>>>>>>
>>>>>>
>>>>>> What could be the reason of this?? Please correct me.
>>>>>> I am stuck at this point....
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Jul 7, 2017 at 10:59 PM, Friedman, Eli <
>>>>>> efriedma at codeaurora.org> wrote:
>>>>>>
>>>>>>> The word "fold" is used all over LLVM. 
It generally refers to
>>>>>>> transformations which delete an instruction.
>>>>>>>
>>>>>>> If you're asking about
http://llvm.org/docs/CodeGener
>>>>>>> ator.html#instruction-folding , it just means an
instruction which
>>>>>>> was produced by the "instruction folding"
transform; there isn't anything
>>>>>>> special about the instruction itself.
>>>>>>>
>>>>>>> -Eli
>>>>>>>
>>>>>>>
>>>>>>> On 7/6/2017 10:51 PM, hameeza ahmed wrote:
>>>>>>>
>>>>>>> What is meant by folded instructions in LLVM?
>>>>>>> How they work?
>>>>>>>
>>>>>>> On Fri, Jul 7, 2017 at 10:19 AM, hameeza ahmed
<hahmed2305 at gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> Thank You.
>>>>>>>>
>>>>>>>> On Fri, Jul 7, 2017 at 10:03 AM, Craig Topper
<
>>>>>>>> craig.topper at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Yes, that error is from instruction
selection. I think your
>>>>>>>>> legalization changes worked fine.
>>>>>>>>>
>>>>>>>>> ~Craig
>>>>>>>>>
>>>>>>>>> On Thu, Jul 6, 2017 at 8:21 PM, hameeza
ahmed via llvm-dev <
>>>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>>>
>>>>>>>>>> also i further run the following
command;
>>>>>>>>>> llc -debug filer-knl_o3.ll
>>>>>>>>>>
>>>>>>>>>> and its output is attached here. by
looking at the output can we
>>>>>>>>>> say that legalization runs fine and the
error is due to instruction
>>>>>>>>>> selection/ pattern matching which is
not yet implemented?
>>>>>>>>>>
>>>>>>>>>> so do i need to worry and try to
correct it at this stage or
>>>>>>>>>> should i move forward to implement
instruction selection/ pattern matching?
>>>>>>>>>>
>>>>>>>>>> Please guide me.
>>>>>>>>>>
>>>>>>>>>> Thank You
>>>>>>>>>>
>>>>>>>>>> On Fri, Jul 7, 2017 at 8:00 AM, hameeza
ahmed <
>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Thank You. well i have seen these
links. but they dont cover the
>>>>>>>>>>> problem that i have mentioned.
actually i am doing all the things step by
>>>>>>>>>>> step.
>>>>>>>>>>>
>>>>>>>>>>> so i havent yet worked with
instruction selection phase/ files.
>>>>>>>>>>> rather before that i am trying to
do legalization by allowing vector
>>>>>>>>>>> elements>16 i.e 64xi32. here i
have mainly worked with 2 files uptil now,
>>>>>>>>>>> i.e registerinfo.td to define
register class to be called in
>>>>>>>>>>> legalization. and most importantly
i am dealing with file
>>>>>>>>>>> X86ISelLowering.cpp.
>>>>>>>>>>>
>>>>>>>>>>> Now is there any relation in this
and instruction selection.
>>>>>>>>>>> since instruction selection comes
after combine and legalize so i havent
>>>>>>>>>>> yet worked on it.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Please correct me, I am stuck here.
>>>>>>>>>>>
>>>>>>>>>>> Thank You again
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Jul 7, 2017 at 7:11 AM,
Friedman, Eli <
>>>>>>>>>>> efriedma at codeaurora.org>
wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Have you read
http://llvm.org/docs/WritingAnLLVMBackend.html
>>>>>>>>>>>> and
http://llvm.org/docs/CodeGenerator.html ?
>>>>>>>>>>>>
http://llvm.org/docs/WritingAnLLVMBackend.html#instruction-s
>>>>>>>>>>>> elector describes how to define
a store instruction.
>>>>>>>>>>>>
>>>>>>>>>>>> -Eli
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 7/6/2017 6:51 PM, hameeza
ahmed via llvm-dev wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Please correct me i m stuck at
this point.
>>>>>>>>>>>>
>>>>>>>>>>>> On Jul 6, 2017 5:18 PM,
"hameeza ahmed" <hahmed2305 at gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Hello,
>>>>>>>>>>>> i am experimenting with the
increase in register/ vector width
>>>>>>>>>>>> to 64 elements of 32 bits
instead of 16 in x86 backend.
>>>>>>>>>>>> for eg.
>>>>>>>>>>>> i have a loop with 65
iterations;
>>>>>>>>>>>> if my IR generates v64i32 and 1
scalar, still the backend
>>>>>>>>>>>> breaks the v64i32 into 4
v16i32. i want it to retain v64i32. like if there
>>>>>>>>>>>> are 128 elements in loop then
it should break it into 2 v64i32 instructions.
>>>>>>>>>>>>
>>>>>>>>>>>> in order to do this i have made
necessary changes in
>>>>>>>>>>>> X86ISelLowering.cpp. and
rebuild llvm. then when i use the
>>>>>>>>>>>> command -view-dag-combine2-dags
i get the required output in
>>>>>>>>>>>> graph but the following error
on console:
>>>>>>>>>>>>
>>>>>>>>>>>> LLVM ERROR: Cannot select: t10:
ch = store<ST256[bitcast ([65 x
>>>>>>>>>>>> i32]* @a to <64 x
i32>*)](align=16)(tbaa=<0x30c5438>)> t9, t7,
>>>>>>>>>>>> t12, undef:i64
>>>>>>>>>>>>   t7: v64i32 = add t6, t4
>>>>>>>>>>>>     t6: v64i32,ch =
load<LD256[bitcast ([65 x i32]* @c to <64 x
>>>>>>>>>>>>
i32>*)](align=16)(tbaa=<0x30c5438>)(dereferenceable)> t0, t14,
>>>>>>>>>>>> undef:i64
>>>>>>>>>>>>       t14: i64 =
X86ISD::Wrapper TargetGlobalAddress:i64<[65 x
>>>>>>>>>>>> i32]* @c> 0
>>>>>>>>>>>>         t13: i64 =
TargetGlobalAddress<[65 x i32]* @c> 0
>>>>>>>>>>>>       t3: i64 = undef
>>>>>>>>>>>>     t4: v64i32,ch =
load<LD256[bitcast ([65 x i32]* @b to <64 x
>>>>>>>>>>>>
i32>*)](align=16)(tbaa=<0x30c5438>)(dereferenceable)> t0, t16,
>>>>>>>>>>>> undef:i64
>>>>>>>>>>>>       t16: i64 =
X86ISD::Wrapper TargetGlobalAddress:i64<[65 x
>>>>>>>>>>>> i32]* @b> 0
>>>>>>>>>>>>         t15: i64 =
TargetGlobalAddress<[65 x i32]* @b> 0
>>>>>>>>>>>>       t3: i64 = undef
>>>>>>>>>>>>   t12: i64 = X86ISD::Wrapper
TargetGlobalAddress:i64<[65 x
>>>>>>>>>>>> i32]* @a> 0
>>>>>>>>>>>>     t11: i64 =
TargetGlobalAddress<[65 x i32]* @a> 0
>>>>>>>>>>>>   t3: i64 = undef
>>>>>>>>>>>> In function: foo
>>>>>>>>>>>>
>>>>>>>>>>>> The dag after legalization is
also attached here.
>>>>>>>>>>>>
>>>>>>>>>>>> the source is vector sum of 65
elements.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Kindly correct me.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>>> LLVM Developers mailing
listllvm-dev at
lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Employee of Qualcomm Innovation
Center, Inc.
>>>>>>>>>>>> Qualcomm Innovation Center,
Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
_______________________________________________
>>>>>>>>>> LLVM Developers mailing list
>>>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Employee of Qualcomm Innovation Center, Inc.
>>>>>>> Qualcomm Innovation Center, Inc. is a member of
Code Aurora Forum, a Linux Foundation Collaborative Project
>>>>>>>
>>>>>>>
>>>>>> --
>>>>> ~Craig
>>>>>
>>>>
>>>>
>>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170707/ed477e62/attachment-0001.html>

llvm dev - Jul 2017 - Error in v64i32 type in x86 backend

[llvm-dev] Error in v64i32 type in x86 backend

[llvm-dev] Error in v64i32 type in x86 backend

[llvm-dev] Error in v64i32 type in x86 backend