thr3ads.net - llvm dev - [llvm-dev] Error in v64i32 type in x86 backend [Jul 2017]

If this information is useful, please help other people find it:
Share via:

hameeza ahmed via llvm-dev

2017-Jul-07 03:21 UTC

[llvm-dev] Error in v64i32 type in x86 backend

also i further run the following command;
llc -debug filer-knl_o3.ll

and its output is attached here. by looking at the output can we say that
legalization runs fine and the error is due to instruction selection/
pattern matching which is not yet implemented?

so do i need to worry and try to correct it at this stage or should i move
forward to implement instruction selection/ pattern matching?

Please guide me.

Thank You

On Fri, Jul 7, 2017 at 8:00 AM, hameeza ahmed <hahmed2305 at gmail.com>
wrote:
> Thank You. well i have seen these links. but they dont cover the problem
> that i have mentioned. actually i am doing all the things step by step.
>
> so i havent yet worked with instruction selection phase/ files. rather
> before that i am trying to do legalization by allowing vector
elements>16
> i.e 64xi32. here i have mainly worked with 2 files uptil now, i.e
> registerinfo.td to define register class to be called in legalization.
> and most importantly i am dealing with file X86ISelLowering.cpp.
>
> Now is there any relation in this and instruction selection. since
> instruction selection comes after combine and legalize so i havent yet
> worked on it.
>
>
> Please correct me, I am stuck here.
>
> Thank You again
>
> On Fri, Jul 7, 2017 at 7:11 AM, Friedman, Eli <efriedma at
codeaurora.org>
> wrote:
>
>> Have you read http://llvm.org/docs/WritingAnLLVMBackend.html and
>> http://llvm.org/docs/CodeGenerator.html ? 
http://llvm.org/docs/WritingAn
>> LLVMBackend.html#instruction-selector describes how to define a store
>> instruction.
>>
>> -Eli
>>
>>
>> On 7/6/2017 6:51 PM, hameeza ahmed via llvm-dev wrote:
>>
>> Please correct me i m stuck at this point.
>>
>> On Jul 6, 2017 5:18 PM, "hameeza ahmed" <hahmed2305 at
gmail.com> wrote:
>>
>> Hello,
>> i am experimenting with the increase in register/ vector width to 64
>> elements of 32 bits instead of 16 in x86 backend.
>> for eg.
>> i have a loop with 65 iterations;
>> if my IR generates v64i32 and 1 scalar, still the backend breaks the
>> v64i32 into 4 v16i32. i want it to retain v64i32. like if there are 128
>> elements in loop then it should break it into 2 v64i32 instructions.
>>
>> in order to do this i have made necessary changes in
X86ISelLowering.cpp.
>> and rebuild llvm. then when i use the command -view-dag-combine2-dags i
>> get the required output in graph but the following error on console:
>>
>> LLVM ERROR: Cannot select: t10: ch = store<ST256[bitcast ([65 x
i32]* @a
>> to <64 x i32>*)](align=16)(tbaa=<0x30c5438>)> t9, t7,
t12, undef:i64
>>   t7: v64i32 = add t6, t4
>>     t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* @c to <64 x
>> i32>*)](align=16)(tbaa=<0x30c5438>)(dereferenceable)> t0,
t14, undef:i64
>>       t14: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]*
@c> 0
>>         t13: i64 = TargetGlobalAddress<[65 x i32]* @c> 0
>>       t3: i64 = undef
>>     t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* @b to <64 x
>> i32>*)](align=16)(tbaa=<0x30c5438>)(dereferenceable)> t0,
t16, undef:i64
>>       t16: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]*
@b> 0
>>         t15: i64 = TargetGlobalAddress<[65 x i32]* @b> 0
>>       t3: i64 = undef
>>   t12: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]*
@a> 0
>>     t11: i64 = TargetGlobalAddress<[65 x i32]* @a> 0
>>   t3: i64 = undef
>> In function: foo
>>
>> The dag after legalization is also attached here.
>>
>> the source is vector sum of 65 elements.
>>
>>
>> Kindly correct me.
>>
>>
>>
>>
>> _______________________________________________
>> LLVM Developers mailing listllvm-dev at
lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>> --
>> Employee of Qualcomm Innovation Center, Inc.
>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a
Linux Foundation Collaborative Project
>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170707/8ce74459/attachment-0001.html>
-------------- next part --------------
hameeza at ubuntu:$ llc -debug filer-knl_o3.ll
Args:llc -debug filer-knl_o3.ll 

Features:+64bit,+sse2,+adx,+aes,+avx,+avx2,+avx512cd,+avx512er,+avx512f,+avx512pf,+bmi,+bmi2,+cx16,+f16c,+fma,+fsgsbase,+fxsr,+lzcnt,+mmx,+movbe,+pclmul,+popcnt,+prefetchwt1,+rdrnd,+rdseed,+rtm,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave,+xsaveopt
CPU:knl

Subtarget features: SSELevel 9, 3DNowLevel 1, 64bit 1
********** Begin Constant Hoisting **********
********** Function: foo
********** End Constant Hoisting **********
*** Interleaved Access Pass: foo
CGP: Found      local addrmode: [GV:@b]
CGP: Found      local addrmode: [GV:@c]
CGP: Found      local addrmode: [GV:@a]
CGP: Found      local addrmode: [GV:@b + 256]
CGP: Found      local addrmode: [GV:@c + 256]
CGP: Found      local addrmode: [GV:@a + 256]
[SafeStack] Function: foo
[SafeStack]     safestack is not requested for this function
---- Branch Probability Info : foo ----

Computing probabilities for scalar.ph
Computing probabilities for vector.body
Computing probabilities for min.iters.checked



=== foo
Initial selection DAG: BB#0 'foo:min.iters.checked'
SelectionDAG has 1 nodes:
  t0: ch = EntryToken



Combining: t0: ch = EntryToken
Optimized lowered selection DAG: BB#0 'foo:min.iters.checked'
SelectionDAG has 1 nodes:
  t0: ch = EntryToken


Legally typed node: t0: ch = EntryToken

Legally typed node: t65535: ch = handlenode t0

Type-legalized selection DAG: BB#0 'foo:min.iters.checked'
SelectionDAG has 1 nodes:
  t0: ch = EntryToken



Legalizing: t0: ch = EntryToken
Legalized selection DAG: BB#0 'foo:min.iters.checked'
SelectionDAG has 1 nodes:
  t0: ch = EntryToken



Legalizing: t0: ch = EntryToken

Combining: t0: ch = EntryToken
Optimized legalized selection DAG: BB#0 'foo:min.iters.checked'
SelectionDAG has 1 nodes:
  t0: ch = EntryToken


===== Instruction selection begins: BB#0 'min.iters.checked'
Selecting: t0: ch = EntryToken

===== Instruction selection ends:
Selected selection DAG: BB#0 'foo:min.iters.checked'
SelectionDAG has 1 nodes:
  t0: ch = EntryToken


********** List Scheduling BB#0 'min.iters.checked' **********
*** Final schedule ***

Total amount of phi nodes to update: 0
Initial selection DAG: BB#1 'foo:vector.body'
SelectionDAG has 11 nodes:
  t0: ch = EntryToken
  t2: i64 = Constant<0>
  t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* @b to <64 x
i32>*)](align=16)(tbaa=<0x4452448>)(dereferenceable)> t0,
GlobalAddress:i64<[65 x i32]* @b> 0, undef:i64
  t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* @c to <64 x
i32>*)](align=16)(tbaa=<0x4452448>)(dereferenceable)> t0,
GlobalAddress:i64<[65 x i32]* @c> 0, undef:i64
    t9: ch = TokenFactor t4:1, t6:1
    t7: v64i32 = add t6, t4
  t10: ch = store<ST256[bitcast ([65 x i32]* @a to <64 x
i32>*)](align=16)(tbaa=<0x4452448>)> t9, t7,
GlobalAddress:i64<[65 x i32]* @a> 0, undef:i64



Combining: t10: ch = store<ST256[bitcast ([65 x i32]* @a to <64 x
i32>*)](align=16)(tbaa=<0x4452448>)> t9, t7,
GlobalAddress:i64<[65 x i32]* @a> 0, undef:i64

Combining: t9: ch = TokenFactor t4:1, t6:1

Combining: t8: i64 = GlobalAddress<[65 x i32]* @a> 0

Combining: t7: v64i32 = add t6, t4

Combining: t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* @c to <64 x
i32>*)](align=16)(tbaa=<0x4452448>)(dereferenceable)> t0,
GlobalAddress:i64<[65 x i32]* @c> 0, undef:i64

Combining: t5: i64 = GlobalAddress<[65 x i32]* @c> 0

Combining: t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* @b to <64 x
i32>*)](align=16)(tbaa=<0x4452448>)(dereferenceable)> t0,
GlobalAddress:i64<[65 x i32]* @b> 0, undef:i64

Combining: t3: i64 = undef

Combining: t1: i64 = GlobalAddress<[65 x i32]* @b> 0

Combining: t0: ch = EntryToken
Optimized lowered selection DAG: BB#1 'foo:vector.body'
SelectionDAG has 10 nodes:
  t0: ch = EntryToken
  t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* @b to <64 x
i32>*)](align=16)(tbaa=<0x4452448>)(dereferenceable)> t0,
GlobalAddress:i64<[65 x i32]* @b> 0, undef:i64
  t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* @c to <64 x
i32>*)](align=16)(tbaa=<0x4452448>)(dereferenceable)> t0,
GlobalAddress:i64<[65 x i32]* @c> 0, undef:i64
    t9: ch = TokenFactor t4:1, t6:1
    t7: v64i32 = add t6, t4
  t10: ch = store<ST256[bitcast ([65 x i32]* @a to <64 x
i32>*)](align=16)(tbaa=<0x4452448>)> t9, t7,
GlobalAddress:i64<[65 x i32]* @a> 0, undef:i64


Legally typed node: t8: i64 = GlobalAddress<[65 x i32]* @a> 0

Legally typed node: t5: i64 = GlobalAddress<[65 x i32]* @c> 0

Legally typed node: t3: i64 = undef

Legally typed node: t1: i64 = GlobalAddress<[65 x i32]* @b> 0

Legally typed node: t0: ch = EntryToken

Legally typed node: t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* @b to
<64 x i32>*)](align=16)(tbaa=<0x4452448>)(dereferenceable)> t0,
GlobalAddress:i64<[65 x i32]* @b> 0, undef:i64

Legally typed node: t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* @c to
<64 x i32>*)](align=16)(tbaa=<0x4452448>)(dereferenceable)> t0,
GlobalAddress:i64<[65 x i32]* @c> 0, undef:i64

Legally typed node: t7: v64i32 = add t6, t4

Legally typed node: t9: ch = TokenFactor t4:1, t6:1

Legally typed node: t10: ch = store<ST256[bitcast ([65 x i32]* @a to <64 x
i32>*)](align=16)(tbaa=<0x4452448>)> t9, t7,
GlobalAddress:i64<[65 x i32]* @a> 0, undef:i64

Legally typed node: t65535: ch = handlenode t10

Type-legalized selection DAG: BB#1 'foo:vector.body'
SelectionDAG has 10 nodes:
  t0: ch = EntryToken
  t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* @b to <64 x
i32>*)](align=16)(tbaa=<0x4452448>)(dereferenceable)> t0,
GlobalAddress:i64<[65 x i32]* @b> 0, undef:i64
  t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* @c to <64 x
i32>*)](align=16)(tbaa=<0x4452448>)(dereferenceable)> t0,
GlobalAddress:i64<[65 x i32]* @c> 0, undef:i64
    t9: ch = TokenFactor t4:1, t6:1
    t7: v64i32 = add t6, t4
  t10: ch = store<ST256[bitcast ([65 x i32]* @a to <64 x
i32>*)](align=16)(tbaa=<0x4452448>)> t9, t7,
GlobalAddress:i64<[65 x i32]* @a> 0, undef:i64



Legalizing: t10: ch = store<ST256[bitcast ([65 x i32]* @a to <64 x
i32>*)](align=16)(tbaa=<0x4452448>)> t9, t7,
GlobalAddress:i64<[65 x i32]* @a> 0, undef:i64

Legalizing: t7: v64i32 = add t6, t4

Legalizing: t9: ch = TokenFactor t4:1, t6:1

Legalizing: t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* @c to <64 x
i32>*)](align=16)(tbaa=<0x4452448>)(dereferenceable)> t0,
GlobalAddress:i64<[65 x i32]* @c> 0, undef:i64

Legalizing: t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* @b to <64 x
i32>*)](align=16)(tbaa=<0x4452448>)(dereferenceable)> t0,
GlobalAddress:i64<[65 x i32]* @b> 0, undef:i64

Legalizing: t8: i64 = GlobalAddress<[65 x i32]* @a> 0
 ... replacing: t8: i64 = GlobalAddress<[65 x i32]* @a> 0
     with:      t12: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x
i32]* @a> 0

Legalizing: t5: i64 = GlobalAddress<[65 x i32]* @c> 0
 ... replacing: t5: i64 = GlobalAddress<[65 x i32]* @c> 0
     with:      t14: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x
i32]* @c> 0

Legalizing: t3: i64 = undef

Legalizing: t1: i64 = GlobalAddress<[65 x i32]* @b> 0
 ... replacing: t1: i64 = GlobalAddress<[65 x i32]* @b> 0
     with:      t16: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x
i32]* @b> 0

Legalizing: t0: ch = EntryToken

Legalizing: t16: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]*
@b> 0

Legalizing: t15: i64 = TargetGlobalAddress<[65 x i32]* @b> 0

Legalizing: t14: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]*
@c> 0

Legalizing: t13: i64 = TargetGlobalAddress<[65 x i32]* @c> 0

Legalizing: t12: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]*
@a> 0

Legalizing: t11: i64 = TargetGlobalAddress<[65 x i32]* @a> 0
Legalized selection DAG: BB#1 'foo:vector.body'
SelectionDAG has 13 nodes:
  t0: ch = EntryToken
    t16: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]* @b> 0
  t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* @b to <64 x
i32>*)](align=16)(tbaa=<0x4452448>)(dereferenceable)> t0, t16,
undef:i64
    t14: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]* @c> 0
  t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* @c to <64 x
i32>*)](align=16)(tbaa=<0x4452448>)(dereferenceable)> t0, t14,
undef:i64
    t9: ch = TokenFactor t4:1, t6:1
    t7: v64i32 = add t6, t4
    t12: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]* @a> 0
  t10: ch = store<ST256[bitcast ([65 x i32]* @a to <64 x
i32>*)](align=16)(tbaa=<0x4452448>)> t9, t7, t12, undef:i64



Legalizing: t16: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]*
@b> 0

Combining: t16: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]*
@b> 0

Legalizing: t15: i64 = TargetGlobalAddress<[65 x i32]* @b> 0

Combining: t15: i64 = TargetGlobalAddress<[65 x i32]* @b> 0

Legalizing: t14: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]*
@c> 0

Combining: t14: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]*
@c> 0

Legalizing: t13: i64 = TargetGlobalAddress<[65 x i32]* @c> 0

Combining: t13: i64 = TargetGlobalAddress<[65 x i32]* @c> 0

Legalizing: t12: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]*
@a> 0

Combining: t12: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]*
@a> 0

Legalizing: t11: i64 = TargetGlobalAddress<[65 x i32]* @a> 0

Combining: t11: i64 = TargetGlobalAddress<[65 x i32]* @a> 0

Legalizing: t10: ch = store<ST256[bitcast ([65 x i32]* @a to <64 x
i32>*)](align=16)(tbaa=<0x4452448>)> t9, t7, t12, undef:i64

Combining: t10: ch = store<ST256[bitcast ([65 x i32]* @a to <64 x
i32>*)](align=16)(tbaa=<0x4452448>)> t9, t7, t12, undef:i64

Legalizing: t7: v64i32 = add t6, t4

Combining: t7: v64i32 = add t6, t4

Legalizing: t9: ch = TokenFactor t4:1, t6:1

Combining: t9: ch = TokenFactor t4:1, t6:1

Legalizing: t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* @c to <64 x
i32>*)](align=16)(tbaa=<0x4452448>)(dereferenceable)> t0, t14,
undef:i64

Combining: t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* @c to <64 x
i32>*)](align=16)(tbaa=<0x4452448>)(dereferenceable)> t0, t14,
undef:i64

Legalizing: t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* @b to <64 x
i32>*)](align=16)(tbaa=<0x4452448>)(dereferenceable)> t0, t16,
undef:i64

Combining: t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* @b to <64 x
i32>*)](align=16)(tbaa=<0x4452448>)(dereferenceable)> t0, t16,
undef:i64

Legalizing: t3: i64 = undef

Combining: t3: i64 = undef

Legalizing: t0: ch = EntryToken

Combining: t0: ch = EntryToken
Optimized legalized selection DAG: BB#1 'foo:vector.body'
SelectionDAG has 13 nodes:
  t0: ch = EntryToken
    t16: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]* @b> 0
  t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* @b to <64 x
i32>*)](align=16)(tbaa=<0x4452448>)(dereferenceable)> t0, t16,
undef:i64
    t14: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]* @c> 0
  t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* @c to <64 x
i32>*)](align=16)(tbaa=<0x4452448>)(dereferenceable)> t0, t14,
undef:i64
    t9: ch = TokenFactor t4:1, t6:1
    t7: v64i32 = add t6, t4
    t12: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]* @a> 0
  t10: ch = store<ST256[bitcast ([65 x i32]* @a to <64 x
i32>*)](align=16)(tbaa=<0x4452448>)> t9, t7, t12, undef:i64


===== Instruction selection begins: BB#1 'vector.body'
Selecting: t10: ch = store<ST256[bitcast ([65 x i32]* @a to <64 x
i32>*)](align=16)(tbaa=<0x4452448>)> t9, t7, t12, undef:i64

ISEL: Starting pattern match on root node: t10: ch = store<ST256[bitcast ([65
x i32]* @a to <64 x i32>*)](align=16)(tbaa=<0x4452448>)> t9, t7,
t12, undef:i64

  Skipped scope entry (due to false predicate) at index 14, continuing at 81
  Skipped scope entry (due to false predicate) at index 82, continuing at 149
  Skipped scope entry (due to false predicate) at index 150, continuing at 217
  Skipped scope entry (due to false predicate) at index 218, continuing at 267
  Skipped scope entry (due to false predicate) at index 268, continuing at 317
  Skipped scope entry (due to false predicate) at index 318, continuing at 367
  Skipped scope entry (due to false predicate) at index 368, continuing at 394
  Skipped scope entry (due to false predicate) at index 395, continuing at 421
  Skipped scope entry (due to false predicate) at index 422, continuing at 471
  Skipped scope entry (due to false predicate) at index 472, continuing at 521
  Skipped scope entry (due to false predicate) at index 522, continuing at 571
  Skipped scope entry (due to false predicate) at index 572, continuing at 639
  Skipped scope entry (due to false predicate) at index 640, continuing at 707
  Skipped scope entry (due to false predicate) at index 708, continuing at 775
  Skipped scope entry (due to false predicate) at index 776, continuing at 804
  Skipped scope entry (due to false predicate) at index 805, continuing at 833
  Skipped scope entry (due to false predicate) at index 834, continuing at 862
  Skipped scope entry (due to false predicate) at index 863, continuing at 891
  Skipped scope entry (due to false predicate) at index 892, continuing at 920
  Skipped scope entry (due to false predicate) at index 921, continuing at 949
  Skipped scope entry (due to false predicate) at index 950, continuing at 987
  Skipped scope entry (due to false predicate) at index 988, continuing at 1025
  Match failed at index 12
  Continuing at 1026
  OpcodeSwitch from 1029 to 5725
  Match failed at index 5743
  Continuing at 5772
  Match failed at index 5776
  Continuing at 5805
  Match failed at index 5809
  Continuing at 5838
  Match failed at index 5842
  Continuing at 5911
  Match failed at index 5915
  Continuing at 5953
  Match failed at index 5957
  Continuing at 5995
  Match failed at index 5999
  Continuing at 6037
  Match failed at index 6041
  Continuing at 6084
  Match failed at index 6088
  Continuing at 6131
  Skipped scope entry (due to false predicate) at index 6138, continuing at 6181
  Skipped scope entry (due to false predicate) at index 6182, continuing at 6228
  Skipped scope entry (due to false predicate) at index 6235, continuing at 6384
  Match failed at index 6388
  Continuing at 6419
  Match failed at index 6423
  Continuing at 6454
  Match failed at index 6458
  Continuing at 6489
  Continuing at 6490
  Continuing at 6491
  Continuing at 6492
  Match failed at index 6514
  Continuing at 6545
  Match failed at index 6562
  Continuing at 6593
  Match failed at index 6610
  Continuing at 6641
  Continuing at 6642
  Match failed at index 6658
  Continuing at 6772
  Match failed at index 6788
  Continuing at 6902
  Continuing at 13636
  Match failed at index 13640
  Continuing at 14940
  Match failed at index 14943
  Continuing at 15415
  Match failed at index 15417
  Continuing at 15570
  Match failed at index 15571
  Continuing at 15598
  Match failed at index 15599
  Continuing at 15716
  Match failed at index 15719
  Continuing at 15837
  Match failed at index 15840
  Continuing at 16174
  Skipped scope entry (due to false predicate) at index 16179, continuing at
16261
  Skipped scope entry (due to false predicate) at index 16262, continuing at
16370
  Skipped scope entry (due to false predicate) at index 16371, continuing at
16440
  Skipped scope entry (due to false predicate) at index 16441, continuing at
16463
  Skipped scope entry (due to false predicate) at index 16464, continuing at
16486
  Skipped scope entry (due to false predicate) at index 16487, continuing at
16509
  Skipped scope entry (due to false predicate) at index 16510, continuing at
16532
  Skipped scope entry (due to false predicate) at index 16533, continuing at
16656
  Skipped scope entry (due to false predicate) at index 16657, continuing at
16780
  Skipped scope entry (due to false predicate) at index 16781, continuing at
16866
  Skipped scope entry (due to false predicate) at index 16867, continuing at
16952
  Skipped scope entry (due to false predicate) at index 16954, continuing at
17145
  Skipped scope entry (due to false predicate) at index 17147, continuing at
17318
  Skipped scope entry (due to false predicate) at index 17320, continuing at
17473
  Skipped scope entry (due to false predicate) at index 17475, continuing at
17608
  Skipped scope entry (due to false predicate) at index 17610, continuing at
17777
  Skipped scope entry (due to false predicate) at index 17779, continuing at
17920
  Skipped scope entry (due to false predicate) at index 17922, continuing at
18050
  Skipped scope entry (due to false predicate) at index 18051, continuing at
18154
  Skipped scope entry (due to false predicate) at index 18155, continuing at
18229
  Skipped scope entry (due to false predicate) at index 18230, continuing at
18254
  Skipped scope entry (due to false predicate) at index 18255, continuing at
18279
  Skipped scope entry (due to false predicate) at index 18280, continuing at
18304
  Skipped scope entry (due to false predicate) at index 18305, continuing at
18352
  Skipped scope entry (due to false predicate) at index 18353, continuing at
18400
  Skipped scope entry (due to false predicate) at index 18401, continuing at
18496
  Skipped scope entry (due to false predicate) at index 18497, continuing at
18612
  Skipped scope entry (due to false predicate) at index 18613, continuing at
18637
  Skipped scope entry (due to false predicate) at index 18638, continuing at
18687
  Skipped scope entry (due to false predicate) at index 18688, continuing at
18712
  Skipped scope entry (due to false predicate) at index 18713, continuing at
18746
  Skipped scope entry (due to false predicate) at index 18747, continuing at
18832
  Skipped scope entry (due to false predicate) at index 18833, continuing at
18918
  Skipped scope entry (due to false predicate) at index 18919, continuing at
19004
  Match failed at index 16177
  Continuing at 19005
LLVM ERROR: Cannot select: t10: ch = store<ST256[bitcast ([65 x i32]* @a to
<64 x i32>*)](align=16)(tbaa=<0x4452448>)> t9, t7, t12, undef:i64
  t7: v64i32 = add t6, t4
    t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* @c to <64 x
i32>*)](align=16)(tbaa=<0x4452448>)(dereferenceable)> t0, t14,
undef:i64
      t14: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]* @c> 0
        t13: i64 = TargetGlobalAddress<[65 x i32]* @c> 0
      t3: i64 = undef
    t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* @b to <64 x
i32>*)](align=16)(tbaa=<0x4452448>)(dereferenceable)> t0, t16,
undef:i64
      t16: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]* @b> 0
        t15: i64 = TargetGlobalAddress<[65 x i32]* @b> 0
      t3: i64 = undef
  t12: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]* @a> 0
    t11: i64 = TargetGlobalAddress<[65 x i32]* @a> 0
  t3: i64 = undef
In function: foo

Craig Topper via llvm-dev

2017-Jul-07 05:03 UTC

head link

[llvm-dev] Error in v64i32 type in x86 backend

Yes, that error is from instruction selection. I think your legalization
changes worked fine.

~Craig

On Thu, Jul 6, 2017 at 8:21 PM, hameeza ahmed via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> also i further run the following command;
> llc -debug filer-knl_o3.ll
>
> and its output is attached here. by looking at the output can we say that
> legalization runs fine and the error is due to instruction selection/
> pattern matching which is not yet implemented?
>
> so do i need to worry and try to correct it at this stage or should i move
> forward to implement instruction selection/ pattern matching?
>
> Please guide me.
>
> Thank You
>
> On Fri, Jul 7, 2017 at 8:00 AM, hameeza ahmed <hahmed2305 at
gmail.com>
> wrote:
>
>> Thank You. well i have seen these links. but they dont cover the
problem
>> that i have mentioned. actually i am doing all the things step by step.
>>
>> so i havent yet worked with instruction selection phase/ files. rather
>> before that i am trying to do legalization by allowing vector
elements>16
>> i.e 64xi32. here i have mainly worked with 2 files uptil now, i.e
>> registerinfo.td to define register class to be called in legalization.
>> and most importantly i am dealing with file X86ISelLowering.cpp.
>>
>> Now is there any relation in this and instruction selection. since
>> instruction selection comes after combine and legalize so i havent yet
>> worked on it.
>>
>>
>> Please correct me, I am stuck here.
>>
>> Thank You again
>>
>> On Fri, Jul 7, 2017 at 7:11 AM, Friedman, Eli <efriedma at
codeaurora.org>
>> wrote:
>>
>>> Have you read http://llvm.org/docs/WritingAnLLVMBackend.html and
>>> http://llvm.org/docs/CodeGenerator.html ?
>>> http://llvm.org/docs/WritingAnLLVMBackend.html#instruction-selector
>>> describes how to define a store instruction.
>>>
>>> -Eli
>>>
>>>
>>> On 7/6/2017 6:51 PM, hameeza ahmed via llvm-dev wrote:
>>>
>>> Please correct me i m stuck at this point.
>>>
>>> On Jul 6, 2017 5:18 PM, "hameeza ahmed" <hahmed2305 at
gmail.com> wrote:
>>>
>>> Hello,
>>> i am experimenting with the increase in register/ vector width to
64
>>> elements of 32 bits instead of 16 in x86 backend.
>>> for eg.
>>> i have a loop with 65 iterations;
>>> if my IR generates v64i32 and 1 scalar, still the backend breaks
the
>>> v64i32 into 4 v16i32. i want it to retain v64i32. like if there are
128
>>> elements in loop then it should break it into 2 v64i32
instructions.
>>>
>>> in order to do this i have made necessary changes in
>>> X86ISelLowering.cpp. and rebuild llvm. then when i use the
>>> command -view-dag-combine2-dags i get the required output in graph
but
>>> the following error on console:
>>>
>>> LLVM ERROR: Cannot select: t10: ch = store<ST256[bitcast ([65 x
i32]* @a
>>> to <64 x i32>*)](align=16)(tbaa=<0x30c5438>)> t9,
t7, t12, undef:i64
>>>   t7: v64i32 = add t6, t4
>>>     t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* @c to <64
x
>>> i32>*)](align=16)(tbaa=<0x30c5438>)(dereferenceable)>
t0, t14, undef:i64
>>>       t14: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x
i32]* @c>
>>> 0
>>>         t13: i64 = TargetGlobalAddress<[65 x i32]* @c> 0
>>>       t3: i64 = undef
>>>     t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* @b to <64
x
>>> i32>*)](align=16)(tbaa=<0x30c5438>)(dereferenceable)>
t0, t16, undef:i64
>>>       t16: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x
i32]* @b>
>>> 0
>>>         t15: i64 = TargetGlobalAddress<[65 x i32]* @b> 0
>>>       t3: i64 = undef
>>>   t12: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x i32]*
@a> 0
>>>     t11: i64 = TargetGlobalAddress<[65 x i32]* @a> 0
>>>   t3: i64 = undef
>>> In function: foo
>>>
>>> The dag after legalization is also attached here.
>>>
>>> the source is vector sum of 65 elements.
>>>
>>>
>>> Kindly correct me.
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing listllvm-dev at
lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>> --
>>> Employee of Qualcomm Innovation Center, Inc.
>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
>>>
>>>
>>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170706/79b1daa5/attachment.html>

hameeza ahmed via llvm-dev

2017-Jul-07 05:19 UTC

head link

[llvm-dev] Error in v64i32 type in x86 backend

Thank You.

On Fri, Jul 7, 2017 at 10:03 AM, Craig Topper <craig.topper at gmail.com>
wrote:
> Yes, that error is from instruction selection. I think your legalization
> changes worked fine.
>
> ~Craig
>
> On Thu, Jul 6, 2017 at 8:21 PM, hameeza ahmed via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> also i further run the following command;
>> llc -debug filer-knl_o3.ll
>>
>> and its output is attached here. by looking at the output can we say
that
>> legalization runs fine and the error is due to instruction selection/
>> pattern matching which is not yet implemented?
>>
>> so do i need to worry and try to correct it at this stage or should i
>> move forward to implement instruction selection/ pattern matching?
>>
>> Please guide me.
>>
>> Thank You
>>
>> On Fri, Jul 7, 2017 at 8:00 AM, hameeza ahmed <hahmed2305 at
gmail.com>
>> wrote:
>>
>>> Thank You. well i have seen these links. but they dont cover the
problem
>>> that i have mentioned. actually i am doing all the things step by
step.
>>>
>>> so i havent yet worked with instruction selection phase/ files.
rather
>>> before that i am trying to do legalization by allowing vector
elements>16
>>> i.e 64xi32. here i have mainly worked with 2 files uptil now, i.e
>>> registerinfo.td to define register class to be called in
legalization.
>>> and most importantly i am dealing with file X86ISelLowering.cpp.
>>>
>>> Now is there any relation in this and instruction selection. since
>>> instruction selection comes after combine and legalize so i havent
yet
>>> worked on it.
>>>
>>>
>>> Please correct me, I am stuck here.
>>>
>>> Thank You again
>>>
>>> On Fri, Jul 7, 2017 at 7:11 AM, Friedman, Eli <efriedma at
codeaurora.org>
>>> wrote:
>>>
>>>> Have you read http://llvm.org/docs/WritingAnLLVMBackend.html
and
>>>> http://llvm.org/docs/CodeGenerator.html ?
>>>>
http://llvm.org/docs/WritingAnLLVMBackend.html#instruction-selector
>>>> describes how to define a store instruction.
>>>>
>>>> -Eli
>>>>
>>>>
>>>> On 7/6/2017 6:51 PM, hameeza ahmed via llvm-dev wrote:
>>>>
>>>> Please correct me i m stuck at this point.
>>>>
>>>> On Jul 6, 2017 5:18 PM, "hameeza ahmed"
<hahmed2305 at gmail.com> wrote:
>>>>
>>>> Hello,
>>>> i am experimenting with the increase in register/ vector width
to 64
>>>> elements of 32 bits instead of 16 in x86 backend.
>>>> for eg.
>>>> i have a loop with 65 iterations;
>>>> if my IR generates v64i32 and 1 scalar, still the backend
breaks the
>>>> v64i32 into 4 v16i32. i want it to retain v64i32. like if there
are 128
>>>> elements in loop then it should break it into 2 v64i32
instructions.
>>>>
>>>> in order to do this i have made necessary changes in
>>>> X86ISelLowering.cpp. and rebuild llvm. then when i use the
>>>> command -view-dag-combine2-dags i get the required output in
graph but
>>>> the following error on console:
>>>>
>>>> LLVM ERROR: Cannot select: t10: ch = store<ST256[bitcast
([65 x i32]*
>>>> @a to <64 x i32>*)](align=16)(tbaa=<0x30c5438>)>
t9, t7, t12, undef:i64
>>>>   t7: v64i32 = add t6, t4
>>>>     t6: v64i32,ch = load<LD256[bitcast ([65 x i32]* @c to
<64 x
>>>>
i32>*)](align=16)(tbaa=<0x30c5438>)(dereferenceable)> t0, t14,
>>>> undef:i64
>>>>       t14: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65
x i32]*
>>>> @c> 0
>>>>         t13: i64 = TargetGlobalAddress<[65 x i32]* @c> 0
>>>>       t3: i64 = undef
>>>>     t4: v64i32,ch = load<LD256[bitcast ([65 x i32]* @b to
<64 x
>>>>
i32>*)](align=16)(tbaa=<0x30c5438>)(dereferenceable)> t0, t16,
>>>> undef:i64
>>>>       t16: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65
x i32]*
>>>> @b> 0
>>>>         t15: i64 = TargetGlobalAddress<[65 x i32]* @b> 0
>>>>       t3: i64 = undef
>>>>   t12: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[65 x
i32]* @a> 0
>>>>     t11: i64 = TargetGlobalAddress<[65 x i32]* @a> 0
>>>>   t3: i64 = undef
>>>> In function: foo
>>>>
>>>> The dag after legalization is also attached here.
>>>>
>>>> the source is vector sum of 65 elements.
>>>>
>>>>
>>>> Kindly correct me.
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing listllvm-dev at
lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>>
>>>> --
>>>> Employee of Qualcomm Innovation Center, Inc.
>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora
Forum, a Linux Foundation Collaborative Project
>>>>
>>>>
>>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170707/772d053c/attachment.html>

Possibly Parallel Threads

Search for more seemingly similar threads

llvm dev - Jul 2017 - Error in v64i32 type in x86 backend

[llvm-dev] Error in v64i32 type in x86 backend

[llvm-dev] Error in v64i32 type in x86 backend

[llvm-dev] Error in v64i32 type in x86 backend

Possibly Parallel Threads