Displaying 2 results from an estimated 2 matches for "vpgatherqd".
2017 Jul 01
2
KNL Assembly Code for Matrix Multiplication
...t;> The gather instruction requires a mask of which elements to read. When
>>>> the gather completes, if there are no faults it will have written the mask
>>>> register to 0. So it needs to reloaded for each gather.
>>>>
>>>>
>>>>> * vpgatherqd ymm0 {k2}, zmmword ptr [zmm14] ; since zmm14 contains 8
>>>>> indexes ( or values at these 8 indexes???) so it will load 8 elements not
>>>>> 16. here it should be zmm14**=[3200,3600,40000,.......54000]. but by
>>>>> the above computation these indexes a...
2017 Jan 24
7
[X86][AVX512] RFC: make i1 illegal in the Codegen
...x i32*> %p) {
%r = call <8 x i32> @llvm.masked.gather.v8i32(<8 x i32*> %p, i32 4, <8 x i1> <i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true>, <8 x i32> undef)
ret 8 x i32>%r
}
Can be lowered to
# BB#0:
kxnorw %k0, %k0, %k1
vpgatherqd (,%zmm1), %ymm0 {%k1}
retq
Legal vectors of i1's require support for BUILD_VECTOR(i1, i1, .., i1), i1 EXTRACT_VEC_ELEMENT (...) and INSERT_VEC_ELEMENT(i1, ...) , so making i1 legal seemed like a sensible decision, and this is the current state in the top of trunk.
However, making i1...