Displaying 2 results from an estimated 2 matches for "16x4".
Did you mean:
164
2017 Jun 21
2
AVX 512 Assembly Code Generation issues
when i generate code with 72 loop iterations.
the compiler generates code with using avx512 zmm operations 4 times
(16x4=64) and remaining 8 iterations are handled by routine mov operations
with EAX register. wouldn't it be better if it uses ymm for remaining 8
iterations as it does when iteration count is between 8 and 15. same for
xmm and so on.
please correct me if i am wrong.
Thank You
On Jun 21, 2017 12...
2014 Jan 21
2
[LLVMdev] Gather load in LLVM IR
Hi Evan, all,
The most obvious thing to me would be to extend the load instruction to
have an additional form that takes a vector of pointers instead of a
single pointer.
This form would return a vector of values instead of a single value.
If a gather instruction is not available on the target, then the load
could be lowered to a series of scalar loads and insertelements.
Thanks,
Nick
On