search for: 16x4

Displaying 2 results from an estimated 2 matches for "16x4".

Did you mean: 164
2017 Jun 21
2
AVX 512 Assembly Code Generation issues
when i generate code with 72 loop iterations. the compiler generates code with using avx512 zmm operations 4 times (16x4=64) and remaining 8 iterations are handled by routine mov operations with EAX register. wouldn't it be better if it uses ymm for remaining 8 iterations as it does when iteration count is between 8 and 15. same for xmm and so on. please correct me if i am wrong. Thank You On Jun 21, 2017 12...
2014 Jan 21
2
[LLVMdev] Gather load in LLVM IR
Hi Evan, all, The most obvious thing to me would be to extend the load instruction to have an additional form that takes a vector of pointers instead of a single pointer. This form would return a vector of values instead of a single value. If a gather instruction is not available on the target, then the load could be lowered to a series of scalar loads and insertelements. Thanks, Nick On