search for: vmovdqu32zmm0

Displaying 1 result from an estimated 1 matches for "vmovdqu32zmm0".

2017 Jun 24
4
AVX Scheduling and Parallelism
Hello, After generating AVX code for large no of iterations i came to realize that it still uses only 2 registers zmm0 and zmm1 when the loop urnroll factor=1024, i wonder if this register allocation allows operations in parallel? Also i know all the elements within a single vector instruction are computed in parallel but does the elements of multiple instructions computed in parallel? like are