Displaying 1 result from an estimated 1 matches for "dcfe0ce8".
2018 Jul 29
2
Vectorizing remainder loop
Hello, I m working on a hardware with very large vector width till v2048.
Now when I vectorize using llvm default vectorizer maximum 2047 iterations
are scalar remainder loop. These are not vectorized by llvm which increases
the cost. However these should be vectorized using next available vector
width I.e v1024, v512, v256, v128, v64, v32, v16, v8, v4.....
The issue of scalar remainder loop has