Hello, I m working on a hardware with very large vector width till v2048. Now when I vectorize using llvm default vectorizer maximum 2047 iterations are scalar remainder loop. These are not vectorized by llvm which increases the cost. However these should be vectorized using next available vector width I.e v1024, v512, v256, v128, v64, v32, v16, v8, v4..... The issue of scalar remainder loop has been there in llvm but this issue is enhanced and can't be ignored with large vector width. This is very important and significant to solve this issue. Please help. I m trying to see loopvectorizer.cpp but unable to figure out actual code to make changes. It's very important for me to solve this issue. Please help. Thank you -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180729/ae9c743c/attachment.html>
Please help in solving this issue. the issue of scalar remainder loop is really big and significant with large vector widths. Please help Thank You On Sun, Jul 29, 2018 at 2:52 PM, hameeza ahmed <hahmed2305 at gmail.com> wrote:> Hello, I m working on a hardware with very large vector width till v2048. > Now when I vectorize using llvm default vectorizer maximum 2047 iterations > are scalar remainder loop. These are not vectorized by llvm which increases > the cost. However these should be vectorized using next available vector > width I.e v1024, v512, v256, v128, v64, v32, v16, v8, v4..... > > The issue of scalar remainder loop has been there in llvm but this issue > is enhanced and can't be ignored with large vector width. This is very > important and significant to solve this issue. > > Please help. I m trying to see loopvectorizer.cpp but unable to figure out > actual code to make changes. > > It's very important for me to solve this issue. > > Please help. > > Thank you >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180729/dcfe0ce8/attachment.html>
Hi Hameeza, At this point Loop Vectorizer does not have capability to vectorize epilog/remainder loop. Sometime back there is an RFC on epilog loop vectorization but it did not went through because of concerns. This RFC has a patch as well, maybe you can give a try with it. http://llvm.1065342.n5.nabble.com/llvm-dev-Proposal-RFC-Epilog-loop-vectorization-tt106322.html#none - Ashutosh From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of hameeza ahmed via llvm-dev Sent: Sunday, July 29, 2018 10:24 PM To: llvm-dev <llvm-dev at lists.llvm.org>; Craig Topper <craig.topper at gmail.com>; Hal Finkel <hfinkel at anl.gov>; Friedman, Eli <efriedma at codeaurora.org> Subject: Re: [llvm-dev] Vectorizing remainder loop Please help in solving this issue. the issue of scalar remainder loop is really big and significant with large vector widths. Please help Thank You On Sun, Jul 29, 2018 at 2:52 PM, hameeza ahmed <hahmed2305 at gmail.com<mailto:hahmed2305 at gmail.com>> wrote: Hello, I m working on a hardware with very large vector width till v2048. Now when I vectorize using llvm default vectorizer maximum 2047 iterations are scalar remainder loop. These are not vectorized by llvm which increases the cost. However these should be vectorized using next available vector width I.e v1024, v512, v256, v128, v64, v32, v16, v8, v4..... The issue of scalar remainder loop has been there in llvm but this issue is enhanced and can't be ignored with large vector width. This is very important and significant to solve this issue. Please help. I m trying to see loopvectorizer.cpp but unable to figure out actual code to make changes. It's very important for me to solve this issue. Please help. Thank you -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180730/a67c0ffd/attachment.html>