Hi! I am trying to get the loop vectorizer to work on a simple example (http://pastebin.com/tGhpc4y0) that doubles every element in a vector. I've found that 'opt -loop-vectorize -force-vector-width=4 -S -debug double.ll' works as expected. However, removing the -force-vector-width flag results in no vectorization. From the debug output I can see that the issue boils down to: LV: The Widest type: 32 bits. LV: The Widest register is:32bits. I tried to work back through the source code to figure out why the widest register is incorrect, though I get lost following the code logic for how TargetTransformInfo gets initialized. Therefore, I have two questions: 1) Can -force-vector-width be specified from the C++ API? And if so, how? 2) What am I neglecting to do so that TargetTransformInfo is set correctly and vectorization happens without forcing a vector width? Ultimately I would like use vectorization in conjunction with the JIT ExecutionEngine. Thank you to those of you who have answered my questions in the past, the answers have helped tremendously and I am extremely grateful! Kindly, Josh -- View this message in context: http://llvm.1065342.n5.nabble.com/Simple-Loop-Vectorize-Question-tp57584.html Sent from the LLVM - Dev mailing list archive at Nabble.com.
Hi Josh, Your modules does not have a triple, so the target machine and TargetTransformInfo have no way of knowing if you are running on a machine with vector registers. Try adding the '-mcpu=XXXX' to opt and see what happens. Thanks, Nadav On May 9, 2013, at 1:42 PM, Josh Klontz <josh.klontz at gmail.com> wrote:> Hi! I am trying to get the loop vectorizer to work on a simple example > (http://pastebin.com/tGhpc4y0) that doubles every element in a vector. > > I've found that 'opt -loop-vectorize -force-vector-width=4 -S -debug > double.ll' works as expected. However, removing the -force-vector-width flag > results in no vectorization. From the debug output I can see that the issue > boils down to: > > LV: The Widest type: 32 bits. > LV: The Widest register is:32bits. > > I tried to work back through the source code to figure out why the widest > register is incorrect, though I get lost following the code logic for how > TargetTransformInfo gets initialized. Therefore, I have two questions: > > 1) Can -force-vector-width be specified from the C++ API? And if so, how? > 2) What am I neglecting to do so that TargetTransformInfo is set correctly > and vectorization happens without forcing a vector width? Ultimately I would > like use vectorization in conjunction with the JIT ExecutionEngine. > > Thank you to those of you who have answered my questions in the past, the > answers have helped tremendously and I am extremely grateful! > > Kindly, > Josh > > > > > -- > View this message in context: http://llvm.1065342.n5.nabble.com/Simple-Loop-Vectorize-Question-tp57584.html > Sent from the LLVM - Dev mailing list archive at Nabble.com. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130509/eb0fcde7/attachment.html>
Nadav, Please forgive my ignorance, but 'opt -mcpu=corei7 -loop-vectorize -S -debug double.ll' doesn't appear to make a difference. In fact it seems to be ignored as garbage values for -mcpu don't raise an error. Am I overlooking something else also? Many Thanks, Josh On Thu, May 9, 2013 at 6:06 PM, Nadav Rotem <nrotem at apple.com> wrote:> Hi Josh, > > Your modules does not have a triple, so the target machine and > TargetTransformInfo have no way of knowing if you are running on a machine > with vector registers. Try adding the '-mcpu=XXXX' to opt and see what > happens. > > Thanks, > Nadav > > On May 9, 2013, at 1:42 PM, Josh Klontz <josh.klontz at gmail.com> wrote: > > Hi! I am trying to get the loop vectorizer to work on a simple example > (http://pastebin.com/tGhpc4y0) that doubles every element in a vector. > > I've found that 'opt -loop-vectorize -force-vector-width=4 -S -debug > double.ll' works as expected. However, removing the -force-vector-width > flag > results in no vectorization. From the debug output I can see that the issue > boils down to: > > LV: The Widest type: 32 bits. > LV: The Widest register is:32bits. > > I tried to work back through the source code to figure out why the widest > register is incorrect, though I get lost following the code logic for how > TargetTransformInfo gets initialized. Therefore, I have two questions: > > 1) Can -force-vector-width be specified from the C++ API? And if so, how? > 2) What am I neglecting to do so that TargetTransformInfo is set correctly > and vectorization happens without forcing a vector width? Ultimately I > would > like use vectorization in conjunction with the JIT ExecutionEngine. > > Thank you to those of you who have answered my questions in the past, the > answers have helped tremendously and I am extremely grateful! > > Kindly, > Josh > > > > > -- > View this message in context: > http://llvm.1065342.n5.nabble.com/Simple-Loop-Vectorize-Question-tp57584.html > Sent from the LLVM - Dev mailing list archive at Nabble.com<http://nabble.com/> > . > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130509/22aaccb8/attachment.html>