thr3ads.net - llvm dev - [LLVMdev] Simple Loop Vectorize Question [May 2013]

If this information is useful, please help other people find it:
Share via:

Joshua Klontz

2013-May-10 00:53 UTC

[LLVMdev] Simple Loop Vectorize Question

Nadav,

Please forgive my ignorance, but 'opt -mcpu=corei7 -loop-vectorize -S
-debug double.ll' doesn't appear to make a difference. In fact it seems
to
be ignored as garbage values for -mcpu don't raise an error. Am I
overlooking something else also?

Many Thanks,
Josh



On Thu, May 9, 2013 at 6:06 PM, Nadav Rotem <nrotem at apple.com> wrote:
> Hi Josh,
>
> Your modules does not have a triple, so the target machine and
> TargetTransformInfo have no way of knowing if you are running on a machine
> with vector registers.  Try adding the '-mcpu=XXXX' to opt and see
what
> happens.
>
> Thanks,
> Nadav
>
> On May 9, 2013, at 1:42 PM, Josh Klontz <josh.klontz at gmail.com>
wrote:
>
> Hi! I am trying to get the loop vectorizer to work on a simple example
> (http://pastebin.com/tGhpc4y0) that doubles every element in a vector.
>
> I've found that 'opt -loop-vectorize -force-vector-width=4 -S
-debug
> double.ll' works as expected. However, removing the -force-vector-width
> flag
> results in no vectorization. From the debug output I can see that the issue
> boils down to:
>
> LV: The Widest type: 32 bits.
> LV: The Widest register is:32bits.
>
> I tried to work back through the source code to figure out why the widest
> register is incorrect, though I get lost following the code logic for how
> TargetTransformInfo gets initialized. Therefore, I have two questions:
>
> 1) Can -force-vector-width be specified from the C++ API? And if so, how?
> 2) What am I neglecting to do so that TargetTransformInfo is set correctly
> and vectorization happens without forcing a vector width? Ultimately I
> would
> like use vectorization in conjunction with the JIT ExecutionEngine.
>
> Thank you to those of you who have answered my questions in the past, the
> answers have helped tremendously and I am extremely grateful!
>
> Kindly,
> Josh
>
>
>
>
> --
> View this message in context:
>
http://llvm.1065342.n5.nabble.com/Simple-Loop-Vectorize-Question-tp57584.html
> Sent from the LLVM - Dev mailing list archive at
Nabble.com<http://nabble.com/>
> .
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130509/22aaccb8/attachment.html>

Nadav Rotem

2013-May-10 05:27 UTC

head link

[LLVMdev] Simple Loop Vectorize Question

Hi Josh, 

This line works for me:

opt file.ll -loop-vectorize -S -o - -mtriple=x86_64 -mcpu=corei7-avx -debug

You need to specify the triple on the command line if it is not inside the
module.

Thanks,
Nadav



On May 9, 2013, at 5:53 PM, Joshua Klontz <josh.klontz at gmail.com>
wrote:
> Nadav,
> 
> Please forgive my ignorance, but 'opt -mcpu=corei7 -loop-vectorize -S
-debug double.ll' doesn't appear to make a difference. In fact it seems
to be ignored as garbage values for -mcpu don't raise an error. Am I
overlooking something else also?
> 
> Many Thanks,
> Josh
> 
> 
> 
> On Thu, May 9, 2013 at 6:06 PM, Nadav Rotem <nrotem at apple.com>
wrote:
> Hi Josh, 
> 
> Your modules does not have a triple, so the target machine and
TargetTransformInfo have no way of knowing if you are running on a machine with
vector registers.  Try adding the '-mcpu=XXXX' to opt and see what
happens.
> 
> Thanks,
> Nadav
> 
> On May 9, 2013, at 1:42 PM, Josh Klontz <josh.klontz at gmail.com>
wrote:
> 
>> Hi! I am trying to get the loop vectorizer to work on a simple example
>> (http://pastebin.com/tGhpc4y0) that doubles every element in a vector.
>> 
>> I've found that 'opt -loop-vectorize -force-vector-width=4 -S
-debug
>> double.ll' works as expected. However, removing the
-force-vector-width flag
>> results in no vectorization. From the debug output I can see that the
issue
>> boils down to:
>> 
>> LV: The Widest type: 32 bits.
>> LV: The Widest register is:32bits.
>> 
>> I tried to work back through the source code to figure out why the
widest
>> register is incorrect, though I get lost following the code logic for
how
>> TargetTransformInfo gets initialized. Therefore, I have two questions:
>> 
>> 1) Can -force-vector-width be specified from the C++ API? And if so,
how?
>> 2) What am I neglecting to do so that TargetTransformInfo is set
correctly
>> and vectorization happens without forcing a vector width? Ultimately I
would
>> like use vectorization in conjunction with the JIT ExecutionEngine.
>> 
>> Thank you to those of you who have answered my questions in the past,
the
>> answers have helped tremendously and I am extremely grateful!
>> 
>> Kindly,
>> Josh
>> 
>> 
>> 
>> 
>> --
>> View this message in context:
http://llvm.1065342.n5.nabble.com/Simple-Loop-Vectorize-Question-tp57584.html
>> Sent from the LLVM - Dev mailing list archive at Nabble.com.
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130509/190bcc84/attachment.html>

Joshua Klontz

2013-May-10 16:52 UTC

head link

[LLVMdev] Simple Loop Vectorize Question

Nadav,

Yes, thank you, this works for me as well. One final related question I
hope you can educate me on:

I'm trying to apply code vectorization within the context of the JIT
execution engine. I've tried to initialize my module and execution engine
with information about the target triple and the cpu akin to what worked
from the command line. Unfortunately, I'm still encountering the same error
I mentioned in my first post:

// Pseudo code InitializeNativeTarget();
Module *m = new Module("test", getGlobalContext());
m->setTargetTriple(sys::getProcessTriple()); // also tried "x86_64"
ExecutionEngine *ee
EngineBuilder(m).setMCPU(sys::getHostCPUName()).setEngineKind(EngineKind::JIT);
// also tried MCPU = "corei7-avx"
initializeScalarOpts(*PassRegistry::getPassRegistry());
FunctionPassManager *fpm = new FunctionPassManager(m);
fpm->add(createVerifierPass(PrintMessageAction));
fpm->add(new DataLayout(*ee->getDataLayout())); // Is this the correct
idiom?
fpm->add(createBasicAliasAnalysisPass());
fpm->add(createLICMPass());
fpm->add(createLoopVectorizePass());
fpm->add(createPrintFunctionPass());
DebugFlag = true;
Function *f = // Build function with IRBuilder (IR code included in first
post)
fpm->run(*f);

Do you notice anything obviously wrong with this? Would a complete minimal
reproducing example be helpful?

Thanks again,
Josh

On Fri, May 10, 2013 at 1:27 AM, Nadav Rotem <nrotem at apple.com> wrote:
> Hi Josh,
>
> This line works for me:
>
> opt file.ll -loop-vectorize -S -o - -mtriple=x86_64 -mcpu=corei7-avx -debug
>
> You need to specify the triple on the command line if it is not inside the
> module.
>
> Thanks,
> Nadav
>
>
>
> On May 9, 2013, at 5:53 PM, Joshua Klontz <josh.klontz at gmail.com>
wrote:
>
> Nadav,
>
> Please forgive my ignorance, but 'opt -mcpu=corei7 -loop-vectorize -S
> -debug double.ll' doesn't appear to make a difference. In fact it
seems to
> be ignored as garbage values for -mcpu don't raise an error. Am I
> overlooking something else also?
>
> Many Thanks,
> Josh
>
>
>
> On Thu, May 9, 2013 at 6:06 PM, Nadav Rotem <nrotem at apple.com>
wrote:
>
>> Hi Josh,
>>
>> Your modules does not have a triple, so the target machine and
>> TargetTransformInfo have no way of knowing if you are running on a
machine
>> with vector registers.  Try adding the '-mcpu=XXXX' to opt and
see what
>> happens.
>>
>> Thanks,
>> Nadav
>>
>> On May 9, 2013, at 1:42 PM, Josh Klontz <josh.klontz at
gmail.com> wrote:
>>
>> Hi! I am trying to get the loop vectorizer to work on a simple example
>> (http://pastebin.com/tGhpc4y0) that doubles every element in a vector.
>>
>> I've found that 'opt -loop-vectorize -force-vector-width=4 -S
-debug
>> double.ll' works as expected. However, removing the
-force-vector-width
>> flag
>> results in no vectorization. From the debug output I can see that the
>> issue
>> boils down to:
>>
>> LV: The Widest type: 32 bits.
>> LV: The Widest register is:32bits.
>>
>> I tried to work back through the source code to figure out why the
widest
>> register is incorrect, though I get lost following the code logic for
how
>> TargetTransformInfo gets initialized. Therefore, I have two questions:
>>
>> 1) Can -force-vector-width be specified from the C++ API? And if so,
how?
>> 2) What am I neglecting to do so that TargetTransformInfo is set
correctly
>> and vectorization happens without forcing a vector width? Ultimately I
>> would
>> like use vectorization in conjunction with the JIT ExecutionEngine.
>>
>> Thank you to those of you who have answered my questions in the past,
the
>> answers have helped tremendously and I am extremely grateful!
>>
>> Kindly,
>> Josh
>>
>>
>>
>>
>> --
>> View this message in context:
>>
http://llvm.1065342.n5.nabble.com/Simple-Loop-Vectorize-Question-tp57584.html
>> Sent from the LLVM - Dev mailing list archive at
Nabble.com<http://nabble.com/>
>> .
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130510/8bded94a/attachment.html>

Possibly Parallel Threads

Search for more apparently analagous threads

llvm dev - May 2013 - [LLVMdev] Simple Loop Vectorize Question

[LLVMdev] Simple Loop Vectorize Question

[LLVMdev] Simple Loop Vectorize Question

[LLVMdev] Simple Loop Vectorize Question

Possibly Parallel Threads