thr3ads.net - llvm dev - [LLVMdev] Limit loop vectorizer to SSE [Nov 2013]

If this information is useful, please help other people find it:
Share via:

Nadav Rotem

2013-Nov-15 20:58 UTC

[LLVMdev] Limit loop vectorizer to SSE

On Nov 15, 2013, at 12:36 PM, Renato Golin <renato.golin at linaro.org>
wrote:
> On 15 November 2013 20:24, Joshua Klontz <josh.klontz at gmail.com>
wrote:
> Agreed, is there a pass that will insert a runtime alignment check? Also,
what's the easiest way to get at TargetTransformInfo::getRegisterBitWidth()
so I don't have to hard code 32? Thanks!
> 
> I think that's a fair question, and it's about safety. If
you're getting this on the JIT, means we may be generating unsafe
transformations on the vectorizer.
> 
> Arnold, Nadav, I don't remember seeing code to generate any run-time
alignment checks on the incoming pointer, is there such a thing? If not,
shouldn't we add one?

If the the vectorizer generates aligned memory accesses to unaligned addresses
then this is a serious bug.  But I don’t think that Josh said that the
vectorizer generated aligned accesses to unaligned pointers.

There is no point in LLVM checking for alignment because if the memory is
unaligned then the program will crash.  Users who want to crash with a readable
error message can simply write code that checks the pointer (by masking the high
bits and comparing to zero).

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131115/d15886ed/attachment.html>

Joshua Klontz

2013-Nov-15 21:49 UTC

head link

[LLVMdev] Limit loop vectorizer to SSE

Nadav,

I believe aligned accesses to unaligned pointers is precisely the issue.
Consider the function `add_u8S` before[1] and after[2] the loop vectorizer
pass. There is no alignment assumption associated with %kernel_data prior
to vectorization. I can't tell if it's the loop vectorizer or the
codegen
at fault, but the alignment assumption seems to sneak in somewhere.

v/r,
Josh

[1] http://pastebin.com/kc95WtGG
[2] http://pastebin.com/VY3ZLVJK


On Fri, Nov 15, 2013 at 3:58 PM, Nadav Rotem <nrotem at apple.com> wrote:
>
> On Nov 15, 2013, at 12:36 PM, Renato Golin <renato.golin at
linaro.org>
> wrote:
>
> On 15 November 2013 20:24, Joshua Klontz <josh.klontz at gmail.com>
wrote:
>
>> Agreed, is there a pass that will insert a runtime alignment check?
Also,
>> what's the easiest way to get at
TargetTransformInfo::getRegisterBitWidth()
>> so I don't have to hard code 32? Thanks!
>>
>
> I think that's a fair question, and it's about safety. If
you're getting
> this on the JIT, means we may be generating unsafe transformations on the
> vectorizer.
>
> Arnold, Nadav, I don't remember seeing code to generate any run-time
> alignment checks on the incoming pointer, is there such a thing? If not,
> shouldn't we add one?
>
>
>
> If the the vectorizer generates aligned memory accesses to unaligned
> addresses then this is a serious bug.  But I don’t think that Josh said
> that the vectorizer generated aligned accesses to unaligned pointers.
>
> There is no point in LLVM checking for alignment because if the memory is
> unaligned then the program will crash.  Users who want to crash with a
> readable error message can simply write code that checks the pointer (by
> masking the high bits and comparing to zero).
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131115/aad9554a/attachment.html>

Arnold Schwaighofer

2013-Nov-15 22:03 UTC

head link

[LLVMdev] Limit loop vectorizer to SSE

Aha, this is interesting. I think this is indeed a bug.

We start off with a

= load i8

(A value of 0 or an omitted align argument means that the operation has the ABI
alignment for the target)

We vectorize this as
= load <32 x i8> 

Which again is assumed to be abi aligned for the type <… i8> which is
probably 32byte.


We should be outputting either the abi alignment of the original type or 1 for
such memory accesses (where alignment == 0).

= load <32 x i8>, align 1

I think we don’t see this from code coming from clang because it will output “,
align #” for most accesses.

Thanks,
Arnold

On Nov 15, 2013, at 3:49 PM, Joshua Klontz <josh.klontz at gmail.com>
wrote:
> Nadav,
> 
> I believe aligned accesses to unaligned pointers is precisely the issue.
Consider the function `add_u8S` before[1] and after[2] the loop vectorizer pass.
There is no alignment assumption associated with %kernel_data prior to
vectorization. I can't tell if it's the loop vectorizer or the codegen
at fault, but the alignment assumption seems to sneak in somewhere.
> 
> v/r,
> Josh
> 
> [1] http://pastebin.com/kc95WtGG
> [2] http://pastebin.com/VY3ZLVJK
> 
> 
> On Fri, Nov 15, 2013 at 3:58 PM, Nadav Rotem <nrotem at apple.com>
wrote:
> 
> On Nov 15, 2013, at 12:36 PM, Renato Golin <renato.golin at
linaro.org> wrote:
> 
>> On 15 November 2013 20:24, Joshua Klontz <josh.klontz at
gmail.com> wrote:
>> Agreed, is there a pass that will insert a runtime alignment check?
Also, what's the easiest way to get at
TargetTransformInfo::getRegisterBitWidth() so I don't have to hard code 32?
Thanks!
>> 
>> I think that's a fair question, and it's about safety. If
you're getting this on the JIT, means we may be generating unsafe
transformations on the vectorizer.
>> 
>> Arnold, Nadav, I don't remember seeing code to generate any
run-time alignment checks on the incoming pointer, is there such a thing? If
not, shouldn't we add one?
> 
> 
> If the the vectorizer generates aligned memory accesses to unaligned
addresses then this is a serious bug.  But I don’t think that Josh said that the
vectorizer generated aligned accesses to unaligned pointers.
> 
> There is no point in LLVM checking for alignment because if the memory is
unaligned then the program will crash.  Users who want to crash with a readable
error message can simply write code that checks the pointer (by masking the high
bits and comparing to zero).
> 
>

Arnold Schwaighofer

2013-Nov-15 22:05 UTC

head link

[LLVMdev] Limit loop vectorizer to SSE

Something like:

index 6db7f68..68564cb 100644
--- a/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -1208,6 +1208,8 @@ void InnerLoopVectorizer::vectorizeMemoryInstruction(Instr
   Type *DataTy = VectorType::get(ScalarDataTy, VF);
   Value *Ptr = LI ? LI->getPointerOperand() : SI->getPointerOperand();
   unsigned Alignment = LI ? LI->getAlignment() : SI->getAlignment();
+  if (Alignment == 0)
+    Alignment = 1;
   unsigned AddressSpace = Ptr->getType()->getPointerAddressSpace();
   unsigned ScalarAllocatedSize = DL->getTypeAllocSize(ScalarDataTy);
   unsigned VectorElementSize = DL->getTypeStoreSize(DataTy)/VF;

Should fix this.

On Nov 15, 2013, at 3:49 PM, Joshua Klontz <josh.klontz at gmail.com>
wrote:
> Nadav,
> 
> I believe aligned accesses to unaligned pointers is precisely the issue.
Consider the function `add_u8S` before[1] and after[2] the loop vectorizer pass.
There is no alignment assumption associated with %kernel_data prior to
vectorization. I can't tell if it's the loop vectorizer or the codegen
at fault, but the alignment assumption seems to sneak in somewhere.
> 
> v/r,
> Josh
> 
> [1] http://pastebin.com/kc95WtGG
> [2] http://pastebin.com/VY3ZLVJK
> 
> 
> On Fri, Nov 15, 2013 at 3:58 PM, Nadav Rotem <nrotem at apple.com>
wrote:
> 
> On Nov 15, 2013, at 12:36 PM, Renato Golin <renato.golin at
linaro.org> wrote:
> 
>> On 15 November 2013 20:24, Joshua Klontz <josh.klontz at
gmail.com> wrote:
>> Agreed, is there a pass that will insert a runtime alignment check?
Also, what's the easiest way to get at
TargetTransformInfo::getRegisterBitWidth() so I don't have to hard code 32?
Thanks!
>> 
>> I think that's a fair question, and it's about safety. If
you're getting this on the JIT, means we may be generating unsafe
transformations on the vectorizer.
>> 
>> Arnold, Nadav, I don't remember seeing code to generate any
run-time alignment checks on the incoming pointer, is there such a thing? If
not, shouldn't we add one?
> 
> 
> If the the vectorizer generates aligned memory accesses to unaligned
addresses then this is a serious bug.  But I don’t think that Josh said that the
vectorizer generated aligned accesses to unaligned pointers.
> 
> There is no point in LLVM checking for alignment because if the memory is
unaligned then the program will crash.  Users who want to crash with a readable
error message can simply write code that checks the pointer (by masking the high
bits and comparing to zero).
> 
>

Arnold Schwaighofer

2013-Nov-15 23:15 UTC

head link

[LLVMdev] Limit loop vectorizer to SSE

A fix for this is in r194876.

Thanks for reporting this!


On Nov 15, 2013, at 3:49 PM, Joshua Klontz <josh.klontz at gmail.com>
wrote:
> Nadav,
> 
> I believe aligned accesses to unaligned pointers is precisely the issue.
Consider the function `add_u8S` before[1] and after[2] the loop vectorizer pass.
There is no alignment assumption associated with %kernel_data prior to
vectorization. I can't tell if it's the loop vectorizer or the codegen
at fault, but the alignment assumption seems to sneak in somewhere.
> 
> v/r,
> Josh
> 
> [1] http://pastebin.com/kc95WtGG
> [2] http://pastebin.com/VY3ZLVJK
> 
> 
> On Fri, Nov 15, 2013 at 3:58 PM, Nadav Rotem <nrotem at apple.com>
wrote:
> 
> On Nov 15, 2013, at 12:36 PM, Renato Golin <renato.golin at
linaro.org> wrote:
> 
>> On 15 November 2013 20:24, Joshua Klontz <josh.klontz at
gmail.com> wrote:
>> Agreed, is there a pass that will insert a runtime alignment check?
Also, what's the easiest way to get at
TargetTransformInfo::getRegisterBitWidth() so I don't have to hard code 32?
Thanks!
>> 
>> I think that's a fair question, and it's about safety. If
you're getting this on the JIT, means we may be generating unsafe
transformations on the vectorizer.
>> 
>> Arnold, Nadav, I don't remember seeing code to generate any
run-time alignment checks on the incoming pointer, is there such a thing? If
not, shouldn't we add one?
> 
> 
> If the the vectorizer generates aligned memory accesses to unaligned
addresses then this is a serious bug.  But I don’t think that Josh said that the
vectorizer generated aligned accesses to unaligned pointers.
> 
> There is no point in LLVM checking for alignment because if the memory is
unaligned then the program will crash.  Users who want to crash with a readable
error message can simply write code that checks the pointer (by masking the high
bits and comparing to zero).
> 
>

Peter Newman

2013-Nov-19 06:57 UTC

head link

[LLVMdev] Limit loop vectorizer to SSE

On 16/11/2013 7:58 AM, Nadav Rotem wrote:>
> On Nov 15, 2013, at 12:36 PM, Renato Golin <renato.golin at linaro.org 
> <mailto:renato.golin at linaro.org>> wrote:
>
>> On 15 November 2013 20:24, Joshua Klontz <josh.klontz at gmail.com 
>> <mailto:josh.klontz at gmail.com>> wrote:
>>
>>     Agreed, is there a pass that will insert a runtime alignment
>>     check? Also, what's the easiest way to get at
>>     TargetTransformInfo::getRegisterBitWidth() so I don't have to
>>     hard code 32? Thanks!
>>
>>
>> I think that's a fair question, and it's about safety. If
you're
>> getting this on the JIT, means we may be generating unsafe 
>> transformations on the vectorizer.
>>
>> Arnold, Nadav, I don't remember seeing code to generate any
run-time
>> alignment checks on the incoming pointer, is there such a thing? If 
>> not, shouldn't we add one?
>
>
> If the the vectorizer generates aligned memory accesses to unaligned 
> addresses then this is a serious bug.  But I don't think that Josh 
> said that the vectorizer generated aligned accesses to unaligned 
> pointers.I've hit this (or a similar) bug, where generating SSE instructions for 
vector access was assuming stack memory would be aligned, if it was an 
aligned offset from the start of the allocation. But if the stack wasn't 
appropriately aligned when the function was called, the stack memory 
would be unaligned and cause an exception.>
> There is no point in LLVM checking for alignment because if the memory 
> is unaligned then the program will crash.  Users who want to crash 
> with a readable error message can simply write code that checks the 
> pointer (by masking the high bits and comparing to zero).
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131119/7aa6269f/attachment.html>

Peter Newman

2013-Nov-19 07:07 UTC

head link

[LLVMdev] Limit loop vectorizer to SSE

On 19/11/2013 5:57 PM, Peter Newman wrote:> On 16/11/2013 7:58 AM, Nadav Rotem wrote:
>>
>> On Nov 15, 2013, at 12:36 PM, Renato Golin <renato.golin at
linaro.org
>> <mailto:renato.golin at linaro.org>> wrote:
>>
>>> On 15 November 2013 20:24, Joshua Klontz <josh.klontz at
gmail.com
>>> <mailto:josh.klontz at gmail.com>> wrote:
>>>
>>>     Agreed, is there a pass that will insert a runtime alignment
>>>     check? Also, what's the easiest way to get at
>>>     TargetTransformInfo::getRegisterBitWidth() so I don't have
to
>>>     hard code 32? Thanks!
>>>
>>>
>>> I think that's a fair question, and it's about safety. If
you're
>>> getting this on the JIT, means we may be generating unsafe 
>>> transformations on the vectorizer.
>>>
>>> Arnold, Nadav, I don't remember seeing code to generate any
run-time
>>> alignment checks on the incoming pointer, is there such a thing? If
>>> not, shouldn't we add one?
>>
>>
>> If the the vectorizer generates aligned memory accesses to unaligned 
>> addresses then this is a serious bug.  But I don't think that Josh 
>> said that the vectorizer generated aligned accesses to unaligned 
>> pointers.
> I've hit this (or a similar) bug, where generating SSE instructions 
> for vector access was assuming stack memory would be aligned, if it 
> was an aligned offset from the start of the allocation. But if the 
> stack wasn't appropriately aligned when the function was called, the 
> stack memory would be unaligned and cause an exception.Actually, after reading the rest of the thread and the discussion of 
what was occurring, it seems I was generating a function without 
alignment specified, which via the "no alignment specified means align 0 
which means ABI alignment" meant the generated code thought it was 
aligned appropriately. Specifying an alignment fixed this (and causes 
the generated code to manually ensure its alignment).
>>
>> There is no point in LLVM checking for alignment because if the 
>> memory is unaligned then the program will crash.  Users who want to 
>> crash with a readable error message can simply write code that checks 
>> the pointer (by masking the high bits and comparing to zero).
>>
>>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131119/b6b1012d/attachment.html>

Maybe Matching Threads

Search for more seemingly similar threads

llvm dev - Nov 2013 - [LLVMdev] Limit loop vectorizer to SSE

[LLVMdev] Limit loop vectorizer to SSE

[LLVMdev] Limit loop vectorizer to SSE

[LLVMdev] Limit loop vectorizer to SSE

[LLVMdev] Limit loop vectorizer to SSE

[LLVMdev] Limit loop vectorizer to SSE

[LLVMdev] Limit loop vectorizer to SSE

[LLVMdev] Limit loop vectorizer to SSE

Maybe Matching Threads