thr3ads.net - llvm dev - [LLVMdev] Autovectorization questions [Mar 2014]

If this information is useful, please help other people find it:
Share via:

Arnold Schwaighofer

2014-Mar-12 23:45 UTC

[LLVMdev] Autovectorization questions

On Mar 12, 2014, at 4:05 PM, Chandler Carruth <chandlerc at google.com>
wrote:
> 
> On Wed, Mar 12, 2014 at 3:50 PM, Arnold Schwaighofer <aschwaighofer at
apple.com> wrote:
> In order to vectorize code like this LLVM needs to prove that “A[i*7]” does
not wrap in the address space. It fails to do so
> 
> But, why?
> 
> I'm moderately sure that neither C nor C++ allow wrapping around the
end of the address space. If they do, we will fix C++ at least to disallow this.
'i' is a signed integer, so we can't wrap in the index space either.
So why can't LLVM prove this?
The loop vectorizer relies on scev’s nowrap flags. We need to improve SCEV for
this.

  %conv = sext i32 %k to i64
  -->  (sext i32 %k to i64)
  %i.06 = phi i64 [ 0, %entry ], [ %inc, %for.body ]
  -->  {0,+,1}<nuw><nsw><%for.body>             Exits: 1023
  %mul1 = mul nsw i64 %i.06, 7
  -->  {0,+,7}<%for.body>               Exits: 7161
  %arrayidx2 = getelementptr inbounds i32* %A, i64 %mul1
  -->  {%A,+,28}<%for.body>           <== we want to see a nw flag
here.


Scev sometimes drops new flags for safety (cannonicalization can make them
invalid if the same expression is used in different contexts) . See past
discussions on this.

We are thinking about doing something like described here:
http://permalink.gmane.org/gmane.comp.compilers.llvm.devel/67476 or in this
thread:(http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20131007/190703.html.

Chandler Carruth

2014-Mar-12 23:48 UTC

head link

[LLVMdev] Autovectorization questions

On Wed, Mar 12, 2014 at 4:45 PM, Arnold Schwaighofer <
aschwaighofer at apple.com> wrote:
> The loop vectorizer relies on scev’s nowrap flags. We need to improve SCEV
> for this.
>
>   %conv = sext i32 %k to i64
>   -->  (sext i32 %k to i64)
>   %i.06 = phi i64 [ 0, %entry ], [ %inc, %for.body ]
>   -->  {0,+,1}<nuw><nsw><%for.body>             Exits:
1023
>   %mul1 = mul nsw i64 %i.06, 7
>   -->  {0,+,7}<%for.body>               Exits: 7161
>   %arrayidx2 = getelementptr inbounds i32* %A, i64 %mul1
>   -->  {%A,+,28}<%for.body>           <== we want to see a nw
flag here.
>
>
> Scev sometimes drops new flags for safety (cannonicalization can make them
> invalid if the same expression is used in different contexts) . See past
> discussions on this.
>
Sure, but I think its really important to clarify that the *example* is
fine, and there is nothing fundamental about it that prevents vectorization.

We simply need to fix SCEV.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140312/30fcf6d9/attachment.html>

Arnold Schwaighofer

2014-Mar-12 23:59 UTC

head link

[LLVMdev] Autovectorization questions

On Mar 12, 2014, at 4:48 PM, Chandler Carruth <chandlerc at google.com>
wrote:
> 
> On Wed, Mar 12, 2014 at 4:45 PM, Arnold Schwaighofer <aschwaighofer at
apple.com> wrote:
> The loop vectorizer relies on scev’s nowrap flags. We need to improve SCEV
for this.
> 
>   %conv = sext i32 %k to i64
>   -->  (sext i32 %k to i64)
>   %i.06 = phi i64 [ 0, %entry ], [ %inc, %for.body ]
>   -->  {0,+,1}<nuw><nsw><%for.body>             Exits:
1023
>   %mul1 = mul nsw i64 %i.06, 7
>   -->  {0,+,7}<%for.body>               Exits: 7161
>   %arrayidx2 = getelementptr inbounds i32* %A, i64 %mul1
>   -->  {%A,+,28}<%for.body>           <== we want to see a nw
flag here.
> 
> 
> Scev sometimes drops new flags for safety (cannonicalization can make them
invalid if the same expression is used in different contexts) . See past
discussions on this.
> 
> Sure, but I think its really important to clarify that the *example* is
fine, and there is nothing fundamental about it that prevents vectorization.
>  
Sure. I thought this was clear in my answer (obviously not :). Rereading, I
should probably have added that the code is vectorizable (I assumed the Zinovy
knows this).
> In order to vectorize code like this LLVM needs to prove that “A[i*7]” does
not wrap in the address space. It fails to do so and so LLVM doesn’t vectorize
this loop even if we try to force it.

llvm dev - Mar 2014 - [LLVMdev] Autovectorization questions

[LLVMdev] Autovectorization questions

[LLVMdev] Autovectorization questions

[LLVMdev] Autovectorization questions