thr3ads.net - llvm dev - [LLVMdev] [LoopVectorizer] Missed vectorization opportunities caused by sext/zext operations [May 2015]

If this information is useful, please help other people find it:
Share via:

Sanjoy Das

2015-May-06 01:59 UTC

[LLVMdev] [LoopVectorizer] Missed vectorization opportunities caused by sext/zext operations

For

void test0(unsigned short a, unsigned short *  in, unsigned short * out) {
  for (unsigned short w = 1; w < a - 1; w++) //this will never overflow
      out[w] = in[w+7] * 2;
}

I think it will be sufficient to add a couple of new cases to
ScalarEvolution::HowManyLessThans --

  zext(A) ult zext(B) == A ult B
  sext(A) slt sext(B) == A slt B

Currently it bails out if it sees a non-add recurrence on the LHS of
the ult/slt.

You could also teach ScalarEvolution::isImpliedCondOperands the
following:

  zext(A) ult zext(B) <=> A ult B
  sext(A) slt sext(B) <=> A slt B

to get more aggressive promotion from ext{A,+,B} to {ext A,+,ext B}.

I'll be happy to review these patches.

-- Sanjoy

Silviu Baranga

2015-May-06 11:03 UTC

head link

[LLVMdev] [LoopVectorizer] Missed vectorization opportunities caused by sext/zext operations

Hi Renato, Sanjoy,
> void test0(unsigned short a, unsigned short *  in, unsigned short * out) {
>   for (unsigned short w = 1; w < a - 1; w++) //this will never overflow
>       out[w] = in[w+7] * 2;
> }
Turns out this can actually overflow for a = 0, and in this case would
be an infinite loop (end condition would be w < MAX_USHORT). Run-time
checks could still be added here:

if (a > 0)
  do vectorized version
else
  default to scalar version

On 06 May 2015 at 03:00, Sanjoy Das <sanjoy at playingwithpointers.com>
wrote:> I think it will be sufficient to add a couple of new cases to
> ScalarEvolution::HowManyLessThans --
> 
>   zext(A) ult zext(B) == A ult B
>   sext(A) slt sext(B) == A slt B
I had a go at this but realized that the problem above would affect
at least the ult case (we need to have B < max(type(B)) for it to hold),
and so I gave up (at least temporarily). I think this should still be
possible, but some range checking might be required.

On 05 May 2015 at 18:54, Renato Golin < renato.golin at linaro.org >
wrote:> > -          Promote values that go into the trip count calculation and
memory
> > access indices to the smallest type
> > which would remove sext/zext/trunc operations from the loop body. This
> > should remove the sext/zext issue, as SCEV wouldn’t have to deal with
> > these operations.
> 
> You mean demote the range specifier (a in your example above), to a more
> constrained type than the induction variable, right?
> 
> If possible, this would help static analysis of loop bounds and not require
run
> time checks.
I was thinking about doing something like promoting the
induction variable (in my case w) to a wider type (int), but there might be
some problems with that. I think we need SCEV working to be able to reason
about when that would be safe, and even if we could do that, there is no
guarantee that this would give us any benefit - we might end up
increasing the code size if vectorization wouldn't be possible
even after that transformation.

On 05 May 2015 at 18:54, Renato Golin < renato.golin at linaro.org >
wrote:> 
> > -          Add runtime checks (outside the loop) to detect overflows
in the
> > original loop
> 
> I think we do some of that. Maybe the two steps above will make the little
> we have today work on most cases already.
I think we're only checking for overflows in the trip count variable
that's
being generated by the loop vectorizer. So any potential overflows in the
actual loop would make us bail out in the legality check part.

I'm currently trying to work around this issue by teaching SCEV
to be able to optimistically fold sext/zext expressions while
accumulating the assumptions that it has made doing so (which
need to be checked at run-time). This allows us to gather all the
assumptions in the vectorization legality check of the loop vectorizer
and even allows us to go through the cost model without modifying
the IR.

This would of course only happen if we would otherwise abort
because we didn't get a SCEV expression that we want to be
an AddRecExpr.

This seems to work at least on a toy example (it can work around
issues with the number of back-edges taken and memory checks),
so I hope to have a patch sent out for RFC soon.

Thanks,
Silviu

Sanjoy Das

2015-May-06 19:00 UTC

head link

[LLVMdev] [LoopVectorizer] Missed vectorization opportunities caused by sext/zext operations

Hi Silviu,
>> void test0(unsigned short a, unsigned short *  in, unsigned short *
out) {
>>   for (unsigned short w = 1; w < a - 1; w++) //this will never
overflow
>>       out[w] = in[w+7] * 2;
>> }
>
> Turns out this can actually overflow for a = 0, and in this case would
> be an infinite loop (end condition would be w < MAX_USHORT). Run-time
> checks could still be added here:
Maybe I'm missing something here, but for a = 0 this will run exactly
(MAX_USHORT - 1) iterations, it will stop once w is MAX_USHORT.

In general, if your loop condition is "I u< N" for any N, I++
cannot
unsigned overflow since the only value of I for which I++ overflows is
MAX_USHORT (= -1) and MAX_USHORT will not be u< N for any N.  A
similar fact follows for "I s< N" and sign overflow in I++.
>> I think it will be sufficient to add a couple of new cases to
>> ScalarEvolution::HowManyLessThans --
>>
>>   zext(A) ult zext(B) == A ult B
>>   sext(A) slt sext(B) == A slt B
>
> I had a go at this but realized that the problem above would affect
> at least the ult case (we need to have B < max(type(B)) for it to hold),
Why?  AFAICT, the above two facts hold for all values of A and B,
including INT_MAX, INT_MIN, INT_UMAX etc.

-- Sanjoy

llvm dev - May 2015 - [LLVMdev] [LoopVectorizer] Missed vectorization opportunities caused by sext/zext operations

[LLVMdev] [LoopVectorizer] Missed vectorization opportunities caused by sext/zext operations

[LLVMdev] [LoopVectorizer] Missed vectorization opportunities caused by sext/zext operations

[LLVMdev] [LoopVectorizer] Missed vectorization opportunities caused by sext/zext operations