thr3ads.net - llvm dev - [llvm-dev] Is it feasible for LV to vectorize a loop accessing A[i] using VF>2 when A has only 2 elements? [Dec 2020]

If this information is useful, please help other people find it:
Share via:

Björn Pettersson A via llvm-dev

2020-Dec-07 17:10 UTC

[llvm-dev] Is it feasible for LV to vectorize a loop accessing A[i] using VF>2 when A has only 2 elements?

Thanks for the answer Philip.

Just to explain the background for the question:

My original problem I had was that DeadStoreElimination eliminated
stores that was intended to be used by a memcpy. It turned out that
the input program was faulty as the memcpy was reading out-of-bounds
(although not being detected by clang, nor clang static analyzer).
In order to do some code washing, I wanted to try to find more files
that might suffer from similar out-of-bounds accesses. I did that by
hacking into BasicAliasAnalysis where it compares access sizes with
size of underlying objects to catch the problem when alias analysis
decided that use in the memcpy was consider as NoAlias due to
reading outside the underlying object (later I realized that the
same kind of check already exist in Lint.cpp in visitMemoryReference
so I could probably just have use -lint to find suspicious files).

The annoying thing with the LoopVectorizer was that I got lots of
false-positives since LV actually introduce UB in some code paths.
I decided to disable LV when doing the code washing to get rid of 
such false-positives.

While it isn't a big problem for me any longer, I still think it is
a bit unfortunate that the vectorizer is creating a vector loop
with UB. No idea if that would be easy to avoid.

/Björn

> -----Original Message-----
> From: Philip Reames <listmail at philipreames.com>
> Sent: den 5 december 2020 00:17
> To: Björn Pettersson A <bjorn.a.pettersson at ericsson.com>; llvm-dev
<llvm-
> dev at lists.llvm.org>
> Subject: Re: [llvm-dev] Is it feasible for LV to vectorize a loop
> accessing A[i] using VF>2 when A has only 2 elements?
> 
> The code the vectorizer emits should be correct for any well defined
> value of N.  (A program which runs this with N > 2 is full UB.)
> 
> The existing code doesn't (yet) reason about out of bounds accesses
from
> known object sizes as a means to imply bounds on loop induction
> variables.  It would be feasible to do so, it's just not something
> anyone has bothered with.
> 
> Philip
> 
> On 12/4/20 2:55 PM, Björn Pettersson A via llvm-dev wrote:
> > Hi!
> >
> > Consider an example like this
> >
> > int G[1000];
> >
> > void gazonk(int N) {
> >      int A[2] = {0};
> >      for (int i = 0; i < N; ++i)
> >        G[i] = A[i] + i;
> > }
> >
> >
> > When compiling with "-O3 -emit-llvm" ( see
> https://protect2.fireeye.com/v1/url?k=ac1ada1d-f3dd08b5-ac1a9a86-
> 86827b6aebbf-48fe48a3db5edb61&q=1&e=bbfa90b2-917e-4967-ab8e-
> 91a9b322c814&u=https%3A%2F%2Fgodbolt.org%2Fz%2Ff6jWfM )
> > the loop is vectorized with VF=4 and we get
> >
> >    %2 = alloca i64, align 8
> >    %3 = bitcast i64* %2 to [2 x i32]*
> >    ...
> >   vector.body:
> >    ...
> >    %24 = getelementptr inbounds [2 x i32], [2 x i32]* %3, i64 0, i64
> %21, !dbg !35
> >    %25 = bitcast i32* %24 to <4 x i32>*, !dbg !35
> >    %26 = load <4 x i32>, <4 x i32>* %25, align 8, !dbg
!35, !tbaa !36
> >    ...
> >
> >
> > Loading <4 x i32> from something pointing into [2 x i32] seems
like a
> bad thing (UB?).
> > And I believe that for example BasicAliasAnalysis will assume that the
> load won't alias with anything else since the size of the access is
> greater than the underlying object, so the code in the vector body is
> just crap afaict.
> >
> > There are some loop guards that perhaps (hopefully) protects from
> running the vector body here,
> > but isn't it a bit weird thing to introduce such code anyway?
> >
> > BR,
> > Björn
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Philip Reames via llvm-dev

2020-Dec-07 18:59 UTC

head link

[llvm-dev] Is it feasible for LV to vectorize a loop accessing A[i] using VF>2 when A has only 2 elements?

This sounds like a case where using msan would have caught your 
problem?  Did you try that approach instead of static analysis?

Philip

On 12/7/20 9:10 AM, Björn Pettersson A wrote:> Thanks for the answer Philip.
>
> Just to explain the background for the question:
>
> My original problem I had was that DeadStoreElimination eliminated
> stores that was intended to be used by a memcpy. It turned out that
> the input program was faulty as the memcpy was reading out-of-bounds
> (although not being detected by clang, nor clang static analyzer).
> In order to do some code washing, I wanted to try to find more files
> that might suffer from similar out-of-bounds accesses. I did that by
> hacking into BasicAliasAnalysis where it compares access sizes with
> size of underlying objects to catch the problem when alias analysis
> decided that use in the memcpy was consider as NoAlias due to
> reading outside the underlying object (later I realized that the
> same kind of check already exist in Lint.cpp in visitMemoryReference
> so I could probably just have use -lint to find suspicious files).
>
> The annoying thing with the LoopVectorizer was that I got lots of
> false-positives since LV actually introduce UB in some code paths.
> I decided to disable LV when doing the code washing to get rid of
> such false-positives.
>
> While it isn't a big problem for me any longer, I still think it is
> a bit unfortunate that the vectorizer is creating a vector loop
> with UB. No idea if that would be easy to avoid.
>
> /Björn
>
>
>> -----Original Message-----
>> From: Philip Reames <listmail at philipreames.com>
>> Sent: den 5 december 2020 00:17
>> To: Björn Pettersson A <bjorn.a.pettersson at ericsson.com>;
llvm-dev <llvm-
>> dev at lists.llvm.org>
>> Subject: Re: [llvm-dev] Is it feasible for LV to vectorize a loop
>> accessing A[i] using VF>2 when A has only 2 elements?
>>
>> The code the vectorizer emits should be correct for any well defined
>> value of N.  (A program which runs this with N > 2 is full UB.)
>>
>> The existing code doesn't (yet) reason about out of bounds accesses
from
>> known object sizes as a means to imply bounds on loop induction
>> variables.  It would be feasible to do so, it's just not something
>> anyone has bothered with.
>>
>> Philip
>>
>> On 12/4/20 2:55 PM, Björn Pettersson A via llvm-dev wrote:
>>> Hi!
>>>
>>> Consider an example like this
>>>
>>> int G[1000];
>>>
>>> void gazonk(int N) {
>>>       int A[2] = {0};
>>>       for (int i = 0; i < N; ++i)
>>>         G[i] = A[i] + i;
>>> }
>>>
>>>
>>> When compiling with "-O3 -emit-llvm" ( see
>> https://protect2.fireeye.com/v1/url?k=ac1ada1d-f3dd08b5-ac1a9a86-
>> 86827b6aebbf-48fe48a3db5edb61&q=1&e=bbfa90b2-917e-4967-ab8e-
>> 91a9b322c814&u=https%3A%2F%2Fgodbolt.org%2Fz%2Ff6jWfM )
>>> the loop is vectorized with VF=4 and we get
>>>
>>>     %2 = alloca i64, align 8
>>>     %3 = bitcast i64* %2 to [2 x i32]*
>>>     ...
>>>    vector.body:
>>>     ...
>>>     %24 = getelementptr inbounds [2 x i32], [2 x i32]* %3, i64 0,
i64
>> %21, !dbg !35
>>>     %25 = bitcast i32* %24 to <4 x i32>*, !dbg !35
>>>     %26 = load <4 x i32>, <4 x i32>* %25, align 8, !dbg
!35, !tbaa !36
>>>     ...
>>>
>>>
>>> Loading <4 x i32> from something pointing into [2 x i32]
seems like a
>> bad thing (UB?).
>>> And I believe that for example BasicAliasAnalysis will assume that
the
>> load won't alias with anything else since the size of the access is
>> greater than the underlying object, so the code in the vector body is
>> just crap afaict.
>>> There are some loop guards that perhaps (hopefully) protects from
>> running the vector body here,
>>> but isn't it a bit weird thing to introduce such code anyway?
>>>
>>> BR,
>>> Björn
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

llvm dev - Dec 2020 - Is it feasible for LV to vectorize a loop accessing A[i] using VF>2 when A has only 2 elements?

[llvm-dev] Is it feasible for LV to vectorize a loop accessing A[i] using VF>2 when A has only 2 elements?

[llvm-dev] Is it feasible for LV to vectorize a loop accessing A[i] using VF>2 when A has only 2 elements?