thr3ads.net - llvm dev - [llvm-dev] Field sensitive alias analysis? [Dec 2015]

If this information is useful, please help other people find it:
Share via:

Dmitry Polukhin via llvm-dev

2015-Dec-07 15:13 UTC

[llvm-dev] Field sensitive alias analysis?

BTW, I have found why it doesn't work for arrays. TBAA information
propagation is not implemented in CodeGenFunction::EmitArraySubscriptExpr
with "TODO: Preserve/extend path TBAA metadata?".

On Fri, Dec 4, 2015 at 1:38 PM, Dmitry Polukhin <dmitry.polukhin at
gmail.com>
wrote:
> As far as I can see it is specifics of arrays inside structs. Current TBAA
> does distinguish non-array members with path sensitive TBAA (see !1 and !6
> in my example below, TBAA has reference to struct !2 and offset). As for
> arrays information that it was member of some struct get lost completely
> (!7 has nothing about struct !2).
>
> struct S {
>   int a;
>   int b;
>   int c[3];
> };
>
> void foo(struct S* p) {
>   p->a = 1;
>   p->b = 2;
>   p->c[0] = 3;
>   p->c[1] = 4;
> }
>
> define void @foo(%struct.S* nocapture %p) #0 {
> entry:
>   %a = getelementptr inbounds %struct.S, %struct.S* %p, i64 0, i32 0
>   store i32 1, i32* %a, align 4, !tbaa !1
>   %b = getelementptr inbounds %struct.S, %struct.S* %p, i64 0, i32 1
>   store i32 2, i32* %b, align 4, !tbaa !6
>   %arrayidx = getelementptr inbounds %struct.S, %struct.S* %p, i64 0, i32
> 2, i64 0
>   store i32 3, i32* %arrayidx, align 4, !tbaa !7
>   %arrayidx2 = getelementptr inbounds %struct.S, %struct.S* %p, i64 0, i32
> 2, i64 1
>   store i32 4, i32* %arrayidx2, align 4, !tbaa !7
>   ret void
> }
>
> !0 = !{!"clang version 3.8.0 "}
> !1 = !{!2, !3, i64 0}
> !2 = !{!"S", !3, i64 0, !3, i64 4, !4, i64 8}
> !3 = !{!"int", !4, i64 0}
> !4 = !{!"omnipotent char", !5, i64 0}
> !5 = !{!"Simple C/C++ TBAA"}
> !6 = !{!2, !3, i64 4}
> !7 = !{!3, !3, i64 0}
>
> I'm just start learning how TBAA in clang works so I don't know why
it was
> implemented this way.
>
> On Fri, Dec 4, 2015 at 11:06 AM, Vaivaswatha Nagaraj via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Hi,
>>
>> I'm trying to optimize a simple C code and came across a situation
where
>> invariant code is not being moved out:
>>
>> On an -O3 compilation, I noticed that the "load" for the loop
bounds
>> (which remain invariant throughout) happens on each iteration of both
the
>> loops, even though it is not modified anywhere in the function
"bigLoop".
>> It seems that alias analysis is not able to say that the writes to one
>> field in the structure does not impact the other field, leading to LICM
>> being ineffective.
>>
>> Do any of the alias analyses currently have some kind of field
>> sensitivity that can help in this case?
>>
>> ------------------------- test case
------------------------------------
>> #include <stdlib.h>
>> #include <stdio.h>
>>
>> #define SIZE 100
>>
>> struct AS {
>>   int a[SIZE+4];
>>   int size;
>> } A;
>>
>> void bigLoop(void)
>> {
>>   unsigned i, j;
>>
>>   for (i = 0; i < A.size; i++) {
>>     A.a[i+2] +=  A.a[i];
>>   }
>>   for (i = 0; i < A.size; i++) {
>>     A.a[i+2] *=  A.a[i];
>>   }
>> }
>>
>> int main()
>> {
>>   A.size = random()%SIZE;
>>   for (unsigned i = 0; i < A.size; i++) {
>>     A.a[i] = random()%23;
>>   }
>>   bigLoop();
>>   return 0;
>> }
>>
>> Thanks,
>>
>>   - Vaivaswatha
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20151207/0cd727a7/attachment.html>

Jeroen Dobbelaere via llvm-dev

2015-Dec-07 16:05 UTC

head link

[llvm-dev] Field sensitive alias analysis?

Hi Vaivaswatha, Dmitry,

Some more analysis of what goes wrong (both in missed optimizations and wrong
code) in the current tbaa representation can be found in following thread:
http://lists.llvm.org/pipermail/cfe-dev/2015-March/042015.html

unfortunately, there is (as far as I am aware) no solution yet.
(I still want to dive deeper into this, but this is currently not on the top of
my stack :( )

Greetings,

Jeroen Dobbealere


From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Dmitry
Polukhin via llvm-dev
Sent: Monday, December 07, 2015 4:13 PM
To: Vaivaswatha Nagaraj
Cc: LLVM Dev
Subject: Re: [llvm-dev] Field sensitive alias analysis?

BTW, I have found why it doesn't work for arrays. TBAA information
propagation is not implemented in CodeGenFunction::EmitArraySubscriptExpr with
"TODO: Preserve/extend path TBAA metadata?".

On Fri, Dec 4, 2015 at 1:38 PM, Dmitry Polukhin <dmitry.polukhin at
gmail.com<mailto:dmitry.polukhin at gmail.com>> wrote:
As far as I can see it is specifics of arrays inside structs. Current TBAA does
distinguish non-array members with path sensitive TBAA (see !1 and !6 in my
example below, TBAA has reference to struct !2 and offset). As for arrays
information that it was member of some struct get lost completely (!7 has
nothing about struct !2).

struct S {
  int a;
  int b;
  int c[3];
};

void foo(struct S* p) {
  p->a = 1;
  p->b = 2;
  p->c[0] = 3;
  p->c[1] = 4;
}

define void @foo(%struct.S* nocapture %p) #0 {
entry:
  %a = getelementptr inbounds %struct.S, %struct.S* %p, i64 0, i32 0
  store i32 1, i32* %a, align 4, !tbaa !1
  %b = getelementptr inbounds %struct.S, %struct.S* %p, i64 0, i32 1
  store i32 2, i32* %b, align 4, !tbaa !6
  %arrayidx = getelementptr inbounds %struct.S, %struct.S* %p, i64 0, i32 2, i64
0
  store i32 3, i32* %arrayidx, align 4, !tbaa !7
  %arrayidx2 = getelementptr inbounds %struct.S, %struct.S* %p, i64 0, i32 2,
i64 1
  store i32 4, i32* %arrayidx2, align 4, !tbaa !7
  ret void
}

!0 = !{!"clang version 3.8.0 "}
!1 = !{!2, !3, i64 0}
!2 = !{!"S", !3, i64 0, !3, i64 4, !4, i64 8}
!3 = !{!"int", !4, i64 0}
!4 = !{!"omnipotent char", !5, i64 0}
!5 = !{!"Simple C/C++ TBAA"}
!6 = !{!2, !3, i64 4}
!7 = !{!3, !3, i64 0}

I'm just start learning how TBAA in clang works so I don't know why it
was implemented this way.

On Fri, Dec 4, 2015 at 11:06 AM, Vaivaswatha Nagaraj via llvm-dev <llvm-dev
at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:
Hi,
I'm trying to optimize a simple C code and came across a situation where
invariant code is not being moved out:
On an -O3 compilation, I noticed that the "load" for the loop bounds
(which remain invariant throughout) happens on each iteration of both the loops,
even though it is not modified anywhere in the function "bigLoop". It
seems that alias analysis is not able to say that the writes to one field in the
structure does not impact the other field, leading to LICM being ineffective.
Do any of the alias analyses currently have some kind of field sensitivity that
can help in this case?

------------------------- test case ------------------------------------
#include <stdlib.h>
#include <stdio.h>

#define SIZE 100

struct AS {
  int a[SIZE+4];
  int size;
} A;

void bigLoop(void)
{
  unsigned i, j;

  for (i = 0; i < A.size; i++) {
    A.a[i+2] +=  A.a[i];
  }
  for (i = 0; i < A.size; i++) {
    A.a[i+2] *=  A.a[i];
  }
}

int main()
{
  A.size = random()%SIZE;
  for (unsigned i = 0; i < A.size; i++) {
    A.a[i] = random()%23;
  }
  bigLoop();
  return 0;
}
Thanks,

  - Vaivaswatha

_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20151207/fd21caf4/attachment.html>

Daniel Berlin via llvm-dev

2015-Dec-07 16:43 UTC

head link

[llvm-dev] Field sensitive alias analysis?

TBAA is about types, not accesses.
TBAA "struct path data" is about access paths and types that are the
end of
access paths, for the most part.

It has no notion of access size, etc, only offset.

It is possible to extend it and handle *very basic* cases in the frontend
(IE structs not containing unions anywhere, with constant accesses, etc)
But it would degrade quite quickly (you would likely need a sane way to say
"this is an access to an unknown offset into this type")

Generally, rather than just try to produce metadata in the frontend, most
compilers perform field-sensitive points-to or something similar for
fields, and then rely on data dependence for differentiating array
subscripts.

(This is what GCC does, LLVM has CFL-AA, but it's not field sensitive yet)

So handling .size vs .a[] is probably possible in the frontend.

Doing array subscript analysis in general, probably not something you want
in the frontend.
Handling tricky cases of what pointers to structs point to, probably the
domain of field-sensitive points-to

On Mon, Dec 7, 2015 at 7:13 AM, Dmitry Polukhin via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> BTW, I have found why it doesn't work for arrays. TBAA information
> propagation is not implemented in CodeGenFunction::EmitArraySubscriptExpr
> with "TODO: Preserve/extend path TBAA metadata?".
>
> On Fri, Dec 4, 2015 at 1:38 PM, Dmitry Polukhin <dmitry.polukhin at
gmail.com
> > wrote:
>
>> As far as I can see it is specifics of arrays inside structs. Current
>> TBAA does distinguish non-array members with path sensitive TBAA (see
!1
>> and !6 in my example below, TBAA has reference to struct !2 and
offset). As
>> for arrays information that it was member of some struct get lost
>> completely (!7 has nothing about struct !2).
>>
>> struct S {
>>   int a;
>>   int b;
>>   int c[3];
>> };
>>
>> void foo(struct S* p) {
>>   p->a = 1;
>>   p->b = 2;
>>   p->c[0] = 3;
>>   p->c[1] = 4;
>> }
>>
>> define void @foo(%struct.S* nocapture %p) #0 {
>> entry:
>>   %a = getelementptr inbounds %struct.S, %struct.S* %p, i64 0, i32 0
>>   store i32 1, i32* %a, align 4, !tbaa !1
>>   %b = getelementptr inbounds %struct.S, %struct.S* %p, i64 0, i32 1
>>   store i32 2, i32* %b, align 4, !tbaa !6
>>   %arrayidx = getelementptr inbounds %struct.S, %struct.S* %p, i64 0,
i32
>> 2, i64 0
>>   store i32 3, i32* %arrayidx, align 4, !tbaa !7
>>   %arrayidx2 = getelementptr inbounds %struct.S, %struct.S* %p, i64 0,
>> i32 2, i64 1
>>   store i32 4, i32* %arrayidx2, align 4, !tbaa !7
>>   ret void
>> }
>>
>> !0 = !{!"clang version 3.8.0 "}
>> !1 = !{!2, !3, i64 0}
>> !2 = !{!"S", !3, i64 0, !3, i64 4, !4, i64 8}
>> !3 = !{!"int", !4, i64 0}
>> !4 = !{!"omnipotent char", !5, i64 0}
>> !5 = !{!"Simple C/C++ TBAA"}
>> !6 = !{!2, !3, i64 4}
>> !7 = !{!3, !3, i64 0}
>>
>> I'm just start learning how TBAA in clang works so I don't know
why it
>> was implemented this way.
>>
>> On Fri, Dec 4, 2015 at 11:06 AM, Vaivaswatha Nagaraj via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> Hi,
>>>
>>> I'm trying to optimize a simple C code and came across a
situation where
>>> invariant code is not being moved out:
>>>
>>> On an -O3 compilation, I noticed that the "load" for the
loop bounds
>>> (which remain invariant throughout) happens on each iteration of
both the
>>> loops, even though it is not modified anywhere in the function
"bigLoop".
>>> It seems that alias analysis is not able to say that the writes to
one
>>> field in the structure does not impact the other field, leading to
LICM
>>> being ineffective.
>>>
>>> Do any of the alias analyses currently have some kind of field
>>> sensitivity that can help in this case?
>>>
>>> ------------------------- test case
------------------------------------
>>> #include <stdlib.h>
>>> #include <stdio.h>
>>>
>>> #define SIZE 100
>>>
>>> struct AS {
>>>   int a[SIZE+4];
>>>   int size;
>>> } A;
>>>
>>> void bigLoop(void)
>>> {
>>>   unsigned i, j;
>>>
>>>   for (i = 0; i < A.size; i++) {
>>>     A.a[i+2] +=  A.a[i];
>>>   }
>>>   for (i = 0; i < A.size; i++) {
>>>     A.a[i+2] *=  A.a[i];
>>>   }
>>> }
>>>
>>> int main()
>>> {
>>>   A.size = random()%SIZE;
>>>   for (unsigned i = 0; i < A.size; i++) {
>>>     A.a[i] = random()%23;
>>>   }
>>>   bigLoop();
>>>   return 0;
>>> }
>>>
>>> Thanks,
>>>
>>>   - Vaivaswatha
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20151207/aa6fcb91/attachment.html>

Dmitry Polukhin via llvm-dev

2015-Dec-08 11:38 UTC

head link

[llvm-dev] Field sensitive alias analysis?

Jeroen, thank you for very useful link with the context. Indeed union cases
are very complicated and I see places in code when TBAA gives up.

Daniel, I completely agree that TBAA has limited power and can solve
relatively simple cases only. So anything more complicated that involves
intermediate variables that points to struct or array elements cannot be
solved by TBAA alone. Differentiating array subscripts also seems like out
of the scope of TBAA but adding type information that this memory assess
accessing array not a random object of given type could be useful.
Therefore TBAA can compliment points-to analysis and sometimes help in
cases when points-to may not have enough information.

On Mon, Dec 7, 2015 at 7:43 PM, Daniel Berlin <dberlin at dberlin.org>
wrote:
> TBAA is about types, not accesses.
> TBAA "struct path data" is about access paths and types that are
the end
> of access paths, for the most part.
>
> It has no notion of access size, etc, only offset.
>
> It is possible to extend it and handle *very basic* cases in the frontend
> (IE structs not containing unions anywhere, with constant accesses, etc)
> But it would degrade quite quickly (you would likely need a sane way to
> say "this is an access to an unknown offset into this type")
>
> Generally, rather than just try to produce metadata in the frontend, most
> compilers perform field-sensitive points-to or something similar for
> fields, and then rely on data dependence for differentiating array
> subscripts.
>
> (This is what GCC does, LLVM has CFL-AA, but it's not field sensitive
yet)
>
> So handling .size vs .a[] is probably possible in the frontend.
>
> Doing array subscript analysis in general, probably not something you want
> in the frontend.
> Handling tricky cases of what pointers to structs point to, probably the
> domain of field-sensitive points-to
>
> On Mon, Dec 7, 2015 at 7:13 AM, Dmitry Polukhin via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> BTW, I have found why it doesn't work for arrays. TBAA information
>> propagation is not implemented in
CodeGenFunction::EmitArraySubscriptExpr
>> with "TODO: Preserve/extend path TBAA metadata?".
>>
>> On Fri, Dec 4, 2015 at 1:38 PM, Dmitry Polukhin <
>> dmitry.polukhin at gmail.com> wrote:
>>
>>> As far as I can see it is specifics of arrays inside structs.
Current
>>> TBAA does distinguish non-array members with path sensitive TBAA
(see !1
>>> and !6 in my example below, TBAA has reference to struct !2 and
offset). As
>>> for arrays information that it was member of some struct get lost
>>> completely (!7 has nothing about struct !2).
>>>
>>> struct S {
>>>   int a;
>>>   int b;
>>>   int c[3];
>>> };
>>>
>>> void foo(struct S* p) {
>>>   p->a = 1;
>>>   p->b = 2;
>>>   p->c[0] = 3;
>>>   p->c[1] = 4;
>>> }
>>>
>>> define void @foo(%struct.S* nocapture %p) #0 {
>>> entry:
>>>   %a = getelementptr inbounds %struct.S, %struct.S* %p, i64 0, i32
0
>>>   store i32 1, i32* %a, align 4, !tbaa !1
>>>   %b = getelementptr inbounds %struct.S, %struct.S* %p, i64 0, i32
1
>>>   store i32 2, i32* %b, align 4, !tbaa !6
>>>   %arrayidx = getelementptr inbounds %struct.S, %struct.S* %p, i64
0,
>>> i32 2, i64 0
>>>   store i32 3, i32* %arrayidx, align 4, !tbaa !7
>>>   %arrayidx2 = getelementptr inbounds %struct.S, %struct.S* %p, i64
0,
>>> i32 2, i64 1
>>>   store i32 4, i32* %arrayidx2, align 4, !tbaa !7
>>>   ret void
>>> }
>>>
>>> !0 = !{!"clang version 3.8.0 "}
>>> !1 = !{!2, !3, i64 0}
>>> !2 = !{!"S", !3, i64 0, !3, i64 4, !4, i64 8}
>>> !3 = !{!"int", !4, i64 0}
>>> !4 = !{!"omnipotent char", !5, i64 0}
>>> !5 = !{!"Simple C/C++ TBAA"}
>>> !6 = !{!2, !3, i64 4}
>>> !7 = !{!3, !3, i64 0}
>>>
>>> I'm just start learning how TBAA in clang works so I don't
know why it
>>> was implemented this way.
>>>
>>> On Fri, Dec 4, 2015 at 11:06 AM, Vaivaswatha Nagaraj via llvm-dev
<
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> Hi,
>>>>
>>>> I'm trying to optimize a simple C code and came across a
situation
>>>> where invariant code is not being moved out:
>>>>
>>>> On an -O3 compilation, I noticed that the "load" for
the loop bounds
>>>> (which remain invariant throughout) happens on each iteration
of both the
>>>> loops, even though it is not modified anywhere in the function
"bigLoop".
>>>> It seems that alias analysis is not able to say that the writes
to one
>>>> field in the structure does not impact the other field, leading
to LICM
>>>> being ineffective.
>>>>
>>>> Do any of the alias analyses currently have some kind of field
>>>> sensitivity that can help in this case?
>>>>
>>>> ------------------------- test case
------------------------------------
>>>> #include <stdlib.h>
>>>> #include <stdio.h>
>>>>
>>>> #define SIZE 100
>>>>
>>>> struct AS {
>>>>   int a[SIZE+4];
>>>>   int size;
>>>> } A;
>>>>
>>>> void bigLoop(void)
>>>> {
>>>>   unsigned i, j;
>>>>
>>>>   for (i = 0; i < A.size; i++) {
>>>>     A.a[i+2] +=  A.a[i];
>>>>   }
>>>>   for (i = 0; i < A.size; i++) {
>>>>     A.a[i+2] *=  A.a[i];
>>>>   }
>>>> }
>>>>
>>>> int main()
>>>> {
>>>>   A.size = random()%SIZE;
>>>>   for (unsigned i = 0; i < A.size; i++) {
>>>>     A.a[i] = random()%23;
>>>>   }
>>>>   bigLoop();
>>>>   return 0;
>>>> }
>>>>
>>>> Thanks,
>>>>
>>>>   - Vaivaswatha
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>>
>>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20151208/9cb9e557/attachment.html>

Possibly Parallel Threads

Search for more possibly parallel threads

llvm dev - Dec 2015 - Field sensitive alias analysis?

[llvm-dev] Field sensitive alias analysis?

[llvm-dev] Field sensitive alias analysis?

[llvm-dev] Field sensitive alias analysis?

[llvm-dev] Field sensitive alias analysis?

Possibly Parallel Threads