Jeroen Dobbelaere via llvm-dev
2019-Jun-05 12:45 UTC
[llvm-dev] llvm-ir: TBAA and struct copies
Hi Ivan,
The code that we have is indeed different from what the 'standard llvm'
expects. Let me explain:
in our version we came into this situation in two steps:
1) I added support for 'special types' that map directly to types
supported by hardware.
These types are represented by a struct containing a single iXXX member,
providing the necessary bits
of the type, and at the same time avoiding any change in the bits (if it would
have been represented
by a pure iXXX). They need to be handled similarly as a fundamental type, so we
provided support for that,
resulting in clang generating load/store instructions that work on a
'struct'.
In order to get this working (for tbaa), I had to relax the check for access
type, as it now could see an aggregate type
(that behaved as a fundamental type).
2) later on, we noticed that sroa (if I remember correctly) produced better code
if it could work on load/store instructions of aggregates instead of
llvm.memcpy. I adapted clang to also emit load/store pairs for small structs.
Just like with the special 'struct type', we the load/store's are
decorated with
the struct and the size of the struct as access TBAAAccessInfo information.
At that time all looked fine, but recently, csmith testing has been showing
cases where alias analysis is now too aggressive.
An easy way out is of course to fall back to llvm.memcpy, but that blocks a lot
of possible optimizations, as it loses all the TBAA information.
(In the original example: because of the llvm.memcpy, it does no matter if the
*p=y copy is for a 'S' or for a different 'struct T'. In both
cases, it blocks the optimization of the 'x.f' )
With the current understanding that I have about the problem and about the TBAA
implementation, I am not sure that it is easy to fix with the default struct
path tbaa.
With the 'new struct path tbaa', as it also tracks the access size, I do
see a possible solution:
in TypeBasedAliasAnalysis.cpp, in function bool mayBeAccessToSubobjectOf(...):
https://github.com/llvm/llvm-project/blob/master/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp#L619
if (BaseType.getNode() == SubobjectTag.getBaseType()) {
bool SameMemberAccess = OffsetInBase == SubobjectTag.getOffset(); //
**** HERE
if (GenericTag) {
*GenericTag = SameMemberAccess ? SubobjectTag.getNode() :
createAccessTag(CommonType);
}
MayAlias = SameMemberAccess;
return true;
}
For the 'NewFormat', The 'MayAlias' should be based on overlap
of {OffsetInBase, BaseTag.getSize} and {SubobjecTag.getOffset(),
SubobjectTag.getSize()}
I am not sure if the same holds for the 'GenericTag'.
Do you think that would cause other problems ?
Thanks,
Jeroen Dobbelaere
> -----Original Message-----
> From: Ivan Kosarev <ivan at kosarev.info>
> Sent: Wednesday, June 5, 2019 12:25
> To: Jeroen Dobbelaere <Jeroen.Dobbelaere at synopsys.com>; llvm-
> dev at lists.llvm.org
> Cc: Ivan Kosarev <ikosarev at accesssoftek.com>
> Subject: Re: [llvm-dev] llvm-ir: TBAA and struct copies
>
> Hello Jeroen,
>
> AFAIR, with the current TBAA format the access type shall never be an
> aggregate, so the !10 node looks suspicious.
>
> Tried to reproduce this on 362464 with that same input snippet, but got
> something completely different, see below.
>
> Regards,
>
> ---
> $ clang -cc1 -S -emit-llvm -O1 -disable-llvm-passes x.cpp
> $ cat x.ll
> ...
> define i32 @_Z6foobarv() #0 {
> entry:
> store i16 42, i16* getelementptr inbounds (%struct.S, %struct.S* @x,
> i32 0, i32 2), align 2, !tbaa !2
> %0 = load %struct.S*, %struct.S** @p, align 8, !tbaa !8
> %1 = bitcast %struct.S* %0 to i8*
> call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %1, i8* align 4
> bitcast (%struct.S* @y to i8*), i64 8, i1 false), !tbaa.struct !10
> %2 = load i16, i16* getelementptr inbounds (%struct.S, %struct.S* @x,
> i32 0, i32 2), align 2, !tbaa !2
> %conv = sext i16 %2 to i32
> ret i32 %conv
> }
> ...
> !2 = !{!3, !7, i64 6}
> !3 = !{!"_ZTS1S", !4, i64 0, !7, i64 4, !7, i64 6}
> !4 = !{!"int", !5, i64 0}
> !5 = !{!"omnipotent char", !6, i64 0}
> !6 = !{!"Simple C++ TBAA"}
> !7 = !{!"short", !5, i64 0}
> !8 = !{!9, !9, i64 0}
> !9 = !{!"any pointer", !5, i64 0}
> !10 = !{i64 0, i64 4, !11, i64 4, i64 2, !12, i64 6, i64 2, !12}
> !11 = !{!4, !4, i64 0}
> !12 = !{!7, !7, i64 0}
> ---
>
>
>
> On 04/06/2019 17:34, Jeroen Dobbelaere via llvm-dev wrote:
> > Hi,
> >
> > I have a question about the current definition of TBAA (See [1]).
> >
> > In the LLVM-IR code that we produce, we generate load/stores of struct
> types. (See [2] and [3] for a godbolt example showing the issue)
> >
> > For following c-alike code:
> >
> > struct S { int dummy; short e, f; } x,y;
> > struct S* p = &x;
> > int foobar() {
> > x.f=42;
> > *p=y; //**** struct copy
> > return x.f;
> > }
> >
> > We produce:
> >
> > [...]
> > %struct.S = type { i32, i16, i16 }
> > ; Function Attrs: norecurse nounwind uwtable
> > define dso_local i32 @foobar() local_unnamed_addr #0 {
> > store i16 42, i16* getelementptr inbounds (%struct.S, %struct.S*
@x, i64
> 0, i32 2), align 2, !tbaa !2
> > %1 = load %struct.S*, %struct.S** @p, align 8, !tbaa !8
> > %2 = load %struct.S, %struct.S* @y, align 8, !tbaa !10
;*************
> > store %struct.S %2, %struct.S* %1, align 4, !tbaa !10 ;
*************
> > %3 = load i16, i16* getelementptr inbounds (%struct.S, %struct.S*
@x,
> i64 0, i32 2), align 2, !tbaa !2
> > %4 = sext i16 %3 to i32
> > ret i32 %4
> > }
> > [...]
> > !0 = !{i32 1, !"wchar_size", i32 4}
> > !1 = !{!"clang version 9.0.0 (trunk 362464)"}
> > !2 = !{!3, !7, i64 6}
> > !3 = !{!"_ZTS1S", !4, i64 0, !7, i64 4, !7, i64 6}
> > !4 = !{!"int", !5, i64 0}
> > !5 = !{!"omnipotent char", !6, i64 0}
> > !6 = !{!"Simple C++ TBAA"}
> > !7 = !{!"short", !5, i64 0}
> > !8 = !{!9, !9, i64 0}
> > !9 = !{!"any pointer", !5, i64 0}
> > !10 = !{!3, !3, i64 0} ; ***********************
> >
> >
> > In order to allow other optimizations, we have also attached TBAA
> information to the load/store of the struct (!tbaa !10).
> > The logical solution would be to construct '!tbaa !10' as
shown:
> > - the base type is '!3 (_ZTS1S)'
> > - the access type is also '!3 (_ZTS1S)'
> > - the offset is 'i64 0'
> >
> > I observe following issues:
> > - issue 1: the tbaa semantics will not detect aliasing between
'!10' and '!2'
> (See [2] and [3]; the load i16 in bar_wrong should not be optimized away)
> > - issue 2: according to the pure definition of the 'access
tags', the base
> type and the access type can not be the same struct type.
> > As such, the provided example could be found to be
'invalid'. Still, by
> adding an extra indirection, a similar 'valid' example (with wrong
behavior)
> > can be constructed. See [3]
> >
> > - For 'issue 2', I think that the definition of 'access
tag' should be relaxed,
> allowing the description of a full copy' of a struct.
> > - For 'issue 1', the lookup algorithm should be enhanced so
that aliasing
> can be detected for these cases.
> >
> > Is this a correct interpretation ? Any input is welcome !
> >
> > Thanks,
> >
> > Jeroen Dobbelaere
> > ----
> > [1] tbaa metadata: https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__llvm.org_docs_LangRef.html-23tbaa-
> 2Dmetadata&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=ELyOnT0WepII6
> UnFk-OSzxlGOXXSfAvOLT6E8iPwwJk&m=b7XmqIDwVRomY-
> z_Vgfx6dv1OKq85hoK5rRXP8m6Rxs&s=TmYSKjBxhgw5sSVeq-
> v7p1UmvzV_81OIMz-_pFakoZQ&e> > [2] godbolt example with single
struct:
> https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__www.godbolt.org_z_XcQy-
> 2DU&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=ELyOnT0WepII6UnFk-
> OSzxlGOXXSfAvOLT6E8iPwwJk&m=b7XmqIDwVRomY-
> z_Vgfx6dv1OKq85hoK5rRXP8m6Rxs&s=4OL_Z9OECS7bHERaUU-
> WMxWhDPt-f_17zpehoMGVgj4&e> > [3] godbolt example with nested
struct:
> https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__www.godbolt.org_z_mNd-
> 2DiI&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=ELyOnT0WepII6UnFk-
> OSzxlGOXXSfAvOLT6E8iPwwJk&m=b7XmqIDwVRomY-
> z_Vgfx6dv1OKq85hoK5rRXP8m6Rxs&s=0y6mskMKXX94CzNTFu1wVHF5m3Oc
> RPyoZBc-ShGZrqM&e> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> >
https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.llvm.org_cgi-
> 2Dbin_mailman_listinfo_llvm-
> 2Ddev&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=ELyOnT0WepII6UnFk-
> OSzxlGOXXSfAvOLT6E8iPwwJk&m=b7XmqIDwVRomY-
> z_Vgfx6dv1OKq85hoK5rRXP8m6Rxs&s=410QYrHOxKaioN8wS4O3Sgfu9vh-
> 2Jnl5Z-xDqr94r0&e=
Ivan Kosarev via llvm-dev
2019-Jun-05 13:38 UTC
[llvm-dev] llvm-ir: TBAA and struct copies
Hi Jeroen,
> With the current understanding that I have about the problem and
about the TBAA implementation, I am not sure that it is easy to fix with
the default struct path tbaa.
To handle aggregate accesses correctly, we need to know the size of the
access in addition to the offset. Since the current format doesn't store
sizes, it doesn't seem possible to do much more than we already do about
aggregate accesses. This is further complicated by the fact that calls
like memcpy() have their own format of metadata nodes.
> For the 'NewFormat', The 'MayAlias' should be based on
overlap of
{OffsetInBase, BaseTag.getSize} and {SubobjecTag.getOffset(),
SubobjectTag.getSize()}
> I am not sure if the same holds for the 'GenericTag'.
Correct. As to the generic tag, this needs some very careful analysis
and extensive testing, but generally I would expect it to be built so that:
- its base type is the matching type in the considered access chains,
- its access type is the generic type of the two access types being
matched and
- its range (offset + size) is the union of the matched access ranges.
Regards,
On 05/06/2019 15:45, Jeroen Dobbelaere wrote:> Hi Ivan,
>
> The code that we have is indeed different from what the 'standard
llvm' expects. Let me explain:
>
> in our version we came into this situation in two steps:
> 1) I added support for 'special types' that map directly to types
supported by hardware.
> These types are represented by a struct containing a single iXXX member,
providing the necessary bits
> of the type, and at the same time avoiding any change in the bits (if it
would have been represented
> by a pure iXXX). They need to be handled similarly as a fundamental type,
so we provided support for that,
> resulting in clang generating load/store instructions that work on a
'struct'.
>
> In order to get this working (for tbaa), I had to relax the check for
access type, as it now could see an aggregate type
> (that behaved as a fundamental type).
>
> 2) later on, we noticed that sroa (if I remember correctly) produced better
code if it could work on load/store instructions of aggregates instead of
> llvm.memcpy. I adapted clang to also emit load/store pairs for small
structs. Just like with the special 'struct type', we the
load/store's are decorated with
> the struct and the size of the struct as access TBAAAccessInfo information.
>
> At that time all looked fine, but recently, csmith testing has been showing
cases where alias analysis is now too aggressive.
>
> An easy way out is of course to fall back to llvm.memcpy, but that blocks a
lot of possible optimizations, as it loses all the TBAA information.
>
> (In the original example: because of the llvm.memcpy, it does no matter if
the *p=y copy is for a 'S' or for a different 'struct T'. In
both cases, it blocks the optimization of the 'x.f' )
>
> With the current understanding that I have about the problem and about the
TBAA implementation, I am not sure that it is easy to fix with the default
struct path tbaa.
> With the 'new struct path tbaa', as it also tracks the access size,
I do see a possible solution:
>
> in TypeBasedAliasAnalysis.cpp, in function bool
mayBeAccessToSubobjectOf(...):
>
>
https://github.com/llvm/llvm-project/blob/master/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp#L619
>
> if (BaseType.getNode() == SubobjectTag.getBaseType()) {
> bool SameMemberAccess = OffsetInBase == SubobjectTag.getOffset();
// **** HERE
> if (GenericTag) {
> *GenericTag = SameMemberAccess ? SubobjectTag.getNode() :
> createAccessTag(CommonType);
> }
> MayAlias = SameMemberAccess;
> return true;
> }
>
> For the 'NewFormat', The 'MayAlias' should be based on
overlap of {OffsetInBase, BaseTag.getSize} and {SubobjecTag.getOffset(),
SubobjectTag.getSize()}
> I am not sure if the same holds for the 'GenericTag'.
>
> Do you think that would cause other problems ?
>
> Thanks,
>
> Jeroen Dobbelaere
>
>> -----Original Message-----
>> From: Ivan Kosarev <ivan at kosarev.info>
>> Sent: Wednesday, June 5, 2019 12:25
>> To: Jeroen Dobbelaere <Jeroen.Dobbelaere at synopsys.com>; llvm-
>> dev at lists.llvm.org
>> Cc: Ivan Kosarev <ikosarev at accesssoftek.com>
>> Subject: Re: [llvm-dev] llvm-ir: TBAA and struct copies
>>
>> Hello Jeroen,
>>
>> AFAIR, with the current TBAA format the access type shall never be an
>> aggregate, so the !10 node looks suspicious.
>>
>> Tried to reproduce this on 362464 with that same input snippet, but got
>> something completely different, see below.
>>
>> Regards,
>>
>> ---
>> $ clang -cc1 -S -emit-llvm -O1 -disable-llvm-passes x.cpp
>> $ cat x.ll
>> ...
>> define i32 @_Z6foobarv() #0 {
>> entry:
>> store i16 42, i16* getelementptr inbounds (%struct.S, %struct.S*
@x,
>> i32 0, i32 2), align 2, !tbaa !2
>> %0 = load %struct.S*, %struct.S** @p, align 8, !tbaa !8
>> %1 = bitcast %struct.S* %0 to i8*
>> call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %1, i8* align 4
>> bitcast (%struct.S* @y to i8*), i64 8, i1 false), !tbaa.struct !10
>> %2 = load i16, i16* getelementptr inbounds (%struct.S, %struct.S*
@x,
>> i32 0, i32 2), align 2, !tbaa !2
>> %conv = sext i16 %2 to i32
>> ret i32 %conv
>> }
>> ...
>> !2 = !{!3, !7, i64 6}
>> !3 = !{!"_ZTS1S", !4, i64 0, !7, i64 4, !7, i64 6}
>> !4 = !{!"int", !5, i64 0}
>> !5 = !{!"omnipotent char", !6, i64 0}
>> !6 = !{!"Simple C++ TBAA"}
>> !7 = !{!"short", !5, i64 0}
>> !8 = !{!9, !9, i64 0}
>> !9 = !{!"any pointer", !5, i64 0}
>> !10 = !{i64 0, i64 4, !11, i64 4, i64 2, !12, i64 6, i64 2, !12}
>> !11 = !{!4, !4, i64 0}
>> !12 = !{!7, !7, i64 0}
>> ---
>>
>>
>>
>> On 04/06/2019 17:34, Jeroen Dobbelaere via llvm-dev wrote:
>>> Hi,
>>>
>>> I have a question about the current definition of TBAA (See [1]).
>>>
>>> In the LLVM-IR code that we produce, we generate load/stores of
struct
>> types. (See [2] and [3] for a godbolt example showing the issue)
>>> For following c-alike code:
>>>
>>> struct S { int dummy; short e, f; } x,y;
>>> struct S* p = &x;
>>> int foobar() {
>>> x.f=42;
>>> *p=y; //**** struct copy
>>> return x.f;
>>> }
>>>
>>> We produce:
>>>
>>> [...]
>>> %struct.S = type { i32, i16, i16 }
>>> ; Function Attrs: norecurse nounwind uwtable
>>> define dso_local i32 @foobar() local_unnamed_addr #0 {
>>> store i16 42, i16* getelementptr inbounds (%struct.S,
%struct.S* @x, i64
>> 0, i32 2), align 2, !tbaa !2
>>> %1 = load %struct.S*, %struct.S** @p, align 8, !tbaa !8
>>> %2 = load %struct.S, %struct.S* @y, align 8, !tbaa !10
;*************
>>> store %struct.S %2, %struct.S* %1, align 4, !tbaa !10 ;
*************
>>> %3 = load i16, i16* getelementptr inbounds (%struct.S,
%struct.S* @x,
>> i64 0, i32 2), align 2, !tbaa !2
>>> %4 = sext i16 %3 to i32
>>> ret i32 %4
>>> }
>>> [...]
>>> !0 = !{i32 1, !"wchar_size", i32 4}
>>> !1 = !{!"clang version 9.0.0 (trunk 362464)"}
>>> !2 = !{!3, !7, i64 6}
>>> !3 = !{!"_ZTS1S", !4, i64 0, !7, i64 4, !7, i64 6}
>>> !4 = !{!"int", !5, i64 0}
>>> !5 = !{!"omnipotent char", !6, i64 0}
>>> !6 = !{!"Simple C++ TBAA"}
>>> !7 = !{!"short", !5, i64 0}
>>> !8 = !{!9, !9, i64 0}
>>> !9 = !{!"any pointer", !5, i64 0}
>>> !10 = !{!3, !3, i64 0} ; ***********************
>>>
>>>
>>> In order to allow other optimizations, we have also attached TBAA
>> information to the load/store of the struct (!tbaa !10).
>>> The logical solution would be to construct '!tbaa !10' as
shown:
>>> - the base type is '!3 (_ZTS1S)'
>>> - the access type is also '!3 (_ZTS1S)'
>>> - the offset is 'i64 0'
>>>
>>> I observe following issues:
>>> - issue 1: the tbaa semantics will not detect aliasing between
'!10' and '!2'
>> (See [2] and [3]; the load i16 in bar_wrong should not be optimized
away)
>>> - issue 2: according to the pure definition of the 'access
tags', the base
>> type and the access type can not be the same struct type.
>>> As such, the provided example could be found to be
'invalid'. Still, by
>> adding an extra indirection, a similar 'valid' example (with
wrong behavior)
>>> can be constructed. See [3]
>>>
>>> - For 'issue 2', I think that the definition of
'access tag' should be relaxed,
>> allowing the description of a full copy' of a struct.
>>> - For 'issue 1', the lookup algorithm should be
enhanced so that aliasing
>> can be detected for these cases.
>>> Is this a correct interpretation ? Any input is welcome !
>>>
>>> Thanks,
>>>
>>> Jeroen Dobbelaere
>>> ----
>>> [1] tbaa metadata:
https://urldefense.proofpoint.com/v2/url?u=https-
>> 3A__llvm.org_docs_LangRef.html-23tbaa-
>>
2Dmetadata&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=ELyOnT0WepII6
>> UnFk-OSzxlGOXXSfAvOLT6E8iPwwJk&m=b7XmqIDwVRomY-
>> z_Vgfx6dv1OKq85hoK5rRXP8m6Rxs&s=TmYSKjBxhgw5sSVeq-
>> v7p1UmvzV_81OIMz-_pFakoZQ&e>>> [2] godbolt example with
single struct:
>> https://urldefense.proofpoint.com/v2/url?u=https-
>> 3A__www.godbolt.org_z_XcQy-
>> 2DU&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=ELyOnT0WepII6UnFk-
>> OSzxlGOXXSfAvOLT6E8iPwwJk&m=b7XmqIDwVRomY-
>> z_Vgfx6dv1OKq85hoK5rRXP8m6Rxs&s=4OL_Z9OECS7bHERaUU-
>> WMxWhDPt-f_17zpehoMGVgj4&e>>> [3] godbolt example with
nested struct:
>> https://urldefense.proofpoint.com/v2/url?u=https-
>> 3A__www.godbolt.org_z_mNd-
>> 2DiI&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=ELyOnT0WepII6UnFk-
>> OSzxlGOXXSfAvOLT6E8iPwwJk&m=b7XmqIDwVRomY-
>> z_Vgfx6dv1OKq85hoK5rRXP8m6Rxs&s=0y6mskMKXX94CzNTFu1wVHF5m3Oc
>> RPyoZBc-ShGZrqM&e>>>
_______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>>
https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.llvm.org_cgi-
>> 2Dbin_mailman_listinfo_llvm-
>>
2Ddev&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=ELyOnT0WepII6UnFk-
>> OSzxlGOXXSfAvOLT6E8iPwwJk&m=b7XmqIDwVRomY-
>> z_Vgfx6dv1OKq85hoK5rRXP8m6Rxs&s=410QYrHOxKaioN8wS4O3Sgfu9vh-
>> 2Jnl5Z-xDqr94r0&e=
Jeroen Dobbelaere via llvm-dev
2019-Jun-06 14:50 UTC
[llvm-dev] llvm-ir: TBAA and struct copies
Thanks ! I'l try to come up with a patch to achieve this. Greetings, Jeroen Dobbelaere> -----Original Message----- > From: Ivan Kosarev <ivan at kosarev.info> > Sent: Wednesday, June 5, 2019 15:39 > To: Jeroen Dobbelaere <Jeroen.Dobbelaere at synopsys.com>; llvm- > dev at lists.llvm.org > Cc: Ivan Kosarev <ikosarev at accesssoftek.com> > Subject: Re: [llvm-dev] llvm-ir: TBAA and struct copies > > Hi Jeroen, > > > With the current understanding that I have about the problem and > about the TBAA implementation, I am not sure that it is easy to fix with > the default struct path tbaa. > > To handle aggregate accesses correctly, we need to know the size of the > access in addition to the offset. Since the current format doesn't store > sizes, it doesn't seem possible to do much more than we already do about > aggregate accesses. This is further complicated by the fact that calls > like memcpy() have their own format of metadata nodes. > > > For the 'NewFormat', The 'MayAlias' should be based on overlap of > {OffsetInBase, BaseTag.getSize} and {SubobjecTag.getOffset(), > SubobjectTag.getSize()} > > I am not sure if the same holds for the 'GenericTag'. > > Correct. As to the generic tag, this needs some very careful analysis > and extensive testing, but generally I would expect it to be built so that: > - its base type is the matching type in the considered access chains, > - its access type is the generic type of the two access types being > matched and > - its range (offset + size) is the union of the matched access ranges. > > Regards, > > > On 05/06/2019 15:45, Jeroen Dobbelaere wrote: > > Hi Ivan, > > > > The code that we have is indeed different from what the 'standard llvm' > expects. Let me explain: > > > > in our version we came into this situation in two steps: > > 1) I added support for 'special types' that map directly to types supported > by hardware. > > These types are represented by a struct containing a single iXXX member, > providing the necessary bits > > of the type, and at the same time avoiding any change in the bits (if it > would have been represented > > by a pure iXXX). They need to be handled similarly as a fundamental type, > so we provided support for that, > > resulting in clang generating load/store instructions that work on a 'struct'. > > > > In order to get this working (for tbaa), I had to relax the check for access > type, as it now could see an aggregate type > > (that behaved as a fundamental type). > > > > 2) later on, we noticed that sroa (if I remember correctly) produced better > code if it could work on load/store instructions of aggregates instead of > > llvm.memcpy. I adapted clang to also emit load/store pairs for small structs. > Just like with the special 'struct type', we the load/store's are decorated with > > the struct and the size of the struct as access TBAAAccessInfo information. > > > > At that time all looked fine, but recently, csmith testing has been showing > cases where alias analysis is now too aggressive. > > > > An easy way out is of course to fall back to llvm.memcpy, but that blocks a > lot of possible optimizations, as it loses all the TBAA information. > > > > (In the original example: because of the llvm.memcpy, it does no matter if > the *p=y copy is for a 'S' or for a different 'struct T'. In both cases, it blocks > the optimization of the 'x.f' ) > > > > With the current understanding that I have about the problem and about > the TBAA implementation, I am not sure that it is easy to fix with the default > struct path tbaa. > > With the 'new struct path tbaa', as it also tracks the access size, I do see a > possible solution: > > > > in TypeBasedAliasAnalysis.cpp, in function bool > mayBeAccessToSubobjectOf(...): > > > > https://urldefense.proofpoint.com/v2/url?u=https- > 3A__github.com_llvm_llvm- > 2Dproject_blob_master_llvm_lib_Analysis_TypeBasedAliasAnalysis.cpp- > 23L619&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=ELyOnT0WepII6UnFk- > OSzxlGOXXSfAvOLT6E8iPwwJk&m=wjZvxSyGva9_KdUwg3GJlKe3d- > Xupw002Pbq1baYxCE&s=KoEC1PwWo0_GryWTGjQ5nhZdePTLJ8nlEcOOs0C_ > Ov8&e> > > > if (BaseType.getNode() == SubobjectTag.getBaseType()) { > > bool SameMemberAccess = OffsetInBase => SubobjectTag.getOffset(); // **** HERE > > if (GenericTag) { > > *GenericTag = SameMemberAccess ? SubobjectTag.getNode() : > > createAccessTag(CommonType); > > } > > MayAlias = SameMemberAccess; > > return true; > > } > > > > For the 'NewFormat', The 'MayAlias' should be based on overlap of > {OffsetInBase, BaseTag.getSize} and {SubobjecTag.getOffset(), > SubobjectTag.getSize()} > > I am not sure if the same holds for the 'GenericTag'. > > > > Do you think that would cause other problems ? > > > > Thanks, > > > > Jeroen Dobbelaere > > > >> -----Original Message----- > >> From: Ivan Kosarev <ivan at kosarev.info> > >> Sent: Wednesday, June 5, 2019 12:25 > >> To: Jeroen Dobbelaere <Jeroen.Dobbelaere at synopsys.com>; llvm- > >> dev at lists.llvm.org > >> Cc: Ivan Kosarev <ikosarev at accesssoftek.com> > >> Subject: Re: [llvm-dev] llvm-ir: TBAA and struct copies > >> > >> Hello Jeroen, > >> > >> AFAIR, with the current TBAA format the access type shall never be an > >> aggregate, so the !10 node looks suspicious. > >> > >> Tried to reproduce this on 362464 with that same input snippet, but got > >> something completely different, see below. > >> > >> Regards, > >> > >> --- > >> $ clang -cc1 -S -emit-llvm -O1 -disable-llvm-passes x.cpp > >> $ cat x.ll > >> ... > >> define i32 @_Z6foobarv() #0 { > >> entry: > >> store i16 42, i16* getelementptr inbounds (%struct.S, %struct.S* @x, > >> i32 0, i32 2), align 2, !tbaa !2 > >> %0 = load %struct.S*, %struct.S** @p, align 8, !tbaa !8 > >> %1 = bitcast %struct.S* %0 to i8* > >> call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %1, i8* align 4 > >> bitcast (%struct.S* @y to i8*), i64 8, i1 false), !tbaa.struct !10 > >> %2 = load i16, i16* getelementptr inbounds (%struct.S, %struct.S* @x, > >> i32 0, i32 2), align 2, !tbaa !2 > >> %conv = sext i16 %2 to i32 > >> ret i32 %conv > >> } > >> ... > >> !2 = !{!3, !7, i64 6} > >> !3 = !{!"_ZTS1S", !4, i64 0, !7, i64 4, !7, i64 6} > >> !4 = !{!"int", !5, i64 0} > >> !5 = !{!"omnipotent char", !6, i64 0} > >> !6 = !{!"Simple C++ TBAA"} > >> !7 = !{!"short", !5, i64 0} > >> !8 = !{!9, !9, i64 0} > >> !9 = !{!"any pointer", !5, i64 0} > >> !10 = !{i64 0, i64 4, !11, i64 4, i64 2, !12, i64 6, i64 2, !12} > >> !11 = !{!4, !4, i64 0} > >> !12 = !{!7, !7, i64 0} > >> --- > >> > >> > >> > >> On 04/06/2019 17:34, Jeroen Dobbelaere via llvm-dev wrote: > >>> Hi, > >>> > >>> I have a question about the current definition of TBAA (See [1]). > >>> > >>> In the LLVM-IR code that we produce, we generate load/stores of struct > >> types. (See [2] and [3] for a godbolt example showing the issue) > >>> For following c-alike code: > >>> > >>> struct S { int dummy; short e, f; } x,y; > >>> struct S* p = &x; > >>> int foobar() { > >>> x.f=42; > >>> *p=y; //**** struct copy > >>> return x.f; > >>> } > >>> > >>> We produce: > >>> > >>> [...] > >>> %struct.S = type { i32, i16, i16 } > >>> ; Function Attrs: norecurse nounwind uwtable > >>> define dso_local i32 @foobar() local_unnamed_addr #0 { > >>> store i16 42, i16* getelementptr inbounds (%struct.S, %struct.S* @x, > i64 > >> 0, i32 2), align 2, !tbaa !2 > >>> %1 = load %struct.S*, %struct.S** @p, align 8, !tbaa !8 > >>> %2 = load %struct.S, %struct.S* @y, align 8, !tbaa !10 > ;************* > >>> store %struct.S %2, %struct.S* %1, align 4, !tbaa !10 ; > ************* > >>> %3 = load i16, i16* getelementptr inbounds (%struct.S, %struct.S* > @x, > >> i64 0, i32 2), align 2, !tbaa !2 > >>> %4 = sext i16 %3 to i32 > >>> ret i32 %4 > >>> } > >>> [...] > >>> !0 = !{i32 1, !"wchar_size", i32 4} > >>> !1 = !{!"clang version 9.0.0 (trunk 362464)"} > >>> !2 = !{!3, !7, i64 6} > >>> !3 = !{!"_ZTS1S", !4, i64 0, !7, i64 4, !7, i64 6} > >>> !4 = !{!"int", !5, i64 0} > >>> !5 = !{!"omnipotent char", !6, i64 0} > >>> !6 = !{!"Simple C++ TBAA"} > >>> !7 = !{!"short", !5, i64 0} > >>> !8 = !{!9, !9, i64 0} > >>> !9 = !{!"any pointer", !5, i64 0} > >>> !10 = !{!3, !3, i64 0} ; *********************** > >>> > >>> > >>> In order to allow other optimizations, we have also attached TBAA > >> information to the load/store of the struct (!tbaa !10). > >>> The logical solution would be to construct '!tbaa !10' as shown: > >>> - the base type is '!3 (_ZTS1S)' > >>> - the access type is also '!3 (_ZTS1S)' > >>> - the offset is 'i64 0' > >>> > >>> I observe following issues: > >>> - issue 1: the tbaa semantics will not detect aliasing between '!10' and > '!2' > >> (See [2] and [3]; the load i16 in bar_wrong should not be optimized away) > >>> - issue 2: according to the pure definition of the 'access tags', the base > >> type and the access type can not be the same struct type. > >>> As such, the provided example could be found to be 'invalid'. Still, by > >> adding an extra indirection, a similar 'valid' example (with wrong behavior) > >>> can be constructed. See [3] > >>> > >>> - For 'issue 2', I think that the definition of 'access tag' should be > relaxed, > >> allowing the description of a full copy' of a struct. > >>> - For 'issue 1', the lookup algorithm should be enhanced so that aliasing > >> can be detected for these cases. > >>> Is this a correct interpretation ? Any input is welcome ! > >>> > >>> Thanks, > >>> > >>> Jeroen Dobbelaere > >>> ---- > >>> [1] tbaa metadata: https://urldefense.proofpoint.com/v2/url?u=https- > >> 3A__llvm.org_docs_LangRef.html-23tbaa- > >> > 2Dmetadata&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=ELyOnT0WepII6 > >> UnFk-OSzxlGOXXSfAvOLT6E8iPwwJk&m=b7XmqIDwVRomY- > >> z_Vgfx6dv1OKq85hoK5rRXP8m6Rxs&s=TmYSKjBxhgw5sSVeq- > >> v7p1UmvzV_81OIMz-_pFakoZQ&e> >>> [2] godbolt example with single struct: > >> https://urldefense.proofpoint.com/v2/url?u=https- > >> 3A__www.godbolt.org_z_XcQy- > >> 2DU&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=ELyOnT0WepII6UnFk- > >> OSzxlGOXXSfAvOLT6E8iPwwJk&m=b7XmqIDwVRomY- > >> z_Vgfx6dv1OKq85hoK5rRXP8m6Rxs&s=4OL_Z9OECS7bHERaUU- > >> WMxWhDPt-f_17zpehoMGVgj4&e> >>> [3] godbolt example with nested struct: > >> https://urldefense.proofpoint.com/v2/url?u=https- > >> 3A__www.godbolt.org_z_mNd- > >> 2DiI&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=ELyOnT0WepII6UnFk- > >> OSzxlGOXXSfAvOLT6E8iPwwJk&m=b7XmqIDwVRomY- > >> > z_Vgfx6dv1OKq85hoK5rRXP8m6Rxs&s=0y6mskMKXX94CzNTFu1wVHF5m3Oc > >> RPyoZBc-ShGZrqM&e> >>> _______________________________________________ > >>> LLVM Developers mailing list > >>> llvm-dev at lists.llvm.org > >>> https://urldefense.proofpoint.com/v2/url?u=https- > 3A__lists.llvm.org_cgi- > >> 2Dbin_mailman_listinfo_llvm- > >> > 2Ddev&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=ELyOnT0WepII6UnFk- > >> OSzxlGOXXSfAvOLT6E8iPwwJk&m=b7XmqIDwVRomY- > >> z_Vgfx6dv1OKq85hoK5rRXP8m6Rxs&s=410QYrHOxKaioN8wS4O3Sgfu9vh- > >> 2Jnl5Z-xDqr94r0&e=