thr3ads.net - llvm dev - [LLVMdev] Alias analysis issue with structs on PPC [Mar 2015]

If this information is useful, please help other people find it:
Share via:

Daniel Berlin

2015-Mar-15 23:48 UTC

[LLVMdev] Alias analysis issue with structs on PPC

On Sun, Mar 15, 2015 at 4:34 PM Olivier Sallenave <ol.sall at gmail.com>
wrote:
> Hi Daniel,
>
> Thanks for your feedback. I would prefer not to write a new AA. Can't
we
> directly implement that traversal in BasicAA?
>Can I ask why?
Outside of the "well, it's another pass", i mean?

BasicAA is stateless, so you can't cache, and you really don't want to
redo
these walks repeatedly (especially when the answer doesn't change unless AA
is invalidated).  It would be really expensive.
Doing this in a separate analysis pass, caching the answer, and producing
AA results, seems to me exactly the right thing.

> Otherwise, I'll investigate why this i64 was generated in the first
place
> (but like you, I don't really want to know why :-)
>
Reid walked over to my desk and told me all the gory details - the clang
ABI lowering of  structs like these for anything but x86 basically says
"oh, this  really should be 2 8 byte GPR's, and the way it makes this
happen is by using i64's :)

If you want to do it at a clang level, the right thing to do is to fixup
the ABI lowerings for pointers to keep them pointers in this case.

For something simpler, if you wanted, you could do the *exact* opposite of
happens now and you'd get better results.

This is, pass everything that needs to be 2 8 byte regs as 8 byte pointers,
and cast it back to something if it's not really a pointer, using ptrtoint.

if it's not really a pointer, we don't care what AA says about it.
If it is a pointer, now we don't get bad AA answers, because there is no
inttoptr being used like there is now.

Of course, i have no idea if this strategy is going to produce correct ABI
results on all platforms, it depends on whether they treat pointers as
special.

Olivier>
> 2015-03-13 18:51 GMT-04:00 Daniel Berlin <dberlin at dberlin.org>:
>
>>
>>
>> On Fri, Mar 13, 2015 at 2:54 PM Daniel Berlin <dberlin at
dberlin.org>
>> wrote:
>>
>>> On Fri, Mar 13, 2015 at 2:39 PM Olivier H Sallenave <ohsallen at
us.ibm.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I have the following C loop to vectorize:
>>>>
>>>> struct box {
>>>>     double* source;
>>>> };
>>>>
>>>> void test(double* restrict result, struct box my_struct, int
len)
>>>> {
>>>>     for (int i=0 ; i<len; i++) {
>>>>         result[i] = my_struct.source[i] * my_struct.source[i];
>>>>     }
>>>> }
>>>>
>>>> There are two references in the loop, result[i] (restrict) and
>>>> my_struct.source[i] (readonly). The compiler should easily
figure out that
>>>> they do not alias.
>>>>
>>>> Compiling for x86, the loop alias analysis works just fine:
>>>>   AST: Alias Set Tracker: 2 alias sets for 2 pointer values.
>>>>   AliasSet[0x7fd8e2f32290, 1] must alias, No access Pointers:
(double*
>>>> %arrayidx5, 18446744073709551615)
>>>>   AliasSet[0x7fd8e2f322e0, 1] must alias, No access Pointers:
(double*
>>>> %arrayidx, 18446744073709551615)
>>>>
>>>> Compiling for PPC with -target powerpc64le-ibm-linux-gnu, the
two
>>>> addresses now alias:
>>>>   AST: Alias Set Tracker: 1 alias sets for 2 pointer values.
>>>>   AliasSet[0x7f931bd5bdc0, 2] may alias, No access Pointers:
(double*
>>>> %arrayidx5, 18446744073709551615), (double* %arrayidx,
18446744073709551615)
>>>>
>>>> BasicAA is used for both targets by default. The difference is
that in
>>>> PPC, the IR obtained from Clang takes an i64 as parameter
instead of a
>>>> double* for my_struct.
>>>>
>>>
>>> I don't even want to know why this would be the case :)
>>>
>>>
>>>> This parameter is then coerced into double* using an inttoptr
>>>> instruction. The code in BasicAliasAnalysis.cpp which is
triggered for x86
>>>> is the following:
>>>>
>>>>     // Function arguments can't alias with things that are
known to be
>>>>     // unambigously identified at the function level.
>>>>     if ((isa<Argument>(O1) &&
isIdentifiedFunctionLocal(O2)) ||
>>>>         (isa<Argument>(O2) &&
isIdentifiedFunctionLocal(O1)))
>>>>       return NoAlias;
>>>>
>>>> isIdentifiedFunctionLocal(V) returns true for a noalias
argument (such
>>>> as result), but the other address (my_struct) must be a
function argument
>>>> in order to return NoAlias, which is not the case anymore for
PPC (since
>>>> my_struct is now the result from an inttoptr instruction). If I
understand,
>>>> the problem is that we cannot trust the fact that locals do not
alias with
>>>> restrict parameters (because the compiler could generate some
locals which
>>>> alias)?
>>>>
>>> Yes, because pointers *based on* the noalias'd argument are
legal
>>> aliases.
>>>
>>> So if you don't know it's an argument or an identified
local, it could
>>> be based on the restricted pointer, and thus, alias it.
>>>
>>>
>>> If someone has suggestions about this, that would help a lot.
>>>>
>>>
>>> The only way you could prove something in this case would be to
walk the
>>> chain and prove the value comes directly from an argument with no
>>> modification.
>>>
>>
>> Actually, you could do the opposite, too, pretty cheaply.
>>
>> You could write a new pass or AA.
>> It traverses chains in the reverse direction (IE it goes from the
>> arguments, and walks down the immediate use chain, marking things as
based
>> on arguments or not), and makes a lookup table of things it can prove
are
>> also  unmolested identified objects.
>> (which would be the result of inttoptr in your case).
>>
>> You can then use this simple lookup table to answer the
>> isIdentifiedObject question better.
>> (You'd have to make isIdentifiedObject part of the AA interface, or
take
>> an optional table, blah blah blah)
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150315/a2177f32/attachment.html>

Olivier Sallenave

2015-Mar-16 02:01 UTC

head link

[LLVMdev] Alias analysis issue with structs on PPC

Reason is that I want this case to be supported by default, without
specific flags (should we change the default AA for PPC if we do a separate
one?).

Fixing at the Clang level might be the right thing to do... But if there
are other cases like this, the traversal would definitely help in general
to get a proper handling of restrict.

Olivier

On Sunday, March 15, 2015, Daniel Berlin <dberlin at dberlin.org> wrote:
>
>
> On Sun, Mar 15, 2015 at 4:34 PM Olivier Sallenave <ol.sall at gmail.com
> <javascript:_e(%7B%7D,'cvml','ol.sall at
gmail.com');>> wrote:
>
>> Hi Daniel,
>>
>> Thanks for your feedback. I would prefer not to write a new AA.
Can't we
>> directly implement that traversal in BasicAA?
>>
> Can I ask why?
> Outside of the "well, it's another pass", i mean?
>
> BasicAA is stateless, so you can't cache, and you really don't want
to
> redo these walks repeatedly (especially when the answer doesn't change
> unless AA is invalidated).  It would be really expensive.
> Doing this in a separate analysis pass, caching the answer, and producing
> AA results, seems to me exactly the right thing.
>
>
>> Otherwise, I'll investigate why this i64 was generated in the first
place
>> (but like you, I don't really want to know why :-)
>>
>
> Reid walked over to my desk and told me all the gory details - the clang
> ABI lowering of  structs like these for anything but x86 basically says
> "oh, this  really should be 2 8 byte GPR's, and the way it makes
this
> happen is by using i64's :)
>
> If you want to do it at a clang level, the right thing to do is to fixup
> the ABI lowerings for pointers to keep them pointers in this case.
>
> For something simpler, if you wanted, you could do the *exact* opposite of
> happens now and you'd get better results.
>
> This is, pass everything that needs to be 2 8 byte regs as 8 byte
> pointers, and cast it back to something if it's not really a pointer,
using
> ptrtoint.
>
> if it's not really a pointer, we don't care what AA says about it.
> If it is a pointer, now we don't get bad AA answers, because there is
no
> inttoptr being used like there is now.
>
> Of course, i have no idea if this strategy is going to produce correct ABI
> results on all platforms, it depends on whether they treat pointers as
> special.
>
> Olivier
>>
>> 2015-03-13 18:51 GMT-04:00 Daniel Berlin <dberlin at dberlin.org
>> <javascript:_e(%7B%7D,'cvml','dberlin at
dberlin.org');>>:
>>
>>>
>>>
>>> On Fri, Mar 13, 2015 at 2:54 PM Daniel Berlin <dberlin at
dberlin.org
>>> <javascript:_e(%7B%7D,'cvml','dberlin at
dberlin.org');>> wrote:
>>>
>>>> On Fri, Mar 13, 2015 at 2:39 PM Olivier H Sallenave <
>>>> ohsallen at us.ibm.com
>>>> <javascript:_e(%7B%7D,'cvml','ohsallen at
us.ibm.com');>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I have the following C loop to vectorize:
>>>>>
>>>>> struct box {
>>>>>     double* source;
>>>>> };
>>>>>
>>>>> void test(double* restrict result, struct box my_struct,
int len)
>>>>> {
>>>>>     for (int i=0 ; i<len; i++) {
>>>>>         result[i] = my_struct.source[i] *
my_struct.source[i];
>>>>>     }
>>>>> }
>>>>>
>>>>> There are two references in the loop, result[i] (restrict)
and
>>>>> my_struct.source[i] (readonly). The compiler should easily
figure out that
>>>>> they do not alias.
>>>>>
>>>>> Compiling for x86, the loop alias analysis works just fine:
>>>>>   AST: Alias Set Tracker: 2 alias sets for 2 pointer
values.
>>>>>   AliasSet[0x7fd8e2f32290, 1] must alias, No access
Pointers: (double*
>>>>> %arrayidx5, 18446744073709551615)
>>>>>   AliasSet[0x7fd8e2f322e0, 1] must alias, No access
Pointers: (double*
>>>>> %arrayidx, 18446744073709551615)
>>>>>
>>>>> Compiling for PPC with -target powerpc64le-ibm-linux-gnu,
the two
>>>>> addresses now alias:
>>>>>   AST: Alias Set Tracker: 1 alias sets for 2 pointer
values.
>>>>>   AliasSet[0x7f931bd5bdc0, 2] may alias, No access
Pointers: (double*
>>>>> %arrayidx5, 18446744073709551615), (double* %arrayidx,
18446744073709551615)
>>>>>
>>>>> BasicAA is used for both targets by default. The difference
is that in
>>>>> PPC, the IR obtained from Clang takes an i64 as parameter
instead of a
>>>>> double* for my_struct.
>>>>>
>>>>
>>>> I don't even want to know why this would be the case :)
>>>>
>>>>
>>>>> This parameter is then coerced into double* using an
inttoptr
>>>>> instruction. The code in BasicAliasAnalysis.cpp which is
triggered for x86
>>>>> is the following:
>>>>>
>>>>>     // Function arguments can't alias with things that
are known to be
>>>>>     // unambigously identified at the function level.
>>>>>     if ((isa<Argument>(O1) &&
isIdentifiedFunctionLocal(O2)) ||
>>>>>         (isa<Argument>(O2) &&
isIdentifiedFunctionLocal(O1)))
>>>>>       return NoAlias;
>>>>>
>>>>> isIdentifiedFunctionLocal(V) returns true for a noalias
argument (such
>>>>> as result), but the other address (my_struct) must be a
function argument
>>>>> in order to return NoAlias, which is not the case anymore
for PPC (since
>>>>> my_struct is now the result from an inttoptr instruction).
If I understand,
>>>>> the problem is that we cannot trust the fact that locals do
not alias with
>>>>> restrict parameters (because the compiler could generate
some locals which
>>>>> alias)?
>>>>>
>>>> Yes, because pointers *based on* the noalias'd argument are
legal
>>>> aliases.
>>>>
>>>> So if you don't know it's an argument or an identified
local, it could
>>>> be based on the restricted pointer, and thus, alias it.
>>>>
>>>>
>>>> If someone has suggestions about this, that would help a lot.
>>>>>
>>>>
>>>> The only way you could prove something in this case would be to
walk
>>>> the chain and prove the value comes directly from an argument
with no
>>>> modification.
>>>>
>>>
>>> Actually, you could do the opposite, too, pretty cheaply.
>>>
>>> You could write a new pass or AA.
>>> It traverses chains in the reverse direction (IE it goes from the
>>> arguments, and walks down the immediate use chain, marking things
as based
>>> on arguments or not), and makes a lookup table of things it can
prove are
>>> also  unmolested identified objects.
>>> (which would be the result of inttoptr in your case).
>>>
>>> You can then use this simple lookup table to answer the
>>> isIdentifiedObject question better.
>>> (You'd have to make isIdentifiedObject part of the AA
interface, or take
>>> an optional table, blah blah blah)
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu
>>> <javascript:_e(%7B%7D,'cvml','LLVMdev at
cs.uiuc.edu');>
>>> http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>
>>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150315/001e2549/attachment.html>

Daniel Berlin

2015-Mar-16 02:13 UTC

head link

[LLVMdev] Alias analysis issue with structs on PPC

On Sun, Mar 15, 2015 at 7:01 PM Olivier Sallenave <ol.sall at gmail.com>
wrote:
> Reason is that I want this case to be supported by default, without
> specific flags (should we change the default AA for PPC if we do a separate
> one?).

Yeah, just change the default AA to also include this in the stack.
It seems like it would be a win in all cases.
>
> Fixing at the Clang level might be the right thing to do... But if there
> are other cases like this, the traversal would definitely help in general
> to get a proper handling of restrict.
>
> You mean noalias ;)(restrict has weird semantics, and doesn't exist at the LLVM level.
noalias has sane semantics, and is close to restrict, but better)


In any case, this will help more than just your case, so it seems obvious
to me that unless it's expensive, it should be on by default.


Olivier>
>
> On Sunday, March 15, 2015, Daniel Berlin <dberlin at dberlin.org>
wrote:
>
>>
>>
>> On Sun, Mar 15, 2015 at 4:34 PM Olivier Sallenave <ol.sall at
gmail.com>
>> wrote:
>>
>>> Hi Daniel,
>>>
>>> Thanks for your feedback. I would prefer not to write a new AA.
Can't we
>>> directly implement that traversal in BasicAA?
>>>
>> Can I ask why?
>> Outside of the "well, it's another pass", i mean?
>>
>> BasicAA is stateless, so you can't cache, and you really don't
want to
>> redo these walks repeatedly (especially when the answer doesn't
change
>> unless AA is invalidated).  It would be really expensive.
>> Doing this in a separate analysis pass, caching the answer, and
producing
>> AA results, seems to me exactly the right thing.
>>
>>
>>> Otherwise, I'll investigate why this i64 was generated in the
first
>>> place (but like you, I don't really want to know why :-)
>>>
>>
>> Reid walked over to my desk and told me all the gory details - the
clang
>> ABI lowering of  structs like these for anything but x86 basically says
>> "oh, this  really should be 2 8 byte GPR's, and the way it
makes this
>> happen is by using i64's :)
>>
>> If you want to do it at a clang level, the right thing to do is to
fixup
>> the ABI lowerings for pointers to keep them pointers in this case.
>>
>> For something simpler, if you wanted, you could do the *exact* opposite
>> of happens now and you'd get better results.
>>
>> This is, pass everything that needs to be 2 8 byte regs as 8 byte
>> pointers, and cast it back to something if it's not really a
pointer, using
>> ptrtoint.
>>
>> if it's not really a pointer, we don't care what AA says about
it.
>> If it is a pointer, now we don't get bad AA answers, because there
is no
>> inttoptr being used like there is now.
>>
>> Of course, i have no idea if this strategy is going to produce correct
>> ABI results on all platforms, it depends on whether they treat pointers
as
>> special.
>>
>> Olivier
>>>
>>> 2015-03-13 18:51 GMT-04:00 Daniel Berlin <dberlin at
dberlin.org>:
>>>
>>>>
>>>>
>>>> On Fri, Mar 13, 2015 at 2:54 PM Daniel Berlin <dberlin at
dberlin.org>
>>>> wrote:
>>>>
>>>>> On Fri, Mar 13, 2015 at 2:39 PM Olivier H Sallenave <
>>>>> ohsallen at us.ibm.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I have the following C loop to vectorize:
>>>>>>
>>>>>> struct box {
>>>>>>     double* source;
>>>>>> };
>>>>>>
>>>>>> void test(double* restrict result, struct box
my_struct, int len)
>>>>>> {
>>>>>>     for (int i=0 ; i<len; i++) {
>>>>>>         result[i] = my_struct.source[i] *
my_struct.source[i];
>>>>>>     }
>>>>>> }
>>>>>>
>>>>>> There are two references in the loop, result[i]
(restrict) and
>>>>>> my_struct.source[i] (readonly). The compiler should
easily figure out that
>>>>>> they do not alias.
>>>>>>
>>>>>> Compiling for x86, the loop alias analysis works just
fine:
>>>>>>   AST: Alias Set Tracker: 2 alias sets for 2 pointer
values.
>>>>>>   AliasSet[0x7fd8e2f32290, 1] must alias, No access
Pointers:
>>>>>> (double* %arrayidx5, 18446744073709551615)
>>>>>>   AliasSet[0x7fd8e2f322e0, 1] must alias, No access
Pointers:
>>>>>> (double* %arrayidx, 18446744073709551615)
>>>>>>
>>>>>> Compiling for PPC with -target
powerpc64le-ibm-linux-gnu, the two
>>>>>> addresses now alias:
>>>>>>   AST: Alias Set Tracker: 1 alias sets for 2 pointer
values.
>>>>>>   AliasSet[0x7f931bd5bdc0, 2] may alias, No access
Pointers: (double*
>>>>>> %arrayidx5, 18446744073709551615), (double* %arrayidx,
18446744073709551615)
>>>>>>
>>>>>> BasicAA is used for both targets by default. The
difference is that
>>>>>> in PPC, the IR obtained from Clang takes an i64 as
parameter instead of a
>>>>>> double* for my_struct.
>>>>>>
>>>>>
>>>>> I don't even want to know why this would be the case :)
>>>>>
>>>>>
>>>>>> This parameter is then coerced into double* using an
inttoptr
>>>>>> instruction. The code in BasicAliasAnalysis.cpp which
is triggered for x86
>>>>>> is the following:
>>>>>>
>>>>>>     // Function arguments can't alias with things
that are known to
>>>>>> be
>>>>>>     // unambigously identified at the function level.
>>>>>>     if ((isa<Argument>(O1) &&
isIdentifiedFunctionLocal(O2)) ||
>>>>>>         (isa<Argument>(O2) &&
isIdentifiedFunctionLocal(O1)))
>>>>>>       return NoAlias;
>>>>>>
>>>>>> isIdentifiedFunctionLocal(V) returns true for a noalias
argument
>>>>>> (such as result), but the other address (my_struct)
must be a function
>>>>>> argument in order to return NoAlias, which is not the
case anymore for PPC
>>>>>> (since my_struct is now the result from an inttoptr
instruction). If I
>>>>>> understand, the problem is that we cannot trust the
fact that locals do not
>>>>>> alias with restrict parameters (because the compiler
could generate some
>>>>>> locals which alias)?
>>>>>>
>>>>> Yes, because pointers *based on* the noalias'd argument
are legal
>>>>> aliases.
>>>>>
>>>>> So if you don't know it's an argument or an
identified local, it could
>>>>> be based on the restricted pointer, and thus, alias it.
>>>>>
>>>>>
>>>>> If someone has suggestions about this, that would help a
lot.
>>>>>>
>>>>>
>>>>> The only way you could prove something in this case would
be to walk
>>>>> the chain and prove the value comes directly from an
argument with no
>>>>> modification.
>>>>>
>>>>
>>>> Actually, you could do the opposite, too, pretty cheaply.
>>>>
>>>> You could write a new pass or AA.
>>>> It traverses chains in the reverse direction (IE it goes from
the
>>>> arguments, and walks down the immediate use chain, marking
things as based
>>>> on arguments or not), and makes a lookup table of things it can
prove are
>>>> also  unmolested identified objects.
>>>> (which would be the result of inttoptr in your case).
>>>>
>>>> You can then use this simple lookup table to answer the
>>>> isIdentifiedObject question better.
>>>> (You'd have to make isIdentifiedObject part of the AA
interface, or
>>>> take an optional table, blah blah blah)
>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>
>>>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150316/6634ea18/attachment.html>

Hal Finkel

2015-Mar-16 16:56 UTC

head link

[LLVMdev] Alias analysis issue with structs on PPC

----- Original Message -----
> From: "Daniel Berlin" <dberlin at dberlin.org>
> To: "Olivier Sallenave" <ol.sall at gmail.com>
> Cc: llvmdev at cs.uiuc.edu
> Sent: Sunday, March 15, 2015 6:48:41 PM
> Subject: Re: [LLVMdev] Alias analysis issue with structs on PPC
> On Sun, Mar 15, 2015 at 4:34 PM Olivier Sallenave < ol.sall at gmail.com
> > wrote:
> > Hi Daniel,
> 
> > Thanks for your feedback. I would prefer not to write a new AA.
> > Can't
> > we directly implement that traversal in BasicAA?
> 
> Can I ask why?
> Outside of the "well, it's another pass", i mean?
> BasicAA is stateless, so you can't cache, and you really don't want
> to redo these walks repeatedly (especially when the answer doesn't
> change unless AA is invalidated). It would be really expensive.
> Doing this in a separate analysis pass, caching the answer, and
> producing AA results, seems to me exactly the right thing.
> > Otherwise, I'll investigate why this i64 was generated in the
first
> > place (but like you, I don't really want to know why :-)
> 
> Reid walked over to my desk and told me all the gory details - the
> clang ABI lowering of structs like these for anything but x86
> basically says "oh, this really should be 2 8 byte GPR's, and the
> way it makes this happen is by using i64's :)
> If you want to do it at a clang level, the right thing to do is to
> fixup the ABI lowerings for pointers to keep them pointers in this
> case.So this is an artifact of the way that we pass structures, and constructing a
general solution at the ABI level might be tricky. I've cc'd Uli, who
did most of the recent work here.

For the single-element struct case, we could fix this by keeping it a pointer
type. The relevant code in Clang is in lib/CodeGen/TargetInfo.cpp (look at
PPC64_SVR4_ABIInfo::classifyArgumentType and nearby code). But that does not
really address the underlying issue:

If I take your example and modify it so that we have: 

struct box { 
double* source; 
double* source2; 
}; 

then the parameter is passed as: 

define void @test(double* noalias nocapture %result, [2 x i64]
%my_struct.coerce, i32 signext %len) #0

and is extracted the same way: 

%my_struct.coerce.fca.0.extract = extractvalue [2 x i64] %my_struct.coerce, 0 
%0 = inttoptr i64 %my_struct.coerce.fca.0.extract to double* 

but, it is also important to realize that the i64 here in the array type is
actually chosen to satisfy alignment requirements. If we have this:

typedef float __attribute__((ext_vector_type(4))) vt; 
struct box { 
double* source; 
double* source2; 
vt v; 
}; 

then the struct is passed as: 

define void @test(double* noalias nocapture %result, [2 x i128]
%my_struct.coerce, i32 signext %len)

and the extraction code looks like: 

%my_struct.coerce.fca.0.extract = extractvalue [2 x i128] %my_struct.coerce, 0 
%my_struct.sroa.0.sroa.0.0.extract.shift = lshr i128
%my_struct.coerce.fca.0.extract, 64
%my_struct.sroa.0.sroa.0.0.extract.trunc = trunc i128
%my_struct.sroa.0.sroa.0.0.extract.shift to i64
%0 = inttoptr i64 %my_struct.sroa.0.sroa.0.0.extract.trunc to double* 

so just using pointer types instead of i64 will help common cases, but will not
address the general issue. Now part of this does some down to using array
parameters as a substitute for byval/direct parameters. As I recall, this was
done because it allowed a natural partial decomposition between GPRs and stack
for structures that straddle the number of available parameter-passing GPRs. If
we could accomplish that with regular byval parameters and regular direct
parameters, then we'd not need any of this array coercion, and the system,
including for the purposes of aliasing analysis, would work as intended. There
may be some infrastructure work required in the backend (SelectionDAG builder,
etc.) -- Uli, if you know please comment -- but I think moving away from the
array coercions might be the right solution, even if that requires some
infrastructure enhancements.

-Hal 
> For something simpler, if you wanted, you could do the *exact*
> opposite of happens now and you'd get better results.
> This is, pass everything that needs to be 2 8 byte regs as 8 byte
> pointers, and cast it back to something if it's not really a
> pointer, using ptrtoint.
> if it's not really a pointer, we don't care what AA says about it.
> If it is a pointer, now we don't get bad AA answers, because there is
> no inttoptr being used like there is now.
> Of course, i have no idea if this strategy is going to produce
> correct ABI results on all platforms, it depends on whether they
> treat pointers as special.
> > Olivier
> 
> > 2015-03-13 18:51 GMT-04:00 Daniel Berlin < dberlin at dberlin.org
> :
> 
> > > On Fri, Mar 13, 2015 at 2:54 PM Daniel Berlin <
> > > dberlin at dberlin.org
> > > >
> > > wrote:
> > 
> 
> > > > On Fri, Mar 13, 2015 at 2:39 PM Olivier H Sallenave <
> > > > ohsallen at us.ibm.com > wrote:
> > > 
> > 
> 
> > > > > Hi,
> > > > 
> > > 
> > 
> 
> > > > > I have the following C loop to vectorize:
> > > > 
> > > 
> > 
> 
> > > > > struct box {
> > > > 
> > > 
> > 
> 
> > > > > double* source;
> > > > 
> > > 
> > 
> 
> > > > > };
> > > > 
> > > 
> > 
> 
> > > > > void test(double* restrict result, struct box
my_struct, int
> > > > > len)
> > > > 
> > > 
> > 
> 
> > > > > {
> > > > 
> > > 
> > 
> 
> > > > > for (int i=0 ; i<len; i++) {
> > > > 
> > > 
> > 
> 
> > > > > result[i] = my_struct.source[i] * my_struct.source[i];
> > > > 
> > > 
> > 
> 
> > > > > }
> > > > 
> > > 
> > 
> 
> > > > > }
> > > > 
> > > 
> > 
> 
> > > > > There are two references in the loop, result[i]
(restrict)
> > > > > and
> > > > > my_struct.source[i] (readonly). The compiler should
easily
> > > > > figure
> > > > > out that they do not alias.
> > > > 
> > > 
> > 
> 
> > > > > Compiling for x86, the loop alias analysis works just
fine:
> > > > 
> > > 
> > 
> 
> > > > > AST: Alias Set Tracker: 2 alias sets for 2 pointer
values.
> > > > 
> > > 
> > 
> 
> > > > > AliasSet[0x7fd8e2f32290, 1] must alias, No access
Pointers:
> > > > > (double*
> > > > > %arrayidx5, 18446744073709551615)
> > > > 
> > > 
> > 
> 
> > > > > AliasSet[0x7fd8e2f322e0, 1] must alias, No access
Pointers:
> > > > > (double*
> > > > > %arrayidx, 18446744073709551615)
> > > > 
> > > 
> > 
> 
> > > > > Compilin g for PPC with -target
powerpc64le-ibm-linux-gnu,
> > > > > the
> > > > > two
> > > > > addresses now alias:
> > > > 
> > > 
> > 
> 
> > > > > AST: Alias Set Tracker: 1 alias sets for 2 pointer
values.
> > > > 
> > > 
> > 
> 
> > > > > AliasSet[0x7f931bd5bdc0, 2] may alias, No access
Pointers:
> > > > > (double*
> > > > > %arrayidx5, 18446744073709551615), (double* %arrayidx,
> > > > > 18446744073709551615)
> > > > 
> > > 
> > 
> 
> > > > > BasicAA is used for both targets by default. The
difference
> > > > > is
> > > > > that
> > > > > in PPC, the IR obtained from Clang takes an i64 as
parameter
> > > > > instead
> > > > > of a double* for my_struct.
> > > > 
> > > 
> > 
> 
> > > > I don't even want to know why this would be the case :)
> > > 
> > 
> 
> > > > > This parameter is then coerced into double* using an
inttoptr
> > > > > instruction. The code in BasicAliasAnalysis.cpp which
is
> > > > > triggered
> > > > > for x86 is the following:
> > > > 
> > > 
> > 
> 
> > > > > // Function arguments can't alias with things that
are known
> > > > > to
> > > > > be
> > > > 
> > > 
> > 
> 
> > > > > // unambigously identified at the function level.
> > > > 
> > > 
> > 
> 
> > > > > if (( isa<Argument> ( O1 ) && isId
entifiedFunctionLocal ( O2
> > > > > ))
> > > > > ||
> > > > 
> > > 
> > 
> 
> > > > > ( isa<Argument> ( O2 ) && isIdenti
fiedFunctionLocal ( O1 )))
> > > > 
> > > 
> > 
> 
> > > > > return NoAlias ;
> > > > 
> > > 
> > 
> 
> > > > > isIdentifiedFunctionLocal(V) returns true for a noalias
> > > > > argument
> > > > > (such as result), but the other address (my_struct)
must be a
> > > > > function argument in order to return NoAlias, which is
not
> > > > > the
> > > > > case
> > > > > anymore for PPC (since my_struct is now the result from
an
> > > > > inttoptr
> > > > > instruction). If I understand, the problem is that we
cannot
> > > > > trust
> > > > > the fact that locals do not alias with restrict
parameters
> > > > > (because
> > > > > the compiler could generate some locals which alias)?
> > > > 
> > > 
> > 
> 
> > > > Yes, because pointers *based on* the noalias'd argument
are
> > > > legal
> > > > aliases.
> > > 
> > 
> 
> > > > So if you don't know it's an argument or an
identified local,
> > > > it
> > > > could be based on the restricted pointer, and thus, alias
it.
> > > 
> > 
> 
> > > > > If someone has suggestions about this, that would help
a lot.
> > > > 
> > > 
> > 
> 
> > > > The only way you could prove something in this case would be
to
> > > > walk
> > > > the chain and prove the value comes directly from an
argument
> > > > with
> > > > no modification.
> > > 
> > 
> 
> > > Actually, you could do the opposite, too, pretty cheaply.
> > 
> 
> > > You could write a new pass or AA.
> > 
> 
> > > It traverses chains in the reverse direction (IE it goes from the
> > > arguments, and walks down the immediate use chain, marking things
> > > as
> > > based on arguments or not), and makes a lookup table of things it
> > > can prove are also unmolested identified objects.
> > 
> 
> > > (which would be the result of inttoptr in your case).
> > 
> 
> > > You can then use this simple lookup table to answer the
> > > isIdentifiedObject question better.
> > 
> 
> > > (You'd have to make isIdentifiedObject part of the AA
interface,
> > > or
> > > take an optional table, blah blah blah)
> > 
> 
> > > _______________________________________________
> > 
> 
> > > LLVM Developers mailing list
> > 
> 
> > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> > 
> 
> > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> > 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-- 

Hal Finkel 
Assistant Computational Scientist 
Leadership Computing Facility 
Argonne National Laboratory 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150316/4e84e2c7/attachment.html>

Reid Kleckner

2015-Mar-16 18:24 UTC

head link

[LLVMdev] Alias analysis issue with structs on PPC

On Mon, Mar 16, 2015 at 9:56 AM, Hal Finkel <hfinkel at anl.gov>
wrote:>
> so just using pointer types instead of i64 will help common cases, but
> will not address the general issue. Now part of this does some down to
> using array parameters as a substitute for byval/direct parameters. As I
> recall, this was done because it allowed a natural partial decomposition
> between GPRs and stack for structures that straddle the number of available
> parameter-passing GPRs. If we could accomplish that with regular byval
> parameters and regular direct parameters, then we'd not need any of
this
> array coercion, and the system, including for the purposes of aliasing
> analysis, would work as intended. There may be some infrastructure work
> required in the backend (SelectionDAG builder, etc.) -- Uli, if you know
> please comment -- but I think moving away from the array coercions might be
> the right solution, even if that requires some infrastructure enhancements.
>
So, every backend interprets 'byval' differently, but it usually means
"pass this whole thing in stack memory". It also requires extra copies
through memory at the IR level, so I don't think we should be moving
towards this construct.

If you want to pass things in registers, it's usually best to use SSA
values. Even though the extra 'extractvalue' instructions look expensive
in
the IR, they lower down to simple virtual register copies in the selection
dag. The shift and trunc, on the other hand, don't model the machine code
at all, and it would be good if we could eliminate them.

I wonder if we could solve this parameter alignment problem via the
'align'
parameter attribute. Unfortunately, I think for pointer types it's already
overloaded to describe the alignment of the pointee and not the argument
itself. In fact, I think you did this Hal. :)

I think, in the long term, we should probably use a direct FCA. I believe
this is what ARM does. It's also nice to flatten the FCA if we can detect
that we're in a simple case where no interesting alignment is required.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150316/861bc32b/attachment.html>

Ulrich Weigand

2015-Mar-17 09:56 UTC

head link

[LLVMdev] Alias analysis issue with structs on PPC

Hal Finkel <hfinkel at anl.gov> wrote on 16.03.2015 17:56:20:
> If you want to do it at a clang level, the right thing to do is to
> fixup the ABI lowerings for pointers to keep them pointers in this case.
> So this is an artifact of the way that we pass structures, and
> constructing a general solution at the ABI level might be tricky.
> I've cc'd Uli, who did most of the recent work here.
>
> For the single-element struct case, we could fix this by keeping it
> a pointer type.  The relevant code in Clang is in lib/CodeGen/
> TargetInfo.cpp (look at PPC64_SVR4_ABIInfo::classifyArgumentType and
> nearby code). But that does not really address the underlying issue:
>
> If I take your example and modify it so that we have:
>
> struct box {
>     double* source;
>     double* source2;
> };
>
> then the parameter is passed as:
>
> define void @test(double* noalias nocapture %result, [2 x i64] %
> my_struct.coerce, i32 signext %len) #0
>
> and is extracted the same way:
>
>   %my_struct.coerce.fca.0.extract = extractvalue [2 x i64] %
> my_struct.coerce, 0
>   %0 = inttoptr i64 %my_struct.coerce.fca.0.extract to double*
>
> but, it is also important to realize that the i64 here in the array
> type is actually chosen to satisfy alignment requirements. If we have
this:>
> typedef float __attribute__((ext_vector_type(4))) vt;
> struct box {
>     double* source;
>     double* source2;
>     vt v;
> };
>
> then the struct is passed as:
>
> define void @test(double* noalias nocapture %result, [2 x i128] %
> my_struct.coerce, i32 signext %len)
>
> and the extraction code looks like:
>
>   %my_struct.coerce.fca.0.extract = extractvalue [2 x i128] %
> my_struct.coerce, 0
>   %my_struct.sroa.0.sroa.0.0.extract.shift = lshr i128 %
> my_struct.coerce.fca.0.extract, 64
>   %my_struct.sroa.0.sroa.0.0.extract.trunc = trunc i128 %
> my_struct.sroa.0.sroa.0.0.extract.shift to i64
>   %0 = inttoptr i64 %my_struct.sroa.0.sroa.0.0.extract.trunc to double*
>
> so just using pointer types instead of i64 will help common cases,
> but will not address the general issue. Now part of this does some
> down to using array parameters as a substitute for byval/direct
> parameters. As I recall, this was done because it allowed a natural
> partial decomposition between GPRs and stack for structures that
> straddle the number of available parameter-passing GPRs. If we could
> accomplish that with regular byval parameters and regular direct
> parameters, then we'd not need any of this array coercion, and the
> system, including for the purposes of aliasing analysis, would work
> as intended. There may be some infrastructure work required in the
> backend (SelectionDAG builder, etc.) -- Uli, if you know please
> comment -- but I think moving away from the array coercions might be
> the right solution, even if that requires some infrastructureenhancements.

I still think "byval" is the wrong approach here.  Using
"byval",
the LLVM target-independent codegen layers assume the ABI requires
actually passing a *pointer* to the argument -- but that's not true
on PowerPC64, no pointer is in fact being passed.

This means that the back-end, in those cases where we use byval,
has to fake up an address when common code asks for the value of
the argument.  This is not too bad if the argument was in fact
actually passed (fully) on the stack, but if it wasn't, we have
to write the argument (partially) to the stack, and possibly
even allocate stack space, just to satisfy common code ...

Now, passing such arguments as "direct" might make more sense.  But
with the current infrastructure, this doesn't really work either.
For one, there is currently no place to attach alignment information
to a "direct" argument, and the back-end is not able in all cases to
reconstruct the required on-stack alignment from the LLVM type info.
In addition, passing a struct type "direct" causes clang codegen
common code in many cases to "flatten" the argument, i.e. pass the
struct elements as separate arguments on the LLVM IR level.  This
may in some cases be OK since it results in the same binary ABI;
but in other cases it is definitively wrong.

Still, if we need to move away from coerced array types, fixing
the infrastructure such that "direct" can be used would be my
prefered solution.  If I understood the original email thread
correctly, the current situation does not lead to incorrect code
generation, just potentially suboptimal code generation?  In this
case, I guess an incremental solution could be employed; we could
start with avoiding the array coercion in those cases where using
the struct type as direct type is already safe even with the
current infrastructure, and then expand the set of types where
this is true by successively improving the infrastructure.

Bye,
Ulrich

llvm dev - Mar 2015 - [LLVMdev] Alias analysis issue with structs on PPC

[LLVMdev] Alias analysis issue with structs on PPC

[LLVMdev] Alias analysis issue with structs on PPC

[LLVMdev] Alias analysis issue with structs on PPC

[LLVMdev] Alias analysis issue with structs on PPC

[LLVMdev] Alias analysis issue with structs on PPC

[LLVMdev] Alias analysis issue with structs on PPC