thr3ads.net - llvm dev - [LLVMdev] [Polly] Analysis of the expensive compile-time overhead of Polly Dependence pass [Jul 2013]

If this information is useful, please help other people find it:
Share via:

Star Tan

2013-Jul-26 04:01 UTC

[LLVMdev] [Polly] Analysis of the expensive compile-time overhead of Polly Dependence pass

Hi Sebastian,


Recently, I found the "Polly - Calculate dependences" pass would lead
to significant compile-time overhead when compiling some loop-intensive source
code. Tobias told me you found similar problem as follows:
http://llvm.org/bugs/show_bug.cgi?id=14240


My evaluation shows that "Polly - Calculate dependences" pass consumes
96.4% of total compile-time overhead when compiling the program you proposed. It
seems that the expensive compile-time overhead comes from those loop operations
in ISL library. Some preliminary results can be seen on
http://llvm.org/bugs/show_bug.cgi?id=14240.


Sebastian, I wonder whether you have further results or suggestions besides
those words posted on the Bugzilla page? Any information or suggestion would be
appreciated.


Thanks,
Star Tan
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130726/79eb5206/attachment.html>

Tobias Grosser

2013-Jul-26 06:14 UTC

head link

[LLVMdev] [Polly] Analysis of the expensive compile-time overhead of Polly Dependence pass

On 07/25/2013 09:01 PM, Star Tan wrote:> Hi Sebastian,
>
>
> Recently, I found the "Polly - Calculate dependences" pass would
lead to significant compile-time overhead when compiling some loop-intensive
source code. Tobias told me you found similar problem as follows:
> http://llvm.org/bugs/show_bug.cgi?id=14240
>
>
> My evaluation shows that "Polly - Calculate dependences" pass
consumes 96.4% of total compile-time overhead when compiling the program you
proposed. It seems that the expensive compile-time overhead comes from those
loop operations in ISL library. Some preliminary results can be seen on
http://llvm.org/bugs/show_bug.cgi?id=14240.
>
>
> Sebastian, I wonder whether you have further results or suggestions besides
those words posted on the Bugzilla page? Any information or suggestion would be
appreciated.
Hi Star Tan,

thanks for looking into this. Just to sum up, the test case that
shows this slowness is the following:

void bar();
void foo(short *input) {
   short x0, x1;
   char i, ctr;

   for(i = 0; i < 8; i++) {

     // SCoP begin
     for (ctr = 0; ctr < 8; ctr++) {
       x0 = input[i*64 + ctr*8 + 0] ;
       x1 = input[i*64 + ctr*8 + 1] ;

       input[i*64 + ctr*8 + 0] = x0 - x1;
       input[i*64 + ctr*8 + 1] = x0 + x1;
       input[i*64 + ctr*8 + 2] = x0 * x1;
     }
     // SCoP end

     bar(); // Unknown function call stops further expansion of SCoP
   }
}

Which is translated to the following scop:

     Context:
     [p_0, p_1, p_2] -> {  : p_0 >= -2147483648 and p_0 <= 2147483647 
and p_1 >= -2147483648 and p_1 <= 2147483647 and p_2 >= -2147483648 and
p_2 <= 2147483647 }
     p_0: {0,+,128}<%for.cond2.preheader>
     p_1: {2,+,128}<%for.cond2.preheader>
     p_2: {4,+,128}<%for.cond2.preheader>
[...]

The interesting observation is, that Polly introduces three parameters 
(p_0, p_1, p_2) for this SCoP, even though in the C source code only the 
variable 'i' is SCoP invariant. However, due to the way 
SCEVExpr(essions) in LLVM are nested, Polly sees three scop-invariant 
SCEVExpr(essions) which are all translated into independent parameters.
However, as we can see, the only difference between the three
parameters is a different constant in the base of the AddRecExpr. If we 
would just introduce p_0 (the parameter where the scev-base is zero) and 
express any use of p_1 as p_0 + 2 and p_2 as p_0 + 4, isl could solve 
this problem very quickly.
There are other ways to improve performance which include further tuning 
isl or reducing the precision of the analysis, but in this case
I don't think we need to look into them. The above fix should give us 
good performance and additionally also increases the precision of the 
result (as isl now understands the relation between the different 
parameters).

To fix this, you probably would like to look into the SCEVAffinator 
class and the way parameters are handled.

Cheers,
Tobias

Star Tan

2013-Jul-26 15:30 UTC

head link

[LLVMdev] [Polly] Analysis of the expensive compile-time overhead of Polly Dependence pass

At 2013-07-26 14:14:51,"Tobias Grosser" <tobias at grosser.es>
wrote:
>On 07/25/2013 09:01 PM, Star Tan wrote:
>> Hi Sebastian,
>>
>>
>> Recently, I found the "Polly - Calculate dependences" pass
would lead to significant compile-time overhead when compiling some
loop-intensive source code. Tobias told me you found similar problem as follows:
>> http://llvm.org/bugs/show_bug.cgi?id=14240
>>
>>
>> My evaluation shows that "Polly - Calculate dependences" pass
consumes 96.4% of total compile-time overhead when compiling the program you
proposed. It seems that the expensive compile-time overhead comes from those
loop operations in ISL library. Some preliminary results can be seen on
http://llvm.org/bugs/show_bug.cgi?id=14240.
>>
>>
>> Sebastian, I wonder whether you have further results or suggestions
besides those words posted on the Bugzilla page? Any information or suggestion
would be appreciated.
>
>Hi Star Tan,
>
>thanks for looking into this. Just to sum up, the test case that
>shows this slowness is the following:
>
>void bar();
>void foo(short *input) {
>   short x0, x1;
>   char i, ctr;
>
>   for(i = 0; i < 8; i++) {
>
>     // SCoP begin
>     for (ctr = 0; ctr < 8; ctr++) {
>       x0 = input[i*64 + ctr*8 + 0] ;
>       x1 = input[i*64 + ctr*8 + 1] ;
>
>       input[i*64 + ctr*8 + 0] = x0 - x1;
>       input[i*64 + ctr*8 + 1] = x0 + x1;
>       input[i*64 + ctr*8 + 2] = x0 * x1;
>     }
>     // SCoP end
>
>     bar(); // Unknown function call stops further expansion of SCoP
>   }
>}
>
>Which is translated to the following scop:
>
>     Context:
>     [p_0, p_1, p_2] -> {  : p_0 >= -2147483648 and p_0 <=
2147483647
>and p_1 >= -2147483648 and p_1 <= 2147483647 and p_2 >= -2147483648
and
>p_2 <= 2147483647 }
>     p_0: {0,+,128}<%for.cond2.preheader>
>     p_1: {2,+,128}<%for.cond2.preheader>
>     p_2: {4,+,128}<%for.cond2.preheader>
>[...]
>
>The interesting observation is, that Polly introduces three parameters 
>(p_0, p_1, p_2) for this SCoP, even though in the C source code only the 
>variable 'i' is SCoP invariant. However, due to the way 
>SCEVExpr(essions) in LLVM are nested, Polly sees three scop-invariant 
>SCEVExpr(essions) which are all translated into independent parameters.
>However, as we can see, the only difference between the three
>parameters is a different constant in the base of the AddRecExpr. If we 
>would just introduce p_0 (the parameter where the scev-base is zero) and 
>express any use of p_1 as p_0 + 2 and p_2 as p_0 + 4, isl could solve 
>this problem very quickly.
>There are other ways to improve performance which include further tuning 
>isl or reducing the precision of the analysis, but in this case
>I don't think we need to look into them. The above fix should give us 
>good performance and additionally also increases the precision of the 
>result (as isl now understands the relation between the different 
>parameters).
>
>To fix this, you probably would like to look into the SCEVAffinator 
>class and the way parameters are handled.
>Thank you so much. Maybe I was in the wrong direction at first -:)

Yes, you are right. I should first investigate why Polly introduces three
parameters rather than just one parameter.
Best wishes,
Star Tan
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130726/ec4c2b2d/attachment.html>

Tobias Grosser

2013-Jul-28 23:42 UTC

head link

[LLVMdev] [Polly] Analysis of the expensive compile-time overhead of Polly Dependence pass

On 07/28/2013 06:52 AM, Star Tan wrote:> Hi Tobias,
>
> I tried to investigated the problem related to ScopInfo, but I need your
> help on handling some problems about ISL and SCEV.
I copied the list as the discussion may be helpful for others.

@Sven, no need to read all. Just search for your name.

[..]
>>The interesting observation is, that Polly introduces three parameters
>>(p_0, p_1, p_2) for this SCoP, even though in the C source code only the
>>variable 'i' is SCoP invariant. However, due to the way
>>SCEVExpr(essions) in LLVM are nested, Polly sees three scop-invariant
>>SCEVExpr(essions) which are all translated into independent parameters.
>>However, as we can see, the only difference between the three
>>parameters is a different constant in the base of the AddRecExpr. If we
>>would just introduce p_0 (the parameter where the scev-base is zero) and
>>express any use of p_1 as p_0 + 2 and p_2 as p_0 + 4, isl could solve
>>this problem very quickly.
>
> Currently, Polly introduces three parameters because it treats (base, step)
as a whole for each parameter. You are right, it seems we could usea single
parameter to represent all these accesses.
>
> I have attached a patch file for this purpose. This patch file is just a
hack file to see whether three parameters can be merged into one parameter and
whether the compile-time could be reduce after that. The key of this patch is
the added source code inScop::addParams, which only add parameters if the new
parameter has different "step" value. In other ways, we skip the
"base" value when compare parameters.
> With this patch file, the Scop becomes:
>
> Context:
>      p0: {0,+,128}<%for.cond2.preheader>
>      Statements {
>      	Stmt_for_body6
>              Domain :>                  { Stmt_for_body6[i0] : i0 >=
0 and i0 <= 7 };
>              Scattering :>                  { Stmt_for_body6[i0] ->
scattering[0, i0, 0] };
>              ReadAccess :>                  [p_0] -> {
Stmt_for_body6[i0] -> MemRef_input[o0] : 2o0 = p_0 + 16i0 };
>              ReadAccess :>                  [p_0] -> {
Stmt_for_body6[i0] -> MemRef_input[o0] : 2o0 = 2 + p_0 + 16i0 };
>              MustWriteAccess :>                  [p_0] -> {
Stmt_for_body6[i0] -> MemRef_input[o0] : 2o0 = p_0 + 16i0 };
>              MustWriteAccess :>                  [p_0] -> {
Stmt_for_body6[i0] -> MemRef_input[o0] : 2o0 = 2 + p_0 + 16i0 };
>              MustWriteAccess :>                  [p_0] -> {
Stmt_for_body6[i0] -> MemRef_input[o0] : 2o0 = 4 + p_0 + 16i0 };
>      }
>
> Unfortunately, the compile-time is still very high!
This _looks_ already a lot better. Very nice!
> I have investigated a little about it. Maybe this is
becauseisl_union_map_add_mapin Polly-dependencewould still introduces new
parameters based on the resulted produced by ScopInfo.
>
> Without this patch file, the Read, Write, MayWrite (input to ISL
library)calculated by collectInfo are:
>
> Read:[p_0, p_1, p_2] -> { Stmt_for_body6[i0] -> MemRef_input[o0] :
(2o0 = p_0 + 16i0 and i0 >= 0 and i0 <= 7) or (2o0 = p_1 + 16i0 and i0
>= 0 and i0 <= 7) }
> Write:[p_0, p_1, p_2] -> { Stmt_for_body6[i0] -> MemRef_input[o0] :
(2o0 = p_0 + 16i0 and i0 >= 0 and i0 <= 7) or (2o0 = p_1 + 16i0 and i0
>= 0 and i0 <= 7) or (2o0 = p_2 + 16i0 and i0 >= 0 and i0 <= 7) }
> Maywrite:[p_0, p_1, p_2] -> {  }
> Schedule:[p_0, p_1, p_2] -> { Stmt_for_body6[i0] -> scattering[0, i0,
0] }
>
> But with our patch file, they become:
>
> Read: [p_0, p_0'] -> { Stmt_for_body6[i0] -> MemRef_input[o0] :
(2o0 = p_0' + 16i0 and i0 >= 0 and i0 <= 7) or (2o0 = 2 + p_0 + 16i0
and i0 >= 0 and i0 <= 7) }
> Write: [p_0, p_0', p_0''] -> { Stmt_for_body6[i0] ->
MemRef_input[o0] : (2o0 = p_0'' + 16i0 and i0 >= 0 and i0 <= 7) or
(2o0 = 2 + p_0' + 16i0 and i0 >= 0 and i0 <= 7) or (2o0 = 4 + p_0 +
16i0 and i0 >= 0 and i0 <= 7) }
> MayWrite: {  }
> Schedule: { Stmt_for_body6[i0] -> scattering[0, i0, 0] }
> It seems that theisl_union_map_add_map function called from 
Dependences::collectInfoautomatically introduces extra parameters, but I have no
idea why this happens.
>
> Could you give me some further suggestion on this problem?
Oh, you are on the right way. Even though we only see parameters 'p_0', 
isl seems to believe those values 'p_0' are not identical. We see this 
as isl renames them such that the difference becomes obvious. 
Unfortunately, as long as isl does not believe those parameters are 
identical, it will still be stuck in long calculations.

To understand the above behaviour we need to understand how isl 
determines if two parameter dimensions refer to the same parameter. What 
you would probably expect is that the union of the two sets [p] -> {[i] 
= p} and [p] -> {[i] = p} is a single set [p] -> {[i] = p} as the two 
parameters have the very same name. However, this is not always true. 
Even though parameters share the same name, they may in fact be 
different isl_id objects. isl_id_alloc() takes a parameter name and a 
pointer and only if both of them are identical, the very same isl_id 
object is returned.

Sven: In terms of making the behaviour of isl easier to understand,
it may make sense to fail/assert in case operands have parameters that
are named identical, but that refer to different pointer values.

So the question is why do you obtain such different isl_ids and what
can be done to fix this. I don't have a solution ready, but I have
some questions in the patch that may help you when looking into this.
> diff --git a/include/polly/ScopInfo.h b/include/polly/ScopInfo.h
> index 8c56582..bab7763 100755
> --- a/include/polly/ScopInfo.h
> +++ b/include/polly/ScopInfo.h
> @@ -445,6 +445,7 @@ class Scop {
>     /// The isl_ids that are used to represent the parameters
>     typedef std::map<const SCEV *, int> ParamIdType;
>     ParamIdType ParameterIds;
> +  ParamIdType ParameterIdsSpace;
Why are you introducing another member variable here? Why don't you keep
using ParameterIds? What is the this variable supposed to track?
> diff --git a/lib/Analysis/Dependences.cpp b/lib/Analysis/Dependences.cpp
> index 5a185d0..9f918f3 100644
> --- a/lib/Analysis/Dependences.cpp
> +++ b/lib/Analysis/Dependences.cpp
> @@ -30,6 +30,9 @@
>   #include <isl/map.h>
>   #include <isl/set.h>
>
> +#define DEBUG_TYPE "polly-dependence"
> +#include "llvm/Support/Debug.h"
> +
>   using namespace polly;
>   using namespace llvm;
>
> @@ -88,8 +91,15 @@ void Dependences::collectInfo(Scop &S, isl_union_map
**Read,
>   void Dependences::calculateDependences(Scop &S) {
>     isl_union_map *Read, *Write, *MayWrite, *Schedule;
>
> +  DEBUG(dbgs() << "Scop: " << S <<
"\n");
> +
>     collectInfo(S, &Read, &Write, &MayWrite, &Schedule);
>
> +  DEBUG(dbgs() << "Read: " << Read <<
"\n";
> +        dbgs() << "Write: " << Write <<
"\n";
> +        dbgs() << "MayWrite: " << MayWrite <<
"\n";
> +        dbgs() << "Schedule: " << Schedule <<
"\n");
> +
>     if (OptAnalysisType == VALUE_BASED_ANALYSIS) {
>       isl_union_map_compute_flow(
>           isl_union_map_copy(Read), isl_union_map_copy(Write),
> @@ -131,6 +141,8 @@ void Dependences::calculateDependences(Scop &S) {
>     RAW = isl_union_map_coalesce(RAW);
>     WAW = isl_union_map_coalesce(WAW);
>     WAR = isl_union_map_coalesce(WAR);
> +
> +  DEBUG(printScop(dbgs()));
>   }
This patch looks good/helpful by itself. I propose to submit it 
separately for commit.

>   bool Dependences::runOnScop(Scop &S) {
> diff --git a/lib/Analysis/ScopInfo.cpp b/lib/Analysis/ScopInfo.cpp
> index aa72f3e..6fc838e 100644
> --- a/lib/Analysis/ScopInfo.cpp
> +++ b/lib/Analysis/ScopInfo.cpp
> @@ -86,6 +86,14 @@ public:
>         isl_aff *Affine >            
isl_aff_zero_on_domain(isl_local_space_from_space(Space));
>         Affine = isl_aff_add_coefficient_si(Affine, isl_dim_param, 0, 1);
> +      if (Scev->getSCEVType() == scAddRecExpr) {
> +        const SCEVAddRecExpr *AR = cast<SCEVAddRecExpr>(Scev);
The canonical pattern for this is:

if (SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(Scev)) {
> +        const SCEVConstant *c =
cast<SCEVConstant>(AR->getOperand(0));
This is obviously a hack. The base is not always a constant.

You can probably just call use something like,

isl_pw_aff *BaseValue = visit(AR->getOperand(0))
Affine = isl_pw_aff_sum(Affine, BaseValue);
> +        isl_int t;
> +        isl_int_init(t);
> +        isl_int_set_si(t, c->getValue()->getSExtValue());
We now use isl_val instead of isl_int.
>         return isl_pw_aff_alloc(Domain, Affine);
>       }
> @@ -717,6 +725,34 @@ void Scop::addParams(std::vector<const SCEV *>
NewParameters) {
>       if (ParameterIds.find(Parameter) != ParameterIds.end())
>         continue;
>
> +    if (Parameter->getSCEVType() == scAddRecExpr) {
> +      int index, maxIndex = Parameters.size();
> +      const SCEVAddRecExpr *AR = cast<SCEVAddRecExpr>(Parameter);
> +
> +      for (index = 0; index < Parameters.size(); ++index) {
> +        const SCEV *pOld = Parameters[index];
> +
> +        if (pOld->getSCEVType() != scAddRecExpr)
> +          continue;
> +
> +        const SCEVAddRecExpr *OAR = cast<SCEVAddRecExpr>(pOld);
> +        if (AR->getNumOperands() != OAR->getNumOperands())
> +          continue;
> +
> +        unsigned i, e;
> +        for (i = 1, e = AR->getNumOperands(); i != e; ++i){
> +          if (AR->getOperand(i) != OAR->getOperand(i))
> +            break;
> +        }
> +
> +        if (i == e) {
> +          // Parameters.push_back(Parameter);
> +          ParameterIds[Parameter] = index;
> +          return;
> +        }
> +      }
> +    }
I think this is the right idea, but probably the wrong place to put it.

I would put this into SCEVValidator::visitAddRecExpr. This function 
always adds the AddRecExpr itself as a parameter, whenever it is found 
to be parametric. However, what we should do is to create a new ScevExpr
that starts at zero and is otherwise identical. We then add this as a 
parameter. When doing this, we now also need to keep all the parameters
that have been found previously in the base expression.
>       int dimension = Parameters.size();
>
>       Parameters.push_back(Parameter);
> @@ -777,9 +813,9 @@ void Scop::addParameterBounds() {
>
>   void Scop::realignParams() {
>     // Add all parameters into a common model.
> -  isl_space *Space = isl_space_params_alloc(IslCtx, ParameterIds.size());
> +  isl_space *Space = isl_space_params_alloc(IslCtx,
ParameterIdsSpace.size());
>
> -  for (ParamIdType::iterator PI = ParameterIds.begin(), PE =
ParameterIds.end();
> +  for (ParamIdType::iterator PI = ParameterIdsSpace.begin(), PE =
ParameterIdsSpace.end();
I do not see why those changes are necessary.
>          PI != PE; ++PI) {
>       const SCEV *Parameter = PI->first;
>       isl_id *id = getIdForParam(Parameter);
> @@ -787,7 +823,7 @@ void Scop::realignParams() {
>     }
>
>     // Align the parameters of all data structures to the model.
> -  Context = isl_set_align_params(Context, Space);
> +//  Context = isl_set_align_params(Context, Space);
Also no idea why this one is necessary.

Hope this helps,
Tobias

Tobias Grosser

2013-Jul-29 14:37 UTC

head link

[LLVMdev] [Polly] Analysis of the expensive compile-time overhead of Polly Dependence pass

On 07/29/2013 03:18 AM, Sven Verdoolaege wrote:> On Sun, Jul 28, 2013 at 04:42:25PM -0700, Tobias Grosser wrote:
>> Sven: In terms of making the behaviour of isl easier to understand,
>> it may make sense to fail/assert in case operands have parameters that
>> are named identical, but that refer to different pointer values.
>
> No, you are allowed to have different identifiers with the same name.
> I could optionally print the pointer values, but then I'd have
> to think about what to do with them when reading a textual
> representation of a set with such pointer values in them.
Yes, this is how it is today. I wondered if there is actually a need to
allow the use of different identifiers with the same name (except all 
being unnamed?). I personally do not see such a need and would prefer 
isl to assert/fail in case someone tries to do so. This may avoid 
confusions as happened here. Do you see a reason why isl should allow this?

Cheers,
Tobias

Tobias Grosser

2013-Jul-29 16:27 UTC

head link

[LLVMdev] [Polly] Analysis of the expensive compile-time overhead of Polly Dependence pass

On 07/29/2013 09:15 AM, Sven Verdoolaege wrote:> On Mon, Jul 29, 2013 at 07:37:14AM -0700, Tobias Grosser wrote:
>> On 07/29/2013 03:18 AM, Sven Verdoolaege wrote:
>>> On Sun, Jul 28, 2013 at 04:42:25PM -0700, Tobias Grosser wrote:
>>>> Sven: In terms of making the behaviour of isl easier to
understand,
>>>> it may make sense to fail/assert in case operands have
parameters that
>>>> are named identical, but that refer to different pointer
values.
>>>
>>> No, you are allowed to have different identifiers with the same
name.
>>> I could optionally print the pointer values, but then I'd have
>>> to think about what to do with them when reading a textual
>>> representation of a set with such pointer values in them.
>>
>> Yes, this is how it is today.
>
> No, the pointer values are currently not printed.
I was referring to the first sentence. I do not think printing pointer 
values is what we want. It would make the output unpredictable not only 
when address space randomisation is involved.
>> I wondered if there is actually a need to
>> allow the use of different identifiers with the same name (except all
being
>> unnamed?). I personally do not see such a need and would prefer isl to
>> assert/fail in case someone tries to do so. This may avoid confusions
as
>> happened here. Do you see a reason why isl should allow this?
>
> Removing this feature would break existing users.
Even if it would, the benefits for future users may outweigh this.
Also, are you aware of a user that actually breaks?

Anyway, on the Polly side we know the behaviour and can handle it. So 
this is nothing I am very strong about. I just mentioned it as it seemed 
to be a good idea.

Cheers,
Tobias

Star Tan

2013-Jul-31 04:13 UTC

head link

[LLVMdev] [Polly] Analysis of the expensive compile-time overhead of Polly Dependence pass

Hi Tobias and Sven,


Thanks for your discussion and suggestion. 


@Sven: ISL actually allows users to have different identifiers with the same
name. The problem that we have discussed is caused by incorrect usage of
isl_space in Polly, so please do not worry about ISL library. You can skip the
following information related to Polly implementation.


@Tobias and Polly developers:
I have attached a patch file to fix this problem. The key idea is to keep ISL
library always seeing the same single parameter for those memory accesses that
use the same index variables even though they have different constant base
values.


For the example we have seen before:
for(i = 0; i < 8; i++) { for (ctr = 0; ctr < 8; ctr++) { x1 = input[i*64 +
ctr*8 + 1] ; x0 = input[i*64 + ctr*8 + 0] ; input[i*64 + ctr*8 + 0] = x0 - x1;
input[i*64 + ctr*8 + 1] = x0 + x1; input[i*64 + ctr*8 + 2] = x0 * x1; }


Without this patch file, Polly would produce the Context as follows:


  Context:
[p_0, p_1, p_2] -> { : p_0 >= -9223372036854775808 and p_0 <=
9223372036854775807 and p_1 >= -9223372036854775808 and p_1 <=
9223372036854775807 and p_2 >= -9223372036854775808 and p_2 <=
9223372036854775807 } p0: {0,+,128}<%for.cond2.preheader> p1:
{2,+,128}<%for.cond2.preheader> p2: {4,+,128}<%for.cond2.preheader>
Statements { Stmt_for_body6 Domain := [p_0, p_1, p_2] -> { Stmt_for_body6[i0]
: i0 >= 0 and i0 <= 7 }; Scattering := [p_0, p_1, p_2] -> {
Stmt_for_body6[i0] -> scattering[0, i0, 0] }; ReadAccess := [p_0, p_1, p_2]
-> { Stmt_for_body6[i0] -> MemRef_input[o0] : 2o0 = p_0 + 16i0 };
ReadAccess := [p_0, p_1, p_2] -> { Stmt_for_body6[i0] -> MemRef_input[o0]
: 2o0 = p_1 + 16i0 }; MustWriteAccess := [p_0, p_1, p_2] -> {
Stmt_for_body6[i0] -> MemRef_input[o0] : 2o0 = p_0 + 16i0 }; MustWriteAccess
:= [p_0, p_1, p_2] -> { Stmt_for_body6[i0] -> MemRef_input[o0] : 2o0 = p_1
+ 16i0 }; MustWriteAccess := [p_0, p_1, p_2] -> { Stmt_for_body6[i0] ->
MemRef_input[o0] : 2o0 = p_2 + !
 16i0 }; }


Furthermore, it will produce very complex RAW, WAW and WAR dependence as
follows:


RAW dependences: [p_0, p_1, p_2] -> { Stmt_for_body6[i0] ->
Stmt_for_body6[o0] : (exists (e0 = [(p_1)/2], e1 = [(7p_1 + p_2 + 112i0)/16]:
16o0 = -p_0 + p_1 + 16i0 and 2e0 = p_1 and 16i0 >= -p_1 + p_2 and 16i0 <=
112 + p_0 - p_1 and p_1 >= 16 + p_0 and i0 >= 0 and 16e1 >= -15 + 7p_1
+ p_2 + 112i0 and 16e1 <= -1 + 7p_1 + p_2 + 112i0 and p_2 >= 16 + p_0)) or
... or (exists (e0 = [(p_1)/2], e1 = [(-p_1 + p_2)/16]: 16o0 = p_0 - p_1 + 16i0
and 2e0 = p_1 and 16e1 = -p_1 + p_2 and 16i0 >= -p_0 + p_2 and 16i0 <= 112
- p_0 + p_1 and p_1 <= -16 + p_0 and p_2 >= 16 + p_0)) }


It not only leads to significant compile-time overhead, but also produces
incorrect dependence results.  As we can see from the source code, there should
be no RAW, WAW and WAR across loop iterations at all.


With this patch file, Polly would produce the following Context and RAW, WAW,
WAR dependence:


Context:
    [p_0] -> {  : p_0 >= -9223372036854775808 and p_0 <=
9223372036854775807 }
    p0: {0,+,128}<%for.cond2.preheader>
    Statements {
    Stmt_for_body6
            Domain :                [p_0] -> { Stmt_for_body6[i0] : i0 >=
0 and i0 <= 7 };
            Scattering :                [p_0] -> { Stmt_for_body6[i0] ->
scattering[0, i0, 0] };
            ReadAccess := 
                [p_0] -> { Stmt_for_body6[i0] -> MemRef_input[o0] : 2o0 =
p_0 + 16i0 };
            ReadAccess := 
                [p_0] -> { Stmt_for_body6[i0] -> MemRef_input[o0] : 2o0 =
2 + p_0 + 16i0 };
            MustWriteAccess := 
                [p_0] -> { Stmt_for_body6[i0] -> MemRef_input[o0] : 2o0 =
p_0 + 16i0 };
            MustWriteAccess := 
                [p_0] -> { Stmt_for_body6[i0] -> MemRef_input[o0] : 2o0 =
2 + p_0 + 16i0 };
            MustWriteAccess := 
                [p_0] -> { Stmt_for_body6[i0] -> MemRef_input[o0] : 2o0 =
4 + p_0 + 16i0 };
    }


RAW dependences:
[p_0] -> {  }
WAR dependences:
[p_0] -> {  }
WAW dependences:
[p_0] -> {  }


We can see it only creates a single parameter and it can catch the exact RAR,
WAW and WAW dependence.


To see whether our patch file could catch normal data dependence, I have
constructed another example as follows:


  for(i = 0; i < 8; i++) {
    for (ctr = 0; ctr < 8; ctr++) {
      x0 = input[i*64 + ctr + 0] ; 
      x1 = input[i*64 + ctr + 1] ;
      input[i*64 + ctr + 0] = x0 - x1;
      input[i*64 + ctr + 1] = x0 + x1;
      input[i*64 + ctr + 2] = x0 * x1;
    }


The original Polly would produce similar complex results. However, with the
attached patch file, Polly would produce the following Context and RAW, WAW, WAR
dependence:
Context:
    [p_0] -> {  : p_0 >= -2147483648 and p_0 <= 2147483647 }
    p0: {0,+,128}<%for.cond2.preheader>
    Statements {
    Stmt_for_body6
            Domain :                [p_0] -> { Stmt_for_body6[i0] : i0 >=
0 and i0 <= 7 };
            Scattering :                [p_0] -> { Stmt_for_body6[i0] ->
scattering[0, i0, 0] };
            ReadAccess := 
                [p_0] -> { Stmt_for_body6[i0] -> MemRef_input[o0] : 2o0 =
p_0 + 2i0 };
            ReadAccess := 
                [p_0] -> { Stmt_for_body6[i0] -> MemRef_input[o0] : 2o0 =
2 + p_0 + 2i0 };
            MustWriteAccess := 
                [p_0] -> { Stmt_for_body6[i0] -> MemRef_input[o0] : 2o0 =
p_0 + 2i0 };
            MustWriteAccess := 
                [p_0] -> { Stmt_for_body6[i0] -> MemRef_input[o0] : 2o0 =
2 + p_0 + 2i0 };
            MustWriteAccess := 
                [p_0] -> { Stmt_for_body6[i0] -> MemRef_input[o0] : 2o0 =
4 + p_0 + 2i0 };
    }


RAW dependences:
[p_0] -> { Stmt_for_body6[i0] -> Stmt_for_body6[1 + i0] : exists (e0 =
[(p_0)/2]: 2e0 = p_0 and i0 >= 0 and i0 <= 6) }
WAR dependences:
[p_0] -> {  }
WAW dependences:
[p_0] -> { Stmt_for_body6[i0] -> Stmt_for_body6[1 + i0] : exists (e0 =
[(p_0)/2]: 2e0 = p_0 and i0 >= 0 and i0 <= 6)


Results show that it still creates a single parameter and catches the exact RAR,
WAW and WAW dependence.  Of course, this patch file makes the compiling very
faster, i.e., compile-time is reduced from several minutes to less then 1
second.


@Tobias:>This is obviously a hack. The base is not always a constant.
>You can probably just call use something like,
>isl_pw_aff *BaseValue = visit(AR->getOperand(0))
>Affine = isl_pw_aff_sum(Affine, BaseValue);Currently, I only handle constant base values because I have not found a good
way to handle general base values. isl_pw_aff_add requires two isl_pw_aff
parameters, but Affine is actually of isl_aff type. Perhaps we could first
commit a patch file to handle common cases, then we can considering submitting
another patch file to handle general cases.>I think this is the right idea, but probably the wrong place to put it.
>I would put this into SCEVValidator::visitAddRecExpr. This function 
>always adds the AddRecExpr itself as a parameter, whenever it is found 
>to be parametric. However, what we should do is to create a new ScevExpr
>that starts at zero and is otherwise identical. We then add this as a 
>parameter. When doing this, we now also need to keep all the parameters
>that have been found previously in the base expression.A lot of Polly functions access ParameterIds and Parameters using existing
ScevExpr. If we create new ScevExpr and put them into Parameters and
ParameterIds, it may require some extra tricks to handle the mapping from
existing ScevExpr to newly created ScevExpr. I think we can consider fixing it
later.
Cheers,
Star Tan

At 2013-07-30 00:27:27,"Tobias Grosser" <tobias at grosser.es>
wrote:>On 07/29/2013 09:15 AM, Sven Verdoolaege wrote:
>> On Mon, Jul 29, 2013 at 07:37:14AM -0700, Tobias Grosser wrote:
>>> On 07/29/2013 03:18 AM, Sven Verdoolaege wrote:
>>>> On Sun, Jul 28, 2013 at 04:42:25PM -0700, Tobias Grosser wrote:
>>>>> Sven: In terms of making the behaviour of isl easier to
understand,
>>>>> it may make sense to fail/assert in case operands have
parameters that
>>>>> are named identical, but that refer to different pointer
values.
>>>>
>>>> No, you are allowed to have different identifiers with the same
name.
>>>> I could optionally print the pointer values, but then I'd
have
>>>> to think about what to do with them when reading a textual
>>>> representation of a set with such pointer values in them.
>>>
>>> Yes, this is how it is today.
>>
>> No, the pointer values are currently not printed.
>
>I was referring to the first sentence. I do not think printing pointer 
>values is what we want. It would make the output unpredictable not only 
>when address space randomisation is involved.
>
>>> I wondered if there is actually a need to
>>> allow the use of different identifiers with the same name (except
all being
>>> unnamed?). I personally do not see such a need and would prefer isl
to
>>> assert/fail in case someone tries to do so. This may avoid
confusions as
>>> happened here. Do you see a reason why isl should allow this?
>>
>> Removing this feature would break existing users.
>
>Even if it would, the benefits for future users may outweigh this.
>Also, are you aware of a user that actually breaks?
>
>Anyway, on the Polly side we know the behaviour and can handle it. So 
>this is nothing I am very strong about. I just mentioned it as it seemed 
>to be a good idea.
>
>Cheers,
>Tobias
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130731/274bbbb4/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-ScopInfo-Create-shared-parameter-for-memory-accesses.patch
Type: application/octet-stream
Size: 6694 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130731/274bbbb4/attachment.obj>

Seemingly Similar Threads

Search for more apparently analagous threads

llvm dev - Jul 2013 - [LLVMdev] [Polly] Analysis of the expensive compile-time overhead of Polly Dependence pass

[LLVMdev] [Polly] Analysis of the expensive compile-time overhead of Polly Dependence pass

[LLVMdev] [Polly] Analysis of the expensive compile-time overhead of Polly Dependence pass

[LLVMdev] [Polly] Analysis of the expensive compile-time overhead of Polly Dependence pass

[LLVMdev] [Polly] Analysis of the expensive compile-time overhead of Polly Dependence pass

[LLVMdev] [Polly] Analysis of the expensive compile-time overhead of Polly Dependence pass

[LLVMdev] [Polly] Analysis of the expensive compile-time overhead of Polly Dependence pass

[LLVMdev] [Polly] Analysis of the expensive compile-time overhead of Polly Dependence pass

Seemingly Similar Threads