thr3ads.net - llvm dev - [LLVMdev] Analysis of polly-detect overhead in oggenc [Jul 2013]

If this information is useful, please help other people find it:
Share via:

Tobias Grosser

2013-Jul-21 17:40 UTC

[LLVMdev] Analysis of polly-detect overhead in oggenc

On 07/21/2013 09:49 AM, Star Tan wrote:> Hi all,
>
>
> I have attached a patch file to reduce the polly-detect overhead.
Great.
> My idea is to avoid calling TypeFinder in Non-DEBUG mode, so
> TypeFinder is only called in DEBUG mode with the DEBUG macro.  This
> patch file did this work with following modifications:
>
>
> First, it keeps most of string information by replacing 
"<<" with "+" operation. For example, code like this:
>      INVALID(CFG, "Non branch instruction terminates BB: " +
BB.getName());
> would be converted into:
>     LastFailure = "Non branch instruction terminates BB: " +
BB.getName().str();
>
>
> Second, it simplifies some complex operations like:
>         INVALID(AffFunc,
>                "Non affine branch in BB '" <<
BB.getName() << "' with LHS: "
>                                            << *LHS << "
and RHS: " << *RHS);
> into:
>        LastFailure = "Non affine branch in BB: " +
BB.getName().str();
> In such cases, some information for "LHS" and "RHS" are
missed.
Yes. And unfortunately, we may also loose the 'name' of unnamed basic
blocks. Basic blocks without a name get formatted as %111 with '111'
being their sequence number. Your above change will not be able to
derive this number.
> However, it only has little affects on the variable
"LastFailure",
> while keeping all information for DEBUG information.
Why is that? It seems the DEBUG output is the very same that gets
written to "LastFailure".
> Since the
> variable "LastFailure" is only used in ScopGraphPrinter, which
should
> only show critical information in graph, I hope such modification is
> acceptable.
Why should we only show critical information? In the GraphPrinter we do
not worry about compile time so much, such that we can easily show
helpful information. We just need to make sure that we do not slow down
the compile-time in the generic case.
> Results show that it has almost the same performance improvements as
> my previous hack patch file, i.e., reducing the compile-time of
> Polly-detect pass from 90s (>80%) to 0.5s (2.5%) when compiling oggenc
> 16X.
Great.
> Postscript to  Tobias:
>   I have also implemented your initial proposal, which uses some class
>   hierarchy to show different Failure information. Unfortunately, I
>   found it would significantly complicate the source code. It not only
>   introduces a lot of dynamic object "new" and "copy"
operations, but
>   also makes source code hard to understand. I argue that very
>   detailed information for the variable "LastFailure" is not
essential
>   because it should only show critical information in the graph. If
>   users want to know detailed failure information, they should refer
>   to DEBUG information. This new patch file keeps most of critical
>   information for "LastFailure" except some detailed information
about
>   Instruction and SCEV.
Interesting. This was also something I was afraid of. Passing new/delete
stuff through the scop detection is probably not what we want.
> Do you have any suggestion?
I do.

Your patch goes in the right direction and it helped to get a better
idea of what should be done. I think the first point that we learned is
that passing class pointers around is probably too distrubing. I also
believe having the formatting of the error messages in the normal scop
detection is not great, as it both adds unrelated stuff to the code, but
also makes it harder to conditionally disable the error reporting. On
the other side, I believe it is good to make the 'return false'
explicit.

Hence, I propose to transform the code in something like the following:

Instead of

  if (checkSomething())
    INVALID(AffFunc, "Test" << SCEV <<);

we should get something like:

  if (checkSomething()) {
    reportInvalidAffFunc(SCEV);
    return false;
  }

The reportInvalidAffFunc is then either a NO-OP (during normal
execution) or it reports the error (in case we need it).

I am not yet fully sure how the reportInvalid* functions should look
like. However, the part of your patch that is easy to commit without
further discussion is to move the 'return false' out of the INVALID
macro. Meaning, translating the above code to:

  if (checkSomething()) {
    INVALID(AffFunc, "Test" << SCEV <<);
    return false;
  }

I propose to submit such a patch first and then focus on the remaining
problems.

Cheers,
Tobias

Star Tan

2013-Jul-22 02:33 UTC

head link

[LLVMdev] Analysis of polly-detect overhead in oggenc

At 2013-07-22 01:40:31,"Tobias Grosser" <tobias at grosser.es>
wrote:
>On 07/21/2013 09:49 AM, Star Tan wrote:
>> Hi all,
>>
>>
>> I have attached a patch file to reduce the polly-detect overhead.
>
>Great.
>
>> My idea is to avoid calling TypeFinder in Non-DEBUG mode, so
>> TypeFinder is only called in DEBUG mode with the DEBUG macro.  This
>> patch file did this work with following modifications:
>>
>>
>> First, it keeps most of string information by replacing 
"<<" with "+" operation. For example, code like this:
>>      INVALID(CFG, "Non branch instruction terminates BB: " +
BB.getName());
>> would be converted into:
>>     LastFailure = "Non branch instruction terminates BB: " +
BB.getName().str();
>>
>>
>> Second, it simplifies some complex operations like:
>>         INVALID(AffFunc,
>>                "Non affine branch in BB '" <<
BB.getName() << "' with LHS: "
>>                                            << *LHS <<
" and RHS: " << *RHS);
>> into:
>>        LastFailure = "Non affine branch in BB: " +
BB.getName().str();
>
>
>> In such cases, some information for "LHS" and "RHS"
are missed.
>
>Yes. And unfortunately, we may also loose the 'name' of unnamed
basic
>blocks. Basic blocks without a name get formatted as %111 with '111'
>being their sequence number. Your above change will not be able to
>derive this number.Yes, but it is much cheaper by using BB.getName().str(). 
Currently, I cannot find a better way to derive unnamed basic block without
calling TypeFinder.>
>> However, it only has little affects on the variable
"LastFailure",
>> while keeping all information for DEBUG information.
>
>Why is that? It seems the DEBUG output is the very same that gets
>written to "LastFailure".No, they are different. For example, the code like this:
        INVALID(AffFunc,
                "Non affine branch in BB '" << BB.getName()
<< "' with LHS: "
                                            << *LHS << " and
RHS: " << *RHS);
would be converted to:
      LastFailure = "Non affine branch in BB: " + BB.getName().str();
       INVALID(AffFunc,
              LastFailure << "' with LHS: " << *LHS
<< " and RHS: " << *RHS);
You see, information about LHS and RHS would be kept in INVALID DEBUG
information.
To keep the information about unnamed basic blocks, I think we can rewrite it
as:
      FailString = "Non affine branch in BB: ";
      INVALID(AffFunc,
              FailString << BB.getName() << "' with LHS:
"
                         << *LHS << " and RHS: " <<
*RHS);
      LastFailure = FailString + BB.getName().str();>
>> Since the
>> variable "LastFailure" is only used in ScopGraphPrinter,
which should
>> only show critical information in graph, I hope such modification is
>> acceptable.
>
>Why should we only show critical information? In the GraphPrinter we do
>not worry about compile time so much, such that we can easily show
>helpful information. We just need to make sure that we do not slow down
>the compile-time in the generic case.To my knowledge, all of those expensive operations are only used for the
variable "LastFailure", which is only used in GraphPrinter. Do you
mean the Graph Printer does not execute in the generic case?  If that is true, I
think we should control those operations for "LastFailure" as follows:
if (In GraphPrinter mode) {
  LastFailure = ...
}
In this way, we can completely remove those operations for
"LastFailure" in the generic case except in GraphPrinter
mode.>
>> Results show that it has almost the same performance improvements as
>> my previous hack patch file, i.e., reducing the compile-time of
>> Polly-detect pass from 90s (>80%) to 0.5s (2.5%) when compiling
oggenc
>> 16X.
>
>Great.
>
>> Postscript to  Tobias:
>>   I have also implemented your initial proposal, which uses some class
>>   hierarchy to show different Failure information. Unfortunately, I
>>   found it would significantly complicate the source code. It not only
>>   introduces a lot of dynamic object "new" and
"copy" operations, but
>>   also makes source code hard to understand. I argue that very
>>   detailed information for the variable "LastFailure" is not
essential
>>   because it should only show critical information in the graph. If
>>   users want to know detailed failure information, they should refer
>>   to DEBUG information. This new patch file keeps most of critical
>>   information for "LastFailure" except some detailed
information about
>>   Instruction and SCEV.
>
>Interesting. This was also something I was afraid of. Passing new/delete
>stuff through the scop detection is probably not what we want.
>
>> Do you have any suggestion?
>
>I do.
>
>Your patch goes in the right direction and it helped to get a better
>idea of what should be done. I think the first point that we learned is
>that passing class pointers around is probably too distrubing. I also
>believe having the formatting of the error messages in the normal scop
>detection is not great, as it both adds unrelated stuff to the code, but
>also makes it harder to conditionally disable the error reporting. On
>the other side, I believe it is good to make the 'return false'
>explicit.
>
>Hence, I propose to transform the code in something like the following:
>
>Instead of
>
>  if (checkSomething())
>    INVALID(AffFunc, "Test" << SCEV <<);
>
>we should get something like:
>
>  if (checkSomething()) {
>    reportInvalidAffFunc(SCEV);
>    return false;
>  }
>
>The reportInvalidAffFunc is then either a NO-OP (during normal
>execution) or it reports the error (in case we need it).What do you mean with "Normal Execution"?  Does the GraphPrinter
executes in "normal case"?
If GraphPrinter is not executes in "normal case", we should move all
operations for "LastFailure" under the condition of 
"GraphPrinter" mode.
>
>I am not yet fully sure how the reportInvalid* functions should look
>like. However, the part of your patch that is easy to commit without
>further discussion is to move the 'return false' out of the INVALID
>macro. Meaning, translating the above code to:
>
>  if (checkSomething()) {
>    INVALID(AffFunc, "Test" << SCEV <<);
>    return false;
>  }
>
>I propose to submit such a patch first and then focus on the remaining
>problems.Yes, I agree with you.  I have attached a patch file to move "return
false" out of INVALID macro.
>
>Cheers,
>Tobias


Thanks,
Star Tan
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130722/1bc228f8/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-ScopDetect-move-return-false-out-of-INVALID-macro.patch
Type: application/octet-stream
Size: 9214 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130722/1bc228f8/attachment.obj>

Tobias Grosser

2013-Jul-22 04:16 UTC

head link

[LLVMdev] Analysis of polly-detect overhead in oggenc

On 07/21/2013 07:33 PM, Star Tan wrote:> At 2013-07-22 01:40:31,"Tobias Grosser" <tobias at
grosser.es> wrote:
>
>> On 07/21/2013 09:49 AM, Star Tan wrote:
>>> Hi all,
>>>
>>>
>>> I have attached a patch file to reduce the polly-detect overhead.
>>
>> Great.
>>
>>> My idea is to avoid calling TypeFinder in Non-DEBUG mode, so
>>> TypeFinder is only called in DEBUG mode with the DEBUG macro.  This
>>> patch file did this work with following modifications:
>>>
>>>
>>> First, it keeps most of string information by replacing 
"<<" with "+" operation. For example, code like this:
>>>       INVALID(CFG, "Non branch instruction terminates BB:
" + BB.getName());
>>> would be converted into:
>>>      LastFailure = "Non branch instruction terminates BB:
" + BB.getName().str();
>>>
>>>
>>> Second, it simplifies some complex operations like:
>>>          INVALID(AffFunc,
>>>                 "Non affine branch in BB '" <<
BB.getName() << "' with LHS: "
>>>                                             << *LHS <<
" and RHS: " << *RHS);
>>> into:
>>>         LastFailure = "Non affine branch in BB: " +
BB.getName().str();
>>
>>
>>> In such cases, some information for "LHS" and
"RHS" are missed.
>>
>> Yes. And unfortunately, we may also loose the 'name' of unnamed
basic
>> blocks. Basic blocks without a name get formatted as %111 with
'111'
>> being their sequence number. Your above change will not be able to
>> derive this number.
> Yes, but it is much cheaper by using BB.getName().str().
> Currently, I cannot find a better way to derive unnamed basic block without
calling TypeFinder.
Yes, that's the reason why it was used in the first place.
>>> However, it only has little affects on the variable
"LastFailure",
>>> while keeping all information for DEBUG information.
>>
>> Why is that? It seems the DEBUG output is the very same that gets
>> written to "LastFailure".
> No, they are different. For example, the code like this:
>          INVALID(AffFunc,
>                  "Non affine branch in BB '" <<
BB.getName() << "' with LHS: "
>                                              << *LHS << "
and RHS: " << *RHS);
> would be converted to:
>        LastFailure = "Non affine branch in BB: " +
BB.getName().str();
>         INVALID(AffFunc,
>                LastFailure << "' with LHS: " <<
*LHS << " and RHS: " << *RHS);
> You see, information about LHS and RHS would be kept in INVALID DEBUG
information.
> To keep the information about unnamed basic blocks, I think we can rewrite
it as:
>        FailString = "Non affine branch in BB: ";
>        INVALID(AffFunc,
>                FailString << BB.getName() << "' with
LHS: "
>                           << *LHS << " and RHS: "
<< *RHS);
>        LastFailure = FailString + BB.getName().str();
>>
>>> Since the
>>> variable "LastFailure" is only used in ScopGraphPrinter,
which should
>>> only show critical information in graph, I hope such modification
is
>>> acceptable.
>>
>> Why should we only show critical information? In the GraphPrinter we do
>> not worry about compile time so much, such that we can easily show
>> helpful information. We just need to make sure that we do not slow down
>> the compile-time in the generic case.
> To my knowledge, all of those expensive operations are only used for the
variable "LastFailure", which is only used in GraphPrinter. Do you
mean the Graph Printer does not execute in the generic case?  If that is true, I
think we should control those operations for "LastFailure" as follows:
> if (In GraphPrinter mode) {
>    LastFailure = ...
> }
> In this way, we can completely remove those operations for
"LastFailure" in the generic case except in GraphPrinter mode.
Yes.

> What do you mean with "Normal Execution"?  Does the GraphPrinter
executes in "normal case"?
> If GraphPrinter is not executes in "normal case", we should move
all operations for "LastFailure" under the condition of 
"GraphPrinter" mode.
It is not executed in normal mode. So yes, we should move all this under 
the condition of 'GraphPrintingMode'.
>> I am not yet fully sure how the reportInvalid* functions should look
>> like. However, the part of your patch that is easy to commit without
>> further discussion is to move the 'return false' out of the
INVALID
>> macro. Meaning, translating the above code to:
>>
>>   if (checkSomething()) {
>>     INVALID(AffFunc, "Test" << SCEV <<);
>>     return false;
>>   }
>>
>> I propose to submit such a patch first and then focus on the remaining
>> problems.
> Yes, I agree with you.  I have attached a patch file to move "return
false" out of INVALID macro.
Great. Committed in r186805.

I propose two more patches:

	1) Transform the INVALID macro into function calls, that format
            the text and that set LastFailure.

         2) Add checks at the beginning of those function calls and
            continue only if LogErrors is set

The second one is slightly more involved as we would like to turn this 
option on automatically either in -debug mode or if -polly-show or 
-polly-show-only is set.

What do you think? Does this sound right?

Cheers,
Tobias

Possibly Parallel Threads

Search for more maybe matching threads

llvm dev - Jul 2013 - [LLVMdev] Analysis of polly-detect overhead in oggenc

[LLVMdev] Analysis of polly-detect overhead in oggenc

[LLVMdev] Analysis of polly-detect overhead in oggenc

[LLVMdev] Analysis of polly-detect overhead in oggenc

Possibly Parallel Threads