thr3ads.net - llvm dev - [LLVMdev] The implementation algorithm behind LLVM's RegionInfo class [Jun 2014]

If this information is useful, please help other people find it:
Share via:

Paul Vario

2014-Feb-06 20:14 UTC

[LLVMdev] The implementation algorithm behind LLVM's RegionInfo class

Hi Tobias,

         Thanks a lot for the detailed reply. I am working on several new
optimizations for OpenCL kernels for a DSP architecture. The project itself
has an NDA associated with it, so I cannot go into more details, but the
source will be open to public after completion. One of the first steps is
to serialize the work-items in a work-group (e.g., insert nested loops
around REGIONs in between barriers). At this point, I thought I should
implement a full-featured structural analysis, but some one in my team
suggested to first explore what was already in the LLVM. That's what lead
me to the RegionInfo class. If it's not much trouble, could you send the
draft to my forum email? Also, are you aware of any plan on having the
interval/structural analysis supported in LLVM. Thanks again.

P.S. One thing I need to find out from reading the implementation is why a
sequence of basic blocks do not get to form a region as a whole.

Best Regards,
Paul


On Thu, Feb 6, 2014 at 10:07 AM, Tobias Grosser <tobias at grosser.es>
wrote:
> On 02/06/2014 04:28 PM, Paul Vario wrote:
>
>> Hi fellows,
>>
>>       I am writing to ask what is the algorithm implemented in
LLVM's
>> RegionInfo.h/cpp. In the header file "RegionInfo.h", it says
"Calculates a
>> program structure tree built out of single entry single exit regions
>> (defined in a 1994 paper called "The Program Structure
Tree"). ... ... The
>> algorithm to calculate these data structures however is COMPLETELY
>> DIFFERENT, as it takes advantage of existing information already
available
>> ... ...". Does anyone know any papers talking about the exact
>> implementation? Also, how is the RegionInfo class related to a general
>> interval analysis? Thanks a lot.
>>
>
> Hi Paul,
>
> I should probably reply. I implemented this algorithm four years ago.
>
> Yes, this algorithm is on purpose different. I implemented it after
> investigating previous approaches. There are two main reasons for the
> difference:
>
> 1) The Program Structure Tree does not access so-called refined regions
>
> Refined regions are program regions that do not have a single entry and a
> single exit edge, but can be transformed into such a region very easily. My
> experience was, that most programs have a large number of refined regions
> which we want to detect.
>
> 2) Use of the dominator tree
>
> When I first came to the mailing list and asked for advice people
> (including Chris) strongly suggested to rely on existing analysis as much
> as possible. The RegionInfo pass does this by using the DominatorTree as
> well as a couple of other analysis. My measurements
> at that time have shown that building the RegionTree itself is in average
> twice as fast as building the dominator tree. The other necessary analysis
> have taken time comparable to a dominator tree construction. The use of
> DominanceFrontier in the RegionInfo tree has
> been a point criticized because this analysis is normally not used in LLVM
> (and needs to be built just for RegionInfo). Also the dominance frontier is
> in exceptional cases expensive to construct. On the other side, the
> analysis has shown useful not only for Polly, but a couple of other
> projects (R600 backend?, some OpenCL vectorizers?). Still, if we find a way
> to remove the reliance on the DomFrontier while still being able to detect
> refined regions, this would be very valuable.
>
> I never bothered to publish the Region Info paper, but I have an
> unfinished draft that I am happy to send you. Also, if the documentation
> in the source code is too sparse, please let me know and I add the missing
> information.
>
> Out of interest, why are you interested in the RegionInfo implementation?
>
> Cheers,
> Tobias
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140206/ba94771b/attachment.html>

Tobias Grosser

2014-Feb-06 21:40 UTC

head link

[LLVMdev] The implementation algorithm behind LLVM's RegionInfo class

On 02/06/2014 09:14 PM, Paul Vario wrote:> Hi Tobias,
>
>           Thanks a lot for the detailed reply. I am working on several new
> optimizations for OpenCL kernels for a DSP architecture. The project itself
> has an NDA associated with it, so I cannot go into more details, but the
> source will be open to public after completion. One of the first steps is
> to serialize the work-items in a work-group (e.g., insert nested loops
> around REGIONs in between barriers). At this point, I thought I should
> implement a full-featured structural analysis, but some one in my team
> suggested to first explore what was already in the LLVM. That's what
lead
> me to the RegionInfo class. If it's not much trouble, could you send
the
> draft to my forum email? Also, are you aware of any plan on having the
> interval/structural analysis supported in LLVM. Thanks again.
Maybe this is interesting to you:

lib/Transforms/Scalar/StructurizeCFG.cpp
> P.S. One thing I need to find out from reading the implementation is why a
> sequence of basic blocks do not get to form a region as a whole.
A sequence of basic blocks can form a region. The reason the RegionInfo 
does not form such reason is that there is no canonical way to form such 
regions. Hence, RegionInfo is only forming minimal regions, which can 
then be easily joined to larger reasons by connecting sequences of them 
(e.g. using the expandRegion()).

Btw, depending how complex your transformations will be, you might also 
be interested in polly.llvm.org or other automatic transformation in the 
area of automatic GPU/accelerator code generation. Those won't be 
finished solutions, but the frameworks/algorithms that exist there may 
allow to develop optimizations for your hardware a lot faster.

Tobi

Paul Vario

2014-Jun-27 20:04 UTC

head link

[LLVMdev] The implementation algorithm behind LLVM's RegionInfo class

Hi Tobi,

       I have one additional question about the RegionInfo::isRegion
function. In the second case (i.e. Entry dominates Exit),  why is checking
the following two conditions are equivalent to checking it's a refined
region:

For any BB in DF(exit), 1)  BB should be in DF(entry)
                                   2)  BB reachable only from entry through
a path passing exit.

       I.e., why is checking these two conditions is equivalent to ensure
single-entry, single-exit in the sense of Refined Region.

Best Regards,
Paul


On Thu, Feb 6, 2014 at 3:40 PM, Tobias Grosser <tobias at grosser.es>
wrote:
> On 02/06/2014 09:14 PM, Paul Vario wrote:
>
>> Hi Tobias,
>>
>>           Thanks a lot for the detailed reply. I am working on several
new
>> optimizations for OpenCL kernels for a DSP architecture. The project
>> itself
>> has an NDA associated with it, so I cannot go into more details, but
the
>> source will be open to public after completion. One of the first steps
is
>> to serialize the work-items in a work-group (e.g., insert nested loops
>> around REGIONs in between barriers). At this point, I thought I should
>> implement a full-featured structural analysis, but some one in my team
>> suggested to first explore what was already in the LLVM. That's
what lead
>> me to the RegionInfo class. If it's not much trouble, could you
send the
>> draft to my forum email? Also, are you aware of any plan on having the
>> interval/structural analysis supported in LLVM. Thanks again.
>>
>
> Maybe this is interesting to you:
>
> lib/Transforms/Scalar/StructurizeCFG.cpp
>
>
>  P.S. One thing I need to find out from reading the implementation is why a
>> sequence of basic blocks do not get to form a region as a whole.
>>
>
> A sequence of basic blocks can form a region. The reason the RegionInfo
> does not form such reason is that there is no canonical way to form such
> regions. Hence, RegionInfo is only forming minimal regions, which can then
> be easily joined to larger reasons by connecting sequences of them (e.g.
> using the expandRegion()).
>
> Btw, depending how complex your transformations will be, you might also be
> interested in polly.llvm.org or other automatic transformation in the
> area of automatic GPU/accelerator code generation. Those won't be
finished
> solutions, but the frameworks/algorithms that exist there may allow to
> develop optimizations for your hardware a lot faster.
>
> Tobi
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140627/b93f2155/attachment.html>

Seemingly Similar Threads

Search for more possibly parallel threads

llvm dev - Jun 2014 - [LLVMdev] The implementation algorithm behind LLVM's RegionInfo class

[LLVMdev] The implementation algorithm behind LLVM's RegionInfo class

[LLVMdev] The implementation algorithm behind LLVM's RegionInfo class

[LLVMdev] The implementation algorithm behind LLVM's RegionInfo class

Seemingly Similar Threads