thr3ads.net - llvm dev - [LLVMdev] GSOC Proposal: Implement Decoupled Software Pipeline [Apr 2013]

If this information is useful, please help other people find it:
Share via:
César
2013-Apr-29 00:05 UTC
[LLVMdev] GSOC Proposal: Implement Decoupled Software Pipeline

Hello,

below is the first draft of my proposal for GSoC. Any comments/advices
are apreciated.




# Implement Decoupled Software Pipeline (DSWP)
-------------------------------------------------------------------------------


# Abstract
-------------------------------------------------------------------------------

The goal of this project is to implement the automatic parallelization
technique called Decoupled Software Pipeline [1] in LLVM.


# Contents
-------------------------------------------------------------------------------

### Personal Details

Name: Divino César S. Lucas
Email: divcesar AT gmail DOT com
Other contact methods: gtalk, +55 19 8294 0276

### Project Proposal

The goal of this project is to implement DSWP in LLVM. DSWP is an automatic
parallelizing technique targeted to parallelize loops that have cross
iteration dependencies. Being so, DSWP have a more widely application scope
than other automatic parallelizing techniques such as DOALL and Vectorization.
The DSWP algorithm is shown below and a detailed description follows.

%%%%%%%% DSWP Algorithm %%%%%%%%%%%%
DSWP_Procedure(L)
 G = build_dependence_graph(L);
 SCCs = find_SCCs(G)

 If (|SCCs| == 1)
return ;
 EndIf

 DAG_SCC = coalesce_SCCs(G, SCCs)
 P = TPP_algorithm(DAG_SCC, L)

 If (|P| == 1)
return ;
 EndIf

 split_code_into_loops(L, P)

 insert_necessary_flows(L,P)
EndProcedure
%%%%%%%% DSWP Algorithm %%%%%%%%%%%%

This procedure receive as input the loop L in an intermediate representation
and modifies it as a side-effect. The first step in the algorithm is construct
the Program Dependence Graph (PDG)[2], this is a graph that represents all
data, memory and control dependence for the loop. After constructing the PDG
all dependency cycles are identified by finding all the strongly connected
components (SCC) in the PDG. At this point DSWP check if only one SCC was
found, this means that the whole loop form one dependency cycle and thus it is
not possible to decouple the execution of parts of the loop body (it is not
possible to split the body into more than one thread). If the loop has more
than one SCC the next step coalesce each SCC in just one node and form the
DAG_SCC which is a graph that represent the dependency pattern among
the SCCs. Potentially each SCC would be executed by one different
processing element(PE), however due to load balancing and/or
communication latency it may
be preferable to execute more than one SCC in each PE. The procedure
TPP_Algorithm tries to partition the SCCs the best possible among all PEs. If
just one partition was created the algorithm finish since no parallelism would
be extracted. The last but one step is responsible for extracting the SCCs
from the original loop and creating new thread/loops for each SCCs. The last
step is responsible to insert communication between the stages to enable
data forwarding and synchronization.

The main goal of this project is to implement DSWP as a pass in the LLVM opt
tool, however as a "side-effect" of this project new analyzis can be
added to
opt such as -print-pdg to print the program dependence graph in a DOT[3] format
and/or -print-ddg to print the data dependence graph in DOT format.


### Schedule of Deliverables

Week 01: The first week I'll be involved in constructing the control
dependence
graph.

Week 02: The second week I'll have finished the CDG and start implementing
the
data dependence graph construction.

Week 03: In this week I'll work to integrate the CDG and the DDG to form the
program dependence graph.

Week 04: On the first week I'll work to construct the DAG_SCC, that
is, to find the
SCCs in the PDG and coalesce them forming the DAG_SCC.

At the end of the first month I expect to have all analyzis pass implemented,
that is, at this point I'll have the infrastructure to construct a program
dependence graph, find its strongly connected components and coalesce them to
form the DAG_SCC.

Week 05: I plan to spend this week implementing the algorith to partition the
SCCs in pipeline stages.

Week 06: In this week and the next I'll work on implementing the code to
split the original loop body among the pipeline stages.

Week 07: Split the original loop body into stages.

Week 08: In this week and the next I'll work on implementing the necessary
infrastructure to enable the communication and synchronization between
the pipeline stages.

Week 09: Insert communication and synchronization channels.

By the end of the nineth week I expect to have implemented all
mechanism to enable DSWP working. That is, at this point I'll be ready
to start testing the system with real benchmarks.

Week 10: Benchmark the system using LLVM test-suite and correct errors.

Week 11: Benchmark the system using LLVM test-suite and correct errors.

Week 12: Produce final report and documentation of the tasks realized.


### Open Source Development Experience

I've started a project by my own for educational purposes, its goal is to
enable
the emulation of Linux x86 processes [4]. Currently the system is
capable of emulating statically linked programs compiled with
micro-libc. Plans are to improve the system to emulate dynamically
linked programs compiled with micro-libc and libc.

### Academic Experience

Currently I'm PhD. student at University of Campinas - Brazil. My
research is focused in Compilers and Virtual Machines, specifically I
work with automatic parallelization and adaptative optimization
systems.

### Why Me and Why LLVM

I've always looked for a opportunity to contribute to LLVM, however I
find it hard to
contribute without defining clear goals. I expect this project to be
the first of
many contributions I can give to LLVM. In fact I'm decided to start
contributing to
LLVM independent of this project being accepted or not. I see LLVM as a great
opportunity to put in practice many of the concepts I've seem just in
theory.

### References

[1] http://dl.acm.org/citation.cfm?id=1100543
[2] http://dl.acm.org/citation.cfm?id=24041
[3] http://www.graphviz.org/
[4] https://github.com/JohnTortugo/Box
Seemingly Similar Threads

Search for more reasonably related threads
llvm dev - Apr 2013 - [LLVMdev] GSOC Proposal: Implement Decoupled Software Pipeline

[LLVMdev] GSOC Proposal: Implement Decoupled Software Pipeline

Seemingly Similar Threads

Wisdom of the Ancients