On 03/21/2011 02:20 PM, raghesh wrote:> Dear all,
>
> I am Raghesh, a student pursuing M.Tech at Indian Institute of
> Technology, Madras, India.
>
> I would like to make contribution to the Polly project
> (http://wiki.llvm.org/Polyhedral_optimization_framework) as part of
> GSOC 2011. I have gained some experience working in OpenMP Code
> generation for Polly. This is almost stable now and planning to test
> with the polybench benchmarks.
>
> Some of the ideas that can be implemented in Polly is listed below.
> Please let me know your comments on this.
Hi Raghesh,
great that you decided to apply for the summer of code. I put a list of
Polly related ideas on the LLVM wiki, in case someone needs further
ideas (http://wiki.llvm.org/Polly).
In respect of your ideas.
> 1. Increasing the Stability and Coverage of Polly
> -------------------------------------------------------------
>
> Polly can show good speedup on several test cases. there are still
> many programs that cannot be optimized by it. One reason is that Polly
> does not yet support casts like (sext, zext, trunk). Those often
> appear implicit in programs, as they often have 32 bit induction
> variables, but require 64 bit indexes for array subscripts.
>
> For example:
>
> for (int i = 0; i< N; i++)
> A[i] >
> If we translate this to LLVM-IR and keep i an i32 but use an i64 to
> calculate the access to A[i] there will be a sext necessary. Polly
> currently do not handle this.
I think this is a very important part. As Polly now shows some benefits,
we should work on increasing its impact. Especially the extension of our
handling of affine expressions is important for other interesting and
important features, like the detection of variable size, multi
dimensional arrays.
> 2. Testing Real Programs.
> ---------------------------------
> Testing with well known benchmarks like SPEC CPU2006. This will help
> us to find out cases where Polly cannot optimize and improve the
> coverage of Polly on existing Programs.
I also like this part and we should combine it with some public Polly
regression testers. I had one at Ohio State, but it was just testing
make polly-test, but not the test-suite. (Which is just tested
internally be me and Andreas). As I now move back to Europe, we should
make sure the new tester runs both and is located at a public place.
> 3. Profiling in polly
> -----------------------
> The idea is explained below with a few examples. Consider the following
code.
> scanf ( ”%d” ,&b ) ;
>
> for ( i = 0 ; i< N; i +=b) {
> body ;
> }
>
> Polly may not detect this as a SCoP because the variable b is read as
> an user input. So to detect this as a SCoP we instrument the IR with
> the information provided by profiling.
> Suppose using profiling we figure out that most of the time the value
> of b is say 2. we can convert the
> above code as follows.
>
> scanf ( ”%d” ,&b ) ;
> if ( b == 2 ) {
> for ( i = 0 ; i< N; i += 2 ) {
> body ;
> }
> } else {
> f o r ( i = 0 ; i< N; i += b ) {
> body ;
> }
> }
>
> Now with the transformed code the for loop inside ’if’ will be
> detected as a SCoP and can be parallelised.
> Since value of N is 100 most of the time, the overall performance will
> be improved.
>
> Consider another scenario.
> f o r ( i = 0 ; i< N; i ++) {
> body ;
> }
>
> Suppose using profiling we know that N is always very small. So there
> wont be much gain from parallelising
> it. So we have to tell polly that don’t detect this as a SCoP if N is
> less than a specific value.
>
> Some other immediate applications would be
>
> * Automatially derive the best scheduling strategy supported by OpenMP.
> * Adding simple profiling support, to understand how much time is
> spent inside each scop.
>
> Andreas Simbuerger has done some significant work on this and can be
extended.
Yes, profiling is also interesting, though I personally have not thought
too much about this topic, yet. You should probably get input from other
LLVM developers. We may also try to not be too Polly centric. Those kind
of optimizations seem to be useful in other parts of LLVM too. It would
be great to work on a generic infrastructure, where Polly is just one
place where we do function/loop/region versioning.
> 5. Porting Polly to Various architectures.
> -------------------------------------------------
>
> Currently Polly generates everything as 64 bit integer, which is
> problamatic for embedded platforms.
Yes. See the answer to ether.
> 6. Vectorization in Polly
> ------------------------------
> Lot of work needed to be done in this area.
You are right, even if we already got basic vectorization, this needs
still a lot of work. My personal plan is to first integrate Pluto as a
library into Polly. Like this we already get a good locality optimizer.
Than we improve the Pluto algorithm to take SIMDization and cache line
locality into account. (And than we consider alignment, data layout
transformations, ...)
I propose you choose one or two related topics and you extend the
application a little. It would be good to see what problems exist at the
moment and how you plan to solve them. I think it is also helpful to
have a time line that gives an impression how much time you estimate for
each feature.
Cheers
Tobi