thr3ads.net - llvm dev - [LLVMdev] speculative parallelization in LLVM [Jul 2011]

If this information is useful, please help other people find it:
Share via:

Jimborean Alexandra

2011-Jul-19 09:12 UTC

[LLVMdev] speculative parallelization in LLVM

Hi Tobi,

Thank you for your reply :). 

I know that array accesses are handled as pointers in LLVM, but as I understood 
Polly is focused on statically analysable code. As you mentioned: proving that 
pointer accesses actually represent virtual array accesses.

In the case of a linked list for example, parsed with a pointer p = p->next,
I
expect that Polly will not handle this code. So I intend to change this pointer 
into an array (I do not expect to get correct code, but a correct SCoP). 

Something along the lines of:

struct linked{
    int val;
    struct linked* next;
};

typedef struct linked item;
item *curr, *head;

...

while(curr) {
     do_something();
      curr = curr->next ;
 }

This becomes in LLVM IR:

%curr = alloca %struct.linked*, align 8

while..

%tmp15 = load %struct.linked** %curr, align 8

%tmp16 = getelementptr inbounds %struct.linked* %tmp15, i32 0, i32 1      ( curr
= curr->next ;)

And I want to transform this into a SCoP like this:

%curr_array = alloca [10 x %struct.linked], align 8

while..
 %tmp16 = getelementptr inbounds [10 x %struct.linked]* %curr_array, i32 0, i32 
1 

(replace all pointers similarly)

I only want Polly to accept the code (although incorrect at this point) and to 
generate optimized code. Next I will replace back all the fake arrays that I 
introduced, with the original pointer accesses and correct the code. Finally, at
runtime, I perform code instrumentation on the optimized code to check its 
correctness.

Is this approach going to work with Polly? Or can I generate optimized code 
versions with Polly in a different manner when there are pointers and indirect 
references inside the code?

Thanks again and good luck with all your work on Polly!

Alexandra

________________________________
From: Tobias Grosser <tobias at grosser.es>
To: llvmdev at cs.uiuc.edu; Jimborean Alexandra <xinfinity_a at yahoo.com>
Sent: Tue, July 19, 2011 10:19:54 AM
Subject: Re: [LLVMdev] speculative parallelization in LLVM

On 07/18/2011 07:03 PM, Jimborean Alexandra wrote:> Hi,
> 
> I plan to do some speculative parallelization in LLVM using Polly and I
> target loops that contain pointers and indirect references. As far as I
> know, Polly generates optimized code starting from the SCoPs, therefore
> I plan to replace all pointer accesses with array accesses, such that
> Polly will accept the code. Each array access should use a liner
> function of the enclosing loops indices.
> 
> This is how I plan to implement it:
> 
> 1) parse the alloca instructions and find declarations of pointers
> 2) for each pointer
> 3) delete the alloca instruction
> 4) create a new alloca instruction for the new array
> 5) replace all uses of the pointer with the new array
> 
> 
> Feed this code into Polly, get an optimized version and replace back the
> array accesses with the original corresponding pointers.
> 
> Please advise me whether this is a good strategy to replace the pointer
> accesses.

Hi Alexandra,

good to read you and even better that you plan to look into Polly. ;-)

In respect of your plan I still do not exactly get what you want to do. (I still
feel pretty new to all this stuff, so just let me know if something I say sounds
wrong)

In LLVM load and store instructions always load and store from a pointer and 
there is no way to express an array access inside LLVM-IR. What comes closes to 
an array access is the combination of a getElementPtr instruction that 
calculates a pointer into an array plus a load/store instruction. However, this 
is actually not very different from directly calculating the pointer into the 
array based on normal pointer arithmetic.

As a result, in Polly we always analyze the pointer arithmetic (or the effects 
of it) and try to prove that it is conceptually equivalent to an access to a 
normal array. To prove this we use the normal LLVM provided alias analysis. Here
we can also derive that the virtual arrays do not overlap (as mentioned by 
Renato). However, we never transform any pointers into arrays. We just treat 
them conceptually as virtual arrays.

Maybe an example would help me to understand what you plan to do. Can you give a
very simple example (at best in both C and LLVM-IR) that shows what 
transformation you plan to perform and what kind of code you would be able to 
optimize.

There are currently quite some parts in and around Polly that need to be 
improved. Preparing code for Polly such that it fits the format we expect is 
definitely one of this. Indirect references is something, I did not even start 
to think about. Hence, ideas in this direction are more then welcome.

All the best to Strasbourg
Tobi
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110719/da11c46e/attachment.html>

Renato Golin

2011-Jul-19 09:39 UTC

head link

[LLVMdev] speculative parallelization in LLVM

On 19 July 2011 10:12, Jimborean Alexandra <xinfinity_a at yahoo.com>
wrote:> %curr_array = alloca [10 x %struct.linked], align 8
>
> while..
>  %tmp16 = getelementptr inbounds [10 x %struct.linked]* %curr_array, i32 0,
> i32 1
Hi Alexandra,

Can you guarantee that the linked list will be allocated in contiguous memory?

cheers,
--renato

Jimborean Alexandra

2011-Jul-19 09:46 UTC

head link

[LLVMdev] speculative parallelization in LLVM

Hi Renato,

No, I cannot, but in case it is, I want to take advantage of this. In case it is
not, the instrumentation code will detect this at runtime and simply roll back 
to the original version. I will always keep an original version available, in 
addition to the ones I modify with Polly. However, initially I will speculate 
that it is allocated contiguously.

Thanks,
Alexandra

________________________________
From: Renato Golin <rengolin at systemcall.org>
To: Jimborean Alexandra <xinfinity_a at yahoo.com>
Cc: Tobias Grosser <tobias at grosser.es>; llvmdev at cs.uiuc.edu
Sent: Tue, July 19, 2011 11:39:02 AM
Subject: Re: [LLVMdev] speculative parallelization in LLVM

On 19 July 2011 10:12, Jimborean Alexandra <xinfinity_a at yahoo.com>
wrote:> %curr_array = alloca [10 x %struct.linked], align 8
>
> while..
>  %tmp16 = getelementptr inbounds [10 x %struct.linked]* %curr_array, i32 0,
> i32 1
Hi Alexandra,

Can you guarantee that the linked list will be allocated in contiguous memory?

cheers,
--renato
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110719/000446d3/attachment.html>

Tobias Grosser

2011-Jul-19 09:53 UTC

head link

[LLVMdev] speculative parallelization in LLVM

On 07/19/2011 11:12 AM, Jimborean Alexandra wrote:> Hi Tobi,
>
> Thank you for your reply :).
>
> I know that array accesses are handled as pointers in LLVM, but as I
> understood Polly is focused on statically analysable code. As you
> mentioned: proving that pointer accesses actually represent virtual
> array accesses.
>
[...]>
> Is this approach going to work with Polly? Or can I generate optimized
> code versions with Polly in a different manner when there are pointers
> and indirect references inside the code?
OK. Perfect. Here we are. Thanks for the nice explanation. Now I get 
what you plan to do and I am extremely interested in how you will solve 
this.

So yes, I assume the translation of your code into statically analyzable 
code should work. The only problem I see is that it may take some time 
to generate code that is really statically analyzable and that at the 
same time can easily be converted back to the original code. Especially 
if afterwords the code is/was further optimized.
Furthermore, it you may trigger some cases that Polly cannot yet handle.

One thing I was reasoning about for a while, is if it is possible to
simplify the generation of code that Polly can recognize, such that 
frontends like clang, but also your tool can generate code that we can 
be sure Polly can handle. Here, I think it would be especially 
interesting to be able to make Polly also parse some kind of virtual 
accesses, were the details of the accesses are hidden and Polly only 
gets the information, that this access acts like an access to a virtual 
array.

Describing this virtual accesses by using some kind of intrinsics 
combined with meta data may be possible.

%1 = polly.virtual_read() !polly !"{A[i][2j][3k]}"
polly.virtual_write(%ptr) !polly !"{A[i][2j][3k]}"

Like this you can simply transform your linked list into virtual 
accesses, and Polly generates the code that executes these accesses,
and at the end you replace them with the actual code.

(The definition of this is far from complete and the example above needs 
probably be changed to be actually usable. Still, I hope it gives the 
general idea.)

Let me know what you think and what you need exactly. Maybe we can work 
out together a good way to represent such virtual accesses.

Cheers
Tobi

Jimborean Alexandra

2011-Jul-19 12:11 UTC

head link

[LLVMdev] speculative parallelization in LLVM

This is exactly want I need to achieve with Polly actually. I think a good idea 
would be to define intrinsics / metadata, as you mentioned, to notify Polly that
even though it cannot analyse these accesses, to ignore them and perform the 
code transformations. We can go even further and maybe describe these accesses 
with some parametric linear functions. 

For instance:

while (cond1){
    while(cond2){
           p=p->next;
    }
}

to introduce virtual iterators of the enclosing loops, i and j , and replace the
accesses inside the loop with virtual accesses that have the form a*i + b*j + c

%1 = polly.virtual_read() !polly !"  {a1*i + b1*j + c1}"
polly.virtual_write(%ptr) !polly !"    {a2*i + b2*j + c2}"

Next at runtime it will be easier to change the virtual accesses to the original
pointers, and to compute the values to the coefficients a1, b1 ... to check if 
they follow the linearity.  I perform dynamic instrumentation to compute the 
coefficients.

However, for applying the transformations, Polly should either totally ignore 
the virtual accesses, or assign some default values to the coefficients and take
them into consideration. Our plan is to create several versions, some with 
different values, lets say  a = 1, b= 1, c = 0, and one version where all the 
virtual accesses are ignored.

What is important is to be able to track the virtual accesses and to be able to 
replace them with the original ones.

Do you think this represents a lot of work in Polly? And do you plan to include 
this kind of support to handle non-statically analysable code? In case this 
doesn't imply significant changes in Polly, I could start working on this.
It
might be a better approach than converting the code into a form accepted by 
Polly :)

Alexandra

________________________________
From: Tobias Grosser <tobias at grosser.es>
To: Jimborean Alexandra <xinfinity_a at yahoo.com>
Cc: llvmdev at cs.uiuc.edu
Sent: Tue, July 19, 2011 11:53:23 AM
Subject: Re: [LLVMdev] speculative parallelization in LLVM

On 07/19/2011 11:12 AM, Jimborean Alexandra wrote:> Hi Tobi,
> 
> Thank you for your reply :).
> 
> I know that array accesses are handled as pointers in LLVM, but as I
> understood Polly is focused on statically analysable code. As you
> mentioned: proving that pointer accesses actually represent virtual
> array accesses.
> 
[...]> 
> Is this approach going to work with Polly? Or can I generate optimized
> code versions with Polly in a different manner when there are pointers
> and indirect references inside the code?
OK. Perfect. Here we are. Thanks for the nice explanation. Now I get what you 
plan to do and I am extremely interested in how you will solve this.

So yes, I assume the translation of your code into statically analyzable code 
should work. The only problem I see is that it may take some time to generate 
code that is really statically analyzable and that at the same time can easily 
be converted back to the original code. Especially if afterwords the code is/was
further optimized.
Furthermore, it you may trigger some cases that Polly cannot yet handle.

One thing I was reasoning about for a while, is if it is possible to
simplify the generation of code that Polly can recognize, such that frontends 
like clang, but also your tool can generate code that we can be sure Polly can 
handle. Here, I think it would be especially interesting to be able to make 
Polly also parse some kind of virtual accesses, were the details of the accesses
are hidden and Polly only gets the information, that this access acts like an 
access to a virtual array.

Describing this virtual accesses by using some kind of intrinsics combined with 
meta data may be possible.

%1 = polly.virtual_read() !polly !"{A[i][2j][3k]}"
polly.virtual_write(%ptr) !polly !"{A[i][2j][3k]}"

Like this you can simply transform your linked list into virtual accesses, and 
Polly generates the code that executes these accesses,
and at the end you replace them with the actual code.

(The definition of this is far from complete and the example above needs 
probably be changed to be actually usable. Still, I hope it gives the general 
idea.)

Let me know what you think and what you need exactly. Maybe we can work out 
together a good way to represent such virtual accesses.

Cheers
Tobi
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110719/7c082780/attachment.html>

Apparently Analagous Threads

Search for more reasonably related threads

llvm dev - Jul 2011 - [LLVMdev] speculative parallelization in LLVM

[LLVMdev] speculative parallelization in LLVM

[LLVMdev] speculative parallelization in LLVM

[LLVMdev] speculative parallelization in LLVM

[LLVMdev] speculative parallelization in LLVM

[LLVMdev] speculative parallelization in LLVM

Apparently Analagous Threads