thr3ads.net - llvm dev - [LLVMdev] Speculative Loop Parallelization on LLVM IR [Jun 2010]

If this information is useful, please help other people find it:
Share via:

Tobias Grosser

2010-Jun-18 14:03 UTC

[LLVMdev] Speculative Loop Parallelization on LLVM IR

Hi Javed,


On 06/18/10 14:07, Javed Absar wrote:> Hi:
> I worked on loop-optimizations techniques previously using ORC.
> Currently i see lots of research on speculative parallelization of
> loops ... specially because multicores [for embedded systems] is
> becoming popular. In other words, because you have
> multiple cores, you can start some loops [Fast-Track] as if there is no
> or low data-dependence [Partial Parallel Loop-Nest].
> The normal part is to check later if there was some violation and then
> rollback etc. [something like that ... I dont think people want to hear the
> full story here].
> What do you guys think about the feasibility of "speculative run-time
> loop optimization' using LLVM IR as base.
> I know there are lots of issues to it -
> 1. Loop transformation requires tools like PolyLib (Polyhedral
> analysis). They are too computationally expensive for
> run-time optimization.There is Polly (still in development)
http://wiki.llvm.org/Polyhedral_optimization_framework

And to say this. The current problem is not very often the 
computationally complexity. However it might be that we have not yet 
reached the point, when this gets relevant.
> 2. I am not clear how to launch threads at LLVM IR level (you do need to
> create new threads for some of these RT speculative loop parallelization
> techniques).You could add calls to the posix functions that create a thread to the 
LLVM-IR or you could use some higher level libraries that support thread 
creation.

The OpenMP runtime libraries e.g. can help you with this. Here a link to 
the GNU libaries that could be used to try this approach.
http://gcc.gnu.org/onlinedocs/libgomp/
> Is somebody already doing something like this? Does this proposal make
> sense even to start...I do not yet understand what you want to do.

You just want to execute some loops in parallel without knowing if they 
have data dependencies. And than afterwords you would like to check if 
there where dependencies.

How would you check for violated dependencies?

Can you give an example of a loop, that would have no dependencies and 
that you would like to improve?

Tobi

Javed Absar

2010-Jun-21 05:12 UTC

head link

[LLVMdev] Speculative Loop Parallelization on LLVM IR

Hi Tobias:

Thanks for replying . So if I understand correctly,  in LLVM currently, the
Polyhedral model is being built ( LLVM IR -------> Poly Model ---------->
LLVM IR ).
This is for compile-time optimizations of loop-nests [e.g.
loop-transformations to expose parallelism or improve locality etc]. Yes,
thats great for optimizing loop-nests.

As an additional, since the real value of LLVM to me is run-time
optimizations possibilities, some loop-nest optimizations "needs" to
be done
at run-time.
I say "needs" because many loops have unresolvable data-dependencies
at
compile-time
[for example

FOR i = 1 to N
A[ B[i]  ]  = A [ i ] + func (....)
END

Can the iterations of this loop be run in parallel?
]

There are two approaches for with the above kind of problems. One is
inspector/executor, other is speculation.
To keep the story short - basically, you run the loop-iterations  in
parallel and verify in the end if data-dependencies were violated. If yes,
you rewind and run the loop sequentially.
If for that particular case there was no data-dependencies violated, you
have gained in execution time [yes, there is cost involved in verifying and
nett gain is not always +ve ].

OK, so whats my point? To be able to do at least some loop-transformations
at run-time to expose parallelism etc, perhaps some kind of LLVM IR --> Poly
---> LLVM IR
support at run-time may be required. Definitely a scaled down version, since
polyhedral transformations need a lot of processing in my opinion.
So if i understand, we are not there yet .... and may be i can come back
with some proposal/ideas and cross-check it with you guys.

Thanks
Javed




On 18 June 2010 19:33, Tobias Grosser <grosser at fim.uni-passau.de>
wrote:
> Hi Javed,
>
>
>
> On 06/18/10 14:07, Javed Absar wrote:
>
>> Hi:
>> I worked on loop-optimizations techniques previously using ORC.
>> Currently i see lots of research on speculative parallelization of
>> loops ... specially because multicores [for embedded systems] is
>> becoming popular. In other words, because you have
>> multiple cores, you can start some loops [Fast-Track] as if there is no
>> or low data-dependence [Partial Parallel Loop-Nest].
>> The normal part is to check later if there was some violation and then
>> rollback etc. [something like that ... I dont think people want to hear
>> the
>> full story here].
>> What do you guys think about the feasibility of "speculative
run-time
>> loop optimization' using LLVM IR as base.
>> I know there are lots of issues to it -
>> 1. Loop transformation requires tools like PolyLib (Polyhedral
>> analysis). They are too computationally expensive for
>> run-time optimization.
>>
> There is Polly (still in development)
> http://wiki.llvm.org/Polyhedral_optimization_framework
>
> And to say this. The current problem is not very often the computationally
> complexity. However it might be that we have not yet reached the point,
when
> this gets relevant.
>
>
> 2. I am not clear how to launch threads at LLVM IR level (you do need to
>> create new threads for some of these RT speculative loop
parallelization
>> techniques).
>>
> You could add calls to the posix functions that create a thread to the
> LLVM-IR or you could use some higher level libraries that support thread
> creation.
>
> The OpenMP runtime libraries e.g. can help you with this. Here a link to
> the GNU libaries that could be used to try this approach.
> http://gcc.gnu.org/onlinedocs/libgomp/
>
>
> Is somebody already doing something like this? Does this proposal make
>> sense even to start...
>>
> I do not yet understand what you want to do.
>
> You just want to execute some loops in parallel without knowing if they
> have data dependencies. And than afterwords you would like to check if
there
> where dependencies.
>
> How would you check for violated dependencies?
>
> Can you give an example of a loop, that would have no dependencies and that
> you would like to improve?
>
> Tobi
>


-- 
my homepage: http://www.javedabsar.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20100621/0d67fb5e/attachment.html>

Renato Golin

2010-Jun-21 08:29 UTC

head link

[LLVMdev] Speculative Loop Parallelization on LLVM IR

On 21 June 2010 06:12, Javed Absar <javed.absar at gmail.com>
wrote:> OK, so whats my point? To be able to do at least some loop-transformations
> at run-time to expose parallelism etc, perhaps some kind of LLVM IR -->
Poly
> ---> LLVM IR
> support at run-time may be required. Definitely a scaled down version,
since
> polyhedral transformations need a lot of processing in my opinion.
Hi Javed,

This is quite interesting, but there are many dangers that can be
easily forgotten. First and most obvious, you'll have to put all
computation into account when comparing if there was a net gain, not
only the vectorized execution. It might not be trivial to calculate
the gain if you don't have control of the cycles or if the pipeline is
too deep. In big loops, it makes little difference, but there will be
a threshold (depending on machine configurations etc) that there will
be a big hit on performance, especially on program start-up, which is
the most critical to the user.

Another phantom is that, when dealing with pointers (nor arrays, as
you suggest), the locality is relative and depend on *each* execution.
You might tage the loop as "safe" if at first it seems so, but you
still have to check *every* time for disparities not to get caught in
that trap. That also add to the "extra cost" this mechanism brings.

Last, as you said, the poly is very expensive. Stripped down versions
might not be suitable to analyse all cases, so you might end up with a
handful of sub-versions, and that adds time and space complexity to
the program. This might also be running on a JITed environment, which
also adds to the invisible cost, with unpredictable cache hits, memory
usage etc.

My tuppence.

cheers,
--renato

Daniel Berlin

2010-Jun-21 17:27 UTC

head link

[LLVMdev] Speculative Loop Parallelization on LLVM IR

On Mon, Jun 21, 2010 at 1:12 AM, Javed Absar <javed.absar at gmail.com>
wrote:> Hi Tobias:
>
> Thanks for replying . So if I understand correctly,  in LLVM currently, the
> Polyhedral model is being built ( LLVM IR -------> Poly Model
---------->
> LLVM IR ).
> This is for compile-time optimizations of loop-nests [e.g.
> loop-transformations to expose parallelism or improve locality etc]. Yes,
> thats great for optimizing loop-nests.
>
> As an additional, since the real value of LLVM to me is run-time
> optimizations possibilities, some loop-nest optimizations
"needs" to be done
> at run-time.
> I say "needs" because many loops have unresolvable
data-dependencies at
> compile-time

In cases where it matters, most compilers will perform function
versioning and include data dependence if checks to check at runtime
which version (parallel or non) can be run.

Then the JIT can simply eliminate the branches it proves are live/dead
and you are left with either the parallelized or not version.
As you say, this is the standard approach, why would you need to run
Poly at runtime?

Devang Patel

2010-Jun-21 18:29 UTC

head link

[LLVMdev] Speculative Loop Parallelization on LLVM IR

On Sun, Jun 20, 2010 at 10:12 PM, Javed Absar <javed.absar at gmail.com>
wrote:> Hi Tobias:
>
> Thanks for replying . So if I understand correctly,  in LLVM currently, the
> Polyhedral model is being built ( LLVM IR -------> Poly Model
---------->
> LLVM IR ).
> This is for compile-time optimizations of loop-nests [e.g.
> loop-transformations to expose parallelism or improve locality etc]. Yes,
> thats great for optimizing loop-nests.
>
> As an additional, since the real value of LLVM to me is run-time
> optimizations possibilities, some loop-nest optimizations
"needs" to be done
> at run-time.
> I say "needs" because many loops have unresolvable
data-dependencies at
> compile-time
> [for example
>
> FOR i = 1 to N
> A[ B[i]  ]  = A [ i ] + func (....)
> END
>
> Can the iterations of this loop be run in parallel?
> ]
>
> There are two approaches for with the above kind of problems. One is
> inspector/executor, other is speculation.
> To keep the story short - basically, you run the loop-iterations  in
> parallel and verify in the end if data-dependencies were violated. If yes,
> you rewind and run the loop sequentially.
> If for that particular case there was no data-dependencies violated, you
> have gained in execution time [yes, there is cost involved in verifying and
> nett gain is not always +ve ].
>
> OK, so whats my point? To be able to do at least some loop-transformations
> at run-time to expose parallelism etc, perhaps some kind of LLVM IR -->
Poly
> ---> LLVM IR
> support at run-time may be required. Definitely a scaled down version,
since
> polyhedral transformations need a lot of processing in my opinion.
> So if i understand, we are not there yet .... and may be i can come back
> with some proposal/ideas and cross-check it with you guys.
Do you really need to do transformations at run time ?

http://www.springerlink.com/content/ly5t6f8fyyr28m8y/

:)
-
Devang

Apparently Analagous Threads

Search for more reasonably related threads

llvm dev - Jun 2010 - [LLVMdev] Speculative Loop Parallelization on LLVM IR

[LLVMdev] Speculative Loop Parallelization on LLVM IR

[LLVMdev] Speculative Loop Parallelization on LLVM IR

[LLVMdev] Speculative Loop Parallelization on LLVM IR

[LLVMdev] Speculative Loop Parallelization on LLVM IR

[LLVMdev] Speculative Loop Parallelization on LLVM IR

Apparently Analagous Threads