thr3ads.net - llvm dev - [LLVMdev] [RFC] Parallelization metadata and intrinsics in LLVM (for OpenMP, etc.) [Oct 2012]

If this information is useful, please help other people find it:
Share via:

Adve, Vikram Sadanand

2012-Oct-02 14:28 UTC

[LLVMdev] [RFC] Parallelization metadata and intrinsics in LLVM (for OpenMP, etc.)

Hal, Andrey, Alexey,
>From the LLVM design viewpoint, there is a fundamental problem with both
Hal's approach and the Intel approach: both are quite language-specific. 
OpenMP is a particular parallel language, with particular constructs (e.g.,
parallel regions) and semantics.  LLVM is a language-neutral IR and
infrastructure and OpenMP-specific concepts should not creep into it.  I've
included an excerpt from Hal's proposal below, which shows what I mean: the 
design is couched in terms of OpenMP parallel regions.  Other parallel
languages, e.g, Cilk, have no such notion.  The latest Intel proposal is at
least as OpenMP-specific.
I do agree with the general goal of trying to support parallel programming
languages in a more first-class manner in LLVM than we do today.  But the right
approach for that is to be as language-neutral as possible.  For example, any
parallelism primitives in the LLVM IR and any LLVM- or machine-level
optimizations that apply to parallel code should be applicable not only to
OpenMP but also to languages like Cilk, Java/C#, and several others.  I think
"libraries" like Intel's TBB should be supported, too: they have a
(reasonably) well-defined semantics, just like languages do, and are become
widely used.

I also do not think LLVM metadata is the way to represent the primitives,
because they are simply too fragile.   But you don't have to argue that one
with me :-), others have argued this already.  You really need more first class,
language-neutral, LLVM mechanisms for parallelism.  I'm not pretending I
know how to do this, though there are papers on the subject, including one from
an Intel team (Pillar: A Parallel Implementation Language, LCPC 2007).

--Vikram
Professor, Computer Science
University of Illinois at Urbana-Champaign
http://llvm.org/~vadve



> To mark this function as a parallel region, a module-level
'parallel'
> metadata entry is created. The call site(s) of this function are marked
> with this metadata,. The metadata has entries:
>  - The string "region"
>  - A reference to the parallel-region function
>  - If applicable, a list of metadata references specifying
> special-handling child regions (parallel loops and serialized/critical
> regions)
> 
> If the special-handling region metadata is no longer referenced by code
> within the parallel region, then the region has become invalid, and
> will be removed (meaning all parallelization metadata will be removed)
> by the ParallelizationCleanup. The same is true for all other
> cross-referenced metadata below.
> 
> Note that parallel regions can be nested.
> 
> As a quick example, something like:
> int main() {
>   int a;
> #pragma omp parallel firstprivate(a) 
>   do_something(a)
>   ...
> }
> 
> becomes something like:
> 
> define private void @parreg(i32 %a) {
> entry:
>   call void @do_something(i32 %a)
>   ret
> }
> 
> define i32 @main() {
> entry:
> ...
> call void @parreg1(i32 %a) !parallel !0
> ...
> 
> !0 = metadata !{ metadata !"region", @parreg }
> 

--Vikram
Professor, Computer Science
University of Illinois at Urbana-Champaign
http://llvm.org/~vadve

Sanjoy Das

2012-Oct-02 15:24 UTC

head link

[LLVMdev] [RFC] Parallelization metadata and intrinsics in LLVM (for OpenMP, etc.)

Hi,
> constructs (e.g., parallel regions) and semantics.  LLVM is a language-
> neutral IR and infrastructure and OpenMP-specific concepts should not
This is exactly the reason I proposed [1] -- mirroring the openmp
directives in LLVM IR doesn't seem very elegant.  The parallelization
information in the IR should be general and orthogonal.

I do realize that boxing the loops into procedures in the frontend
will initially inhibit loop optimizations, but that can be resolved
with some work.  Ideally, I'd like to have a different class
(ParallelLoop, maybe) altogether representing parallel loops and make
the relevant passes aware of it.  More work, yes, but I think such an
approach will pay off eventually.

[1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-September/053798.html
-- 
Sanjoy Das
http://playingwithpointers.com

Andrey Bokhanko

2012-Oct-02 15:38 UTC

head link

[LLVMdev] [RFC] Parallelization metadata and intrinsics in LLVM (for OpenMP, etc.)

Vikram,

I agree with basically everything you said.

Yes, our (Intel) proposal is clearly OpenMP-specific. This is
obviously a deficiency that probably makes it unsuitable for inclusion
as a "first class" part of LLVM IR. What we intended is to have a
simple extension of LLVM IR (based on existing IR constructs) that
establishes common ground for front-end, back-end and runtime library
writers -- and thus, further progress of OpenMP support in LLVM
compiler toolchain.

Yours,
Andrey
---
Software Engineer
Intel Compiler Team
Intel Corp.

On Tue, Oct 2, 2012 at 6:28 PM, Adve, Vikram Sadanand
<vadve at illinois.edu> wrote:> Hal, Andrey, Alexey,
>
> From the LLVM design viewpoint, there is a fundamental problem with both
Hal's approach and the Intel approach: both are quite language-specific. 
OpenMP is a particular parallel language, with particular constructs (e.g.,
parallel regions) and semantics.  LLVM is a language-neutral IR and
infrastructure and OpenMP-specific concepts should not creep into it.  I've
included an excerpt from Hal's proposal below, which shows what I mean: the 
design is couched in terms of OpenMP parallel regions.  Other parallel
languages, e.g, Cilk, have no such notion.  The latest Intel proposal is at
least as OpenMP-specific.
>
> I do agree with the general goal of trying to support parallel programming
languages in a more first-class manner in LLVM than we do today.  But the right
approach for that is to be as language-neutral as possible.  For example, any
parallelism primitives in the LLVM IR and any LLVM- or machine-level
optimizations that apply to parallel code should be applicable not only to
OpenMP but also to languages like Cilk, Java/C#, and several others.  I think
"libraries" like Intel's TBB should be supported, too: they have a
(reasonably) well-defined semantics, just like languages do, and are become
widely used.
>
> I also do not think LLVM metadata is the way to represent the primitives,
because they are simply too fragile.   But you don't have to argue that one
with me :-), others have argued this already.  You really need more first class,
language-neutral, LLVM mechanisms for parallelism.  I'm not pretending I
know how to do this, though there are papers on the subject, including one from
an Intel team (Pillar: A Parallel Implementation Language, LCPC 2007).
>
> --Vikram
> Professor, Computer Science
> University of Illinois at Urbana-Champaign
> http://llvm.org/~vadve
>
>
>
>
>> To mark this function as a parallel region, a module-level
'parallel'
>> metadata entry is created. The call site(s) of this function are marked
>> with this metadata,. The metadata has entries:
>>  - The string "region"
>>  - A reference to the parallel-region function
>>  - If applicable, a list of metadata references specifying
>> special-handling child regions (parallel loops and serialized/critical
>> regions)
>>
>> If the special-handling region metadata is no longer referenced by code
>> within the parallel region, then the region has become invalid, and
>> will be removed (meaning all parallelization metadata will be removed)
>> by the ParallelizationCleanup. The same is true for all other
>> cross-referenced metadata below.
>>
>> Note that parallel regions can be nested.
>>
>> As a quick example, something like:
>> int main() {
>>   int a;
>> #pragma omp parallel firstprivate(a)
>>   do_something(a)
>>   ...
>> }
>>
>> becomes something like:
>>
>> define private void @parreg(i32 %a) {
>> entry:
>>   call void @do_something(i32 %a)
>>   ret
>> }
>>
>> define i32 @main() {
>> entry:
>> ...
>> call void @parreg1(i32 %a) !parallel !0
>> ...
>>
>> !0 = metadata !{ metadata !"region", @parreg }
>>
>
>
> --Vikram
> Professor, Computer Science
> University of Illinois at Urbana-Champaign
> http://llvm.org/~vadve
>
>
>
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Hal Finkel

2012-Oct-02 18:53 UTC

head link

[LLVMdev] [RFC] Parallelization metadata and intrinsics in LLVM (for OpenMP, etc.)

On Tue, 2 Oct 2012 20:54:03 +0530
Sanjoy Das <sanjoy at playingwithpointers.com> wrote:
> Hi,
> 
> > constructs (e.g., parallel regions) and semantics.  LLVM is a
> > language- neutral IR and infrastructure and OpenMP-specific
> > concepts should not
> 
> This is exactly the reason I proposed [1] -- mirroring the openmp
> directives in LLVM IR doesn't seem very elegant.  The parallelization
> information in the IR should be general and orthogonal.
> 
> I do realize that boxing the loops into procedures in the frontend
> will initially inhibit loop optimizations, but that can be resolved
> with some work.
It is hard to evaluate this without more details. Would you convert
LoopInfo, SE, etc. into module-level passes? Would you inline these
functions early with some special attached semantics? If the second,
how is this different from attaching the special semantics using
intrinsics or metadata?
>  Ideally, I'd like to have a different class
> (ParallelLoop, maybe) altogether representing parallel loops and make
> the relevant passes aware of it.  More work, yes, but I think such an
> approach will pay off eventually.
Can you be more specific? Pay off how?

Thanks again,
Hal
> 
> [1]
> http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-September/053798.html


-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory

Hal Finkel

2012-Oct-02 20:00 UTC

head link

[LLVMdev] [RFC] Parallelization metadata and intrinsics in LLVM (for OpenMP, etc.)

On Tue, 2 Oct 2012 14:28:25 +0000
"Adve, Vikram Sadanand" <vadve at illinois.edu> wrote:
> Hal, Andrey, Alexey,
> 
> From the LLVM design viewpoint, there is a fundamental problem with
> both Hal's approach and the Intel approach: both are quite
> language-specific.  OpenMP is a particular parallel language, with
> particular constructs (e.g., parallel regions) and semantics.  LLVM
> is a language-neutral IR and infrastructure and OpenMP-specific
> concepts should not creep into it. 
This is a matter of perspective. One could also argue that the LLVM IR
should be target neutral. Nevertheless, we have target-specific
intrinsics. Similarly, there is a legitimate case to be made for
producing code that targets existing OpenMP runtime ABIs. The most
natural way to do this is for some of the ABIs semantics, and thus some
of OpenMP's language semantics, to leak into the IR level. Otherwise,
one ends up playing too much of a double-translation game.

Consider, for example, an OpenMP loop with runtime scheduling. This
implies a certain interaction with the OpenMP runtime library (and so is
explicitly OpenMP-specific). Nevertheless, we don't want to lower the
parallelization too early because we'd like to perform look analysis
and transformations (like LICM) first. The only way to do this properly
seems to be to push some of the OpenMP-specific nature of the loop into
the IR. This is not necessarily bad.
> I've included an excerpt from
> Hal's proposal below, which shows what I mean: the  design is couched
> in terms of OpenMP parallel regions.  Other parallel languages, e.g,
> Cilk, have no such notion.
The approach that I proposed was certainly inspired by OpenMP, and
designed to fully support OpenMP, but was not limited to it. As a
practical matter, OpenMP includes both loop-based parallelism and
task-based parallelism, which is a pretty broad foundation for
supporting parallelism in general.

I looked at the Cilk documentation when writing my proposal. Is there a
reason why Cilk's semantics cannot be mapped onto the proposed support
for parallel tasks?
>  The latest Intel proposal is at least as
> OpenMP-specific.
> 
> I do agree with the general goal of trying to support parallel
> programming languages in a more first-class manner in LLVM than we do
> today.  But the right approach for that is to be as language-neutral
> as possible.  For example, any parallelism primitives in the LLVM IR
> and any LLVM- or machine-level optimizations that apply to parallel
> code should be applicable not only to OpenMP but also to languages
> like Cilk, Java/C#, and several others.  I think "libraries" like
> Intel's TBB should be supported, too: they have a (reasonably)
> well-defined semantics, just like languages do, and are become widely
> used.  
> 
> I also do not think LLVM metadata is the way to represent the
> primitives, because they are simply too fragile.   But you don't have
> to argue that one with me :-), others have argued this already.
I've never argued that the mechanism is not fragile. However, I do
think that, with a proper design, it is possible to use the existing
metadata infrastructure (with some minimal changes, for example, to
inhibit inlining). I am not committed to a metadata-based approach, but
I think such an approach is workable.
>  You
> really need more first class, language-neutral, LLVM mechanisms for
> parallelism.  I'm not pretending I know how to do this, though there
> are papers on the subject, including one from an Intel team (Pillar:
> A Parallel Implementation Language, LCPC 2007).
I'll look at the paper, thanks for the reference! The problem is not
just in supporting parallelism in general, the problem is specifically
in supporting OpenMP, with its mix of language semantics, runtime
semantics, and the interaction of the two, while not inhibiting
optimization.

Thanks again,
Hal
> 
> --Vikram
> Professor, Computer Science
> University of Illinois at Urbana-Champaign
> http://llvm.org/~vadve
> 
> 
> 
> 
> > To mark this function as a parallel region, a module-level
> > 'parallel' metadata entry is created. The call site(s) of this
> > function are marked with this metadata,. The metadata has entries:
> >  - The string "region"
> >  - A reference to the parallel-region function
> >  - If applicable, a list of metadata references specifying
> > special-handling child regions (parallel loops and
> > serialized/critical regions)
> > 
> > If the special-handling region metadata is no longer referenced by
> > code within the parallel region, then the region has become
> > invalid, and will be removed (meaning all parallelization metadata
> > will be removed) by the ParallelizationCleanup. The same is true
> > for all other cross-referenced metadata below.
> > 
> > Note that parallel regions can be nested.
> > 
> > As a quick example, something like:
> > int main() {
> >   int a;
> > #pragma omp parallel firstprivate(a) 
> >   do_something(a)
> >   ...
> > }
> > 
> > becomes something like:
> > 
> > define private void @parreg(i32 %a) {
> > entry:
> >   call void @do_something(i32 %a)
> >   ret
> > }
> > 
> > define i32 @main() {
> > entry:
> > ...
> > call void @parreg1(i32 %a) !parallel !0
> > ...
> > 
> > !0 = metadata !{ metadata !"region", @parreg }
> > 
> 
> 
> --Vikram
> Professor, Computer Science
> University of Illinois at Urbana-Champaign
> http://llvm.org/~vadve
> 
> 
> 
> 
> 
> 


-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory

Mahesha HS

2012-Oct-02 21:39 UTC

head link

[LLVMdev] [RFC] Parallelization metadata and intrinsics in LLVM (for OpenMP, etc.)

I am not an optimizer guy, but, I am just thinking, if we can solve the
problems that we are discussing in this mail chain, by introducing a
middle-end in between front-end and LLVM. We may need to introduce GGC
GIMPLE kind of IR (or any new suitable IR) in the middle-end so that
front-end can produce this new IR, middle-end can consume it, and do all
the parallelization and subsequent optimizations, and generate LLVM IR to
take it forward by LLVM. Again going through middle-end can be made
optional depending on the requirements. This means, front-end must have to
have a capability to produce either new IR to middle-end or LLVM IR
directly to LLVM.

--
mahesha


On Wed, Oct 3, 2012 at 1:30 AM, Hal Finkel <hfinkel at anl.gov> wrote:
> On Tue, 2 Oct 2012 14:28:25 +0000
> "Adve, Vikram Sadanand" <vadve at illinois.edu> wrote:
>
> > Hal, Andrey, Alexey,
> >
> > From the LLVM design viewpoint, there is a fundamental problem with
> > both Hal's approach and the Intel approach: both are quite
> > language-specific.  OpenMP is a particular parallel language, with
> > particular constructs (e.g., parallel regions) and semantics.  LLVM
> > is a language-neutral IR and infrastructure and OpenMP-specific
> > concepts should not creep into it.
>
> This is a matter of perspective. One could also argue that the LLVM IR
> should be target neutral. Nevertheless, we have target-specific
> intrinsics. Similarly, there is a legitimate case to be made for
> producing code that targets existing OpenMP runtime ABIs. The most
> natural way to do this is for some of the ABIs semantics, and thus some
> of OpenMP's language semantics, to leak into the IR level. Otherwise,
> one ends up playing too much of a double-translation game.
>
> Consider, for example, an OpenMP loop with runtime scheduling. This
> implies a certain interaction with the OpenMP runtime library (and so is
> explicitly OpenMP-specific). Nevertheless, we don't want to lower the
> parallelization too early because we'd like to perform look analysis
> and transformations (like LICM) first. The only way to do this properly
> seems to be to push some of the OpenMP-specific nature of the loop into
> the IR. This is not necessarily bad.
>
> > I've included an excerpt from
> > Hal's proposal below, which shows what I mean: the  design is
couched
> > in terms of OpenMP parallel regions.  Other parallel languages, e.g,
> > Cilk, have no such notion.
>
> The approach that I proposed was certainly inspired by OpenMP, and
> designed to fully support OpenMP, but was not limited to it. As a
> practical matter, OpenMP includes both loop-based parallelism and
> task-based parallelism, which is a pretty broad foundation for
> supporting parallelism in general.
>
> I looked at the Cilk documentation when writing my proposal. Is there a
> reason why Cilk's semantics cannot be mapped onto the proposed support
> for parallel tasks?
>
> >  The latest Intel proposal is at least as
> > OpenMP-specific.
> >
> > I do agree with the general goal of trying to support parallel
> > programming languages in a more first-class manner in LLVM than we do
> > today.  But the right approach for that is to be as language-neutral
> > as possible.  For example, any parallelism primitives in the LLVM IR
> > and any LLVM- or machine-level optimizations that apply to parallel
> > code should be applicable not only to OpenMP but also to languages
> > like Cilk, Java/C#, and several others.  I think "libraries"
like
> > Intel's TBB should be supported, too: they have a (reasonably)
> > well-defined semantics, just like languages do, and are become widely
> > used.
> >
> > I also do not think LLVM metadata is the way to represent the
> > primitives, because they are simply too fragile.   But you don't
have
> > to argue that one with me :-), others have argued this already.
>
> I've never argued that the mechanism is not fragile. However, I do
> think that, with a proper design, it is possible to use the existing
> metadata infrastructure (with some minimal changes, for example, to
> inhibit inlining). I am not committed to a metadata-based approach, but
> I think such an approach is workable.
>
> >  You
> > really need more first class, language-neutral, LLVM mechanisms for
> > parallelism.  I'm not pretending I know how to do this, though
there
> > are papers on the subject, including one from an Intel team (Pillar:
> > A Parallel Implementation Language, LCPC 2007).
>
> I'll look at the paper, thanks for the reference! The problem is not
> just in supporting parallelism in general, the problem is specifically
> in supporting OpenMP, with its mix of language semantics, runtime
> semantics, and the interaction of the two, while not inhibiting
> optimization.
>
> Thanks again,
> Hal
>
> >
> > --Vikram
> > Professor, Computer Science
> > University of Illinois at Urbana-Champaign
> > http://llvm.org/~vadve
> >
> >
> >
> >
> > > To mark this function as a parallel region, a module-level
> > > 'parallel' metadata entry is created. The call site(s) of
this
> > > function are marked with this metadata,. The metadata has
entries:
> > >  - The string "region"
> > >  - A reference to the parallel-region function
> > >  - If applicable, a list of metadata references specifying
> > > special-handling child regions (parallel loops and
> > > serialized/critical regions)
> > >
> > > If the special-handling region metadata is no longer referenced
by
> > > code within the parallel region, then the region has become
> > > invalid, and will be removed (meaning all parallelization
metadata
> > > will be removed) by the ParallelizationCleanup. The same is true
> > > for all other cross-referenced metadata below.
> > >
> > > Note that parallel regions can be nested.
> > >
> > > As a quick example, something like:
> > > int main() {
> > >   int a;
> > > #pragma omp parallel firstprivate(a)
> > >   do_something(a)
> > >   ...
> > > }
> > >
> > > becomes something like:
> > >
> > > define private void @parreg(i32 %a) {
> > > entry:
> > >   call void @do_something(i32 %a)
> > >   ret
> > > }
> > >
> > > define i32 @main() {
> > > entry:
> > > ...
> > > call void @parreg1(i32 %a) !parallel !0
> > > ...
> > >
> > > !0 = metadata !{ metadata !"region", @parreg }
> > >
> >
> >
> > --Vikram
> > Professor, Computer Science
> > University of Illinois at Urbana-Champaign
> > http://llvm.org/~vadve
> >
> >
> >
> >
> >
> >
>
>
>
> --
> Hal Finkel
> Postdoctoral Appointee
> Leadership Computing Facility
> Argonne National Laboratory
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>


-- 
mahesha
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20121003/3fa0625c/attachment.html>

Hal Finkel

2012-Oct-03 04:41 UTC

head link

[LLVMdev] [RFC] Parallelization metadata and intrinsics in LLVM (for OpenMP, etc.)

On Tue, 2 Oct 2012 14:28:25 +0000
"Adve, Vikram Sadanand" <vadve at illinois.edu> wrote:
> Hal, Andrey, Alexey,
> 
> From the LLVM design viewpoint, there is a fundamental problem with
> both Hal's approach and the Intel approach: both are quite
> language-specific.  OpenMP is a particular parallel language, with
> particular constructs (e.g., parallel regions) and semantics.  LLVM
> is a language-neutral IR and infrastructure and OpenMP-specific
> concepts should not creep into it.  I've included an excerpt from
> Hal's proposal below, which shows what I mean: the  design is couched
> in terms of OpenMP parallel regions.  Other parallel languages, e.g,
> Cilk, have no such notion.  The latest Intel proposal is at least as
> OpenMP-specific.
> 
> I do agree with the general goal of trying to support parallel
> programming languages in a more first-class manner in LLVM than we do
> today.  But the right approach for that is to be as language-neutral
> as possible.  For example, any parallelism primitives in the LLVM IR
> and any LLVM- or machine-level optimizations that apply to parallel
> code should be applicable not only to OpenMP but also to languages
> like Cilk, Java/C#, and several others.  I think "libraries" like
> Intel's TBB should be supported, too: they have a (reasonably)
> well-defined semantics, just like languages do, and are become widely
> used.  
> 
> I also do not think LLVM metadata is the way to represent the
> primitives, because they are simply too fragile.   But you don't have
> to argue that one with me :-), others have argued this already.  You
> really need more first class, language-neutral, LLVM mechanisms for
> parallelism.  I'm not pretending I know how to do this, though there
> are papers on the subject, including one from an Intel team (Pillar:
> A Parallel Implementation Language, LCPC 2007).
I took a quick look at this paper; essentially, their language
introduces continuations and several different types of 'parallel call'
functions. Do we want continuations? If we want to go down this road,
then I think that something like Sanoy's parallel_map is a good idea. My
worry with these restricted approaches is that correctly implementing
OpenMP's semantics may prove inefficient (if not impossible). The
specification dictates specific interactions between the runtime
library, certain environmental variables, and the pragmas. I think that
showing that this will work will require a specific proposal and an
explicit construction of the mapping. Moreover, I fear that restricting
parallelism to some special types of call instructions will inhibit
useful loop optimizations on parallelized loops. Since parallel loop
performance is one of the most important measures for the quality of a
parallelizing compiler, we need to consider the impact on loop
optimizations carefully.

 -Hal
> 
> --Vikram
> Professor, Computer Science
> University of Illinois at Urbana-Champaign
> http://llvm.org/~vadve
> 
> 
> 
> 
> > To mark this function as a parallel region, a module-level
> > 'parallel' metadata entry is created. The call site(s) of this
> > function are marked with this metadata,. The metadata has entries:
> >  - The string "region"
> >  - A reference to the parallel-region function
> >  - If applicable, a list of metadata references specifying
> > special-handling child regions (parallel loops and
> > serialized/critical regions)
> > 
> > If the special-handling region metadata is no longer referenced by
> > code within the parallel region, then the region has become
> > invalid, and will be removed (meaning all parallelization metadata
> > will be removed) by the ParallelizationCleanup. The same is true
> > for all other cross-referenced metadata below.
> > 
> > Note that parallel regions can be nested.
> > 
> > As a quick example, something like:
> > int main() {
> >   int a;
> > #pragma omp parallel firstprivate(a) 
> >   do_something(a)
> >   ...
> > }
> > 
> > becomes something like:
> > 
> > define private void @parreg(i32 %a) {
> > entry:
> >   call void @do_something(i32 %a)
> >   ret
> > }
> > 
> > define i32 @main() {
> > entry:
> > ...
> > call void @parreg1(i32 %a) !parallel !0
> > ...
> > 
> > !0 = metadata !{ metadata !"region", @parreg }
> > 
> 
> 
> --Vikram
> Professor, Computer Science
> University of Illinois at Urbana-Champaign
> http://llvm.org/~vadve
> 
> 
> 
> 
> 
> 


-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory

Hal Finkel

2012-Oct-03 17:49 UTC

head link

[LLVMdev] [RFC] Parallelization metadata and intrinsics in LLVM (for OpenMP, etc.)

On Tue, 2 Oct 2012 14:28:25 +0000
"Adve, Vikram Sadanand" <vadve at illinois.edu> wrote:
> Hal, Andrey, Alexey,
> 
> From the LLVM design viewpoint, there is a fundamental problem with
> both Hal's approach and the Intel approach: both are quite
> language-specific.  OpenMP is a particular parallel language, with
> particular constructs (e.g., parallel regions) and semantics.
Also, after thinking about it, I object to this characterization.
OpenMP is a parallelization model and runtime interface (with some
specific language bindings pre-specified). There is very little in the
standard that is language specific. It would not be difficult to create
OpenMP for Python, or Go, etc. OpenMP covers both task-based and
region/loop-based parallelism. As far as choosing a conceptual
substrate on which to base LLVM's parallelism, using OpenMP would not
be a bad idea. If LLVM has parallelization support, such support will
also be a, "a particular parallel language, with particular constructs
(e.g., parallel regions) and semantics."

I think that there is a dangerous inclination to oversimply the
parallelization interface. While it is true that parallel algorithms
can often be modeled using a few distinct operators, implementing a
programming language this way is not a good idea. You would not, for
example, argue that C should eliminate all integer types except for the
largest. It is true that integer operations on shorter types can be
implemented in terms of the larger types combined with appropriate
masking operations. However, it would neither be convenient,
aesthetically pleasing, nor easy to optimize in practice.

 -Hal
>  LLVM
> is a language-neutral IR and infrastructure and OpenMP-specific
> concepts should not creep into it.  I've included an excerpt from
> Hal's proposal below, which shows what I mean: the  design is couched
> in terms of OpenMP parallel regions.  Other parallel languages, e.g,
> Cilk, have no such notion.  The latest Intel proposal is at least as
> OpenMP-specific.
> 
> I do agree with the general goal of trying to support parallel
> programming languages in a more first-class manner in LLVM than we do
> today.  But the right approach for that is to be as language-neutral
> as possible.  For example, any parallelism primitives in the LLVM IR
> and any LLVM- or machine-level optimizations that apply to parallel
> code should be applicable not only to OpenMP but also to languages
> like Cilk, Java/C#, and several others.  I think "libraries" like
> Intel's TBB should be supported, too: they have a (reasonably)
> well-defined semantics, just like languages do, and are become widely
> used.  
> 
> I also do not think LLVM metadata is the way to represent the
> primitives, because they are simply too fragile.   But you don't have
> to argue that one with me :-), others have argued this already.  You
> really need more first class, language-neutral, LLVM mechanisms for
> parallelism.  I'm not pretending I know how to do this, though there
> are papers on the subject, including one from an Intel team (Pillar:
> A Parallel Implementation Language, LCPC 2007).
> 
> --Vikram
> Professor, Computer Science
> University of Illinois at Urbana-Champaign
> http://llvm.org/~vadve
> 
> 
> 
> 
> > To mark this function as a parallel region, a module-level
> > 'parallel' metadata entry is created. The call site(s) of this
> > function are marked with this metadata,. The metadata has entries:
> >  - The string "region"
> >  - A reference to the parallel-region function
> >  - If applicable, a list of metadata references specifying
> > special-handling child regions (parallel loops and
> > serialized/critical regions)
> > 
> > If the special-handling region metadata is no longer referenced by
> > code within the parallel region, then the region has become
> > invalid, and will be removed (meaning all parallelization metadata
> > will be removed) by the ParallelizationCleanup. The same is true
> > for all other cross-referenced metadata below.
> > 
> > Note that parallel regions can be nested.
> > 
> > As a quick example, something like:
> > int main() {
> >   int a;
> > #pragma omp parallel firstprivate(a) 
> >   do_something(a)
> >   ...
> > }
> > 
> > becomes something like:
> > 
> > define private void @parreg(i32 %a) {
> > entry:
> >   call void @do_something(i32 %a)
> >   ret
> > }
> > 
> > define i32 @main() {
> > entry:
> > ...
> > call void @parreg1(i32 %a) !parallel !0
> > ...
> > 
> > !0 = metadata !{ metadata !"region", @parreg }
> > 
> 
> 
> --Vikram
> Professor, Computer Science
> University of Illinois at Urbana-Champaign
> http://llvm.org/~vadve
> 
> 
> 
> 
> 
> 


-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory

Reasonably Related Threads

Search for more possibly parallel threads

llvm dev - Oct 2012 - [LLVMdev] [RFC] Parallelization metadata and intrinsics in LLVM (for OpenMP, etc.)

[LLVMdev] [RFC] Parallelization metadata and intrinsics in LLVM (for OpenMP, etc.)

[LLVMdev] [RFC] Parallelization metadata and intrinsics in LLVM (for OpenMP, etc.)

[LLVMdev] [RFC] Parallelization metadata and intrinsics in LLVM (for OpenMP, etc.)

[LLVMdev] [RFC] Parallelization metadata and intrinsics in LLVM (for OpenMP, etc.)

[LLVMdev] [RFC] Parallelization metadata and intrinsics in LLVM (for OpenMP, etc.)

[LLVMdev] [RFC] Parallelization metadata and intrinsics in LLVM (for OpenMP, etc.)

[LLVMdev] [RFC] Parallelization metadata and intrinsics in LLVM (for OpenMP, etc.)

[LLVMdev] [RFC] Parallelization metadata and intrinsics in LLVM (for OpenMP, etc.)

Reasonably Related Threads