thr3ads.net - llvm dev - [LLVMdev] [RFC] OpenMP Representation in LLVM IR [Oct 2012]

If this information is useful, please help other people find it:
Share via:

Andrey Bokhanko

2012-Sep-28 18:42 UTC

[LLVMdev] [RFC] OpenMP Representation in LLVM IR

Hi All,

We'd like to make a proposal for OpenMP representation in LLVM IR.

Our goal is to reach an agreement in the community on a simple,
complete and extensible representation of OpenMP language constructs
in LLVM IR. Hopefully, this would serve as a common ground and would
enable further development of OpenMP support both in Clang and LLVM
compiler toolchain.

We seek feedback on the proposed design and ways to improve it.

Main authors of the proposal are Andrey Bokhanko and Alexey Bataev.
Also, we'd like to acknowledge valuable contributions and advice
provided by Kevin B Smith, Xinmin Tian, Stefanus Du Toit, Brian
Minard, Dmitry Babokin and other colleagues from Intel and Diego
Novillo from Google. NB, this *does not* automatically imply support
of the proposal by said individuals.

Please find the proposal in *.pdf (attached to the message, for
reading convenience) and plain text (below the message, for quoting
convenience) formats. Their content is identical.

Full disclosure: both of us are associated with Intel and Intel Compiler team.

Yours,
Andrey Bokhanko
Alexey Bataev
============Software Engineers
Intel Compiler Team
Intel Corp.


OpenMP Representation in LLVM IR
===============================Design Proposal

Alexey Bataev, Andrey Bokhanko


Introduction
===========This document describes design proposal for OpenMP representation in
LLVM IR.

Authors assume that readers have basic understanding of:
1) OpenMP principles and language constructs
2) General design of LLVM compiler system


Goals
====Our end goal is to provide a simple, complete and extensible support
for OpenMP representation in LLVM IR.

We aim to extend LLVM IR as little as possible, preferably without
adding any new types / language constructs at all.

Also, we’d like to keep opportunities for optimization of parallel
code as widely opened as possible –obviously, within the boundaries of
preserving correct semantics of a user program.

And finally, in the spirit of OpenMP Specification ([OpenMP11]), LLVM
IR based compilers should be able to easily skip IR extensions related
to OpenMP support and still generate correct, albeit sequential, code.


Function Outlining
=================Eventually, parallel regions should be put into separate
routines, a
process usually called “function outlining” or “procedurization”. This
can happen as early as in front-end, and as late as right before code
generation.

As can be seen in the following sections, the IR extension we propose
doesn’t involve explicit procedurization. Thus, we assume that
function outlining should happen somewhere in the LLVM back-end, and
usually this should be aligned with how chosen OpenMP runtime library
works and what it expects. This is a deliberate decision on our part.
We believe it provides the following benefits (when compared with
designs involving procedurization done in a front-end):

1) Function outlining doesn’t depend on source language; thus, it can
be implemented once and used with any front-ends.

2) Optimizations are usually restricted by a single function boundary.
If procedurization is done in a front-end, this effectively kills any
optimizations – as simple ones as loop invariant code motion. Refer to
[Tian2005] for more information on why this is important for efficient
optimization of OpenMP programs.

It should be stressed, though, that in order to preserve correct
semantics of a user program, optimizations should be made thread-aware
(which, to the best of our knowledge, is not the case with LLVM
optimizations).

By “thread-aware” we mean an optimization that performs legal
transformations on parallel programs. Refer to [Novillo00] for more
information on correctness of optimizations for parallel programs.

Given that LLVM optimizations are currently not thread-aware, initial
implementation should call procedurization pass right at the start of
the optimizer. In the future, this pass can be moved further and
further down the optimizer, as more optimizations will become
thread-aware.

Essentially, this initial implementation is not that different from
implementations with procedurization done in front-end; however, it
keeps window of possibility opened.


Scope
====Scope of this design is OpenMP representation in LLVM IR. Thus, all
other issues related to OpenMP support, like runtime library, ABI, etc
are not covered.

We also included a set of requirements for front-ends and back-ends,
which establish mutual expectations and is an important addition to
the design.

Our proposal is based on the latest published OpenMP specification,
which is version 3.1 ([OpenMP31]) at the time of writing. However, the
design approach we employed is general enough to allow easy adaptation
for future versions of OpenMP standard.


Front-End/Back-End Contract
==========================While not a part of OpenMP representation design per
se, the following
pre- and post-conditions help to establish a set of mutual expectation
between front-ends (that generate LLVM IR with OpenMP support) and
back-ends (that consume it). Without this kind of a “contract”
([Meyer92]), a back-end has to verify too much and basically repeat
the work already done by a front-end.

While it is possible to generate LLVM IR that violates these
conditions, we consider it to be non-conformant. Obviously,
conformance can be verified, if necessary.

Pre- and Post-conditions for Front-Ends
---------------------------------------
1) Guarantee correct semantics of directives and clauses. This
includes nomenclature and number of clauses in directives, correct
nesting of directives, values of clauses, etc.

For example, front-ends should guarantee that omp single directive has
at most one private, firstprivate, copyprivate and nowait clause, and
nothing else; omp section directive is nested inside omp sections
directive; the only possible value of private clause is a list of
variables available at the point where this clause is present; etc.

2) Guarantee correct semantics of statements / structured blocks
following directives.
As an example, it should be guaranteed that only for-loops follow omp
parallel for directive.

3) Set _OPENMP macro. This is required in section 2.2 of [OpenMP11].

4) Support a command-line option that enables/disables OpenMP support.
This is required in Chapter 2 of [OpenMP11].

If a user program violates OpenMP semantics and thus, makes compliance
with first two conditions impossible, a front-end should report a
meaningful error message and stop compilation.

It should be noted that front-ends are not required to guarantee full
conformance of generated LLVM IR to OpenMP specification. As an
example, the specification deems programs that branch into or out of
parallel regions to be non-conformant. We believe that verification of
such properties is too complex for most front-ends. This matches well
with what is written in the specification itself: “compliant
implementations are not required to check for code sequences that
cause a program to be classified as non-conforming” ([OpenMP11],
Section 1.1).

Pre- and Post-conditions for Back-Ends
--------------------------------------
1) Intrinsics, builtins, library routines and language implementation
should be thread-safe. This is equally applicable to front-ends and is
required in Section 1.5 of [OpenMP11].

2) Optimizations should be thread-aware.

3) Support a command-line option that enables/disables OpenMP support.
This is required in Chapter 2 of [OpenMP11].

4) Rely on post-conditions guaranteed by front-end, and nothing else.

Conditions 2)-5) are only applicable to optimizations working on IR
before procedurization.

As noted in the previous section, LLVM IR is not guaranteed to be
fully conformant with OpenMP specification. Back-ends should be able
to compile cleanly non-compliant programs, but no promises are made on
correct behavior of resulting machine code.


Elements of OpenMP
=================OpenMP is comprised of four components (as of version 3.1):
* Directives (+ Clauses)
* Internal Control Variables
* Runtime Library Routines
* Environment Variables

Internal Control Variables, Runtime Library Routines and Environment
Variables are provided / handled by OpenMP runtime library. These
components of OpenMP don’t require special support in LLVM IR and thus
are out of scope of this document.

All OpenMP directives in C/C++ are specified with #pragma
preprocessing directive and have the following format:

#pragma omp directive-name [clause [[,] clause]…].

Next two chapters describe our design proposal for representation of,
correspondingly, directives and clauses.


Directives
=========Directives in LLVM IR are represented as calls to “llvm.omp.directive”
intrinsic with a single argument referencing LLVM IR metadata. The
metadata contains an identifier of a directive.

Almost all OpenMP directives are represented with two intrinsic calls:
one for entry and one for exit point of the directive’s context. It is
enough to have just one intrinsic call for several directives which
are supposed to be enclosed in other directives (like omp section,
which must appear only within omp sections context) or specify a
single instruction (like omp flush) or are declarative (like omp
threadprivate). Exit point for such directives is either non-existent
or can be determined by examining other intrinsic calls and thus, not
required to be explicitly present.

Metadata specify only one thing: type of a directive. Currently LLVM
IR does not support integer or enumeration metadata types; thus, we
decided to use MDString (metadata string type) to represent the type.
The list of directives and their identifiers is shown in Table 1.

Table 1. OpenMP Directives

Type of directive #pragma omp ...      | MDString in metadata
------------------------------------------------------------
parallel                               | OMP_PARALLEL
                                       | OMP_END_PARALLEL
------------------------------------------------------------
[parallel] for                         | OMP_[PARALLEL_]LOOP
                                       | OMP_END_[PARALLEL_]LOOP
------------------------------------------------------------
[parallel] sections                    | OMP_[PARALLEL_]SECTIONS
                                       | OMP_END_[PARALLEL_]SECTIONS
------------------------------------------------------------
section                                | OMP_SECTION
------------------------------------------------------------
single                                 | OMP_SINGLE
                                       | OMP_END_SINGLE
------------------------------------------------------------
task                                   | OMP_PTASK
                                       | OMP_END_PTASK
------------------------------------------------------------
taskyield                              | OMP_PTASKYIELD
------------------------------------------------------------
master                                 | OMP_MASTER
                                       | OMP_END_MASTER
------------------------------------------------------------
critical                               | OMP_CRITICAL
                                       | OMP_END_CRITICAL
------------------------------------------------------------
barrier                                | OMP_BARRIER
------------------------------------------------------------
taskwait                               | OMP_PTASKWAIT
------------------------------------------------------------
atomic                                 | OMP_ATOMIC
                                       | OMP_END_ATOMIC
------------------------------------------------------------
flush                                  | OMP_FLUSH
------------------------------------------------------------
ordered                                | OMP_ORDERED
                                       | OMP_END_ORDERED

An example of an OpenMP directive and its representation in LLVM IR:

C/C++
-----
#pragma omp parallel

LLVM IR
-------
call void @llvm.omp.directive(metadata !0)
...
call void @llvm.omp.directive(metadata !1)

!0 = metadata !{metadata !”OMP_PARALLEL”}
!1 = metadata !{metadata !”OMP_END_PARALLEL”}


Clauses
======All OpenMP clauses can be divided into four groups:
* Ones with predefined values (default, ordered, nowait, untied, read,
write, update, capture)
* Ones with a list of variables (shared, private, firstprivate,
lastprivate, copyin, copyprivate, directives flush, threadprivate)
* Ones with a scalar or integer expression (if, num_threads, final, collapse)
* Compound ones (reduction, schedule, directive critical)

Clauses in LLVM IR are represented as intrinsic calls. Each intrinsic
call representing a clause has one mandatory argument and arbitrary
number of optional arguments. The mandatory argument references LLVM
IR metadata. The metadata contains identifier of the clause (MDString)
and, for some compound clauses, additional data represented as another
MDString. Optional arguments reference LLVM variables or expressions.


Clauses with Predefined Values
------------------------------
Clauses with predefined values are represented as calls to
“llvm.omp.simple” intrinsic. For this group of clauses metadata
contains an MDString with a clause’s identifier.

The list of clauses and their identifiers is shown in Table 2.

Table 2. OpenMP Clauses with Predefined Values

Clause                                 | MDString in Metadata
------------------------------------------------------------
default(none)                          | OMP_DEFAULT_NONE
------------------------------------------------------------
default(shared)                        | OMP_DEFAULT_SHARED
------------------------------------------------------------
ordered                                | OMP_ORDERED
------------------------------------------------------------
nowait                                 | OMP_NOWAIT
------------------------------------------------------------
untied                                 | OMP_UNTIED
------------------------------------------------------------
read                                   | OMP_READ
------------------------------------------------------------
write                                  | OMP_WRITE
------------------------------------------------------------
update                                 | OMP_UPDATE
------------------------------------------------------------
capture                                | OMP_CAPTURE

Here is an example of OpenMP clause with a predefined value and its
representation in LLVM IR:

C/C++
-----
#pragma omp atomic read

LLVM IR
-------
call void @llvm.omp.simple(metadata !1)

!1 = metadata !{metadata !”OMP_READ”}


Clauses with a List of Variables
--------------------------------
Clauses with a list of variables are represented as calls to
“llvm.omp.list” intrinsic. For this group of clauses metadata contains
an MDString with a clause’s identifier.

Additional arguments of the intrinsic reference LLVM variables
associated with a clause. It is important to reference variables
directly in intrinsic calls and not in metadata, in order to preserve
data dependency.

Each variable of a non user-defined type is represented with a single
argument (referencing the variable).

Each variable of a user-defined type is represented with four
arguments: one referencing the variable itself, one referencing its
default constructor, one referencing its copy constructor and one
referencing its destructor. References to constructors / destructors
are required to correctly create and destroy private copies of
variables.

The list of clauses and their identifiers is shown in Table 3.

Table 3. OpenMP Clauses with a List of Variables

Clause                                 | MDString in Metadata
------------------------------------------------------------
private(list)                          | OMP_PRIVATE
------------------------------------------------------------
firstprivate(list)                     | OMP_FIRSTPRIVATE
------------------------------------------------------------
lastprivate(list)                      | OMP_LASTPRIVATE
------------------------------------------------------------
shared(list)                           | OMP_SHARED
------------------------------------------------------------
copyin(list)                           | OMP_COPYIN
------------------------------------------------------------
Directive flush(list)                  | OMP_FLUSH
------------------------------------------------------------
Directive threadprivate(list)          | OMP_THREADPRIVATE

Here is an example of OpenMP clause with a list of variables and its
representation in LLVM IR:

C/C++
-----
#pragma omp parallel private(a,b)

LLVM IR
-------
call void (metadata, ...)* @llvm.omp.list(metadata !1, i32* @a, i32* @b)

!1 = metadata !{metadata !”OMP_PRIVATE”}


Clauses with Scalar or Integer Expressions
------------------------------------------
Clauses with scalar or integer expressions are represented as calls to
“llvm.omp.expr” intrinsic. For this group of clauses metadata contains
an MDString with a clause’s identifier.

Second argument of the intrinsic references a scalar or integer LLVM
expression associated with a clause. It is important to reference
expressions directly in intrinsic calls and not in metadata, in order
to preserve data dependency.

The list of clauses and their identifiers is shown in Table 4.

Table 4. OpenMP Clauses with Scalar or Integer Expressions

Clause                                 | MDString in Metadata
------------------------------------------------------------
if(scalar_expr)                        | OMP_IF
------------------------------------------------------------
num_threads(integer_expr)              | OMP_NUM_THREADS
------------------------------------------------------------
final(scalar_expr)                     | OMP_FINAL
------------------------------------------------------------
collapse(const_integer_expr)           | OMP_COLLAPSE

Here is an example of OpenMP clause with a scalar expression and its
representation in LLVM IR:

C/C++
-----
#pragma omp parallel if(a)

LLVM IR
-------
%4 = load i32* @a, align 4
%5 = icmp ne i32 %4, 0
call void @llvm.omp.expr(metadata !1, i32 %5)
...
!1 = metadata !{metadata !”OMP_IF”}


Compound Clauses
----------------
Compound clauses are represented as calls to “llvm.omp.compound”
intrinsic. For this group of clauses metadata contains an MDString
with a clause’s identifier and additional data, also represented as an
MDString.

Additional arguments of an intrinsic call for reduction clause
reference LLVM variables associated with a clause (see format of
“Clauses with a List of Variables”).

Second argument of an intrinsic call for schedule (static), schedule
(dynamic) and schedule (guided) clauses references an integer LLVM
expression associated with a clause (see format of “Clauses with
Scalar or Integer Expressions”).

The list of compound clauses along with their representation in
metadata is shown in Table 5.

Table 5. OpenMP Compound Clauses

Clause                                 | Representation in Metadata
------------------------------------------------------------
reduction(operator : list)             | OMP_REDUCTION, <operator>
------------------------------------------------------------
schedule(static [, integer_expr])      | OMP_SCHEDULE, STATIC
------------------------------------------------------------
schedule(dynamic [, integer_expr])     | OMP_SCHEDULE, DYNAMIC
------------------------------------------------------------
schedule(guided [, integer_expr])      | OMP_SCHEDULE, GUIDED
------------------------------------------------------------
schedule(auto)                         | OMP_SCHEDULE, AUTO
------------------------------------------------------------
schedule(runtime)                      | OMP_SCHEDULE, RUNTIME
------------------------------------------------------------
Directive critical(name)               | OMP_NAME, <name>

Here is an example of a compound OpenMP clause and its representation
in LLVM IR:

C/C++
-----
#pragma omp parallel reduction(+ : a, b)

LLVM IR
-------
call void (metadata, ...)* @llvm.omp.compound(metadata !1, i32* @a, i32* @b)

!1 = metadata !{metadata !”OMP_REDUCTION”, metadata !”+”}


An Example
----------
Here is a more complex example, demonstrating LLVM IR representation
of a directive with several clauses of different types.

C/C++
-----
int a, gVar;
int main() {
  int lVar;
#pragma omp parallel default(shared), private(gVar, a), if(lVar)
  {
    ...
  }
  return (0);
}

LLVM IR
-------
@a = global i32 0, align 4
@gVar = global i32 0, align 4
define i32 @main () nounwind uwtable ssp {
  %lVar = alloca i32, align 4
  call void @llvm.omp.directive(metadata !0)
  call void @llvm.omp.simple(metadata !1)
  call void (metadata, ...)* @llvm.omp.list(metadata !2, i32* @gVar, i32* @a)
  call void @llvm.omp.expr(metadata !3, i32* %lVar)
  ...
  call void @llvm.omp.directive(metadata !4)
  ret i32 0
}
!0 = metadata !{metadata !”OMP_PARALLEL”}
!1 = metadata !{metadata !”OMP_DEFAULT_SHARED”}
!2 = metadata !{metadata !”OMP_PRIVATE”}
!3 = metadata !{metadata !”OMP_IF”}
!4 = metadata !{metadata !”OMP_END_PARALLEL”}


Design Alternatives
==================It is possible to propose several viable alternatives to
representing
information on OpenMP directives and clauses while keeping general
design approach intact.

Some things that we considered:

a) Represent information on types of directives and clauses as
constant expressions instead of MDStrings.
b) Place MDStrings with identifiers of directives and clauses directly
into intrinsic calls instead of metadata.
c) Do not represent clauses as separate intrinsics with references to
metadata; instead, put a list of references to metadata for all
clauses associated with a directive at the end of metadata describing
the directive.

As we said, all these are viable alternatives. We decided to choose
what we chose and not employ the alternatives listed above in order to
keep true the following three design principles:

1) Absolute simplicity and readability (including human readability).
This ruled out alternatives a) and [partially] c).
2) Uniformity of representation, for both directives and clauses. This
ruled out alternatives b) and c).
3) As much extensibility and openness for future changes in both LLVM
IR and OpenMP as possible. This ruled out alternatives b) and c).

We believe that the only downside of our choice is larger size of LLVM
IR required to represent same number of directives and clauses.

However, we consider this as a relatively minor element of our design;
one might argue in favor of these or other similar design
alternatives.


Conclusion
=========In this document we described our design proposal for OpenMP
representation in LLVM IR. We believe the IR extensions we proposed
are:

1) Simple

They rely on existing LLVM IR types and language constructs; the only
new things are a few new intrinsics. Thus, LLVM IR consumers lacking
OpenMP support can simply ignore these two intrinsics and still
generate correct, albeit sequential, code.

2) Complete

OpenMP 3.1 Specification ([OpenMP31]) is fully covered.

3) Extensible

The design approach is general enough to readily incorporate future
versions of OpenMP standard.

4) Enables both early and late procedurization and aggressive optimization

This is when compared with designs implying explicit procedurization
done right in front-ends.


Acknowledgements
===============Authors would like to acknowledge valuable contributions and
advice
provided by Kevin B Smith, Xinmin Tian, Stefanus Du Toit, Brian
Minard, Dmitry Babokin and other colleagues from Intel and Diego
Novillo from Google. This *does not* automatically imply approval of
this proposal by said individuals.


References
=========
[Lattner10] Chris Lattner, Devang Patel, “Extensible Metadata in LLVM
IR”. Available at:
http://blog.llvm.org/2010/04/extensible-metadata-in-llvm-ir.html

[Meyer92] Bertrand Meyer, “Applying “Design by Contract”, IEEE
Computer, October 1992, pp. 40-51. Available at:
http://se.ethz.ch/~meyer/publications/computer/contract.pdf

[Novillo00] Diego Novillo, “Analysis and Optimization of Explicitly
Parallel Programs”, PhD Thesis, University of Alberta, Edmonton,
Alberta, Canada, 2000. Available at:
http://www.airs.com/dnovillo/Papers/Thesis.pdf

[OpenMP11] “OpenMP Application Program Interface”, Version 3.1, July
2011. Available at: http://www.openmp.org/mp-documents/OpenMP3.1.pdf

[Tian05] Xinmin Tian, Milind Girkar, Aart Bik and Hideki Saito,
“Practical Compiler Techniques on Efficient Multithreaded Code
Generation for OpenMP Programs”, The Computer Journal, September 2005,
pp.588-601. Available at: http://dl.acm.org/citation.cfm?id=1095017
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenMP_LLVM_IR.pdf
Type: application/pdf
Size: 666518 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20120928/aaca2b70/attachment.pdf>

Hal Finkel

2012-Sep-28 21:56 UTC

head link

[LLVMdev] [RFC] OpenMP Representation in LLVM IR

Andrey,

I am very glad to see that you're interested in working on this! I
have a few comments:

As you may know, this is the third such proposal over the past two
months, one by me
(http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-August/052472.html)
and the other, based somewhat on mine, by Sanjoy
(http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-September/053798.html)

In order for your proposal to work well, there will be a lot of
infrastructure work required (more than with my proposal); many passes
will need to be made explicitly aware of how they can, or can't, reorder
things with respect to the parallelization intrinsics; loop
restructuring may require special care, etc. How this is done depends
in part on where the state information is stored: Do we keep the
parallelization information in the intrinsics during mid-level
optimization, or do we move its state into an analysis pass? In any
case, I don't object to this approach so long as we have a good plan
for how this work will be done.

When we discussed this earlier this year, there seemed to be some
consensus that we wanted to avoid, to the extent possible, introducing
OpenMP-specific intrinsics into the LLVM layer. Rather, we should
define some kind of parallelization API (in the form of metadata,
intrinsics, etc.) onto which OpenMP can naturally map along with other
paradigms. There is interest in supporting OpenACC, for example, which
will require data copying clauses, and it would make sense to share
as much of the infrastructure as possible with OpenMP. Are you
interested in providing Cilk support as well? We probably don't want to
have NxM slightly-different ways of expressing 'this is a parallel
region'. There are obviously cases in which things need to be specific
to the interface (like runtime loop scheduling in OpenMP which implies
a specific interaction with the runtime library), but such cases may be
the exception rather than the rule.

We don't need 'omp' in the intrinsic names and also 'OMP_'
on all of
the string specifiers. Maybe, to my previous point, we could call the
intrinsics 'parallel' and use 'OMP_' only when something is
really
OpenMP-specific?

You don't seem to want to map thread-private variables onto the
existing TLS support. Why?

Sincerely,
Hal

On Fri, 28 Sep 2012 22:42:37 +0400
Andrey Bokhanko <andreybokhanko at gmail.com> wrote:
> Hi All,
> 
> We'd like to make a proposal for OpenMP representation in LLVM IR.
> 
> Our goal is to reach an agreement in the community on a simple,
> complete and extensible representation of OpenMP language constructs
> in LLVM IR. Hopefully, this would serve as a common ground and would
> enable further development of OpenMP support both in Clang and LLVM
> compiler toolchain.
> 
> We seek feedback on the proposed design and ways to improve it.
> 
> Main authors of the proposal are Andrey Bokhanko and Alexey Bataev.
> Also, we'd like to acknowledge valuable contributions and advice
> provided by Kevin B Smith, Xinmin Tian, Stefanus Du Toit, Brian
> Minard, Dmitry Babokin and other colleagues from Intel and Diego
> Novillo from Google. NB, this *does not* automatically imply support
> of the proposal by said individuals.
> 
> Please find the proposal in *.pdf (attached to the message, for
> reading convenience) and plain text (below the message, for quoting
> convenience) formats. Their content is identical.
> 
> Full disclosure: both of us are associated with Intel and Intel
> Compiler team.
> 
> Yours,
> Andrey Bokhanko
> Alexey Bataev
> ============> Software Engineers
> Intel Compiler Team
> Intel Corp.
> 
> 
> OpenMP Representation in LLVM IR
> ===============================> Design Proposal
> 
> Alexey Bataev, Andrey Bokhanko
> 
> 
> Introduction
> ===========> This document describes design proposal for OpenMP
representation in
> LLVM IR.
> 
> Authors assume that readers have basic understanding of:
> 1) OpenMP principles and language constructs
> 2) General design of LLVM compiler system
> 
> 
> Goals
> ====> Our end goal is to provide a simple, complete and extensible
support
> for OpenMP representation in LLVM IR.
> 
> We aim to extend LLVM IR as little as possible, preferably without
> adding any new types / language constructs at all.
> 
> Also, we’d like to keep opportunities for optimization of parallel
> code as widely opened as possible –obviously, within the boundaries of
> preserving correct semantics of a user program.
> 
> And finally, in the spirit of OpenMP Specification ([OpenMP11]), LLVM
> IR based compilers should be able to easily skip IR extensions related
> to OpenMP support and still generate correct, albeit sequential, code.
> 
> 
> Function Outlining
> =================> Eventually, parallel regions should be put into
separate routines, a
> process usually called “function outlining” or “procedurization”. This
> can happen as early as in front-end, and as late as right before code
> generation.
> 
> As can be seen in the following sections, the IR extension we propose
> doesn’t involve explicit procedurization. Thus, we assume that
> function outlining should happen somewhere in the LLVM back-end, and
> usually this should be aligned with how chosen OpenMP runtime library
> works and what it expects. This is a deliberate decision on our part.
> We believe it provides the following benefits (when compared with
> designs involving procedurization done in a front-end):
> 
> 1) Function outlining doesn’t depend on source language; thus, it can
> be implemented once and used with any front-ends.
> 
> 2) Optimizations are usually restricted by a single function boundary.
> If procedurization is done in a front-end, this effectively kills any
> optimizations – as simple ones as loop invariant code motion. Refer to
> [Tian2005] for more information on why this is important for efficient
> optimization of OpenMP programs.
> 
> It should be stressed, though, that in order to preserve correct
> semantics of a user program, optimizations should be made thread-aware
> (which, to the best of our knowledge, is not the case with LLVM
> optimizations).
> 
> By “thread-aware” we mean an optimization that performs legal
> transformations on parallel programs. Refer to [Novillo00] for more
> information on correctness of optimizations for parallel programs.
> 
> Given that LLVM optimizations are currently not thread-aware, initial
> implementation should call procedurization pass right at the start of
> the optimizer. In the future, this pass can be moved further and
> further down the optimizer, as more optimizations will become
> thread-aware.
> 
> Essentially, this initial implementation is not that different from
> implementations with procedurization done in front-end; however, it
> keeps window of possibility opened.
> 
> 
> Scope
> ====> Scope of this design is OpenMP representation in LLVM IR. Thus,
all
> other issues related to OpenMP support, like runtime library, ABI, etc
> are not covered.
> 
> We also included a set of requirements for front-ends and back-ends,
> which establish mutual expectations and is an important addition to
> the design.
> 
> Our proposal is based on the latest published OpenMP specification,
> which is version 3.1 ([OpenMP31]) at the time of writing. However, the
> design approach we employed is general enough to allow easy adaptation
> for future versions of OpenMP standard.
> 
> 
> Front-End/Back-End Contract
> ==========================> While not a part of OpenMP representation
design per se, the following
> pre- and post-conditions help to establish a set of mutual expectation
> between front-ends (that generate LLVM IR with OpenMP support) and
> back-ends (that consume it). Without this kind of a “contract”
> ([Meyer92]), a back-end has to verify too much and basically repeat
> the work already done by a front-end.
> 
> While it is possible to generate LLVM IR that violates these
> conditions, we consider it to be non-conformant. Obviously,
> conformance can be verified, if necessary.
> 
> Pre- and Post-conditions for Front-Ends
> ---------------------------------------
> 1) Guarantee correct semantics of directives and clauses. This
> includes nomenclature and number of clauses in directives, correct
> nesting of directives, values of clauses, etc.
> 
> For example, front-ends should guarantee that omp single directive has
> at most one private, firstprivate, copyprivate and nowait clause, and
> nothing else; omp section directive is nested inside omp sections
> directive; the only possible value of private clause is a list of
> variables available at the point where this clause is present; etc.
> 
> 2) Guarantee correct semantics of statements / structured blocks
> following directives.
> As an example, it should be guaranteed that only for-loops follow omp
> parallel for directive.
> 
> 3) Set _OPENMP macro. This is required in section 2.2 of [OpenMP11].
> 
> 4) Support a command-line option that enables/disables OpenMP support.
> This is required in Chapter 2 of [OpenMP11].
> 
> If a user program violates OpenMP semantics and thus, makes compliance
> with first two conditions impossible, a front-end should report a
> meaningful error message and stop compilation.
> 
> It should be noted that front-ends are not required to guarantee full
> conformance of generated LLVM IR to OpenMP specification. As an
> example, the specification deems programs that branch into or out of
> parallel regions to be non-conformant. We believe that verification of
> such properties is too complex for most front-ends. This matches well
> with what is written in the specification itself: “compliant
> implementations are not required to check for code sequences that
> cause a program to be classified as non-conforming” ([OpenMP11],
> Section 1.1).
> 
> Pre- and Post-conditions for Back-Ends
> --------------------------------------
> 1) Intrinsics, builtins, library routines and language implementation
> should be thread-safe. This is equally applicable to front-ends and is
> required in Section 1.5 of [OpenMP11].
> 
> 2) Optimizations should be thread-aware.
> 
> 3) Support a command-line option that enables/disables OpenMP support.
> This is required in Chapter 2 of [OpenMP11].
> 
> 4) Rely on post-conditions guaranteed by front-end, and nothing else.
> 
> Conditions 2)-5) are only applicable to optimizations working on IR
> before procedurization.
> 
> As noted in the previous section, LLVM IR is not guaranteed to be
> fully conformant with OpenMP specification. Back-ends should be able
> to compile cleanly non-compliant programs, but no promises are made on
> correct behavior of resulting machine code.
> 
> 
> Elements of OpenMP
> =================> OpenMP is comprised of four components (as of version
3.1):
> * Directives (+ Clauses)
> * Internal Control Variables
> * Runtime Library Routines
> * Environment Variables
> 
> Internal Control Variables, Runtime Library Routines and Environment
> Variables are provided / handled by OpenMP runtime library. These
> components of OpenMP don’t require special support in LLVM IR and thus
> are out of scope of this document.
> 
> All OpenMP directives in C/C++ are specified with #pragma
> preprocessing directive and have the following format:
> 
> #pragma omp directive-name [clause [[,] clause]…].
> 
> Next two chapters describe our design proposal for representation of,
> correspondingly, directives and clauses.
> 
> 
> Directives
> =========> Directives in LLVM IR are represented as calls to
“llvm.omp.directive”
> intrinsic with a single argument referencing LLVM IR metadata. The
> metadata contains an identifier of a directive.
> 
> Almost all OpenMP directives are represented with two intrinsic calls:
> one for entry and one for exit point of the directive’s context. It is
> enough to have just one intrinsic call for several directives which
> are supposed to be enclosed in other directives (like omp section,
> which must appear only within omp sections context) or specify a
> single instruction (like omp flush) or are declarative (like omp
> threadprivate). Exit point for such directives is either non-existent
> or can be determined by examining other intrinsic calls and thus, not
> required to be explicitly present.
> 
> Metadata specify only one thing: type of a directive. Currently LLVM
> IR does not support integer or enumeration metadata types; thus, we
> decided to use MDString (metadata string type) to represent the type.
> The list of directives and their identifiers is shown in Table 1.
> 
> Table 1. OpenMP Directives
> 
> Type of directive #pragma omp ...      | MDString in metadata
> ------------------------------------------------------------
> parallel                               | OMP_PARALLEL
>                                        | OMP_END_PARALLEL
> ------------------------------------------------------------
> [parallel] for                         | OMP_[PARALLEL_]LOOP
>                                        | OMP_END_[PARALLEL_]LOOP
> ------------------------------------------------------------
> [parallel] sections                    | OMP_[PARALLEL_]SECTIONS
>                                        | OMP_END_[PARALLEL_]SECTIONS
> ------------------------------------------------------------
> section                                | OMP_SECTION
> ------------------------------------------------------------
> single                                 | OMP_SINGLE
>                                        | OMP_END_SINGLE
> ------------------------------------------------------------
> task                                   | OMP_PTASK
>                                        | OMP_END_PTASK
> ------------------------------------------------------------
> taskyield                              | OMP_PTASKYIELD
> ------------------------------------------------------------
> master                                 | OMP_MASTER
>                                        | OMP_END_MASTER
> ------------------------------------------------------------
> critical                               | OMP_CRITICAL
>                                        | OMP_END_CRITICAL
> ------------------------------------------------------------
> barrier                                | OMP_BARRIER
> ------------------------------------------------------------
> taskwait                               | OMP_PTASKWAIT
> ------------------------------------------------------------
> atomic                                 | OMP_ATOMIC
>                                        | OMP_END_ATOMIC
> ------------------------------------------------------------
> flush                                  | OMP_FLUSH
> ------------------------------------------------------------
> ordered                                | OMP_ORDERED
>                                        | OMP_END_ORDERED
> 
> An example of an OpenMP directive and its representation in LLVM IR:
> 
> C/C++
> -----
> #pragma omp parallel
> 
> LLVM IR
> -------
> call void @llvm.omp.directive(metadata !0)
> ...
> call void @llvm.omp.directive(metadata !1)
> 
> !0 = metadata !{metadata !”OMP_PARALLEL”}
> !1 = metadata !{metadata !”OMP_END_PARALLEL”}
> 
> 
> Clauses
> ======> All OpenMP clauses can be divided into four groups:
> * Ones with predefined values (default, ordered, nowait, untied, read,
> write, update, capture)
> * Ones with a list of variables (shared, private, firstprivate,
> lastprivate, copyin, copyprivate, directives flush, threadprivate)
> * Ones with a scalar or integer expression (if, num_threads, final,
> collapse)
> * Compound ones (reduction, schedule, directive critical)
> 
> Clauses in LLVM IR are represented as intrinsic calls. Each intrinsic
> call representing a clause has one mandatory argument and arbitrary
> number of optional arguments. The mandatory argument references LLVM
> IR metadata. The metadata contains identifier of the clause (MDString)
> and, for some compound clauses, additional data represented as another
> MDString. Optional arguments reference LLVM variables or expressions.
> 
> 
> Clauses with Predefined Values
> ------------------------------
> Clauses with predefined values are represented as calls to
> “llvm.omp.simple” intrinsic. For this group of clauses metadata
> contains an MDString with a clause’s identifier.
> 
> The list of clauses and their identifiers is shown in Table 2.
> 
> Table 2. OpenMP Clauses with Predefined Values
> 
> Clause                                 | MDString in Metadata
> ------------------------------------------------------------
> default(none)                          | OMP_DEFAULT_NONE
> ------------------------------------------------------------
> default(shared)                        | OMP_DEFAULT_SHARED
> ------------------------------------------------------------
> ordered                                | OMP_ORDERED
> ------------------------------------------------------------
> nowait                                 | OMP_NOWAIT
> ------------------------------------------------------------
> untied                                 | OMP_UNTIED
> ------------------------------------------------------------
> read                                   | OMP_READ
> ------------------------------------------------------------
> write                                  | OMP_WRITE
> ------------------------------------------------------------
> update                                 | OMP_UPDATE
> ------------------------------------------------------------
> capture                                | OMP_CAPTURE
> 
> Here is an example of OpenMP clause with a predefined value and its
> representation in LLVM IR:
> 
> C/C++
> -----
> #pragma omp atomic read
> 
> LLVM IR
> -------
> call void @llvm.omp.simple(metadata !1)
> 
> !1 = metadata !{metadata !”OMP_READ”}
> 
> 
> Clauses with a List of Variables
> --------------------------------
> Clauses with a list of variables are represented as calls to
> “llvm.omp.list” intrinsic. For this group of clauses metadata contains
> an MDString with a clause’s identifier.
> 
> Additional arguments of the intrinsic reference LLVM variables
> associated with a clause. It is important to reference variables
> directly in intrinsic calls and not in metadata, in order to preserve
> data dependency.
> 
> Each variable of a non user-defined type is represented with a single
> argument (referencing the variable).
> 
> Each variable of a user-defined type is represented with four
> arguments: one referencing the variable itself, one referencing its
> default constructor, one referencing its copy constructor and one
> referencing its destructor. References to constructors / destructors
> are required to correctly create and destroy private copies of
> variables.
> 
> The list of clauses and their identifiers is shown in Table 3.
> 
> Table 3. OpenMP Clauses with a List of Variables
> 
> Clause                                 | MDString in Metadata
> ------------------------------------------------------------
> private(list)                          | OMP_PRIVATE
> ------------------------------------------------------------
> firstprivate(list)                     | OMP_FIRSTPRIVATE
> ------------------------------------------------------------
> lastprivate(list)                      | OMP_LASTPRIVATE
> ------------------------------------------------------------
> shared(list)                           | OMP_SHARED
> ------------------------------------------------------------
> copyin(list)                           | OMP_COPYIN
> ------------------------------------------------------------
> Directive flush(list)                  | OMP_FLUSH
> ------------------------------------------------------------
> Directive threadprivate(list)          | OMP_THREADPRIVATE
> 
> Here is an example of OpenMP clause with a list of variables and its
> representation in LLVM IR:
> 
> C/C++
> -----
> #pragma omp parallel private(a,b)
> 
> LLVM IR
> -------
> call void (metadata, ...)* @llvm.omp.list(metadata !1, i32* @a, i32*
> @b)
> 
> !1 = metadata !{metadata !”OMP_PRIVATE”}
> 
> 
> Clauses with Scalar or Integer Expressions
> ------------------------------------------
> Clauses with scalar or integer expressions are represented as calls to
> “llvm.omp.expr” intrinsic. For this group of clauses metadata contains
> an MDString with a clause’s identifier.
> 
> Second argument of the intrinsic references a scalar or integer LLVM
> expression associated with a clause. It is important to reference
> expressions directly in intrinsic calls and not in metadata, in order
> to preserve data dependency.
> 
> The list of clauses and their identifiers is shown in Table 4.
> 
> Table 4. OpenMP Clauses with Scalar or Integer Expressions
> 
> Clause                                 | MDString in Metadata
> ------------------------------------------------------------
> if(scalar_expr)                        | OMP_IF
> ------------------------------------------------------------
> num_threads(integer_expr)              | OMP_NUM_THREADS
> ------------------------------------------------------------
> final(scalar_expr)                     | OMP_FINAL
> ------------------------------------------------------------
> collapse(const_integer_expr)           | OMP_COLLAPSE
> 
> Here is an example of OpenMP clause with a scalar expression and its
> representation in LLVM IR:
> 
> C/C++
> -----
> #pragma omp parallel if(a)
> 
> LLVM IR
> -------
> %4 = load i32* @a, align 4
> %5 = icmp ne i32 %4, 0
> call void @llvm.omp.expr(metadata !1, i32 %5)
> ...
> !1 = metadata !{metadata !”OMP_IF”}
> 
> 
> Compound Clauses
> ----------------
> Compound clauses are represented as calls to “llvm.omp.compound”
> intrinsic. For this group of clauses metadata contains an MDString
> with a clause’s identifier and additional data, also represented as an
> MDString.
> 
> Additional arguments of an intrinsic call for reduction clause
> reference LLVM variables associated with a clause (see format of
> “Clauses with a List of Variables”).
> 
> Second argument of an intrinsic call for schedule (static), schedule
> (dynamic) and schedule (guided) clauses references an integer LLVM
> expression associated with a clause (see format of “Clauses with
> Scalar or Integer Expressions”).
> 
> The list of compound clauses along with their representation in
> metadata is shown in Table 5.
> 
> Table 5. OpenMP Compound Clauses
> 
> Clause                                 | Representation in Metadata
> ------------------------------------------------------------
> reduction(operator : list)             | OMP_REDUCTION, <operator>
> ------------------------------------------------------------
> schedule(static [, integer_expr])      | OMP_SCHEDULE, STATIC
> ------------------------------------------------------------
> schedule(dynamic [, integer_expr])     | OMP_SCHEDULE, DYNAMIC
> ------------------------------------------------------------
> schedule(guided [, integer_expr])      | OMP_SCHEDULE, GUIDED
> ------------------------------------------------------------
> schedule(auto)                         | OMP_SCHEDULE, AUTO
> ------------------------------------------------------------
> schedule(runtime)                      | OMP_SCHEDULE, RUNTIME
> ------------------------------------------------------------
> Directive critical(name)               | OMP_NAME, <name>
> 
> Here is an example of a compound OpenMP clause and its representation
> in LLVM IR:
> 
> C/C++
> -----
> #pragma omp parallel reduction(+ : a, b)
> 
> LLVM IR
> -------
> call void (metadata, ...)* @llvm.omp.compound(metadata !1, i32* @a,
> i32* @b)
> 
> !1 = metadata !{metadata !”OMP_REDUCTION”, metadata !”+”}
> 
> 
> An Example
> ----------
> Here is a more complex example, demonstrating LLVM IR representation
> of a directive with several clauses of different types.
> 
> C/C++
> -----
> int a, gVar;
> int main() {
>   int lVar;
> #pragma omp parallel default(shared), private(gVar, a), if(lVar)
>   {
>     ...
>   }
>   return (0);
> }
> 
> LLVM IR
> -------
> @a = global i32 0, align 4
> @gVar = global i32 0, align 4
> define i32 @main () nounwind uwtable ssp {
>   %lVar = alloca i32, align 4
>   call void @llvm.omp.directive(metadata !0)
>   call void @llvm.omp.simple(metadata !1)
>   call void (metadata, ...)* @llvm.omp.list(metadata !2, i32* @gVar,
> i32* @a) call void @llvm.omp.expr(metadata !3, i32* %lVar)
>   ...
>   call void @llvm.omp.directive(metadata !4)
>   ret i32 0
> }
> !0 = metadata !{metadata !”OMP_PARALLEL”}
> !1 = metadata !{metadata !”OMP_DEFAULT_SHARED”}
> !2 = metadata !{metadata !”OMP_PRIVATE”}
> !3 = metadata !{metadata !”OMP_IF”}
> !4 = metadata !{metadata !”OMP_END_PARALLEL”}
> 
> 
> Design Alternatives
> ==================> It is possible to propose several viable
alternatives to representing
> information on OpenMP directives and clauses while keeping general
> design approach intact.
> 
> Some things that we considered:
> 
> a) Represent information on types of directives and clauses as
> constant expressions instead of MDStrings.
> b) Place MDStrings with identifiers of directives and clauses directly
> into intrinsic calls instead of metadata.
> c) Do not represent clauses as separate intrinsics with references to
> metadata; instead, put a list of references to metadata for all
> clauses associated with a directive at the end of metadata describing
> the directive.
> 
> As we said, all these are viable alternatives. We decided to choose
> what we chose and not employ the alternatives listed above in order to
> keep true the following three design principles:
> 
> 1) Absolute simplicity and readability (including human readability).
> This ruled out alternatives a) and [partially] c).
> 2) Uniformity of representation, for both directives and clauses. This
> ruled out alternatives b) and c).
> 3) As much extensibility and openness for future changes in both LLVM
> IR and OpenMP as possible. This ruled out alternatives b) and c).
> 
> We believe that the only downside of our choice is larger size of LLVM
> IR required to represent same number of directives and clauses.
> 
> However, we consider this as a relatively minor element of our design;
> one might argue in favor of these or other similar design
> alternatives.
> 
> 
> Conclusion
> =========> In this document we described our design proposal for OpenMP
> representation in LLVM IR. We believe the IR extensions we proposed
> are:
> 
> 1) Simple
> 
> They rely on existing LLVM IR types and language constructs; the only
> new things are a few new intrinsics. Thus, LLVM IR consumers lacking
> OpenMP support can simply ignore these two intrinsics and still
> generate correct, albeit sequential, code.
> 
> 2) Complete
> 
> OpenMP 3.1 Specification ([OpenMP31]) is fully covered.
> 
> 3) Extensible
> 
> The design approach is general enough to readily incorporate future
> versions of OpenMP standard.
> 
> 4) Enables both early and late procedurization and aggressive
> optimization
> 
> This is when compared with designs implying explicit procedurization
> done right in front-ends.
> 
> 
> Acknowledgements
> ===============> Authors would like to acknowledge valuable
contributions and advice
> provided by Kevin B Smith, Xinmin Tian, Stefanus Du Toit, Brian
> Minard, Dmitry Babokin and other colleagues from Intel and Diego
> Novillo from Google. This *does not* automatically imply approval of
> this proposal by said individuals.
> 
> 
> References
> =========> 
> [Lattner10] Chris Lattner, Devang Patel, “Extensible Metadata in LLVM
> IR”. Available at:
> http://blog.llvm.org/2010/04/extensible-metadata-in-llvm-ir.html
> 
> [Meyer92] Bertrand Meyer, “Applying “Design by Contract”, IEEE
> Computer, October 1992, pp. 40-51. Available at:
> http://se.ethz.ch/~meyer/publications/computer/contract.pdf
> 
> [Novillo00] Diego Novillo, “Analysis and Optimization of Explicitly
> Parallel Programs”, PhD Thesis, University of Alberta, Edmonton,
> Alberta, Canada, 2000. Available at:
> http://www.airs.com/dnovillo/Papers/Thesis.pdf
> 
> [OpenMP11] “OpenMP Application Program Interface”, Version 3.1, July
> 2011. Available at: http://www.openmp.org/mp-documents/OpenMP3.1.pdf
> 
> [Tian05] Xinmin Tian, Milind Girkar, Aart Bik and Hideki Saito,
> “Practical Compiler Techniques on Efficient Multithreaded Code
> Generation for OpenMP Programs”, The Computer Journal, September 2005,
> pp.588-601. Available at: http://dl.acm.org/citation.cfm?id=1095017


-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory

Pekka Jääskeläinen

2012-Sep-29 11:01 UTC

head link

[LLVMdev] [RFC] OpenMP Representation in LLVM IR

On 09/29/2012 12:56 AM, Hal Finkel wrote:> There is interest in supporting OpenACC, for example, which
> will require data copying clauses, and it would make sense to share
> as much of the infrastructure as possible with OpenMP. Are you
> interested in providing Cilk support as well?
I'd like to remind that more generic parallelism constructs in LLVM
would also help the Portable OpenCL implementation cause. The SPMD
multi work-item work-group functions (produced from OpenCL C kernels) should be
easily mappable to such constructs after which the actual parallelization (DLP,
TLP, ILP, or a combination of them) can be target-specific and done in generic
LLVM passes which, e.g., OpenMP benefits from.

-- 
--Pekka

Andrey Bokhanko

2012-Sep-29 17:16 UTC

head link

[LLVMdev] [RFC] OpenMP Representation in LLVM IR

Hal,

Thank you for the reply!
> As you may know, this is the third such proposal over the past two
> months, one by me
> (http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-August/052472.html)
> and the other, based somewhat on mine, by Sanjoy
> (http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-September/053798.html)
Yes, I was aware of your proposal. I hesitated to make any comments or
criticism -- as I am, obviously, biased.

In my opinion, two most important differences between our proposals are:

1) Your design employs explicit procedurization done in front-end,
while our design allows both early (right after front-end) and late
(later in the back-end) procedurization.
2) You aim to provide a general support for all (or at least most)
parallel standards, while our aim is more modest -- just OpenMP.

Please see discussion of 1) in "Function Outlining" section of our
proposal.

As for 2), there are many arguments one might use in favor of more
general or more specialized solution. What is easier to implement?
What is better for LLVM IR development? Are we sure what we see as
necessary and sufficient today would be suitable for future parallel
standards -- given all the developments happening in this area as we
speak? Whatever one answers, it would be quite subjective. My personal
preference is for simplest and most focused solution -- but then again
this is subjective.
> In order for your proposal to work well, there will be a lot of
> infrastructure work required (more than with my proposal); many passes
> will need to be made explicitly aware of how they can, or can't,
reorder
> things with respect to the parallelization intrinsics; loop
> restructuring may require special care, etc. How this is done depends
> in part on where the state information is stored: Do we keep the
> parallelization information in the intrinsics during mid-level
> optimization, or do we move its state into an analysis pass? In any
> case, I don't object to this approach so long as we have a good plan
> for how this work will be done.
No -- only passes than happen before procedurization should be aware
of these intrinsics.

I agree that it is not so easy to make optimizations "thread-aware".
But the problem essentially the same, no matter how parallel extension
is manifested in the IR.
> When we discussed this earlier this year, there seemed to be some
> consensus that we wanted to avoid, to the extent possible, introducing
> OpenMP-specific intrinsics into the LLVM layer. Rather, we should
> define some kind of parallelization API (in the form of metadata,
> intrinsics, etc.) onto which OpenMP can naturally map along with other
> paradigms. There is interest in supporting OpenACC, for example, which
> will require data copying clauses, and it would make sense to share
> as much of the infrastructure as possible with OpenMP. Are you
> interested in providing Cilk support as well? We probably don't want to
> have NxM slightly-different ways of expressing 'this is a parallel
> region'. There are obviously cases in which things need to be specific
> to the interface (like runtime loop scheduling in OpenMP which implies
> a specific interaction with the runtime library), but such cases may be
> the exception rather than the rule.
>
> We don't need 'omp' in the intrinsic names and also
'OMP_' on all of
> the string specifiers. Maybe, to my previous point, we could call the
> intrinsics 'parallel' and use 'OMP_' only when something is
really
> OpenMP-specific?
As I said before, our aim was quite simple -- OpenMP support only.

Can the design be extended to allow more general form of parallel
extensions support? Probably... but this is definitely more than what
we intended.
> You don't seem to want to map thread-private variables onto the
> existing TLS support. Why?
Because we don't employ explicit procedurization. What happens after
procedurization (including how thread-private variables are manifested
in the IR) is heavily dependent on OpenMP runtime library one relies
upon and out of scope of our proposal.

Yours,
Andrey Bokhanko
---
Software Engineer
Intel Compiler Team
Intel Corp.

greened at obbligato.org

2012-Oct-02 01:06 UTC

head link

[LLVMdev] [RFC] OpenMP Representation in LLVM IR

Andrey Bokhanko <andreybokhanko at gmail.com> writes:
> Hi All,
>
> We'd like to make a proposal for OpenMP representation in LLVM IR.
I'm providing some brief comments after a skim of this..
> Our goal is to reach an agreement in the community on a simple,
> complete and extensible representation of OpenMP language constructs
> in LLVM IR.
I think this is a bad idea.  OpenMP is not Low Level and I can't think
of a good reason to start putting OpenMP support in the IR.  Cray has
complete functioning OpenMP stack that performs very well without any
LLVM IR support at all.
> As can be seen in the following sections, the IR extension we propose
> doesn’t involve explicit procedurization. Thus, we assume that
> function outlining should happen somewhere in the LLVM back-end, and
> usually this should be aligned with how chosen OpenMP runtime library
> works and what it expects. This is a deliberate decision on our part.
> We believe it provides the following benefits (when compared with
> designs involving procedurization done in a front-end):
This is a very high-level transformation.  I don't think it belongs
in a low-level backend.
> 1) Function outlining doesn’t depend on source language; thus, it can
> be implemented once and used with any front-ends.
A higher-level IR would be more appropriate for this, either something
provided by Clang or another frontend or a some other mid-level IR.
> 2) Optimizations are usually restricted by a single function boundary.
> If procedurization is done in a front-end, this effectively kills any
> optimizations – as simple ones as loop invariant code motion. Refer to
> [Tian2005] for more information on why this is important for efficient
> optimization of OpenMP programs.
You're assuming all optimization is done by LLVM.  That's not true in
general.
> It should be stressed, though, that in order to preserve correct
> semantics of a user program, optimizations should be made thread-aware
> (which, to the best of our knowledge, is not the case with LLVM
> optimizations).
Another reason not to do this in LLVM.
> We also included a set of requirements for front-ends and back-ends,
> which establish mutual expectations and is an important addition to
> the design.
This will increase coupling between the "front ends" and LLVM.  That
would be very unfortunate.  One of LLVM's great strength is its
flexibility.

I didn't look at the details of how you map OMP directives to LLVM IR.
I think this is really the wrong way to go.

                       -David

greened at obbligato.org

2012-Oct-02 01:15 UTC

head link

[LLVMdev] [RFC] OpenMP Representation in LLVM IR

Hal Finkel <hfinkel at anl.gov> writes:

Hi Hal,
> As you may know, this is the third such proposal over the past two
> months, one by me
> (http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-August/052472.html)
This link seems to be broken.  I missed your earlier proposal and would
like to read it.  As with this proposal, I fear any direct
parallelization support in LLVM is going to take us out of the "low
level" feature of LLVM which is a huge strength.
> and the other, based somewhat on mine, by Sanjoy
> (http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-September/053798.html)
I read this proposal quickly.  I don't understand why we need
intrinsics.  Won't calls to runtime routines work just fine?

Ah, Sanjoy had a link to your proposal in his message.

Again, I only skimmed the document, but I was left with the question,
"why not just make calls to runtime routines?"  What is the reason for
the "paralleliation metadata?" It seems to me this implies/requires
that
LLVM have knowledge of parallel semantics.  That would be very
unfortunate.

                     -David

Hal Finkel

2012-Oct-02 04:03 UTC

head link

[LLVMdev] [RFC] OpenMP Representation in LLVM IR

On Mon, 01 Oct 2012 20:06:20 -0500
greened at obbligato.org wrote:
> Andrey Bokhanko <andreybokhanko at gmail.com> writes:
> 
> > Hi All,
> >
> > We'd like to make a proposal for OpenMP representation in LLVM IR.
> 
> I'm providing some brief comments after a skim of this..
> 
> > Our goal is to reach an agreement in the community on a simple,
> > complete and extensible representation of OpenMP language constructs
> > in LLVM IR.
> 
> I think this is a bad idea.  OpenMP is not Low Level and I can't think
> of a good reason to start putting OpenMP support in the IR.  Cray has
> complete functioning OpenMP stack that performs very well without any
> LLVM IR support at all.
OpenMP provides a mechanism to expresses parallel semantics, and the
best way to implement those semantics is highly target-dependent. On
some targets early lowering into a runtime library will perform well,
and optimization opportunities lost by doing so will prove fairly
insignificant in many cases. I can believe that this is true on those
systems that Cray targets. However, that will not be true everywhere. 
> 
> > As can be seen in the following sections, the IR extension we
> > propose doesn’t involve explicit procedurization. Thus, we assume
> > that function outlining should happen somewhere in the LLVM
> > back-end, and usually this should be aligned with how chosen OpenMP
> > runtime library works and what it expects. This is a deliberate
> > decision on our part. We believe it provides the following benefits
> > (when compared with designs involving procedurization done in a
> > front-end):
> 
> This is a very high-level transformation.  I don't think it belongs
> in a low-level backend.
> 
> > 1) Function outlining doesn’t depend on source language; thus, it
> > can be implemented once and used with any front-ends.
> 
> A higher-level IR would be more appropriate for this, either something
> provided by Clang or another frontend or a some other mid-level IR.
For some things, yes, but at the moment we don't have anything else
besides the LLVM IR. The LLVM IR is currently where vectorization is
done, loop-restructuring is done, aliasing analysis is performed, etc.
and so is where the parallelization should be done as well. For other
things, like atomics, lowering later may be better.
> 
> > 2) Optimizations are usually restricted by a single function
> > boundary. If procedurization is done in a front-end, this
> > effectively kills any optimizations – as simple ones as loop
> > invariant code motion. Refer to [Tian2005] for more information on
> > why this is important for efficient optimization of OpenMP programs.
> 
> You're assuming all optimization is done by LLVM.  That's not true
in
> general.
Even if LLVM has support for parallelization, no customer is required
to use it. If you'd like to lower parallelization semantics into
runtime calls before lowering in LLVM, you're free to do that.

Nevertheless, LLVM is where, for example, loop-invariant code motion is
performed. We don't want to procedurize parallel loops before that
happens.
> 
> > It should be stressed, though, that in order to preserve correct
> > semantics of a user program, optimizations should be made
> > thread-aware (which, to the best of our knowledge, is not the case
> > with LLVM optimizations).
> 
> Another reason not to do this in LLVM.
> 
> > We also included a set of requirements for front-ends and back-ends,
> > which establish mutual expectations and is an important addition to
> > the design.
> 
> This will increase coupling between the "front ends" and LLVM. 
That
> would be very unfortunate.  One of LLVM's great strength is its
> flexibility.
Most users do not use every feature of a programming language, and LLVM
is no different.
> 
> I didn't look at the details of how you map OMP directives to LLVM IR.
> I think this is really the wrong way to go.
Respectfully, I disagree.

 -Hal
> 
>                        -David
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory

Marc J. Driftmeyer

2012-Oct-02 08:22 UTC

head link

[LLVMdev] [RFC] OpenMP Representation in LLVM IR

Not to distract, but the word, `procedurization' is not an English word. 
It's just leaping out at me when it is either procedure(s) (noun) or 
proceduralize (verb). Even processes would make sense. I couldn't help 
myself because the word was distracting.

- Marc

P.S. Not that my vote counts, but I'm more in the camp of Hal whose 
approach to tackling the parallelization foundation within LLVM [OpenMP 
just one aspect] makes more sense hitting that now by providing a clean 
and generic container without being hardwired to any one particular 
third party extension, ala the `complete and extensible representation 
of OpenMP.'

Speaking from not having the necessary foundation to really sway either 
direction it just seems more the spirit of Hal's project would fall 
in-line, with what has been historically how Lattner and company prefer 
to keep LLVM as clean a design and extensible as possible without being 
interdependent upon other projects hooks into extending value to the 
project.

As a former Apple/NeXT alum I would have already expected Enderby and 
company to have designed LLVM/Clang with OpenMP hooks if it were 
intended to have an intimate relationship with the OpenMP project. It 
seems to me that OpenCL, OpenMP and other solutions are best served as 
add-ons to the project freeing the project from being design compromised 
from it's original goals.

I'd make the same observations whether AMD, ARM or anyone else made the 
proposal attempting to interweave OpenMP as you hope to sway the 
community to allow. I prefer Hal's approach to the who problem space.

On 09/28/2012 11:42 AM, Andrey Bokhanko wrote:> Hi All,
>
> We'd like to make a proposal for OpenMP representation in LLVM IR.
>
> Our goal is to reach an agreement in the community on a simple,
> complete and extensible representation of OpenMP language constructs
> in LLVM IR. Hopefully, this would serve as a common ground and would
> enable further development of OpenMP support both in Clang and LLVM
> compiler toolchain.
>
> We seek feedback on the proposed design and ways to improve it.
>
> Main authors of the proposal are Andrey Bokhanko and Alexey Bataev.
> Also, we'd like to acknowledge valuable contributions and advice
> provided by Kevin B Smith, Xinmin Tian, Stefanus Du Toit, Brian
> Minard, Dmitry Babokin and other colleagues from Intel and Diego
> Novillo from Google. NB, this *does not* automatically imply support
> of the proposal by said individuals.
>
> Please find the proposal in *.pdf (attached to the message, for
> reading convenience) and plain text (below the message, for quoting
> convenience) formats. Their content is identical.
>
> Full disclosure: both of us are associated with Intel and Intel Compiler
team.
>
> Yours,
> Andrey Bokhanko
> Alexey Bataev
> ============> Software Engineers
> Intel Compiler Team
> Intel Corp.
>
>
> OpenMP Representation in LLVM IR
> ===============================> Design Proposal
>
> Alexey Bataev, Andrey Bokhanko
>
>
> Introduction
> ===========> This document describes design proposal for OpenMP
representation in LLVM IR.
>
> Authors assume that readers have basic understanding of:
> 1) OpenMP principles and language constructs
> 2) General design of LLVM compiler system
>
>
> Goals
> ====> Our end goal is to provide a simple, complete and extensible
support
> for OpenMP representation in LLVM IR.
>
> We aim to extend LLVM IR as little as possible, preferably without
> adding any new types / language constructs at all.
>
> Also, we’d like to keep opportunities for optimization of parallel
> code as widely opened as possible –obviously, within the boundaries of
> preserving correct semantics of a user program.
>
> And finally, in the spirit of OpenMP Specification ([OpenMP11]), LLVM
> IR based compilers should be able to easily skip IR extensions related
> to OpenMP support and still generate correct, albeit sequential, code.
>
>
> Function Outlining
> =================> Eventually, parallel regions should be put into
separate routines, a
> process usually called “function outlining” or “procedurization”. This
> can happen as early as in front-end, and as late as right before code
> generation.
>
> As can be seen in the following sections, the IR extension we propose
> doesn’t involve explicit procedurization. Thus, we assume that
> function outlining should happen somewhere in the LLVM back-end, and
> usually this should be aligned with how chosen OpenMP runtime library
> works and what it expects. This is a deliberate decision on our part.
> We believe it provides the following benefits (when compared with
> designs involving procedurization done in a front-end):
>
> 1) Function outlining doesn’t depend on source language; thus, it can
> be implemented once and used with any front-ends.
>
> 2) Optimizations are usually restricted by a single function boundary.
> If procedurization is done in a front-end, this effectively kills any
> optimizations – as simple ones as loop invariant code motion. Refer to
> [Tian2005] for more information on why this is important for efficient
> optimization of OpenMP programs.
>
> It should be stressed, though, that in order to preserve correct
> semantics of a user program, optimizations should be made thread-aware
> (which, to the best of our knowledge, is not the case with LLVM
> optimizations).
>
> By “thread-aware” we mean an optimization that performs legal
> transformations on parallel programs. Refer to [Novillo00] for more
> information on correctness of optimizations for parallel programs.
>
> Given that LLVM optimizations are currently not thread-aware, initial
> implementation should call procedurization pass right at the start of
> the optimizer. In the future, this pass can be moved further and
> further down the optimizer, as more optimizations will become
> thread-aware.
>
> Essentially, this initial implementation is not that different from
> implementations with procedurization done in front-end; however, it
> keeps window of possibility opened.
>
>
> Scope
> ====> Scope of this design is OpenMP representation in LLVM IR. Thus,
all
> other issues related to OpenMP support, like runtime library, ABI, etc
> are not covered.
>
> We also included a set of requirements for front-ends and back-ends,
> which establish mutual expectations and is an important addition to
> the design.
>
> Our proposal is based on the latest published OpenMP specification,
> which is version 3.1 ([OpenMP31]) at the time of writing. However, the
> design approach we employed is general enough to allow easy adaptation
> for future versions of OpenMP standard.
>
>
> Front-End/Back-End Contract
> ==========================> While not a part of OpenMP representation
design per se, the following
> pre- and post-conditions help to establish a set of mutual expectation
> between front-ends (that generate LLVM IR with OpenMP support) and
> back-ends (that consume it). Without this kind of a “contract”
> ([Meyer92]), a back-end has to verify too much and basically repeat
> the work already done by a front-end.
>
> While it is possible to generate LLVM IR that violates these
> conditions, we consider it to be non-conformant. Obviously,
> conformance can be verified, if necessary.
>
> Pre- and Post-conditions for Front-Ends
> ---------------------------------------
> 1) Guarantee correct semantics of directives and clauses. This
> includes nomenclature and number of clauses in directives, correct
> nesting of directives, values of clauses, etc.
>
> For example, front-ends should guarantee that omp single directive has
> at most one private, firstprivate, copyprivate and nowait clause, and
> nothing else; omp section directive is nested inside omp sections
> directive; the only possible value of private clause is a list of
> variables available at the point where this clause is present; etc.
>
> 2) Guarantee correct semantics of statements / structured blocks
> following directives.
> As an example, it should be guaranteed that only for-loops follow omp
> parallel for directive.
>
> 3) Set _OPENMP macro. This is required in section 2.2 of [OpenMP11].
>
> 4) Support a command-line option that enables/disables OpenMP support.
> This is required in Chapter 2 of [OpenMP11].
>
> If a user program violates OpenMP semantics and thus, makes compliance
> with first two conditions impossible, a front-end should report a
> meaningful error message and stop compilation.
>
> It should be noted that front-ends are not required to guarantee full
> conformance of generated LLVM IR to OpenMP specification. As an
> example, the specification deems programs that branch into or out of
> parallel regions to be non-conformant. We believe that verification of
> such properties is too complex for most front-ends. This matches well
> with what is written in the specification itself: “compliant
> implementations are not required to check for code sequences that
> cause a program to be classified as non-conforming” ([OpenMP11],
> Section 1.1).
>
> Pre- and Post-conditions for Back-Ends
> --------------------------------------
> 1) Intrinsics, builtins, library routines and language implementation
> should be thread-safe. This is equally applicable to front-ends and is
> required in Section 1.5 of [OpenMP11].
>
> 2) Optimizations should be thread-aware.
>
> 3) Support a command-line option that enables/disables OpenMP support.
> This is required in Chapter 2 of [OpenMP11].
>
> 4) Rely on post-conditions guaranteed by front-end, and nothing else.
>
> Conditions 2)-5) are only applicable to optimizations working on IR
> before procedurization.
>
> As noted in the previous section, LLVM IR is not guaranteed to be
> fully conformant with OpenMP specification. Back-ends should be able
> to compile cleanly non-compliant programs, but no promises are made on
> correct behavior of resulting machine code.
>
>
> Elements of OpenMP
> =================> OpenMP is comprised of four components (as of version
3.1):
> * Directives (+ Clauses)
> * Internal Control Variables
> * Runtime Library Routines
> * Environment Variables
>
> Internal Control Variables, Runtime Library Routines and Environment
> Variables are provided / handled by OpenMP runtime library. These
> components of OpenMP don’t require special support in LLVM IR and thus
> are out of scope of this document.
>
> All OpenMP directives in C/C++ are specified with #pragma
> preprocessing directive and have the following format:
>
> #pragma omp directive-name [clause [[,] clause]…].
>
> Next two chapters describe our design proposal for representation of,
> correspondingly, directives and clauses.
>
>
> Directives
> =========> Directives in LLVM IR are represented as calls to
“llvm.omp.directive”
> intrinsic with a single argument referencing LLVM IR metadata. The
> metadata contains an identifier of a directive.
>
> Almost all OpenMP directives are represented with two intrinsic calls:
> one for entry and one for exit point of the directive’s context. It is
> enough to have just one intrinsic call for several directives which
> are supposed to be enclosed in other directives (like omp section,
> which must appear only within omp sections context) or specify a
> single instruction (like omp flush) or are declarative (like omp
> threadprivate). Exit point for such directives is either non-existent
> or can be determined by examining other intrinsic calls and thus, not
> required to be explicitly present.
>
> Metadata specify only one thing: type of a directive. Currently LLVM
> IR does not support integer or enumeration metadata types; thus, we
> decided to use MDString (metadata string type) to represent the type.
> The list of directives and their identifiers is shown in Table 1.
>
> Table 1. OpenMP Directives
>
> Type of directive #pragma omp ...      | MDString in metadata
> ------------------------------------------------------------
> parallel                               | OMP_PARALLEL
>                                         | OMP_END_PARALLEL
> ------------------------------------------------------------
> [parallel] for                         | OMP_[PARALLEL_]LOOP
>                                         | OMP_END_[PARALLEL_]LOOP
> ------------------------------------------------------------
> [parallel] sections                    | OMP_[PARALLEL_]SECTIONS
>                                         | OMP_END_[PARALLEL_]SECTIONS
> ------------------------------------------------------------
> section                                | OMP_SECTION
> ------------------------------------------------------------
> single                                 | OMP_SINGLE
>                                         | OMP_END_SINGLE
> ------------------------------------------------------------
> task                                   | OMP_PTASK
>                                         | OMP_END_PTASK
> ------------------------------------------------------------
> taskyield                              | OMP_PTASKYIELD
> ------------------------------------------------------------
> master                                 | OMP_MASTER
>                                         | OMP_END_MASTER
> ------------------------------------------------------------
> critical                               | OMP_CRITICAL
>                                         | OMP_END_CRITICAL
> ------------------------------------------------------------
> barrier                                | OMP_BARRIER
> ------------------------------------------------------------
> taskwait                               | OMP_PTASKWAIT
> ------------------------------------------------------------
> atomic                                 | OMP_ATOMIC
>                                         | OMP_END_ATOMIC
> ------------------------------------------------------------
> flush                                  | OMP_FLUSH
> ------------------------------------------------------------
> ordered                                | OMP_ORDERED
>                                         | OMP_END_ORDERED
>
> An example of an OpenMP directive and its representation in LLVM IR:
>
> C/C++
> -----
> #pragma omp parallel
>
> LLVM IR
> -------
> call void @llvm.omp.directive(metadata !0)
> ...
> call void @llvm.omp.directive(metadata !1)
>
> !0 = metadata !{metadata !”OMP_PARALLEL”}
> !1 = metadata !{metadata !”OMP_END_PARALLEL”}
>
>
> Clauses
> ======> All OpenMP clauses can be divided into four groups:
> * Ones with predefined values (default, ordered, nowait, untied, read,
> write, update, capture)
> * Ones with a list of variables (shared, private, firstprivate,
> lastprivate, copyin, copyprivate, directives flush, threadprivate)
> * Ones with a scalar or integer expression (if, num_threads, final,
collapse)
> * Compound ones (reduction, schedule, directive critical)
>
> Clauses in LLVM IR are represented as intrinsic calls. Each intrinsic
> call representing a clause has one mandatory argument and arbitrary
> number of optional arguments. The mandatory argument references LLVM
> IR metadata. The metadata contains identifier of the clause (MDString)
> and, for some compound clauses, additional data represented as another
> MDString. Optional arguments reference LLVM variables or expressions.
>
>
> Clauses with Predefined Values
> ------------------------------
> Clauses with predefined values are represented as calls to
> “llvm.omp.simple” intrinsic. For this group of clauses metadata
> contains an MDString with a clause’s identifier.
>
> The list of clauses and their identifiers is shown in Table 2.
>
> Table 2. OpenMP Clauses with Predefined Values
>
> Clause                                 | MDString in Metadata
> ------------------------------------------------------------
> default(none)                          | OMP_DEFAULT_NONE
> ------------------------------------------------------------
> default(shared)                        | OMP_DEFAULT_SHARED
> ------------------------------------------------------------
> ordered                                | OMP_ORDERED
> ------------------------------------------------------------
> nowait                                 | OMP_NOWAIT
> ------------------------------------------------------------
> untied                                 | OMP_UNTIED
> ------------------------------------------------------------
> read                                   | OMP_READ
> ------------------------------------------------------------
> write                                  | OMP_WRITE
> ------------------------------------------------------------
> update                                 | OMP_UPDATE
> ------------------------------------------------------------
> capture                                | OMP_CAPTURE
>
> Here is an example of OpenMP clause with a predefined value and its
> representation in LLVM IR:
>
> C/C++
> -----
> #pragma omp atomic read
>
> LLVM IR
> -------
> call void @llvm.omp.simple(metadata !1)
>
> !1 = metadata !{metadata !”OMP_READ”}
>
>
> Clauses with a List of Variables
> --------------------------------
> Clauses with a list of variables are represented as calls to
> “llvm.omp.list” intrinsic. For this group of clauses metadata contains
> an MDString with a clause’s identifier.
>
> Additional arguments of the intrinsic reference LLVM variables
> associated with a clause. It is important to reference variables
> directly in intrinsic calls and not in metadata, in order to preserve
> data dependency.
>
> Each variable of a non user-defined type is represented with a single
> argument (referencing the variable).
>
> Each variable of a user-defined type is represented with four
> arguments: one referencing the variable itself, one referencing its
> default constructor, one referencing its copy constructor and one
> referencing its destructor. References to constructors / destructors
> are required to correctly create and destroy private copies of
> variables.
>
> The list of clauses and their identifiers is shown in Table 3.
>
> Table 3. OpenMP Clauses with a List of Variables
>
> Clause                                 | MDString in Metadata
> ------------------------------------------------------------
> private(list)                          | OMP_PRIVATE
> ------------------------------------------------------------
> firstprivate(list)                     | OMP_FIRSTPRIVATE
> ------------------------------------------------------------
> lastprivate(list)                      | OMP_LASTPRIVATE
> ------------------------------------------------------------
> shared(list)                           | OMP_SHARED
> ------------------------------------------------------------
> copyin(list)                           | OMP_COPYIN
> ------------------------------------------------------------
> Directive flush(list)                  | OMP_FLUSH
> ------------------------------------------------------------
> Directive threadprivate(list)          | OMP_THREADPRIVATE
>
> Here is an example of OpenMP clause with a list of variables and its
> representation in LLVM IR:
>
> C/C++
> -----
> #pragma omp parallel private(a,b)
>
> LLVM IR
> -------
> call void (metadata, ...)* @llvm.omp.list(metadata !1, i32* @a, i32* @b)
>
> !1 = metadata !{metadata !”OMP_PRIVATE”}
>
>
> Clauses with Scalar or Integer Expressions
> ------------------------------------------
> Clauses with scalar or integer expressions are represented as calls to
> “llvm.omp.expr” intrinsic. For this group of clauses metadata contains
> an MDString with a clause’s identifier.
>
> Second argument of the intrinsic references a scalar or integer LLVM
> expression associated with a clause. It is important to reference
> expressions directly in intrinsic calls and not in metadata, in order
> to preserve data dependency.
>
> The list of clauses and their identifiers is shown in Table 4.
>
> Table 4. OpenMP Clauses with Scalar or Integer Expressions
>
> Clause                                 | MDString in Metadata
> ------------------------------------------------------------
> if(scalar_expr)                        | OMP_IF
> ------------------------------------------------------------
> num_threads(integer_expr)              | OMP_NUM_THREADS
> ------------------------------------------------------------
> final(scalar_expr)                     | OMP_FINAL
> ------------------------------------------------------------
> collapse(const_integer_expr)           | OMP_COLLAPSE
>
> Here is an example of OpenMP clause with a scalar expression and its
> representation in LLVM IR:
>
> C/C++
> -----
> #pragma omp parallel if(a)
>
> LLVM IR
> -------
> %4 = load i32* @a, align 4
> %5 = icmp ne i32 %4, 0
> call void @llvm.omp.expr(metadata !1, i32 %5)
> ...
> !1 = metadata !{metadata !”OMP_IF”}
>
>
> Compound Clauses
> ----------------
> Compound clauses are represented as calls to “llvm.omp.compound”
> intrinsic. For this group of clauses metadata contains an MDString
> with a clause’s identifier and additional data, also represented as an
> MDString.
>
> Additional arguments of an intrinsic call for reduction clause
> reference LLVM variables associated with a clause (see format of
> “Clauses with a List of Variables”).
>
> Second argument of an intrinsic call for schedule (static), schedule
> (dynamic) and schedule (guided) clauses references an integer LLVM
> expression associated with a clause (see format of “Clauses with
> Scalar or Integer Expressions”).
>
> The list of compound clauses along with their representation in
> metadata is shown in Table 5.
>
> Table 5. OpenMP Compound Clauses
>
> Clause                                 | Representation in Metadata
> ------------------------------------------------------------
> reduction(operator : list)             | OMP_REDUCTION,<operator>
> ------------------------------------------------------------
> schedule(static [, integer_expr])      | OMP_SCHEDULE, STATIC
> ------------------------------------------------------------
> schedule(dynamic [, integer_expr])     | OMP_SCHEDULE, DYNAMIC
> ------------------------------------------------------------
> schedule(guided [, integer_expr])      | OMP_SCHEDULE, GUIDED
> ------------------------------------------------------------
> schedule(auto)                         | OMP_SCHEDULE, AUTO
> ------------------------------------------------------------
> schedule(runtime)                      | OMP_SCHEDULE, RUNTIME
> ------------------------------------------------------------
> Directive critical(name)               | OMP_NAME,<name>
>
> Here is an example of a compound OpenMP clause and its representation
> in LLVM IR:
>
> C/C++
> -----
> #pragma omp parallel reduction(+ : a, b)
>
> LLVM IR
> -------
> call void (metadata, ...)* @llvm.omp.compound(metadata !1, i32* @a, i32*
@b)
>
> !1 = metadata !{metadata !”OMP_REDUCTION”, metadata !”+”}
>
>
> An Example
> ----------
> Here is a more complex example, demonstrating LLVM IR representation
> of a directive with several clauses of different types.
>
> C/C++
> -----
> int a, gVar;
> int main() {
>    int lVar;
> #pragma omp parallel default(shared), private(gVar, a), if(lVar)
>    {
>      ...
>    }
>    return (0);
> }
>
> LLVM IR
> -------
> @a = global i32 0, align 4
> @gVar = global i32 0, align 4
> define i32 @main () nounwind uwtable ssp {
>    %lVar = alloca i32, align 4
>    call void @llvm.omp.directive(metadata !0)
>    call void @llvm.omp.simple(metadata !1)
>    call void (metadata, ...)* @llvm.omp.list(metadata !2, i32* @gVar, i32*
@a)
>    call void @llvm.omp.expr(metadata !3, i32* %lVar)
>    ...
>    call void @llvm.omp.directive(metadata !4)
>    ret i32 0
> }
> !0 = metadata !{metadata !”OMP_PARALLEL”}
> !1 = metadata !{metadata !”OMP_DEFAULT_SHARED”}
> !2 = metadata !{metadata !”OMP_PRIVATE”}
> !3 = metadata !{metadata !”OMP_IF”}
> !4 = metadata !{metadata !”OMP_END_PARALLEL”}
>
>
> Design Alternatives
> ==================> It is possible to propose several viable
alternatives to representing
> information on OpenMP directives and clauses while keeping general
> design approach intact.
>
> Some things that we considered:
>
> a) Represent information on types of directives and clauses as
> constant expressions instead of MDStrings.
> b) Place MDStrings with identifiers of directives and clauses directly
> into intrinsic calls instead of metadata.
> c) Do not represent clauses as separate intrinsics with references to
> metadata; instead, put a list of references to metadata for all
> clauses associated with a directive at the end of metadata describing
> the directive.
>
> As we said, all these are viable alternatives. We decided to choose
> what we chose and not employ the alternatives listed above in order to
> keep true the following three design principles:
>
> 1) Absolute simplicity and readability (including human readability).
> This ruled out alternatives a) and [partially] c).
> 2) Uniformity of representation, for both directives and clauses. This
> ruled out alternatives b) and c).
> 3) As much extensibility and openness for future changes in both LLVM
> IR and OpenMP as possible. This ruled out alternatives b) and c).
>
> We believe that the only downside of our choice is larger size of LLVM
> IR required to represent same number of directives and clauses.
>
> However, we consider this as a relatively minor element of our design;
> one might argue in favor of these or other similar design
> alternatives.
>
>
> Conclusion
> =========> In this document we described our design proposal for OpenMP
> representation in LLVM IR. We believe the IR extensions we proposed
> are:
>
> 1) Simple
>
> They rely on existing LLVM IR types and language constructs; the only
> new things are a few new intrinsics. Thus, LLVM IR consumers lacking
> OpenMP support can simply ignore these two intrinsics and still
> generate correct, albeit sequential, code.
>
> 2) Complete
>
> OpenMP 3.1 Specification ([OpenMP31]) is fully covered.
>
> 3) Extensible
>
> The design approach is general enough to readily incorporate future
> versions of OpenMP standard.
>
> 4) Enables both early and late procedurization and aggressive optimization
>
> This is when compared with designs implying explicit procedurization
> done right in front-ends.
>
>
> Acknowledgements
> ===============> Authors would like to acknowledge valuable
contributions and advice
> provided by Kevin B Smith, Xinmin Tian, Stefanus Du Toit, Brian
> Minard, Dmitry Babokin and other colleagues from Intel and Diego
> Novillo from Google. This *does not* automatically imply approval of
> this proposal by said individuals.
>
>
> References
> =========>
> [Lattner10] Chris Lattner, Devang Patel, “Extensible Metadata in LLVM
> IR”. Available at:
> http://blog.llvm.org/2010/04/extensible-metadata-in-llvm-ir.html
>
> [Meyer92] Bertrand Meyer, “Applying “Design by Contract”, IEEE
> Computer, October 1992, pp. 40-51. Available at:
> http://se.ethz.ch/~meyer/publications/computer/contract.pdf
>
> [Novillo00] Diego Novillo, “Analysis and Optimization of Explicitly
> Parallel Programs”, PhD Thesis, University of Alberta, Edmonton,
> Alberta, Canada, 2000. Available at:
> http://www.airs.com/dnovillo/Papers/Thesis.pdf
>
> [OpenMP11] “OpenMP Application Program Interface”, Version 3.1, July
> 2011. Available at: http://www.openmp.org/mp-documents/OpenMP3.1.pdf
>
> [Tian05] Xinmin Tian, Milind Girkar, Aart Bik and Hideki Saito,
> “Practical Compiler Techniques on Efficient Multithreaded Code
> Generation for OpenMP Programs”, The Computer Journal, September 2005,
> pp.588-601. Available at: http://dl.acm.org/citation.cfm?id=1095017
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-- 
Marc J. Driftmeyer
Email :: mjd at reanimality.com <mailto:mjd at reanimality.com>
Web :: http://www.reanimality.com
Cell :: (509) 435-5212
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20121002/3104b480/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mjd.vcf
Type: text/x-vcard
Size: 317 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20121002/3104b480/attachment.vcf>

Andrey Bokhanko

2012-Oct-02 12:00 UTC

head link

[LLVMdev] [RFC] OpenMP Representation in LLVM IR

Mark,
> Speaking from not having the necessary foundation to really sway either
> direction it just seems more the spirit of Hal's project would fall
in-line,
> with what has been historically how Lattner and company prefer to keep LLVM
> as clean a design and extensible as possible without being interdependent
> upon other projects hooks into extending value to the project.
I have hard time understanding why one proposal is different from
another in this regard.

Both rely on intrinsics and metadata... why intrinsics used in one
proposal are better (or worse) than ones used in another?

Yours,
Andrey
---
Software Engineer
Intel Compiler Team
Intel Corp.

P.S.: I tried diligently not to use "parallelization" word in this
message. :-)

On Tue, Oct 2, 2012 at 12:22 PM, Marc J. Driftmeyer <mjd at
reanimality.com> wrote:> Not to distract, but the word, `procedurization' is not an English
word.
> It's just leaping out at me when it is either procedure(s) (noun) or
> proceduralize (verb). Even processes would make sense. I couldn't help
> myself because the word was distracting.
>
> - Marc
>
> P.S. Not that my vote counts, but I'm more in the camp of Hal whose
approach
> to tackling the parallelization foundation within LLVM [OpenMP just one
> aspect] makes more sense hitting that now by providing a clean and generic
> container without being hardwired to any one particular third party
> extension, ala the `complete and extensible representation of OpenMP.'
>
> Speaking from not having the necessary foundation to really sway either
> direction it just seems more the spirit of Hal's project would fall
in-line,
> with what has been historically how Lattner and company prefer to keep LLVM
> as clean a design and extensible as possible without being interdependent
> upon other projects hooks into extending value to the project.
>
> As a former Apple/NeXT alum I would have already expected Enderby and
> company to have designed LLVM/Clang with OpenMP hooks if it were intended
to
> have an intimate relationship with the OpenMP project. It seems to me that
> OpenCL, OpenMP and other solutions are best served as add-ons to the
project
> freeing the project from being design compromised from it's original
goals.
>
> I'd make the same observations whether AMD, ARM or anyone else made the
> proposal attempting to interweave OpenMP as you hope to sway the community
> to allow. I prefer Hal's approach to the who problem space.
>
>
> On 09/28/2012 11:42 AM, Andrey Bokhanko wrote:
>
> Hi All,
>
> We'd like to make a proposal for OpenMP representation in LLVM IR.
>
> Our goal is to reach an agreement in the community on a simple,
> complete and extensible representation of OpenMP language constructs
> in LLVM IR. Hopefully, this would serve as a common ground and would
> enable further development of OpenMP support both in Clang and LLVM
> compiler toolchain.
>
> We seek feedback on the proposed design and ways to improve it.
>
> Main authors of the proposal are Andrey Bokhanko and Alexey Bataev.
> Also, we'd like to acknowledge valuable contributions and advice
> provided by Kevin B Smith, Xinmin Tian, Stefanus Du Toit, Brian
> Minard, Dmitry Babokin and other colleagues from Intel and Diego
> Novillo from Google. NB, this *does not* automatically imply support
> of the proposal by said individuals.
>
> Please find the proposal in *.pdf (attached to the message, for
> reading convenience) and plain text (below the message, for quoting
> convenience) formats. Their content is identical.
>
> Full disclosure: both of us are associated with Intel and Intel Compiler
> team.
>
> Yours,
> Andrey Bokhanko
> Alexey Bataev
> ============> Software Engineers
> Intel Compiler Team
> Intel Corp.
>
>
> OpenMP Representation in LLVM IR
> ===============================> Design Proposal
>
> Alexey Bataev, Andrey Bokhanko
>
>
> Introduction
> ===========> This document describes design proposal for OpenMP
representation in LLVM
> IR.
>
> Authors assume that readers have basic understanding of:
> 1) OpenMP principles and language constructs
> 2) General design of LLVM compiler system
>
>
> Goals
> ====> Our end goal is to provide a simple, complete and extensible
support
> for OpenMP representation in LLVM IR.
>
> We aim to extend LLVM IR as little as possible, preferably without
> adding any new types / language constructs at all.
>
> Also, we’d like to keep opportunities for optimization of parallel
> code as widely opened as possible –obviously, within the boundaries of
> preserving correct semantics of a user program.
>
> And finally, in the spirit of OpenMP Specification ([OpenMP11]), LLVM
> IR based compilers should be able to easily skip IR extensions related
> to OpenMP support and still generate correct, albeit sequential, code.
>
>
> Function Outlining
> =================> Eventually, parallel regions should be put into
separate routines, a
> process usually called “function outlining” or “procedurization”. This
> can happen as early as in front-end, and as late as right before code
> generation.
>
> As can be seen in the following sections, the IR extension we propose
> doesn’t involve explicit procedurization. Thus, we assume that
> function outlining should happen somewhere in the LLVM back-end, and
> usually this should be aligned with how chosen OpenMP runtime library
> works and what it expects. This is a deliberate decision on our part.
> We believe it provides the following benefits (when compared with
> designs involving procedurization done in a front-end):
>
> 1) Function outlining doesn’t depend on source language; thus, it can
> be implemented once and used with any front-ends.
>
> 2) Optimizations are usually restricted by a single function boundary.
> If procedurization is done in a front-end, this effectively kills any
> optimizations – as simple ones as loop invariant code motion. Refer to
> [Tian2005] for more information on why this is important for efficient
> optimization of OpenMP programs.
>
> It should be stressed, though, that in order to preserve correct
> semantics of a user program, optimizations should be made thread-aware
> (which, to the best of our knowledge, is not the case with LLVM
> optimizations).
>
> By “thread-aware” we mean an optimization that performs legal
> transformations on parallel programs. Refer to [Novillo00] for more
> information on correctness of optimizations for parallel programs.
>
> Given that LLVM optimizations are currently not thread-aware, initial
> implementation should call procedurization pass right at the start of
> the optimizer. In the future, this pass can be moved further and
> further down the optimizer, as more optimizations will become
> thread-aware.
>
> Essentially, this initial implementation is not that different from
> implementations with procedurization done in front-end; however, it
> keeps window of possibility opened.
>
>
> Scope
> ====> Scope of this design is OpenMP representation in LLVM IR. Thus,
all
> other issues related to OpenMP support, like runtime library, ABI, etc
> are not covered.
>
> We also included a set of requirements for front-ends and back-ends,
> which establish mutual expectations and is an important addition to
> the design.
>
> Our proposal is based on the latest published OpenMP specification,
> which is version 3.1 ([OpenMP31]) at the time of writing. However, the
> design approach we employed is general enough to allow easy adaptation
> for future versions of OpenMP standard.
>
>
> Front-End/Back-End Contract
> ==========================> While not a part of OpenMP representation
design per se, the following
> pre- and post-conditions help to establish a set of mutual expectation
> between front-ends (that generate LLVM IR with OpenMP support) and
> back-ends (that consume it). Without this kind of a “contract”
> ([Meyer92]), a back-end has to verify too much and basically repeat
> the work already done by a front-end.
>
> While it is possible to generate LLVM IR that violates these
> conditions, we consider it to be non-conformant. Obviously,
> conformance can be verified, if necessary.
>
> Pre- and Post-conditions for Front-Ends
> ---------------------------------------
> 1) Guarantee correct semantics of directives and clauses. This
> includes nomenclature and number of clauses in directives, correct
> nesting of directives, values of clauses, etc.
>
> For example, front-ends should guarantee that omp single directive has
> at most one private, firstprivate, copyprivate and nowait clause, and
> nothing else; omp section directive is nested inside omp sections
> directive; the only possible value of private clause is a list of
> variables available at the point where this clause is present; etc.
>
> 2) Guarantee correct semantics of statements / structured blocks
> following directives.
> As an example, it should be guaranteed that only for-loops follow omp
> parallel for directive.
>
> 3) Set _OPENMP macro. This is required in section 2.2 of [OpenMP11].
>
> 4) Support a command-line option that enables/disables OpenMP support.
> This is required in Chapter 2 of [OpenMP11].
>
> If a user program violates OpenMP semantics and thus, makes compliance
> with first two conditions impossible, a front-end should report a
> meaningful error message and stop compilation.
>
> It should be noted that front-ends are not required to guarantee full
> conformance of generated LLVM IR to OpenMP specification. As an
> example, the specification deems programs that branch into or out of
> parallel regions to be non-conformant. We believe that verification of
> such properties is too complex for most front-ends. This matches well
> with what is written in the specification itself: “compliant
> implementations are not required to check for code sequences that
> cause a program to be classified as non-conforming” ([OpenMP11],
> Section 1.1).
>
> Pre- and Post-conditions for Back-Ends
> --------------------------------------
> 1) Intrinsics, builtins, library routines and language implementation
> should be thread-safe. This is equally applicable to front-ends and is
> required in Section 1.5 of [OpenMP11].
>
> 2) Optimizations should be thread-aware.
>
> 3) Support a command-line option that enables/disables OpenMP support.
> This is required in Chapter 2 of [OpenMP11].
>
> 4) Rely on post-conditions guaranteed by front-end, and nothing else.
>
> Conditions 2)-5) are only applicable to optimizations working on IR
> before procedurization.
>
> As noted in the previous section, LLVM IR is not guaranteed to be
> fully conformant with OpenMP specification. Back-ends should be able
> to compile cleanly non-compliant programs, but no promises are made on
> correct behavior of resulting machine code.
>
>
> Elements of OpenMP
> =================> OpenMP is comprised of four components (as of version
3.1):
> * Directives (+ Clauses)
> * Internal Control Variables
> * Runtime Library Routines
> * Environment Variables
>
> Internal Control Variables, Runtime Library Routines and Environment
> Variables are provided / handled by OpenMP runtime library. These
> components of OpenMP don’t require special support in LLVM IR and thus
> are out of scope of this document.
>
> All OpenMP directives in C/C++ are specified with #pragma
> preprocessing directive and have the following format:
>
> #pragma omp directive-name [clause [[,] clause]…].
>
> Next two chapters describe our design proposal for representation of,
> correspondingly, directives and clauses.
>
>
> Directives
> =========> Directives in LLVM IR are represented as calls to
“llvm.omp.directive”
> intrinsic with a single argument referencing LLVM IR metadata. The
> metadata contains an identifier of a directive.
>
> Almost all OpenMP directives are represented with two intrinsic calls:
> one for entry and one for exit point of the directive’s context. It is
> enough to have just one intrinsic call for several directives which
> are supposed to be enclosed in other directives (like omp section,
> which must appear only within omp sections context) or specify a
> single instruction (like omp flush) or are declarative (like omp
> threadprivate). Exit point for such directives is either non-existent
> or can be determined by examining other intrinsic calls and thus, not
> required to be explicitly present.
>
> Metadata specify only one thing: type of a directive. Currently LLVM
> IR does not support integer or enumeration metadata types; thus, we
> decided to use MDString (metadata string type) to represent the type.
> The list of directives and their identifiers is shown in Table 1.
>
> Table 1. OpenMP Directives
>
> Type of directive #pragma omp ...      | MDString in metadata
> ------------------------------------------------------------
> parallel                               | OMP_PARALLEL
>                                        | OMP_END_PARALLEL
> ------------------------------------------------------------
> [parallel] for                         | OMP_[PARALLEL_]LOOP
>                                        | OMP_END_[PARALLEL_]LOOP
> ------------------------------------------------------------
> [parallel] sections                    | OMP_[PARALLEL_]SECTIONS
>                                        | OMP_END_[PARALLEL_]SECTIONS
> ------------------------------------------------------------
> section                                | OMP_SECTION
> ------------------------------------------------------------
> single                                 | OMP_SINGLE
>                                        | OMP_END_SINGLE
> ------------------------------------------------------------
> task                                   | OMP_PTASK
>                                        | OMP_END_PTASK
> ------------------------------------------------------------
> taskyield                              | OMP_PTASKYIELD
> ------------------------------------------------------------
> master                                 | OMP_MASTER
>                                        | OMP_END_MASTER
> ------------------------------------------------------------
> critical                               | OMP_CRITICAL
>                                        | OMP_END_CRITICAL
> ------------------------------------------------------------
> barrier                                | OMP_BARRIER
> ------------------------------------------------------------
> taskwait                               | OMP_PTASKWAIT
> ------------------------------------------------------------
> atomic                                 | OMP_ATOMIC
>                                        | OMP_END_ATOMIC
> ------------------------------------------------------------
> flush                                  | OMP_FLUSH
> ------------------------------------------------------------
> ordered                                | OMP_ORDERED
>                                        | OMP_END_ORDERED
>
> An example of an OpenMP directive and its representation in LLVM IR:
>
> C/C++
> -----
> #pragma omp parallel
>
> LLVM IR
> -------
> call void @llvm.omp.directive(metadata !0)
> ...
> call void @llvm.omp.directive(metadata !1)
>
> !0 = metadata !{metadata !”OMP_PARALLEL”}
> !1 = metadata !{metadata !”OMP_END_PARALLEL”}
>
>
> Clauses
> ======> All OpenMP clauses can be divided into four groups:
> * Ones with predefined values (default, ordered, nowait, untied, read,
> write, update, capture)
> * Ones with a list of variables (shared, private, firstprivate,
> lastprivate, copyin, copyprivate, directives flush, threadprivate)
> * Ones with a scalar or integer expression (if, num_threads, final,
> collapse)
> * Compound ones (reduction, schedule, directive critical)
>
> Clauses in LLVM IR are represented as intrinsic calls. Each intrinsic
> call representing a clause has one mandatory argument and arbitrary
> number of optional arguments. The mandatory argument references LLVM
> IR metadata. The metadata contains identifier of the clause (MDString)
> and, for some compound clauses, additional data represented as another
> MDString. Optional arguments reference LLVM variables or expressions.
>
>
> Clauses with Predefined Values
> ------------------------------
> Clauses with predefined values are represented as calls to
> “llvm.omp.simple” intrinsic. For this group of clauses metadata
> contains an MDString with a clause’s identifier.
>
> The list of clauses and their identifiers is shown in Table 2.
>
> Table 2. OpenMP Clauses with Predefined Values
>
> Clause                                 | MDString in Metadata
> ------------------------------------------------------------
> default(none)                          | OMP_DEFAULT_NONE
> ------------------------------------------------------------
> default(shared)                        | OMP_DEFAULT_SHARED
> ------------------------------------------------------------
> ordered                                | OMP_ORDERED
> ------------------------------------------------------------
> nowait                                 | OMP_NOWAIT
> ------------------------------------------------------------
> untied                                 | OMP_UNTIED
> ------------------------------------------------------------
> read                                   | OMP_READ
> ------------------------------------------------------------
> write                                  | OMP_WRITE
> ------------------------------------------------------------
> update                                 | OMP_UPDATE
> ------------------------------------------------------------
> capture                                | OMP_CAPTURE
>
> Here is an example of OpenMP clause with a predefined value and its
> representation in LLVM IR:
>
> C/C++
> -----
> #pragma omp atomic read
>
> LLVM IR
> -------
> call void @llvm.omp.simple(metadata !1)
>
> !1 = metadata !{metadata !”OMP_READ”}
>
>
> Clauses with a List of Variables
> --------------------------------
> Clauses with a list of variables are represented as calls to
> “llvm.omp.list” intrinsic. For this group of clauses metadata contains
> an MDString with a clause’s identifier.
>
> Additional arguments of the intrinsic reference LLVM variables
> associated with a clause. It is important to reference variables
> directly in intrinsic calls and not in metadata, in order to preserve
> data dependency.
>
> Each variable of a non user-defined type is represented with a single
> argument (referencing the variable).
>
> Each variable of a user-defined type is represented with four
> arguments: one referencing the variable itself, one referencing its
> default constructor, one referencing its copy constructor and one
> referencing its destructor. References to constructors / destructors
> are required to correctly create and destroy private copies of
> variables.
>
> The list of clauses and their identifiers is shown in Table 3.
>
> Table 3. OpenMP Clauses with a List of Variables
>
> Clause                                 | MDString in Metadata
> ------------------------------------------------------------
> private(list)                          | OMP_PRIVATE
> ------------------------------------------------------------
> firstprivate(list)                     | OMP_FIRSTPRIVATE
> ------------------------------------------------------------
> lastprivate(list)                      | OMP_LASTPRIVATE
> ------------------------------------------------------------
> shared(list)                           | OMP_SHARED
> ------------------------------------------------------------
> copyin(list)                           | OMP_COPYIN
> ------------------------------------------------------------
> Directive flush(list)                  | OMP_FLUSH
> ------------------------------------------------------------
> Directive threadprivate(list)          | OMP_THREADPRIVATE
>
> Here is an example of OpenMP clause with a list of variables and its
> representation in LLVM IR:
>
> C/C++
> -----
> #pragma omp parallel private(a,b)
>
> LLVM IR
> -------
> call void (metadata, ...)* @llvm.omp.list(metadata !1, i32* @a, i32* @b)
>
> !1 = metadata !{metadata !”OMP_PRIVATE”}
>
>
> Clauses with Scalar or Integer Expressions
> ------------------------------------------
> Clauses with scalar or integer expressions are represented as calls to
> “llvm.omp.expr” intrinsic. For this group of clauses metadata contains
> an MDString with a clause’s identifier.
>
> Second argument of the intrinsic references a scalar or integer LLVM
> expression associated with a clause. It is important to reference
> expressions directly in intrinsic calls and not in metadata, in order
> to preserve data dependency.
>
> The list of clauses and their identifiers is shown in Table 4.
>
> Table 4. OpenMP Clauses with Scalar or Integer Expressions
>
> Clause                                 | MDString in Metadata
> ------------------------------------------------------------
> if(scalar_expr)                        | OMP_IF
> ------------------------------------------------------------
> num_threads(integer_expr)              | OMP_NUM_THREADS
> ------------------------------------------------------------
> final(scalar_expr)                     | OMP_FINAL
> ------------------------------------------------------------
> collapse(const_integer_expr)           | OMP_COLLAPSE
>
> Here is an example of OpenMP clause with a scalar expression and its
> representation in LLVM IR:
>
> C/C++
> -----
> #pragma omp parallel if(a)
>
> LLVM IR
> -------
> %4 = load i32* @a, align 4
> %5 = icmp ne i32 %4, 0
> call void @llvm.omp.expr(metadata !1, i32 %5)
> ...
> !1 = metadata !{metadata !”OMP_IF”}
>
>
> Compound Clauses
> ----------------
> Compound clauses are represented as calls to “llvm.omp.compound”
> intrinsic. For this group of clauses metadata contains an MDString
> with a clause’s identifier and additional data, also represented as an
> MDString.
>
> Additional arguments of an intrinsic call for reduction clause
> reference LLVM variables associated with a clause (see format of
> “Clauses with a List of Variables”).
>
> Second argument of an intrinsic call for schedule (static), schedule
> (dynamic) and schedule (guided) clauses references an integer LLVM
> expression associated with a clause (see format of “Clauses with
> Scalar or Integer Expressions”).
>
> The list of compound clauses along with their representation in
> metadata is shown in Table 5.
>
> Table 5. OpenMP Compound Clauses
>
> Clause                                 | Representation in Metadata
> ------------------------------------------------------------
> reduction(operator : list)             | OMP_REDUCTION, <operator>
> ------------------------------------------------------------
> schedule(static [, integer_expr])      | OMP_SCHEDULE, STATIC
> ------------------------------------------------------------
> schedule(dynamic [, integer_expr])     | OMP_SCHEDULE, DYNAMIC
> ------------------------------------------------------------
> schedule(guided [, integer_expr])      | OMP_SCHEDULE, GUIDED
> ------------------------------------------------------------
> schedule(auto)                         | OMP_SCHEDULE, AUTO
> ------------------------------------------------------------
> schedule(runtime)                      | OMP_SCHEDULE, RUNTIME
> ------------------------------------------------------------
> Directive critical(name)               | OMP_NAME, <name>
>
> Here is an example of a compound OpenMP clause and its representation
> in LLVM IR:
>
> C/C++
> -----
> #pragma omp parallel reduction(+ : a, b)
>
> LLVM IR
> -------
> call void (metadata, ...)* @llvm.omp.compound(metadata !1, i32* @a, i32*
@b)
>
> !1 = metadata !{metadata !”OMP_REDUCTION”, metadata !”+”}
>
>
> An Example
> ----------
> Here is a more complex example, demonstrating LLVM IR representation
> of a directive with several clauses of different types.
>
> C/C++
> -----
> int a, gVar;
> int main() {
>   int lVar;
> #pragma omp parallel default(shared), private(gVar, a), if(lVar)
>   {
>     ...
>   }
>   return (0);
> }
>
> LLVM IR
> -------
> @a = global i32 0, align 4
> @gVar = global i32 0, align 4
> define i32 @main () nounwind uwtable ssp {
>   %lVar = alloca i32, align 4
>   call void @llvm.omp.directive(metadata !0)
>   call void @llvm.omp.simple(metadata !1)
>   call void (metadata, ...)* @llvm.omp.list(metadata !2, i32* @gVar, i32*
> @a)
>   call void @llvm.omp.expr(metadata !3, i32* %lVar)
>   ...
>   call void @llvm.omp.directive(metadata !4)
>   ret i32 0
> }
> !0 = metadata !{metadata !”OMP_PARALLEL”}
> !1 = metadata !{metadata !”OMP_DEFAULT_SHARED”}
> !2 = metadata !{metadata !”OMP_PRIVATE”}
> !3 = metadata !{metadata !”OMP_IF”}
> !4 = metadata !{metadata !”OMP_END_PARALLEL”}
>
>
> Design Alternatives
> ==================> It is possible to propose several viable
alternatives to representing
> information on OpenMP directives and clauses while keeping general
> design approach intact.
>
> Some things that we considered:
>
> a) Represent information on types of directives and clauses as
> constant expressions instead of MDStrings.
> b) Place MDStrings with identifiers of directives and clauses directly
> into intrinsic calls instead of metadata.
> c) Do not represent clauses as separate intrinsics with references to
> metadata; instead, put a list of references to metadata for all
> clauses associated with a directive at the end of metadata describing
> the directive.
>
> As we said, all these are viable alternatives. We decided to choose
> what we chose and not employ the alternatives listed above in order to
> keep true the following three design principles:
>
> 1) Absolute simplicity and readability (including human readability).
> This ruled out alternatives a) and [partially] c).
> 2) Uniformity of representation, for both directives and clauses. This
> ruled out alternatives b) and c).
> 3) As much extensibility and openness for future changes in both LLVM
> IR and OpenMP as possible. This ruled out alternatives b) and c).
>
> We believe that the only downside of our choice is larger size of LLVM
> IR required to represent same number of directives and clauses.
>
> However, we consider this as a relatively minor element of our design;
> one might argue in favor of these or other similar design
> alternatives.
>
>
> Conclusion
> =========> In this document we described our design proposal for OpenMP
> representation in LLVM IR. We believe the IR extensions we proposed
> are:
>
> 1) Simple
>
> They rely on existing LLVM IR types and language constructs; the only
> new things are a few new intrinsics. Thus, LLVM IR consumers lacking
> OpenMP support can simply ignore these two intrinsics and still
> generate correct, albeit sequential, code.
>
> 2) Complete
>
> OpenMP 3.1 Specification ([OpenMP31]) is fully covered.
>
> 3) Extensible
>
> The design approach is general enough to readily incorporate future
> versions of OpenMP standard.
>
> 4) Enables both early and late procedurization and aggressive optimization
>
> This is when compared with designs implying explicit procedurization
> done right in front-ends.
>
>
> Acknowledgements
> ===============> Authors would like to acknowledge valuable
contributions and advice
> provided by Kevin B Smith, Xinmin Tian, Stefanus Du Toit, Brian
> Minard, Dmitry Babokin and other colleagues from Intel and Diego
> Novillo from Google. This *does not* automatically imply approval of
> this proposal by said individuals.
>
>
> References
> =========>
> [Lattner10] Chris Lattner, Devang Patel, “Extensible Metadata in LLVM
> IR”. Available at:
> http://blog.llvm.org/2010/04/extensible-metadata-in-llvm-ir.html
>
> [Meyer92] Bertrand Meyer, “Applying “Design by Contract”, IEEE
> Computer, October 1992, pp. 40-51. Available at:
> http://se.ethz.ch/~meyer/publications/computer/contract.pdf
>
> [Novillo00] Diego Novillo, “Analysis and Optimization of Explicitly
> Parallel Programs”, PhD Thesis, University of Alberta, Edmonton,
> Alberta, Canada, 2000. Available at:
> http://www.airs.com/dnovillo/Papers/Thesis.pdf
>
> [OpenMP11] “OpenMP Application Program Interface”, Version 3.1, July
> 2011. Available at: http://www.openmp.org/mp-documents/OpenMP3.1.pdf
>
> [Tian05] Xinmin Tian, Milind Girkar, Aart Bik and Hideki Saito,
> “Practical Compiler Techniques on Efficient Multithreaded Code
> Generation for OpenMP Programs”, The Computer Journal, September 2005,
> pp.588-601. Available at: http://dl.acm.org/citation.cfm?id=1095017
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
> --
> Marc J. Driftmeyer
> Email :: mjd at reanimality.com
> Web :: http://www.reanimality.com
> Cell :: (509) 435-5212
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

Hal Finkel

2012-Oct-03 02:44 UTC

head link

[LLVMdev] [RFC] OpenMP Representation in LLVM IR

Andrey,

While I think that it will be relatively easy to have the intrinsics
serve as code-motion barriers for other code that might be threads
sensitive (like other external function calls), we would need to think
through exactly how this would work. The easiest thing would be to make
the intrinsics have having unmodeled side effects, although we might
want to do something more intelligent.

Where do you propose placing the parallel loop intrinsics calls
relative to the loop code? Will this inhibit restructuring (like loop
interchange), fusion, etc. if necessary?

 -Hal

On Fri, 28 Sep 2012 22:42:37 +0400
Andrey Bokhanko <andreybokhanko at gmail.com> wrote:
> Hi All,
> 
> We'd like to make a proposal for OpenMP representation in LLVM IR.
> 
> Our goal is to reach an agreement in the community on a simple,
> complete and extensible representation of OpenMP language constructs
> in LLVM IR. Hopefully, this would serve as a common ground and would
> enable further development of OpenMP support both in Clang and LLVM
> compiler toolchain.
> 
> We seek feedback on the proposed design and ways to improve it.
> 
> Main authors of the proposal are Andrey Bokhanko and Alexey Bataev.
> Also, we'd like to acknowledge valuable contributions and advice
> provided by Kevin B Smith, Xinmin Tian, Stefanus Du Toit, Brian
> Minard, Dmitry Babokin and other colleagues from Intel and Diego
> Novillo from Google. NB, this *does not* automatically imply support
> of the proposal by said individuals.
> 
> Please find the proposal in *.pdf (attached to the message, for
> reading convenience) and plain text (below the message, for quoting
> convenience) formats. Their content is identical.
> 
> Full disclosure: both of us are associated with Intel and Intel
> Compiler team.
> 
> Yours,
> Andrey Bokhanko
> Alexey Bataev
> ============> Software Engineers
> Intel Compiler Team
> Intel Corp.
> 
> 
> OpenMP Representation in LLVM IR
> ===============================> Design Proposal
> 
> Alexey Bataev, Andrey Bokhanko
> 
> 
> Introduction
> ===========> This document describes design proposal for OpenMP
representation in
> LLVM IR.
> 
> Authors assume that readers have basic understanding of:
> 1) OpenMP principles and language constructs
> 2) General design of LLVM compiler system
> 
> 
> Goals
> ====> Our end goal is to provide a simple, complete and extensible
support
> for OpenMP representation in LLVM IR.
> 
> We aim to extend LLVM IR as little as possible, preferably without
> adding any new types / language constructs at all.
> 
> Also, we’d like to keep opportunities for optimization of parallel
> code as widely opened as possible –obviously, within the boundaries of
> preserving correct semantics of a user program.
> 
> And finally, in the spirit of OpenMP Specification ([OpenMP11]), LLVM
> IR based compilers should be able to easily skip IR extensions related
> to OpenMP support and still generate correct, albeit sequential, code.
> 
> 
> Function Outlining
> =================> Eventually, parallel regions should be put into
separate routines, a
> process usually called “function outlining” or “procedurization”. This
> can happen as early as in front-end, and as late as right before code
> generation.
> 
> As can be seen in the following sections, the IR extension we propose
> doesn’t involve explicit procedurization. Thus, we assume that
> function outlining should happen somewhere in the LLVM back-end, and
> usually this should be aligned with how chosen OpenMP runtime library
> works and what it expects. This is a deliberate decision on our part.
> We believe it provides the following benefits (when compared with
> designs involving procedurization done in a front-end):
> 
> 1) Function outlining doesn’t depend on source language; thus, it can
> be implemented once and used with any front-ends.
> 
> 2) Optimizations are usually restricted by a single function boundary.
> If procedurization is done in a front-end, this effectively kills any
> optimizations – as simple ones as loop invariant code motion. Refer to
> [Tian2005] for more information on why this is important for efficient
> optimization of OpenMP programs.
> 
> It should be stressed, though, that in order to preserve correct
> semantics of a user program, optimizations should be made thread-aware
> (which, to the best of our knowledge, is not the case with LLVM
> optimizations).
> 
> By “thread-aware” we mean an optimization that performs legal
> transformations on parallel programs. Refer to [Novillo00] for more
> information on correctness of optimizations for parallel programs.
> 
> Given that LLVM optimizations are currently not thread-aware, initial
> implementation should call procedurization pass right at the start of
> the optimizer. In the future, this pass can be moved further and
> further down the optimizer, as more optimizations will become
> thread-aware.
> 
> Essentially, this initial implementation is not that different from
> implementations with procedurization done in front-end; however, it
> keeps window of possibility opened.
> 
> 
> Scope
> ====> Scope of this design is OpenMP representation in LLVM IR. Thus,
all
> other issues related to OpenMP support, like runtime library, ABI, etc
> are not covered.
> 
> We also included a set of requirements for front-ends and back-ends,
> which establish mutual expectations and is an important addition to
> the design.
> 
> Our proposal is based on the latest published OpenMP specification,
> which is version 3.1 ([OpenMP31]) at the time of writing. However, the
> design approach we employed is general enough to allow easy adaptation
> for future versions of OpenMP standard.
> 
> 
> Front-End/Back-End Contract
> ==========================> While not a part of OpenMP representation
design per se, the following
> pre- and post-conditions help to establish a set of mutual expectation
> between front-ends (that generate LLVM IR with OpenMP support) and
> back-ends (that consume it). Without this kind of a “contract”
> ([Meyer92]), a back-end has to verify too much and basically repeat
> the work already done by a front-end.
> 
> While it is possible to generate LLVM IR that violates these
> conditions, we consider it to be non-conformant. Obviously,
> conformance can be verified, if necessary.
> 
> Pre- and Post-conditions for Front-Ends
> ---------------------------------------
> 1) Guarantee correct semantics of directives and clauses. This
> includes nomenclature and number of clauses in directives, correct
> nesting of directives, values of clauses, etc.
> 
> For example, front-ends should guarantee that omp single directive has
> at most one private, firstprivate, copyprivate and nowait clause, and
> nothing else; omp section directive is nested inside omp sections
> directive; the only possible value of private clause is a list of
> variables available at the point where this clause is present; etc.
> 
> 2) Guarantee correct semantics of statements / structured blocks
> following directives.
> As an example, it should be guaranteed that only for-loops follow omp
> parallel for directive.
> 
> 3) Set _OPENMP macro. This is required in section 2.2 of [OpenMP11].
> 
> 4) Support a command-line option that enables/disables OpenMP support.
> This is required in Chapter 2 of [OpenMP11].
> 
> If a user program violates OpenMP semantics and thus, makes compliance
> with first two conditions impossible, a front-end should report a
> meaningful error message and stop compilation.
> 
> It should be noted that front-ends are not required to guarantee full
> conformance of generated LLVM IR to OpenMP specification. As an
> example, the specification deems programs that branch into or out of
> parallel regions to be non-conformant. We believe that verification of
> such properties is too complex for most front-ends. This matches well
> with what is written in the specification itself: “compliant
> implementations are not required to check for code sequences that
> cause a program to be classified as non-conforming” ([OpenMP11],
> Section 1.1).
> 
> Pre- and Post-conditions for Back-Ends
> --------------------------------------
> 1) Intrinsics, builtins, library routines and language implementation
> should be thread-safe. This is equally applicable to front-ends and is
> required in Section 1.5 of [OpenMP11].
> 
> 2) Optimizations should be thread-aware.
> 
> 3) Support a command-line option that enables/disables OpenMP support.
> This is required in Chapter 2 of [OpenMP11].
> 
> 4) Rely on post-conditions guaranteed by front-end, and nothing else.
> 
> Conditions 2)-5) are only applicable to optimizations working on IR
> before procedurization.
> 
> As noted in the previous section, LLVM IR is not guaranteed to be
> fully conformant with OpenMP specification. Back-ends should be able
> to compile cleanly non-compliant programs, but no promises are made on
> correct behavior of resulting machine code.
> 
> 
> Elements of OpenMP
> =================> OpenMP is comprised of four components (as of version
3.1):
> * Directives (+ Clauses)
> * Internal Control Variables
> * Runtime Library Routines
> * Environment Variables
> 
> Internal Control Variables, Runtime Library Routines and Environment
> Variables are provided / handled by OpenMP runtime library. These
> components of OpenMP don’t require special support in LLVM IR and thus
> are out of scope of this document.
> 
> All OpenMP directives in C/C++ are specified with #pragma
> preprocessing directive and have the following format:
> 
> #pragma omp directive-name [clause [[,] clause]…].
> 
> Next two chapters describe our design proposal for representation of,
> correspondingly, directives and clauses.
> 
> 
> Directives
> =========> Directives in LLVM IR are represented as calls to
“llvm.omp.directive”
> intrinsic with a single argument referencing LLVM IR metadata. The
> metadata contains an identifier of a directive.
> 
> Almost all OpenMP directives are represented with two intrinsic calls:
> one for entry and one for exit point of the directive’s context. It is
> enough to have just one intrinsic call for several directives which
> are supposed to be enclosed in other directives (like omp section,
> which must appear only within omp sections context) or specify a
> single instruction (like omp flush) or are declarative (like omp
> threadprivate). Exit point for such directives is either non-existent
> or can be determined by examining other intrinsic calls and thus, not
> required to be explicitly present.
> 
> Metadata specify only one thing: type of a directive. Currently LLVM
> IR does not support integer or enumeration metadata types; thus, we
> decided to use MDString (metadata string type) to represent the type.
> The list of directives and their identifiers is shown in Table 1.
> 
> Table 1. OpenMP Directives
> 
> Type of directive #pragma omp ...      | MDString in metadata
> ------------------------------------------------------------
> parallel                               | OMP_PARALLEL
>                                        | OMP_END_PARALLEL
> ------------------------------------------------------------
> [parallel] for                         | OMP_[PARALLEL_]LOOP
>                                        | OMP_END_[PARALLEL_]LOOP
> ------------------------------------------------------------
> [parallel] sections                    | OMP_[PARALLEL_]SECTIONS
>                                        | OMP_END_[PARALLEL_]SECTIONS
> ------------------------------------------------------------
> section                                | OMP_SECTION
> ------------------------------------------------------------
> single                                 | OMP_SINGLE
>                                        | OMP_END_SINGLE
> ------------------------------------------------------------
> task                                   | OMP_PTASK
>                                        | OMP_END_PTASK
> ------------------------------------------------------------
> taskyield                              | OMP_PTASKYIELD
> ------------------------------------------------------------
> master                                 | OMP_MASTER
>                                        | OMP_END_MASTER
> ------------------------------------------------------------
> critical                               | OMP_CRITICAL
>                                        | OMP_END_CRITICAL
> ------------------------------------------------------------
> barrier                                | OMP_BARRIER
> ------------------------------------------------------------
> taskwait                               | OMP_PTASKWAIT
> ------------------------------------------------------------
> atomic                                 | OMP_ATOMIC
>                                        | OMP_END_ATOMIC
> ------------------------------------------------------------
> flush                                  | OMP_FLUSH
> ------------------------------------------------------------
> ordered                                | OMP_ORDERED
>                                        | OMP_END_ORDERED
> 
> An example of an OpenMP directive and its representation in LLVM IR:
> 
> C/C++
> -----
> #pragma omp parallel
> 
> LLVM IR
> -------
> call void @llvm.omp.directive(metadata !0)
> ...
> call void @llvm.omp.directive(metadata !1)
> 
> !0 = metadata !{metadata !”OMP_PARALLEL”}
> !1 = metadata !{metadata !”OMP_END_PARALLEL”}
> 
> 
> Clauses
> ======> All OpenMP clauses can be divided into four groups:
> * Ones with predefined values (default, ordered, nowait, untied, read,
> write, update, capture)
> * Ones with a list of variables (shared, private, firstprivate,
> lastprivate, copyin, copyprivate, directives flush, threadprivate)
> * Ones with a scalar or integer expression (if, num_threads, final,
> collapse)
> * Compound ones (reduction, schedule, directive critical)
> 
> Clauses in LLVM IR are represented as intrinsic calls. Each intrinsic
> call representing a clause has one mandatory argument and arbitrary
> number of optional arguments. The mandatory argument references LLVM
> IR metadata. The metadata contains identifier of the clause (MDString)
> and, for some compound clauses, additional data represented as another
> MDString. Optional arguments reference LLVM variables or expressions.
> 
> 
> Clauses with Predefined Values
> ------------------------------
> Clauses with predefined values are represented as calls to
> “llvm.omp.simple” intrinsic. For this group of clauses metadata
> contains an MDString with a clause’s identifier.
> 
> The list of clauses and their identifiers is shown in Table 2.
> 
> Table 2. OpenMP Clauses with Predefined Values
> 
> Clause                                 | MDString in Metadata
> ------------------------------------------------------------
> default(none)                          | OMP_DEFAULT_NONE
> ------------------------------------------------------------
> default(shared)                        | OMP_DEFAULT_SHARED
> ------------------------------------------------------------
> ordered                                | OMP_ORDERED
> ------------------------------------------------------------
> nowait                                 | OMP_NOWAIT
> ------------------------------------------------------------
> untied                                 | OMP_UNTIED
> ------------------------------------------------------------
> read                                   | OMP_READ
> ------------------------------------------------------------
> write                                  | OMP_WRITE
> ------------------------------------------------------------
> update                                 | OMP_UPDATE
> ------------------------------------------------------------
> capture                                | OMP_CAPTURE
> 
> Here is an example of OpenMP clause with a predefined value and its
> representation in LLVM IR:
> 
> C/C++
> -----
> #pragma omp atomic read
> 
> LLVM IR
> -------
> call void @llvm.omp.simple(metadata !1)
> 
> !1 = metadata !{metadata !”OMP_READ”}
> 
> 
> Clauses with a List of Variables
> --------------------------------
> Clauses with a list of variables are represented as calls to
> “llvm.omp.list” intrinsic. For this group of clauses metadata contains
> an MDString with a clause’s identifier.
> 
> Additional arguments of the intrinsic reference LLVM variables
> associated with a clause. It is important to reference variables
> directly in intrinsic calls and not in metadata, in order to preserve
> data dependency.
> 
> Each variable of a non user-defined type is represented with a single
> argument (referencing the variable).
> 
> Each variable of a user-defined type is represented with four
> arguments: one referencing the variable itself, one referencing its
> default constructor, one referencing its copy constructor and one
> referencing its destructor. References to constructors / destructors
> are required to correctly create and destroy private copies of
> variables.
> 
> The list of clauses and their identifiers is shown in Table 3.
> 
> Table 3. OpenMP Clauses with a List of Variables
> 
> Clause                                 | MDString in Metadata
> ------------------------------------------------------------
> private(list)                          | OMP_PRIVATE
> ------------------------------------------------------------
> firstprivate(list)                     | OMP_FIRSTPRIVATE
> ------------------------------------------------------------
> lastprivate(list)                      | OMP_LASTPRIVATE
> ------------------------------------------------------------
> shared(list)                           | OMP_SHARED
> ------------------------------------------------------------
> copyin(list)                           | OMP_COPYIN
> ------------------------------------------------------------
> Directive flush(list)                  | OMP_FLUSH
> ------------------------------------------------------------
> Directive threadprivate(list)          | OMP_THREADPRIVATE
> 
> Here is an example of OpenMP clause with a list of variables and its
> representation in LLVM IR:
> 
> C/C++
> -----
> #pragma omp parallel private(a,b)
> 
> LLVM IR
> -------
> call void (metadata, ...)* @llvm.omp.list(metadata !1, i32* @a, i32*
> @b)
> 
> !1 = metadata !{metadata !”OMP_PRIVATE”}
> 
> 
> Clauses with Scalar or Integer Expressions
> ------------------------------------------
> Clauses with scalar or integer expressions are represented as calls to
> “llvm.omp.expr” intrinsic. For this group of clauses metadata contains
> an MDString with a clause’s identifier.
> 
> Second argument of the intrinsic references a scalar or integer LLVM
> expression associated with a clause. It is important to reference
> expressions directly in intrinsic calls and not in metadata, in order
> to preserve data dependency.
> 
> The list of clauses and their identifiers is shown in Table 4.
> 
> Table 4. OpenMP Clauses with Scalar or Integer Expressions
> 
> Clause                                 | MDString in Metadata
> ------------------------------------------------------------
> if(scalar_expr)                        | OMP_IF
> ------------------------------------------------------------
> num_threads(integer_expr)              | OMP_NUM_THREADS
> ------------------------------------------------------------
> final(scalar_expr)                     | OMP_FINAL
> ------------------------------------------------------------
> collapse(const_integer_expr)           | OMP_COLLAPSE
> 
> Here is an example of OpenMP clause with a scalar expression and its
> representation in LLVM IR:
> 
> C/C++
> -----
> #pragma omp parallel if(a)
> 
> LLVM IR
> -------
> %4 = load i32* @a, align 4
> %5 = icmp ne i32 %4, 0
> call void @llvm.omp.expr(metadata !1, i32 %5)
> ...
> !1 = metadata !{metadata !”OMP_IF”}
> 
> 
> Compound Clauses
> ----------------
> Compound clauses are represented as calls to “llvm.omp.compound”
> intrinsic. For this group of clauses metadata contains an MDString
> with a clause’s identifier and additional data, also represented as an
> MDString.
> 
> Additional arguments of an intrinsic call for reduction clause
> reference LLVM variables associated with a clause (see format of
> “Clauses with a List of Variables”).
> 
> Second argument of an intrinsic call for schedule (static), schedule
> (dynamic) and schedule (guided) clauses references an integer LLVM
> expression associated with a clause (see format of “Clauses with
> Scalar or Integer Expressions”).
> 
> The list of compound clauses along with their representation in
> metadata is shown in Table 5.
> 
> Table 5. OpenMP Compound Clauses
> 
> Clause                                 | Representation in Metadata
> ------------------------------------------------------------
> reduction(operator : list)             | OMP_REDUCTION, <operator>
> ------------------------------------------------------------
> schedule(static [, integer_expr])      | OMP_SCHEDULE, STATIC
> ------------------------------------------------------------
> schedule(dynamic [, integer_expr])     | OMP_SCHEDULE, DYNAMIC
> ------------------------------------------------------------
> schedule(guided [, integer_expr])      | OMP_SCHEDULE, GUIDED
> ------------------------------------------------------------
> schedule(auto)                         | OMP_SCHEDULE, AUTO
> ------------------------------------------------------------
> schedule(runtime)                      | OMP_SCHEDULE, RUNTIME
> ------------------------------------------------------------
> Directive critical(name)               | OMP_NAME, <name>
> 
> Here is an example of a compound OpenMP clause and its representation
> in LLVM IR:
> 
> C/C++
> -----
> #pragma omp parallel reduction(+ : a, b)
> 
> LLVM IR
> -------
> call void (metadata, ...)* @llvm.omp.compound(metadata !1, i32* @a,
> i32* @b)
> 
> !1 = metadata !{metadata !”OMP_REDUCTION”, metadata !”+”}
> 
> 
> An Example
> ----------
> Here is a more complex example, demonstrating LLVM IR representation
> of a directive with several clauses of different types.
> 
> C/C++
> -----
> int a, gVar;
> int main() {
>   int lVar;
> #pragma omp parallel default(shared), private(gVar, a), if(lVar)
>   {
>     ...
>   }
>   return (0);
> }
> 
> LLVM IR
> -------
> @a = global i32 0, align 4
> @gVar = global i32 0, align 4
> define i32 @main () nounwind uwtable ssp {
>   %lVar = alloca i32, align 4
>   call void @llvm.omp.directive(metadata !0)
>   call void @llvm.omp.simple(metadata !1)
>   call void (metadata, ...)* @llvm.omp.list(metadata !2, i32* @gVar,
> i32* @a) call void @llvm.omp.expr(metadata !3, i32* %lVar)
>   ...
>   call void @llvm.omp.directive(metadata !4)
>   ret i32 0
> }
> !0 = metadata !{metadata !”OMP_PARALLEL”}
> !1 = metadata !{metadata !”OMP_DEFAULT_SHARED”}
> !2 = metadata !{metadata !”OMP_PRIVATE”}
> !3 = metadata !{metadata !”OMP_IF”}
> !4 = metadata !{metadata !”OMP_END_PARALLEL”}
> 
> 
> Design Alternatives
> ==================> It is possible to propose several viable
alternatives to representing
> information on OpenMP directives and clauses while keeping general
> design approach intact.
> 
> Some things that we considered:
> 
> a) Represent information on types of directives and clauses as
> constant expressions instead of MDStrings.
> b) Place MDStrings with identifiers of directives and clauses directly
> into intrinsic calls instead of metadata.
> c) Do not represent clauses as separate intrinsics with references to
> metadata; instead, put a list of references to metadata for all
> clauses associated with a directive at the end of metadata describing
> the directive.
> 
> As we said, all these are viable alternatives. We decided to choose
> what we chose and not employ the alternatives listed above in order to
> keep true the following three design principles:
> 
> 1) Absolute simplicity and readability (including human readability).
> This ruled out alternatives a) and [partially] c).
> 2) Uniformity of representation, for both directives and clauses. This
> ruled out alternatives b) and c).
> 3) As much extensibility and openness for future changes in both LLVM
> IR and OpenMP as possible. This ruled out alternatives b) and c).
> 
> We believe that the only downside of our choice is larger size of LLVM
> IR required to represent same number of directives and clauses.
> 
> However, we consider this as a relatively minor element of our design;
> one might argue in favor of these or other similar design
> alternatives.
> 
> 
> Conclusion
> =========> In this document we described our design proposal for OpenMP
> representation in LLVM IR. We believe the IR extensions we proposed
> are:
> 
> 1) Simple
> 
> They rely on existing LLVM IR types and language constructs; the only
> new things are a few new intrinsics. Thus, LLVM IR consumers lacking
> OpenMP support can simply ignore these two intrinsics and still
> generate correct, albeit sequential, code.
> 
> 2) Complete
> 
> OpenMP 3.1 Specification ([OpenMP31]) is fully covered.
> 
> 3) Extensible
> 
> The design approach is general enough to readily incorporate future
> versions of OpenMP standard.
> 
> 4) Enables both early and late procedurization and aggressive
> optimization
> 
> This is when compared with designs implying explicit procedurization
> done right in front-ends.
> 
> 
> Acknowledgements
> ===============> Authors would like to acknowledge valuable
contributions and advice
> provided by Kevin B Smith, Xinmin Tian, Stefanus Du Toit, Brian
> Minard, Dmitry Babokin and other colleagues from Intel and Diego
> Novillo from Google. This *does not* automatically imply approval of
> this proposal by said individuals.
> 
> 
> References
> =========> 
> [Lattner10] Chris Lattner, Devang Patel, “Extensible Metadata in LLVM
> IR”. Available at:
> http://blog.llvm.org/2010/04/extensible-metadata-in-llvm-ir.html
> 
> [Meyer92] Bertrand Meyer, “Applying “Design by Contract”, IEEE
> Computer, October 1992, pp. 40-51. Available at:
> http://se.ethz.ch/~meyer/publications/computer/contract.pdf
> 
> [Novillo00] Diego Novillo, “Analysis and Optimization of Explicitly
> Parallel Programs”, PhD Thesis, University of Alberta, Edmonton,
> Alberta, Canada, 2000. Available at:
> http://www.airs.com/dnovillo/Papers/Thesis.pdf
> 
> [OpenMP11] “OpenMP Application Program Interface”, Version 3.1, July
> 2011. Available at: http://www.openmp.org/mp-documents/OpenMP3.1.pdf
> 
> [Tian05] Xinmin Tian, Milind Girkar, Aart Bik and Hideki Saito,
> “Practical Compiler Techniques on Efficient Multithreaded Code
> Generation for OpenMP Programs”, The Computer Journal, September 2005,
> pp.588-601. Available at: http://dl.acm.org/citation.cfm?id=1095017


-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory

Andrey Bokhanko

2012-Oct-03 08:15 UTC

head link

[LLVMdev] [RFC] OpenMP Representation in LLVM IR

Hal,
> While I think that it will be relatively easy to have the intrinsics
> serve as code-motion barriers for other code that might be threads
> sensitive (like other external function calls), we would need to think
> through exactly how this would work. The easiest thing would be to make
> the intrinsics have having unmodeled side effects, although we might
> want to do something more intelligent.
Yes, that's exactly the idea.
> Where do you propose placing the parallel loop intrinsics calls
> relative to the loop code?
In preloop ("opening" intrinsic) and postloop ("closing"
one).
> Will this inhibit restructuring (like loop
> interchange), fusion, etc. if necessary?
I guess so... Loops usually deal with reading/writing memory, and if
an intrinsic is marked as "modifies everything", this hardly leaves
any possibility for [at least] the optimizations you mentioned.

But this is different from what I have in mind. Basically, the plan is
to perform analysis and some optimizations before procedurization, and
do the rest (including loop restructuring) after it. This is not
mentioned in the proposal (we tried to be succint -- only 20 pages
long! :-)), but explained in detail in [Tian05] (sorry, the link in
the proposal doesn't lead you directly to pdf file; use this one
instead:
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.97.3763&rep=rep1&type=pdf).

Yours,
Andrey

Krzysztof Parzyszek

2012-Nov-07 17:31 UTC

head link

[LLVMdev] [RFC] OpenMP Representation in LLVM IR

On 10/1/2012 8:06 PM, greened at obbligato.org wrote:>
> I think this is a bad idea.  OpenMP is not Low Level and I can't think
> of a good reason to start putting OpenMP support in the IR.
OpenMP is as low-level as loop nest optimizations or inlining.  There is 
nothing about it that would make it not fit in the LLVM's model.

As for reasons, here are two:
1. Implementing parallelization directives in the IR would allow 
multiple front-ends to use the same interface to express SMP semantics 
without having each one of them to do the actual implementation.

2. Having the paralellization infrastructure implemented in LLVM would 
allow auto-parallelization to be added at some point without adding a 
new parallelization infrastructure to the optimizer (on top of what the 
front-ends would have already implemented).

-Krzysztof

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, 
hosted by The Linux Foundation

Reasonably Related Threads

Search for more seemingly similar threads

llvm dev - Oct 2012 - [LLVMdev] [RFC] OpenMP Representation in LLVM IR

[LLVMdev] [RFC] OpenMP Representation in LLVM IR

[LLVMdev] [RFC] OpenMP Representation in LLVM IR

[LLVMdev] [RFC] OpenMP Representation in LLVM IR

[LLVMdev] [RFC] OpenMP Representation in LLVM IR

[LLVMdev] [RFC] OpenMP Representation in LLVM IR

[LLVMdev] [RFC] OpenMP Representation in LLVM IR

[LLVMdev] [RFC] OpenMP Representation in LLVM IR

[LLVMdev] [RFC] OpenMP Representation in LLVM IR

[LLVMdev] [RFC] OpenMP Representation in LLVM IR

[LLVMdev] [RFC] OpenMP Representation in LLVM IR

[LLVMdev] [RFC] OpenMP Representation in LLVM IR

[LLVMdev] [RFC] OpenMP Representation in LLVM IR

Reasonably Related Threads