thr3ads.net - llvm dev - [LLVMdev] Upstreaming PNaCl's IR simplification passes [Mar 2014]

If this information is useful, please help other people find it:
Share via:

Mark Seaborn

2014-Mar-04 21:04 UTC

[LLVMdev] Upstreaming PNaCl's IR simplification passes

The PNaCl project has implemented various IR simplification passes that
simplify LLVM IR by lowering complex features to simpler features.  We'd
like to upstream some of these IR passes to LLVM.  We'd like to explore if
this acceptable, and if so, how we should go about doing this.

The immediate reason is that Emscripten is reusing PNaCl's IR passes for
its new "fastcomp" backend [1].  It would be really useful if PNaCl
and
Emscripten could collaborate via upstream LLVM rather than a branch.

Some background:  There are two related use cases for these IR
simplification passes:

 1) Simplifying the task of writing a new LLVM backend.  This is
Emscripten's use case.  The IR simplification passes reduce the number of
cases a backend has to handle, so they would be useful for anyone else
creating a new backend.

 2) Using a subset of LLVM IR as a stable distribution format for portable
executables.  This is PNaCl's use case.  PNaCl's IR subset omits various
complex IR features, which we lower using the IR simplification passes [2].
 Renderscript is an example of another project that uses IR as a stable
distribution format, though I think currently Renderscript is not
subsetting IR much.

Some examples of PNaCl's IR simplification passes are:

 * Calling conventions lowering:  ExpandVarArgs and ExpandByVal lower
varargs and by-value argument passing respectively.  They would be useful
for any backend that doesn't want to implement varargs or by-value calling
conventions.

 * Instruction-level lowering:
    * ExpandStructRegs splits up struct values into scalars, removing the
"insertvalue" and "extractvalue" instructions.
    * PromoteIntegers legalizes integer types (e.g. i30 is converted to
i32).

 * Module-level lowering:  This implements, at the IR level, functionality
that is traditionally provided by "ld".  e.g. ExpandCtors lowers
llvm.global_ctors to the __init_array_start and __init_array_end symbols
that are used by C libraries at startup.

PNaCl's IR simplification passes are modular -- most are independent of
each other -- so they allow projects to pick and choose which IR features
to support and which to pre-lower.  The modularity of these passes makes
them low-maintenance and easy to write targeted tests for.

The code for these passes can be found here:
https://chromium.googlesource.com/native_client/pnacl-llvm/+/master/lib/Transforms/NaCl/

There seems to be plenty of precedent for IR-to-IR lowering passes -- LLVM
already contains passes such as LowerInvoke, LowerSwitch and LowerAtomic.

The PNaCl team (which I'm a member of) is happy to take on the work of
maintaining this code, such as updating it as LLVM IR evolves and doing
code reviews.  We would upstream this gradually, pass by pass, so the
changes would be manageable.

Cheers,
Mark

[1] https://github.com/kripken/emscripten/wiki/LLVM-Backend
[2] https://groups.google.com/forum/#!topic/llvm-dev/lk6dZzwW0ls - PNaCl
Bitcode reference manual
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140304/86e2dde3/attachment.html>

Alon Zakai

2014-Mar-04 22:25 UTC

head link

[LLVMdev] Upstreaming PNaCl's IR simplification passes

To add to what Mark mentioned about Emscripten's new backend [1] using the
PNaCl passes: It made writing the backend much easier than it otherwise
would have been, given our requirements - we are an 'odd' target in that
we
want to transform LLVM IR into JavaScript, then run it through our existing
external JavaScript optimizer tool, which does very JavaScript-specific
optimizations (on a JavaScript AST which is the natural form for us), and
for that reason we don't use the common backend codegen path. Basically the
PNaCl simplification passes convert LLVM IR into a smaller and simpler
subset of LLVM IR, which makes writing a backend that processes LLVM IR
more convenient.

I think there are other use cases as well that could benefit from these
passes being upstream. While typically a backend would want to use the
common codegen to get register allocation and so forth, there are
situations where you just want to transform LLVM IR into something else.
For example in a university course you could teach people compiler
optimizations using LLVM IR, then have them write a tiny backend that
compiles that IR into a familiar language (Python, Java, anything else that
they already know) to execute it (lli also works of course, but this might
feel more "concrete" for the students, and they would learn more I
suspect). Writing that backend in a way that processes LLVM IR means you
only need them to understand LLVM IR and not anything about the selection
DAG etc. Also, there are situations where performance is really not a
concern, like someone writing a backend for a little VM they invented for
fun and just want to execute small amounts of C code on it - for example
this happened with the DCPU-16 spec, and people made an LLVM backend for it.

In summary, I think the shared thing in these examples is that LLVM IR is
very nice to work with, and there are some situations where you're using it
and you have a reason to convert it into something else, and you want to do
that in as _simple_ a way as possible as opposed to generating the most
_optimal_ results. The PNaCl IR simplification passes are in my opinion a
big help there.

- Alon

[1] https://github.com/kripken/emscripten/wiki/LLVM-Backend

On Tue, Mar 4, 2014 at 1:04 PM, Mark Seaborn <mseaborn at chromium.org>
wrote:
> The PNaCl project has implemented various IR simplification passes that
> simplify LLVM IR by lowering complex features to simpler features. 
We'd
> like to upstream some of these IR passes to LLVM.  We'd like to explore
if
> this acceptable, and if so, how we should go about doing this.
>
> The immediate reason is that Emscripten is reusing PNaCl's IR passes
for
> its new "fastcomp" backend [1].  It would be really useful if
PNaCl and
> Emscripten could collaborate via upstream LLVM rather than a branch.
>
> Some background:  There are two related use cases for these IR
> simplification passes:
>
>  1) Simplifying the task of writing a new LLVM backend.  This is
> Emscripten's use case.  The IR simplification passes reduce the number
of
> cases a backend has to handle, so they would be useful for anyone else
> creating a new backend.
>
>  2) Using a subset of LLVM IR as a stable distribution format for portable
> executables.  This is PNaCl's use case.  PNaCl's IR subset omits
various
> complex IR features, which we lower using the IR simplification passes [2].
>  Renderscript is an example of another project that uses IR as a stable
> distribution format, though I think currently Renderscript is not
> subsetting IR much.
>
> Some examples of PNaCl's IR simplification passes are:
>
>  * Calling conventions lowering:  ExpandVarArgs and ExpandByVal lower
> varargs and by-value argument passing respectively.  They would be useful
> for any backend that doesn't want to implement varargs or by-value
calling
> conventions.
>
>  * Instruction-level lowering:
>     * ExpandStructRegs splits up struct values into scalars, removing the
> "insertvalue" and "extractvalue" instructions.
>     * PromoteIntegers legalizes integer types (e.g. i30 is converted to
> i32).
>
>  * Module-level lowering:  This implements, at the IR level, functionality
> that is traditionally provided by "ld".  e.g. ExpandCtors lowers
> llvm.global_ctors to the __init_array_start and __init_array_end symbols
> that are used by C libraries at startup.
>
> PNaCl's IR simplification passes are modular -- most are independent of
> each other -- so they allow projects to pick and choose which IR features
> to support and which to pre-lower.  The modularity of these passes makes
> them low-maintenance and easy to write targeted tests for.
>
> The code for these passes can be found here:
>
>
https://chromium.googlesource.com/native_client/pnacl-llvm/+/master/lib/Transforms/NaCl/
>
> There seems to be plenty of precedent for IR-to-IR lowering passes -- LLVM
> already contains passes such as LowerInvoke, LowerSwitch and LowerAtomic.
>
> The PNaCl team (which I'm a member of) is happy to take on the work of
> maintaining this code, such as updating it as LLVM IR evolves and doing
> code reviews.  We would upstream this gradually, pass by pass, so the
> changes would be manageable.
>
> Cheers,
> Mark
>
> [1] https://github.com/kripken/emscripten/wiki/LLVM-Backend
> [2] https://groups.google.com/forum/#!topic/llvm-dev/lk6dZzwW0ls - PNaCl
> Bitcode reference manual
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140304/c1214e5f/attachment.html>

Sean Silva

2014-Mar-04 23:11 UTC

head link

[LLVMdev] Upstreaming PNaCl's IR simplification passes

On Tue, Mar 4, 2014 at 4:04 PM, Mark Seaborn <mseaborn at chromium.org>
wrote:
> The PNaCl project has implemented various IR simplification passes that
> simplify LLVM IR by lowering complex features to simpler features. 
We'd
> like to upstream some of these IR passes to LLVM.  We'd like to explore
if
> this acceptable, and if so, how we should go about doing this.
>
> The immediate reason is that Emscripten is reusing PNaCl's IR passes
for
> its new "fastcomp" backend [1].  It would be really useful if
PNaCl and
> Emscripten could collaborate via upstream LLVM rather than a branch.
>
> Some background:  There are two related use cases for these IR
> simplification passes:
>
>  1) Simplifying the task of writing a new LLVM backend.  This is
> Emscripten's use case.  The IR simplification passes reduce the number
of
> cases a backend has to handle, so they would be useful for anyone else
> creating a new backend.
>
FWIW, this sounds to me like a sufficiently compelling use case to support
getting this in-tree.

-- Sean Silva

>
>  2) Using a subset of LLVM IR as a stable distribution format for portable
> executables.  This is PNaCl's use case.  PNaCl's IR subset omits
various
> complex IR features, which we lower using the IR simplification passes [2].
>  Renderscript is an example of another project that uses IR as a stable
> distribution format, though I think currently Renderscript is not
> subsetting IR much.
>
> Some examples of PNaCl's IR simplification passes are:
>
>  * Calling conventions lowering:  ExpandVarArgs and ExpandByVal lower
> varargs and by-value argument passing respectively.  They would be useful
> for any backend that doesn't want to implement varargs or by-value
calling
> conventions.
>
>  * Instruction-level lowering:
>     * ExpandStructRegs splits up struct values into scalars, removing the
> "insertvalue" and "extractvalue" instructions.
>     * PromoteIntegers legalizes integer types (e.g. i30 is converted to
> i32).
>
>  * Module-level lowering:  This implements, at the IR level, functionality
> that is traditionally provided by "ld".  e.g. ExpandCtors lowers
> llvm.global_ctors to the __init_array_start and __init_array_end symbols
> that are used by C libraries at startup.
>
> PNaCl's IR simplification passes are modular -- most are independent of
> each other -- so they allow projects to pick and choose which IR features
> to support and which to pre-lower.  The modularity of these passes makes
> them low-maintenance and easy to write targeted tests for.
>
> The code for these passes can be found here:
>
>
https://chromium.googlesource.com/native_client/pnacl-llvm/+/master/lib/Transforms/NaCl/
>
> There seems to be plenty of precedent for IR-to-IR lowering passes -- LLVM
> already contains passes such as LowerInvoke, LowerSwitch and LowerAtomic.
>
> The PNaCl team (which I'm a member of) is happy to take on the work of
> maintaining this code, such as updating it as LLVM IR evolves and doing
> code reviews.  We would upstream this gradually, pass by pass, so the
> changes would be manageable.
>
> Cheers,
> Mark
>
> [1] https://github.com/kripken/emscripten/wiki/LLVM-Backend
> [2] https://groups.google.com/forum/#!topic/llvm-dev/lk6dZzwW0ls - PNaCl
> Bitcode reference manual
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140304/07ae1b1b/attachment.html>

Chandler Carruth

2014-Mar-04 23:17 UTC

head link

[LLVMdev] Upstreaming PNaCl's IR simplification passes

On Tue, Mar 4, 2014 at 1:04 PM, Mark Seaborn <mseaborn at chromium.org>
wrote:
> The PNaCl project has implemented various IR simplification passes that
> simplify LLVM IR by lowering complex features to simpler features. 
We'd
> like to upstream some of these IR passes to LLVM.  We'd like to explore
if
> this acceptable, and if so, how we should go about doing this.
>
My question is somewhat different. I'm not questioning whether these are
acceptable, I'm questioning why these are interesting and important for the
LLVM project.

Neither PNaCl nor Emscripten open source projects have extensive developer
overlap with the LLVM community, and the developers have not (so far)
become super active maintainers of LLVM, although your recent patches to
fix some bugs uncovered by PNaCl have been much appreciated. These lowering
passes are likely to have few (most likely, zero) in-tree users for the
foreseeable future. I'm not enthusiastic about the community taking on the
maintenance, update, and code review burden of these.

I would point you at the several emails I have written to folks adding new
significant features to LLVM about how to offset this by contributing
maintenance and improvements to the core infrastructure, fixing bugs and
generally making things better sufficient to offset the ongoing complexity
cost of the new features. Fortunately, the PNaCl passes seem somewhat less
complex than (for instance) the x32 backend, but they seem likely to still
add a reasonable amount of complexity. They will certainly be challenging
to review and get the design into an acceptable state across the community.
At this point, I'm not really optimistic about there being a large enough
body of community members excited about getting these passes in to offset
these costs. I'm happy to be proven wrong of course, and would also be
happy to see you, other PNaCl developers, or Emscripten developers become
more active in the community in order to build this trust and establish a
good basis for these to go into LLVM.

>
> The immediate reason is that Emscripten is reusing PNaCl's IR passes
for
> its new "fastcomp" backend [1].  It would be really useful if
PNaCl and
> Emscripten could collaborate via upstream LLVM rather than a branch.
>
While this does seem like a useful thing for your two projects, it isn't
clear why this benefits the LLVM community. Perhaps it does, but I'd like
to see that clarified.

> Some background:  There are two related use cases for these IR
> simplification passes:
>
>  1) Simplifying the task of writing a new LLVM backend.  This is
> Emscripten's use case.  The IR simplification passes reduce the number
of
> cases a backend has to handle, so they would be useful for anyone else
> creating a new backend.
>
If these simplify writing a backend, why wouldn't the patches include
commensurate simplifications to LLVM's backends? That would both give them
an in-tree customer, and more immediate value to the community and project
as a whole.

>
>  2) Using a subset of LLVM IR as a stable distribution format for portable
> executables.  This is PNaCl's use case.  PNaCl's IR subset omits
various
> complex IR features, which we lower using the IR simplification passes [2].
>  Renderscript is an example of another project that uses IR as a stable
> distribution format, though I think currently Renderscript is not
> subsetting IR much.
>
Given that the bitcode is stable, I don't understand why this is important.
What technical problems are you solving other than making the IR match some
predetermined form chosen by PNaCl?

>
> Some examples of PNaCl's IR simplification passes are:
>
I have a bunch of questions about the specific passes you mention. Perhaps
these questions are better answered in the review thread for the patches,
but they are at least things that I would think about and try to address if
and when you send out the code review.

>
>  * Calling conventions lowering:  ExpandVarArgs and ExpandByVal lower
> varargs and by-value argument passing respectively.  They would be useful
> for any backend that doesn't want to implement varargs or by-value
calling
> conventions.
>
Why wouldn't these be applicable to existing backends? What is hard about
the existing representations?

>
>  * Instruction-level lowering:
>     * ExpandStructRegs splits up struct values into scalars, removing the
> "insertvalue" and "extractvalue" instructions.
>
There are already passes that do this outside of function arguments and
return values. Why is a new one needed? How do you handle the
overflow-detecting operations?


>     * PromoteIntegers legalizes integer types (e.g. i30 is converted to
> i32).
>
Does it split up too-wide integers? Do we really want another integer
legalization framework in LLVM? I am actually interested in doing (partial)
legalization in the IR during lowering (codegenprep time) in order to
simplify the backend, but I don't think we should develop such a framework
independently of the legalization currently used in the backends.

>
>  * Module-level lowering:  This implements, at the IR level, functionality
> that is traditionally provided by "ld".  e.g. ExpandCtors lowers
> llvm.global_ctors to the __init_array_start and __init_array_end symbols
> that are used by C libraries at startup.
>
This doesn't make any sense to me. The IR representation is strictly
simpler. It is trivially lowered in a backend. I don't understand what this
would benefit.

> There seems to be plenty of precedent for IR-to-IR lowering passes -- LLVM
> already contains passes such as LowerInvoke, LowerSwitch and LowerAtomic.
>
Note that these are quite different -- they lower from a front-end
convenient form toward the canonical IR form. You are talking about
something totally different that deals with target-oriented lowering. The
correct place to look for analogies is CodeGenPrep.

> The PNaCl team (which I'm a member of) is happy to take on the work of
> maintaining this code, such as updating it as LLVM IR evolves and doing
> code reviews.  We would upstream this gradually, pass by pass, so the
> changes would be manageable.
>
While this is appreciated, the PNaCl team should work to much more actively
contribute to the core of LLVM if it wants to be trusted to maintain this
code.



All of that said, while I have a lot of concerns, I do want to clarify
something: I actually think that this is the correct fundamental direction
for LLVM. I *want* to see PNaCl and Emscripten both be significantly more
involved in the community, and I think that using lowering to simplify
backends is a Very Good Thing. However, I think that unless there is a
significant consensus amongst the active LLVM developers that they are OK
accepting and maintaining these patches (currently, I'm not), I think that
the community engagement needs to happen first.

-Chandler
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140304/64a7eae2/attachment.html>

Chandler Carruth

2014-Mar-04 23:18 UTC

head link

[LLVMdev] Upstreaming PNaCl's IR simplification passes

On Tue, Mar 4, 2014 at 3:11 PM, Sean Silva <chisophugis at gmail.com>
wrote:
> On Tue, Mar 4, 2014 at 4:04 PM, Mark Seaborn <mseaborn at
chromium.org>wrote:
>
>> The PNaCl project has implemented various IR simplification passes that
>> simplify LLVM IR by lowering complex features to simpler features. 
We'd
>> like to upstream some of these IR passes to LLVM.  We'd like to
explore if
>> this acceptable, and if so, how we should go about doing this.
>>
>> The immediate reason is that Emscripten is reusing PNaCl's IR
passes for
>> its new "fastcomp" backend [1].  It would be really useful if
PNaCl and
>> Emscripten could collaborate via upstream LLVM rather than a branch.
>>
>> Some background:  There are two related use cases for these IR
>> simplification passes:
>>
>>  1) Simplifying the task of writing a new LLVM backend.  This is
>> Emscripten's use case.  The IR simplification passes reduce the
number of
>> cases a backend has to handle, so they would be useful for anyone else
>> creating a new backend.
>>
>
> FWIW, this sounds to me like a sufficiently compelling use case to support
> getting this in-tree.
>
Just in case it gets lost in my longer reply, I want to emphasize that if
these will be used to simplify the in-tree backends and those backend
maintainers are on board, then I am *totally* in favor of this going into
the tree. My concerns are heavily based on the fact that as proposed, none
of that seems likely to happen.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140304/8bf9ab5a/attachment.html>

Filip Pizlo

2014-Mar-04 23:35 UTC

head link

[LLVMdev] Upstreaming PNaCl's IR simplification passes

I like this and would love to see it in the tree. I think it's broadly
useful to projects that want to take IR as input and then do interests things
with it.

-Fil
> On Mar 4, 2014, at 1:04 PM, Mark Seaborn <mseaborn at chromium.org>
wrote:
> 
> The PNaCl project has implemented various IR simplification passes that
simplify LLVM IR by lowering complex features to simpler features.  We'd
like to upstream some of these IR passes to LLVM.  We'd like to explore if
this acceptable, and if so, how we should go about doing this.
> 
> The immediate reason is that Emscripten is reusing PNaCl's IR passes
for its new "fastcomp" backend [1].  It would be really useful if
PNaCl and Emscripten could collaborate via upstream LLVM rather than a branch.
> 
> Some background:  There are two related use cases for these IR
simplification passes:
> 
>  1) Simplifying the task of writing a new LLVM backend.  This is
Emscripten's use case.  The IR simplification passes reduce the number of
cases a backend has to handle, so they would be useful for anyone else creating
a new backend.
> 
>  2) Using a subset of LLVM IR as a stable distribution format for portable
executables.  This is PNaCl's use case.  PNaCl's IR subset omits various
complex IR features, which we lower using the IR simplification passes [2]. 
Renderscript is an example of another project that uses IR as a stable
distribution format, though I think currently Renderscript is not subsetting IR
much.
> 
> Some examples of PNaCl's IR simplification passes are:
> 
>  * Calling conventions lowering:  ExpandVarArgs and ExpandByVal lower
varargs and by-value argument passing respectively.  They would be useful for
any backend that doesn't want to implement varargs or by-value calling
conventions.
> 
>  * Instruction-level lowering:
>     * ExpandStructRegs splits up struct values into scalars, removing the
"insertvalue" and "extractvalue" instructions.
>     * PromoteIntegers legalizes integer types (e.g. i30 is converted to
i32).
> 
>  * Module-level lowering:  This implements, at the IR level, functionality
that is traditionally provided by "ld".  e.g. ExpandCtors lowers
llvm.global_ctors to the __init_array_start and __init_array_end symbols that
are used by C libraries at startup.
> 
> PNaCl's IR simplification passes are modular -- most are independent of
each other -- so they allow projects to pick and choose which IR features to
support and which to pre-lower.  The modularity of these passes makes them
low-maintenance and easy to write targeted tests for.
> 
> The code for these passes can be found here:
>
https://chromium.googlesource.com/native_client/pnacl-llvm/+/master/lib/Transforms/NaCl/
> 
> There seems to be plenty of precedent for IR-to-IR lowering passes -- LLVM
already contains passes such as LowerInvoke, LowerSwitch and LowerAtomic.
> 
> The PNaCl team (which I'm a member of) is happy to take on the work of
maintaining this code, such as updating it as LLVM IR evolves and doing code
reviews.  We would upstream this gradually, pass by pass, so the changes would
be manageable.
> 
> Cheers,
> Mark
> 
> [1] https://github.com/kripken/emscripten/wiki/LLVM-Backend
> [2] https://groups.google.com/forum/#!topic/llvm-dev/lk6dZzwW0ls - PNaCl
Bitcode reference manual
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140304/68b10e1c/attachment.html>

Sean Silva

2014-Mar-05 00:12 UTC

head link

[LLVMdev] Upstreaming PNaCl's IR simplification passes

On Tue, Mar 4, 2014 at 6:17 PM, Chandler Carruth <chandlerc at
google.com>wrote:
> On Tue, Mar 4, 2014 at 1:04 PM, Mark Seaborn <mseaborn at
chromium.org>wrote:
>
>> The PNaCl project has implemented various IR simplification passes that
>> simplify LLVM IR by lowering complex features to simpler features. 
We'd
>> like to upstream some of these IR passes to LLVM.  We'd like to
explore if
>> this acceptable, and if so, how we should go about doing this.
>>
>
> My question is somewhat different. I'm not questioning whether these
are
> acceptable, I'm questioning why these are interesting and important for
the
> LLVM project.
>
> Neither PNaCl nor Emscripten open source projects have extensive developer
> overlap with the LLVM community, and the developers have not (so far)
> become super active maintainers of LLVM, although your recent patches to
> fix some bugs uncovered by PNaCl have been much appreciated. These lowering
> passes are likely to have few (most likely, zero) in-tree users for the
> foreseeable future. I'm not enthusiastic about the community taking on
the
> maintenance, update, and code review burden of these.
>
> I would point you at the several emails I have written to folks adding new
> significant features to LLVM about how to offset this by contributing
> maintenance and improvements to the core infrastructure, fixing bugs and
> generally making things better sufficient to offset the ongoing complexity
> cost of the new features. Fortunately, the PNaCl passes seem somewhat less
> complex than (for instance) the x32 backend, but they seem likely to still
> add a reasonable amount of complexity. They will certainly be challenging
> to review and get the design into an acceptable state across the community.
> At this point, I'm not really optimistic about there being a large
enough
> body of community members excited about getting these passes in to offset
> these costs. I'm happy to be proven wrong of course, and would also be
> happy to see you, other PNaCl developers, or Emscripten developers become
> more active in the community in order to build this trust and establish a
> good basis for these to go into LLVM.
>
>
>>
>> The immediate reason is that Emscripten is reusing PNaCl's IR
passes for
>> its new "fastcomp" backend [1].  It would be really useful if
PNaCl and
>> Emscripten could collaborate via upstream LLVM rather than a branch.
>>
>
> While this does seem like a useful thing for your two projects, it
isn't
> clear why this benefits the LLVM community. Perhaps it does, but I'd
like
> to see that clarified.
>
I think Alon's point about easing the task for students/people learning (or
playing with) LLVM is pretty strong. People playing around with LLVM today
are tomorrow's contributors. If we can get them to that feeling of
"win"
faster, they are more likely to stick with the project.

>
>
>> Some background:  There are two related use cases for these IR
>> simplification passes:
>>
>>  1) Simplifying the task of writing a new LLVM backend.  This is
>> Emscripten's use case.  The IR simplification passes reduce the
number of
>> cases a backend has to handle, so they would be useful for anyone else
>> creating a new backend.
>>
>
> If these simplify writing a backend, why wouldn't the patches include
> commensurate simplifications to LLVM's backends? That would both give
them
> an in-tree customer, and more immediate value to the community and project
> as a whole.
>
I'd also like to add:
If these simplify writing a backend, should there be commensurate changes
to any relevant documentation for getting started writing backends? (we
don't have much such documentation though...)

(such documentation could also be construed as an in-tree customer if
indeed this would simplify it).

>
>
>>
>>  2) Using a subset of LLVM IR as a stable distribution format for
>> portable executables.  This is PNaCl's use case.  PNaCl's IR
subset omits
>> various complex IR features, which we lower using the IR simplification
>> passes [2].  Renderscript is an example of another project that uses IR
as
>> a stable distribution format, though I think currently Renderscript is
not
>> subsetting IR much.
>>
>
> Given that the bitcode is stable, I don't understand why this is
> important. What technical problems are you solving other than making the IR
> match some predetermined form chosen by PNaCl?
>
>
>>
>> Some examples of PNaCl's IR simplification passes are:
>>
>
> I have a bunch of questions about the specific passes you mention. Perhaps
> these questions are better answered in the review thread for the patches,
> but they are at least things that I would think about and try to address if
> and when you send out the code review.
>
>
>>
>>  * Calling conventions lowering:  ExpandVarArgs and ExpandByVal lower
>> varargs and by-value argument passing respectively.  They would be
useful
>> for any backend that doesn't want to implement varargs or by-value
calling
>> conventions.
>>
>
> Why wouldn't these be applicable to existing backends? What is hard
about
> the existing representations?
>
>
>>
>>  * Instruction-level lowering:
>>     * ExpandStructRegs splits up struct values into scalars, removing
the
>> "insertvalue" and "extractvalue" instructions.
>>
>
> There are already passes that do this outside of function arguments and
> return values. Why is a new one needed? How do you handle the
> overflow-detecting operations?
>
>
>
>>     * PromoteIntegers legalizes integer types (e.g. i30 is converted to
>> i32).
>>
>
> Does it split up too-wide integers? Do we really want another integer
> legalization framework in LLVM? I am actually interested in doing (partial)
> legalization in the IR during lowering (codegenprep time) in order to
> simplify the backend, but I don't think we should develop such a
framework
> independently of the legalization currently used in the backends.
>
>
>>
>>  * Module-level lowering:  This implements, at the IR level,
>> functionality that is traditionally provided by "ld".  e.g.
ExpandCtors
>> lowers llvm.global_ctors to the __init_array_start and __init_array_end
>> symbols that are used by C libraries at startup.
>>
>
> This doesn't make any sense to me. The IR representation is strictly
> simpler. It is trivially lowered in a backend. I don't understand what
this
> would benefit.
>
It might be simpler to do in the backend, but I think that the point is
that it is a recurring cost in every backend; in particular for backends
written by people starting out/playing around with LLVM (i.e. potential
future contributors), where any potential performance loss is acceptable
for the sake of simplifying things.

>
>
>> There seems to be plenty of precedent for IR-to-IR lowering passes --
>> LLVM already contains passes such as LowerInvoke, LowerSwitch and
>> LowerAtomic.
>>
>
> Note that these are quite different -- they lower from a front-end
> convenient form toward the canonical IR form. You are talking about
> something totally different that deals with target-oriented lowering. The
> correct place to look for analogies is CodeGenPrep.
>
>
>> The PNaCl team (which I'm a member of) is happy to take on the work
of
>> maintaining this code, such as updating it as LLVM IR evolves and doing
>> code reviews.  We would upstream this gradually, pass by pass, so the
>> changes would be manageable.
>>
>
> While this is appreciated, the PNaCl team should work to much more
> actively contribute to the core of LLVM if it wants to be trusted to
> maintain this code.
>
Is eliben still on the PNaCl team? (e.g. <
http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-June/063010.html>)

I'd also like to point out that IR-level passes are pretty much LLVM's
strongest point of decoupling and modularization, so of all code changes to
have no in-tree users (if indeed there are none), this is probably a
best-case scenario from a maintainability perspective (especially if it
becomes the point of collaboration for Emscripten and PNaCl).

-- Sean Silva

>
>
>
> All of that said, while I have a lot of concerns, I do want to clarify
> something: I actually think that this is the correct fundamental direction
> for LLVM. I *want* to see PNaCl and Emscripten both be significantly more
> involved in the community, and I think that using lowering to simplify
> backends is a Very Good Thing. However, I think that unless there is a
> significant consensus amongst the active LLVM developers that they are OK
> accepting and maintaining these patches (currently, I'm not), I think
that
> the community engagement needs to happen first.
>
> -Chandler
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140304/e4e6ac87/attachment.html>

Chris Lattner

2014-Mar-05 01:15 UTC

head link

[LLVMdev] Upstreaming PNaCl's IR simplification passes

On Mar 4, 2014, at 3:17 PM, Chandler Carruth <chandlerc at google.com>
wrote:
> On Tue, Mar 4, 2014 at 1:04 PM, Mark Seaborn <mseaborn at
chromium.org> wrote:
> The PNaCl project has implemented various IR simplification passes that
simplify LLVM IR by lowering complex features to simpler features.  We'd
like to upstream some of these IR passes to LLVM.  We'd like to explore if
this acceptable, and if so, how we should go about doing this.
> 
> My question is somewhat different. I'm not questioning whether these
are acceptable, I'm questioning why these are interesting and important for
the LLVM project.
I share Chandler's concern.  If these aren't actively used by something
in tree, they will bit rot.  The way to counter the bit rot would be to add
extensive testcases... but that would just add an even larger burden on core
LLVM developers to keep them up to date.

We have seen similar "obviously useful" pieces of infrastructure fall
to the same fate (e.g., the C backend, which incidentally had very similar
utilities back when it was alive).  Why would this be any different?

-Chris

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140304/0866b289/attachment.html>

Alon Zakai

2014-Mar-05 04:45 UTC

head link

[LLVMdev] Upstreaming PNaCl's IR simplification passes

Sorry to reply to myself, I thought of something else I should have
mentioned before. Emscripten hopes to eventually upstream its JavaScript
backend, if there is interest. It's a work in progress and far from ready
for that right now, and there are probably lots of issues to figure out
regarding that (again, far too early to get into detail), but one thing
will be the dependence of the backend on the PNaCl IR simplification passes
- I guess if they are not upstream at that point, we'd have to figure
things out then.

- Alon



On Tue, Mar 4, 2014 at 2:25 PM, Alon Zakai <alonzakai at gmail.com> wrote:
> To add to what Mark mentioned about Emscripten's new backend [1] using
the
> PNaCl passes: It made writing the backend much easier than it otherwise
> would have been, given our requirements - we are an 'odd' target in
that we
> want to transform LLVM IR into JavaScript, then run it through our existing
> external JavaScript optimizer tool, which does very JavaScript-specific
> optimizations (on a JavaScript AST which is the natural form for us), and
> for that reason we don't use the common backend codegen path. Basically
the
> PNaCl simplification passes convert LLVM IR into a smaller and simpler
> subset of LLVM IR, which makes writing a backend that processes LLVM IR
> more convenient.
>
> I think there are other use cases as well that could benefit from these
> passes being upstream. While typically a backend would want to use the
> common codegen to get register allocation and so forth, there are
> situations where you just want to transform LLVM IR into something else.
> For example in a university course you could teach people compiler
> optimizations using LLVM IR, then have them write a tiny backend that
> compiles that IR into a familiar language (Python, Java, anything else that
> they already know) to execute it (lli also works of course, but this might
> feel more "concrete" for the students, and they would learn more
I
> suspect). Writing that backend in a way that processes LLVM IR means you
> only need them to understand LLVM IR and not anything about the selection
> DAG etc. Also, there are situations where performance is really not a
> concern, like someone writing a backend for a little VM they invented for
> fun and just want to execute small amounts of C code on it - for example
> this happened with the DCPU-16 spec, and people made an LLVM backend for
it.
>
> In summary, I think the shared thing in these examples is that LLVM IR is
> very nice to work with, and there are some situations where you're
using it
> and you have a reason to convert it into something else, and you want to do
> that in as _simple_ a way as possible as opposed to generating the most
> _optimal_ results. The PNaCl IR simplification passes are in my opinion a
> big help there.
>
> - Alon
>
> [1] https://github.com/kripken/emscripten/wiki/LLVM-Backend
>
>
>
> On Tue, Mar 4, 2014 at 1:04 PM, Mark Seaborn <mseaborn at
chromium.org>wrote:
>
>> The PNaCl project has implemented various IR simplification passes that
>> simplify LLVM IR by lowering complex features to simpler features. 
We'd
>> like to upstream some of these IR passes to LLVM.  We'd like to
explore if
>> this acceptable, and if so, how we should go about doing this.
>>
>> The immediate reason is that Emscripten is reusing PNaCl's IR
passes for
>> its new "fastcomp" backend [1].  It would be really useful if
PNaCl and
>> Emscripten could collaborate via upstream LLVM rather than a branch.
>>
>> Some background:  There are two related use cases for these IR
>> simplification passes:
>>
>>  1) Simplifying the task of writing a new LLVM backend.  This is
>> Emscripten's use case.  The IR simplification passes reduce the
number of
>> cases a backend has to handle, so they would be useful for anyone else
>> creating a new backend.
>>
>>  2) Using a subset of LLVM IR as a stable distribution format for
>> portable executables.  This is PNaCl's use case.  PNaCl's IR
subset omits
>> various complex IR features, which we lower using the IR simplification
>> passes [2].  Renderscript is an example of another project that uses IR
as
>> a stable distribution format, though I think currently Renderscript is
not
>> subsetting IR much.
>>
>> Some examples of PNaCl's IR simplification passes are:
>>
>>  * Calling conventions lowering:  ExpandVarArgs and ExpandByVal lower
>> varargs and by-value argument passing respectively.  They would be
useful
>> for any backend that doesn't want to implement varargs or by-value
calling
>> conventions.
>>
>>  * Instruction-level lowering:
>>     * ExpandStructRegs splits up struct values into scalars, removing
the
>> "insertvalue" and "extractvalue" instructions.
>>     * PromoteIntegers legalizes integer types (e.g. i30 is converted to
>> i32).
>>
>>  * Module-level lowering:  This implements, at the IR level,
>> functionality that is traditionally provided by "ld".  e.g.
ExpandCtors
>> lowers llvm.global_ctors to the __init_array_start and __init_array_end
>> symbols that are used by C libraries at startup.
>>
>> PNaCl's IR simplification passes are modular -- most are
independent of
>> each other -- so they allow projects to pick and choose which IR
features
>> to support and which to pre-lower.  The modularity of these passes
makes
>> them low-maintenance and easy to write targeted tests for.
>>
>> The code for these passes can be found here:
>>
>>
https://chromium.googlesource.com/native_client/pnacl-llvm/+/master/lib/Transforms/NaCl/
>>
>> There seems to be plenty of precedent for IR-to-IR lowering passes --
>> LLVM already contains passes such as LowerInvoke, LowerSwitch and
>> LowerAtomic.
>>
>> The PNaCl team (which I'm a member of) is happy to take on the work
of
>> maintaining this code, such as updating it as LLVM IR evolves and doing
>> code reviews.  We would upstream this gradually, pass by pass, so the
>> changes would be manageable.
>>
>> Cheers,
>> Mark
>>
>> [1] https://github.com/kripken/emscripten/wiki/LLVM-Backend
>> [2] https://groups.google.com/forum/#!topic/llvm-dev/lk6dZzwW0ls -
PNaCl
>> Bitcode reference manual
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140304/fa8f31fa/attachment.html>

Mark Seaborn

2014-Mar-07 02:09 UTC

head link

[LLVMdev] Upstreaming PNaCl's IR simplification passes

There's a lot of questions in your post, so I'll focus on the technical
questions about specific IR passes in this first reply...

On 4 March 2014 15:17, Chandler Carruth <chandlerc at google.com> wrote:
> On Tue, Mar 4, 2014 at 1:04 PM, Mark Seaborn <mseaborn at
chromium.org>wrote:
>
>> Some background:  There are two related use cases for these IR
>> simplification passes:
>>
>
>>  1) Simplifying the task of writing a new LLVM backend.  This is
>> Emscripten's use case.  The IR simplification passes reduce the
number of
>> cases a backend has to handle, so they would be useful for anyone else
>> creating a new backend.
>>
>
> If these simplify writing a backend, why wouldn't the patches include
> commensurate simplifications to LLVM's backends? That would both give
them
> an in-tree customer, and more immediate value to the community and project
> as a whole.
>
That's a good question.  I'll have to have a look around in the LLVM
backend code and see what parts could be replaced by one of PNaCl's
simplification passes.

One answer is that, in some cases, such as calling conventions and global
constructor arrays, LLVM's backend is constrained to follow the ABIs for
particular OSes and architectures.  Compatibility makes complexity harder
to remove.  I'll elaborate more below.  This only applies to a few of
PNaCl's IR passes though.

>
>
>>  2) Using a subset of LLVM IR as a stable distribution format for
>> portable executables.  This is PNaCl's use case.  PNaCl's IR
subset omits
>> various complex IR features, which we lower using the IR simplification
>> passes [2].  Renderscript is an example of another project that uses IR
as
>> a stable distribution format, though I think currently Renderscript is
not
>> subsetting IR much.
>>
>
> Given that the bitcode is stable, I don't understand why this is
important.
>
Is the bitcode format stable now?  I heard talk that LLVM is trying to do
this now, but I don't remember seeing an llvmdev thread stating that for
sure.  Was there a thread about it that I missed?  I just remember hearing
complaints last year that the format was still getting changed. :-)

>  * Calling conventions lowering:  ExpandVarArgs and ExpandByVal lower
>> varargs and by-value argument passing respectively.  They would be
useful
>> for any backend that doesn't want to implement varargs or by-value
calling
>> conventions.
>>
>
> Why wouldn't these be applicable to existing backends? What is hard
about
> the existing representations?
>
For the calling conventions lowering passes, you wouldn't want to use them
in backends that have to match some existing architecture-specific ABI for
calling conventions.  For example, if you use ExpandVarArgs on x86, your .o
file won't be able to successfully call the printf() function provided by
libc.so, because the varargs calling conventions won't match.

But for many targets that is not an issue, either because:

 * there is no existing architecture-specific ABI that LLVM must match, or
 * you're using static linking, or can make similar "closed world"
assumptions, so that a module can use any calling conventions as long as
they're used consistently within the module.

Both of these are true for PNaCl and Emscripten.

My suspicion is that one or both of these conditions will be true for other
novel backends, such as for specialised architectures like GPUs.

Aside from PNaCl and Emscripten, I am less familiar with other novel
backends.  So one of the things I had hoped to learn from this discussion
was whether other backends would find these passes useful.  So far we've
had some people say that yes, they would.

>   * Instruction-level lowering:
>>     * ExpandStructRegs splits up struct values into scalars, removing
the
>> "insertvalue" and "extractvalue" instructions.
>>
>
> There are already passes that do this outside of function arguments and
> return values. Why is a new one needed?
>
Are you referring to the work that SelectionDAGBuilder.cpp does to convert
insertvalue/extractvalue to a SelectionDAG?  I don't think there's an
IR-to-IR pass in LLVM for doing this, is there?

The reason PNaCl needs an IR-to-IR pass is that PNaCl's stable IR omits
insertvalue/extractvalue, in order to keep the format simple and reduce the
set of constructs that a PNaCl translator implementation needs to handle.
 The reason Emscripten's fastcomp uses ExpandStructRegs is to keep
Emscripten's backend simple, in the context that it doesn't use
lib/CodeGen.

And the reason we have to handle insertvalue/extractvalue at all is largely
that Clang outputs them for uses of C++ method pointers.  Otherwise,
structs-as-registers aren't really used.  At least, that was the case in
3.3 -- maybe some more uses have appeared since then.

> How do you handle the overflow-detecting operations?
>
PNaCl has the ExpandArithWithOverflow pass, which lowers uses of
llvm.*.with.overflow.*.

>
>     * PromoteIntegers legalizes integer types (e.g. i30 is converted to
>> i32).
>>
>
> Does it split up too-wide integers?
>
PNaCl's version currently doesn't.  Emscripten's fastcomp has a
version
which splits up 64-bit integer operations into 32-bit operations, which
they need because Javascript doesn't support 64-bit integer arithmetic.

PNaCl's version didn't need to do that because we were happy to support
64-bit arithmetic in PNaCl's stable ABI.  However, we did find that unusual
C bitfields caused Clang to generate integer types larger than 64-bit
(which we don't support in PNaCl's stable ABI), so we started
implementing
a pass to split those up.  We should probably sync up with Emscripten and
reuse their code for that.

> Do we really want another integer legalization framework in LLVM?
>
At the risk of not answering your question directly, LLVM already has two
instruction selectors, SelectionDAG and FastISel.  So another question
might be, when is it OK to have multiple implementations that perform
similar tasks using different approaches, and when is it not OK?  What are
the trade-offs involved here?

> I am actually interested in doing (partial) legalization in the IR during
> lowering (codegenprep time) in order to simplify the backend, but I
don't
> think we should develop such a framework independently of the legalization
> currently used in the backends.
>
>
>>
>>  * Module-level lowering:  This implements, at the IR level,
>> functionality that is traditionally provided by "ld".  e.g.
ExpandCtors
>> lowers llvm.global_ctors to the __init_array_start and __init_array_end
>> symbols that are used by C libraries at startup.
>>
>
> This doesn't make any sense to me. The IR representation is strictly
> simpler. It is trivially lowered in a backend. I don't understand what
this
> would benefit.
>
To elaborate:  In PNaCl, pexes are statically linked modules in which
running global constructors is handled by user code inside the pexe.  The
special llvm.global_ctors array isn't part of PNaCl's stable subset of
IR,
because there's no need for it to be.  Running constructors is done in
normal IR by the pexe's entry point, without constructors needing to be
handled specially by PNaCl's IR format.

LLVM's global_ctors construct is incomplete:  it provides a mechanism, at
the IR level, to declare functions to be run at startup, but it assumes
that running these functions will be done by a runtime library.  At the IR
level, LLVM doesn't provide a way to implement a runtime library that can
read that constructor list.  ld linker scripts provide a way to do that --
e.g. on Linux, see /usr/lib/ldscripts/elf_i386.x, which defines
__init_array_{start,end} -- but that's not at the IR level.

ExpandCtors just provides a mechanism for a runtime library to list the
constructor functions, purely at the IR level, without constructors having
to be a special feature in the PNaCl ABI or in the Emscripten backend.

 There seems to be plenty of precedent for IR-to-IR lowering passes --
LLVM>> already contains passes such as LowerInvoke, LowerSwitch and
LowerAtomic.
>>
>
> Note that these are quite different -- they lower from a front-end
> convenient form toward the canonical IR form.
>
Those three passes don't lower towards canonical IR form -- unless we are
taking "canonical IR form" to mean quite different things?

LowerInvoke and LowerAtomic both strip out information irreversibly.

LowerAtomic "lowers atomic intrinsics to non-atomic form for use in a known
non-preemptible environment".  LowerInvoke strips out exception handling by
converting invokes to calls, so that landingpads, resumes, etc. become dead
and can be removed by a later pass.

(As an aside, LowerInvoke has an option for using SJLJ exception handling,
but that option appears to be unused and replaced
by lib/CodeGen/SjLjEHPrepare.cpp.)

LowerSwitch "rewrites switch instructions with a sequence of branches,
which allows targets to get away with not implementing the switch
instruction until it is convenient".

These three are very similar in function to PNaCl's IR simplification
passes, since they reduce the set of language features that must be
supported by a backend or by a stable IR format.

> You are talking about something totally different that deals with
> target-oriented lowering. The correct place to look for analogies is
> CodeGenPrep.
>
CodeGenPrepare.cpp just contains optimisations, doesn't it?  It doesn't
lower any language features such that the feature is removed from the
module, so it doesn't seem to be analogous to PNaCl's IR simplification
passes, which do do that.  e.g. LowerAtomic strips out atomicrmw entirely
so that anything processing LowerAtomic's output doesn't have to handle
atomicrmw at all.  Similarly, ExpandByVal expands out "byval"
entirely.

If you're looking for backend IR-to-IR passes which lower language
features, DwarfEHPrepare and SjLjEHPrepare are analogous to PNaCl's passes.
 DwarfEHPrepare only lowers resume instructions, while SjLjEHPrepare
handles more.

Cheers,
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140306/1e945ce5/attachment.html>

Possibly Parallel Threads

Search for more possibly parallel threads

llvm dev - Mar 2014 - [LLVMdev] Upstreaming PNaCl's IR simplification passes

[LLVMdev] Upstreaming PNaCl's IR simplification passes

[LLVMdev] Upstreaming PNaCl's IR simplification passes

[LLVMdev] Upstreaming PNaCl's IR simplification passes

[LLVMdev] Upstreaming PNaCl's IR simplification passes

[LLVMdev] Upstreaming PNaCl's IR simplification passes

[LLVMdev] Upstreaming PNaCl's IR simplification passes

[LLVMdev] Upstreaming PNaCl's IR simplification passes

[LLVMdev] Upstreaming PNaCl's IR simplification passes

[LLVMdev] Upstreaming PNaCl's IR simplification passes

[LLVMdev] Upstreaming PNaCl's IR simplification passes

Possibly Parallel Threads