thr3ads.net - llvm dev - [LLVMdev] Upstreaming PNaCl's IR simplification passes [Mar 2014]

If this information is useful, please help other people find it:
Share via:

Sean Silva

2014-Mar-05 00:12 UTC

[LLVMdev] Upstreaming PNaCl's IR simplification passes

On Tue, Mar 4, 2014 at 6:17 PM, Chandler Carruth <chandlerc at
google.com>wrote:
> On Tue, Mar 4, 2014 at 1:04 PM, Mark Seaborn <mseaborn at
chromium.org>wrote:
>
>> The PNaCl project has implemented various IR simplification passes that
>> simplify LLVM IR by lowering complex features to simpler features. 
We'd
>> like to upstream some of these IR passes to LLVM.  We'd like to
explore if
>> this acceptable, and if so, how we should go about doing this.
>>
>
> My question is somewhat different. I'm not questioning whether these
are
> acceptable, I'm questioning why these are interesting and important for
the
> LLVM project.
>
> Neither PNaCl nor Emscripten open source projects have extensive developer
> overlap with the LLVM community, and the developers have not (so far)
> become super active maintainers of LLVM, although your recent patches to
> fix some bugs uncovered by PNaCl have been much appreciated. These lowering
> passes are likely to have few (most likely, zero) in-tree users for the
> foreseeable future. I'm not enthusiastic about the community taking on
the
> maintenance, update, and code review burden of these.
>
> I would point you at the several emails I have written to folks adding new
> significant features to LLVM about how to offset this by contributing
> maintenance and improvements to the core infrastructure, fixing bugs and
> generally making things better sufficient to offset the ongoing complexity
> cost of the new features. Fortunately, the PNaCl passes seem somewhat less
> complex than (for instance) the x32 backend, but they seem likely to still
> add a reasonable amount of complexity. They will certainly be challenging
> to review and get the design into an acceptable state across the community.
> At this point, I'm not really optimistic about there being a large
enough
> body of community members excited about getting these passes in to offset
> these costs. I'm happy to be proven wrong of course, and would also be
> happy to see you, other PNaCl developers, or Emscripten developers become
> more active in the community in order to build this trust and establish a
> good basis for these to go into LLVM.
>
>
>>
>> The immediate reason is that Emscripten is reusing PNaCl's IR
passes for
>> its new "fastcomp" backend [1].  It would be really useful if
PNaCl and
>> Emscripten could collaborate via upstream LLVM rather than a branch.
>>
>
> While this does seem like a useful thing for your two projects, it
isn't
> clear why this benefits the LLVM community. Perhaps it does, but I'd
like
> to see that clarified.
>
I think Alon's point about easing the task for students/people learning (or
playing with) LLVM is pretty strong. People playing around with LLVM today
are tomorrow's contributors. If we can get them to that feeling of
"win"
faster, they are more likely to stick with the project.

>
>
>> Some background:  There are two related use cases for these IR
>> simplification passes:
>>
>>  1) Simplifying the task of writing a new LLVM backend.  This is
>> Emscripten's use case.  The IR simplification passes reduce the
number of
>> cases a backend has to handle, so they would be useful for anyone else
>> creating a new backend.
>>
>
> If these simplify writing a backend, why wouldn't the patches include
> commensurate simplifications to LLVM's backends? That would both give
them
> an in-tree customer, and more immediate value to the community and project
> as a whole.
>
I'd also like to add:
If these simplify writing a backend, should there be commensurate changes
to any relevant documentation for getting started writing backends? (we
don't have much such documentation though...)

(such documentation could also be construed as an in-tree customer if
indeed this would simplify it).

>
>
>>
>>  2) Using a subset of LLVM IR as a stable distribution format for
>> portable executables.  This is PNaCl's use case.  PNaCl's IR
subset omits
>> various complex IR features, which we lower using the IR simplification
>> passes [2].  Renderscript is an example of another project that uses IR
as
>> a stable distribution format, though I think currently Renderscript is
not
>> subsetting IR much.
>>
>
> Given that the bitcode is stable, I don't understand why this is
> important. What technical problems are you solving other than making the IR
> match some predetermined form chosen by PNaCl?
>
>
>>
>> Some examples of PNaCl's IR simplification passes are:
>>
>
> I have a bunch of questions about the specific passes you mention. Perhaps
> these questions are better answered in the review thread for the patches,
> but they are at least things that I would think about and try to address if
> and when you send out the code review.
>
>
>>
>>  * Calling conventions lowering:  ExpandVarArgs and ExpandByVal lower
>> varargs and by-value argument passing respectively.  They would be
useful
>> for any backend that doesn't want to implement varargs or by-value
calling
>> conventions.
>>
>
> Why wouldn't these be applicable to existing backends? What is hard
about
> the existing representations?
>
>
>>
>>  * Instruction-level lowering:
>>     * ExpandStructRegs splits up struct values into scalars, removing
the
>> "insertvalue" and "extractvalue" instructions.
>>
>
> There are already passes that do this outside of function arguments and
> return values. Why is a new one needed? How do you handle the
> overflow-detecting operations?
>
>
>
>>     * PromoteIntegers legalizes integer types (e.g. i30 is converted to
>> i32).
>>
>
> Does it split up too-wide integers? Do we really want another integer
> legalization framework in LLVM? I am actually interested in doing (partial)
> legalization in the IR during lowering (codegenprep time) in order to
> simplify the backend, but I don't think we should develop such a
framework
> independently of the legalization currently used in the backends.
>
>
>>
>>  * Module-level lowering:  This implements, at the IR level,
>> functionality that is traditionally provided by "ld".  e.g.
ExpandCtors
>> lowers llvm.global_ctors to the __init_array_start and __init_array_end
>> symbols that are used by C libraries at startup.
>>
>
> This doesn't make any sense to me. The IR representation is strictly
> simpler. It is trivially lowered in a backend. I don't understand what
this
> would benefit.
>
It might be simpler to do in the backend, but I think that the point is
that it is a recurring cost in every backend; in particular for backends
written by people starting out/playing around with LLVM (i.e. potential
future contributors), where any potential performance loss is acceptable
for the sake of simplifying things.

>
>
>> There seems to be plenty of precedent for IR-to-IR lowering passes --
>> LLVM already contains passes such as LowerInvoke, LowerSwitch and
>> LowerAtomic.
>>
>
> Note that these are quite different -- they lower from a front-end
> convenient form toward the canonical IR form. You are talking about
> something totally different that deals with target-oriented lowering. The
> correct place to look for analogies is CodeGenPrep.
>
>
>> The PNaCl team (which I'm a member of) is happy to take on the work
of
>> maintaining this code, such as updating it as LLVM IR evolves and doing
>> code reviews.  We would upstream this gradually, pass by pass, so the
>> changes would be manageable.
>>
>
> While this is appreciated, the PNaCl team should work to much more
> actively contribute to the core of LLVM if it wants to be trusted to
> maintain this code.
>
Is eliben still on the PNaCl team? (e.g. <
http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-June/063010.html>)

I'd also like to point out that IR-level passes are pretty much LLVM's
strongest point of decoupling and modularization, so of all code changes to
have no in-tree users (if indeed there are none), this is probably a
best-case scenario from a maintainability perspective (especially if it
becomes the point of collaboration for Emscripten and PNaCl).

-- Sean Silva

>
>
>
> All of that said, while I have a lot of concerns, I do want to clarify
> something: I actually think that this is the correct fundamental direction
> for LLVM. I *want* to see PNaCl and Emscripten both be significantly more
> involved in the community, and I think that using lowering to simplify
> backends is a Very Good Thing. However, I think that unless there is a
> significant consensus amongst the active LLVM developers that they are OK
> accepting and maintaining these patches (currently, I'm not), I think
that
> the community engagement needs to happen first.
>
> -Chandler
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140304/e4e6ac87/attachment.html>

Chandler Carruth

2014-Mar-05 00:27 UTC

head link

[LLVMdev] Upstreaming PNaCl's IR simplification passes

On Tue, Mar 4, 2014 at 4:12 PM, Sean Silva <silvas at purdue.edu> wrote:
>
>
>
> On Tue, Mar 4, 2014 at 6:17 PM, Chandler Carruth <chandlerc at
google.com>wrote:
>
>> On Tue, Mar 4, 2014 at 1:04 PM, Mark Seaborn <mseaborn at
chromium.org>wrote:
>>
>>> The PNaCl project has implemented various IR simplification passes
that
>>> simplify LLVM IR by lowering complex features to simpler features. 
We'd
>>> like to upstream some of these IR passes to LLVM.  We'd like to
explore if
>>> this acceptable, and if so, how we should go about doing this.
>>>
>>
>> My question is somewhat different. I'm not questioning whether
these are
>> acceptable, I'm questioning why these are interesting and important
for the
>> LLVM project.
>>
>> Neither PNaCl nor Emscripten open source projects have extensive
>> developer overlap with the LLVM community, and the developers have not
(so
>> far) become super active maintainers of LLVM, although your recent
patches
>> to fix some bugs uncovered by PNaCl have been much appreciated. These
>> lowering passes are likely to have few (most likely, zero) in-tree
users
>> for the foreseeable future. I'm not enthusiastic about the
community taking
>> on the maintenance, update, and code review burden of these.
>>
>> I would point you at the several emails I have written to folks adding
>> new significant features to LLVM about how to offset this by
contributing
>> maintenance and improvements to the core infrastructure, fixing bugs
and
>> generally making things better sufficient to offset the ongoing
complexity
>> cost of the new features. Fortunately, the PNaCl passes seem somewhat
less
>> complex than (for instance) the x32 backend, but they seem likely to
still
>> add a reasonable amount of complexity. They will certainly be
challenging
>> to review and get the design into an acceptable state across the
community.
>> At this point, I'm not really optimistic about there being a large
enough
>> body of community members excited about getting these passes in to
offset
>> these costs. I'm happy to be proven wrong of course, and would also
be
>> happy to see you, other PNaCl developers, or Emscripten developers
become
>> more active in the community in order to build this trust and establish
a
>> good basis for these to go into LLVM.
>>
>>
>>>
>>> The immediate reason is that Emscripten is reusing PNaCl's IR
passes for
>>> its new "fastcomp" backend [1].  It would be really
useful if PNaCl and
>>> Emscripten could collaborate via upstream LLVM rather than a
branch.
>>>
>>
>> While this does seem like a useful thing for your two projects, it
isn't
>> clear why this benefits the LLVM community. Perhaps it does, but
I'd like
>> to see that clarified.
>>
>
> I think Alon's point about easing the task for students/people learning
> (or playing with) LLVM is pretty strong. People playing around with LLVM
> today are tomorrow's contributors. If we can get them to that feeling
of
> "win" faster, they are more likely to stick with the project.
>
Sure, but I don't think this direction is a necessary step there, or even a
very significant one. I don't think any part of this is going to make it
easier to get up and rolling with LLVM for newcomers.


>
>
>>
>>
>>> Some background:  There are two related use cases for these IR
>>> simplification passes:
>>>
>>>  1) Simplifying the task of writing a new LLVM backend.  This is
>>> Emscripten's use case.  The IR simplification passes reduce the
number of
>>> cases a backend has to handle, so they would be useful for anyone
else
>>> creating a new backend.
>>>
>>
>> If these simplify writing a backend, why wouldn't the patches
include
>> commensurate simplifications to LLVM's backends? That would both
give them
>> an in-tree customer, and more immediate value to the community and
project
>> as a whole.
>>
>
> I'd also like to add:
> If these simplify writing a backend, should there be commensurate changes
> to any relevant documentation for getting started writing backends? (we
> don't have much such documentation though...)
>
Very much so, yes.

>
> (such documentation could also be construed as an in-tree customer if
> indeed this would simplify it).
>
I won't go that far. It won't keep it well tested or correct.

>
>>
>>
>>>
>>>  2) Using a subset of LLVM IR as a stable distribution format for
>>> portable executables.  This is PNaCl's use case.  PNaCl's
IR subset omits
>>> various complex IR features, which we lower using the IR
simplification
>>> passes [2].  Renderscript is an example of another project that
uses IR as
>>> a stable distribution format, though I think currently Renderscript
is not
>>> subsetting IR much.
>>>
>>
>> Given that the bitcode is stable, I don't understand why this is
>> important. What technical problems are you solving other than making
the IR
>> match some predetermined form chosen by PNaCl?
>>
>>
>>>
>>> Some examples of PNaCl's IR simplification passes are:
>>>
>>
>> I have a bunch of questions about the specific passes you mention.
>> Perhaps these questions are better answered in the review thread for
the
>> patches, but they are at least things that I would think about and try
to
>> address if and when you send out the code review.
>>
>>
>>>
>>>  * Calling conventions lowering:  ExpandVarArgs and ExpandByVal
lower
>>> varargs and by-value argument passing respectively.  They would be
useful
>>> for any backend that doesn't want to implement varargs or
by-value calling
>>> conventions.
>>>
>>
>> Why wouldn't these be applicable to existing backends? What is hard
about
>> the existing representations?
>>
>>
>>>
>>>  * Instruction-level lowering:
>>>     * ExpandStructRegs splits up struct values into scalars,
removing
>>> the "insertvalue" and "extractvalue"
instructions.
>>>
>>
>> There are already passes that do this outside of function arguments and
>> return values. Why is a new one needed? How do you handle the
>> overflow-detecting operations?
>>
>>
>>
>>>     * PromoteIntegers legalizes integer types (e.g. i30 is
converted to
>>> i32).
>>>
>>
>> Does it split up too-wide integers? Do we really want another integer
>> legalization framework in LLVM? I am actually interested in doing
(partial)
>> legalization in the IR during lowering (codegenprep time) in order to
>> simplify the backend, but I don't think we should develop such a
framework
>> independently of the legalization currently used in the backends.
>>
>>
>>>
>>>  * Module-level lowering:  This implements, at the IR level,
>>> functionality that is traditionally provided by "ld". 
e.g. ExpandCtors
>>> lowers llvm.global_ctors to the __init_array_start and
__init_array_end
>>> symbols that are used by C libraries at startup.
>>>
>>
>> This doesn't make any sense to me. The IR representation is
strictly
>> simpler. It is trivially lowered in a backend. I don't understand
what this
>> would benefit.
>>
>
> It might be simpler to do in the backend, but I think that the point is
> that it is a recurring cost in every backend; in particular for backends
> written by people starting out/playing around with LLVM (i.e. potential
> future contributors), where any potential performance loss is acceptable
> for the sake of simplifying things.
>
I don't understand this at all.

We have a *target independent* backend. There is only one, so there should
be no recurring cost.

If people are writing a totally independent backend, then the cost of
handling this very trivial construct is ... completely unimportant compared
to the challenge of a new backend.


Also, I don't think this is about performance at all. Today, we have a
clear declarative construct that marks a special "on startup" thing
with a
clear spec in the langref. With this patch we'll have an ad-hoc implicit
contract with an implementation detail of some systems libc ABIs. I don't
see how the latter is easier on any level.

>
>
>>
>>
>>> There seems to be plenty of precedent for IR-to-IR lowering passes
--
>>> LLVM already contains passes such as LowerInvoke, LowerSwitch and
>>> LowerAtomic.
>>>
>>
>> Note that these are quite different -- they lower from a front-end
>> convenient form toward the canonical IR form. You are talking about
>> something totally different that deals with target-oriented lowering.
The
>> correct place to look for analogies is CodeGenPrep.
>>
>>
>>> The PNaCl team (which I'm a member of) is happy to take on the
work of
>>> maintaining this code, such as updating it as LLVM IR evolves and
doing
>>> code reviews.  We would upstream this gradually, pass by pass, so
the
>>> changes would be manageable.
>>>
>>
>> While this is appreciated, the PNaCl team should work to much more
>> actively contribute to the core of LLVM if it wants to be trusted to
>> maintain this code.
>>
>
> Is eliben still on the PNaCl team? (e.g. <
> http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-June/063010.html>)
>
Nope.

>
> I'd also like to point out that IR-level passes are pretty much
LLVM's
> strongest point of decoupling and modularization, so of all code changes to
> have no in-tree users (if indeed there are none), this is probably a
> best-case scenario from a maintainability perspective (especially if it
> becomes the point of collaboration for Emscripten and PNaCl).
>
Yep, its definitely a best case scenario. Note that I started off saying
that this was less complex than the proposed x32 changes. I think IR passes
are reasonably well factored for this.

However, it does still have a cost. Having fixed bugs in RegionInfo (prior
to the current excellent Polly bots) and deleted a large number of stale IR
passes that were not used, they cause confusion and ongoing maintenance
headaches. These aren't extreme, they are imminently surmountable even! But
we do need to have something to overcome them, and currently I'm not seeing
it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140304/af4dca6f/attachment.html>

Eli Bendersky

2014-Mar-05 00:53 UTC

head link

[LLVMdev] Upstreaming PNaCl's IR simplification passes

<snip>
>
>>> The PNaCl team (which I'm a member of) is happy to take on the
work of
>>> maintaining this code, such as updating it as LLVM IR evolves and
doing
>>> code reviews.  We would upstream this gradually, pass by pass, so
the
>>> changes would be manageable.
>>>
>>
>> While this is appreciated, the PNaCl team should work to much more
>> actively contribute to the core of LLVM if it wants to be trusted to
>> maintain this code.
>>
>
> Is eliben still on the PNaCl team? (e.g. <
> http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-June/063010.html>)
>
Technically, no. While I'm still collaborating with the PNaCl team on some
tasks, I'm not likely to be maintaining these passes as part of my day job
(beyond the usual upstream gardening I do from time to time).

That said, personally I think these passes are very useful in upstream
LLVM. For example, juts recently I found the constant-expr elimination
extremely handy in a completely PNaCl-unrelated out-of-tree project I'm
working on. Having this code in upstream LLVM would be wonderful --
otherwise I just maintain my own copy. Not only the simplification passes
make LLVM IR more palatable for non-traditional backends, they also strike
into the very important conundrum of whether LLVM IR is, or can be, target
independent. PNaCl is an interesting proof of concept that LLVM IR can
indeed, under some circumstances, be useful as a target-indepentent IR. I
think this opens up many interesting opportunities for LLVM.

As for the maintenance cost... the passes are really quite simple in
essence. Moreover, if two very significant projects rely on them (PNaCl is
officially released in Chrome, Emscripten is extremely popular too), it
seems unlikely to me that they will bit-rot.

The detailed technical concerns are very interesting, of course (for
example, where it's most proper to do integer legalization). These should
definitely be discussed in detail on a case-by-case basis, but I don't see
them as a strong reason not to add this to LLVM at all.

Eli
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140304/952750f8/attachment.html>

Chandler Carruth

2014-Mar-05 01:14 UTC

head link

[LLVMdev] Upstreaming PNaCl's IR simplification passes

On Tue, Mar 4, 2014 at 4:53 PM, Eli Bendersky <eliben at google.com>
wrote:
> As for the maintenance cost... the passes are really quite simple in
> essence. Moreover, if two very significant projects rely on them (PNaCl is
> officially released in Chrome, Emscripten is extremely popular too), it
> seems unlikely to me that they will bit-rot.

I want to be clear, I'm not claiming they will bitrot. I'm claiming that
they are a technical burden that is being added to that of the community,
and I don't currently see the balancing contributions from the developers
on those projects to the core of LLVM.

Could the project tolerate the burden? I am not optimistic. For example, I
don't think that the community has the free bandwidth to give the technical
review to the patches that they need.

The thing is, there are simple (if not "easy" as it requires a lot of
work)
ways to address this: PNaCl folks could become more active contributors to
the project, or they could make the changes have a significant positive
effect on the existing complexity of the system. Or even better, both! But
I've not yet seen real evidence of either.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140304/59df27f9/attachment.html>

Alp Toker

2014-Mar-05 12:32 UTC

head link

[LLVMdev] Upstreaming PNaCl's IR simplification passes

On 05/03/2014 00:12, Sean Silva wrote:> I'd also like to point out that IR-level passes are pretty much
LLVM's
> strongest point of decoupling and modularization, so of all code 
> changes to have no in-tree users (if indeed there are none), this is 
> probably a best-case scenario from a maintainability perspective 
> (especially if it becomes the point of collaboration for Emscripten 
> and PNaCl).
Just to chime in with another use case, these passes would have been 
useful for lowering to MSIL in our C++/CLI compiler.

We initially experimented with LLVM as the backend for our C++/CLI 
compiler but hit upon problems just like these -- and without the skill 
set to solve them at the time, we ended up resorting to just blasting 
out bytecode from clang IRGen.

The IRGen kludge "works for us" but it's always been a regret of
mine
that we missed out on a lot of what makes LLVM great at the last mile. 
We're kind of stuck with that decision today but +1 for facilities that 
help others avoid that fate.

I can see how such facilities may appear orthogonal to people working on 
"real" machine backends but the same could be said for JIT / MCJIT
which
currently doesn't have in-tree users.

If a real in-tree user would help how about expediting the inclusion of 
PNaCl or Emscripten? I know developers on both teams and have confidence 
in their ability to keep with the programme.

Alp.

-- 
http://www.nuanti.com
the browser experts

llvm dev - Mar 2014 - [LLVMdev] Upstreaming PNaCl's IR simplification passes

[LLVMdev] Upstreaming PNaCl's IR simplification passes

[LLVMdev] Upstreaming PNaCl's IR simplification passes

[LLVMdev] Upstreaming PNaCl's IR simplification passes

[LLVMdev] Upstreaming PNaCl's IR simplification passes

[LLVMdev] Upstreaming PNaCl's IR simplification passes