thr3ads.net - llvm dev - [llvm-dev] WebKit B3 (was LLVM Weekly

If this information is useful, please help other people find it:
Share via:

Andrew Trick via llvm-dev

2016-Feb-16 07:42 UTC

[llvm-dev] WebKit B3 (was LLVM Weekly - #110, Feb 8th 2016)

> On Feb 15, 2016, at 5:34 PM, Philip Reames <listmail at
philipreames.com> wrote:
> 
> 
> 
> On 02/15/2016 04:57 PM, Andrew Trick wrote:
>> 
>>> On Feb 15, 2016, at 4:25 PM, Philip Reames < <mailto:listmail
at philipreames.com>listmail at philipreames.com <mailto:listmail at
philipreames.com>> wrote:
>>> 
>>> After reading
https://webkit.org/blog/5852/introducing-the-b3-jit-compiler/
<https://webkit.org/blog/5852/introducing-the-b3-jit-compiler/>., I jotted
down a couple of thoughts of my own
here:http://www.philipreames.com/Blog/2016/02/15/quick-thoughts-on-webkits-b3/
<http://www.philipreames.com/Blog/2016/02/15/quick-thoughts-on-webkits-b3/>
>> 
>> Thanks for sharing. I think it’s worth noting that what you are doing
would be considered 5th tier for WebKit, since you already had a decent
optimizing backend without LLVM.
> So, serious, but naive question: what are the other tiers for?  My mental
model is generally:
> tier 0 - interpreter or splat compiler -- (to deal with run once code)
You combined two tiers in one, and I start at 1. So using my terminology
inspired by WebKit:
tier 1: interpreter
tier 2: splat compiler
> tier 1 - a fast (above all else) but decent compiler which gets the obvious
stuff -- (does most of the compilation by methods)
or tier 3: compiling methods into IR or bytecode, applying high-level
optimization, splatting codegen
> tier 2 - a good, but fast, compiler which generates good quality code
without burning too much time -- (specifically for the hotish stuff)
or tier 4: high level optimization using profile data from tier3, nontrivial
codegen
> tier 3 - "a compile time does not matter, get this hot method"
compiler, decidedly optional -- (compiles only *really* hot stuff)
or tier 5: bolt a C compiler onto the JIT.
> 
> (Profiling is handled by tier 0, and tier 1, in the above.)
Profiling needs to be done by all tiers up to and including at least the first
round of high-level optimization where the optimizer registers some assumptions 
about runtime types (tier 3 in my case).

I’m not saying it’s a good idea to have all those tiers, it’s just a way to
compare JIT levels. The point is, you are a tier higher than B3.

- Andy
> 
> It really sounds to me like FTL is positioned somewhere between tier 1 and
tier 2 in the above.  Is that about right?
>> You also have more room for background compilation threads and aren’t
benchmarking on a MacBook Air.
> True! Both definitely matter.
>> 
>> Andy
>> 
>>> 
>>> Philip
>>> 
>>> On 02/15/2016 03:12 PM, Andrew Trick via llvm-dev wrote:
>>>> 
>>>>> On Feb 9, 2016, at 9:55 AM, Rafael Espíndola via llvm-dev
< <mailto:llvm-dev at lists.llvm.org>llvm-dev at lists.llvm.org
<mailto:llvm-dev at lists.llvm.org>> wrote:
>>>>> >
>>>>> > JavaScriptCore's [FTL JIT](
<https://trac.webkit.org/wiki/FTLJIT>https://trac.webkit.org/wiki/FTLJIT
<https://trac.webkit.org/wiki/FTLJIT>) is moving away
>>>>> > from using LLVM as its backend, towards [B3 (Bare
Bones
>>>>> > Backend)](https://webkit.org/docs/b3/
<https://webkit.org/docs/b3/>). This includes its own [SSA
>>>>> > IR](https://webkit.org/docs/b3/intermediate-
<https://webkit.org/docs/b3/intermediate-representation.html>representation.html
<https://webkit.org/docs/b3/intermediate-representation.html>),
>>>>> > optimisations, and instruction selection backend.
>>>>> 
>>>>> In the end, what was the main motivation for creating a new
IR?
>>>>> 
>>>> 
>>>> I can't speak to the motivation of the WebKit team. Those
are outlined in 
<https://webkit.org/blog/5852/introducing-the-b3-jit-compiler/>https://webkit.org/blog/5852/introducing-the-b3-jit-compiler/
<https://webkit.org/blog/5852/introducing-the-b3-jit-compiler/>.
>>>> I'll give you my personal perspective on using LLVM for
JITs, which may be interesting to the LLVM community.
>>>> 
>>>> Most of the payoff for high level languages comes from the
language-specific optimizer. It was simpler for JavaScriptCore to perform loop
optimization at that level, so it doesn't even make use of LLVM's most
powerful optimizations, particularly SCEV based optimization. There is a
relatively small, finite amount of low-level optimization that is going to be
important for JavaScript benchmarks (most of InstCombine is not relevant).
>>>> 
>>>> SelectionDAG ISEL's compile time makes it a very poor
choice for a JIT. We never put the effort into making x86 FastISEL competitive
for WebKit's needs. The focus now is on Global ISEL, but that won't be
ready for a while.
>>>> 
>>>> Even when LLVM's compile time problems are largely solved,
and I believe they can be, there will always be systemic compile time and memory
overhead from design decisions that achieve generality, flexibility, and
layering. These are software engineering tradeoffs.
>>>> 
>>>> It is possible to design an extremely lightweight SSA IR that
works well in a carefully controlled, fixed optimization pipeline. You then
benefit from basic SSA optimizations, which are not hard to write. You end up
working with an IR of arrays, where identifiers are indicies into the array.
It's a different way of writing passes, but very efficient. It's
probably worth it for WebKit, but not LLVM.
>>>> 
>>>> LLVM's patchpoints and stackmaps features are critical for
managed runtimes. However, directly supporting these features in a custom IR is
simply more convenient. It takes more time to make design changes to LLVM IR vs.
a custom IR. For example, LLVM does not yet support TBAA on calls, which would
be very useful for optimizating around patchpoints and runtime calls.
>>>> 
>>>> Prior to FTL, JavaScriptCore had no dependence on the LLVM
project. Maintaining a dependence on an external project naturally has
integration overhead.
>>>> 
>>>> So, while LLVM is not the perfect JIT IR, it is very useful for
JIT developers who want a quick solution for low-level optimization and
retargetable codegen. WebKit FTL was a great example of using it to bootstrap a
higher tier JIT.
>>>> 
>>>> To that end, I think it is important for LLVM to have a
well-supported -Ojit pipeline (compile fast) with the right set of passes for
higher-level languages (e.g. Tail Duplication).
>>>> 
>>>> -Andy
>>>> 
>>>> 
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>>> 
>> 
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160215/8e8c2856/attachment.html>

Philip Reames via llvm-dev

2016-Feb-16 15:47 UTC

head link

[llvm-dev] WebKit B3 (was LLVM Weekly - #110, Feb 8th 2016)

On 02/15/2016 11:42 PM, Andrew Trick wrote:>
>> On Feb 15, 2016, at 5:34 PM, Philip Reames <listmail at
philipreames.com
>> <mailto:listmail at philipreames.com>> wrote:
>>
>>
>>
>> On 02/15/2016 04:57 PM, Andrew Trick wrote:
>>>
>>>> On Feb 15, 2016, at 4:25 PM, Philip Reames 
>>>> <listmail at philipreames.com> wrote:
>>>>
>>>> After reading 
>>>> https://webkit.org/blog/5852/introducing-the-b3-jit-compiler/.,
I
>>>> jotted down a couple of thoughts of my own here: 
>>>>
http://www.philipreames.com/Blog/2016/02/15/quick-thoughts-on-webkits-b3/
>>>
>>> Thanks for sharing. I think it’s worth noting that what you are 
>>> doing would be considered 5th tier for WebKit, since you already
had
>>> a decent optimizing backend without LLVM.
>> So, serious, but naive question: what are the other tiers for?  My 
>> mental model is generally:
>> tier 0 - interpreter or splat compiler -- (to deal with run once code)
>
> You combined two tiers in one, and I start at 1. So using my 
> terminology inspired by WebKit:
> tier 1: interpreter
> tier 2: splat compilerAh, this was the piece I was missing.  I didn't realize you had both an 
interpreter and a splat compiler.  That makes the numbering make a lot 
more sense.  I had come up with the possible off-by-one myself, but that 
didn't fully explain the difference.

Can you say anything about the reasoning for having both?  Do you see 
warmish code that the splat compiler is worthwhile?  I'm used to 
interpreters and splat compilers being positioned as an either-or 
choice.  When do you decide to promote something to the splat compiler, 
but not the "tier 3" compiler?>
>> tier 1 - a fast (above all else) but decent compiler which gets the 
>> obvious stuff -- (does most of the compilation by methods)
>
> or tier 3: compiling methods into IR or bytecode, applying high-level 
> optimization, splatting codegen
>
>> tier 2 - a good, but fast, compiler which generates good quality code 
>> without burning too much time -- (specifically for the hotish stuff)
>
> or tier 4: high level optimization using profile data from tier3, 
> nontrivial codegen
>
>> tier 3 - "a compile time does not matter, get this hot
method"
>> compiler, decidedly optional -- (compiles only *really* hot stuff)
>
> or tier 5: bolt a C compiler onto the JIT.
>
>>
>> (Profiling is handled by tier 0, and tier 1, in the above.)
>
> Profiling needs to be done by all tiers up to and including at least 
> the first round of high-level optimization where the optimizer 
> registers some assumptions  about runtime types (tier 3 in my case).
>
> I’m not saying it’s a good idea to have all those tiers, it’s just a 
> way to compare JIT levels. The point is, you are a tier higher than B3.
>
> - Andy
>
>>
>> It really sounds to me like FTL is positioned somewhere between tier 
>> 1 and tier 2 in the above.  Is that about right?
>>> You also have more room for background compilation threads and 
>>> aren’t benchmarking on a MacBook Air.
>> True! Both definitely matter.
>>>
>>> Andy
>>>
>>>>
>>>> Philip
>>>>
>>>> On 02/15/2016 03:12 PM, Andrew Trick via llvm-dev wrote:
>>>>>
>>>>>> On Feb 9, 2016, at 9:55 AM, Rafael Espíndola via
llvm-dev
>>>>>> <llvm-dev at lists.llvm.org> wrote:
>>>>>>
>>>>>> >
>>>>>> > JavaScriptCore's [FTL
JIT](https://trac.webkit.org/wiki/FTLJIT)
>>>>>> is moving away
>>>>>> > from using LLVM as its backend, towards [B3 (Bare
Bones
>>>>>> > Backend)](https://webkit.org/docs/b3/). This
includes its own [SSA
>>>>>> > IR](https://webkit.org/docs/b3/intermediate- 
>>>>>>
<https://webkit.org/docs/b3/intermediate-representation.html>representation.html
>>>>>>
<https://webkit.org/docs/b3/intermediate-representation.html>),
>>>>>> > optimisations, and instruction selection backend.
>>>>>>
>>>>>> In the end, what was the main motivation for creating a
new IR?
>>>>>>
>>>>> I can't speak to the motivation of the WebKit team.
Those are
>>>>> outlined in 
>>>>>
https://webkit.org/blog/5852/introducing-the-b3-jit-compiler/.
>>>>> I'll give you my personal perspective on using LLVM for
JITs,
>>>>> which may be interesting to the LLVM community.
>>>>>
>>>>> Most of the payoff for high level languages comes from the 
>>>>> language-specific optimizer. It was simpler for
JavaScriptCore to
>>>>> perform loop optimization at that level, so it doesn't
even make
>>>>> use of LLVM's most powerful optimizations, particularly
SCEV based
>>>>> optimization. There is a relatively small, finite amount of
>>>>> low-level optimization that is going to be important for 
>>>>> JavaScript benchmarks (most of InstCombine is not
relevant).
>>>>>
>>>>> SelectionDAG ISEL's compile time makes it a very poor
choice for a
>>>>> JIT. We never put the effort into making x86 FastISEL
competitive
>>>>> for WebKit's needs. The focus now is on Global ISEL,
but that
>>>>> won't be ready for a while.
>>>>>
>>>>> Even when LLVM's compile time problems are largely
solved, and I
>>>>> believe they can be, there will always be systemic compile
time
>>>>> and memory overhead from design decisions that achieve
generality,
>>>>> flexibility, and layering. These are software engineering
tradeoffs.
>>>>>
>>>>> It is possible to design an extremely lightweight SSA IR
that
>>>>> works well in a carefully controlled, fixed optimization
pipeline.
>>>>> You then benefit from basic SSA optimizations, which are
not hard
>>>>> to write. You end up working with an IR of arrays, where 
>>>>> identifiers are indicies into the array. It's a
different way of
>>>>> writing passes, but very efficient. It's probably worth
it for
>>>>> WebKit, but not LLVM.
>>>>>
>>>>> LLVM's patchpoints and stackmaps features are critical
for managed
>>>>> runtimes. However, directly supporting these features in a
custom
>>>>> IR is simply more convenient. It takes more time to make
design
>>>>> changes to LLVM IR vs. a custom IR. For example, LLVM does
not yet
>>>>> support TBAA on calls, which would be very useful for
optimizating
>>>>> around patchpoints and runtime calls.
>>>>>
>>>>> Prior to FTL, JavaScriptCore had no dependence on the LLVM 
>>>>> project. Maintaining a dependence on an external project
naturally
>>>>> has integration overhead.
>>>>>
>>>>> So, while LLVM is not the perfect JIT IR, it is very useful
for
>>>>> JIT developers who want a quick solution for low-level 
>>>>> optimization and retargetable codegen. WebKit FTL was a
great
>>>>> example of using it to bootstrap a higher tier JIT.
>>>>>
>>>>> To that end, I think it is important for LLVM to have a 
>>>>> well-supported -Ojit pipeline (compile fast) with the right
set of
>>>>> passes for higher-level languages (e.g. Tail Duplication).
>>>>>
>>>>> -Andy
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160216/eef0c3b0/attachment-0001.html>

David Chisnall via llvm-dev

2016-Feb-16 16:28 UTC

head link

[llvm-dev] WebKit B3 (was LLVM Weekly - #110, Feb 8th 2016)

On 16 Feb 2016, at 15:47, Philip Reames via llvm-dev <llvm-dev at
lists.llvm.org> wrote:> 
> Can you say anything about the reasoning for having both?  Do you see
warmish code that the splat compiler is worthwhile?  I'm used to
interpreters and splat compilers being positioned as an either-or choice.  When
do you decide to promote something to the splat compiler, but not the "tier
3” compiler?
I’m not a WebKit developer (though I do use JSC as a case study in the course
that I teach), so this may be wide of the mark, but it’s worth noting that the
requirements of JavaScript in a web browser are quite different from those of
most languages.  A *lot* of JavaScript code has execution time completely
dominated by the time spent in the DOM and users notice if memory consumption
for this code is high, but don’t notice if the JavaScript execution is slow
(even a naïve AST interpreter will find performance massively dominated by the
code in the DOM).  Additionally, a lot is executed only once (JavaScript is
purely imperative and so ends up with a lot of code that is declarative in Java
/ C#, for example creating classes and attaching methods to them, is imperative
code in JavaScript and is executed precisely once).  Much of this code must be
executed in the few tens of milliseconds that exist between the user clicking on
a link and the user complaining that the browser is slow.

The baseline JIT is around an order of magnitude faster than the interpreter[1],
but consumes memory for the generated code and does not give a user-noticeable
speedup for code that is executed only once.  The baseline JIT and the
interpreter both use the same stack layout (this was one of the motivations for
replacing the old C++ interpreter with one written in JSC’s custom macro
assembly), so it’s comparatively cheap to move from the interpreter to the
baseline JIT.

Finally, WebKit / JSC runs on a lot of mobile devices (iPhones up to MacBooks
Pro) where power consumption is a vital design consideration.  It is very
important in these situations not to speculatively burn cycles optimising code
where the user won’t notice the difference, because they will notice if their
battery doesn’t last as long.

These constraints don’t exist in server workloads and are even quite rare on the
desktop.  Few people care if their desktop app takes a couple of seconds to
start (and, if they do, they won’t mind if the second time it starts a lot
faster because it has cached the generated code).  If a web page takes a couple
of seconds to load, then a lot of people will close the tab before it finishes.

David

[1] https://webkit.org/blog/3362/introducing-the-webkit-ftl-jit/

llvm dev - Feb 2016 - WebKit B3 (was LLVM Weekly - #110, Feb 8th 2016)

[llvm-dev] WebKit B3 (was LLVM Weekly - #110, Feb 8th 2016)

[llvm-dev] WebKit B3 (was LLVM Weekly - #110, Feb 8th 2016)

[llvm-dev] WebKit B3 (was LLVM Weekly - #110, Feb 8th 2016)