Andrew Trick via llvm-dev
2016-Feb-16 07:42 UTC
[llvm-dev] WebKit B3 (was LLVM Weekly - #110, Feb 8th 2016)
> On Feb 15, 2016, at 5:34 PM, Philip Reames <listmail at philipreames.com> wrote: > > > > On 02/15/2016 04:57 PM, Andrew Trick wrote: >> >>> On Feb 15, 2016, at 4:25 PM, Philip Reames < <mailto:listmail at philipreames.com>listmail at philipreames.com <mailto:listmail at philipreames.com>> wrote: >>> >>> After reading https://webkit.org/blog/5852/introducing-the-b3-jit-compiler/ <https://webkit.org/blog/5852/introducing-the-b3-jit-compiler/>., I jotted down a couple of thoughts of my own here:http://www.philipreames.com/Blog/2016/02/15/quick-thoughts-on-webkits-b3/ <http://www.philipreames.com/Blog/2016/02/15/quick-thoughts-on-webkits-b3/> >> >> Thanks for sharing. I think it’s worth noting that what you are doing would be considered 5th tier for WebKit, since you already had a decent optimizing backend without LLVM. > So, serious, but naive question: what are the other tiers for? My mental model is generally: > tier 0 - interpreter or splat compiler -- (to deal with run once code)You combined two tiers in one, and I start at 1. So using my terminology inspired by WebKit: tier 1: interpreter tier 2: splat compiler> tier 1 - a fast (above all else) but decent compiler which gets the obvious stuff -- (does most of the compilation by methods)or tier 3: compiling methods into IR or bytecode, applying high-level optimization, splatting codegen> tier 2 - a good, but fast, compiler which generates good quality code without burning too much time -- (specifically for the hotish stuff)or tier 4: high level optimization using profile data from tier3, nontrivial codegen> tier 3 - "a compile time does not matter, get this hot method" compiler, decidedly optional -- (compiles only *really* hot stuff)or tier 5: bolt a C compiler onto the JIT.> > (Profiling is handled by tier 0, and tier 1, in the above.)Profiling needs to be done by all tiers up to and including at least the first round of high-level optimization where the optimizer registers some assumptions about runtime types (tier 3 in my case). I’m not saying it’s a good idea to have all those tiers, it’s just a way to compare JIT levels. The point is, you are a tier higher than B3. - Andy> > It really sounds to me like FTL is positioned somewhere between tier 1 and tier 2 in the above. Is that about right? >> You also have more room for background compilation threads and aren’t benchmarking on a MacBook Air. > True! Both definitely matter. >> >> Andy >> >>> >>> Philip >>> >>> On 02/15/2016 03:12 PM, Andrew Trick via llvm-dev wrote: >>>> >>>>> On Feb 9, 2016, at 9:55 AM, Rafael Espíndola via llvm-dev < <mailto:llvm-dev at lists.llvm.org>llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >>>>> > >>>>> > JavaScriptCore's [FTL JIT]( <https://trac.webkit.org/wiki/FTLJIT>https://trac.webkit.org/wiki/FTLJIT <https://trac.webkit.org/wiki/FTLJIT>) is moving away >>>>> > from using LLVM as its backend, towards [B3 (Bare Bones >>>>> > Backend)](https://webkit.org/docs/b3/ <https://webkit.org/docs/b3/>). This includes its own [SSA >>>>> > IR](https://webkit.org/docs/b3/intermediate- <https://webkit.org/docs/b3/intermediate-representation.html>representation.html <https://webkit.org/docs/b3/intermediate-representation.html>), >>>>> > optimisations, and instruction selection backend. >>>>> >>>>> In the end, what was the main motivation for creating a new IR? >>>>> >>>> >>>> I can't speak to the motivation of the WebKit team. Those are outlined in <https://webkit.org/blog/5852/introducing-the-b3-jit-compiler/>https://webkit.org/blog/5852/introducing-the-b3-jit-compiler/ <https://webkit.org/blog/5852/introducing-the-b3-jit-compiler/>. >>>> I'll give you my personal perspective on using LLVM for JITs, which may be interesting to the LLVM community. >>>> >>>> Most of the payoff for high level languages comes from the language-specific optimizer. It was simpler for JavaScriptCore to perform loop optimization at that level, so it doesn't even make use of LLVM's most powerful optimizations, particularly SCEV based optimization. There is a relatively small, finite amount of low-level optimization that is going to be important for JavaScript benchmarks (most of InstCombine is not relevant). >>>> >>>> SelectionDAG ISEL's compile time makes it a very poor choice for a JIT. We never put the effort into making x86 FastISEL competitive for WebKit's needs. The focus now is on Global ISEL, but that won't be ready for a while. >>>> >>>> Even when LLVM's compile time problems are largely solved, and I believe they can be, there will always be systemic compile time and memory overhead from design decisions that achieve generality, flexibility, and layering. These are software engineering tradeoffs. >>>> >>>> It is possible to design an extremely lightweight SSA IR that works well in a carefully controlled, fixed optimization pipeline. You then benefit from basic SSA optimizations, which are not hard to write. You end up working with an IR of arrays, where identifiers are indicies into the array. It's a different way of writing passes, but very efficient. It's probably worth it for WebKit, but not LLVM. >>>> >>>> LLVM's patchpoints and stackmaps features are critical for managed runtimes. However, directly supporting these features in a custom IR is simply more convenient. It takes more time to make design changes to LLVM IR vs. a custom IR. For example, LLVM does not yet support TBAA on calls, which would be very useful for optimizating around patchpoints and runtime calls. >>>> >>>> Prior to FTL, JavaScriptCore had no dependence on the LLVM project. Maintaining a dependence on an external project naturally has integration overhead. >>>> >>>> So, while LLVM is not the perfect JIT IR, it is very useful for JIT developers who want a quick solution for low-level optimization and retargetable codegen. WebKit FTL was a great example of using it to bootstrap a higher tier JIT. >>>> >>>> To that end, I think it is important for LLVM to have a well-supported -Ojit pipeline (compile fast) with the right set of passes for higher-level languages (e.g. Tail Duplication). >>>> >>>> -Andy >>>> >>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160215/8e8c2856/attachment.html>
Philip Reames via llvm-dev
2016-Feb-16 15:47 UTC
[llvm-dev] WebKit B3 (was LLVM Weekly - #110, Feb 8th 2016)
On 02/15/2016 11:42 PM, Andrew Trick wrote:> >> On Feb 15, 2016, at 5:34 PM, Philip Reames <listmail at philipreames.com >> <mailto:listmail at philipreames.com>> wrote: >> >> >> >> On 02/15/2016 04:57 PM, Andrew Trick wrote: >>> >>>> On Feb 15, 2016, at 4:25 PM, Philip Reames >>>> <listmail at philipreames.com> wrote: >>>> >>>> After reading >>>> https://webkit.org/blog/5852/introducing-the-b3-jit-compiler/., I >>>> jotted down a couple of thoughts of my own here: >>>> http://www.philipreames.com/Blog/2016/02/15/quick-thoughts-on-webkits-b3/ >>> >>> Thanks for sharing. I think it’s worth noting that what you are >>> doing would be considered 5th tier for WebKit, since you already had >>> a decent optimizing backend without LLVM. >> So, serious, but naive question: what are the other tiers for? My >> mental model is generally: >> tier 0 - interpreter or splat compiler -- (to deal with run once code) > > You combined two tiers in one, and I start at 1. So using my > terminology inspired by WebKit: > tier 1: interpreter > tier 2: splat compilerAh, this was the piece I was missing. I didn't realize you had both an interpreter and a splat compiler. That makes the numbering make a lot more sense. I had come up with the possible off-by-one myself, but that didn't fully explain the difference. Can you say anything about the reasoning for having both? Do you see warmish code that the splat compiler is worthwhile? I'm used to interpreters and splat compilers being positioned as an either-or choice. When do you decide to promote something to the splat compiler, but not the "tier 3" compiler?> >> tier 1 - a fast (above all else) but decent compiler which gets the >> obvious stuff -- (does most of the compilation by methods) > > or tier 3: compiling methods into IR or bytecode, applying high-level > optimization, splatting codegen > >> tier 2 - a good, but fast, compiler which generates good quality code >> without burning too much time -- (specifically for the hotish stuff) > > or tier 4: high level optimization using profile data from tier3, > nontrivial codegen > >> tier 3 - "a compile time does not matter, get this hot method" >> compiler, decidedly optional -- (compiles only *really* hot stuff) > > or tier 5: bolt a C compiler onto the JIT. > >> >> (Profiling is handled by tier 0, and tier 1, in the above.) > > Profiling needs to be done by all tiers up to and including at least > the first round of high-level optimization where the optimizer > registers some assumptions about runtime types (tier 3 in my case). > > I’m not saying it’s a good idea to have all those tiers, it’s just a > way to compare JIT levels. The point is, you are a tier higher than B3. > > - Andy > >> >> It really sounds to me like FTL is positioned somewhere between tier >> 1 and tier 2 in the above. Is that about right? >>> You also have more room for background compilation threads and >>> aren’t benchmarking on a MacBook Air. >> True! Both definitely matter. >>> >>> Andy >>> >>>> >>>> Philip >>>> >>>> On 02/15/2016 03:12 PM, Andrew Trick via llvm-dev wrote: >>>>> >>>>>> On Feb 9, 2016, at 9:55 AM, Rafael Espíndola via llvm-dev >>>>>> <llvm-dev at lists.llvm.org> wrote: >>>>>> >>>>>> > >>>>>> > JavaScriptCore's [FTL JIT](https://trac.webkit.org/wiki/FTLJIT) >>>>>> is moving away >>>>>> > from using LLVM as its backend, towards [B3 (Bare Bones >>>>>> > Backend)](https://webkit.org/docs/b3/). This includes its own [SSA >>>>>> > IR](https://webkit.org/docs/b3/intermediate- >>>>>> <https://webkit.org/docs/b3/intermediate-representation.html>representation.html >>>>>> <https://webkit.org/docs/b3/intermediate-representation.html>), >>>>>> > optimisations, and instruction selection backend. >>>>>> >>>>>> In the end, what was the main motivation for creating a new IR? >>>>>> >>>>> I can't speak to the motivation of the WebKit team. Those are >>>>> outlined in >>>>> https://webkit.org/blog/5852/introducing-the-b3-jit-compiler/. >>>>> I'll give you my personal perspective on using LLVM for JITs, >>>>> which may be interesting to the LLVM community. >>>>> >>>>> Most of the payoff for high level languages comes from the >>>>> language-specific optimizer. It was simpler for JavaScriptCore to >>>>> perform loop optimization at that level, so it doesn't even make >>>>> use of LLVM's most powerful optimizations, particularly SCEV based >>>>> optimization. There is a relatively small, finite amount of >>>>> low-level optimization that is going to be important for >>>>> JavaScript benchmarks (most of InstCombine is not relevant). >>>>> >>>>> SelectionDAG ISEL's compile time makes it a very poor choice for a >>>>> JIT. We never put the effort into making x86 FastISEL competitive >>>>> for WebKit's needs. The focus now is on Global ISEL, but that >>>>> won't be ready for a while. >>>>> >>>>> Even when LLVM's compile time problems are largely solved, and I >>>>> believe they can be, there will always be systemic compile time >>>>> and memory overhead from design decisions that achieve generality, >>>>> flexibility, and layering. These are software engineering tradeoffs. >>>>> >>>>> It is possible to design an extremely lightweight SSA IR that >>>>> works well in a carefully controlled, fixed optimization pipeline. >>>>> You then benefit from basic SSA optimizations, which are not hard >>>>> to write. You end up working with an IR of arrays, where >>>>> identifiers are indicies into the array. It's a different way of >>>>> writing passes, but very efficient. It's probably worth it for >>>>> WebKit, but not LLVM. >>>>> >>>>> LLVM's patchpoints and stackmaps features are critical for managed >>>>> runtimes. However, directly supporting these features in a custom >>>>> IR is simply more convenient. It takes more time to make design >>>>> changes to LLVM IR vs. a custom IR. For example, LLVM does not yet >>>>> support TBAA on calls, which would be very useful for optimizating >>>>> around patchpoints and runtime calls. >>>>> >>>>> Prior to FTL, JavaScriptCore had no dependence on the LLVM >>>>> project. Maintaining a dependence on an external project naturally >>>>> has integration overhead. >>>>> >>>>> So, while LLVM is not the perfect JIT IR, it is very useful for >>>>> JIT developers who want a quick solution for low-level >>>>> optimization and retargetable codegen. WebKit FTL was a great >>>>> example of using it to bootstrap a higher tier JIT. >>>>> >>>>> To that end, I think it is important for LLVM to have a >>>>> well-supported -Ojit pipeline (compile fast) with the right set of >>>>> passes for higher-level languages (e.g. Tail Duplication). >>>>> >>>>> -Andy >>>>> >>>>> >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> llvm-dev at lists.llvm.org >>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160216/eef0c3b0/attachment-0001.html>
David Chisnall via llvm-dev
2016-Feb-16 16:28 UTC
[llvm-dev] WebKit B3 (was LLVM Weekly - #110, Feb 8th 2016)
On 16 Feb 2016, at 15:47, Philip Reames via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > Can you say anything about the reasoning for having both? Do you see warmish code that the splat compiler is worthwhile? I'm used to interpreters and splat compilers being positioned as an either-or choice. When do you decide to promote something to the splat compiler, but not the "tier 3” compiler?I’m not a WebKit developer (though I do use JSC as a case study in the course that I teach), so this may be wide of the mark, but it’s worth noting that the requirements of JavaScript in a web browser are quite different from those of most languages. A *lot* of JavaScript code has execution time completely dominated by the time spent in the DOM and users notice if memory consumption for this code is high, but don’t notice if the JavaScript execution is slow (even a naïve AST interpreter will find performance massively dominated by the code in the DOM). Additionally, a lot is executed only once (JavaScript is purely imperative and so ends up with a lot of code that is declarative in Java / C#, for example creating classes and attaching methods to them, is imperative code in JavaScript and is executed precisely once). Much of this code must be executed in the few tens of milliseconds that exist between the user clicking on a link and the user complaining that the browser is slow. The baseline JIT is around an order of magnitude faster than the interpreter[1], but consumes memory for the generated code and does not give a user-noticeable speedup for code that is executed only once. The baseline JIT and the interpreter both use the same stack layout (this was one of the motivations for replacing the old C++ interpreter with one written in JSC’s custom macro assembly), so it’s comparatively cheap to move from the interpreter to the baseline JIT. Finally, WebKit / JSC runs on a lot of mobile devices (iPhones up to MacBooks Pro) where power consumption is a vital design consideration. It is very important in these situations not to speculatively burn cycles optimising code where the user won’t notice the difference, because they will notice if their battery doesn’t last as long. These constraints don’t exist in server workloads and are even quite rare on the desktop. Few people care if their desktop app takes a couple of seconds to start (and, if they do, they won’t mind if the second time it starts a lot faster because it has cached the generated code). If a web page takes a couple of seconds to load, then a lot of people will close the tab before it finishes. David [1] https://webkit.org/blog/3362/introducing-the-webkit-ftl-jit/