Viktor Pavlu
2011-Apr-01 13:53 UTC
[LLVMdev] GSoC 2011: Fast JIT Code Generation for x86-64
Hi All, I'd like to propose a fast path through code generation for x86-84 in the JIT execution engine as part of 2011's Google Summer of Code program. While the LLVM-JIT is very popular as a first try at jitting programming languages, projects have abandoned the LLVM-JIT when disappointed with the overall runtime. The problem is, that the benefit of faster execution is traded for longer compile time -- which only pays off for code that is executed frequently. One solution to this problem is an adaptive compilation scheme with separate compile strategies for cold and hot code. The aim of my project is to create the fast path for code that is compiled the first time in such an adaptive compilation scheme: a code generator that produces unoptimized code in a very short time. I plan to implement a two-pass (almost) linear code generator specifically for x86-64 that - performs analyses (e.g. live-range analysis) on LLVM-IR in the first pass and - then generates x86-64 instructions directly from IR in a second pass that writes to the executable memory (e.g. in X86CodeEmitter.cpp), circumventing the more expensive backend passes. This code generator can then be part of an adaptive compilation framework within LLVM (a GSoC proposal in this direction is currently discussed on the llvm-dev mailing list[1]), or from outside of LLVM -- the latter being my main motivation. I currently work on generating fast cycle-accurate simulators[2]. For this, our institute has implemented a two-part adaptive compilation scheme using the LLVM-JIT. Although most optimizations are turned off already and the FastISel instruction selector is used, the "fast" path for first-time code generation is still the bottleneck of the simulators. This is for the largest part due to the SelectionDAG instruction selection process, hence the motivation for a simpler, two-pass code generator. As for my personal details, I'm a PhD student at Vienna University of Technology (TU Wien) with a strong background in compiler theory, acquired in a wide variety of undergradute- and graduate-level courses. I appreciate any suggestions and would be very excited if someone is interested in mentoring this. Please note that I'm offline until April 4, so I cannot respond before next Tuesday. - Viktor Pavlu --- [1]: GSoC Proposal: Adaptive Compilation Framework for LLVM JIT Compiler http://groups.google.com/group/llvm-dev/browse_thread/thread/b4dfd837e208f9dc/ [2]: Optimal Code Generation for Explicitly Parallel Processors http://www.complang.tuwien.ac.at/epicopt/
Joshua Warner
2011-Apr-01 15:06 UTC
[LLVMdev] GSoC 2011: Fast JIT Code Generation for x86-64
Hi Viktor, I think this is a great idea overall! This problem is something that has * almost* turned me away from LLVM several times now. I'm by no means an influential member of the community (and hence have no real say in GSoC projects), but I do have a few comments.> I plan to implement a two-pass (almost) linear code generator > specifically for x86-64 that >> - performs analyses (e.g. live-range analysis) on LLVM-IR in the > first pass and >I assume this is for collecting information for register allocation? For fast code generation, I would go with a local, bottom-up, linear register allocator, which shouldn't require an explicit live-range analysis pass. It only needs to know liveness information within a single block (mostly), which should be easier and faster to compute on-demand instead of in an analysis pass.> > - then generates x86-64 instructions directly from IR in a second > pass that writes to the executable memory (e.g. in > X86CodeEmitter.cpp), >It sounds as if you are intending on mostly hand-writing the code generation part. If this is the case, I would suggest that it would be significantly more valuable to generate it from the *.td files instead. That way, it should be a lot easier to port to other architectures. Sincerely, Joshua -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110401/fdfbc516/attachment.html>
Óscar Fuentes
2011-Apr-02 05:09 UTC
[LLVMdev] GSoC 2011: Fast JIT Code Generation for x86-64
Viktor Pavlu <vpavlu at gmail.com> writes: [snip]> While the LLVM-JIT is very popular as a first try at jitting > programming languages, projects have abandoned the LLVM-JIT when > disappointed with the overall runtime.Hear, hear.> The problem is, that the benefit of faster execution is traded for > longer compile time -- which only pays off for code that is executed > frequently. One solution to this problem is an adaptive compilation > scheme with separate compile strategies for cold and hot code.[snip]> I currently work on generating fast cycle-accurate simulators[2]. For > this, our institute has implemented a two-part adaptive compilation > scheme using the LLVM-JIT. Although most optimizations are turned off > already and the FastISel instruction selector is used, the "fast" path > for first-time code generation is still the bottleneck of the > simulators. This is for the largest part due to the SelectionDAG > instruction selection process, hence the motivation for a simpler, > two-pass code generator.Well, anything that makes the JIT usable for those of us compiling medium-sized code (on the order of hundred of KB to a few MB of generated native code) is greatly appreciated. As a means of improving runtime performance, my compiler supports the LLVM JIT and a dumb X86 assembler generator that makes very simple optimizations and has some hard constraints. The latter runs on a fraction of the time and performs very similar or better than the LLVM JIT (without LLVM's optimization passes.) So I'm pretty sure that it is possible to dramatically reduce the time required by the JIT without a severe impact on performance. [snip]
Eric Christopher
2011-Apr-04 19:50 UTC
[LLVMdev] GSoC 2011: Fast JIT Code Generation for x86-64
On Apr 1, 2011, at 6:53 AM, Viktor Pavlu wrote:> I currently work on generating fast cycle-accurate simulators[2]. For > this, our institute has implemented a two-part adaptive compilation > scheme using the LLVM-JIT. Although most optimizations are turned off > already and the FastISel instruction selector is used, the "fast" path > for first-time code generation is still the bottleneck of the > simulators. This is for the largest part due to the SelectionDAG > instruction selection process, hence the motivation for a simpler, > two-pass code generator.This is effectively what fastisel was created for - there are just IR constructs that don't go through that path. The idea is that fastisel will get most of the IR and everything that'd be really hard we just punt to the DAG. I imagine running more things through fastisel would help. That won't help the slow register allocation problem though - even the fast allocator is pretty slow. I haven't seen what your plan is for register allocation or were you planning on just using a few registers in defined ways? Also, X86CodeEmitter.cpp is going away to be replaced with the MC emitters. -eric
Viktor Pavlu
2011-Apr-05 09:56 UTC
[LLVMdev] GSoC 2011: Fast JIT Code Generation for x86-64
On Mon, Apr 4, 2011 at 9:50 PM, Eric Christopher <echristo at apple.com> wrote:> > On Apr 1, 2011, at 6:53 AM, Viktor Pavlu wrote: > >> [...] Although most optimizations are turned off >> already and the FastISel instruction selector is used, the "fast" path >> for first-time code generation is still the bottleneck [...] > > This is effectively what fastisel was created for - there are just IR > constructs that don't go through that path. The idea is that fastisel > will get most of the IR and everything that'd be really hard we just > punt to the DAG. I imagine running more things through fastisel would > help.To me, increasing coverage of the FastISel seemed more involved than directly emitting opcodes to memory, with a lesser outlook on reducing overhead.> That won't help the slow register allocation problem though - even > the fast allocator is pretty slow. I haven't seen what your plan > is for register allocation or were you planning on just using a few > registers in defined ways?My first idea was to implement a linear scan allocator integrated into the code generation pass.> Also, X86CodeEmitter.cpp is going away to be replaced with the MC > emitters.Yes, I remember reading about this on the mailing list. With our simulator generators we are still living in 2.2/2.6 land, though, but we will change that. X86CodeEmitter was only meant to indicate that in my intended fast path there is nothing in between the LLVM-IR passes and the final emission of the code, i.e. an LLVM-IR pass that produces x86-64. - Viktor
Possibly Parallel Threads
- [LLVMdev] GSoC 2011: Fast JIT Code Generation for x86-64
- [LLVMdev] GSoC 2011: Fast JIT Code Generation for x86-64
- [LLVMdev] GSoC 2011: Fast JIT Code Generation for x86-64
- [LLVMdev] GSoC 2011: Fast JIT Code Generation for x86-64
- [LLVMdev] GSoC 2011: Fast JIT Code Generation for x86-64