thr3ads.net - llvm dev - [LLVMdev] GSoC 2011: Fast JIT Code Generation for x86-64 [Apr 2011]

If this information is useful, please help other people find it:
Share via:

Viktor Pavlu

2011-Apr-05 09:56 UTC

[LLVMdev] GSoC 2011: Fast JIT Code Generation for x86-64

On Mon, Apr 4, 2011 at 9:50 PM, Eric Christopher <echristo at apple.com>
wrote:>
> On Apr 1, 2011, at 6:53 AM, Viktor Pavlu wrote:
>
>> [...] Although most optimizations are turned off
>> already and the FastISel instruction selector is used, the
"fast" path
>> for first-time code generation is still the bottleneck [...]
>
> This is effectively what fastisel was created for - there are just IR
> constructs that don't go through that path. The idea is that fastisel
> will get most of the IR and everything that'd be really hard we just
> punt to the DAG. I imagine running more things through fastisel would
> help.
To me, increasing coverage of the FastISel seemed more involved than
directly emitting opcodes to memory, with a lesser outlook on
reducing overhead.
> That won't help the slow register allocation problem though - even
> the fast allocator is pretty slow. I haven't seen what your plan
> is for register allocation or were you planning on just using a few
> registers in defined ways?
My first idea was to implement a linear scan allocator integrated
into the code generation pass.
> Also, X86CodeEmitter.cpp is going away to be replaced with the MC
> emitters.
Yes, I remember reading about this on the mailing list.
With our simulator generators we are still living in 2.2/2.6 land,
though, but we will change that.

X86CodeEmitter was only meant to indicate that in my intended fast
path there is nothing in between the LLVM-IR passes and the final
emission of the code, i.e. an LLVM-IR pass that produces x86-64.

- Viktor

Jim Grosbach

2011-Apr-05 17:16 UTC

head link

[LLVMdev] GSoC 2011: Fast JIT Code Generation for x86-64

On Apr 5, 2011, at 2:56 AM, Viktor Pavlu wrote:
> On Mon, Apr 4, 2011 at 9:50 PM, Eric Christopher <echristo at
apple.com> wrote:
>> 
>> On Apr 1, 2011, at 6:53 AM, Viktor Pavlu wrote:
>> 
>>> [...] Although most optimizations are turned off
>>> already and the FastISel instruction selector is used, the
"fast" path
>>> for first-time code generation is still the bottleneck [...]
>> 
>> This is effectively what fastisel was created for - there are just IR
>> constructs that don't go through that path. The idea is that
fastisel
>> will get most of the IR and everything that'd be really hard we
just
>> punt to the DAG. I imagine running more things through fastisel would
>> help.
> 
> To me, increasing coverage of the FastISel seemed more involved than
> directly emitting opcodes to memory, with a lesser outlook on
> reducing overhead.
That seems extremely unlikely. You'd be effectively re-implementing both
fast-isel and the MC binary emitter layers, and it sounds like a new register
allocator as well.

What Eric is suggesting is instead locating which IR constructs are not being
handled by fast-isel and are causing problems (i.e., are being frequently
encountered in your code-base) and implementing fast-isel handling for them.
That will remove the selectiondag overhead that you've identified as the
primary compile-time problem.

-Jim
>> That won't help the slow register allocation problem though - even
>> the fast allocator is pretty slow. I haven't seen what your plan
>> is for register allocation or were you planning on just using a few
>> registers in defined ways?
> 
> My first idea was to implement a linear scan allocator integrated
> into the code generation pass.
> 
>> Also, X86CodeEmitter.cpp is going away to be replaced with the MC
>> emitters.
> 
> Yes, I remember reading about this on the mailing list.
> With our simulator generators we are still living in 2.2/2.6 land,
> though, but we will change that.
> 
> X86CodeEmitter was only meant to indicate that in my intended fast
> path there is nothing in between the LLVM-IR passes and the final
> emission of the code, i.e. an LLVM-IR pass that produces x86-64.
> 
> - Viktor
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Jan Sjodin

2011-Apr-05 17:39 UTC

head link

[LLVMdev] GSoC 2011: Fast JIT Code Generation for x86-64

----- Original Message ----
> From: Jim Grosbach <grosbach at apple.com>
> To: Viktor Pavlu <vpavlu at gmail.com>
> Cc: llvmdev at cs.uiuc.edu
> Sent: Tue, April 5, 2011 1:16:34 PM
> Subject: Re: [LLVMdev] GSoC 2011: Fast JIT Code Generation for x86-64
> 
> 
> On Apr 5, 2011, at 2:56 AM, Viktor Pavlu wrote:
> 
> > On Mon, Apr 4,  2011 at 9:50 PM, Eric Christopher <echristo at
apple.com>
wrote:> >> 
> >> On Apr 1, 2011, at 6:53 AM, Viktor Pavlu wrote:
> >> 
> >>> [...] Although most optimizations are turned  off
> >>> already and the FastISel instruction selector is used, the 
"fast" path
> >>> for first-time code generation is still the  bottleneck [...]
> >> 
> >> This is effectively what fastisel was  created for - there are
just IR
> >> constructs that don't go through  that path. The idea is that
fastisel
> >> will get most of the IR and  everything that'd be really hard
we just
> >> punt to the DAG. I imagine  running more things through fastisel
would
> >> help.
> > 
> > To  me, increasing coverage of the FastISel seemed more involved than
> >  directly emitting opcodes to memory, with a lesser outlook on
> > reducing  overhead.
> 
> That seems extremely unlikely. You'd be effectively  re-implementing
both
>fast-isel and the MC binary emitter layers, and it sounds  like a new
register
>allocator as well.
> 
> What Eric is suggesting is instead  locating which IR constructs are not
being
>handled by fast-isel and are causing  problems (i.e., are being frequently 
>encountered in your code-base) and  implementing fast-isel handling for
them.
>That will remove the selectiondag  overhead that you've identified as
the
>primary compile-time  problem.
> 
> -Jim
An alternative that would expand LLVMs capabilities would be to write an 
interpreter for the LLVM IR itself. A well written interpretation framework 
could be used by the compiler as well.

- Jan

Eric Christopher

2011-Apr-05 18:33 UTC

head link

[LLVMdev] GSoC 2011: Fast JIT Code Generation for x86-64

On Apr 5, 2011, at 2:56 AM, Viktor Pavlu wrote:
> On Mon, Apr 4, 2011 at 9:50 PM, Eric Christopher <echristo at
apple.com> wrote:
>> 
>> On Apr 1, 2011, at 6:53 AM, Viktor Pavlu wrote:
>> 
>>> [...] Although most optimizations are turned off
>>> already and the FastISel instruction selector is used, the
"fast" path
>>> for first-time code generation is still the bottleneck [...]
>> 
>> This is effectively what fastisel was created for - there are just IR
>> constructs that don't go through that path. The idea is that
fastisel
>> will get most of the IR and everything that'd be really hard we
just
>> punt to the DAG. I imagine running more things through fastisel would
>> help.
> 
> To me, increasing coverage of the FastISel seemed more involved than
> directly emitting opcodes to memory, with a lesser outlook on
> reducing overhead.
> 
Then you're not quite understanding what fast-isel does completely. The
idea behind fast-isel is that common code that can be easily splatted
out using effectively assembly is done using that. If you're seeing
a lot of time in the dag instruction selection then the code you're
putting through fast-isel isn't getting all the way through and fast-isel
is punting to selection dag.

If this is happening a lot you'll probably want to change the IR that
you're generating if possible. If you'd like to see the constructs
that fast-isel is punting to selection dag on, then there are options
to get it to be more verbose, or even abort.
>> That won't help the slow register allocation problem though - even
>> the fast allocator is pretty slow. I haven't seen what your plan
>> is for register allocation or were you planning on just using a few
>> registers in defined ways?
> 
> My first idea was to implement a linear scan allocator integrated
> into the code generation pass.
You may want to look at the fast register allocator then.
> 
>> Also, X86CodeEmitter.cpp is going away to be replaced with the MC
>> emitters.
> 
> Yes, I remember reading about this on the mailing list.
> With our simulator generators we are still living in 2.2/2.6 land,
> though, but we will change that.
> 
> X86CodeEmitter was only meant to indicate that in my intended fast
> path there is nothing in between the LLVM-IR passes and the final
> emission of the code, i.e. an LLVM-IR pass that produces x86-64.
Effectively what you're talking about then is a pass that rewrites
fast-isel to use MC instead of machine instructions. Also, one of
the advantages of fast-isel is that there's very little that has to
go through the hand coded parts - a great deal is autogenerated out
of the .td files. This is something that any replacement should do
as well.

-eric

Óscar Fuentes

2011-Apr-05 19:41 UTC

head link

[LLVMdev] GSoC 2011: Fast JIT Code Generation for x86-64

Jim Grosbach <grosbach at apple.com> writes:
>> To me, increasing coverage of the FastISel seemed more involved than
>> directly emitting opcodes to memory, with a lesser outlook on
>> reducing overhead.
>
> That seems extremely unlikely. You'd be effectively re-implementing
> both fast-isel and the MC binary emitter layers, and it sounds like a
> new register allocator as well.
>
> What Eric is suggesting is instead locating which IR constructs are
> not being handled by fast-isel and are causing problems (i.e., are
> being frequently encountered in your code-base) and implementing
> fast-isel handling for them. That will remove the selectiondag
> overhead that you've identified as the primary compile-time problem.
At some point on the past someone was kind enough to add fast-isel for
some instructions frequently emitted by my compiler, hoping that that
would speed up JITting. The results were dissapointing (negligible,
IIRC). Either fast-isel does not make much of a difference or the main
inefficiency is elsewhere.

Reasonably Related Threads

Search for more maybe matching threads

llvm dev - Apr 2011 - [LLVMdev] GSoC 2011: Fast JIT Code Generation for x86-64

[LLVMdev] GSoC 2011: Fast JIT Code Generation for x86-64

[LLVMdev] GSoC 2011: Fast JIT Code Generation for x86-64

[LLVMdev] GSoC 2011: Fast JIT Code Generation for x86-64

[LLVMdev] GSoC 2011: Fast JIT Code Generation for x86-64

[LLVMdev] GSoC 2011: Fast JIT Code Generation for x86-64

Reasonably Related Threads