thr3ads.net - llvm dev - [LLVMdev] A faster instruction selector? [Dec 2008]

If this information is useful, please help other people find it:
Share via:

Mark Shannon

2008-Dec-15 17:14 UTC

[LLVMdev] A faster instruction selector?

Hi everyone,

llvm is great!

But there is one exception ;)
llvm components are generally fast, but instruction selection is slooow.

Let me explain.
I am developing a toolkit for building virtual machines which can automatically
generate a JIT compiler using the interpreter specification.
llvm does the hard work of machine code generation. (Thanks to you all)

I discovered that JIT compilation is taking much longer than had hoped.
Some profiling showed that is almost entirely due 
to the llvm instruction selector (The register allocator and optimisation passes
are fast)

The split in execution time for JIT compilation being approximately:
(My) IR code generation:      1
(llvm) Register allocation:   1
(llvm) Instruction selection: 12 

The ladyvm JVM paper noted the same problem: 
http://vmkit.llvm.org/ladyvm.html

So I did a quick experiment with llvm and lcc
(http://www.cs.princeton.edu/software/lcc),
in order to see how much faster instruction selection could be. 
lcc uses a BURG-type (called lburg) instruction-selector.

The following is for x86/linux (ubuntu)
I am interested in JIT performance so I have only counted user+sys time.

Compiling a small test program (the dhrystone benchmark):
lcc (Fraser and Hanson) (after pre-processing): 4ms.
llc                                           : 16ms.
Incidentally, optimisation is respectably fast
opt -O3                                       : 16ms.

(My machine is quite slow)

Using -time-passes shows that almost all of the time spent by llc is in the
DAG-to-DAG
instruction selection.

lcc does lexing, parsing, type-checking, IR code-generation, register allocation
AND final code generation in much less time that llvm spends doing instruction
selection.
Also, there is no way that instruction selection should take as long as -O3
optimisation.

It would seem that lcc's lburg instruction selector is at least ten times
faster than
than llvm, possibly 20 times faster.

Does a fast instruction selector for llvm exist?

A BURG based instruction selector would improve the JIT compilation capabilities
of llvm hugely,
and benefit the static compiler as well.
A fast compiler is really useful in the compile-test-debug cycle.

Sadly, I do not have the time to implement such an instruction selector myself.

Cheers,
Mark.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: winmail.dat
Type: application/ms-tnef
Size: 3683 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20081215/22b34198/attachment.bin>

Nicolas Geoffray

2008-Dec-15 18:00 UTC

head link

[LLVMdev] A faster instruction selector?

Hi Mark,

I don't know the internals of the instruction selection in LLVM (vburg 
and stuff), but I do know that work has been made to improve it. The 
"-fast" command line argument in lli and llc generates less optimized 
machine code quickly. You can also use the local register allocator 
which is faster than the default linear scan.

I did some benchmarks in vmkit (the project that encompass ladyvm) and 
compilation time is roughly 50% faster with both fast and local register 
allocator.

Nicolas

Mark Shannon wrote:> Hi everyone,
>
> llvm is great!
>
> But there is one exception ;)
> llvm components are generally fast, but instruction selection is slooow.
>
> Let me explain.
> I am developing a toolkit for building virtual machines which can
automatically
> generate a JIT compiler using the interpreter specification.
> llvm does the hard work of machine code generation. (Thanks to you all)
>
> I discovered that JIT compilation is taking much longer than had hoped.
> Some profiling showed that is almost entirely due 
> to the llvm instruction selector (The register allocator and optimisation
passes are fast)
>
> The split in execution time for JIT compilation being approximately:
> (My) IR code generation:      1
> (llvm) Register allocation:   1
> (llvm) Instruction selection: 12 
>
> The ladyvm JVM paper noted the same problem: 
> http://vmkit.llvm.org/ladyvm.html
>
> So I did a quick experiment with llvm and lcc
(http://www.cs.princeton.edu/software/lcc),
> in order to see how much faster instruction selection could be. 
> lcc uses a BURG-type (called lburg) instruction-selector.
>
> The following is for x86/linux (ubuntu)
> I am interested in JIT performance so I have only counted user+sys time.
>
> Compiling a small test program (the dhrystone benchmark):
> lcc (Fraser and Hanson) (after pre-processing): 4ms.
> llc                                           : 16ms.
> Incidentally, optimisation is respectably fast
> opt -O3                                       : 16ms.
>
> (My machine is quite slow)
>
> Using -time-passes shows that almost all of the time spent by llc is in the
DAG-to-DAG
> instruction selection.
>
> lcc does lexing, parsing, type-checking, IR code-generation, register
allocation
> AND final code generation in much less time that llvm spends doing
instruction selection.
> Also, there is no way that instruction selection should take as long as -O3
optimisation.
>
> It would seem that lcc's lburg instruction selector is at least ten
times faster than
> than llvm, possibly 20 times faster.
>
> Does a fast instruction selector for llvm exist?
>
> A BURG based instruction selector would improve the JIT compilation
capabilities of llvm hugely,
> and benefit the static compiler as well.
> A fast compiler is really useful in the compile-test-debug cycle.
>
> Sadly, I do not have the time to implement such an instruction selector
myself.
>
> Cheers,
> Mark.
>   
> ------------------------------------------------------------------------
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

Dan Gohman

2008-Dec-15 18:39 UTC

head link

[LLVMdev] A faster instruction selector?

On Dec 15, 2008, at 9:14 AM, Mark Shannon wrote:> I discovered that JIT compilation is taking much longer than had  
> hoped.
> Some profiling showed that is almost entirely due
> to the llvm instruction selector (The register allocator and  
> optimisation passes are fast)
>
> The split in execution time for JIT compilation being approximately:
> (My) IR code generation:      1
> (llvm) Register allocation:   1
> (llvm) Instruction selection: 12
Interesting. What version of LLVM are you using here?
>
>
> The ladyvm JVM paper noted the same problem:
> http://vmkit.llvm.org/ladyvm.html
>
> So I did a quick experiment with llvm and lcc
(http://www.cs.princeton.edu/software/lcc
> ),
> in order to see how much faster instruction selection could be.
> lcc uses a BURG-type (called lburg) instruction-selector.
>
> The following is for x86/linux (ubuntu)
> I am interested in JIT performance so I have only counted user+sys  
> time.
>
> Compiling a small test program (the dhrystone benchmark):
> lcc (Fraser and Hanson) (after pre-processing): 4ms.
> llc                                           : 16ms.
> Incidentally, optimisation is respectably fast
> opt -O3                                       : 16ms.
>
> (My machine is quite slow)
>
> Using -time-passes shows that almost all of the time spent by llc is  
> in the DAG-to-DAG
> instruction selection.
>
> lcc does lexing, parsing, type-checking, IR code-generation,  
> register allocation
> AND final code generation in much less time that llvm spends doing  
> instruction selection.
> Also, there is no way that instruction selection should take as long  
> as -O3 optimisation.
Note that these proportions can vary greatly depending on the nature  
of the
input, and dhrystone is a very small amount of code.
>
>
> It would seem that lcc's lburg instruction selector is at least ten  
> times faster than
> than llvm, possibly 20 times faster.
>
> Does a fast instruction selector for llvm exist?
The -fast option directs llc to generate code quickly, at the expense
of the quality of the generated code.  If your JIT is designed such that
JIT compilation time is the dominating factor, this can be a very
valuable tradeoff.  A lot of work was done in this area for
LLVM 2.4.
>
>
> A BURG based instruction selector would improve the JIT compilation  
> capabilities of llvm hugely,
> and benefit the static compiler as well.
This may be true, though in the data I've seen currently the selection
algorithm itself isn't the most significant bottleneck.

FWIW, what -time-passes calls "instruction selection" includes a lot  
more
work than what its name suggests. A fair amount of pre-selection  
optimization
and lowering happens during this phase, as well as scheduling.
> A fast compiler is really useful in the compile-test-debug cycle.
Yep.
>
>
> Sadly, I do not have the time to implement such an instruction  
> selector myself.
Do you have time to do more detailed profiling? It might be interesting
to see which parts of codegen are hot in your use cases.

Dan

Reasonably Related Threads

Search for more apparently analagous threads

llvm dev - Dec 2008 - [LLVMdev] A faster instruction selector?

[LLVMdev] A faster instruction selector?

[LLVMdev] A faster instruction selector?

[LLVMdev] A faster instruction selector?

Reasonably Related Threads