thr3ads.net - llvm dev - [LLVMdev] -O0 compile time speed [Nov 2009]

If this information is useful, please help other people find it:
Share via:

Bob Wilson

2009-Nov-19 21:04 UTC

[LLVMdev] Google's Go

On Nov 19, 2009, at 2:10 PM, Jon Harrop wrote:
> On Thursday 19 November 2009 19:48:18 Owen Anderson wrote:
>> On Nov 19, 2009, at 10:25 AM, Jon Harrop wrote:
>>>> In this case, the assertion that LLVM is slow is correct:
it's
>>>> definitely slower than a non-optimizing compiler.
>>>
>>> I'm *very* surprised by this and will test it myself...
>
> I've tested it and LLVM is indeed 2x slower to compile, although it  
> generates
> code that is 2x faster to run...
>
>> Compared to a compiler in the same category as PCC, whose pinnacle of
>> optimization is doing register allocation?  I'm not surprised at
all.
>
> What else does LLVM do with optimizations turned off that makes it  
> slower?
I haven't looked at Go at all, but in general, there is a significant  
overhead to creating a compiler intermediate representation.  If you  
produce assembly code straight out of the parser, you can compile  
faster.

Even though LLVM does little optimization at -O0, there is still a  
fair amount of work involved in translating to LLVM IR.

Óscar Fuentes

2009-Nov-19 21:12 UTC

head link

[LLVMdev] Google's Go

Bob Wilson <bob.wilson at apple.com> writes:
>> What else does LLVM do with optimizations turned off that makes it  
>> slower?
>
> I haven't looked at Go at all, but in general, there is a significant  
> overhead to creating a compiler intermediate representation.  If you  
> produce assembly code straight out of the parser, you can compile  
> faster.
Not for me. Producing LLVM IR is *very* fast, faster than producing
poorly optimized assembler code (I know, I do both things).
> Even though LLVM does little optimization at -O0, there is still a  
> fair amount of work involved in translating to LLVM IR.
As said on my previous message, the most significant part of the work is
in generating native code from the LLVM IR.

-- 
Óscar

Eric Christopher

2009-Nov-19 23:54 UTC

head link

[LLVMdev] Google's Go

> 
>> Even though LLVM does little optimization at -O0, there is still a  
>> fair amount of work involved in translating to LLVM IR.
> 
> As said on my previous message, the most significant part of the work is
> in generating native code from the LLVM IR.
And register allocation.  That said, John's email sums it up rather well.

-eric

Chris Lattner

2009-Nov-21 14:27 UTC

head link

[LLVMdev] -O0 compile time speed (was: Go)

On Nov 19, 2009, at 1:04 PM, Bob Wilson wrote:>> I've tested it and LLVM is indeed 2x slower to compile, although it
>> generates
>> code that is 2x faster to run...
>> 
>>> Compared to a compiler in the same category as PCC, whose pinnacle
of
>>> optimization is doing register allocation?  I'm not surprised
at all.
>> 
>> What else does LLVM do with optimizations turned off that makes it  
>> slower?
> 
> I haven't looked at Go at all, but in general, there is a significant  
> overhead to creating a compiler intermediate representation.  If you  
> produce assembly code straight out of the parser, you can compile  
> faster.
Right.  Another common comparison is between clang and TCC.  TCC generates
terrible code, but it is a great example of a one pass compiler that doesn't
even build an AST.  Generating code as you parse will be much much much faster
than building an AST, then generating llvm ir, then generating assembly from it.
On X86 at -O0, we use FastISel which avoids creating the SelectionDAG
intermediate representation in most cases (it fast paths LLVM IR ->
MachineInstrs, instead of going IR -> SelectionDAG -> MachineInstrs).

I'm still really interested in making Clang (and thus LLVM) faster at -O0
(while still preserving debuggability of course).  One way to do this (which
would be a disaster and not worth it)  would be to implement a new X86 backend
directly translating from Clang ASTs or something like that.  However, this
would obviously lose all of the portability benefits that LLVM IR provides.

That said, there is a lot that we can do to make the compiler faster at O0. 
FastISel could be improved in several dimensions, including going bottom-up
instead of top-down (eliminating the need for the 'dead instruction
elimination pass'), integrating simple register allocation into it for the
common case of single-use instructions, etc.  Another good way to speed up O0
codegen is to avoid generating as much horrible code in the frontend that the
optimizer (which isn't run at O0) is expected to clean up.

-Chris

Mark Shannon

2009-Nov-21 15:57 UTC

head link

[LLVMdev] -O0 compile time speed

Hi there,
Just my tuppence worth:

I for one would love it if the code-gen pass was quicker.
It makes LLVM even more appealing for JIT compilers.

One approach might be to generate code directly from the standard IR.
(Rather than create yet another IR  (the instruction DAG)).

Would it be possible to convert the standard IR DAG to a forest of trees 
with a simple linear pass, either before or after register allocation, 
then use a BURG code generator on the trees?

BURG selectors are both fast and optimal(In theory, assuming all 
instructions can be given a cost and ignoring scheduling issues).

Chris Lattner wrote:> On Nov 19, 2009, at 1:04 PM, Bob Wilson wrote:
>>> I've tested it and LLVM is indeed 2x slower to compile,
although it
>>> generates
>>> code that is 2x faster to run...
>>>
>>>> Compared to a compiler in the same category as PCC, whose
pinnacle of
>>>> optimization is doing register allocation?  I'm not
surprised at all.
>>> What else does LLVM do with optimizations turned off that makes it
>>> slower?
>> I haven't looked at Go at all, but in general, there is a
significant
>> overhead to creating a compiler intermediate representation.  If you  
>> produce assembly code straight out of the parser, you can compile  
>> faster.
> 
> Right.  Another common comparison is between clang and TCC.  TCC generates
terrible code, but it is a great example of a one pass compiler that doesn't
even build an AST.  Generating code as you parse will be much much much faster
than building an AST, then generating llvm ir, then generating assembly from it.
On X86 at -O0, we use FastISel which avoids creating the SelectionDAG
intermediate representation in most cases (it fast paths LLVM IR ->
MachineInstrs, instead of going IR -> SelectionDAG -> MachineInstrs).
> 
> I'm still really interested in making Clang (and thus LLVM) faster at
-O0 (while still preserving debuggability of course).  One way to do this (which
would be a disaster and not worth it)  would be to implement a new X86 backend
directly translating from Clang ASTs or something like that.  However, this
would obviously lose all of the portability benefits that LLVM IR provides.
> 
> That said, there is a lot that we can do to make the compiler faster at O0.
FastISel could be improved in several dimensions, including going bottom-up
instead of top-down (eliminating the need for the 'dead instruction
elimination pass'), integrating simple register allocation into it for the
common case of single-use instructions, etc.  Another good way to speed up O0
codegen is to avoid generating as much horrible code in the frontend that the
optimizer (which isn't run at O0) is expected to clean up.
> 
> -Chris
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

Arnt Gulbrandsen

2009-Nov-21 21:00 UTC

head link

[LLVMdev] -O0 compile time speed (was: Go)

Chris Lattner writes:> I'm still really interested in making Clang (and thus LLVM) faster at 
> -O0 (while still preserving debuggability of course).
Why?

Arnt

Jon Harrop

2009-Nov-22 19:14 UTC

head link

[LLVMdev] -O0 compile time speed (was: Go)

On Saturday 21 November 2009 14:27:15 Chris Lattner
wrote:> On Nov 19, 2009, at 1:04 PM, Bob Wilson wrote:
> >> I've tested it and LLVM is indeed 2x slower to compile,
although it
> >> generates
> >> code that is 2x faster to run...
> >>
> >>> Compared to a compiler in the same category as PCC, whose
pinnacle of
> >>> optimization is doing register allocation?  I'm not
surprised at all.
> >>
> >> What else does LLVM do with optimizations turned off that makes it
> >> slower?
> >
> > I haven't looked at Go at all, but in general, there is a
significant
> > overhead to creating a compiler intermediate representation.  If you
> > produce assembly code straight out of the parser, you can compile
> > faster.
>
> Right.  Another common comparison is between clang and TCC.  TCC generates
> terrible code, but it is a great example of a one pass compiler that
> doesn't even build an AST.  Generating code as you parse will be much
much
> much faster than building an AST, then generating llvm ir, then generating
> assembly from it.  On X86 at -O0, we use FastISel which avoids creating the
> SelectionDAG intermediate representation in most cases (it fast paths LLVM
> IR -> MachineInstrs, instead of going IR -> SelectionDAG ->
MachineInstrs).
I found LLVM was 2x slower than Go at a simple 10,000 Fibonacci functions 
test. Do you have any data on Clang vs TCC compilation speed?
> I'm still really interested in making Clang (and thus LLVM) faster at
-O0
> (while still preserving debuggability of course).  One way to do this
> (which would be a disaster and not worth it)  would be to implement a new
> X86 backend directly translating from Clang ASTs or something like that. 
> However, this would obviously lose all of the portability benefits that
> LLVM IR provides.
That sounds like a lot of work for relatively little gain.
> That said, there is a lot that we can do to make the compiler faster at O0.
> FastISel could be improved in several dimensions, including going
> bottom-up instead of top-down (eliminating the need for the 'dead
> instruction elimination pass'), integrating simple register allocation
into
> it for the common case of single-use instructions, etc.  Another good way
> to speed up O0 codegen is to avoid generating as much horrible code in the
> frontend that the optimizer (which isn't run at O0) is expected to
clean
> up.
HLVM generates quite sane and efficient IR directly and I've been more than 
happy with LLVM's JIT compilation times when using it interatively from a 
REPL.

So I'm not sure that LLVM so slow as to make it worth ploughing much effort 
into optimizing compilation times. If you want to go down that route then
I'd
certainly start with higher level optimizations like memoizing previous 
compilations and reusing them. What about parallelization?

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

Maybe Matching Threads

Search for more seemingly similar threads

llvm dev - Nov 2009 - [LLVMdev] -O0 compile time speed

[LLVMdev] Google's Go

[LLVMdev] Google's Go

[LLVMdev] Google's Go

[LLVMdev] -O0 compile time speed (was: Go)

[LLVMdev] -O0 compile time speed

[LLVMdev] -O0 compile time speed (was: Go)

[LLVMdev] -O0 compile time speed (was: Go)

Maybe Matching Threads