thr3ads.net - llvm dev - [LLVMdev] selection dag speedups / llc speedups [May 2010]

If this information is useful, please help other people find it:
Share via:

Rafael Espindola

2010-May-18 04:09 UTC

[LLVMdev] selection dag speedups / llc speedups

> The fast and local register allocators are meant to be used on unoptimized
code, a 'Debug build'. While they do work on optimized code, they do not
give good results. Their primary goal is compile time, not code quality.
Yes, we have a somewhat uncommon use case. It is fine to spend time
optimizing bitcode (LTO is a OK), but we want to make the final IL ->
Executable translation as fast as possible.
> /jakob
Cheers,
-- 
Rafael Ávila de Espíndola

Jakob Stoklund Olesen

2010-May-18 04:33 UTC

head link

[LLVMdev] selection dag speedups / llc speedups

On May 17, 2010, at 9:09 PM, Rafael Espindola wrote:
>> The fast and local register allocators are meant to be used on
unoptimized code, a 'Debug build'. While they do work on optimized code,
they do not give good results. Their primary goal is compile time, not code
quality.
> 
> Yes, we have a somewhat uncommon use case. It is fine to spend time
> optimizing bitcode (LTO is a OK), but we want to make the final IL ->
> Executable translation as fast as possible.
Do you know how the fast allocator performs in these conditions? Have you
compared it to the local allocator? I really focused my efforts on unoptimized
code.

/jakob

Jan Voung

2010-May-18 19:07 UTC

head link

[LLVMdev] selection dag speedups / llc speedups

Here are some recent stats of the fast vs local vs linear scan at O0 on
"opt
-std-compile-opts" processed bitcode files. The fast regalloc is still
certainly faster at codegen than local with such bitcode files.  Let me know
if the link doesn't work:

https://spreadsheets.google.com/a/google.com/ccc?key=0At5EJFcCBf-wdDgtd2FoZjU4bFBzcFBtT25rQkgzMEE&hl=en

Misc stuff: I ran into an "UNREACHABLE executed" using linear scan on
revision 104021, so I used an older version for that.

0  llc.hg          0x0000000000af4d7f
1  llc.hg          0x0000000000af54fa
2  libpthread.so.0 0x00007fb1734b67d0
3  libc.so.6       0x00007fb1725d2095 gsignal + 53
4  libc.so.6       0x00007fb1725d3af0 abort + 272
5  llc.hg          0x0000000000ad4932 llvm::llvm_unreachable_internal(char
const*, char const*, unsigned int) + 370
6  llc.hg          0x0000000000886426
llvm::LiveIntervals::handleVirtualRegisterDef(llvm::MachineBasicBlock*,
llvm::ilist_iterator<llvm::MachineInstr>, llvm::SlotIndex,
llvm::MachineOperand&, unsigned int, llvm::LiveInterval&) + 3910
7  llc.hg          0x0000000000888429
llvm::LiveIntervals::handleRegisterDef(llvm::MachineBasicBlock*,
llvm::ilist_iterator<llvm::MachineInstr>, llvm::SlotIndex,
llvm::MachineOperand&, unsigned int) + 409
8  llc.hg          0x000000000088ade0
llvm::LiveIntervals::computeIntervals() + 2496
9  llc.hg          0x000000000088b56f
llvm::LiveIntervals::runOnMachineFunction(llvm::MachineFunction&) + 447
10 llc.hg          0x00000000007b3493
llvm::MachineFunctionPass::runOnFunction(llvm::Function&) + 115
11 llc.hg          0x0000000000a79ec0
llvm::FPPassManager::runOnFunction(llvm::Function&) + 688
12 llc.hg          0x0000000000a79f13
llvm::FPPassManager::runOnModule(llvm::Module&) + 67
13 llc.hg          0x0000000000a79a63
llvm::MPPassManager::runOnModule(llvm::Module&) + 515
14 llc.hg          0x0000000000a79b42
llvm::PassManagerImpl::run(llvm::Module&) + 114
15 llc.hg          0x0000000000a79bdd llvm::PassManager::run(llvm::Module&)
+ 13
16 llc.hg          0x00000000004d3112 main + 2802
17 libc.so.6       0x00007fb1725be1c4 __libc_start_main + 244
18 llc.hg          0x00000000004d0b09


Stack dump:
0.      Program arguments: llc.hg -asm-verbose=false -O0
32/403.gcc/403.gcc.linked.bc -o
32/403.gcc/output/403.gcc.linked.bc.llc_O0.s
1.      Running pass 'Function Pass Manager' on module
'32/403.gcc/403.gcc.linked.bc'.
2.      Running pass 'Live Interval Analysis' on function
'@nonlocal_mentioned_p'

- Jan

On Mon, May 17, 2010 at 9:33 PM, Jakob Stoklund Olesen <stoklund at
2pi.dk>wrote:
>
> On May 17, 2010, at 9:09 PM, Rafael Espindola wrote:
>
> >> The fast and local register allocators are meant to be used on
> unoptimized code, a 'Debug build'. While they do work on optimized
code,
> they do not give good results. Their primary goal is compile time, not code
> quality.
> >
> > Yes, we have a somewhat uncommon use case. It is fine to spend time
> > optimizing bitcode (LTO is a OK), but we want to make the final IL
->
> > Executable translation as fast as possible.
>
> Do you know how the fast allocator performs in these conditions? Have you
> compared it to the local allocator? I really focused my efforts on
> unoptimized code.
>
> /jakob
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20100518/c8ad5349/attachment.html>

Maybe Matching Threads

Search for more maybe matching threads

llvm dev - May 2010 - [LLVMdev] selection dag speedups / llc speedups

[LLVMdev] selection dag speedups / llc speedups

[LLVMdev] selection dag speedups / llc speedups

[LLVMdev] selection dag speedups / llc speedups

Maybe Matching Threads