Hi, I have a question about x86 code quality. I have run a few benchmarks and compared the running time of executables created by LLVM to executables created by gcc. It appears that code generated by LLVM is x1.5 - x3 times slower than code generated by gcc, for the x86 For some of the benchmarks the linear scan regalloc works. When it does, results are in the x1.0 - 1.5 range. Unfortunately, the linear scan allocator breaks on most of my code. Question: 1) Do my observations fit your general experience ? I haven't looked into the details of the generated x86 code. I have the following observation, though: When using gcc as a backend (compiling to the 'c' target and then recompiling with gcc) results are generally a lot better than just using the LLVM->x86 backend. This indicates that the performance difference is mostly located to the LLVM->x86 backend. Further, for those of my codes where the new allocator works, results are much better. Whether this is due to the allocator, or some interaction between it and cogen, I do not know. Currently, I am just playing with LLVM, but the longterm plan is to build a new backend for a new machine. It won't be register starved as the x86 is. Question: 2) Is there a similar performance differential between LLVM->sparc and gcc on sparc, or are they much closer because the sparc has more registers and thus should be less dependent on good register allocation ? 3) What is the expected timeframe for the new regalloc to become stable ? .. or perhaps I should make a more general question: what is the perceived status in terms of performance for the two compiler backends and for the compiler backend part of the infrastructure ? Finally I think LLVM looks *very* nice and appears to be a substantial contribution to the world of open source compiler infrastructure. Best regards, and thanks in advance, /Finn
On Wed, 21 Apr 2004, Finn S Andersen wrote:> Hi, I have a question about x86 code quality. > > I have run a few benchmarks and compared the > running time of executables created by LLVM to > executables created by gcc. > > It appears that code generated by LLVM is x1.5 - x3 > times slower than code generated by gcc, for the x86 > > For some of the benchmarks the linear scan regalloc > works. When it does, results are in the x1.0 - 1.5 > range. Unfortunately, the linear scan allocator breaks > on most of my code. > > Question: > 1) Do my observations fit your general experience ?Yes, that does. I assume you are working with LLVM 1.2?> I haven't looked into the details of the generated > x86 code. I have the following observation, though: > > When using gcc as a backend (compiling to the 'c' target > and then recompiling with gcc) results are generally a lot > better than just using the LLVM->x86 backend. This > indicates that the performance difference is mostly > located to the LLVM->x86 backend. Further, for those > of my codes where the new allocator works, results are > much better. Whether this is due to the allocator, or > some interaction between it and cogen, I do not know.The LLVM 1.2 X86 code quality problems are due to a couple of serious issues. 1. The default register allocator is a purely local algorithm, which cannot hold (e.g.) the counter of a loop in a register across the loop. This is *clearly* bad, and switching to the new allocator obviously makes a big difference :) 2. Even with the new allocator, we are not able to globally allocate floating point registers (yet), do to some interaction with the X86 floating point stack. This is just something that needs to be worked on, but unfortunately noone has had time to do the work recently. 3. When compiling with the native X86 backend, very little additional optimization is performed. When compiling with the C backend & GCC, GCC does it's own optimizations that can make a big difference. For example, LLVM 1.2 could only index into arrays with 64-bit integers (the getelementptr only accepted a 'long' operand). This could cause huge performance problems on the X86, which the GCC optimizer happily stomped out. (this issue has been fixed in LLVM CVS: http://llvm.cs.uiuc.edu/PR309) 4. in LLVM 1.2, several LLVM->LLVM optimizations were doing very obviously silly things, and have subsequently been fixed. See the "1.3" release notes for information: http://llvm.cs.uiuc.edu/docs/ReleaseNotes.html 5. One of our goals for LLVM 1.3 is to get one of the scalable pointer analyses that I have been working on turned on by default in the optimizing linker. This should have a pretty noticable performance impact.> Currently, I am just playing with LLVM, but the longterm > plan is to build a new backend for a new machine. It won't > be register starved as the x86 is.Of the above, #1 would directly effect your target, #2 is X86 specific, #3 would have affected your target if it's 32-bit or smaller, #4 would have hurt your target, and #5 will almost certainly help your target.> Question: > 2) Is there a similar performance differential between > LLVM->sparc and gcc on sparc, or are they much closer > because the sparc has more registers and thus should > be less dependent on good register allocation ?I truly have no idea. I don't use the Sparc target very much, and I don't know if anyone has looked into the actual performance of it. One of the problems is that the LLVM Sparc backend doesn't share much code with the target-independent code generator, so it's very hard to compare. Our long-term goal is to merge the sparc code generator into the target-independent code paths.> 3) What is the expected timeframe for the new regalloc to > become stable ?I am hoping/planning for the new allocator to be in LLVM 1.3 as the default allocator. From what I understand there is one bug left related to spill code insertion, but Alkis has been very busy with other projects (it's nearing the end of the semester already :). If he doesn't get to it by 1.3, I will.> .. or perhaps I should make a more general > question: what is the perceived status in terms of performance > for the two compiler backends and for the compiler backend > part of the infrastructure ?At this point we haven't actually spent a lot of time evaluating and measuring code quality. In fact if you notice a piece of code that is not being optimized or code generated well, please file a bug (with a suggestion on what the code should have been compiled to). Generally we separate optimizations in the catagories of LLVM->LLVM or codegen optimizations, but both are important.> Finally I think LLVM looks *very* nice and appears to be a substantial > contribution to the world of open source compiler infrastructure.Thanks! If you have any more questions, please feel free to ask. -Chris -- http://llvm.cs.uiuc.edu/ http://www.nondot.org/~sabre/Projects/
On Wed, Apr 21, 2004 at 11:01:48AM +0200, Finn S Andersen wrote:> For some of the benchmarks the linear scan regalloc > works. When it does, results are in the x1.0 - 1.5 > range. Unfortunately, the linear scan allocator breaks > on most of my code.Is there a chance you can try cvs? I would be interested to get a simplified test case where the allocator breaks. A lot of improvements went into the x86 backend since 1.2 and we currently have no test cases where the allocator breaks today.> Currently, I am just playing with LLVM, but the longterm > plan is to build a new backend for a new machine. It won't > be register starved as the x86 is.It would be very interesting to see the performance difference between linear scan and local allocators on a machine that is less spill happy than the x86. In that case I expect to see much bigger difference between the two.> 3) What is the expected timeframe for the new regalloc to > become stable ? .. or perhaps I should make a more general > question: what is the perceived status in terms of performance > for the two compiler backends and for the compiler backend > part of the infrastructure ?As Chris said, I have been held back from other projects. I hope that right after finals I will have some time to fix the regression of the linear scan register allocator. There are some improvements I have in mind as well, so expect the linear scan register allocator to be much better in 1.3. -- Alkis
Alkis Evlogimenos wrote:>On Wed, Apr 21, 2004 at 11:01:48AM +0200, Finn S Andersen wrote: > > >>For some of the benchmarks the linear scan regalloc >>works. When it does, results are in the x1.0 - 1.5 >>range. Unfortunately, the linear scan allocator breaks >>on most of my code. >> >> > >Is there a chance you can try cvs? I would be interested to >get a simplified test case where the allocator breaks. A lot of >improvements went into the x86 backend since 1.2 and we currently have >no test cases where the allocator breaks today. > >I would, if I could. However, it seems that there is a lot of changes since release 1.2. The cvsweb interface only allow me to download one file at a time. I have grabbed "llvm/lib/CodeGen/RegAllocLinearScan.cpp" and run make and make install. But the problem is still there. The error message says: lli: /home/finna/llvm/llvm/include/llvm/Target/MRegisterInfo.h:144: static bool llvm::MRegisterInfo::isPhysicalRegister(unsigned int): Assertion `Reg && "this is not a register!"' failed. But trying cvsweb I cannot locate the file mentioned above. I guess you have removed it which likely means there are many files I should update. Bus cvsweb is not the right interface for that. How do I proceed ? Best regards /Finn
Alkis Evlogimenos wrote:>Is there a chance you can try cvs? I would be interested to >get a simplified test case where the allocator breaks. A lot of >improvements went into the x86 backend since 1.2 and we currently have >no test cases where the allocator breaks today. > >I updated and recompiled and the error is still there. It turns out that I cannot use the bugpoint utility to narrow down the error, because it is not a miscompilation and it is not a compiler pass. It is a co-gen pass and to provoke it I need to pass the regalloc=linearscan to llc or lli, but the bugpoint utility does not support it. I attach a small bytecode file that triggers the bug. My apologies for trying to submit a bug through email to this list, but there appear to be some problem with bugzilla. Although I have opened an account, registered a password and confirmed it through mail, I am still rejected by bugzilla when I try to log in. I hope you can use the attached bc to narrow down the bug. Thanks a lot for any help. ---------------------- The message I get when running the attached bc: [finna at coplin11 fft]$ lli -regalloc=linearscan a.out.bc lli: /home/finna/llvm/llvm/include/llvm/Target/MRegisterInfo.h:144: static bool llvm::MRegisterInfo::isPhysicalRegister(unsigned int): Assertion `Reg && "this is not a register!"' failed. lli[0x849e768] lli[0x849e974] /lib/tls/libc.so.6[0x420277b8] /lib/tls/libc.so.6(abort+0x1d5)[0x42028c55] /lib/tls/libc.so.6[0x42021043] lli(llvm::MRegisterInfo::isPhysicalRegister(unsigned)+0x25)[0x8314863] lli((anonymous namespace)::RA::assignRegOrStackSlotAtInterval(llvm::LiveIntervals::Interval*)+0x844)[0x8307d16] lli((anonymous namespace)::RA::linearScan()+0x3c1)[0x8306aff] lli((anonymous namespace)::RA::runOnMachineFunction(llvm::MachineFunction&)+0x1d7)[0x83066f1] lli(llvm::MachineFunctionPass::runOnFunction(llvm::Function&)+0x28)[0x83144fe] lli(llvm::PassManagerTraits<llvm::Function>::runPass(llvm::FunctionPass*, llvm::Function*)+0x1b)[0x84786d5] lli(llvm::PassManagerT<llvm::Function>::runOnUnit(llvm::Function*)+0x5e4)[0x8473ba0] lli(llvm::PassManagerTraits<llvm::Function>::runOnFunction(llvm::Function&)+0x1b)[0x8471a1b] lli(llvm::FunctionPass::run(llvm::Function&)+0x6b)[0x84259c3] lli(llvm::FunctionPassManager::run(llvm::Function&)+0x34)[0x8424f16] lli(llvm::JIT::runJITOnFunction(llvm::Function*)+0x3e)[0x82f4062] lli(llvm::JIT::getPointerToFunction(llvm::Function*)+0xf3)[0x82f41b5] lli(llvm::JIT::runFunction(llvm::Function*, std::vector<llvm::GenericValue, std::allocator<llvm::GenericValue> > const&)+0x5c)[0x82f3f1a] lli(llvm::ExecutionEngine::runFunctionAsMain(llvm::Function*, std::vector<std::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, char const* const*)+0x25b)[0x8338111] lli(main+0x23c)[0x82db5dc] /lib/tls/libc.so.6(__libc_start_main+0xe4)[0x42015704] lli(dlopen+0x41)[0x82db311] Abort (core dumped) -------------- next part -------------- A non-text attachment was scrubbed... Name: a.out.bc Type: application/octet-stream Size: 1092 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20040426/6d0bca0f/attachment.obj>