Jakob Stoklund Olesen
2010-Jun-03 00:53 UTC
[LLVMdev] Heads up: Local register allocator going away
I just changed the default register allocator for -O0 builds to the fast allocator. This means that the local register allocator is not used anymore, and since it does more or less the same as the fast allocator, there is no reason to keep it around. I am going to delete it in a week or two. If you are using the local register allocator, please try switching to the fast allocator and report any bugs you find. Thanks, /jakob
Kalle.Raiskila at nokia.com
2010-Jun-04 08:57 UTC
[LLVMdev] Heads up: Local register allocator going away
On Thu, 2010-06-03 at 02:53 +0200, Jakob Stoklund Olesen wrote:> If you are using the local register allocator, please try switching to the fast allocator and report any bugs you find. >Tried it, and it seems to break quite a big chunk of our tests on SPU :) Before r103488 ("Mostly rewrite RegAllocFast") there was no problem. But with r103488, I get a: llvm/lib/CodeGen/RegisterScavenging.cpp:196: void llvm::RegScavenger::forward(): Assertion `SubUsed && "Using an undefined register!"' failed. In r103685 ("More asserts around physreg uses") the error changed to: llvm/lib/CodeGen/RegAllocFast.cpp:629: void<unnamed>::RAFast::AllocateBasicBlock(llvm::MachineBasicBlock&): Assertion `PhysRegState[Reg] <= regReserved && "Using clobbered physreg"' failed. And with latest it is now: Instruction uses an allocated register UNREACHABLE executed at /home/kraiskil/llvm/svn-clean/llvm/lib/CodeGen/RegAllocFast.cpp:302! This is probably in the SPU backend, as all (most) other backends compile the example just fine? Where do I start to look if I want to fix this? I can file a PR if this is not in the SPU backend. kalle P.s. This is a simplification of programs that crash: declare [8 x [8 x float]]* @extFunc() define void @testFunc() { %sl8_5 = tail call [8 x [8 x float]]* @extFunc() br label %Entry Entry: %idx = phi i64 [ 0, %0 ], [ %next, %Entry ] %scevgep = getelementptr [8 x [8 x float]]* %sl8_5, i64 0, i64 %idx, i64 0 %next = add i64 %idx, 1 %exitcond = icmp eq i64 %next, 8 br i1 %exitcond, label %Exit, label %Entry Exit: ret void } (Sorry for the cluttered simpification: removing any of the call, getelementpointer or loop removes the llc crash)
Jakob Stoklund Olesen
2010-Jun-04 18:05 UTC
[LLVMdev] Heads up: Local register allocator going away
On Jun 4, 2010, at 1:57 AM, <Kalle.Raiskila at nokia.com> <Kalle.Raiskila at nokia.com> wrote:> On Thu, 2010-06-03 at 02:53 +0200, Jakob Stoklund Olesen wrote: >> If you are using the local register allocator, please try switching to the fast allocator and report any bugs you find. >> > Tried it, and it seems to break quite a big chunk of our tests on SPU :)Thanks for testing it! [...]> And with latest it is now: > Instruction uses an allocated register > UNREACHABLE executed > at /home/kraiskil/llvm/svn-clean/llvm/lib/CodeGen/RegAllocFast.cpp:302!This is RegAllocFast's way of saying "Oops, I clobbered the return value from your CALL. Didn't think you would need it." The problem is this code: BB#0: derived from LLVM BB %0 BRASL <ga:@extFunc>, %R0<imp-def>, %R1<imp-def>, %R3<imp-def>, %R0<imp-use>, ... %reg1028<def> = ILv4i32 0 %reg1027<def> = ORi64_v2i64 %reg1028 ADJCALLSTACKUP 0, %R1<imp-def>, %R1<imp-use> %reg1029<def> = LRr32 %R3 The return value from the call is in %R3, but %reg1027 and %reg1028 are also allocated to %R3 before it is copied to a safe place (%reg1029). RegAllocFast does not distinguish between call-clobbered registers and return value registers. They are all considered 'free' after the call. It expects return value physregs to be copied to virtregs immediately after the call instruction, so they are not clobbered. This is a bit risky, but it works when the return value CopyFromReg is scheduled immediately following the call. So the question is: What are those ILv4i32 and ORi64_v2i64 doing inside the call sequence? Can you get rid of them? If you look at the scheduler DAG for this BB (llc -view-sched-dags), you can see that the return value CopyFromReg is tied to ADJCALLSTACKUP with a flag, but there is no flag from ADJCALLSTACKUP to BRASL. This allows the scheduler to put other instructions in the gap, and you don't want that. You should fix SPUTargetLowering::LowerCall to make sure there is an unbroken chain of flag ties between CopyFromReg and BRASL. At least ARM, MBlaze, and Blackfin are doing this, if you need example code.> This is probably in the SPU backend, as all (most) other backends > compile the example just fine? Where do I start to look if I want to fix > this? I can file a PR if this is not in the SPU backend.In this case it was RegAllocFast making special assumptions, but in general -verify-machineinstrs and -debug-only=regalloc might help. And feel free to file PRs for the SPU backend as well.> (Sorry for the cluttered simpification: removing any of the call, > getelementpointer or loop removes the llc crash)Do you know about bugpoint? "bugpoint test.ll -run-llc -tool-args -O0 -march=cellspu" will reduce the test case for you. It is awesome. /jakob
Possibly Parallel Threads
- [LLVMdev] Heads up: Local register allocator going away
- [LLVMdev] Heads up: Local register allocator going away
- [LLVMdev] Heads up: Local register allocator going away
- [LLVMdev] Removing dead code
- [LLVMdev] "ran out of registers during register allocation"