Jakob Stoklund Olesen
2010-Jun-04 18:05 UTC
[LLVMdev] Heads up: Local register allocator going away
On Jun 4, 2010, at 1:57 AM, <Kalle.Raiskila at nokia.com> <Kalle.Raiskila at nokia.com> wrote:> On Thu, 2010-06-03 at 02:53 +0200, Jakob Stoklund Olesen wrote: >> If you are using the local register allocator, please try switching to the fast allocator and report any bugs you find. >> > Tried it, and it seems to break quite a big chunk of our tests on SPU :)Thanks for testing it! [...]> And with latest it is now: > Instruction uses an allocated register > UNREACHABLE executed > at /home/kraiskil/llvm/svn-clean/llvm/lib/CodeGen/RegAllocFast.cpp:302!This is RegAllocFast's way of saying "Oops, I clobbered the return value from your CALL. Didn't think you would need it." The problem is this code: BB#0: derived from LLVM BB %0 BRASL <ga:@extFunc>, %R0<imp-def>, %R1<imp-def>, %R3<imp-def>, %R0<imp-use>, ... %reg1028<def> = ILv4i32 0 %reg1027<def> = ORi64_v2i64 %reg1028 ADJCALLSTACKUP 0, %R1<imp-def>, %R1<imp-use> %reg1029<def> = LRr32 %R3 The return value from the call is in %R3, but %reg1027 and %reg1028 are also allocated to %R3 before it is copied to a safe place (%reg1029). RegAllocFast does not distinguish between call-clobbered registers and return value registers. They are all considered 'free' after the call. It expects return value physregs to be copied to virtregs immediately after the call instruction, so they are not clobbered. This is a bit risky, but it works when the return value CopyFromReg is scheduled immediately following the call. So the question is: What are those ILv4i32 and ORi64_v2i64 doing inside the call sequence? Can you get rid of them? If you look at the scheduler DAG for this BB (llc -view-sched-dags), you can see that the return value CopyFromReg is tied to ADJCALLSTACKUP with a flag, but there is no flag from ADJCALLSTACKUP to BRASL. This allows the scheduler to put other instructions in the gap, and you don't want that. You should fix SPUTargetLowering::LowerCall to make sure there is an unbroken chain of flag ties between CopyFromReg and BRASL. At least ARM, MBlaze, and Blackfin are doing this, if you need example code.> This is probably in the SPU backend, as all (most) other backends > compile the example just fine? Where do I start to look if I want to fix > this? I can file a PR if this is not in the SPU backend.In this case it was RegAllocFast making special assumptions, but in general -verify-machineinstrs and -debug-only=regalloc might help. And feel free to file PRs for the SPU backend as well.> (Sorry for the cluttered simpification: removing any of the call, > getelementpointer or loop removes the llc crash)Do you know about bugpoint? "bugpoint test.ll -run-llc -tool-args -O0 -march=cellspu" will reduce the test case for you. It is awesome. /jakob
Kalle.Raiskila at nokia.com
2010-Jun-08 08:24 UTC
[LLVMdev] Heads up: Local register allocator going away
On Fri, 2010-06-04 at 20:05 +0200, Jakob Stoklund Olesen wrote:> You should fix SPUTargetLowering::LowerCall to make sure there is an unbroken chain of flag ties between CopyFromReg and BRASL. At least ARM, MBlaze, and Blackfin are doing this, if you need example code. >Thanks for the tip. This got fixed in 105601. And with that, half of the problematic tests appearing with --regalloc=fast flag started working. Unfortunately the second half started to miscompile :( Also, I now see some rather unoptimal code, e.g: ... brasl $lr, extFunc lr $3, $3 lr $3, $3 ... ('lr rt, ra' moves ra->rt). I guess the miscompilations are due to the same problem as this sort of stuff gets generated. I'm looking into it, but do you have any more of those useful tips? ;) The code I used is the same test case as earlier in this chain. Compiled with: llc --march=cellspu --regalloc=fast test.ll -o - thanks, kalle
Jakob Stoklund Olesen
2010-Jun-08 14:08 UTC
[LLVMdev] Heads up: Local register allocator going away
On Jun 8, 2010, at 1:24 AM, <Kalle.Raiskila at nokia.com> <Kalle.Raiskila at nokia.com> wrote:> On Fri, 2010-06-04 at 20:05 +0200, Jakob Stoklund Olesen wrote: > >> You should fix SPUTargetLowering::LowerCall to make sure there is an unbroken chain of flag ties between CopyFromReg and BRASL. At least ARM, MBlaze, and Blackfin are doing this, if you need example code.> Thanks for the tip. This got fixed in 105601.Great!> And with that, half of the problematic tests appearing with > --regalloc=fast flag started working. Unfortunately the second half > started to miscompile :(Do you know how they are being miscompiled? Is the return value from a call being clobbered? Did they crash the compiler before your fix?> Also, I now see some rather unoptimal code, e.g: > ... > brasl $lr, extFunc > lr $3, $3 > lr $3, $3That's odd, identity copies should get coalesced by RegAllocFast. Is isMoveInstr working correctly for these instructions? Is it setting DstReg and SrcReg correctly?> ... > ('lr rt, ra' moves ra->rt). I guess the miscompilations are due to the > same problem as this sort of stuff gets generated. I'm looking into it, > but do you have any more of those useful tips? ;) The code I used is the > same test case as earlier in this chain. Compiled with: > llc --march=cellspu --regalloc=fast test.ll -o -Look into the identity copy issue first. It could be related. Does it make a difference if you run llc with -O0? /jakob -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1929 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100608/d9644724/attachment.bin>
Reasonably Related Threads
- [LLVMdev] Heads up: Local register allocator going away
- [LLVMdev] Heads up: Local register allocator going away
- [LLVMdev] Heads up: Local register allocator going away
- [LLVMdev] RegisterScavenging on targets without subregisters
- [LLVMdev] Building LLVM-GCC on Linux/PowerPC failed