Pranav Bhandarkar
2012-May-14 23:15 UTC
[LLVMdev] Register coalescing (Subregs and SuperRegs)
Hi, Consider this MI code from the hexagon backend. ------------------------------------------------------------------ 16B %vreg0<def> = COPY %R0<kill>; IntRegs:%vreg0 32B %vreg1<def> = LDriw %vreg0, 0; mem:LD4[%a],IntRegs:%vreg1,%vreg0 48B %vreg2<def> = LDriw_indexed %vreg0<kill>, 4; mem:LD4[%add.ptr] IntRegs:%vreg2,%vreg0 64B %vreg7<def> = COMBINE_rr %vreg2<kill>, %vreg1<kill>; DoubleRegs:%vreg7 IntRegs:%vreg2,%vreg1 80B %D0<def> = COPY %vreg7<kill>; DoubleRegs:%vreg7 ------------------------------------------------------------------ LDriw and LDriw_indexed load 32 -bit words. So %vreg1 and %vreg2 are both 32-bit virtual registers. Hexagon has register pairs and even-odd registers can be paired to form 64-bit registers. For instance, physical registers R0 and R1 can form the register pair R1:R0. Similarly R3:R2 with the odd number register holding the higher 32 bits and the even numbered register holds the lower 32-bits. Consider now the COMBINE_rr instruction ------------------------------------------------------------------ %vreg7<def> = COMBINE_rr %vreg2<kill>, %vreg1<kill>; DoubleRegs:%vreg7,IntRegs:%vreg2,%vreg1 ------------------------------------------------------------------ It creates a 64bit vreg by making %vreg2 the higher word and %vreg1 the lower word in the DoubleReg. The optimization opportunity here is that if %vreg2 and %vreg1 are allocated the right registers (odd for %vreg2 and even for %vreg1) then the COMBINE_rr instruction can be made redundant. For instance if %vreg2 is allocated the physical register R3 and %vreg1 is allocated R2 then %vreg7 can simply be the register pair R3:R2 i.e %D0 in the hexagon backend. The question is this possible in the current setup of the Reg. Coalescer and the Reg. Allocator ? Or is there some target hook that'll help me inform the Register coalescer or the allocator ? @Jakob: I noticed your commit last week regarding TRI::getCommonSuperRegClass(). Can that have a role to play here? FWIW, the relevant patterns for COMBINE_rr are shown below. ------------------------------------------------------------------ // Combine. let isPredicable = 1, neverHasSideEffects = 1 in def COMBINE_rr : ALU32_rr<(outs DoubleRegs:$dst), (ins IntRegs:$src1, IntRegs:$src2), "$dst = combine($src1, $src2)", []>; def: Pat<(i64 (or (i64 (shl (i64 DoubleRegs:$srcHigh), (i32 32))), (i64 DoubleRegs:$srcLow))), (i64 (COMBINE_rr (EXTRACT_SUBREG (i64 DoubleRegs:$srcHigh), subreg_loreg), (EXTRACT_SUBREG (i64 DoubleRegs:$srcLow), subreg_loreg)))>; ------------------------------------------------------------------ Thanks, Pranav Qualcomm Innovation Center, (QuIC) is a member of the Code Aurora Forum.
Possibly Parallel Threads
- [LLVMdev] Register coalescing (Subregs and SuperRegs)
- [LLVMdev] MemRefs in a Load Instruction
- [LLVMdev] MachineOperand: Subreg defines and the Undef flag
- [LLVMdev] Splitting live ranges of half-defined registers
- [LLVMdev] MachineOperand: Subreg defines and the Undef flag