You may have noticed Andy and me committing TableGen patches for "register
units". I thought I'd better explain what they are.
Some targets have instructions that operate on sequences of registers. I'll
use ARM examples because it is the most notorious. ARM has, for example:
vld1.64 {d1, d2}, [r0]
The instruction loads two d-registers, but they must be consecutive. ARM also
names even-odd pairs of d-registers, q0 = {d0, d1}, but odd-even pairs have no
other name than {d1, d2}.
LLVM models the sequence constraint as a single super-register, so we define
pseudo-registers D1_D2, D3_D4, … in addition to the existing Q0, Q1, … From the
register allocator's point of view, the vld1.64 instruction has two
operands: a GPR operands for the address, and a DPair operand representing the
two consecutive d-registers.
This model makes it easy to handle register sequence constraints, and we
don't have to worry about what ISA designers decide to call a
'register'. The cost is a fairly high number of pseudo-registers that
the ISA didn't dare to name. ARM currently has 277 registers, some named by
the ISA, some pseudo-registers representing sequence constraints.
Register units help alleviate the pain of having many registers. They represent
the 'basic units of interference', typically corresponding to the leaf
registers. (Those without any sub-registers). A target has fewer register units
than registers, and forming pseudo-super-registers to model constraints
doesn't create more register units.
Each register is assigned a list of register units such that:
RegA overlaps RegB if and only if Units(RegA) intersects Units(RegB).
On X86, for example, the register units are the 8-bit registers: AH, AL, BH, BL,
… The 64-bit register %rax is assigned units (AH, AL), and so is %eax. It is
easy to check that %rax and %eax overlaps because they have the same register
units. In general, registers only need to have one register unit in common.
These are some of the X86 register to register unit mappings:
%rax -> {AH, AL}
%eax -> {AH, AL}
%ax -> {AH, AL}
%ah -> {AH}
%al -> {AH}
%rbx -> {BH, BL}
…
%r8 -> {R8B}
…
%xmm0 -> {XMM0}
%ymm0 -> {XMM0}
X86 has 87 register units compared to 160 registers, making it possible for the
register allocator to track interference more compactly.
Register units also enable accurate register pressure tracking in spite of
overlapping register classes and aliasing registers. This is what Andy has been
implementing in RegisterPressure.h. Register classes are mapped to corresponding
sets of register units, and by counting units instead of registers, problems
with aliasing registers go away.
/jakob