Jakob Stoklund Olesen
2013-May-22 21:20 UTC
[LLVMdev] Avoiding MCRegAliasIterator with register units
LLVM can model some quite complicated register banks now, and we even use registers to model some encoding constraints. For example, a few ARM instructions like strexd have two register operands that must be an aligned pair of consecutive GPR registers (like r0, r1). This constraint is modeled with the GPRPair register class containing R0_R1, R2_R3, ... pseudo-registers. Sometimes ISAs also assign assembly names to such pseudo-registers, again from ARM: SPR: (s0, s1, ...) 32-bit floating point registers. DPR: (d0, d1, ...) Even-odd pairs of consecutive S-registers. QPR: (q0, q1, ...) Even-odd pairs of consecutive D-registers. But not all constraints are given 'register' names by the ISA. One vld1 instruction variant can load two consecutive D-registers, both even-odd and odd-even pairs. An even-odd pair like {d0, d1} is also called q0, but an odd-even pair like {d1, d2} has no other ISA name. Since it's a bit random what an ISA decides to call a register and what it decides to call an encoding constraint, LLVM's concept of a physical register is usually a bit broader, but more consistent. We will normally define physical registers for all reasonable encoding constraints on register operands. For example, the LLVM ARM target has physical registers like D1_D2 which don't exist in the ISA. In a target with many encoding constraints like that, some registers can have a high number of super-registers, and even more aliases. On the last count, some of the ARM NEON registers had more than 40 aliasing registers. The register allocator uses register units to deal with the complexity of these register banks. Register units are more or less the same as leaf registers in the sub-register graph. Register interference is tracked in terms of register units, and that means it is no longer necessary to scan through the long list of aliasing registers. We still have an MCRegAliasIterator that's used here and there in the code generator, and TableGen is still emitting tables backing this iterator. I would like to minimize the use of MCRegAliasIterator because some of the alias lists are so long. In most cases, it should be possible to express algorithms in terms of register units instead. I also want to avoid emitting tables for driving MCRegAliasIterator. If required, the set of aliasing registers can be computed from regunits and super-registers. (See the block comment for MCRegUnitRootIterator or LiveIntervals::computeRegUnitInterval). /jakob
Chad Rosier
2013-May-24 16:39 UTC
[LLVMdev] Avoiding MCRegAliasIterator with register units
Jakob, I've implemented a patch that reworks the MCRegAliasIterator to dynamically compute the register aliases. The size reduction in the RegDiffLists are rather dramatic. Here are a few size differences for MCTargetDesc.o files (before and after) in bytes: R600 - 36160B - 11184B - 69% reduction ARM - 28480B - 8368B - 71% reduction Mips - 816B - 576B - 29% reduction One side effect of dynamically computing the aliases is that the iterator does not guarantee that the entries are ordered or that duplicates have been removed. The documentation seems to imply this is a safe assumption and I haven't found a client that requires these attributes (i.e., strict ordering and uniqueness). I also setup a LNT tester and found no execution-time failures on x86 for -O0 and -O2 using the llvm test-suite and externals test-suite (i.e., SPEC95/2K/2K6, etc.). I'm still in the process of verifying there are no compile-time regressions. Chad On May 22, 2013, at 2:20 PM, Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote:> LLVM can model some quite complicated register banks now, and we even use registers to model some encoding constraints. > > For example, a few ARM instructions like strexd have two register operands that must be an aligned pair of consecutive GPR registers (like r0, r1). This constraint is modeled with the GPRPair register class containing R0_R1, R2_R3, ... pseudo-registers. > > Sometimes ISAs also assign assembly names to such pseudo-registers, again from ARM: > > SPR: (s0, s1, ...) 32-bit floating point registers. > DPR: (d0, d1, ...) Even-odd pairs of consecutive S-registers. > QPR: (q0, q1, ...) Even-odd pairs of consecutive D-registers. > > But not all constraints are given 'register' names by the ISA. One vld1 instruction variant can load two consecutive D-registers, both even-odd and odd-even pairs. An even-odd pair like {d0, d1} is also called q0, but an odd-even pair like {d1, d2} has no other ISA name. > > Since it's a bit random what an ISA decides to call a register and what it decides to call an encoding constraint, LLVM's concept of a physical register is usually a bit broader, but more consistent. We will normally define physical registers for all reasonable encoding constraints on register operands. For example, the LLVM ARM target has physical registers like D1_D2 which don't exist in the ISA. > > In a target with many encoding constraints like that, some registers can have a high number of super-registers, and even more aliases. On the last count, some of the ARM NEON registers had more than 40 aliasing registers. > > The register allocator uses register units to deal with the complexity of these register banks. Register units are more or less the same as leaf registers in the sub-register graph. Register interference is tracked in terms of register units, and that means it is no longer necessary to scan through the long list of aliasing registers. > > We still have an MCRegAliasIterator that's used here and there in the code generator, and TableGen is still emitting tables backing this iterator. > > I would like to minimize the use of MCRegAliasIterator because some of the alias lists are so long. In most cases, it should be possible to express algorithms in terms of register units instead. > > I also want to avoid emitting tables for driving MCRegAliasIterator. If required, the set of aliasing registers can be computed from regunits and super-registers. (See the block comment for MCRegUnitRootIterator or LiveIntervals::computeRegUnitInterval). > > /jakob > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Renato Golin
2013-May-24 17:08 UTC
[LLVMdev] Avoiding MCRegAliasIterator with register units
On 24 May 2013 17:39, Chad Rosier <mcrosier at apple.com> wrote:> One side effect of dynamically computing the aliases is that the iterator > does not guarantee that the entries are ordered or that duplicates have > been removed. >Hi Chad, Sounds like you're growing the list (thus the lookup time), rather than shrinking, as I take it was Jacob's original intention? The documentation seems to imply this is a safe assumption and I haven't> found a client that requires these attributes (i.e., strict ordering and > uniqueness). >I don't think it should matter for the purposes of aliasing. I also setup a LNT tester and found no execution-time failures on x86 for> -O0 and -O2 using the llvm test-suite and externals test-suite (i.e., > SPEC95/2K/2K6, etc.). >I'm wondering why not O3, too? That'll expose vector selection as well, not just VFP, which is a big source of super-registers, at least on ARM. Somehow it sounds as though you'll find more performance regressions than improvements, but that's my poor assumptions based on nothing concrete. ;) cheers, --renato -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130524/0f7fe6fd/attachment.html>
Possibly Parallel Threads
- [LLVMdev] Avoiding MCRegAliasIterator with register units
- [LLVMdev] Avoiding MCRegAliasIterator with register units
- [LLVMdev] Performance regression in the LiveIntevals phase
- Incorrect placement of an instruction after PostRAScheduler pass
- [LLVMdev] Registers and Register Units