This is probably diving a bit deeper in to register allocator internals than any sane person should ever want to go. but I'm curious how one would go about teaching LLVM of a register swap instruction, for an architecture where swaps are as cheap as moves. It would of course be an easy enough pattern to add in a PostRegAlloc pass, except that at that point most opportunities are gone. In particular, many of my test functions for the backend I'm on see code like this: ; %bb.0: ; %entry mov r0, r4 mov r2, r0 Where the allocator is trying to get parameter r2 of the function in to r0 (result), and is using r4 as a temporary for the displaced parameter r0. This could be replaced by a single: exch r0, r2 Which saves a register, reduces code size, and provides a speed boost - but I'm unsure where one would even begin to look for adding support for this even as an academic exercise. Can anyone provide any guidance here? Assuming LLVM doesn't support it already via a little documented TII callback of course.