thr3ads.net - llvm dev - [llvm-dev] Register spilling fix for experimental 6502 backend [Feb 2016]

If this information is useful, please help other people find it:
Share via:

N. E. C. via llvm-dev

2016-Feb-13 22:54 UTC

[llvm-dev] Register spilling fix for experimental 6502 backend

So I've been designing an experimental 6502 backend for LLVM. Link:
<https://github.com/beholdnec/llvm-m6502>

The 6502 only has a few registers, one accumulator and two indexes, none of
which are large enough to hold an absolute pointer. Because of this, the
backend really tests the power of LLVM's register allocator (RA).

I've made a change to the RA that might be of interest to devs of other
backends, read on.

Other devs suggested reserving the zero page of memory (addresses $00-$FF) and
treating it as a giant set of registers. And I don't know, because
doesn't that
sort of defeat the purpose of register allocation? It might be inevitable that
we have to lie to LLVM about the 6502's registers, but, for now, I want to
see
how far I can get using the allocator as intended.

The registers are A, X, and Y. They belong to a hierarchy of Register Classes:
  Acc class: A
  Index class: X, Y
  General class: Acc, Index

There's a fictional instruction "SUBidx Acc, Index". It subtracts
an Index from
an Acc. Bear with me.

What if we want to subtract A from X?
We can't say "SUBidx X, A". The operands must be Acc, Index.

The DAG looks something like this:

  %vreg2 = COPY %X ; %vreg2 is Index
  %vreg0 = COPY %A ; %vreg0 is Acc
  SUBidx %vreg2, %vreg0

To make this work, we first have to swap A and X.
For technical reasons, we can't use Y as temporary storage. Instead, we must
spill to the stack.

When spilling a reg to the stack, the RA needs two virtual regs, one to store
and one to load. Currently, the loaded reg always has the same class as the
stored reg. Because of this limitation, I ran into a problem: the RA would try
to assign two Accs to physical regs, but only one is available, and the RA
would fail.

Line 1290 of InlineSpiller.cpp, inside the function spillAroundUses, says:
"// FIXME: Infer regclass from instruction alone"
(As opposed to the current behavior of always using the same class as the reg
being spilled.)
The line refers to this very limitation. It was written in 2010, so we've
known
about the limitation since the beginning.

My RA patch can be found in this commit:
<https://github.com/beholdnec/llvm-m6502/commit/a4b492f63>

This patch allows spills to use a more general reg class, so a spilled Acc can
be reloaded as a General.

Previously, the RA would fail because it would spill X, but it couldn't
reload
X into an Acc.

Now, with the patch, the DAG compiles into this:

  # store X to stack
  STX_stack [0]
  # transfer A to X
  TAX
  # load A from stack
  LDA_stack [0]
  # now that A and X are swapped, we can subtract!
  SUBidx A, X

It works, but I'm not sure if I used the best approach. If you're
familiar with
the RA, feel free to look at the patch and leave feedback.

- Nolan

David Chisnall via llvm-dev

2016-Feb-14 10:07 UTC

head link

[llvm-dev] Register spilling fix for experimental 6502 backend

On 13 Feb 2016, at 22:54, N. E. C. via llvm-dev <llvm-dev at
lists.llvm.org> wrote:> 
> Other devs suggested reserving the zero page of memory (addresses $00-$FF)
and
> treating it as a giant set of registers. And I don't know, because
doesn't that
> sort of defeat the purpose of register allocation? It might be inevitable
that
> we have to lie to LLVM about the 6502's registers, but, for now, I want
to see
> how far I can get using the allocator as intended.
I don’t think that it contradicts the purpose of register allocation.  Register
allocation is really shorthand for operand location selection (though registers
do have the nice property that, on the compiler’s side of the instruction
decoder, they never alias).  On a RISC-like register-register architecture,
these are precisely the same.  On modern x86 processors (at least when running
in 32-bit mode), if you don’t move the stack pointer then the top few stack
slots are stored in rename registers so register-memory instructions that
operate on them can use the same rename logic as register-register instructions,
so it would be entirely valid to treat a few esp-relative addresses that the
compiler guarantees are not address taken as registers.

For the 6502, you might want to look at the papers describing compilers for the
Xerox Alto as a good reference.  As was common in that era, they defined a
virtual instruction set implemented in microcode as the compiler target.  The
Alto’s instruction set was specifically designed to be used in this way, with
compilers generating bytecode and shipping a bytecode interpreter.  If you think
that this is slow, remember that Smalltalk on the Alto ran an entire
multitasking GUI on a processor not much faster than a 6502, though with a bit
more RAM.  Without long pipelines, a jump-threaded bytecode interpreter is very
fast.  See also P-code from the UCSD Pascal compiler.

SWEET16 is an example of the same concept for the 6502, providing effectively a
16-bit virtual ISA for compilers and interpreters to target.  When targeting a
processor like the 6502 from a vaguely high-level language (which, given the
amount of structure that it implies, includes LLVM IR in this context), you are
going to find so many repeated instruction sequences that it will be very
difficult to fit within the memory constraints if you expand them at every use.

Even Clang and GCC targeting modern architectures end up doing this to some
degree, for example providing software floating point routines in compiler-rt /
libgcc.  If you tried to inline them at every use, then you’d end up with code
that would suffer from instruction cache footprint on larger processors and
completely fail to fit on M-profile processors.

If I were writing a 6502 back end, then I would define an SWEET16-derived
virtual ISA, include the 6502 assembly for that code in compiler-rt, and target
that ISA from the LLVM back end.  I’d end up with much denser code at the end
and not have to fight the LLVM infrastructure so much,

David

Steve Montgomery via llvm-dev

2016-Feb-14 11:49 UTC

head link

[llvm-dev] Register spilling fix for experimental 6502 backend

Have you tried implementing getLargestLegalSuperClass(RC) in M6502RegisterInfo?

The default implementation returns RC but if you make it return the General
class then live range splitting will use General registers. That might avoid the
need for a patch to the register allocator.

Steve
> On 13 Feb 2016, at 22:54, N. E. C. via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> So I've been designing an experimental 6502 backend for LLVM. Link:
> <https://github.com/beholdnec/llvm-m6502>
> 
> The 6502 only has a few registers, one accumulator and two indexes, none of
> which are large enough to hold an absolute pointer. Because of this, the
> backend really tests the power of LLVM's register allocator (RA).
> 
> I've made a change to the RA that might be of interest to devs of other
> backends, read on.
> 
> Other devs suggested reserving the zero page of memory (addresses $00-$FF)
and
> treating it as a giant set of registers. And I don't know, because
doesn't that
> sort of defeat the purpose of register allocation? It might be inevitable
that
> we have to lie to LLVM about the 6502's registers, but, for now, I want
to see
> how far I can get using the allocator as intended.
> 
> The registers are A, X, and Y. They belong to a hierarchy of Register
Classes:
> Acc class: A
> Index class: X, Y
> General class: Acc, Index
> 
> There's a fictional instruction "SUBidx Acc, Index". It
subtracts an Index from
> an Acc. Bear with me.
> 
> What if we want to subtract A from X?
> We can't say "SUBidx X, A". The operands must be Acc, Index.
> 
> The DAG looks something like this:
> 
> %vreg2 = COPY %X ; %vreg2 is Index
> %vreg0 = COPY %A ; %vreg0 is Acc
> SUBidx %vreg2, %vreg0
> 
> To make this work, we first have to swap A and X.
> For technical reasons, we can't use Y as temporary storage. Instead, we
must
> spill to the stack.
> 
> When spilling a reg to the stack, the RA needs two virtual regs, one to
store
> and one to load. Currently, the loaded reg always has the same class as the
> stored reg. Because of this limitation, I ran into a problem: the RA would
try
> to assign two Accs to physical regs, but only one is available, and the RA
> would fail.
> 
> Line 1290 of InlineSpiller.cpp, inside the function spillAroundUses, says:
> "// FIXME: Infer regclass from instruction alone"
> (As opposed to the current behavior of always using the same class as the
reg
> being spilled.)
> The line refers to this very limitation. It was written in 2010, so
we've known
> about the limitation since the beginning.
> 
> My RA patch can be found in this commit:
> <https://github.com/beholdnec/llvm-m6502/commit/a4b492f63>
> 
> This patch allows spills to use a more general reg class, so a spilled Acc
can
> be reloaded as a General.
> 
> Previously, the RA would fail because it would spill X, but it couldn't
reload
> X into an Acc.
> 
> Now, with the patch, the DAG compiles into this:
> 
> # store X to stack
> STX_stack [0]
> # transfer A to X
> TAX
> # load A from stack
> LDA_stack [0]
> # now that A and X are swapped, we can subtract!
> SUBidx A, X
> 
> It works, but I'm not sure if I used the best approach. If you're
familiar with
> the RA, feel free to look at the patch and leave feedback.
> 
> - Nolan
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Quentin Colombet via llvm-dev

2016-Feb-15 18:36 UTC

head link

[llvm-dev] Register spilling fix for experimental 6502 backend

Hi Nolan,

One thing you could try is to define a register class that is the union of X and
Acc.
That register class would be used for spill and reload instructions.
In other words, both store and load would use the “same" register class.

By doing so, you shouldn’t need to patch any of the RA pipeline.

Cheers,
-Quentin> On Feb 13, 2016, at 2:54 PM, N. E. C. via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> So I've been designing an experimental 6502 backend for LLVM. Link:
> <https://github.com/beholdnec/llvm-m6502>
> 
> The 6502 only has a few registers, one accumulator and two indexes, none of
> which are large enough to hold an absolute pointer. Because of this, the
> backend really tests the power of LLVM's register allocator (RA).
> 
> I've made a change to the RA that might be of interest to devs of other
> backends, read on.
> 
> Other devs suggested reserving the zero page of memory (addresses $00-$FF)
and
> treating it as a giant set of registers. And I don't know, because
doesn't that
> sort of defeat the purpose of register allocation? It might be inevitable
that
> we have to lie to LLVM about the 6502's registers, but, for now, I want
to see
> how far I can get using the allocator as intended.
> 
> The registers are A, X, and Y. They belong to a hierarchy of Register
Classes:
>  Acc class: A
>  Index class: X, Y
>  General class: Acc, Index
> 
> There's a fictional instruction "SUBidx Acc, Index". It
subtracts an Index from
> an Acc. Bear with me.
> 
> What if we want to subtract A from X?
> We can't say "SUBidx X, A". The operands must be Acc, Index.
> 
> The DAG looks something like this:
> 
>  %vreg2 = COPY %X ; %vreg2 is Index
>  %vreg0 = COPY %A ; %vreg0 is Acc
>  SUBidx %vreg2, %vreg0
> 
> To make this work, we first have to swap A and X.
> For technical reasons, we can't use Y as temporary storage. Instead, we
must
> spill to the stack.
> 
> When spilling a reg to the stack, the RA needs two virtual regs, one to
store
> and one to load. Currently, the loaded reg always has the same class as the
> stored reg. Because of this limitation, I ran into a problem: the RA would
try
> to assign two Accs to physical regs, but only one is available, and the RA
> would fail.
> 
> Line 1290 of InlineSpiller.cpp, inside the function spillAroundUses, says:
> "// FIXME: Infer regclass from instruction alone"
> (As opposed to the current behavior of always using the same class as the
reg
> being spilled.)
> The line refers to this very limitation. It was written in 2010, so
we've known
> about the limitation since the beginning.
> 
> My RA patch can be found in this commit:
> <https://github.com/beholdnec/llvm-m6502/commit/a4b492f63>
> 
> This patch allows spills to use a more general reg class, so a spilled Acc
can
> be reloaded as a General.
> 
> Previously, the RA would fail because it would spill X, but it couldn't
reload
> X into an Acc.
> 
> Now, with the patch, the DAG compiles into this:
> 
>  # store X to stack
>  STX_stack [0]
>  # transfer A to X
>  TAX
>  # load A from stack
>  LDA_stack [0]
>  # now that A and X are swapped, we can subtract!
>  SUBidx A, X
> 
> It works, but I'm not sure if I used the best approach. If you're
familiar with
> the RA, feel free to look at the patch and leave feedback.
> 
> - Nolan
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

N. E. C. via llvm-dev

2016-Feb-16 07:09 UTC

head link

[llvm-dev] Register spilling fix for experimental 6502 backend

Thank you for pointing me to getLargestLegalSuperClass, implementing
this is adequate to fix the reg allocation problem without patching
LLVM's register allocator.

It is understandable that I should avoid touching the Reg Allocator,
as doing so might potentially break other backends.

Nolan Check

On Mon, Feb 15, 2016 at 10:36 AM, Quentin Colombet <qcolombet at
apple.com> wrote:> Hi Nolan,
>
> One thing you could try is to define a register class that is the union of
X and Acc.
> That register class would be used for spill and reload instructions.
> In other words, both store and load would use the “same" register
class.
>
> By doing so, you shouldn’t need to patch any of the RA pipeline.
>
> Cheers,
> -Quentin
>> On Feb 13, 2016, at 2:54 PM, N. E. C. via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
>>
>> So I've been designing an experimental 6502 backend for LLVM. Link:
>> <https://github.com/beholdnec/llvm-m6502>
>>
>> The 6502 only has a few registers, one accumulator and two indexes,
none of
>> which are large enough to hold an absolute pointer. Because of this,
the
>> backend really tests the power of LLVM's register allocator (RA).
>>
>> I've made a change to the RA that might be of interest to devs of
other
>> backends, read on.
>>
>> Other devs suggested reserving the zero page of memory (addresses
$00-$FF) and
>> treating it as a giant set of registers. And I don't know, because
doesn't that
>> sort of defeat the purpose of register allocation? It might be
inevitable that
>> we have to lie to LLVM about the 6502's registers, but, for now, I
want to see
>> how far I can get using the allocator as intended.
>>
>> The registers are A, X, and Y. They belong to a hierarchy of Register
Classes:
>>  Acc class: A
>>  Index class: X, Y
>>  General class: Acc, Index
>>
>> There's a fictional instruction "SUBidx Acc, Index". It
subtracts an Index from
>> an Acc. Bear with me.
>>
>> What if we want to subtract A from X?
>> We can't say "SUBidx X, A". The operands must be Acc,
Index.
>>
>> The DAG looks something like this:
>>
>>  %vreg2 = COPY %X ; %vreg2 is Index
>>  %vreg0 = COPY %A ; %vreg0 is Acc
>>  SUBidx %vreg2, %vreg0
>>
>> To make this work, we first have to swap A and X.
>> For technical reasons, we can't use Y as temporary storage.
Instead, we must
>> spill to the stack.
>>
>> When spilling a reg to the stack, the RA needs two virtual regs, one to
store
>> and one to load. Currently, the loaded reg always has the same class as
the
>> stored reg. Because of this limitation, I ran into a problem: the RA
would try
>> to assign two Accs to physical regs, but only one is available, and the
RA
>> would fail.
>>
>> Line 1290 of InlineSpiller.cpp, inside the function spillAroundUses,
says:
>> "// FIXME: Infer regclass from instruction alone"
>> (As opposed to the current behavior of always using the same class as
the reg
>> being spilled.)
>> The line refers to this very limitation. It was written in 2010, so
we've known
>> about the limitation since the beginning.
>>
>> My RA patch can be found in this commit:
>> <https://github.com/beholdnec/llvm-m6502/commit/a4b492f63>
>>
>> This patch allows spills to use a more general reg class, so a spilled
Acc can
>> be reloaded as a General.
>>
>> Previously, the RA would fail because it would spill X, but it
couldn't reload
>> X into an Acc.
>>
>> Now, with the patch, the DAG compiles into this:
>>
>>  # store X to stack
>>  STX_stack [0]
>>  # transfer A to X
>>  TAX
>>  # load A from stack
>>  LDA_stack [0]
>>  # now that A and X are swapped, we can subtract!
>>  SUBidx A, X
>>
>> It works, but I'm not sure if I used the best approach. If
you're familiar with
>> the RA, feel free to look at the patch and leave feedback.
>>
>> - Nolan
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

llvm dev - Feb 2016 - Register spilling fix for experimental 6502 backend

[llvm-dev] Register spilling fix for experimental 6502 backend

[llvm-dev] Register spilling fix for experimental 6502 backend

[llvm-dev] Register spilling fix for experimental 6502 backend

[llvm-dev] Register spilling fix for experimental 6502 backend

[llvm-dev] Register spilling fix for experimental 6502 backend