thr3ads.net - llvm dev - [llvm-dev] Experimental 6502 backend; memory operand folding problem [Feb 2016]

If this information is useful, please help other people find it:
Share via:

N. E. C. via llvm-dev

2016-Feb-12 09:39 UTC

[llvm-dev] Experimental 6502 backend; memory operand folding problem

Greetings, LLVM devs,

For the past few weeks, I have been putting together a 6502 backend for LLVM.
The 6502 and its derivatives, of course, have powered countless microcomputers,
game consoles and arcade machines over the past 40 years.

The backend is just an experimental hobby project right now. The code is
available here: <https://github.com/beholdnec/llvm-m6502>. This branch
introduces
a target called "m6502", which can be used in llc to compile some very
simple
functions. Only a few instructions are implemented, it's not useful for
anything
yet.

There was another attempt in August of last year by c64scene-ar on GitHub to
design a 6502 backend, however, the project appears to be stalled with no
substantial progress. As far as I know, my backend is the only one able to
generate 6502 instructions.

Here is a test file:
<https://gist.github.com/beholdnec/910eba79391bb24ba2fa>.

I would like to ask for help as I'm stuck on one particularly sticky
problem.
I'll describe the problem shortly.

Occasionally, the topic of a 6502 backend comes up on this mailing list. Here
is an old thread talking about some of the challenges involved:
<https://groups.google.com/forum/#!topic/llvm-dev/w37MfNU_Ag8>.

The 6502 has only three 8-bit registers: A, X and Y, and 256 bytes of hardware-
supported stack. Generating code for such a constrained system pushes LLVM to
its limits.

For one thing, LLVM couldn't figure out how to lower an ADD instruction that
added a reg to a reg. The 6502's ADD instruction can only add register A to
an
immediate or a value loaded from memory. There is no instruction that adds A to
another register.

I had thought LLVM would allocate a stack object for the second operand, but
it didn't, and LLVM threw an ISel matching error. I currently solve this
with
a custom ADD lowering function, see LowerADD in M6502ISelLowering.cpp.
Question: Is custom lowering ideal for this situation? Or, is there another way
to coax LLVM into recognizing ADD?

The problem I'm stuck on is folding memory operands. In the test file above,
in @testSum, switch %a, %b to %b, %a. llc will assert in Register Spilling:
"Remaining use wasn't a snippet copy". Debug output shows STRabs
being
generated,
followed by an attempted fold of a stack-load into ADDabs.

I must be on the wrong track in M6502InstrInfo::foldMemoryOperandImpl. If
someone could please explain this error, it would really help. Thanks!

- Nolan

Steve Montgomery via llvm-dev

2016-Feb-12 10:14 UTC

head link

[llvm-dev] Experimental 6502 backend; memory operand folding problem

I have a similar problem with my CPU12 family backend. There are very few
instructions that operate on two register operands so I have to force the
right-hand operand to be an immediate or load from memory.

I’ve been through several attempts at dealing with this issue but my current
favourite is to define pseudo instructions for register-register instructions,
e.g. ADD8rr, ADD16rr, SUB8rr, SUB16rr, AND8rr and so on. I then use a pre-RA
pass to convert the pseudos into rImm or rMem versions. Some instructions are
essentially free to convert because the RHS operand can be rematerialised but in
some cases it’s necessary to store to a new stack object and then fold the load
into the pseudo instruction to give the rMem version. It’s made more complicated
by the fact that some instructions are commutative and some actually do have rr
variants but it’s not always most efficient to use them, but that’s not really
important to the principle of the approach I’ve taken.

I did try custom lowering and I also tried using preprocessISelDAG() but both
those approaches are limited in scope to basic blocks. If you do it after
post-ISel then you get to examine the whole function and might be able to make
better decisions as to what can be rematerialised and what needs to be spilled
to the stack.

It’s been a long time since I used a 6502 so I’m a bit rusty on the instruction
set but IIRC it’s not just ADD that gives you this problem. Won’t SUB, CMP, AND,
EOR and XOR all have the same issue?

I’ve not seen that assertion failure before so can’t really help, sorry.

Steve
> On 12 Feb 2016, at 09:39, N. E. C. via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> Greetings, LLVM devs,
> 
> For the past few weeks, I have been putting together a 6502 backend for
LLVM.
> The 6502 and its derivatives, of course, have powered countless
microcomputers,
> game consoles and arcade machines over the past 40 years.
> 
> The backend is just an experimental hobby project right now. The code is
> available here: <https://github.com/beholdnec/llvm-m6502>. This
branch
> introduces
> a target called "m6502", which can be used in llc to compile some
very simple
> functions. Only a few instructions are implemented, it's not useful for
anything
> yet.
> 
> There was another attempt in August of last year by c64scene-ar on GitHub
to
> design a 6502 backend, however, the project appears to be stalled with no
> substantial progress. As far as I know, my backend is the only one able to
> generate 6502 instructions.
> 
> Here is a test file:
<https://gist.github.com/beholdnec/910eba79391bb24ba2fa>.
> 
> I would like to ask for help as I'm stuck on one particularly sticky
problem.
> I'll describe the problem shortly.
> 
> Occasionally, the topic of a 6502 backend comes up on this mailing list.
Here
> is an old thread talking about some of the challenges involved:
> <https://groups.google.com/forum/#!topic/llvm-dev/w37MfNU_Ag8>.
> 
> The 6502 has only three 8-bit registers: A, X and Y, and 256 bytes of
hardware-
> supported stack. Generating code for such a constrained system pushes LLVM
to
> its limits.
> 
> For one thing, LLVM couldn't figure out how to lower an ADD instruction
that
> added a reg to a reg. The 6502's ADD instruction can only add register
A to an
> immediate or a value loaded from memory. There is no instruction that adds
A to
> another register.
> 
> I had thought LLVM would allocate a stack object for the second operand,
but
> it didn't, and LLVM threw an ISel matching error. I currently solve
this with
> a custom ADD lowering function, see LowerADD in M6502ISelLowering.cpp.
> Question: Is custom lowering ideal for this situation? Or, is there another
way
> to coax LLVM into recognizing ADD?
> 
> The problem I'm stuck on is folding memory operands. In the test file
above,
> in @testSum, switch %a, %b to %b, %a. llc will assert in Register Spilling:
> "Remaining use wasn't a snippet copy". Debug output shows
STRabs being
> generated,
> followed by an attempted fold of a stack-load into ADDabs.
> 
> I must be on the wrong track in M6502InstrInfo::foldMemoryOperandImpl. If
> someone could please explain this error, it would really help. Thanks!
> 
> - Nolan
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Bruce Hoult via llvm-dev

2016-Feb-12 13:23 UTC

head link

[llvm-dev] Experimental 6502 backend; memory operand folding problem

I haven't seen what you are doing, but if I was writing a back end for the
6502, I'd lie to LLVM and describe RAM page 0 as being the real registers,
and A, X and Y as being special purpose registers used for temporaries.

If your code is dealing with 8 bit values then you can keep a value in A
for some time, but if there are 16 bit variables then you have no choice
but to compile a = b + c + d into sequences like

clc
lda 10
adc 20
sta 40
lda 11
adc 21
sta 41
clc
lda 40
adc 30
sta 40
lda 41
adc 31
sta 41

(assuming a, b, c, d are stored in RAM at 40-41, 30-31, 20-21, and 10-11.)

I don't think there is any way you can do better by adding all the low
bytes together .. is there? You'd have to save the carries and add them one
at at time.

Hmm.

clc
lda 10
adc 20
php
clc
adc 30
sta 40
lda 11
adc 21
plp
adc 31
sta 41

Interesting .. saves two instructions (12 vs 14), six bytes (20 vs 26),
seven clock cycles (35 vs 40).

But I'm not sure it's worth the middle end of LLVM knowing about this.
Better to treat it as a 16 bit machine and use the actual 6502 register
only in a kind of macro way in code generation?

Also, this kind of code is simply *big*, and on a machine with small
memory. A typical RISC with 32 bit instructions does this in 8 bytes and
two instructions, and thumb does it in 6 bytes and three instructions...

Have you looked at compiling most code to Sweet16 (or an updated version,
but Woz did a nice job on that), and only innermost loops to real code?

http://www.6502.org/source/interpreters/sweet16.htm

This interacts well with native code -- it's easy to pop into Sweet16 and
out again.

Example use of Sweet16:

300  B9 00 02           LDA   IN,Y     ;get a char
303  C9 CD              CMP   #"M"     ;"M" for move
305  D0 09              BNE   NOMOVE   ;No. Skip move
307  20 89 F6           JSR   SW16     ;Yes, call SWEET 16
30A  41         MLOOP   LD    @R1      ;R1 holds source
30B  52                 ST    @R2      ;R2 holds dest. addr.
30C  F3                 DCR   R3       ;Decr. length
30D  07 FB              BNZ   MLOOP    ;Loop until done
30F  00                 RTN            ;Return to 6502 mode.
310  C9 C5      NOMOVE  CMP   #"E"     ;"E" char?
312  D0 13              BEQ   EXIT     ;Yes, exit
314  C8                 INY            ;No, cont.

An alternative would be to compile inner loops to native code and
everything else to a pseudo register machine (probably similar in concept
to Sweet16) but using indirect threaded code. i.e. The "code" is a
list of
addresses of functions, and the first two bytes of each function is the
address of the "interpreter" for that function. For native functions,
the
interpreter is the function itself i.e. the first two bytes of the function
point to the 3rd byte of the function, not to an interpreter.

Of course, all this assumes that you can compile to native code in the
first place, even if it is big. So that's a good first step.

On Fri, Feb 12, 2016 at 12:39 PM, N. E. C. via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Greetings, LLVM devs,
>
> For the past few weeks, I have been putting together a 6502 backend for
> LLVM.
> The 6502 and its derivatives, of course, have powered countless
> microcomputers,
> game consoles and arcade machines over the past 40 years.
>
> The backend is just an experimental hobby project right now. The code is
> available here: <https://github.com/beholdnec/llvm-m6502>. This
branch
> introduces
> a target called "m6502", which can be used in llc to compile some
very
> simple
> functions. Only a few instructions are implemented, it's not useful for
> anything
> yet.
>
> There was another attempt in August of last year by c64scene-ar on GitHub
> to
> design a 6502 backend, however, the project appears to be stalled with no
> substantial progress. As far as I know, my backend is the only one able to
> generate 6502 instructions.
>
> Here is a test file: <
> https://gist.github.com/beholdnec/910eba79391bb24ba2fa>.
>
> I would like to ask for help as I'm stuck on one particularly sticky
> problem.
> I'll describe the problem shortly.
>
> Occasionally, the topic of a 6502 backend comes up on this mailing list.
> Here
> is an old thread talking about some of the challenges involved:
> <https://groups.google.com/forum/#!topic/llvm-dev/w37MfNU_Ag8>.
>
> The 6502 has only three 8-bit registers: A, X and Y, and 256 bytes of
> hardware-
> supported stack. Generating code for such a constrained system pushes LLVM
> to
> its limits.
>
> For one thing, LLVM couldn't figure out how to lower an ADD instruction
> that
> added a reg to a reg. The 6502's ADD instruction can only add register
A
> to an
> immediate or a value loaded from memory. There is no instruction that adds
> A to
> another register.
>
> I had thought LLVM would allocate a stack object for the second operand,
> but
> it didn't, and LLVM threw an ISel matching error. I currently solve
this
> with
> a custom ADD lowering function, see LowerADD in M6502ISelLowering.cpp.
> Question: Is custom lowering ideal for this situation? Or, is there
> another way
> to coax LLVM into recognizing ADD?
>
> The problem I'm stuck on is folding memory operands. In the test file
> above,
> in @testSum, switch %a, %b to %b, %a. llc will assert in Register Spilling:
> "Remaining use wasn't a snippet copy". Debug output shows
STRabs being
> generated,
> followed by an attempted fold of a stack-load into ADDabs.
>
> I must be on the wrong track in M6502InstrInfo::foldMemoryOperandImpl. If
> someone could please explain this error, it would really help. Thanks!
>
> - Nolan
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160212/d4cf722e/attachment.html>

Krzysztof Parzyszek via llvm-dev

2016-Feb-12 13:54 UTC

head link

[llvm-dev] Experimental 6502 backend; memory operand folding problem

On 2/12/2016 7:23 AM, Bruce Hoult via llvm-dev wrote:> I haven't seen what you are doing, but if I was writing a back end for
> the 6502, I'd lie to LLVM and describe RAM page 0 as being the real
> registers, and A, X and Y as being special purpose registers used for
> temporaries.
How did you get the "(z), x" and "(z, y)" addressing modes
to work with
this scheme?  The "z" is two adjacent zero-page bytes---did you model 
16-bit registers as all possible pairs of adjacent 0p addresses?

-Krzysztof

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, 
hosted by The Linux Foundation

Possibly Parallel Threads

Search for more maybe matching threads

llvm dev - Feb 2016 - Experimental 6502 backend; memory operand folding problem

[llvm-dev] Experimental 6502 backend; memory operand folding problem

[llvm-dev] Experimental 6502 backend; memory operand folding problem

[llvm-dev] Experimental 6502 backend; memory operand folding problem

[llvm-dev] Experimental 6502 backend; memory operand folding problem

Possibly Parallel Threads