thr3ads.net - llvm dev - [LLVMdev] Instructions on a target with no general purpose registers [Jul 2014]

If this information is useful, please help other people find it:
Share via:

Edwin Amsler

2014-Jul-05 13:59 UTC

[LLVMdev] Instructions on a target with no general purpose registers

I've mentioned my sneaky plans to target the MOS6502 here before.

The big issue I think is that a lot of instructions don't really have a
choice for output register. It all just goes into the accumulator, X index, or Y
index based on the specific instruction.

So, my question is, when I'm defining my ins, outs and registers for these
instructions, is it going to be a problem that different instructions outputs
are defined by the function that is called? And I guess, how do I tell LLVM it
can't pick any old register for a given instruction?

Thanks,

Edwin

Owen Anderson

2014-Jul-05 17:55 UTC

head link

[LLVMdev] Instructions on a target with no general purpose registers

Hi Edwin

I’ve refrained from raining on your other thread, but LLVM’s instruction
selection model, and in fact a large portion of its code generation model, is
not well-fitted to modeling the 6502.  I don’t think you’re going to get
significant benefit from trying to bend it to work for the 6502, and you’ll
spend a lot of time working around baked in assumptions that aren’t true for
your target.

—Owen

On Jul 5, 2014, at 6:59 AM, Edwin Amsler <edwinguy at gmail.com> wrote:
> I've mentioned my sneaky plans to target the MOS6502 here before.
> 
> The big issue I think is that a lot of instructions don't really have a
choice for output register. It all just goes into the accumulator, X index, or Y
index based on the specific instruction.
> 
> So, my question is, when I'm defining my ins, outs and registers for
these instructions, is it going to be a problem that different instructions
outputs are defined by the function that is called? And I guess, how do I tell
LLVM it can't pick any old register for a given instruction?
> 
> Thanks,
> 
> Edwin
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

David Chisnall

2014-Jul-05 18:07 UTC

head link

[LLVMdev] Instructions on a target with no general purpose registers

Hi Edwin,

I'm not sure how actively he follows the lists, but you should talk to Steve
Montgomery.  He has some experience in porting LLVM to 8-bit microcontrollers
(and gave a talk about it at the Cambridge LLVM Day in 2012).

David


On 5 Jul 2014, at 14:59, Edwin Amsler <edwinguy at gmail.com> wrote:
> I've mentioned my sneaky plans to target the MOS6502 here before.
> 
> The big issue I think is that a lot of instructions don't really have a
choice for output register. It all just goes into the accumulator, X index, or Y
index based on the specific instruction.
> 
> So, my question is, when I'm defining my ins, outs and registers for
these instructions, is it going to be a problem that different instructions
outputs are defined by the function that is called? And I guess, how do I tell
LLVM it can't pick any old register for a given instruction?
> 
> Thanks,
> 
> Edwin
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Owen Anderson

2014-Jul-05 19:51 UTC

head link

[LLVMdev] Instructions on a target with no general purpose registers

On Jul 5, 2014, at 11:47 AM, Bruce Hoult <bruce at hoult.org> wrote:
> Although I'm new to LLVM, I don't think that can really be true.
Sure, LLVM has the assumption of a machine with "lots" of registers,
but it works with the x86, which doesn't have many registers, and has
instructions that work only with fixed input or output registers.
Actually, LLVM is *also* poorly suited to machines with lots of registers.  A
lot of the register allocation infrastructure is built around small-ish register
files with relatively sparse aliasing relationships.  From experience, modeling
a machine with 256 GPRs and extensive use of register sequences results in a lot
of pain in various parts of the code generator.
> To a first approximation, you can treat the 6502 as a machine with 256 8
bit registers, and register to register (and register to memory)
pseudo-instructions that are expanded at a late stage to short sequences using
A, X, Y as temps. That will certainly *work*. Not optimally for 8 bit values,
but it's pretty much the best you can do anyway if you're dealing with
16 or 32 bit values.
This will definitely work, but you’re not really compiling for 6502 at that
point.  You’re compiling for a bytecode VM that has 256 GPRs, where each
bytecode opcode is implemented as a sequence of 6502 instructions.  If your goal
is just to get something to run on a 6502, this will definitely work, but it’s
not actually solving the problem of making LLVM work as a proper 6502 compiler.

Fundamentally, you have to ask what you’re looking to get out of this.  If it’s
just for fun, or if you just want to get (say) C++11 to run on your 6502 without
caring about the low-level performance, that’s definitely doable.  But if you’re
trying to actually build a production-grade 6502 compiler, you’ll find that
you’re unlikely to benefit from most of the facilities that LLVM provides in the
SelectionDAG and Machine layers.  They’re simply built for machines that work
very, very differently.  I’m not claiming it’s impossible to bend them to your
will, but I am questioning the value in doing so.  I think you’d get a better
result with less work if you translated out of LLVM IR into a code generation
pipeline more specialized for the needs of a 6502.

—Owen
> On Sun, Jul 6, 2014 at 5:55 AM, Owen Anderson <resistor at mac.com>
wrote:
> Hi Edwin
> 
> I’ve refrained from raining on your other thread, but LLVM’s instruction
selection model, and in fact a large portion of its code generation model, is
not well-fitted to modeling the 6502.  I don’t think you’re going to get
significant benefit from trying to bend it to work for the 6502, and you’ll
spend a lot of time working around baked in assumptions that aren’t true for
your target.
> 
> —Owen
> 
> 
> On Jul 5, 2014, at 6:59 AM, Edwin Amsler <edwinguy at gmail.com>
wrote:
> 
> > I've mentioned my sneaky plans to target the MOS6502 here before.
> >
> > The big issue I think is that a lot of instructions don't really
have a choice for output register. It all just goes into the accumulator, X
index, or Y index based on the specific instruction.
> >
> > So, my question is, when I'm defining my ins, outs and registers
for these instructions, is it going to be a problem that different instructions
outputs are defined by the function that is called? And I guess, how do I tell
LLVM it can't pick any old register for a given instruction?
> >
> > Thanks,
> >
> > Edwin
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140705/a43c02ae/attachment.html>

David Given

2014-Jul-05 21:09 UTC

head link

[LLVMdev] Instructions on a target with no general purpose registers

On 7/5/14, 3:59 PM, Edwin Amsler wrote:
[...]> So, my question is, when I'm defining my ins, outs and registers for
these instructions, is it going to be a problem that different instructions
outputs are defined by the function that is called? And I guess, how do I tell
LLVM it can't pick any old register for a given instruction?
You can define a register class which matches a single register only and
then define an instruction which accepts parameters of only that class.
Unfortunately I actually tried this for my Videocore code generator and
the most likely outcome is that the register allocator will turn up its
toes and die. It really doesn't like this.

However, if you squint hard enough the 6502 becomes a processor with 256
byte-sized registers --- zero page. You may have better luck with a
processor model which uses a subset of these as your registers, and
possibly not telling LLVM about A, X and Y at all. Plus that gets you
useful free stuff like indirect indexed, which you can use to implement
*(addr+offset) for 8-bit offsets, and sneak a 16-bit add for free out of
the processor.

...

In another life I help look ater the ACK compiler suite, which is
ancient and crufty and produces terrible code, but dates from about the
era when the 6502 was still a thing and I've played with its 6502 code
generator (and the 8080 code generator); these processors will basically
not run modern languages acceptably. It's because modern languages are
all stack-frame based, and these processors don't have any support for
stack relative accesses.

On the Z80 you can just about fake it by using ix or iy as a user stack
pointer, but it's painfully slow, painfully bloaty (8 bytes of code to
read or write a 16-bit value from the stack frame!) and basically just
painful. I believe later Z80s eventually gained a stack-relative load
instruction, but I don't know if there was a corresponding store.

The 6502 isn't quite as bad. If your user stack pointer is in zero page,
you can use indirect indexed to give access to a 256-byte stack frame,
but it's not pretty:

ldy #offset
lda (sp), y # load high(?) byte
iny
ldx (sp), y # load low byte

...seven bytes, I think.

At least it's not as bad as the 8080.

Being able to forbid recursion and therefore map every stack frame
statically into RAM would improve things no end, as now every stack slot
has a compile-time known address; I remember asking here about this ages
back and was told that there wasn't a built-in pass to do this, but it
would be fairly easy to do. *shrug*

Or you could just start using FORTRAN.

--
┌─── ｄｇ＠ｃｏｗｌａｒｋ．ｃｏｍ ───── http://www.cowlark.com ─────
│ "Feminism encourages women to leave their husbands, kill their
│ children, practice withcraft, destroy capitalism and become lesbians."
│ --- Rev. Pat Robertson

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 876 bytes
Desc: OpenPGP digital signature
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140705/89a5d5a4/attachment.sig>

Edwin Amsler

2014-Jul-06 18:39 UTC

head link

[LLVMdev] Instructions on a target with no general purpose registers

So far, this is mostly a for-fun project as opposed to a production-grade
compiler. If it somehow magically gets there, hurray. If I lose interest in it,
no worries.

For an end goal, if I can write in C and produce native code, I’ll be happy
(ideally no interpreter). Even if I need to inline literally everything to make
my programs work. Current goal is to produce an assembler so I can hand-code
stuff using the instructions in my handy C64 Programmer’s Reference Guide.

I have to admit I’m somewhat worried about function calls, the stack, and
performance. Despite that, I’ll just keep plowing ahead until I bang into a wall
somewhere. I enjoy learning something new, and since the whole Target backend is
way different than anything I’ve worked with, I’m already pretty happy with
where I’ve gotten and where this is going.

Now to specifics (inline):

On Jul 5, 2014, at 4:09 PM, David Given <dg at cowlark.com> wrote:
> ...
> However, if you squint hard enough the 6502 becomes a processor with 256
> byte-sized registers --- zero page. You may have better luck with a
> processor model which uses a subset of these as your registers, and
> possibly not telling LLVM about A, X and Y at all. Plus that gets you
> useful free stuff like indirect indexed, which you can use to implement
> *(addr+offset) for 8-bit offsets, and sneak a 16-bit add for free out of
> the processor.
> 
I had planed on implementing a register bank in zero page already (register
definitions are already in place!), but hadn’t thought about outright hiding A,
X, and Y. That’s a great plan. It shouldn’t be much work to see what I can try
to optimize after that fact.
> ...
> 
> In another life I help look ater the ACK compiler suite, which is
> ancient and crufty and produces terrible code, but dates from about the
> era when the 6502 was still a thing and I've played with its 6502 code
> generator (and the 8080 code generator); these processors will basically
> not run modern languages acceptably. It's because modern languages are
> all stack-frame based, and these processors don't have any support for
> stack relative accesses.
Disappointing. 
> 
> On the Z80 you can just about fake it by using ix or iy as a user stack
> pointer, but it's painfully slow, painfully bloaty (8 bytes of code to
> read or write a 16-bit value from the stack frame!) and basically just
> painful. I believe later Z80s eventually gained a stack-relative load
> instruction, but I don't know if there was a corresponding store.
> 
> The 6502 isn't quite as bad. If your user stack pointer is in zero
page,
> you can use indirect indexed to give access to a 256-byte stack frame,
> but it's not pretty:
> 
>  ldy #offset
>  lda (sp), y    # load high(?) byte
>  iny
>  ldx (sp), y    # load low byte
> 
> ...seven bytes, I think.
> 
> At least it's not as bad as the 8080.
That’s not a bad plan. I’m assuming though that this would only be needed when a
16 bit value is passed in by reference? Sorry I don’t have any experience with
compilers, so I may be completely off here.

As for efficiency, the 6502 is pretty much garbage at handling anything given
its age, so I can’t really see a way even with pure assembly to get it to do
anything efficiently. Doing a 32bit add requires just a tonne of work, no matter
what you’re doing. 40 years of progress can iron out a heck of a lot efficiency
problems.

I probably need to go and read up on Sweet-16 to see how it handles things like
large integers and the like.
> 
> Being able to forbid recursion and therefore map every stack frame
> statically into RAM would improve things no end, as now every stack slot
> has a compile-time known address; I remember asking here about this ages
> back and was told that there wasn't a built-in pass to do this, but it
> would be fairly easy to do. *shrug*
Getting ahead of me here, but it could be good times.
> Or you could just start using FORTRAN.
Cobol’s more my jam. :P
> 
> -- 
> ┌─── ｄｇ＠ｃｏｗｌａｒｋ．ｃｏｍ ───── http://www.cowlark.com ─────
> │ "Feminism encourages women to leave their husbands, kill their
> │ children, practice withcraft, destroy capitalism and become lesbians.”
> │ --- Rev. Pat Robertson
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Steve Montgomery

2014-Jul-08 07:48 UTC

head link

[LLVMdev] Instructions on a target with no general purpose registers

Hi Edwin,

as David says, I've implemented a back-end for the Freescale 8 and 16-bit
microcontrollers, HC08, HC12, HCS12, S12, S12X, ... It's far from perfect
and I've not finished but, as indicated in my off-list e-mail to you,
I'd be happy to share tips and tricks.

I didn't have a particular problem with the small number of registers. I
wrote instruction patterns in which the operand register classes often contained
a single register. I found the default greedy register allocator handled this
pretty well provided you remember to allow register classes to be inflated
(implement MOS6502RegisterInfo::getLargestLegalSuperClass). If you don't
them the register allocator has to reload into the same class that a vreg was
spilled from which means it can't break circular dependencies sometimes. If
a dependency can't be broken then the allocator will report running out of
registers.

If you go this route, you will also need to implement
MOS6502InstrInfo::foldMemoryOperandImpl so you can fold the reloads into other
instructions.

LLVM assumes that each binary operation has a corresponding instruction that can
operate on register operands. This isn't the case for the microcontrollers
I've been working on and isn't the case for the 6502 either. I think the
best solution is to implement pseudo instructions and let the register
allocator's spill/reload deal with them. Much of the time, an operand will
be reloaded, so can be folded. Any pseudos that remain after register allocation
can be handled by spilling to memory, e.g. a dedicated fixed offset in the stack
frame.

If you decide to use zero-page as registers, and not tell LLVM about A, X and Y,
then you've got more registers and they can all be the same register class.
I imagine it might be best in this situation to lower the IR mainly to library
calls.

Hope that helps. Let me know if you want any further info about how I
implemented the CPU12 backend.

Regards,

Steve

On 5 Jul 2014, at 19:07, David Chisnall <David.Chisnall at cl.cam.ac.uk>
wrote:
> Hi Edwin,
> 
> I'm not sure how actively he follows the lists, but you should talk to
Steve Montgomery.  He has some experience in porting LLVM to 8-bit
microcontrollers (and gave a talk about it at the Cambridge LLVM Day in 2012).
> 
> David
> 
> 
> On 5 Jul 2014, at 14:59, Edwin Amsler <edwinguy at gmail.com> wrote:
> 
>> I've mentioned my sneaky plans to target the MOS6502 here before.
>> 
>> The big issue I think is that a lot of instructions don't really
have a choice for output register. It all just goes into the accumulator, X
index, or Y index based on the specific instruction.
>> 
>> So, my question is, when I'm defining my ins, outs and registers
for these instructions, is it going to be a problem that different instructions
outputs are defined by the function that is called? And I guess, how do I tell
LLVM it can't pick any old register for a given instruction?
>> 
>> Thanks,
>> 
>> Edwin
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Possibly Parallel Threads

Search for more reasonably related threads

llvm dev - Jul 2014 - [LLVMdev] Instructions on a target with no general purpose registers

[LLVMdev] Instructions on a target with no general purpose registers

[LLVMdev] Instructions on a target with no general purpose registers

[LLVMdev] Instructions on a target with no general purpose registers

[LLVMdev] Instructions on a target with no general purpose registers

[LLVMdev] Instructions on a target with no general purpose registers

[LLVMdev] Instructions on a target with no general purpose registers

[LLVMdev] Instructions on a target with no general purpose registers

Possibly Parallel Threads