Edwin Amsler
2014-Jul-05 13:59 UTC
[LLVMdev] Instructions on a target with no general purpose registers
I've mentioned my sneaky plans to target the MOS6502 here before. The big issue I think is that a lot of instructions don't really have a choice for output register. It all just goes into the accumulator, X index, or Y index based on the specific instruction. So, my question is, when I'm defining my ins, outs and registers for these instructions, is it going to be a problem that different instructions outputs are defined by the function that is called? And I guess, how do I tell LLVM it can't pick any old register for a given instruction? Thanks, Edwin
Owen Anderson
2014-Jul-05 17:55 UTC
[LLVMdev] Instructions on a target with no general purpose registers
Hi Edwin I’ve refrained from raining on your other thread, but LLVM’s instruction selection model, and in fact a large portion of its code generation model, is not well-fitted to modeling the 6502. I don’t think you’re going to get significant benefit from trying to bend it to work for the 6502, and you’ll spend a lot of time working around baked in assumptions that aren’t true for your target. —Owen On Jul 5, 2014, at 6:59 AM, Edwin Amsler <edwinguy at gmail.com> wrote:> I've mentioned my sneaky plans to target the MOS6502 here before. > > The big issue I think is that a lot of instructions don't really have a choice for output register. It all just goes into the accumulator, X index, or Y index based on the specific instruction. > > So, my question is, when I'm defining my ins, outs and registers for these instructions, is it going to be a problem that different instructions outputs are defined by the function that is called? And I guess, how do I tell LLVM it can't pick any old register for a given instruction? > > Thanks, > > Edwin > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
David Chisnall
2014-Jul-05 18:07 UTC
[LLVMdev] Instructions on a target with no general purpose registers
Hi Edwin, I'm not sure how actively he follows the lists, but you should talk to Steve Montgomery. He has some experience in porting LLVM to 8-bit microcontrollers (and gave a talk about it at the Cambridge LLVM Day in 2012). David On 5 Jul 2014, at 14:59, Edwin Amsler <edwinguy at gmail.com> wrote:> I've mentioned my sneaky plans to target the MOS6502 here before. > > The big issue I think is that a lot of instructions don't really have a choice for output register. It all just goes into the accumulator, X index, or Y index based on the specific instruction. > > So, my question is, when I'm defining my ins, outs and registers for these instructions, is it going to be a problem that different instructions outputs are defined by the function that is called? And I guess, how do I tell LLVM it can't pick any old register for a given instruction? > > Thanks, > > Edwin > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Owen Anderson
2014-Jul-05 19:51 UTC
[LLVMdev] Instructions on a target with no general purpose registers
On Jul 5, 2014, at 11:47 AM, Bruce Hoult <bruce at hoult.org> wrote:> Although I'm new to LLVM, I don't think that can really be true. Sure, LLVM has the assumption of a machine with "lots" of registers, but it works with the x86, which doesn't have many registers, and has instructions that work only with fixed input or output registers.Actually, LLVM is *also* poorly suited to machines with lots of registers. A lot of the register allocation infrastructure is built around small-ish register files with relatively sparse aliasing relationships. From experience, modeling a machine with 256 GPRs and extensive use of register sequences results in a lot of pain in various parts of the code generator.> To a first approximation, you can treat the 6502 as a machine with 256 8 bit registers, and register to register (and register to memory) pseudo-instructions that are expanded at a late stage to short sequences using A, X, Y as temps. That will certainly *work*. Not optimally for 8 bit values, but it's pretty much the best you can do anyway if you're dealing with 16 or 32 bit values.This will definitely work, but you’re not really compiling for 6502 at that point. You’re compiling for a bytecode VM that has 256 GPRs, where each bytecode opcode is implemented as a sequence of 6502 instructions. If your goal is just to get something to run on a 6502, this will definitely work, but it’s not actually solving the problem of making LLVM work as a proper 6502 compiler. Fundamentally, you have to ask what you’re looking to get out of this. If it’s just for fun, or if you just want to get (say) C++11 to run on your 6502 without caring about the low-level performance, that’s definitely doable. But if you’re trying to actually build a production-grade 6502 compiler, you’ll find that you’re unlikely to benefit from most of the facilities that LLVM provides in the SelectionDAG and Machine layers. They’re simply built for machines that work very, very differently. I’m not claiming it’s impossible to bend them to your will, but I am questioning the value in doing so. I think you’d get a better result with less work if you translated out of LLVM IR into a code generation pipeline more specialized for the needs of a 6502. —Owen> On Sun, Jul 6, 2014 at 5:55 AM, Owen Anderson <resistor at mac.com> wrote: > Hi Edwin > > I’ve refrained from raining on your other thread, but LLVM’s instruction selection model, and in fact a large portion of its code generation model, is not well-fitted to modeling the 6502. I don’t think you’re going to get significant benefit from trying to bend it to work for the 6502, and you’ll spend a lot of time working around baked in assumptions that aren’t true for your target. > > —Owen > > > On Jul 5, 2014, at 6:59 AM, Edwin Amsler <edwinguy at gmail.com> wrote: > > > I've mentioned my sneaky plans to target the MOS6502 here before. > > > > The big issue I think is that a lot of instructions don't really have a choice for output register. It all just goes into the accumulator, X index, or Y index based on the specific instruction. > > > > So, my question is, when I'm defining my ins, outs and registers for these instructions, is it going to be a problem that different instructions outputs are defined by the function that is called? And I guess, how do I tell LLVM it can't pick any old register for a given instruction? > > > > Thanks, > > > > Edwin > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140705/a43c02ae/attachment.html>
David Given
2014-Jul-05 21:09 UTC
[LLVMdev] Instructions on a target with no general purpose registers
On 7/5/14, 3:59 PM, Edwin Amsler wrote: [...]> So, my question is, when I'm defining my ins, outs and registers for these instructions, is it going to be a problem that different instructions outputs are defined by the function that is called? And I guess, how do I tell LLVM it can't pick any old register for a given instruction?You can define a register class which matches a single register only and then define an instruction which accepts parameters of only that class. Unfortunately I actually tried this for my Videocore code generator and the most likely outcome is that the register allocator will turn up its toes and die. It really doesn't like this. However, if you squint hard enough the 6502 becomes a processor with 256 byte-sized registers --- zero page. You may have better luck with a processor model which uses a subset of these as your registers, and possibly not telling LLVM about A, X and Y at all. Plus that gets you useful free stuff like indirect indexed, which you can use to implement *(addr+offset) for 8-bit offsets, and sneak a 16-bit add for free out of the processor. ... In another life I help look ater the ACK compiler suite, which is ancient and crufty and produces terrible code, but dates from about the era when the 6502 was still a thing and I've played with its 6502 code generator (and the 8080 code generator); these processors will basically not run modern languages acceptably. It's because modern languages are all stack-frame based, and these processors don't have any support for stack relative accesses. On the Z80 you can just about fake it by using ix or iy as a user stack pointer, but it's painfully slow, painfully bloaty (8 bytes of code to read or write a 16-bit value from the stack frame!) and basically just painful. I believe later Z80s eventually gained a stack-relative load instruction, but I don't know if there was a corresponding store. The 6502 isn't quite as bad. If your user stack pointer is in zero page, you can use indirect indexed to give access to a 256-byte stack frame, but it's not pretty: ldy #offset lda (sp), y # load high(?) byte iny ldx (sp), y # load low byte ...seven bytes, I think. At least it's not as bad as the 8080. Being able to forbid recursion and therefore map every stack frame statically into RAM would improve things no end, as now every stack slot has a compile-time known address; I remember asking here about this ages back and was told that there wasn't a built-in pass to do this, but it would be fairly easy to do. *shrug* Or you could just start using FORTRAN. -- ┌─── dg@cowlark.com ───── http://www.cowlark.com ───── │ "Feminism encourages women to leave their husbands, kill their │ children, practice withcraft, destroy capitalism and become lesbians." │ --- Rev. Pat Robertson -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 876 bytes Desc: OpenPGP digital signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140705/89a5d5a4/attachment.sig>
Edwin Amsler
2014-Jul-06 18:39 UTC
[LLVMdev] Instructions on a target with no general purpose registers
So far, this is mostly a for-fun project as opposed to a production-grade compiler. If it somehow magically gets there, hurray. If I lose interest in it, no worries. For an end goal, if I can write in C and produce native code, I’ll be happy (ideally no interpreter). Even if I need to inline literally everything to make my programs work. Current goal is to produce an assembler so I can hand-code stuff using the instructions in my handy C64 Programmer’s Reference Guide. I have to admit I’m somewhat worried about function calls, the stack, and performance. Despite that, I’ll just keep plowing ahead until I bang into a wall somewhere. I enjoy learning something new, and since the whole Target backend is way different than anything I’ve worked with, I’m already pretty happy with where I’ve gotten and where this is going. Now to specifics (inline): On Jul 5, 2014, at 4:09 PM, David Given <dg at cowlark.com> wrote:> ... > However, if you squint hard enough the 6502 becomes a processor with 256 > byte-sized registers --- zero page. You may have better luck with a > processor model which uses a subset of these as your registers, and > possibly not telling LLVM about A, X and Y at all. Plus that gets you > useful free stuff like indirect indexed, which you can use to implement > *(addr+offset) for 8-bit offsets, and sneak a 16-bit add for free out of > the processor. >I had planed on implementing a register bank in zero page already (register definitions are already in place!), but hadn’t thought about outright hiding A, X, and Y. That’s a great plan. It shouldn’t be much work to see what I can try to optimize after that fact.> ... > > In another life I help look ater the ACK compiler suite, which is > ancient and crufty and produces terrible code, but dates from about the > era when the 6502 was still a thing and I've played with its 6502 code > generator (and the 8080 code generator); these processors will basically > not run modern languages acceptably. It's because modern languages are > all stack-frame based, and these processors don't have any support for > stack relative accesses.Disappointing.> > On the Z80 you can just about fake it by using ix or iy as a user stack > pointer, but it's painfully slow, painfully bloaty (8 bytes of code to > read or write a 16-bit value from the stack frame!) and basically just > painful. I believe later Z80s eventually gained a stack-relative load > instruction, but I don't know if there was a corresponding store. > > The 6502 isn't quite as bad. If your user stack pointer is in zero page, > you can use indirect indexed to give access to a 256-byte stack frame, > but it's not pretty: > > ldy #offset > lda (sp), y # load high(?) byte > iny > ldx (sp), y # load low byte > > ...seven bytes, I think. > > At least it's not as bad as the 8080.That’s not a bad plan. I’m assuming though that this would only be needed when a 16 bit value is passed in by reference? Sorry I don’t have any experience with compilers, so I may be completely off here. As for efficiency, the 6502 is pretty much garbage at handling anything given its age, so I can’t really see a way even with pure assembly to get it to do anything efficiently. Doing a 32bit add requires just a tonne of work, no matter what you’re doing. 40 years of progress can iron out a heck of a lot efficiency problems. I probably need to go and read up on Sweet-16 to see how it handles things like large integers and the like.> > Being able to forbid recursion and therefore map every stack frame > statically into RAM would improve things no end, as now every stack slot > has a compile-time known address; I remember asking here about this ages > back and was told that there wasn't a built-in pass to do this, but it > would be fairly easy to do. *shrug*Getting ahead of me here, but it could be good times.> Or you could just start using FORTRAN.Cobol’s more my jam. :P> > -- > ┌─── dg@cowlark.com ───── http://www.cowlark.com ───── > │ "Feminism encourages women to leave their husbands, kill their > │ children, practice withcraft, destroy capitalism and become lesbians.” > │ --- Rev. Pat Robertson > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Steve Montgomery
2014-Jul-08 07:48 UTC
[LLVMdev] Instructions on a target with no general purpose registers
Hi Edwin, as David says, I've implemented a back-end for the Freescale 8 and 16-bit microcontrollers, HC08, HC12, HCS12, S12, S12X, ... It's far from perfect and I've not finished but, as indicated in my off-list e-mail to you, I'd be happy to share tips and tricks. I didn't have a particular problem with the small number of registers. I wrote instruction patterns in which the operand register classes often contained a single register. I found the default greedy register allocator handled this pretty well provided you remember to allow register classes to be inflated (implement MOS6502RegisterInfo::getLargestLegalSuperClass). If you don't them the register allocator has to reload into the same class that a vreg was spilled from which means it can't break circular dependencies sometimes. If a dependency can't be broken then the allocator will report running out of registers. If you go this route, you will also need to implement MOS6502InstrInfo::foldMemoryOperandImpl so you can fold the reloads into other instructions. LLVM assumes that each binary operation has a corresponding instruction that can operate on register operands. This isn't the case for the microcontrollers I've been working on and isn't the case for the 6502 either. I think the best solution is to implement pseudo instructions and let the register allocator's spill/reload deal with them. Much of the time, an operand will be reloaded, so can be folded. Any pseudos that remain after register allocation can be handled by spilling to memory, e.g. a dedicated fixed offset in the stack frame. If you decide to use zero-page as registers, and not tell LLVM about A, X and Y, then you've got more registers and they can all be the same register class. I imagine it might be best in this situation to lower the IR mainly to library calls. Hope that helps. Let me know if you want any further info about how I implemented the CPU12 backend. Regards, Steve On 5 Jul 2014, at 19:07, David Chisnall <David.Chisnall at cl.cam.ac.uk> wrote:> Hi Edwin, > > I'm not sure how actively he follows the lists, but you should talk to Steve Montgomery. He has some experience in porting LLVM to 8-bit microcontrollers (and gave a talk about it at the Cambridge LLVM Day in 2012). > > David > > > On 5 Jul 2014, at 14:59, Edwin Amsler <edwinguy at gmail.com> wrote: > >> I've mentioned my sneaky plans to target the MOS6502 here before. >> >> The big issue I think is that a lot of instructions don't really have a choice for output register. It all just goes into the accumulator, X index, or Y index based on the specific instruction. >> >> So, my question is, when I'm defining my ins, outs and registers for these instructions, is it going to be a problem that different instructions outputs are defined by the function that is called? And I guess, how do I tell LLVM it can't pick any old register for a given instruction? >> >> Thanks, >> >> Edwin >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev