John Byrd via llvm-dev
2021-Mar-21 19:45 UTC
[llvm-dev] Tablegen backend for emulator core?
> > And I realized that, although I could write an emulator in the > traditional manner, tablegen already has most of the information it needs > to automatically generate the guts of an emulator. > > > Simon Cook (CCed) previously used LLVM MC to help write a simulator < > https://llvm.org/devmtg/2016-01/slides/fosdem16-aapsim.pdf>, which might > be worth taking a look at. Though I understood from your email that you're > imagining relying more heavily on TableGen for generating the execution > loop. >Thank you. I think Cook understood what I was hinting at, towards the end of his presentation. You could build such a simulator by creating a large switch statement based on MCInst's the way that Cook has done, or you could theoretically let tablegen create that switch for you.... llvm-tblgen -gen-simulator was the way he put this idea. At the least, the concept maintains tablegen's DRY approach to representing machine instructions. jwb -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210321/79a68615/attachment.html>
Simon Cook via llvm-dev
2021-Mar-23 11:41 UTC
[llvm-dev] Tablegen backend for emulator core?
Hi John, This is an area I'm still greatly interested in since doing the work up to that talk, and have worked on a second simulator using MCDisassembler as the decoder, but sadly haven't had the time to do any of the exploring using TableGen for the semantics. It has however still been ticking over in my brain so I have some more thoughts on what would need to be considered, and think it would be a good addition to LLVM, so would be happy to see that be added, making comparisons in my mind to what CGEN adds to the GNU toolchain. The first thing is what kind of simulation do we want to have LLVM model, if its a simple instruction set simulator then in some regards I don't imagine this being too hard, but if we want to be able to stretch as far as modelling full pipelines (unlikely to be automatically), it would be good to generate the components, even if we have to build the pipeline by hand (I think scheduling info in TableGen is probably insufficient for this). The other large thing would be identifying the semantics we don't have from TableGen patterns and working out a nice way to describe these in Instruction definitions. For architectures that have status bits modified by instructions and then used by future branch instructions, these are typically modelled as registers that are implicitly written, and so definitions would need to be extended to describe these. I'm thinking you might have a second field that describes extra semantics that doesn't make sense for code generation, but 100% sure on that one. If we can auto-generate that one that IMO would dramatically reduce how much needs manually writing. One thing to keep in mind though, is at some scale a single switch on Insn.getOpcode() might not be the best model for the more complex and varied architectures. If you have multiple generations of architectures you might want to split the simulator up into different loops, so if you have a couple of generations of cores in one backend maybe you want to generate different loops (maybe reusing ProcessorModels or something similar here?) That would help identifying which instructions would still need manual semantics written in a more mentally scalable way. Practically speaking if something like this would be written, I think there's a good model to follow in how GlobalISel's reuse of TableGen patterns has gone, having a TableGen generator that generates what it can for some instructions, and then asking someone to write the missing parts, and over time more things can move from hand written to auto-generated. As for driving the simulator, I've found the "forked objdump" approach works well, but if this were in-tree I'd expect something to be more specialised (and likely written from scratch) to fit in. It may also be potentially possible to repurpose parts of LLDBs lldb-server pulling in some "MCSimulator" library to have something that talks RSP (for people using LLVM + GDB), but I'm not familiar with all its components to know how feasible this would be. Overall I still think this is something that would be a great addition to LLVM and I think the raw pieces are there. As with most things I think these things live and die by having people who would use and maintain such things. I'm not sure how many others also have interest or thoughts in this area but if I would certainly welcome such an addition. Thanks, Simon On Sun, Mar 21, 2021 at 7:45 PM John Byrd <jbyrd at giganticsoftware.com> wrote:> > >> > And I realized that, although I could write an emulator in the traditional manner, tablegen already has most of the information it needs to automatically generate the guts of an emulator. >> >> > Simon Cook (CCed) previously used LLVM MC to help write a simulator <https://llvm.org/devmtg/2016-01/slides/fosdem16-aapsim.pdf>, which might be worth taking a look at. Though I understood from your email that you're imagining relying more heavily on TableGen for generating the execution loop. > > > Thank you. I think Cook understood what I was hinting at, towards the end of his presentation. You could build such a simulator by creating a large switch statement based on MCInst's the way that Cook has done, or you could theoretically let tablegen create that switch for you.... llvm-tblgen -gen-simulator was the way he put this idea. At the least, the concept maintains tablegen's DRY approach to representing machine instructions. > > jwb