Short version: If the integrated assembler accepted assembly strings as input, more targets could take advantage of integrated assembly. The longer version: For a given assembly statement, my out-of-tree target has complex instruction selection logic -- more so than the in-tree targets. This target uses variable length instructions and a laborious hierarchy of tblgen AsmOperands to do the job. Assembly and disassembly with llvm-mc and llvm-objdump work fine. As a simplification, the compiler deals almost exclusively in pseudo instructions. By x86 analogy, using pseudos to unfold a TEST32rm into MOV32rm + TEST32rr means I can skip the complex operand fitting effort needed to pick specific machine instructions. There are many such examples where handling real instructions would become a gross overload. One drawback of this approach is that the integrated assembler receives only unexpanded pseudos as input, which fails. So, my target must generate .s files and assemble as a separate step. If a target could pass an assembly string to the integrated assembler, the pseudo problem goes away. The string passed to the integrated assembler could come wrapped in a pseudo MCInst or whatever. Support for asm string parsing already exists in every target, so this doesn't seem like much of a stretch. Thanks for any comments on this idea. Regards, -steve
> As a simplification, the compiler deals almost exclusively in pseudo > instructions. By x86 analogy, using pseudos to unfold a TEST32rm into > MOV32rm + TEST32rr means I can skip the complex operand fitting effort > needed to pick specific machine instructions. There are many such > examples where handling real instructions would become a gross > overload. > > One drawback of this approach is that the integrated assembler > receives only unexpanded pseudos as input, which fails. So, my target > must generate .s files and assemble as a separate step.It sounds like you're doing the expansion directly in the InstPrinter. Most targets using fused pseudo-instructions like this do it either in a separate pass (e.g. AArch64ExpandPseudoInsts.cpp) or during the lowering from MachineInstr to MCInst(s). Skipping this for a stringly typed "yeah, whatever" interface sounds really hacky to me. Though you might be able to get something working by following what inline-asm calls do (I've no idea where they're lowered off the top of my head, but then I've no idea what your code is doing in the specifics either so it probably cancels out). Cheers. Tim.
Joerg Sonnenberger
2015-Mar-18 01:22 UTC
[LLVMdev] string input for the integrated assembler
On Tue, Mar 17, 2015 at 05:47:31PM -0700, Steve King wrote:> As a simplification, the compiler deals almost exclusively in pseudo > instructions. By x86 analogy, using pseudos to unfold a TEST32rm into > MOV32rm + TEST32rr means I can skip the complex operand fitting effort > needed to pick specific machine instructions. There are many such > examples where handling real instructions would become a gross > overload.You can use various forms of aliases and pseudo instructions for pure assembler use too. Consider the simplified immediate load syntax of ARM, which handles the constant pool by itself. See test/MC/ARM/ltorg.s. Joerg
> It sounds like you're doing the expansion directly in the InstPrinter. > Most targets using fused pseudo-instructions like this do it either in > a separate pass (e.g. AArch64ExpandPseudoInsts.cpp) or during the > lowering from MachineInstr to MCInst(s).Another option that occurred is enhancing the PseudoInstExpansion handling to cover multiple instructions. Currently it can only produce 1 output, but I don't think there's any fundamental reason it the output couldn't be a list of instructions instead. Cheers. Tim.
On Tue, Mar 17, 2015 at 6:14 PM, Tim Northover <t.p.northover at gmail.com> wrote:>> As a simplification, the compiler deals almost exclusively in pseudo >> instructions. By x86 analogy, using pseudos to unfold a TEST32rm into >> MOV32rm + TEST32rr means I can skip the complex operand fitting effort >> needed to pick specific machine instructions. There are many such >> examples where handling real instructions would become a gross >> overload. >> >> One drawback of this approach is that the integrated assembler >> receives only unexpanded pseudos as input, which fails. So, my target >> must generate .s files and assemble as a separate step. > > It sounds like you're doing the expansion directly in the InstPrinter.I probably used the term 'expansion' incorrectly. Pseudos go 1:1 into .s files, then MCTargetAsmParser does its job. This class nicely consolidates tblgen's auto-generated operand fitting logic, which for me is quite a blob of code. Should use of the integrated assembler require targets to pick all machine instructions some other way? If the answer should be no, then handling pseudos via their AsmString feels like a tidy answer.
To try to rephrase, these are the transformations to and from MC in LLVM: assembly -> MCInsts MachineInstrs -> MCInsts MCInsts -> object file MCInsts -> assembly It sounds you're having trouble translating from MachineInstrs to MCInsts, but printing assembly strings from MachineInstrs is easy for some reason. I don't really see why we'd want to go back to the old way of printing assembly strings from the compiler, and then reparsing them, even if it were in-process. We did a lot of work to avoid that whole re-parsing step to save on compile time. I don't fully understand the challenges your target presents, but I'm not convinced that they can't be overcome with some good engineering. Printing asm strings and building MCInsts seem like equivalently difficult problems that would leverage the same underlying logic. On Tue, Mar 17, 2015 at 5:47 PM, Steve King <steve at metrokings.com> wrote:> Short version: If the integrated assembler accepted assembly strings > as input, more targets could take advantage of integrated assembly. > > The longer version: > > For a given assembly statement, my out-of-tree target has complex > instruction selection logic -- more so than the in-tree targets. This > target uses variable length instructions and a laborious hierarchy of > tblgen AsmOperands to do the job. Assembly and disassembly with > llvm-mc and llvm-objdump work fine. > > As a simplification, the compiler deals almost exclusively in pseudo > instructions. By x86 analogy, using pseudos to unfold a TEST32rm into > MOV32rm + TEST32rr means I can skip the complex operand fitting effort > needed to pick specific machine instructions. There are many such > examples where handling real instructions would become a gross > overload. > > One drawback of this approach is that the integrated assembler > receives only unexpanded pseudos as input, which fails. So, my target > must generate .s files and assemble as a separate step. > > If a target could pass an assembly string to the integrated assembler, > the pseudo problem goes away. The string passed to the integrated > assembler could come wrapped in a pseudo MCInst or whatever. Support > for asm string parsing already exists in every target, so this doesn't > seem like much of a stretch. > > Thanks for any comments on this idea. > > Regards, > -steve > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150318/5a596e9f/attachment.html>
On Wed, Mar 18, 2015 at 11:00 AM, Reid Kleckner <rnk at google.com> wrote:> To try to rephrase, these are the transformations to and from MC in LLVM: > assembly -> MCInsts > MachineInstrs -> MCInsts > MCInsts -> object file > MCInsts -> assembly > > It sounds you're having trouble translating from MachineInstrs to MCInsts, > but printing assembly strings from MachineInstrs is easy for some reason. >Thanks, that's probably another way to put it. The "for some reason" is because the task of physical instruction selection is especially complex in my target. The tblgen'd code in MCTargetAsmParser was a significant investment and the only way I have to pick an instruction with a real encoding.
Apparently Analagous Threads
- [LLVMdev] string input for the integrated assembler
- [LLVMdev] where is F7 opcode for TEST instruction on X86?
- [LLVMdev] Are Opcode and register mappings exposed anywhere?
- [LLVMdev] Code Generation Problem llvm 1.9
- [LLVMdev] Are Opcode and register mappings exposed anywhere?