On Tue, Mar 17, 2015 at 6:14 PM, Tim Northover <t.p.northover at gmail.com> wrote:>> As a simplification, the compiler deals almost exclusively in pseudo >> instructions. By x86 analogy, using pseudos to unfold a TEST32rm into >> MOV32rm + TEST32rr means I can skip the complex operand fitting effort >> needed to pick specific machine instructions. There are many such >> examples where handling real instructions would become a gross >> overload. >> >> One drawback of this approach is that the integrated assembler >> receives only unexpanded pseudos as input, which fails. So, my target >> must generate .s files and assemble as a separate step. > > It sounds like you're doing the expansion directly in the InstPrinter.I probably used the term 'expansion' incorrectly. Pseudos go 1:1 into .s files, then MCTargetAsmParser does its job. This class nicely consolidates tblgen's auto-generated operand fitting logic, which for me is quite a blob of code. Should use of the integrated assembler require targets to pick all machine instructions some other way? If the answer should be no, then handling pseudos via their AsmString feels like a tidy answer.
Joerg Sonnenberger
2015-Mar-18 17:50 UTC
[LLVMdev] string input for the integrated assembler
On Wed, Mar 18, 2015 at 10:06:38AM -0700, Steve King wrote:> On Tue, Mar 17, 2015 at 6:14 PM, Tim Northover <t.p.northover at gmail.com> wrote: > >> As a simplification, the compiler deals almost exclusively in pseudo > >> instructions. By x86 analogy, using pseudos to unfold a TEST32rm into > >> MOV32rm + TEST32rr means I can skip the complex operand fitting effort > >> needed to pick specific machine instructions. There are many such > >> examples where handling real instructions would become a gross > >> overload. > >> > >> One drawback of this approach is that the integrated assembler > >> receives only unexpanded pseudos as input, which fails. So, my target > >> must generate .s files and assemble as a separate step. > > > > It sounds like you're doing the expansion directly in the InstPrinter. > > I probably used the term 'expansion' incorrectly. Pseudos go 1:1 into > .s files, then MCTargetAsmParser does its job. This class nicely > consolidates tblgen's auto-generated operand fitting logic, which for > me is quite a blob of code. Should use of the integrated assembler > require targets to pick all machine instructions some other way? If > the answer should be no, then handling pseudos via their AsmString > feels like a tidy answer.Please stop talking abouot "AsmString", that really makes no sense. It is really hard to help you if you go back to an approach that is difficult to understand and from what we can understand, completely wrong. It also doesn't help that the questions so far are pretty much without meat. E.g. no example of what your "high-level" assembler mnemonic looks like and how the resulting instructions look like. Joerg
On Wed, Mar 18, 2015 at 10:50 AM, Joerg Sonnenberger <joerg at britannica.bec.de> wrote:>> >> I probably used the term 'expansion' incorrectly. Pseudos go 1:1 into >> .s files, then MCTargetAsmParser does its job. This class nicely >> consolidates tblgen's auto-generated operand fitting logic, which for >> me is quite a blob of code. Should use of the integrated assembler >> require targets to pick all machine instructions some other way? If >> the answer should be no, then handling pseudos via their AsmString >> feels like a tidy answer. > > Please stop talking abouot "AsmString", that really makes no sense. It > is really hard to help you if you go back to an approach that is > difficult to understand and from what we can understand, completely > wrong. It also doesn't help that the questions so far are pretty much > without meat. E.g. no example of what your "high-level" assembler > mnemonic looks like and how the resulting instructions look like.OK. AsmString is the tblgen variable holding the string representation of the instruction. This doesn't exist in an MCInst and I conflated the two. Sorry about that. We can use X86 for a meaty example that is close to my target. In X86InstrInfo::optimizeCompareInstr() we see code like this: case X86::SUB64ri32: NewOpcode = X86::CMP64ri32; break; case X86::SUB64ri8: NewOpcode = X86::CMP64ri8; break; case X86::SUB32ri: NewOpcode = X86::CMP32ri; break; case X86::SUB32ri8: NewOpcode = X86::CMP32ri8; break; case X86::SUB16ri: NewOpcode = X86::CMP16ri; break; case X86::SUB16ri8: NewOpcode = X86::CMP16ri8; break; case X86::SUB8ri: NewOpcode = X86::CMP8ri; break; Here, the compiler must distinguish SUB "ri" from the more compact, but logically redundant SUB "ri8". That, multiplied by the number of operand widths and that again multiplied by the number of compare-like instructions. My target has *many* more choices than just "ri" and "ri8" and checks like this switch statement would explode. At the time of optimizeCompareInstr(), nitpicking all the various ways to encode an immediate value for an instruction would be very painful. It's enough to know an immediate form exists as represented by a pseudo. I get to use something like this: case FOO::SUB64ri: NewOpcode = FOO::CMP64ri; break; case FOO::SUB32ri: NewOpcode = FOO::CMP32ri; break; case FOO::SUB16ri: NewOpcode = FOO::CMP16ri; break; case FOO::SUB8ri: NewOpcode = FOO::CMP8ri; break; Into the assembly output file, the pseudo emits something like: "cmpl $0x80,%eax" with no clue of encoding. Given the .s file, only the assembly parser worries if $0x80 is best represented as an unsigned imm8, a signed imm16, imm32, power-of-2, shifted nibble, special choices for given register destinations, and so on. Hopefully this is this more clear? I will back off from suggesting how this use of pseudos could be accommodated with the integrated assembler. Regards, -steve