thr3ads.net - llvm dev - [LLVMdev] string input for the integrated assembler [Mar 2015]

If this information is useful, please help other people find it:
Share via:

Steve King

2015-Mar-18 17:06 UTC

[LLVMdev] string input for the integrated assembler

On Tue, Mar 17, 2015 at 6:14 PM, Tim Northover <t.p.northover at
gmail.com> wrote:>> As a simplification, the compiler deals almost exclusively in pseudo
>> instructions.  By x86 analogy, using pseudos to unfold a TEST32rm into
>> MOV32rm + TEST32rr means I can skip the complex operand fitting effort
>> needed to pick specific machine instructions.  There are many such
>> examples where handling real instructions would become a gross
>> overload.
>>
>> One drawback of this approach is that the integrated assembler
>> receives only unexpanded pseudos as input, which fails.  So, my target
>> must generate .s files and assemble as a separate step.
>
> It sounds like you're doing the expansion directly in the InstPrinter.
I probably used the term 'expansion' incorrectly.  Pseudos go 1:1 into
.s files, then MCTargetAsmParser does its job.  This class nicely
consolidates tblgen's auto-generated operand fitting logic, which for
me is quite a blob of code.  Should use of the integrated assembler
require targets to pick all machine instructions some other way?  If
the answer should be no, then handling pseudos via their AsmString
feels like a tidy answer.

Joerg Sonnenberger

2015-Mar-18 17:50 UTC

head link

[LLVMdev] string input for the integrated assembler

On Wed, Mar 18, 2015 at 10:06:38AM -0700, Steve King
wrote:> On Tue, Mar 17, 2015 at 6:14 PM, Tim Northover <t.p.northover at
gmail.com> wrote:
> >> As a simplification, the compiler deals almost exclusively in
pseudo
> >> instructions.  By x86 analogy, using pseudos to unfold a TEST32rm
into
> >> MOV32rm + TEST32rr means I can skip the complex operand fitting
effort
> >> needed to pick specific machine instructions.  There are many such
> >> examples where handling real instructions would become a gross
> >> overload.
> >>
> >> One drawback of this approach is that the integrated assembler
> >> receives only unexpanded pseudos as input, which fails.  So, my
target
> >> must generate .s files and assemble as a separate step.
> >
> > It sounds like you're doing the expansion directly in the
InstPrinter.
> 
> I probably used the term 'expansion' incorrectly.  Pseudos go 1:1
into
> .s files, then MCTargetAsmParser does its job.  This class nicely
> consolidates tblgen's auto-generated operand fitting logic, which for
> me is quite a blob of code.  Should use of the integrated assembler
> require targets to pick all machine instructions some other way?  If
> the answer should be no, then handling pseudos via their AsmString
> feels like a tidy answer.
Please stop talking abouot "AsmString", that really makes no sense. It
is really hard to help you if you go back to an approach that is
difficult to understand and from what we can understand, completely
wrong. It also doesn't help that the questions so far are pretty much
without meat. E.g. no example of what your "high-level" assembler
mnemonic looks like and how the resulting instructions look like.

Joerg

Steve King

2015-Mar-18 18:57 UTC

head link

[LLVMdev] string input for the integrated assembler

On Wed, Mar 18, 2015 at 10:50 AM, Joerg Sonnenberger
<joerg at britannica.bec.de> wrote:>>
>> I probably used the term 'expansion' incorrectly.  Pseudos go
1:1 into
>> .s files, then MCTargetAsmParser does its job.  This class nicely
>> consolidates tblgen's auto-generated operand fitting logic, which
for
>> me is quite a blob of code.  Should use of the integrated assembler
>> require targets to pick all machine instructions some other way?  If
>> the answer should be no, then handling pseudos via their AsmString
>> feels like a tidy answer.
>
> Please stop talking abouot "AsmString", that really makes no
sense. It
> is really hard to help you if you go back to an approach that is
> difficult to understand and from what we can understand, completely
> wrong. It also doesn't help that the questions so far are pretty much
> without meat. E.g. no example of what your "high-level" assembler
> mnemonic looks like and how the resulting instructions look like.

OK.  AsmString is the tblgen variable holding the string
representation of the instruction.  This doesn't exist in an MCInst
and I conflated the two.  Sorry about that.

We can use X86 for a meaty example that is close to my target.  In
X86InstrInfo::optimizeCompareInstr() we see code like this:

    case X86::SUB64ri32: NewOpcode = X86::CMP64ri32; break;
    case X86::SUB64ri8:  NewOpcode = X86::CMP64ri8;  break;
    case X86::SUB32ri:   NewOpcode = X86::CMP32ri;   break;
    case X86::SUB32ri8:  NewOpcode = X86::CMP32ri8;  break;
    case X86::SUB16ri:   NewOpcode = X86::CMP16ri;   break;
    case X86::SUB16ri8:  NewOpcode = X86::CMP16ri8;  break;
    case X86::SUB8ri:    NewOpcode = X86::CMP8ri;    break;

Here, the compiler must distinguish SUB "ri" from the more compact,
but logically redundant SUB "ri8".  That, multiplied by the number of
operand widths and that again multiplied by the number of compare-like
instructions.

My target has *many* more choices than just "ri" and "ri8"
and checks
like this switch statement would explode.  At the time of
optimizeCompareInstr(), nitpicking all the various ways to encode an
immediate value for an instruction would be very painful.  It's enough
to know an immediate form exists as represented by a pseudo.  I get to
use something like this:

    case FOO::SUB64ri: NewOpcode = FOO::CMP64ri; break;
    case FOO::SUB32ri:   NewOpcode = FOO::CMP32ri;   break;
    case FOO::SUB16ri:   NewOpcode = FOO::CMP16ri;   break;
    case FOO::SUB8ri:    NewOpcode = FOO::CMP8ri;    break;

Into the assembly output file, the pseudo emits something like:  "cmpl
$0x80,%eax" with no clue of encoding.

Given the .s file, only the assembly parser worries if $0x80 is best
represented as an unsigned imm8, a signed imm16, imm32, power-of-2,
shifted nibble, special choices for given register destinations, and
so on.

Hopefully this is this more clear?  I will back off from suggesting
how this use of pseudos could be accommodated with the integrated
assembler.

Regards,
-steve

llvm dev - Mar 2015 - [LLVMdev] string input for the integrated assembler

[LLVMdev] string input for the integrated assembler

[LLVMdev] string input for the integrated assembler

[LLVMdev] string input for the integrated assembler