thr3ads.net - llvm dev - [LLVMdev] Be Careful with Positionally-Encoded Operands (AArch64, Mips, AMDGPU, etc.) [Mar 2014]

If this information is useful, please help other people find it:
Share via:

Hal Finkel

2014-Mar-13 08:21 UTC

[LLVMdev] Be Careful with Positionally-Encoded Operands (AArch64, Mips, AMDGPU, etc.)

Hello,

Some of the backends seem to be combining positional and named operands when
defining some instructions such that some of the positional operands overlap
with some of the named operands. I suspect this is not intentional; here's
an example:

AArch64 has the following instruction definition:

SMULHxxx {
  field bits<32> Inst = { 1, 0, 0, 1, 1, 0, 1, 1, 0, 1, 0, Rm{4}, Rm{3},
Rm{2}, Rm{1}, Rm{0}, 0, Ra{4}, Ra{3}, Ra{2}, Ra{1}, Ra{0}, Rn{4}, Rn{3}, Rn{2},
Rn{1}, Rn{0}, Rd{4}, Rd{3}, Rd{2}, Rd{1}, Rd{0} };
  ...
  dag OutOperandList = (outs GPR64:$Rd);
  dag InOperandList = (ins GPR64:$Rn, GPR64:$Rm);
  string AsmString = "smulh	$Rd, $Rn, $Rm";
  list<dag> Pattern = [(set i64:$Rd, (mulhs i64:$Rn, i64:$Rm))];
  ...
  bits<5> Rd = { ?, ?, ?, ?, ? };
  bits<5> Rn = { ?, ?, ?, ?, ? };
  bits<5> Rm = { ?, ?, ?, ?, ? };
  bits<5> Ra = { ?, ?, ?, ?, ? };
...
}

So how do the operands in OutOperandList and InOperandList get mapped to the
variables used to encode the instruction (Rd,Rn,Rm,Ra)? The first three match by
name (as they appear explicitly in OutOperandList and InOperandList), but what
about Ra? Ra contributes to defining the bits in Inst, and because there is, by
default, no overlap checking, it also gets mapped to the first operand:
GPR64:$Rd. The result, from AArch64GenMCCodeEmitter.inc is:

    case AArch64::SMULHxxx:
    case AArch64::UMULHxxx: {
      // op: Rd
      op = getMachineOpValue(MI, MI.getOperand(0), Fixups, STI);
      Value |= op & UINT64_C(31);
      // op: Rn
      op = getMachineOpValue(MI, MI.getOperand(1), Fixups, STI);
      Value |= (op & UINT64_C(31)) << 5;
      // op: Rm
      op = getMachineOpValue(MI, MI.getOperand(2), Fixups, STI);
      Value |= (op & UINT64_C(31)) << 16;
      // op: Ra
      op = getMachineOpValue(MI, MI.getOperand(0), Fixups, STI);
      Value |= (op & UINT64_C(31)) << 10;
      Value = fixMulHigh(MI, Value, STI);
      break;
    }

This may be correct (I have no idea), but even if it is, it seems like an odd
way to get this behavior: depending on the fact that, after operands are matched
by name, the remaining operands are matched by position such that they might
overlap with those operands matched by name.

In any case, I believe that the only in-tree target that still really needs the
positional operand support is PowerPC (because the named-operand matching scheme
still can't deal with the complex operands that PPC uses for handling memory
operands), and PowerPC needs the behavior that named and positional operands are
disjoint so that we can start transitioning toward using named mapping, so
I've added a new TableGen target bit to disable this problematic overlapping
operand behavior (noNamedPositionallyEncodedOperands) in r203767.

This means that in lib/Target/PowerPC/PPC.td, we have:

def PPCInstrInfo : InstrInfo {
...

  let noNamedPositionallyEncodedOperands = 1;

...
}

I'd like those maintaining the current backends (especially AArch64, Mips,
AMDGPU, which I know to be problematic in this regard) to try setting this and:
a) fix those definitions that are problematic or b) explain to me that the
current behavior is useful.

Thanks again,
Hal

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Tim Northover

2014-Mar-13 09:06 UTC

head link

[LLVMdev] Be Careful with Positionally-Encoded Operands (AArch64, Mips, AMDGPU, etc.)

Hi Hal,

This sounds like a really good idea in general, but...
>  let noNamedPositionallyEncodedOperands = 1;
The diagnostics produced are pretty atrocious (an assertion failure
without even mentioning which instruction is the problem). I couldn't
in good conscience recommend anyone try it in its present state if
there are more than 1-2 problems.
> I'd like those maintaining the current backends (especially AArch64,
> Mips, AMDGPU, which I know to be problematic in this regard) to try
> setting this and: a) fix those definitions that are problematic or b)
explain
> to me that the current behavior is useful.
I've looked into the SMULH/UMULH case (which appears to be the only
issue in AArch64). It wasn't intended, but was also harmless.
Fortunately it can be removed fairly easily so I've done that and
enabled the flag in r203772.

Cheers.

Tim.

Hal Finkel

2014-Mar-13 13:28 UTC

head link

[LLVMdev] Be Careful with Positionally-Encoded Operands (AArch64, Mips, AMDGPU, etc.)

----- Original Message -----> From: "Tim Northover" <t.p.northover at gmail.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>
> Sent: Thursday, March 13, 2014 4:06:49 AM
> Subject: Re: [LLVMdev] Be Careful with Positionally-Encoded Operands
(AArch64, Mips, AMDGPU, etc.)
> 
> Hi Hal,
> 
> This sounds like a really good idea in general, but...
> 
> >  let noNamedPositionallyEncodedOperands = 1;
> 
> The diagnostics produced are pretty atrocious (an assertion failure
> without even mentioning which instruction is the problem). I couldn't
> in good conscience recommend anyone try it in its present state if
> there are more than 1-2 problems.
Good point. I can add a better diagnostic for this case (where enabling
noNamedPositionallyEncodedOperands causes the position-based mapping to run off
the end of the operand list); I may not have time until Tuesday, however.
> 
> > I'd like those maintaining the current backends (especially
> > AArch64,
> > Mips, AMDGPU, which I know to be problematic in this regard) to try
> > setting this and: a) fix those definitions that are problematic or
> > b) explain
> > to me that the current behavior is useful.
> 
> I've looked into the SMULH/UMULH case (which appears to be the only
> issue in AArch64). It wasn't intended, but was also harmless.
> Fortunately it can be removed fairly easily so I've done that and
> enabled the flag in r203772.
Thanks!

 -Hal
> 
> Cheers.
> 
> Tim.
> 
-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Tom Stellard

2014-Mar-13 14:46 UTC

head link

[LLVMdev] Be Careful with Positionally-Encoded Operands (AArch64, Mips, AMDGPU, etc.)

On Thu, Mar 13, 2014 at 03:21:47AM -0500, Hal Finkel
wrote:> Hello,
> 
> Some of the backends seem to be combining positional and named operands
when defining some instructions such that some of the positional operands
overlap with some of the named operands. I suspect this is not intentional;
here's an example:
> 
> AArch64 has the following instruction definition:
> 
> SMULHxxx {
>   field bits<32> Inst = { 1, 0, 0, 1, 1, 0, 1, 1, 0, 1, 0, Rm{4},
Rm{3}, Rm{2}, Rm{1}, Rm{0}, 0, Ra{4}, Ra{3}, Ra{2}, Ra{1}, Ra{0}, Rn{4}, Rn{3},
Rn{2}, Rn{1}, Rn{0}, Rd{4}, Rd{3}, Rd{2}, Rd{1}, Rd{0} };
>   ...
>   dag OutOperandList = (outs GPR64:$Rd);
>   dag InOperandList = (ins GPR64:$Rn, GPR64:$Rm);
>   string AsmString = "smulh	$Rd, $Rn, $Rm";
>   list<dag> Pattern = [(set i64:$Rd, (mulhs i64:$Rn, i64:$Rm))];
>   ...
>   bits<5> Rd = { ?, ?, ?, ?, ? };
>   bits<5> Rn = { ?, ?, ?, ?, ? };
>   bits<5> Rm = { ?, ?, ?, ?, ? };
>   bits<5> Ra = { ?, ?, ?, ?, ? };
> ...
> }
> 
> So how do the operands in OutOperandList and InOperandList get mapped to
the variables used to encode the instruction (Rd,Rn,Rm,Ra)? The first three
match by name (as they appear explicitly in OutOperandList and InOperandList),
but what about Ra? Ra contributes to defining the bits in Inst, and because
there is, by default, no overlap checking, it also gets mapped to the first
operand: GPR64:$Rd. The result, from AArch64GenMCCodeEmitter.inc is:
> 
>     case AArch64::SMULHxxx:
>     case AArch64::UMULHxxx: {
>       // op: Rd
>       op = getMachineOpValue(MI, MI.getOperand(0), Fixups, STI);
>       Value |= op & UINT64_C(31);
>       // op: Rn
>       op = getMachineOpValue(MI, MI.getOperand(1), Fixups, STI);
>       Value |= (op & UINT64_C(31)) << 5;
>       // op: Rm
>       op = getMachineOpValue(MI, MI.getOperand(2), Fixups, STI);
>       Value |= (op & UINT64_C(31)) << 16;
>       // op: Ra
>       op = getMachineOpValue(MI, MI.getOperand(0), Fixups, STI);
>       Value |= (op & UINT64_C(31)) << 10;
>       Value = fixMulHigh(MI, Value, STI);
>       break;
>     }
> 
> This may be correct (I have no idea), but even if it is, it seems like an
odd way to get this behavior: depending on the fact that, after operands are
matched by name, the remaining operands are matched by position such that they
might overlap with those operands matched by name.
> 
> In any case, I believe that the only in-tree target that still really needs
the positional operand support is PowerPC (because the named-operand matching
scheme still can't deal with the complex operands that PPC uses for handling
memory operands), and PowerPC needs the behavior that named and positional
operands are disjoint so that we can start transitioning toward using named
mapping, so I've added a new TableGen target bit to disable this problematic
overlapping operand behavior (noNamedPositionallyEncodedOperands) in r203767.
> 
> This means that in lib/Target/PowerPC/PPC.td, we have:
> 
> def PPCInstrInfo : InstrInfo {
> ...
> 
>   let noNamedPositionallyEncodedOperands = 1;
> 
> ...
> }
> 
> I'd like those maintaining the current backends (especially AArch64,
Mips, AMDGPU, which I know to be problematic in this regard) to try setting this
and: a) fix those definitions that are problematic or b) explain to me that the
current behavior is useful.
>
Hi Hal,

Thanks for working on this, I have run into several problems with the
positional encodings and I would be happy to see it go away.  However,
I was under the impression that positional encoding was the only way to
encode custom operand types that map to multiple machine operands.
See for example the MEMxi Operand sub-class in R600/R600Instructions.td.

Thanks,
Tom
> Thanks again,
> Hal
> 
> -- 
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Hal Finkel

2014-Mar-13 15:01 UTC

head link

[LLVMdev] Be Careful with Positionally-Encoded Operands (AArch64, Mips, AMDGPU, etc.)

----- Original Message -----> From: "Tom Stellard" <tom at stellard.net>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>
> Sent: Thursday, March 13, 2014 9:46:22 AM
> Subject: Re: [LLVMdev] Be Careful with Positionally-Encoded Operands
(AArch64, Mips, AMDGPU, etc.)
> 
> On Thu, Mar 13, 2014 at 03:21:47AM -0500, Hal Finkel wrote:
> > Hello,
> > 
> > Some of the backends seem to be combining positional and named
> > operands when defining some instructions such that some of the
> > positional operands overlap with some of the named operands. I
> > suspect this is not intentional; here's an example:
> > 
> > AArch64 has the following instruction definition:
> > 
> > SMULHxxx {
> >   field bits<32> Inst = { 1, 0, 0, 1, 1, 0, 1, 1, 0, 1, 0,
Rm{4},
> >   Rm{3}, Rm{2}, Rm{1}, Rm{0}, 0, Ra{4}, Ra{3}, Ra{2}, Ra{1},
> >   Ra{0}, Rn{4}, Rn{3}, Rn{2}, Rn{1}, Rn{0}, Rd{4}, Rd{3}, Rd{2},
> >   Rd{1}, Rd{0} };
> >   ...
> >   dag OutOperandList = (outs GPR64:$Rd);
> >   dag InOperandList = (ins GPR64:$Rn, GPR64:$Rm);
> >   string AsmString = "smulh	$Rd, $Rn, $Rm";
> >   list<dag> Pattern = [(set i64:$Rd, (mulhs i64:$Rn, i64:$Rm))];
> >   ...
> >   bits<5> Rd = { ?, ?, ?, ?, ? };
> >   bits<5> Rn = { ?, ?, ?, ?, ? };
> >   bits<5> Rm = { ?, ?, ?, ?, ? };
> >   bits<5> Ra = { ?, ?, ?, ?, ? };
> > ...
> > }
> > 
> > So how do the operands in OutOperandList and InOperandList get
> > mapped to the variables used to encode the instruction
> > (Rd,Rn,Rm,Ra)? The first three match by name (as they appear
> > explicitly in OutOperandList and InOperandList), but what about
> > Ra? Ra contributes to defining the bits in Inst, and because there
> > is, by default, no overlap checking, it also gets mapped to the
> > first operand: GPR64:$Rd. The result, from
> > AArch64GenMCCodeEmitter.inc is:
> > 
> >     case AArch64::SMULHxxx:
> >     case AArch64::UMULHxxx: {
> >       // op: Rd
> >       op = getMachineOpValue(MI, MI.getOperand(0), Fixups, STI);
> >       Value |= op & UINT64_C(31);
> >       // op: Rn
> >       op = getMachineOpValue(MI, MI.getOperand(1), Fixups, STI);
> >       Value |= (op & UINT64_C(31)) << 5;
> >       // op: Rm
> >       op = getMachineOpValue(MI, MI.getOperand(2), Fixups, STI);
> >       Value |= (op & UINT64_C(31)) << 16;
> >       // op: Ra
> >       op = getMachineOpValue(MI, MI.getOperand(0), Fixups, STI);
> >       Value |= (op & UINT64_C(31)) << 10;
> >       Value = fixMulHigh(MI, Value, STI);
> >       break;
> >     }
> > 
> > This may be correct (I have no idea), but even if it is, it seems
> > like an odd way to get this behavior: depending on the fact that,
> > after operands are matched by name, the remaining operands are
> > matched by position such that they might overlap with those
> > operands matched by name.
> > 
> > In any case, I believe that the only in-tree target that still
> > really needs the positional operand support is PowerPC (because
> > the named-operand matching scheme still can't deal with the
> > complex operands that PPC uses for handling memory operands), and
> > PowerPC needs the behavior that named and positional operands are
> > disjoint so that we can start transitioning toward using named
> > mapping, so I've added a new TableGen target bit to disable this
> > problematic overlapping operand behavior
> > (noNamedPositionallyEncodedOperands) in r203767.
> > 
> > This means that in lib/Target/PowerPC/PPC.td, we have:
> > 
> > def PPCInstrInfo : InstrInfo {
> > ...
> > 
> >   let noNamedPositionallyEncodedOperands = 1;
> > 
> > ...
> > }
> > 
> > I'd like those maintaining the current backends (especially
> > AArch64, Mips, AMDGPU, which I know to be problematic in this
> > regard) to try setting this and: a) fix those definitions that are
> > problematic or b) explain to me that the current behavior is
> > useful.
> >
> 
> Hi Hal,
> 
> Thanks for working on this, I have run into several problems with the
> positional encodings and I would be happy to see it go away.
>  However,
> I was under the impression that positional encoding was the only way
> to
> encode custom operand types that map to multiple machine operands.
> See for example the MEMxi Operand sub-class in
> R600/R600Instructions.td.
Yes, this is the remaining use case for the positional scheme. We need to invent
something to replace it (maybe this should be something as simple as mapping
'.' -> '_'). Thoughts?

 -Hal
> 
> Thanks,
> Tom
> 
> > Thanks again,
> > Hal
> > 
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Hal Finkel

2014-Mar-22 11:46 UTC

head link

[LLVMdev] Be Careful with Positionally-Encoded Operands (AArch64, Mips, AMDGPU, etc.)

----- Original Message -----> From: "Tim Northover" <t.p.northover at gmail.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>
> Sent: Thursday, March 13, 2014 4:06:49 AM
> Subject: Re: [LLVMdev] Be Careful with Positionally-Encoded Operands
(AArch64, Mips, AMDGPU, etc.)
> 
> Hi Hal,
> 
> This sounds like a really good idea in general, but...
> 
> >  let noNamedPositionallyEncodedOperands = 1;
> 
> The diagnostics produced are pretty atrocious (an assertion failure
> without even mentioning which instruction is the problem). I couldn't
> in good conscience recommend anyone try it in its present state if
> there are more than 1-2 problems.
I've fixed this in r204542; TableGen now produces a usable error instead of
asserting.

 -Hal
> 
> > I'd like those maintaining the current backends (especially
> > AArch64,
> > Mips, AMDGPU, which I know to be problematic in this regard) to try
> > setting this and: a) fix those definitions that are problematic or
> > b) explain
> > to me that the current behavior is useful.
> 
> I've looked into the SMULH/UMULH case (which appears to be the only
> issue in AArch64). It wasn't intended, but was also harmless.
> Fortunately it can be removed fairly easily so I've done that and
> enabled the flag in r203772.
> 
> Cheers.
> 
> Tim.
> 
-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Possibly Parallel Threads

Search for more apparently analagous threads

llvm dev - Mar 2014 - [LLVMdev] Be Careful with Positionally-Encoded Operands (AArch64, Mips, AMDGPU, etc.)

[LLVMdev] Be Careful with Positionally-Encoded Operands (AArch64, Mips, AMDGPU, etc.)

[LLVMdev] Be Careful with Positionally-Encoded Operands (AArch64, Mips, AMDGPU, etc.)

[LLVMdev] Be Careful with Positionally-Encoded Operands (AArch64, Mips, AMDGPU, etc.)

[LLVMdev] Be Careful with Positionally-Encoded Operands (AArch64, Mips, AMDGPU, etc.)

[LLVMdev] Be Careful with Positionally-Encoded Operands (AArch64, Mips, AMDGPU, etc.)

[LLVMdev] Be Careful with Positionally-Encoded Operands (AArch64, Mips, AMDGPU, etc.)

Possibly Parallel Threads