thr3ads.net - llvm dev - [LLVMdev] X86 sub_ss and sub

If this information is useful, please help other people find it:
Share via:

Jakob Stoklund Olesen

2012-Jul-26 17:04 UTC

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

On Jul 26, 2012, at 9:43 AM, dag at cray.com wrote:
> Jakob Stoklund Olesen <jolesen at apple.com> writes:
> 
>> As far as I can tell, all sub-register operations involving sub_ss and
>> sub_sd can simply be replaced with COPY_TO_REGCLASS:
>> 
>>  def : Pat<(v4i32 (X86Movsd VR128:$src1, VR128:$src2)),
>>            (VMOVSDrr VR128:$src1, (EXTRACT_SUBREG (v4i32 VR128:$src2),
>>                                                   sub_sd))>;
>> 
>> Becomes:
>> 
>>  def : Pat<(v4i32 (X86Movsd VR128:$src1, VR128:$src2)),
>>            (VMOVSDrr VR128:$src1, (COPY_TO_REGCLASS VR128:$src2,
FR64))>;
> 
> A few questions:
> 
> Will COPY_TO_REGCLASS actually generate a copy instruction or can
> TableGen/isel fold it away?
Both EXTRACT_SUBREG and COPY_TO_REGCLASS are emitted as COPY instructions by
InstrEmitter. One as a sub-register copy, one as a full register copy. Both are
handled by the register coalescer.

It would actually be possible to have EmitCopyToRegClassNode() try to call
MRI->constrainRegClass() first, just like AddRegisterOperand() does. That
could avoid the copy in some cases, and you would simply get a VR128 register as
the second VMOVSDrr operand. I am not proposing we do that for now. Let the
register coalescer deal with that.
> What happens if the result of the above pattern using COPY_TO_REGCLASS
> is spilled?  Will we get a 64-bit store or a 128-bit store?
This behavior isn't affected by the change. FR64 registers are spilled with
64-bit stores, and VR128 registers are spilled with 128-bit stores.

When the register coalescer removes a copy between VR128 and FR64 registers, it
chooses the larger spill size for the result. This is the same for sub-register
copies and full register copies.

The important point here is that VR128 is a sub-class of FR64, so
getCommonSubClass(VR128, FR64) -> VR128. This is the Liskov substitution
principle for register classes.

/jakob

dag at cray.com

2012-Jul-26 17:28 UTC

head link

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

Jakob Stoklund Olesen <jolesen at apple.com> writes:
>> What happens if the result of the above pattern using COPY_TO_REGCLASS
>> is spilled?  Will we get a 64-bit store or a 128-bit store?
>
> This behavior isn't affected by the change. FR64 registers are spilled
> with 64-bit stores, and VR128 registers are spilled with 128-bit
> stores.
>
> When the register coalescer removes a copy between VR128 and FR64
> registers, it chooses the larger spill size for the result. This is
> the same for sub-register copies and full register copies.
So if I understand this correctly, a pattern like this:

  def : Pat<(f64 (vector_extract (v2f64 VR128:$src), (iPTR 0))),
            (f64 (EXTRACT_SUBREG (v2f64 VR128:$src), sub_sd))>;

will currently use a 128-bit store if it is spilled?

That's really not good.

If the 128-bit register is not ever used as a 128-bit register,
shouldn't the coalescer pick the 64- or 32-bit register?

                                   -Dave

Jakob Stoklund Olesen

2012-Jul-26 17:43 UTC

head link

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

On Jul 26, 2012, at 10:28 AM, dag at cray.com wrote:
> Jakob Stoklund Olesen <jolesen at apple.com> writes:
> 
>>> What happens if the result of the above pattern using
COPY_TO_REGCLASS
>>> is spilled?  Will we get a 64-bit store or a 128-bit store?
>> 
>> This behavior isn't affected by the change. FR64 registers are
spilled
>> with 64-bit stores, and VR128 registers are spilled with 128-bit
>> stores.
>> 
>> When the register coalescer removes a copy between VR128 and FR64
>> registers, it chooses the larger spill size for the result. This is
>> the same for sub-register copies and full register copies.
> 
> So if I understand this correctly, a pattern like this:
> 
>  def : Pat<(f64 (vector_extract (v2f64 VR128:$src), (iPTR 0))),
>            (f64 (EXTRACT_SUBREG (v2f64 VR128:$src), sub_sd))>;
> 
> will currently use a 128-bit store if it is spilled?
It will if we coalesce the COPY away, yes.

None of this is dependent on our using sub-registers, though. The coalescer
treats sub-register copies and full register copies equally.
> If the 128-bit register is not ever used as a 128-bit register,
> shouldn't the coalescer pick the 64- or 32-bit register?
That optimization is not currently implemented for sub-registers. For example,
if you create a GR64 virtual register and only ever use the sub_32bit
sub-register, it would be possible to replace the virtual register with a GR32
register. It's not impossible to do, but it doesn't come up a lot.

When not using sub-registers, the optimization does exist. For example, if you
have a VR128 virtual register, but all the instructions using it only require
FR32, MRI->recomputeRegClass() will figure it out, and downgrade to FR32.

It gets permission to do this because
X86RegisterInfo::getLargestLegalSuperClass(VR128) returns FR32.

/jakob

Reasonably Related Threads

Search for more maybe matching threads

llvm dev - Jul 2012 - [LLVMdev] X86 sub_ss and sub_sd sub-register indexes

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

Reasonably Related Threads