thr3ads.net - llvm dev - [llvm-dev] Problems with subreg-liveness and Greedy RA [Jun 2021]

If this information is useful, please help other people find it:
Share via:
Quentin Colombet via llvm-dev
2021-Jun-23 19:03 UTC
[llvm-dev] Problems with subreg-liveness and Greedy RA

> On Jun 23, 2021, at 11:23 AM, Nemanja Ivanovic <nemanja.i.ibm at
gmail.com> wrote:
> 
> Thank you so much for taking the time to answer Quentin.
> 
> The bad copies are definitely added by live range splitting. The issue
seems to be the LaneBitmasks for the various subregisters. Honestly, I don't
really know what the bits of LaneBitmask produced by TblGen are meant to mean,
but I can't make any sense of them. And those seem to lead the register
allocator astray.
> Here are the LaneBitmasks from the register include file:
> static const LaneBitmask SubRegIndexLaneMaskTable[] = {
>   LaneBitmask::getAll(),
>   LaneBitmask(0x0000000000000001), // sub_32
>   LaneBitmask(0x0000000000000002), // sub_64
>   LaneBitmask(0x0000000000000004), // sub_eq
>   LaneBitmask(0x0000000000000001), // sub_gp8_x0
>   LaneBitmask(0x0000000000000200), // sub_gp8_x1
>   LaneBitmask(0x0000000000000008), // sub_gt
>   LaneBitmask(0x0000000000000010), // sub_lt
>   LaneBitmask(0x0000000000000042), // sub_pair0
>   LaneBitmask(0x0000000000000180), // sub_pair1
>   LaneBitmask(0x0000000000000020), // sub_un
>   LaneBitmask(0x0000000000000002), // sub_vsx0
>   LaneBitmask(0x0000000000000040), // sub_vsx1
>   LaneBitmask(0x0000000000000040), // sub_vsx1_then_sub_64
>   LaneBitmask(0x0000000000000080), // sub_pair1_then_sub_64
>   LaneBitmask(0x0000000000000080), // sub_pair1_then_sub_vsx0
>   LaneBitmask(0x0000000000000100), // sub_pair1_then_sub_vsx1
>   LaneBitmask(0x0000000000000100), // sub_pair1_then_sub_vsx1_then_sub_64
>   LaneBitmask(0x0000000000000200), // sub_gp8_x1_then_sub_32
>  };
> 
> For example, what does it mean that the mask for sub_64 and sub_vsx0 are
the same?
That just means they overlap. That’s fine (I think!)

From LaneBitmask.h
/// Iff the target has a register such that:
///
///   getSubReg(Reg, A) overlaps getSubReg(Reg, B)
///
/// then:
///
///   (getSubRegIndexLaneMask(A) & getSubRegIndexLaneMask(B)) != 0

> The two subregisters certainly do not represent the same lanes in their
respective registers. The sub_vsx0 subregister is the first VSX register in a
VSX register pair. And each of the two subregisters of a VSX register pair
(sub_vsx0, sub_vsx1) have their own scalar subregister (sub_64).
> 
> I have also attached the output of RA, but it is huge :(
> It is the result of specifying options -debug-only=regalloc
-print-before=greedy -print-after=greedy on the command line.
Thanks, I’ll try to take a look this week.
Looking at these lines, I wonder if the issue is not simply that we didn’t pass
the right subregindex. I.e., the following code would have been fine with
sub_vsx0 instead of sub_64.

80988B    undef %7526.sub_64:vsrprc = COPY %7527.sub_64:vsrprc
84324B    undef %7501.sub_64:vsrprc = COPY %7526.sub_64:vsrprc
84328B    %5546:vsrc = contract nofpexcept XVMADDADP %5546:vsrc(tied-def 0),
%7501.sub_vsx0:vsrprc
> 
> On Tue, Jun 22, 2021 at 3:21 PM Quentin Colombet <qcolombet at apple.com
<mailto:qcolombet at apple.com>> wrote:
> 
> 
>> On Jun 21, 2021, at 10:05 AM, Nemanja Ivanovic via llvm-dev
<llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>
wrote:
>> 
>> I am having a really difficult time with subregister related issues
when I turn
>> on subregister liveness tracking.
>> 
>> Before RA:
>> 79760B    %2216:vsrc = LXVDSX %5551:g8rc_and_g8rc_nox0, %2215:g8rc ::
(load 8 from %ir.scevgep1857.cast, !alias.scope !92, !noalias !93)
>> 79872B    %2225:vsrprc = LXVP 352, %661:g8rc_and_g8rc_nox0
>> 84328B    %5540:vsrc = contract nofpexcept XVMADDADP
%5540:vsrc(tied-def 0), %2225.sub_vsx0:vsrprc, %2216:vsrc, implicit $rm
>> 
>> After RA (greedy):
>> 79744B    %2214:vsrc = LXVDSX %5551:g8rc_and_g8rc_nox0, %6477:g8rc ::
(load 8 from %ir.scevgep1860.cast, !alias.scope !92, !noalias !93)
>> 79872B    %7503:vsrprc = LXVP 352, %661:g8rc_and_g8rc_nox0
>> 80248B    %7527:vsrprc = COPY %7503:vsrprc
>> 80988B    undef %7526.sub_64:vsrprc = COPY %7527.sub_64:vsrprc
>> 84324B    undef %7501.sub_64:vsrprc = COPY %7526.sub_64:vsrprc
>> 84328B    %5546:vsrc = contract nofpexcept XVMADDADP
%5546:vsrc(tied-def 0), %7501.sub_vsx0:vsrprc, %2214:vsrc, implicit $rm
>> 
>> Subregister definitions for PPC:
>> def sub_64 : SubRegIndex<64>;
>> def sub_vsx0 : SubRegIndex<128>;
>> def sub_vsx1 : SubRegIndex<128, 128>;
>> def sub_pair0 : SubRegIndex<256>;
>> def sub_pair1 : SubRegIndex<256, 256>;
>> 
>> So the instruction at 84328B uses the full register %2216 and the high
order
>> 128 bits of (256-bit) register %2225. However, the register allocator
splits
>> the live range and introduces a copy of the high order 64 bits of that
256-bit
>> register, then another copy of that copy and rewrites the use in
instruction
>> 84328B to that copy. The copy is marked undef so the register allocator
>> assigns just some random register to the use of that copy in 84328B.
>> 
>> Or maybe I am completely misinterpreting the meaning of the debug dumps
>> from the register allocator.
>> 
>> This appears to be related to lane masks and dead lane detection
although
>> I don't see dead lane detection marking anything unexpected as
undef (seems
>> to just be INSERT_SUBREG and PHI).
> 
> Are the copies added by dead lane detection or by live-range splitting?
> 
> The undef flag on the definition of %7501 is suspicious and depending on
how you look at it, so is the one on %7526. Essentially, we are losing the full
copy in this chain of copies and I wonder what is at fault here.
> 
> Could you share the debug output of regalloc?
> 
>> 
>> If anyone has suggestions on what might be the issue and/or how to go
about figuring this out and fixing it, I would really appreciate it.
>> 
>> Nemanja
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> 
> <ra-before-after-debug.txt>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210623/eb729607/attachment.html>
llvm dev - Jun 2021 - Problems with subreg-liveness and Greedy RA

[llvm-dev] Problems with subreg-liveness and Greedy RA