Dominique Torette via llvm-dev
2018-Apr-23 08:50 UTC
[llvm-dev] pre-RA scheduling/live register analysis optimization (handle move) forcing spill of registers
Hi,
I have a question related to pre-RA scheduling and spill of registers.
I'm writing a backend for two operands instructions set, so FPU operations
result have implicit destination.
For example, the result of FMUL_A_oo is implicitly the register FA_ROUTMUL.
I have defined FPUaROUTMULRegisterClass containing only FA_ROUTMUL.
During the instruction lowering, in order to avoid frequent spill out of
FA_ROUTMUL, I systematically copy the result of FMUL_A_oo to a virtual register
through a COPY_TO_REGCLASS.
def : Pat<(fdiv f32:$OffsetA, f32:$OffsetB),
(COPY_TO_REGCLASS (FDIV_A_oo
FPUaOffsetOperand:$OffsetA,FPUaOffsetOperand:$OffsetB),FPUaOffsetClass)>;
The instruction lowering goes as expected all instances of FMUL_A_oo are
followed by a COPY, freeing the usage of FPUaROUTMULRegisterClass.
These COPY are at positions 64B and 112B in the example below. So far, so good.
My problem arise in some pre-RA instruction scheduling optimization moving these
COPY at later positions 104B and 112B.
The new code sequence leaves two FMUL_A_oo without COPY. So this requires 2
registers from FPUaROUTMULRegisterClass (which only includes FA_ROUTMUL).
So spill out need to be inserted where I tried to avoid it by inserting the
COPY. :-/
This 'handleMove' is generated by LiveIntervalAnalysis, but I don't
understand why it is generated and how to avoid this counterproductive
optimization.
TIA, Dominique Torette.
# *** IR Dump After MachineFunction Printer ***:
# Machine code for function addproddivConst: Post SSA
Function Live Ins: %FA_ROFF1 in %vreg0
0B BB#0: derived from LLVM BB %entry
Live Ins: %FA_ROFF1
16B %vreg0<def> = COPY %FA_ROFF1; FPUaOffsetClass:%vreg0
32B %vreg2<def> = MOVSUTO_A_iSLo 1077936128;
FPUaOffsetClass:%vreg2
48B %vreg3<def> = FMUL_A_oo %vreg0, %vreg2,
%RFLAGA<imp-def,dead>; FPUaROUTMULRegisterClass:%vreg3
FPUaOffsetClass:%vreg0,%vreg2
64B %vreg4<def> = COPY %vreg3; FPUaOffsetClass:%vreg4
FPUaROUTMULRegisterClass:%vreg3
80B %vreg5<def> = MOVSUTO_A_iSLo 1056964608;
FPUaOffsetClass:%vreg5
96B %vreg6<def> = FMUL_A_oo %vreg0, %vreg5,
%RFLAGA<imp-def,dead>; FPUaROUTMULRegisterClass:%vreg6
FPUaOffsetClass:%vreg0,%vreg5
112B %vreg7<def> = COPY %vreg6; FPUaOffsetClass:%vreg7
FPUaROUTMULRegisterClass:%vreg6
128B %vreg8<def> = FADD_A_oo %vreg4, %vreg7,
%RFLAGA<imp-def,dead>; FPUaROUTADDRegisterClass:%vreg8
FPUaOffsetClass:%vreg4,%vreg7
144B %FA_ROFF0<def> = COPY %vreg8;
FPUaROUTADDRegisterClass:%vreg8
176B MOVSUTO_SU_os_rpc %SU_ROFF0<kill>,
%RPC<imp-def,dead>
192B NOP
# End machine code for function addproddivConst.
handleMove 64B -> 104B: %vreg4<def> = COPY %vreg3;
FPUaOffsetClass:%vreg4 FPUaROUTMULRegisterClass:%vreg3
%vreg4: [64r,128r:0) 0 at 64r
--> [104r,128r:0) 0 at 104r
%vreg3: [48r,64r:0) 0 at 48r
--> [48r,104r:0) 0 at 48r
# *** IR Dump After Machine Instruction Scheduler ***:
# Machine code for function addproddivConst: Post SSA
Function Live Ins: %FA_ROFF1 in %vreg0
0B BB#0: derived from LLVM BB %entry
Live Ins: %FA_ROFF1
16B %vreg0<def> = COPY %FA_ROFF1; FPUaOffsetClass:%vreg0
32B %vreg2<def> = MOVSUTO_A_iSLo 1077936128;
FPUaOffsetClass:%vreg2
48B %vreg3<def> = FMUL_A_oo %vreg0, %vreg2,
%RFLAGA<imp-def,dead>; FPUaROUTMULRegisterClass:%vreg3
FPUaOffsetClass:%vreg0,%vreg2
80B %vreg5<def> = MOVSUTO_A_iSLo 1056964608;
FPUaOffsetClass:%vreg5
96B %vreg6<def> = FMUL_A_oo %vreg0, %vreg5,
%RFLAGA<imp-def,dead>; FPUaROUTMULRegisterClass:%vreg6
FPUaOffsetClass:%vreg0,%vreg5
104B %vreg4<def> = COPY %vreg3; FPUaOffsetClass:%vreg4
FPUaROUTMULRegisterClass:%vreg3
112B %vreg7<def> = COPY %vreg6; FPUaOffsetClass:%vreg7
FPUaROUTMULRegisterClass:%vreg6
128B %vreg8<def> = FADD_A_oo %vreg4, %vreg7,
%RFLAGA<imp-def,dead>; FPUaROUTADDRegisterClass:%vreg8
FPUaOffsetClass:%vreg4,%vreg7
144B %FA_ROFF0<def> = COPY %vreg8;
FPUaROUTADDRegisterClass:%vreg8
176B MOVSUTO_SU_os_rpc %SU_ROFF0<kill>,
%RPC<imp-def,dead>
192B NOP
# End machine code for function addproddivConst.
[http://www.spacebel.be/wp-content/uploads/2011/06/image-sign-sbp.jpg]
Dominique Torette
System Architect
Rue des Chasseurs Ardennais - Liège Science Park - B-4031 Angleur
Tel: +32 (0) 4 361 81 11 - Fax: +32 (0) 4 361 81 20
www.spacebel.be<http://www.spacebel.be/>
------------------------------------------------------------------------------
E-MAIL DISCLAIMER
The present message may contain confidential and/or legally privileged
information. If you are not the intended addressee and in case of a transmission
error, please notify the sender immediately and destroy this E-mail. Disclosure,
reproduction or distribution of this document and its possible attachments is
strictly forbidden.
SPACEBEL denies all liability for incomplete, improper, inaccurate, intercepted,
(partly) destroyed, lost and/or belated transmission of the current information
given that unencrypted electronic transmission cannot currently be guaranteed to
be secure or error free.
Upon request or in conformity with formal, contractual agreements, an originally
signed hard copy will be sent to you to confirm the information contained in
this E-mail.
SPACEBEL denies all liability where E-mail is used for private use.
SPACEBEL cannot be held responsible for possible viruses that might corrupt this
message and/or your computer system.
-------------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180423/e4da2418/attachment.html>
Quentin Colombet via llvm-dev
2018-Apr-23 22:44 UTC
[llvm-dev] pre-RA scheduling/live register analysis optimization (handle move) forcing spill of registers
Hi Dominque, Not commenting on the scheduling part as I don't know how the register pressure tracking is done there. Unless you constrain your copies to stay next to the MUL_A (e.g., using a bundle), there is a non-zero chance that something is going to mess with them. That said, the splitting mechanism should just insert the desired copies to avoid spilling for you. Could you check why this is not happening? (-debug-only=regalloc and check why it spills or why the splitting failed.) One possible problem is that you don't have a bigger regclass that contains FPUaROUTMULRegisterClass, that could be used when relaxing the constraint with splitting. Cheers, -Quentin 2018-04-23 1:50 GMT-07:00 Dominique Torette via llvm-dev < llvm-dev at lists.llvm.org>:> Hi, > > > > I have a question related to pre-RA scheduling and spill of registers. > > I’m writing a backend for two operands instructions set, so FPU operations > result have implicit destination. > > For example, the result of FMUL_A_oo is implicitly the register FA_ROUTMUL. > > I have defined FPUaROUTMULRegisterClass containing only FA_ROUTMUL. > > During the instruction lowering, in order to avoid frequent spill out of > FA_ROUTMUL, I systematically copy the result of FMUL_A_oo to a virtual > register through a COPY_TO_REGCLASS. > > > > def : Pat<(fdiv f32:$OffsetA, f32:$OffsetB), > (COPY_TO_REGCLASS (FDIV_A_oo FPUaOffsetOperand:$OffsetA, > FPUaOffsetOperand:$OffsetB),FPUaOffsetClass)>; > > > > The instruction lowering goes as expected all instances of FMUL_A_oo are > followed by a COPY, freeing the usage of FPUaROUTMULRegisterClass. > > These COPY are at positions 64B and 112B in the example below. So far, so > good. > > > > My problem arise in some pre-RA instruction scheduling optimization moving > these COPY at later positions 104B and 112B. > > The new code sequence leaves two FMUL_A_oo without COPY. So this requires > 2 registers from FPUaROUTMULRegisterClass (which only includes FA_ROUTMUL). > > So spill out need to be inserted where I tried to avoid it by inserting > the COPY. :-/ > > > > This ‘handleMove’ is generated by LiveIntervalAnalysis, but I don’t > understand why it is generated and how to avoid this counterproductive > optimization. > > > > TIA, Dominique Torette. > > # *** IR Dump After MachineFunction Printer ***: > > # Machine code for function addproddivConst: Post SSA > > Function Live Ins: %FA_ROFF1 in %vreg0 > > > > 0B BB#0: derived from LLVM BB %entry > > Live Ins: %FA_ROFF1 > > 16B %vreg0<def> = COPY %FA_ROFF1; FPUaOffsetClass:%vreg0 > > 32B %vreg2<def> = MOVSUTO_A_iSLo 1077936128; > FPUaOffsetClass:%vreg2 > > 48B %vreg3<def> = FMUL_A_oo %vreg0, %vreg2, > %RFLAGA<imp-def,dead>; FPUaROUTMULRegisterClass:%vreg3 > FPUaOffsetClass:%vreg0,%vreg2 > > 64B %vreg4<def> = COPY %vreg3; FPUaOffsetClass:%vreg4 > FPUaROUTMULRegisterClass:%vreg3 > > 80B %vreg5<def> = MOVSUTO_A_iSLo 1056964608; > FPUaOffsetClass:%vreg5 > > 96B %vreg6<def> = FMUL_A_oo %vreg0, %vreg5, > %RFLAGA<imp-def,dead>; FPUaROUTMULRegisterClass:%vreg6 > FPUaOffsetClass:%vreg0,%vreg5 > > 112B %vreg7<def> = COPY %vreg6; FPUaOffsetClass:%vreg7 > FPUaROUTMULRegisterClass:%vreg6 > > 128B %vreg8<def> = FADD_A_oo %vreg4, %vreg7, > %RFLAGA<imp-def,dead>; FPUaROUTADDRegisterClass:%vreg8 > FPUaOffsetClass:%vreg4,%vreg7 > > 144B %FA_ROFF0<def> = COPY %vreg8; FPUaROUTADDRegisterClass:% > vreg8 > > 176B MOVSUTO_SU_os_rpc %SU_ROFF0<kill>, %RPC<imp-def,dead> > > 192B NOP > > > > # End machine code for function addproddivConst. > > > > handleMove 64B -> 104B: %vreg4<def> = COPY %vreg3; FPUaOffsetClass:%vreg4 > FPUaROUTMULRegisterClass:%vreg3 > > %vreg4: [64r,128r:0) 0 at 64r > > --> [104r,128r:0) 0 at 104r > > %vreg3: [48r,64r:0) 0 at 48r > > --> [48r,104r:0) 0 at 48r > > # *** IR Dump After Machine Instruction Scheduler ***: > > # Machine code for function addproddivConst: Post SSA > > Function Live Ins: %FA_ROFF1 in %vreg0 > > > > 0B BB#0: derived from LLVM BB %entry > > Live Ins: %FA_ROFF1 > > 16B %vreg0<def> = COPY %FA_ROFF1; FPUaOffsetClass:%vreg0 > > 32B %vreg2<def> = MOVSUTO_A_iSLo 1077936128; > FPUaOffsetClass:%vreg2 > > 48B %vreg3<def> = FMUL_A_oo %vreg0, %vreg2, > %RFLAGA<imp-def,dead>; FPUaROUTMULRegisterClass:%vreg3 > FPUaOffsetClass:%vreg0,%vreg2 > > 80B %vreg5<def> = MOVSUTO_A_iSLo 1056964608; > FPUaOffsetClass:%vreg5 > > 96B %vreg6<def> = FMUL_A_oo %vreg0, %vreg5, > %RFLAGA<imp-def,dead>; FPUaROUTMULRegisterClass:%vreg6 > FPUaOffsetClass:%vreg0,%vreg5 > > 104B %vreg4<def> = COPY %vreg3; FPUaOffsetClass:%vreg4 > FPUaROUTMULRegisterClass:%vreg3 > > 112B %vreg7<def> = COPY %vreg6; FPUaOffsetClass:%vreg7 > FPUaROUTMULRegisterClass:%vreg6 > > 128B %vreg8<def> = FADD_A_oo %vreg4, %vreg7, > %RFLAGA<imp-def,dead>; FPUaROUTADDRegisterClass:%vreg8 > FPUaOffsetClass:%vreg4,%vreg7 > > 144B %FA_ROFF0<def> = COPY %vreg8; FPUaROUTADDRegisterClass:% > vreg8 > > 176B MOVSUTO_SU_os_rpc %SU_ROFF0<kill>, %RPC<imp-def,dead> > > 192B NOP > > > > # End machine code for function addproddivConst. > > > > > > > > > > [image: > http://www.spacebel.be/wp-content/uploads/2018/02/image-sign-sbp30y-1.jpg] > > *Dominique Torette* > System Architect > Rue des Chasseurs Ardennais - Liège Science Park - B-4031 Angleur > Tel: +32 (0) 4 361 81 11 - Fax: +32 (0) 4 361 81 20 > www.spacebel.be > > > > ------------------------------------------------------------ > ------------------ > > E-MAIL DISCLAIMER > > The present message may contain confidential and/or legally privileged > information. If you are not the intended addressee and in case of a > transmission error, please notify the sender immediately and destroy this > E-mail. Disclosure, reproduction or distribution of this document and its > possible attachments is strictly forbidden. > > SPACEBEL denies all liability for incomplete, improper, inaccurate, > intercepted, (partly) destroyed, lost and/or belated transmission of the > current information given that unencrypted electronic transmission cannot > currently be guaranteed to be secure or error free. > Upon request or in conformity with formal, contractual agreements, an > originally signed hard copy will be sent to you to confirm the information > contained in this E-mail. > > SPACEBEL denies all liability where E-mail is used for private use. > > SPACEBEL cannot be held responsible for possible viruses that might > corrupt this message and/or your computer system. > ------------------------------------------------------------ > ------------------- > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180423/819cc095/attachment-0001.html>
Andrew Trick via llvm-dev
2018-Apr-26 22:42 UTC
[llvm-dev] pre-RA scheduling/live register analysis optimization (handle move) forcing spill of registers
> On Apr 23, 2018, at 1:50 AM, Dominique Torette via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hi, > > I have a question related to pre-RA scheduling and spill of registers. > I’m writing a backend for two operands instructions set, so FPU operations result have implicit destination. > For example, the result of FMUL_A_oo is implicitly the register FA_ROUTMUL. > I have defined FPUaROUTMULRegisterClass containing only FA_ROUTMUL. > During the instruction lowering, in order to avoid frequent spill out of FA_ROUTMUL, I systematically copy the result of FMUL_A_oo to a virtual register through a COPY_TO_REGCLASS. > > def : Pat<(fdiv f32:$OffsetA, f32:$OffsetB), (COPY_TO_REGCLASS (FDIV_A_oo FPUaOffsetOperand:$OffsetA,FPUaOffsetOperand:$OffsetB),FPUaOffsetClass)>; > > The instruction lowering goes as expected all instances of FMUL_A_oo are followed by a COPY, freeing the usage of FPUaROUTMULRegisterClass. > These COPY are at positions 64B and 112B in the example below. So far, so good. > > My problem arise in some pre-RA instruction scheduling optimization moving these COPY at later positions 104B and 112B. > The new code sequence leaves two FMUL_A_oo without COPY. So this requires 2 registers from FPUaROUTMULRegisterClass (which only includes FA_ROUTMUL). > So spill out need to be inserted where I tried to avoid it by inserting the COPY. :-/ > > This ‘handleMove’ is generated by LiveIntervalAnalysis, but I don’t understand why it is generated and how to avoid this counterproductive optimization.‘handleMove’ updates LiveIntervals when a virtual register read/write is moved. The scheduler has a heuristic called biasPhysRegCopy that tries to avoid creating any interference on physregs. You might check -debug-only=machine-scheduler to see why a copy was moved, or just step through the scheduler for a very small test case. -Andy> > TIA, Dominique Torette. > # *** IR Dump After MachineFunction Printer ***: > # Machine code for function addproddivConst: Post SSA > Function Live Ins: %FA_ROFF1 in %vreg0 > > 0B BB#0: derived from LLVM BB %entry > Live Ins: %FA_ROFF1 > 16B %vreg0<def> = COPY %FA_ROFF1; FPUaOffsetClass:%vreg0 > 32B %vreg2<def> = MOVSUTO_A_iSLo 1077936128; FPUaOffsetClass:%vreg2 > 48B %vreg3<def> = FMUL_A_oo %vreg0, %vreg2, %RFLAGA<imp-def,dead>; FPUaROUTMULRegisterClass:%vreg3 FPUaOffsetClass:%vreg0,%vreg2 > 64B %vreg4<def> = COPY %vreg3; FPUaOffsetClass:%vreg4 FPUaROUTMULRegisterClass:%vreg3 > 80B %vreg5<def> = MOVSUTO_A_iSLo 1056964608; FPUaOffsetClass:%vreg5 > 96B %vreg6<def> = FMUL_A_oo %vreg0, %vreg5, %RFLAGA<imp-def,dead>; FPUaROUTMULRegisterClass:%vreg6 FPUaOffsetClass:%vreg0,%vreg5 > 112B %vreg7<def> = COPY %vreg6; FPUaOffsetClass:%vreg7 FPUaROUTMULRegisterClass:%vreg6 > 128B %vreg8<def> = FADD_A_oo %vreg4, %vreg7, %RFLAGA<imp-def,dead>; FPUaROUTADDRegisterClass:%vreg8 FPUaOffsetClass:%vreg4,%vreg7 > 144B %FA_ROFF0<def> = COPY %vreg8; FPUaROUTADDRegisterClass:%vreg8 > 176B MOVSUTO_SU_os_rpc %SU_ROFF0<kill>, %RPC<imp-def,dead> > 192B NOP > > # End machine code for function addproddivConst. > > handleMove 64B -> 104B: %vreg4<def> = COPY %vreg3; FPUaOffsetClass:%vreg4 FPUaROUTMULRegisterClass:%vreg3 > %vreg4: [64r,128r:0) 0 at 64r > --> [104r,128r:0) 0 at 104r > %vreg3: [48r,64r:0) 0 at 48r > --> [48r,104r:0) 0 at 48r > # *** IR Dump After Machine Instruction Scheduler ***: > # Machine code for function addproddivConst: Post SSA > Function Live Ins: %FA_ROFF1 in %vreg0 > > 0B BB#0: derived from LLVM BB %entry > Live Ins: %FA_ROFF1 > 16B %vreg0<def> = COPY %FA_ROFF1; FPUaOffsetClass:%vreg0 > 32B %vreg2<def> = MOVSUTO_A_iSLo 1077936128; FPUaOffsetClass:%vreg2 > 48B %vreg3<def> = FMUL_A_oo %vreg0, %vreg2, %RFLAGA<imp-def,dead>; FPUaROUTMULRegisterClass:%vreg3 FPUaOffsetClass:%vreg0,%vreg2 > 80B %vreg5<def> = MOVSUTO_A_iSLo 1056964608; FPUaOffsetClass:%vreg5 > 96B %vreg6<def> = FMUL_A_oo %vreg0, %vreg5, %RFLAGA<imp-def,dead>; FPUaROUTMULRegisterClass:%vreg6 FPUaOffsetClass:%vreg0,%vreg5 > 104B %vreg4<def> = COPY %vreg3; FPUaOffsetClass:%vreg4 FPUaROUTMULRegisterClass:%vreg3 > 112B %vreg7<def> = COPY %vreg6; FPUaOffsetClass:%vreg7 FPUaROUTMULRegisterClass:%vreg6 > 128B %vreg8<def> = FADD_A_oo %vreg4, %vreg7, %RFLAGA<imp-def,dead>; FPUaROUTADDRegisterClass:%vreg8 FPUaOffsetClass:%vreg4,%vreg7 > 144B %FA_ROFF0<def> = COPY %vreg8; FPUaROUTADDRegisterClass:%vreg8 > 176B MOVSUTO_SU_os_rpc %SU_ROFF0<kill>, %RPC<imp-def,dead> > 192B NOP > > # End machine code for function addproddivConst. > > > > > > > > > > Dominique Torette > System Architect > Rue des Chasseurs Ardennais - Liège Science Park - B-4031 Angleur > Tel: +32 (0) 4 361 81 11 - Fax: +32 (0) 4 361 81 20 > www.spacebel.be <http://www.spacebel.be/> > > > ------------------------------------------------------------------------------ > > E-MAIL DISCLAIMER > > The present message may contain confidential and/or legally privileged information. If you are not the intended addressee and in case of a transmission error, please notify the sender immediately and destroy this E-mail. Disclosure, reproduction or distribution of this document and its possible attachments is strictly forbidden. > > SPACEBEL denies all liability for incomplete, improper, inaccurate, intercepted, (partly) destroyed, lost and/or belated transmission of the current information given that unencrypted electronic transmission cannot currently be guaranteed to be secure or error free. > Upon request or in conformity with formal, contractual agreements, an originally signed hard copy will be sent to you to confirm the information contained in this E-mail. > > SPACEBEL denies all liability where E-mail is used for private use. > > SPACEBEL cannot be held responsible for possible viruses that might corrupt this message and/or your computer system. > ------------------------------------------------------------------------------- > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180426/cc468595/attachment.html>