Dominique Torette via llvm-dev
2018-Apr-23 08:50 UTC
[llvm-dev] pre-RA scheduling/live register analysis optimization (handle move) forcing spill of registers
Hi, I have a question related to pre-RA scheduling and spill of registers. I'm writing a backend for two operands instructions set, so FPU operations result have implicit destination. For example, the result of FMUL_A_oo is implicitly the register FA_ROUTMUL. I have defined FPUaROUTMULRegisterClass containing only FA_ROUTMUL. During the instruction lowering, in order to avoid frequent spill out of FA_ROUTMUL, I systematically copy the result of FMUL_A_oo to a virtual register through a COPY_TO_REGCLASS. def : Pat<(fdiv f32:$OffsetA, f32:$OffsetB), (COPY_TO_REGCLASS (FDIV_A_oo FPUaOffsetOperand:$OffsetA,FPUaOffsetOperand:$OffsetB),FPUaOffsetClass)>; The instruction lowering goes as expected all instances of FMUL_A_oo are followed by a COPY, freeing the usage of FPUaROUTMULRegisterClass. These COPY are at positions 64B and 112B in the example below. So far, so good. My problem arise in some pre-RA instruction scheduling optimization moving these COPY at later positions 104B and 112B. The new code sequence leaves two FMUL_A_oo without COPY. So this requires 2 registers from FPUaROUTMULRegisterClass (which only includes FA_ROUTMUL). So spill out need to be inserted where I tried to avoid it by inserting the COPY. :-/ This 'handleMove' is generated by LiveIntervalAnalysis, but I don't understand why it is generated and how to avoid this counterproductive optimization. TIA, Dominique Torette. # *** IR Dump After MachineFunction Printer ***: # Machine code for function addproddivConst: Post SSA Function Live Ins: %FA_ROFF1 in %vreg0 0B BB#0: derived from LLVM BB %entry Live Ins: %FA_ROFF1 16B %vreg0<def> = COPY %FA_ROFF1; FPUaOffsetClass:%vreg0 32B %vreg2<def> = MOVSUTO_A_iSLo 1077936128; FPUaOffsetClass:%vreg2 48B %vreg3<def> = FMUL_A_oo %vreg0, %vreg2, %RFLAGA<imp-def,dead>; FPUaROUTMULRegisterClass:%vreg3 FPUaOffsetClass:%vreg0,%vreg2 64B %vreg4<def> = COPY %vreg3; FPUaOffsetClass:%vreg4 FPUaROUTMULRegisterClass:%vreg3 80B %vreg5<def> = MOVSUTO_A_iSLo 1056964608; FPUaOffsetClass:%vreg5 96B %vreg6<def> = FMUL_A_oo %vreg0, %vreg5, %RFLAGA<imp-def,dead>; FPUaROUTMULRegisterClass:%vreg6 FPUaOffsetClass:%vreg0,%vreg5 112B %vreg7<def> = COPY %vreg6; FPUaOffsetClass:%vreg7 FPUaROUTMULRegisterClass:%vreg6 128B %vreg8<def> = FADD_A_oo %vreg4, %vreg7, %RFLAGA<imp-def,dead>; FPUaROUTADDRegisterClass:%vreg8 FPUaOffsetClass:%vreg4,%vreg7 144B %FA_ROFF0<def> = COPY %vreg8; FPUaROUTADDRegisterClass:%vreg8 176B MOVSUTO_SU_os_rpc %SU_ROFF0<kill>, %RPC<imp-def,dead> 192B NOP # End machine code for function addproddivConst. handleMove 64B -> 104B: %vreg4<def> = COPY %vreg3; FPUaOffsetClass:%vreg4 FPUaROUTMULRegisterClass:%vreg3 %vreg4: [64r,128r:0) 0 at 64r --> [104r,128r:0) 0 at 104r %vreg3: [48r,64r:0) 0 at 48r --> [48r,104r:0) 0 at 48r # *** IR Dump After Machine Instruction Scheduler ***: # Machine code for function addproddivConst: Post SSA Function Live Ins: %FA_ROFF1 in %vreg0 0B BB#0: derived from LLVM BB %entry Live Ins: %FA_ROFF1 16B %vreg0<def> = COPY %FA_ROFF1; FPUaOffsetClass:%vreg0 32B %vreg2<def> = MOVSUTO_A_iSLo 1077936128; FPUaOffsetClass:%vreg2 48B %vreg3<def> = FMUL_A_oo %vreg0, %vreg2, %RFLAGA<imp-def,dead>; FPUaROUTMULRegisterClass:%vreg3 FPUaOffsetClass:%vreg0,%vreg2 80B %vreg5<def> = MOVSUTO_A_iSLo 1056964608; FPUaOffsetClass:%vreg5 96B %vreg6<def> = FMUL_A_oo %vreg0, %vreg5, %RFLAGA<imp-def,dead>; FPUaROUTMULRegisterClass:%vreg6 FPUaOffsetClass:%vreg0,%vreg5 104B %vreg4<def> = COPY %vreg3; FPUaOffsetClass:%vreg4 FPUaROUTMULRegisterClass:%vreg3 112B %vreg7<def> = COPY %vreg6; FPUaOffsetClass:%vreg7 FPUaROUTMULRegisterClass:%vreg6 128B %vreg8<def> = FADD_A_oo %vreg4, %vreg7, %RFLAGA<imp-def,dead>; FPUaROUTADDRegisterClass:%vreg8 FPUaOffsetClass:%vreg4,%vreg7 144B %FA_ROFF0<def> = COPY %vreg8; FPUaROUTADDRegisterClass:%vreg8 176B MOVSUTO_SU_os_rpc %SU_ROFF0<kill>, %RPC<imp-def,dead> 192B NOP # End machine code for function addproddivConst. [http://www.spacebel.be/wp-content/uploads/2011/06/image-sign-sbp.jpg] Dominique Torette System Architect Rue des Chasseurs Ardennais - Liège Science Park - B-4031 Angleur Tel: +32 (0) 4 361 81 11 - Fax: +32 (0) 4 361 81 20 www.spacebel.be<http://www.spacebel.be/> ------------------------------------------------------------------------------ E-MAIL DISCLAIMER The present message may contain confidential and/or legally privileged information. If you are not the intended addressee and in case of a transmission error, please notify the sender immediately and destroy this E-mail. Disclosure, reproduction or distribution of this document and its possible attachments is strictly forbidden. SPACEBEL denies all liability for incomplete, improper, inaccurate, intercepted, (partly) destroyed, lost and/or belated transmission of the current information given that unencrypted electronic transmission cannot currently be guaranteed to be secure or error free. Upon request or in conformity with formal, contractual agreements, an originally signed hard copy will be sent to you to confirm the information contained in this E-mail. SPACEBEL denies all liability where E-mail is used for private use. SPACEBEL cannot be held responsible for possible viruses that might corrupt this message and/or your computer system. ------------------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180423/e4da2418/attachment.html>
Quentin Colombet via llvm-dev
2018-Apr-23 22:44 UTC
[llvm-dev] pre-RA scheduling/live register analysis optimization (handle move) forcing spill of registers
Hi Dominque, Not commenting on the scheduling part as I don't know how the register pressure tracking is done there. Unless you constrain your copies to stay next to the MUL_A (e.g., using a bundle), there is a non-zero chance that something is going to mess with them. That said, the splitting mechanism should just insert the desired copies to avoid spilling for you. Could you check why this is not happening? (-debug-only=regalloc and check why it spills or why the splitting failed.) One possible problem is that you don't have a bigger regclass that contains FPUaROUTMULRegisterClass, that could be used when relaxing the constraint with splitting. Cheers, -Quentin 2018-04-23 1:50 GMT-07:00 Dominique Torette via llvm-dev < llvm-dev at lists.llvm.org>:> Hi, > > > > I have a question related to pre-RA scheduling and spill of registers. > > I’m writing a backend for two operands instructions set, so FPU operations > result have implicit destination. > > For example, the result of FMUL_A_oo is implicitly the register FA_ROUTMUL. > > I have defined FPUaROUTMULRegisterClass containing only FA_ROUTMUL. > > During the instruction lowering, in order to avoid frequent spill out of > FA_ROUTMUL, I systematically copy the result of FMUL_A_oo to a virtual > register through a COPY_TO_REGCLASS. > > > > def : Pat<(fdiv f32:$OffsetA, f32:$OffsetB), > (COPY_TO_REGCLASS (FDIV_A_oo FPUaOffsetOperand:$OffsetA, > FPUaOffsetOperand:$OffsetB),FPUaOffsetClass)>; > > > > The instruction lowering goes as expected all instances of FMUL_A_oo are > followed by a COPY, freeing the usage of FPUaROUTMULRegisterClass. > > These COPY are at positions 64B and 112B in the example below. So far, so > good. > > > > My problem arise in some pre-RA instruction scheduling optimization moving > these COPY at later positions 104B and 112B. > > The new code sequence leaves two FMUL_A_oo without COPY. So this requires > 2 registers from FPUaROUTMULRegisterClass (which only includes FA_ROUTMUL). > > So spill out need to be inserted where I tried to avoid it by inserting > the COPY. :-/ > > > > This ‘handleMove’ is generated by LiveIntervalAnalysis, but I don’t > understand why it is generated and how to avoid this counterproductive > optimization. > > > > TIA, Dominique Torette. > > # *** IR Dump After MachineFunction Printer ***: > > # Machine code for function addproddivConst: Post SSA > > Function Live Ins: %FA_ROFF1 in %vreg0 > > > > 0B BB#0: derived from LLVM BB %entry > > Live Ins: %FA_ROFF1 > > 16B %vreg0<def> = COPY %FA_ROFF1; FPUaOffsetClass:%vreg0 > > 32B %vreg2<def> = MOVSUTO_A_iSLo 1077936128; > FPUaOffsetClass:%vreg2 > > 48B %vreg3<def> = FMUL_A_oo %vreg0, %vreg2, > %RFLAGA<imp-def,dead>; FPUaROUTMULRegisterClass:%vreg3 > FPUaOffsetClass:%vreg0,%vreg2 > > 64B %vreg4<def> = COPY %vreg3; FPUaOffsetClass:%vreg4 > FPUaROUTMULRegisterClass:%vreg3 > > 80B %vreg5<def> = MOVSUTO_A_iSLo 1056964608; > FPUaOffsetClass:%vreg5 > > 96B %vreg6<def> = FMUL_A_oo %vreg0, %vreg5, > %RFLAGA<imp-def,dead>; FPUaROUTMULRegisterClass:%vreg6 > FPUaOffsetClass:%vreg0,%vreg5 > > 112B %vreg7<def> = COPY %vreg6; FPUaOffsetClass:%vreg7 > FPUaROUTMULRegisterClass:%vreg6 > > 128B %vreg8<def> = FADD_A_oo %vreg4, %vreg7, > %RFLAGA<imp-def,dead>; FPUaROUTADDRegisterClass:%vreg8 > FPUaOffsetClass:%vreg4,%vreg7 > > 144B %FA_ROFF0<def> = COPY %vreg8; FPUaROUTADDRegisterClass:% > vreg8 > > 176B MOVSUTO_SU_os_rpc %SU_ROFF0<kill>, %RPC<imp-def,dead> > > 192B NOP > > > > # End machine code for function addproddivConst. > > > > handleMove 64B -> 104B: %vreg4<def> = COPY %vreg3; FPUaOffsetClass:%vreg4 > FPUaROUTMULRegisterClass:%vreg3 > > %vreg4: [64r,128r:0) 0 at 64r > > --> [104r,128r:0) 0 at 104r > > %vreg3: [48r,64r:0) 0 at 48r > > --> [48r,104r:0) 0 at 48r > > # *** IR Dump After Machine Instruction Scheduler ***: > > # Machine code for function addproddivConst: Post SSA > > Function Live Ins: %FA_ROFF1 in %vreg0 > > > > 0B BB#0: derived from LLVM BB %entry > > Live Ins: %FA_ROFF1 > > 16B %vreg0<def> = COPY %FA_ROFF1; FPUaOffsetClass:%vreg0 > > 32B %vreg2<def> = MOVSUTO_A_iSLo 1077936128; > FPUaOffsetClass:%vreg2 > > 48B %vreg3<def> = FMUL_A_oo %vreg0, %vreg2, > %RFLAGA<imp-def,dead>; FPUaROUTMULRegisterClass:%vreg3 > FPUaOffsetClass:%vreg0,%vreg2 > > 80B %vreg5<def> = MOVSUTO_A_iSLo 1056964608; > FPUaOffsetClass:%vreg5 > > 96B %vreg6<def> = FMUL_A_oo %vreg0, %vreg5, > %RFLAGA<imp-def,dead>; FPUaROUTMULRegisterClass:%vreg6 > FPUaOffsetClass:%vreg0,%vreg5 > > 104B %vreg4<def> = COPY %vreg3; FPUaOffsetClass:%vreg4 > FPUaROUTMULRegisterClass:%vreg3 > > 112B %vreg7<def> = COPY %vreg6; FPUaOffsetClass:%vreg7 > FPUaROUTMULRegisterClass:%vreg6 > > 128B %vreg8<def> = FADD_A_oo %vreg4, %vreg7, > %RFLAGA<imp-def,dead>; FPUaROUTADDRegisterClass:%vreg8 > FPUaOffsetClass:%vreg4,%vreg7 > > 144B %FA_ROFF0<def> = COPY %vreg8; FPUaROUTADDRegisterClass:% > vreg8 > > 176B MOVSUTO_SU_os_rpc %SU_ROFF0<kill>, %RPC<imp-def,dead> > > 192B NOP > > > > # End machine code for function addproddivConst. > > > > > > > > > > [image: > http://www.spacebel.be/wp-content/uploads/2018/02/image-sign-sbp30y-1.jpg] > > *Dominique Torette* > System Architect > Rue des Chasseurs Ardennais - Liège Science Park - B-4031 Angleur > Tel: +32 (0) 4 361 81 11 - Fax: +32 (0) 4 361 81 20 > www.spacebel.be > > > > ------------------------------------------------------------ > ------------------ > > E-MAIL DISCLAIMER > > The present message may contain confidential and/or legally privileged > information. If you are not the intended addressee and in case of a > transmission error, please notify the sender immediately and destroy this > E-mail. Disclosure, reproduction or distribution of this document and its > possible attachments is strictly forbidden. > > SPACEBEL denies all liability for incomplete, improper, inaccurate, > intercepted, (partly) destroyed, lost and/or belated transmission of the > current information given that unencrypted electronic transmission cannot > currently be guaranteed to be secure or error free. > Upon request or in conformity with formal, contractual agreements, an > originally signed hard copy will be sent to you to confirm the information > contained in this E-mail. > > SPACEBEL denies all liability where E-mail is used for private use. > > SPACEBEL cannot be held responsible for possible viruses that might > corrupt this message and/or your computer system. > ------------------------------------------------------------ > ------------------- > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180423/819cc095/attachment-0001.html>
Andrew Trick via llvm-dev
2018-Apr-26 22:42 UTC
[llvm-dev] pre-RA scheduling/live register analysis optimization (handle move) forcing spill of registers
> On Apr 23, 2018, at 1:50 AM, Dominique Torette via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hi, > > I have a question related to pre-RA scheduling and spill of registers. > I’m writing a backend for two operands instructions set, so FPU operations result have implicit destination. > For example, the result of FMUL_A_oo is implicitly the register FA_ROUTMUL. > I have defined FPUaROUTMULRegisterClass containing only FA_ROUTMUL. > During the instruction lowering, in order to avoid frequent spill out of FA_ROUTMUL, I systematically copy the result of FMUL_A_oo to a virtual register through a COPY_TO_REGCLASS. > > def : Pat<(fdiv f32:$OffsetA, f32:$OffsetB), (COPY_TO_REGCLASS (FDIV_A_oo FPUaOffsetOperand:$OffsetA,FPUaOffsetOperand:$OffsetB),FPUaOffsetClass)>; > > The instruction lowering goes as expected all instances of FMUL_A_oo are followed by a COPY, freeing the usage of FPUaROUTMULRegisterClass. > These COPY are at positions 64B and 112B in the example below. So far, so good. > > My problem arise in some pre-RA instruction scheduling optimization moving these COPY at later positions 104B and 112B. > The new code sequence leaves two FMUL_A_oo without COPY. So this requires 2 registers from FPUaROUTMULRegisterClass (which only includes FA_ROUTMUL). > So spill out need to be inserted where I tried to avoid it by inserting the COPY. :-/ > > This ‘handleMove’ is generated by LiveIntervalAnalysis, but I don’t understand why it is generated and how to avoid this counterproductive optimization.‘handleMove’ updates LiveIntervals when a virtual register read/write is moved. The scheduler has a heuristic called biasPhysRegCopy that tries to avoid creating any interference on physregs. You might check -debug-only=machine-scheduler to see why a copy was moved, or just step through the scheduler for a very small test case. -Andy> > TIA, Dominique Torette. > # *** IR Dump After MachineFunction Printer ***: > # Machine code for function addproddivConst: Post SSA > Function Live Ins: %FA_ROFF1 in %vreg0 > > 0B BB#0: derived from LLVM BB %entry > Live Ins: %FA_ROFF1 > 16B %vreg0<def> = COPY %FA_ROFF1; FPUaOffsetClass:%vreg0 > 32B %vreg2<def> = MOVSUTO_A_iSLo 1077936128; FPUaOffsetClass:%vreg2 > 48B %vreg3<def> = FMUL_A_oo %vreg0, %vreg2, %RFLAGA<imp-def,dead>; FPUaROUTMULRegisterClass:%vreg3 FPUaOffsetClass:%vreg0,%vreg2 > 64B %vreg4<def> = COPY %vreg3; FPUaOffsetClass:%vreg4 FPUaROUTMULRegisterClass:%vreg3 > 80B %vreg5<def> = MOVSUTO_A_iSLo 1056964608; FPUaOffsetClass:%vreg5 > 96B %vreg6<def> = FMUL_A_oo %vreg0, %vreg5, %RFLAGA<imp-def,dead>; FPUaROUTMULRegisterClass:%vreg6 FPUaOffsetClass:%vreg0,%vreg5 > 112B %vreg7<def> = COPY %vreg6; FPUaOffsetClass:%vreg7 FPUaROUTMULRegisterClass:%vreg6 > 128B %vreg8<def> = FADD_A_oo %vreg4, %vreg7, %RFLAGA<imp-def,dead>; FPUaROUTADDRegisterClass:%vreg8 FPUaOffsetClass:%vreg4,%vreg7 > 144B %FA_ROFF0<def> = COPY %vreg8; FPUaROUTADDRegisterClass:%vreg8 > 176B MOVSUTO_SU_os_rpc %SU_ROFF0<kill>, %RPC<imp-def,dead> > 192B NOP > > # End machine code for function addproddivConst. > > handleMove 64B -> 104B: %vreg4<def> = COPY %vreg3; FPUaOffsetClass:%vreg4 FPUaROUTMULRegisterClass:%vreg3 > %vreg4: [64r,128r:0) 0 at 64r > --> [104r,128r:0) 0 at 104r > %vreg3: [48r,64r:0) 0 at 48r > --> [48r,104r:0) 0 at 48r > # *** IR Dump After Machine Instruction Scheduler ***: > # Machine code for function addproddivConst: Post SSA > Function Live Ins: %FA_ROFF1 in %vreg0 > > 0B BB#0: derived from LLVM BB %entry > Live Ins: %FA_ROFF1 > 16B %vreg0<def> = COPY %FA_ROFF1; FPUaOffsetClass:%vreg0 > 32B %vreg2<def> = MOVSUTO_A_iSLo 1077936128; FPUaOffsetClass:%vreg2 > 48B %vreg3<def> = FMUL_A_oo %vreg0, %vreg2, %RFLAGA<imp-def,dead>; FPUaROUTMULRegisterClass:%vreg3 FPUaOffsetClass:%vreg0,%vreg2 > 80B %vreg5<def> = MOVSUTO_A_iSLo 1056964608; FPUaOffsetClass:%vreg5 > 96B %vreg6<def> = FMUL_A_oo %vreg0, %vreg5, %RFLAGA<imp-def,dead>; FPUaROUTMULRegisterClass:%vreg6 FPUaOffsetClass:%vreg0,%vreg5 > 104B %vreg4<def> = COPY %vreg3; FPUaOffsetClass:%vreg4 FPUaROUTMULRegisterClass:%vreg3 > 112B %vreg7<def> = COPY %vreg6; FPUaOffsetClass:%vreg7 FPUaROUTMULRegisterClass:%vreg6 > 128B %vreg8<def> = FADD_A_oo %vreg4, %vreg7, %RFLAGA<imp-def,dead>; FPUaROUTADDRegisterClass:%vreg8 FPUaOffsetClass:%vreg4,%vreg7 > 144B %FA_ROFF0<def> = COPY %vreg8; FPUaROUTADDRegisterClass:%vreg8 > 176B MOVSUTO_SU_os_rpc %SU_ROFF0<kill>, %RPC<imp-def,dead> > 192B NOP > > # End machine code for function addproddivConst. > > > > > > > > > > Dominique Torette > System Architect > Rue des Chasseurs Ardennais - Liège Science Park - B-4031 Angleur > Tel: +32 (0) 4 361 81 11 - Fax: +32 (0) 4 361 81 20 > www.spacebel.be <http://www.spacebel.be/> > > > ------------------------------------------------------------------------------ > > E-MAIL DISCLAIMER > > The present message may contain confidential and/or legally privileged information. If you are not the intended addressee and in case of a transmission error, please notify the sender immediately and destroy this E-mail. Disclosure, reproduction or distribution of this document and its possible attachments is strictly forbidden. > > SPACEBEL denies all liability for incomplete, improper, inaccurate, intercepted, (partly) destroyed, lost and/or belated transmission of the current information given that unencrypted electronic transmission cannot currently be guaranteed to be secure or error free. > Upon request or in conformity with formal, contractual agreements, an originally signed hard copy will be sent to you to confirm the information contained in this E-mail. > > SPACEBEL denies all liability where E-mail is used for private use. > > SPACEBEL cannot be held responsible for possible viruses that might corrupt this message and/or your computer system. > ------------------------------------------------------------------------------- > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180426/cc468595/attachment.html>