Evgeny Astigeevich via llvm-dev
2017-Apr-04 11:06 UTC
[llvm-dev] [Perf Regressions] r299379 - [CodeGenPrep] move aarch64-type-promotion to CGP
Hi Jun, Your commit caused performance regressions on AArch64 Cortex-53: http://llvm.org/perf/db_default/v4/nts/110757 MultiSource/Benchmarks/TSVC/Packing-flt/Packing-flt: -16.61% MultiSource/Benchmarks/TSVC/Packing-dbl/Packing-dbl: -14.02% Other regressions on the page are noise. I see a difference in generated code which is hot: ===== r299379 ==== 14.19% 40f910 ldr s0, [x9] 21.37% 40f914 fcmp s0, #0.0 7.14% 40f918 b.le 40f930 <s342+0xd0> 40f91c sxtw x10, w10 7.05% 40f920 add x10, x10, #0x1 14.02% 40f924 add x11, x19, x10, lsl #2 21.16% 40f928 ldr w11, [x11,x26] 7.79% 40f92c str w11, [x9] 40f930 sub x8, x8, #0x1 7.25% 40f934 add x9, x9, #0x4 =============== ===== r299314 ==== 16.54% 40f918 ldr s0, [x9] 25.29% 40f91c fcmp s0, #0.0 8.08% 40f920 b.le 40f934 <s342+0xcc> 40f924 add w10, w10, #0x1 8.23% 40f928 add x11, x19, w10, sxtw #2 24.63% 40f92c ldr w11, [x11,x26] 9.03% 40f930 str w11, [x9] 40f934 sub x8, x8, #0x1 8.17% 40f938 add x9, x9, #0x4 =============== I see ' add x11, x19, w10, sxtw #2' is transformed into: 40f91c sxtw x10, w10 40f924 add x11, x19, x10, lsl #2 You can get the code of the benchmarks from here: https://llvm.org/svn/llvm-project/test-suite/trunk/ Could you please have a look at them? Thanks, Evgeny Astigeevich Senior Compiler Engineer Compilation Tools ARM -----Original Message----- From: llvm-commits [mailto:llvm-commits-bounces at lists.llvm.org] On Behalf Of Jun Bum Lim via llvm-commits Sent: Monday, April 03, 2017 8:20 PM To: llvm-commits at lists.llvm.org Subject: [llvm] r299379 - [CodeGenPrep] move aarch64-type-promotion to CGP Author: junbuml Date: Mon Apr 3 14:20:07 2017 New Revision: 299379 URL: http://llvm.org/viewvc/llvm-project?rev=299379&view=rev Log: [CodeGenPrep] move aarch64-type-promotion to CGP Summary: Move the aarch64-type-promotion pass within the existing type promotion framework in CGP. This change also support forking sexts when a new sext is required for promotion. Note that change is based on D27853 and I am submitting this out early to provide a better idea on D27853. Reviewers: jmolloy, mcrosier, javed.absar, qcolombet Reviewed By: qcolombet Subscribers: llvm-commits, aemerson, rengolin, mcrosier Differential Revision: https://reviews.llvm.org/D28680
Jun Lim via llvm-dev
2017-Apr-04 17:46 UTC
[llvm-dev] [Perf Regressions] r299379 - [CodeGenPrep] move aarch64-type-promotion to CGP
Hi Evgeny, Let me take a closer look at this and get back to you soon. Thanks, Jun -----Original Message----- From: Evgeny Astigeevich [mailto:Evgeny.Astigeevich at arm.com] Sent: Tuesday, April 4, 2017 7:06 AM To: Jun Bum Lim <junbuml at codeaurora.org> Cc: LLVM Commits <llvm-commits at lists.llvm.org>; nd <nd at arm.com>; llvm-dev <llvm-dev at lists.llvm.org> Subject: [Perf Regressions] r299379 - [CodeGenPrep] move aarch64-type-promotion to CGP Hi Jun, Your commit caused performance regressions on AArch64 Cortex-53: http://llvm.org/perf/db_default/v4/nts/110757 MultiSource/Benchmarks/TSVC/Packing-flt/Packing-flt: -16.61% MultiSource/Benchmarks/TSVC/Packing-dbl/Packing-dbl: -14.02% Other regressions on the page are noise. I see a difference in generated code which is hot: ===== r299379 ==== 14.19% 40f910 ldr s0, [x9] 21.37% 40f914 fcmp s0, #0.0 7.14% 40f918 b.le 40f930 <s342+0xd0> 40f91c sxtw x10, w10 7.05% 40f920 add x10, x10, #0x1 14.02% 40f924 add x11, x19, x10, lsl #2 21.16% 40f928 ldr w11, [x11,x26] 7.79% 40f92c str w11, [x9] 40f930 sub x8, x8, #0x1 7.25% 40f934 add x9, x9, #0x4 =============== ===== r299314 ==== 16.54% 40f918 ldr s0, [x9] 25.29% 40f91c fcmp s0, #0.0 8.08% 40f920 b.le 40f934 <s342+0xcc> 40f924 add w10, w10, #0x1 8.23% 40f928 add x11, x19, w10, sxtw #2 24.63% 40f92c ldr w11, [x11,x26] 9.03% 40f930 str w11, [x9] 40f934 sub x8, x8, #0x1 8.17% 40f938 add x9, x9, #0x4 =============== I see ' add x11, x19, w10, sxtw #2' is transformed into: 40f91c sxtw x10, w10 40f924 add x11, x19, x10, lsl #2 You can get the code of the benchmarks from here: https://llvm.org/svn/llvm-project/test-suite/trunk/ Could you please have a look at them? Thanks, Evgeny Astigeevich Senior Compiler Engineer Compilation Tools ARM -----Original Message----- From: llvm-commits [mailto:llvm-commits-bounces at lists.llvm.org] On Behalf Of Jun Bum Lim via llvm-commits Sent: Monday, April 03, 2017 8:20 PM To: llvm-commits at lists.llvm.org Subject: [llvm] r299379 - [CodeGenPrep] move aarch64-type-promotion to CGP Author: junbuml Date: Mon Apr 3 14:20:07 2017 New Revision: 299379 URL: http://llvm.org/viewvc/llvm-project?rev=299379&view=rev Log: [CodeGenPrep] move aarch64-type-promotion to CGP Summary: Move the aarch64-type-promotion pass within the existing type promotion framework in CGP. This change also support forking sexts when a new sext is required for promotion. Note that change is based on D27853 and I am submitting this out early to provide a better idea on D27853. Reviewers: jmolloy, mcrosier, javed.absar, qcolombet Reviewed By: qcolombet Subscribers: llvm-commits, aemerson, rengolin, mcrosier Differential Revision: https://reviews.llvm.org/D28680
Jun Lim via llvm-dev
2017-Apr-05 17:09 UTC
[llvm-dev] [Perf Regressions] r299379 - [CodeGenPrep] move aarch64-type-promotion to CGP
The IR below should show the issue with my recent commit (r299379) in CodeGenPrepare : %struct.GlobalData = type { [32000 x float], [3 x i32] } @global_data = common global %struct.GlobalData zeroinitializer, align 16 define i32 @s341(i8 %c, i32 %j, i32 %j2, float %s) { if.then: ; preds = %for.body4 %inc = add nsw i32 %j, 1 %sext1= sext i32 %inc to i64 %arrayidx9 = getelementptr inbounds %struct.GlobalData, %struct.GlobalData* @global_data, i64 0, i32 0, i64 %sext1 store float %s, float* %arrayidx9, align 4 %j3 = sdiv i32 %inc, %j2 br label %return return: ret i32 %j3 } In the original AArch64 address type promotion pass, %sext1 was not promoted because its operand %inc is also used in %j3. With my commit (r299379), %sext1 is now allowed for promotion with trunc instruction to feed to %j3 from %inc : %sext1 = sext i32 %j to i64 %inc = add nsw i64 %sext1, 1 %promoted = trunc i64 %inc to i32 %arrayidx9 = getelementptr inbounds %struct.GlobalData, %struct.GlobalData* @global_data, i64 0, i32 0, i64 %inc store float %s, float* %arrayidx9, align 4 %j3 = sdiv i32 %promoted, %j2 ret i32 %j3 This transformation prevent ISel from folding sext into the store. Let me first try to fold this in ISel. If this is unreasonable, I will change CodeGenPrepare not to allow sext promotion when the operand has multiple users. Please let me know any comment. Thanks, Jun -----Original Message----- From: Evgeny Astigeevich [mailto:Evgeny.Astigeevich at arm.com] Sent: Tuesday, April 4, 2017 7:06 AM To: Jun Bum Lim <junbuml at codeaurora.org> Cc: LLVM Commits <llvm-commits at lists.llvm.org>; nd <nd at arm.com>; llvm-dev <llvm-dev at lists.llvm.org> Subject: [Perf Regressions] r299379 - [CodeGenPrep] move aarch64-type-promotion to CGP Hi Jun, Your commit caused performance regressions on AArch64 Cortex-53: http://llvm.org/perf/db_default/v4/nts/110757 MultiSource/Benchmarks/TSVC/Packing-flt/Packing-flt: -16.61% MultiSource/Benchmarks/TSVC/Packing-dbl/Packing-dbl: -14.02% Other regressions on the page are noise. I see a difference in generated code which is hot: ===== r299379 ==== 14.19% 40f910 ldr s0, [x9] 21.37% 40f914 fcmp s0, #0.0 7.14% 40f918 b.le 40f930 <s342+0xd0> 40f91c sxtw x10, w10 7.05% 40f920 add x10, x10, #0x1 14.02% 40f924 add x11, x19, x10, lsl #2 21.16% 40f928 ldr w11, [x11,x26] 7.79% 40f92c str w11, [x9] 40f930 sub x8, x8, #0x1 7.25% 40f934 add x9, x9, #0x4 =============== ===== r299314 ==== 16.54% 40f918 ldr s0, [x9] 25.29% 40f91c fcmp s0, #0.0 8.08% 40f920 b.le 40f934 <s342+0xcc> 40f924 add w10, w10, #0x1 8.23% 40f928 add x11, x19, w10, sxtw #2 24.63% 40f92c ldr w11, [x11,x26] 9.03% 40f930 str w11, [x9] 40f934 sub x8, x8, #0x1 8.17% 40f938 add x9, x9, #0x4 =============== I see ' add x11, x19, w10, sxtw #2' is transformed into: 40f91c sxtw x10, w10 40f924 add x11, x19, x10, lsl #2 You can get the code of the benchmarks from here: https://llvm.org/svn/llvm-project/test-suite/trunk/ Could you please have a look at them? Thanks, Evgeny Astigeevich Senior Compiler Engineer Compilation Tools ARM -----Original Message----- From: llvm-commits [mailto:llvm-commits-bounces at lists.llvm.org] On Behalf Of Jun Bum Lim via llvm-commits Sent: Monday, April 03, 2017 8:20 PM To: llvm-commits at lists.llvm.org Subject: [llvm] r299379 - [CodeGenPrep] move aarch64-type-promotion to CGP Author: junbuml Date: Mon Apr 3 14:20:07 2017 New Revision: 299379 URL: http://llvm.org/viewvc/llvm-project?rev=299379&view=rev Log: [CodeGenPrep] move aarch64-type-promotion to CGP Summary: Move the aarch64-type-promotion pass within the existing type promotion framework in CGP. This change also support forking sexts when a new sext is required for promotion. Note that change is based on D27853 and I am submitting this out early to provide a better idea on D27853. Reviewers: jmolloy, mcrosier, javed.absar, qcolombet Reviewed By: qcolombet Subscribers: llvm-commits, aemerson, rengolin, mcrosier Differential Revision: https://reviews.llvm.org/D28680