Lori Yao Yu via llvm-dev
2020-Apr-26 03:37 UTC
[llvm-dev] assembly code for array iteration generated by llvm is much slower than gcc
Hi all developers, I'm changing compiler from gcc to llvm on a RISCV target now. but I found in some case the assembly code generated by llvm is much more than gcc. It cause my program's performance about 40% decrease. The flowing is a simple test code. It shows the problem. We can see than gcc prefer to use pointer to iterate the array, but llvm perfere to use index to iterate the array. So llvm generate more codes to calculate the memory address of an array element from the index. My question is that if there's an option of llvm, to let it iterate array by pointer. Or it’s a bug of llvm not resolved now ? Test C code: Int func(int w1, int w2, int *b, int *c) { Int wstart = 0; Int i = 0; Int j = 0; Int sum = 0; In wend = 0; Int dst_idx = 0; Int dst_idx2 = 0; for (I = 0; I < w2; i++) { wstart = i * w1; wend = i / w1; sum = c[wstart]; for (j = wstart + 1; j < wend; j++) { sum += c[j * w2]; sum += c[j * w1]; } dst_idx = w1 * i + w2; dst_idx2 = w2 * i + w1; b[dst_idx] = sum; b[dst_idx2] = sum/2; } } Compile command: riscv32-unkown-elf-g++ -nostartfiles -nostdlib -O2 -march=rv32imf -mabi=ilp32f -fno-builtin -S perf.c -o perf.g++ clang++ -O2 –target=riscv32 -march=rv32img -mabi=ilp32f -nostdlib -fno-builtin -S perf.c -o perf.lang the gcc version is 7.2.0 the llvm version is 10.0.0 Assembly code of the loop generated by gcc and llvm: [cid:image002.jpg at 01D61BBF.0A4250B0] Thank you all for your time and any help you can provide. Lori -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200426/e8d2c45f/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.jpg Type: image/jpeg Size: 117391 bytes Desc: image002.jpg URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200426/e8d2c45f/attachment-0001.jpg>
Sam Elliott via llvm-dev
2020-Apr-27 13:36 UTC
[llvm-dev] assembly code for array iteration generated by llvm is much slower than gcc
Hi, Am I right in thinking that this is RISC-V assembly? Please can you provide a testcase (a C file, or LLVM IR) that we can use to diagnose this issue further? It would also be useful to know what architecture (including extensions) and other compiler flags you are using. We know that the assembly that LLVM generates for RISC-V is not always the most efficient, and we're working on this issue at the moment. We would welcome more testcases. Sam> On 26 Apr 2020, at 4:37 am, Lori Yao Yu via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > <image002.jpg>-- Sam Elliott Software Team Lead Senior Software Developer - LLVM and OpenTitan lowRISC CIC
Lori Yao Yu via llvm-dev
2020-Apr-28 03:00 UTC
[llvm-dev] 回复: assembly code for array iteration generated by llvm is much slower than gcc
Hi Sam, Yes, it is riscv assembly code. The test code is show bellow. You can copy the code to a c file named perf.c, then you can compile perf.c using the compile command bellow. We can see than gcc prefer to use pointer to iterate the array, but llvm perfer to use index to iterate the array. So llvm generate more codes to calculate the memory address of an array element from the index. Test C code: //perf.c int func(int w1, int w2, int *b, int *c) { int wstart = 0; int i = 0; int j = 0; int sum = 0; int wend = 0; int dst_idx = 0; int dst_idx2 = 0; for (i = 0; i < w2; i++) { wstart = i * w1; wend = i / w1; sum = c[wstart]; for (j = wstart + 1; j < wend; j++) { sum += c[j * w2]; sum += c[j * w1]; } dst_idx = w1 * i + w2; dst_idx2 = w2 * i + w1; b[dst_idx] = sum; b[dst_idx2] = sum/2; } } Compile command: riscv32-unkown-elf-g++ -nostartfiles -nostdlib -O2 -march=rv32imf -mabi=ilp32f -fno-builtin -S perf.c -o perf.g++ clang++ -O2 �Ctarget=riscv32 -march=rv32img -mabi=ilp32f -nostdlib -fno-builtin -S perf.c -o perf.lang the gcc version is 7.2.0 the llvm version is 10.0.0 thanks!~ Lori -----�ʼ�ԭ��----- ������: Sam Elliott <selliott at lowrisc.org> ����ʱ��: 2020��4��27�� 21:36 �ռ���: Lori Yao Yu <loriyu at panyi.ai> ����: LLVM Developers Mailing List <llvm-dev at lists.llvm.org> ����: Re: [llvm-dev] assembly code for array iteration generated by llvm is much slower than gcc Hi, Am I right in thinking that this is RISC-V assembly? Please can you provide a testcase (a C file, or LLVM IR) that we can use to diagnose this issue further? It would also be useful to know what architecture (including extensions) and other compiler flags you are using. We know that the assembly that LLVM generates for RISC-V is not always the most efficient, and we're working on this issue at the moment. We would welcome more testcases. Sam> On 26 Apr 2020, at 4:37 am, Lori Yao Yu via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > <image002.jpg>-- Sam Elliott Software Team Lead Senior Software Developer - LLVM and OpenTitan lowRISC CIC