Vasiliy Korchagin
2011-Feb-09 13:02 UTC
[LLVMdev] Lowering "memcpy" intrinsic function on ARM using LDMIA/STMIA
Hi, llvm emits code for "memcpy" on ARM as consecutive ldr/str commands, and further combines them into ldm/stm with special pass after register allocation. But ldm/stm commands require registers to go in ascending order, what is often not so after regalloc, therefore some str/ldr commands. For example such code: struct Foo {int a, b, c, d; } void CopyStruct(struct Foo *a, struct Foo *b) { *a = *b; } compiled to: ldmia r1, {r2, r3, r12} ldr r1, [r1, #12] stmia r0, {r2, r3, r12} str r1, [r0, #12] bx lr I ran different tests and always regalloc allocates at least one register not in ascending order. What is your ideas to overcome this issue? Maybe llvm should emit code for "memcpy" straight into ldm/stm or exchange registers before combining ldr/str to make them go in ascending order or fix somehow register allocator? Best regards, Vasiliy. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110209/a37db267/attachment.html>
Jason Kim
2011-Feb-09 15:57 UTC
[LLVMdev] Lowering "memcpy" intrinsic function on ARM using LDMIA/STMIA
On Wed, Feb 9, 2011 at 5:02 AM, Vasiliy Korchagin <vasiliy.korchagin at gmail.com> wrote:> Hi, > > llvm emits code for "memcpy" on ARM as consecutive ldr/str commands, andHmm, this happens elsewhere as well (x86?). Perhaps what we need is a switch to disable memset/memcpy lowering?> further combines them into ldm/stm with special pass after register > allocation. But ldm/stm commands require registers to go in ascending order, > what is often not so after regalloc, therefore some str/ldr commands. For > example such code: > > struct Foo {int a, b, c, d; } > void CopyStruct(struct Foo *a, struct Foo *b) { *a = *b; } > > compiled to: > > ldmia r1, {r2, r3, r12} > ldr r1, [r1, #12] > stmia r0, {r2, r3, r12} > str r1, [r0, #12] > bx lr > > I ran different tests and always regalloc allocates at least one register > not in ascending order. > > What is your ideas to overcome this issue? Maybe llvm should emit code for > "memcpy" straight into ldm/stm or exchange registers before combining > ldr/str to make them go in ascending order or fix somehow register > allocator? > > Best regards, Vasiliy. > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >
Корчагин Василий
2011-Feb-09 22:18 UTC
[LLVMdev] Lowering "memcpy" intrinsic function on ARM using LDMIA/STMIA
09.02.2011 18:57, Jason Kim пишет:> On Wed, Feb 9, 2011 at 5:02 AM, Vasiliy Korchagin > <vasiliy.korchagin at gmail.com> wrote: >> Hi, >> >> llvm emits code for "memcpy" on ARM as consecutive ldr/str commands, and > > Hmm, this happens elsewhere as well (x86?). Perhaps what we need is a > switch to disable memset/memcpy lowering? >Do you offer to call libc memset/memcpy functions always instead of intrinsic lowering? It seems not a good idea, because often (especially in cases of small chunks of memory) consecutive ldm/stm instructions are more efficient than memcpy call.>> further combines them into ldm/stm with special pass after register >> allocation. But ldm/stm commands require registers to go in ascending order, >> what is often not so after regalloc, therefore some str/ldr commands. For >> example such code: >> >> struct Foo {int a, b, c, d; } >> void CopyStruct(struct Foo *a, struct Foo *b) { *a = *b; } >> >> compiled to: >> >> ldmia r1, {r2, r3, r12} >> ldr r1, [r1, #12] >> stmia r0, {r2, r3, r12} >> str r1, [r0, #12] >> bx lr >> >> I ran different tests and always regalloc allocates at least one register >> not in ascending order. >> >> What is your ideas to overcome this issue? Maybe llvm should emit code for >> "memcpy" straight into ldm/stm or exchange registers before combining >> ldr/str to make them go in ascending order or fix somehow register >> allocator? >> >> Best regards, Vasiliy. >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >>
Andrew Trick
2011-Feb-11 06:07 UTC
[LLVMdev] Lowering "memcpy" intrinsic function on ARM using LDMIA/STMIA
On Feb 9, 2011, at 5:02 AM, Vasiliy Korchagin wrote:> llvm emits code for "memcpy" on ARM as consecutive ldr/str commands, and further combines them into ldm/stm with special pass after register allocation. But ldm/stm commands require registers to go in ascending order, what is often not so after regalloc, therefore some str/ldr commands. For example such code: > > struct Foo {int a, b, c, d; } > void CopyStruct(struct Foo *a, struct Foo *b) { *a = *b; } > > compiled to: > > ldmia r1, {r2, r3, r12} > ldr r1, [r1, #12] > stmia r0, {r2, r3, r12} > str r1, [r0, #12] > bx lr > > I ran different tests and always regalloc allocates at least one register not in ascending order. > > What is your ideas to overcome this issue? Maybe llvm should emit code for "memcpy" straight into ldm/stm or exchange registers before combining ldr/str to make them go in ascending order or fix somehow register allocator?Hi Vasiliy, We should handle this better. I'm not sure how to guarantee that we can generate ldm/stm without regalloc support. Our only idea is to teach the new register allocator to do a much better job satisfying register hints. If you'd like to track this, feel free to file a bug. Thanks, -Andy -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110210/96323a81/attachment.html>
Possibly Parallel Threads
- [LLVMdev] Lowering "memcpy" intrinsic function on ARM using LDMIA/STMIA
- [LLVMdev] Lowering "memcpy" intrinsic function on ARM using LDMIA/STMIA
- [LLVMdev] Lowering "memcpy" intrinsic function on ARM using LDMIA/STMIA
- [LLVMdev] How to prevent generation of wide integers in LLVM IR?
- [LLVMdev] How to prevent generation of wide integers in LLVM IR?