Корчагин Василий
2011-Feb-09 22:18 UTC
[LLVMdev] Lowering "memcpy" intrinsic function on ARM using LDMIA/STMIA
09.02.2011 18:57, Jason Kim пишет:> On Wed, Feb 9, 2011 at 5:02 AM, Vasiliy Korchagin > <vasiliy.korchagin at gmail.com> wrote: >> Hi, >> >> llvm emits code for "memcpy" on ARM as consecutive ldr/str commands, and > > Hmm, this happens elsewhere as well (x86?). Perhaps what we need is a > switch to disable memset/memcpy lowering? >Do you offer to call libc memset/memcpy functions always instead of intrinsic lowering? It seems not a good idea, because often (especially in cases of small chunks of memory) consecutive ldm/stm instructions are more efficient than memcpy call.>> further combines them into ldm/stm with special pass after register >> allocation. But ldm/stm commands require registers to go in ascending order, >> what is often not so after regalloc, therefore some str/ldr commands. For >> example such code: >> >> struct Foo {int a, b, c, d; } >> void CopyStruct(struct Foo *a, struct Foo *b) { *a = *b; } >> >> compiled to: >> >> ldmia r1, {r2, r3, r12} >> ldr r1, [r1, #12] >> stmia r0, {r2, r3, r12} >> str r1, [r0, #12] >> bx lr >> >> I ran different tests and always regalloc allocates at least one register >> not in ascending order. >> >> What is your ideas to overcome this issue? Maybe llvm should emit code for >> "memcpy" straight into ldm/stm or exchange registers before combining >> ldr/str to make them go in ascending order or fix somehow register >> allocator? >> >> Best regards, Vasiliy. >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >>
Sandeep Patel
2011-Feb-09 22:22 UTC
[LLVMdev] Lowering "memcpy" intrinsic function on ARM using LDMIA/STMIA
-fno-builtin is the flag you want. deep On Wed, Feb 9, 2011 at 10:18 PM, Корчагин Василий <vasiliy.korchagin at gmail.com> wrote:> 09.02.2011 18:57, Jason Kim пишет: >> On Wed, Feb 9, 2011 at 5:02 AM, Vasiliy Korchagin >> <vasiliy.korchagin at gmail.com> wrote: >>> Hi, >>> >>> llvm emits code for "memcpy" on ARM as consecutive ldr/str commands, and >> >> Hmm, this happens elsewhere as well (x86?). Perhaps what we need is a >> switch to disable memset/memcpy lowering? >> > > Do you offer to call libc memset/memcpy functions always instead of > intrinsic lowering? It seems not a good idea, because often (especially > in cases of small chunks of memory) consecutive ldm/stm instructions are > more efficient than memcpy call. > >>> further combines them into ldm/stm with special pass after register >>> allocation. But ldm/stm commands require registers to go in ascending order, >>> what is often not so after regalloc, therefore some str/ldr commands. For >>> example such code: >>> >>> struct Foo {int a, b, c, d; } >>> void CopyStruct(struct Foo *a, struct Foo *b) { *a = *b; } >>> >>> compiled to: >>> >>> ldmia r1, {r2, r3, r12} >>> ldr r1, [r1, #12] >>> stmia r0, {r2, r3, r12} >>> str r1, [r0, #12] >>> bx lr >>> >>> I ran different tests and always regalloc allocates at least one register >>> not in ascending order. >>> >>> What is your ideas to overcome this issue? Maybe llvm should emit code for >>> "memcpy" straight into ldm/stm or exchange registers before combining >>> ldr/str to make them go in ascending order or fix somehow register >>> allocator? >>> >>> Best regards, Vasiliy. >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>> >>> > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >
Корчагин Василий
2011-Feb-09 23:00 UTC
[LLVMdev] Lowering "memcpy" intrinsic function on ARM using LDMIA/STMIA
llc hasn't such flag and as I mentioned transforming memcpy into ldm/stm instructions often is more efficient way than calling memcpy from libc. 10.02.2011 01:22, Sandeep Patel пишет:> -fno-builtin is the flag you want. > > deep > > On Wed, Feb 9, 2011 at 10:18 PM, Корчагин Василий > <vasiliy.korchagin at gmail.com> wrote: >> 09.02.2011 18:57, Jason Kim пишет: >>> On Wed, Feb 9, 2011 at 5:02 AM, Vasiliy Korchagin >>> <vasiliy.korchagin at gmail.com> wrote: >>>> Hi, >>>> >>>> llvm emits code for "memcpy" on ARM as consecutive ldr/str commands, and >>> >>> Hmm, this happens elsewhere as well (x86?). Perhaps what we need is a >>> switch to disable memset/memcpy lowering? >>> >> >> Do you offer to call libc memset/memcpy functions always instead of >> intrinsic lowering? It seems not a good idea, because often (especially >> in cases of small chunks of memory) consecutive ldm/stm instructions are >> more efficient than memcpy call. >> >>>> further combines them into ldm/stm with special pass after register >>>> allocation. But ldm/stm commands require registers to go in ascending order, >>>> what is often not so after regalloc, therefore some str/ldr commands. For >>>> example such code: >>>> >>>> struct Foo {int a, b, c, d; } >>>> void CopyStruct(struct Foo *a, struct Foo *b) { *a = *b; } >>>> >>>> compiled to: >>>> >>>> ldmia r1, {r2, r3, r12} >>>> ldr r1, [r1, #12] >>>> stmia r0, {r2, r3, r12} >>>> str r1, [r0, #12] >>>> bx lr >>>> >>>> I ran different tests and always regalloc allocates at least one register >>>> not in ascending order. >>>> >>>> What is your ideas to overcome this issue? Maybe llvm should emit code for >>>> "memcpy" straight into ldm/stm or exchange registers before combining >>>> ldr/str to make them go in ascending order or fix somehow register >>>> allocator? >>>> >>>> Best regards, Vasiliy. >>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>> >>>> >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>
Корчагин Василий
2011-Feb-10 08:40 UTC
[LLVMdev] Lowering "memcpy" intrinsic function on ARM using LDMIA/STMIA
Seems like a little misunderstanding. I wrote about bitcode memcpy intrinsic, not memcpy from libc. Exactly this intrinsic is used in IR for stuctures coping as in my example. And lowering of memcpy intrinsic has mentioned issue on ARM. 10.02.2011 01:22, Sandeep Patel пишет:> -fno-builtin is the flag you want. > > deep > > On Wed, Feb 9, 2011 at 10:18 PM, Корчагин Василий > <vasiliy.korchagin at gmail.com> wrote: >> 09.02.2011 18:57, Jason Kim пишет: >>> On Wed, Feb 9, 2011 at 5:02 AM, Vasiliy Korchagin >>> <vasiliy.korchagin at gmail.com> wrote: >>>> Hi, >>>> >>>> llvm emits code for "memcpy" on ARM as consecutive ldr/str commands, and >>> >>> Hmm, this happens elsewhere as well (x86?). Perhaps what we need is a >>> switch to disable memset/memcpy lowering? >>> >> >> Do you offer to call libc memset/memcpy functions always instead of >> intrinsic lowering? It seems not a good idea, because often (especially >> in cases of small chunks of memory) consecutive ldm/stm instructions are >> more efficient than memcpy call. >> >>>> further combines them into ldm/stm with special pass after register >>>> allocation. But ldm/stm commands require registers to go in ascending order, >>>> what is often not so after regalloc, therefore some str/ldr commands. For >>>> example such code: >>>> >>>> struct Foo {int a, b, c, d; } >>>> void CopyStruct(struct Foo *a, struct Foo *b) { *a = *b; } >>>> >>>> compiled to: >>>> >>>> ldmia r1, {r2, r3, r12} >>>> ldr r1, [r1, #12] >>>> stmia r0, {r2, r3, r12} >>>> str r1, [r0, #12] >>>> bx lr >>>> >>>> I ran different tests and always regalloc allocates at least one register >>>> not in ascending order. >>>> >>>> What is your ideas to overcome this issue? Maybe llvm should emit code for >>>> "memcpy" straight into ldm/stm or exchange registers before combining >>>> ldr/str to make them go in ascending order or fix somehow register >>>> allocator? >>>> >>>> Best regards, Vasiliy. >>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>> >>>> >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>
Reasonably Related Threads
- [LLVMdev] Lowering "memcpy" intrinsic function on ARM using LDMIA/STMIA
- [LLVMdev] Lowering "memcpy" intrinsic function on ARM using LDMIA/STMIA
- [LLVMdev] Lowering "memcpy" intrinsic function on ARM using LDMIA/STMIA
- [LLVMdev] How to prevent generation of wide integers in LLVM IR?
- [LLVMdev] How to prevent generation of wide integers in LLVM IR?