edA-qa mort-ora-y via llvm-dev
2018-Apr-18 17:50 UTC
[llvm-dev] Why does clang do a memcpy? Is the cast not enough? (ABI function args)
Yes, but why is it even copying the memory? It already has a pointer which it can cast and load from -- and does so in other scenarios. I'm wondering whether this copying is somehow required and I'm missing something, or it's just an artifact of the clang emitter. That is, could it not omit the memcpy and cast the original variable? On 18/04/18 19:43, Krzysztof Parzyszek via llvm-dev wrote:> This is the standard way of copying memory in the IR. Backends can > expand the memcpy into loads/stores if they want. > > -Krzysztof > > On 4/18/2018 12:38 PM, edA-qa mort-ora-y via llvm-dev wrote: >> Yes, I understand that as well (it's what I'm trying to recreate in my >> language now). >> >> I'm really wondering why it does the copy, since from what I can tell it >> could just as easily cast the original value and do the load without the >> memcpy operation. >> >> That is, the question is about the memcpy and extra alloca -- I >> understand what it's doing, just not why it's doing it this way. >> >> >> On 18/04/18 19:33, Krzysztof Parzyszek via llvm-dev wrote: >>> It is a matter of the calling convention. It would specify what >>> structs are passed in registers, and which are passed through stack. >>> >>> -Krzysztof >>> >>> On 4/18/2018 12:28 PM, edA-qa mort-ora-y via llvm-dev wrote: >>>> I understand it's passing by value, that's what I'm testing here. The >>>> question is why does it copy the data rather than just casting and >>>> loading values from the original variable (%v) ? It seems like the >>>> copying is unnecessary. >>>> >>>> Not all struct's result in the copy, only certain forms -- others are >>>> just cast directly as I was expecting. I'm just not clear on what the >>>> differences are, and whether I need to do the same thing. >>>> >>>> >>>> On 18/04/18 19:13, Dimitry Andric wrote: >>>>> On 18 Apr 2018, at 18:40, edA-qa mort-ora-y via llvm-dev >>>>> <llvm-dev at lists.llvm.org> wrote: >>>>>> I'm implementing function arguments and tested this code in C: >>>>>> >>>>>> // clang -emit-llvm ll_struct_arg.c -S -o /dev/tty >>>>>> typedef struct vpt_data { >>>>>> char a; >>>>>> int b; >>>>>> float c; >>>>>> } vpt_data; >>>>>> >>>>>> void vpt_test( vpt_data vd ) { >>>>>> } >>>>>> >>>>>> int main() { >>>>>> vpt_data v; >>>>>> vpt_test(v); >>>>>> } >>>>>> >>>>>> This emits an odd LLVM structure that casts to the desired struct >>>>>> type, >>>>>> but also memcpy's to a temporary structure. I'm unsure of why the >>>>>> memcpy >>>>>> is done as opposed to just casting directly? >>>>> Because you are passing the parameter by value? It *should* copy the >>>>> data. In this particular case it will probably be elided if you >>>>> turn on >>>>> optimization, but it is more logical to pass structs via a const >>>>> reference or pointer. >>>>> >>>>> -Dimitry >>>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>> >> >-- edA-qa mort-ora-y http://mortoray.com/ Creator of the Leaf language http://leaflang.org/ Streaming algorithms, AI, and design on Twitch https://www.twitch.tv/mortoray Twitter edaqa
mats petersson via llvm-dev
2018-Apr-18 18:04 UTC
[llvm-dev] Why does clang do a memcpy? Is the cast not enough? (ABI function args)
It needs to LOAD the data. It is FASTER to do a memcpy (if the data is large enough) than to do a "load". If you actually convince the compiler to do a load, it will produce enough 32- or 64-bit LOAD/STORE pairs to copy the data. Not only does this bloat the code, it is also likely slower than running memcpy as a loop. For SMALL copies, memcpy gets replaced by simple load/store instructions anyway in the memcpy optimisation pass, so it is not an overhead. I know this, because I had to implement a similar thing in my Pascal compiler to avoid it exploding when trying to use a "record" (Pascal's "struct") with an array of 16000 int - it generated several thousand LOAD and STORE instructions for each function call. Which made the whole thing take almost forever, and the code generated was terrible. Calling memcpy instead solved the problem. -- Mats On 18 April 2018 at 18:50, edA-qa mort-ora-y via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Yes, but why is it even copying the memory? It already has a pointer > which it can cast and load from -- and does so in other scenarios. > > I'm wondering whether this copying is somehow required and I'm missing > something, or it's just an artifact of the clang emitter. That is, could > it not omit the memcpy and cast the original variable? > > On 18/04/18 19:43, Krzysztof Parzyszek via llvm-dev wrote: > > This is the standard way of copying memory in the IR. Backends can > > expand the memcpy into loads/stores if they want. > > > > -Krzysztof > > > > On 4/18/2018 12:38 PM, edA-qa mort-ora-y via llvm-dev wrote: > >> Yes, I understand that as well (it's what I'm trying to recreate in my > >> language now). > >> > >> I'm really wondering why it does the copy, since from what I can tell it > >> could just as easily cast the original value and do the load without the > >> memcpy operation. > >> > >> That is, the question is about the memcpy and extra alloca -- I > >> understand what it's doing, just not why it's doing it this way. > >> > >> > >> On 18/04/18 19:33, Krzysztof Parzyszek via llvm-dev wrote: > >>> It is a matter of the calling convention. It would specify what > >>> structs are passed in registers, and which are passed through stack. > >>> > >>> -Krzysztof > >>> > >>> On 4/18/2018 12:28 PM, edA-qa mort-ora-y via llvm-dev wrote: > >>>> I understand it's passing by value, that's what I'm testing here. The > >>>> question is why does it copy the data rather than just casting and > >>>> loading values from the original variable (%v) ? It seems like the > >>>> copying is unnecessary. > >>>> > >>>> Not all struct's result in the copy, only certain forms -- others are > >>>> just cast directly as I was expecting. I'm just not clear on what the > >>>> differences are, and whether I need to do the same thing. > >>>> > >>>> > >>>> On 18/04/18 19:13, Dimitry Andric wrote: > >>>>> On 18 Apr 2018, at 18:40, edA-qa mort-ora-y via llvm-dev > >>>>> <llvm-dev at lists.llvm.org> wrote: > >>>>>> I'm implementing function arguments and tested this code in C: > >>>>>> > >>>>>> // clang -emit-llvm ll_struct_arg.c -S -o /dev/tty > >>>>>> typedef struct vpt_data { > >>>>>> char a; > >>>>>> int b; > >>>>>> float c; > >>>>>> } vpt_data; > >>>>>> > >>>>>> void vpt_test( vpt_data vd ) { > >>>>>> } > >>>>>> > >>>>>> int main() { > >>>>>> vpt_data v; > >>>>>> vpt_test(v); > >>>>>> } > >>>>>> > >>>>>> This emits an odd LLVM structure that casts to the desired struct > >>>>>> type, > >>>>>> but also memcpy's to a temporary structure. I'm unsure of why the > >>>>>> memcpy > >>>>>> is done as opposed to just casting directly? > >>>>> Because you are passing the parameter by value? It *should* copy the > >>>>> data. In this particular case it will probably be elided if you > >>>>> turn on > >>>>> optimization, but it is more logical to pass structs via a const > >>>>> reference or pointer. > >>>>> > >>>>> -Dimitry > >>>>> > >>>> > >>>> > >>>> > >>>> _______________________________________________ > >>>> LLVM Developers mailing list > >>>> llvm-dev at lists.llvm.org > >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >>>> > >>> > >> > > > > -- > edA-qa mort-ora-y > http://mortoray.com/ > > Creator of the Leaf language > http://leaflang.org/ > > Streaming algorithms, AI, and design on Twitch > https://www.twitch.tv/mortoray > > Twitter > edaqa > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180418/10a88a8b/attachment.html>
David Blaikie via llvm-dev
2018-Apr-19 17:26 UTC
[llvm-dev] Why does clang do a memcpy? Is the cast not enough? (ABI function args)
I believe the memcpy is there just as a consequence of Clang's design - different parts of the compiler own different pieces of this, so in some sense one hand doesn't see what the other is doing. Part of it is "create an argument" (memcpying the local variable into an unnamed value) and then the next part is "oh, but that argument gets passed in registers, so decompose it into registers again". Clang doesn't need to produce perfectly optimal IR - because the optimization pipeline of LLVM will clean things up. So in many cases it's just easier (& not a significant impediment to performance) to have some of these sort of redundancies/oddities in output, and just let the LLVM optimization pipeline clean them up later. On Wed, Apr 18, 2018 at 10:51 AM edA-qa mort-ora-y via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Yes, but why is it even copying the memory? It already has a pointer > which it can cast and load from -- and does so in other scenarios. > > I'm wondering whether this copying is somehow required and I'm missing > something, or it's just an artifact of the clang emitter. That is, could > it not omit the memcpy and cast the original variable? > > On 18/04/18 19:43, Krzysztof Parzyszek via llvm-dev wrote: > > This is the standard way of copying memory in the IR. Backends can > > expand the memcpy into loads/stores if they want. > > > > -Krzysztof > > > > On 4/18/2018 12:38 PM, edA-qa mort-ora-y via llvm-dev wrote: > >> Yes, I understand that as well (it's what I'm trying to recreate in my > >> language now). > >> > >> I'm really wondering why it does the copy, since from what I can tell it > >> could just as easily cast the original value and do the load without the > >> memcpy operation. > >> > >> That is, the question is about the memcpy and extra alloca -- I > >> understand what it's doing, just not why it's doing it this way. > >> > >> > >> On 18/04/18 19:33, Krzysztof Parzyszek via llvm-dev wrote: > >>> It is a matter of the calling convention. It would specify what > >>> structs are passed in registers, and which are passed through stack. > >>> > >>> -Krzysztof > >>> > >>> On 4/18/2018 12:28 PM, edA-qa mort-ora-y via llvm-dev wrote: > >>>> I understand it's passing by value, that's what I'm testing here. The > >>>> question is why does it copy the data rather than just casting and > >>>> loading values from the original variable (%v) ? It seems like the > >>>> copying is unnecessary. > >>>> > >>>> Not all struct's result in the copy, only certain forms -- others are > >>>> just cast directly as I was expecting. I'm just not clear on what the > >>>> differences are, and whether I need to do the same thing. > >>>> > >>>> > >>>> On 18/04/18 19:13, Dimitry Andric wrote: > >>>>> On 18 Apr 2018, at 18:40, edA-qa mort-ora-y via llvm-dev > >>>>> <llvm-dev at lists.llvm.org> wrote: > >>>>>> I'm implementing function arguments and tested this code in C: > >>>>>> > >>>>>> // clang -emit-llvm ll_struct_arg.c -S -o /dev/tty > >>>>>> typedef struct vpt_data { > >>>>>> char a; > >>>>>> int b; > >>>>>> float c; > >>>>>> } vpt_data; > >>>>>> > >>>>>> void vpt_test( vpt_data vd ) { > >>>>>> } > >>>>>> > >>>>>> int main() { > >>>>>> vpt_data v; > >>>>>> vpt_test(v); > >>>>>> } > >>>>>> > >>>>>> This emits an odd LLVM structure that casts to the desired struct > >>>>>> type, > >>>>>> but also memcpy's to a temporary structure. I'm unsure of why the > >>>>>> memcpy > >>>>>> is done as opposed to just casting directly? > >>>>> Because you are passing the parameter by value? It *should* copy the > >>>>> data. In this particular case it will probably be elided if you > >>>>> turn on > >>>>> optimization, but it is more logical to pass structs via a const > >>>>> reference or pointer. > >>>>> > >>>>> -Dimitry > >>>>> > >>>> > >>>> > >>>> > >>>> _______________________________________________ > >>>> LLVM Developers mailing list > >>>> llvm-dev at lists.llvm.org > >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >>>> > >>> > >> > > > > -- > edA-qa mort-ora-y > http://mortoray.com/ > > Creator of the Leaf language > http://leaflang.org/ > > Streaming algorithms, AI, and design on Twitch > https://www.twitch.tv/mortoray > > Twitter > edaqa > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180419/3bfba7ac/attachment.html>
edA-qa mort-ora-y via llvm-dev
2018-Apr-20 05:59 UTC
[llvm-dev] Why does clang do a memcpy? Is the cast not enough? (ABI function args)
Thanks. That kind of makes sense. I see that a lot in my code as well: poor IR structures that aren't worth the effort to clean up since the LLVM passes do such a fine job of it. Turns out I now have the same copying structure in my ABI support code, though I use Store instead. :) On 19/04/18 19:26, David Blaikie wrote:> I believe the memcpy is there just as a consequence of Clang's design > - different parts of the compiler own different pieces of this, so in > some sense one hand doesn't see what the other is doing. Part of it is > "create an argument" (memcpying the local variable into an unnamed > value) and then the next part is "oh, but that argument gets passed in > registers, so decompose it into registers again". > > Clang doesn't need to produce perfectly optimal IR - because the > optimization pipeline of LLVM will clean things up. So in many cases > it's just easier (& not a significant impediment to performance) to > have some of these sort of redundancies/oddities in output, and just > let the LLVM optimization pipeline clean them up later. > > On Wed, Apr 18, 2018 at 10:51 AM edA-qa mort-ora-y via llvm-dev > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > Yes, but why is it even copying the memory? It already has a pointer > which it can cast and load from -- and does so in other scenarios. > > I'm wondering whether this copying is somehow required and I'm missing > something, or it's just an artifact of the clang emitter. That is, > could > it not omit the memcpy and cast the original variable? > > On 18/04/18 19:43, Krzysztof Parzyszek via llvm-dev wrote: > > This is the standard way of copying memory in the IR. Backends can > > expand the memcpy into loads/stores if they want. > > > > -Krzysztof > > > > On 4/18/2018 12:38 PM, edA-qa mort-ora-y via llvm-dev wrote: > >> Yes, I understand that as well (it's what I'm trying to > recreate in my > >> language now). > >> > >> I'm really wondering why it does the copy, since from what I > can tell it > >> could just as easily cast the original value and do the load > without the > >> memcpy operation. > >> > >> That is, the question is about the memcpy and extra alloca -- I > >> understand what it's doing, just not why it's doing it this way. > >> > >> > >> On 18/04/18 19:33, Krzysztof Parzyszek via llvm-dev wrote: > >>> It is a matter of the calling convention. It would specify what > >>> structs are passed in registers, and which are passed through > stack. > >>> > >>> -Krzysztof > >>> > >>> On 4/18/2018 12:28 PM, edA-qa mort-ora-y via llvm-dev wrote: > >>>> I understand it's passing by value, that's what I'm testing > here. The > >>>> question is why does it copy the data rather than just > casting and > >>>> loading values from the original variable (%v) ? It seems > like the > >>>> copying is unnecessary. > >>>> > >>>> Not all struct's result in the copy, only certain forms -- > others are > >>>> just cast directly as I was expecting. I'm just not clear on > what the > >>>> differences are, and whether I need to do the same thing. > >>>> > >>>> > >>>> On 18/04/18 19:13, Dimitry Andric wrote: > >>>>> On 18 Apr 2018, at 18:40, edA-qa mort-ora-y via llvm-dev > >>>>> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> > wrote: > >>>>>> I'm implementing function arguments and tested this code in C: > >>>>>> > >>>>>> // clang -emit-llvm ll_struct_arg.c -S -o /dev/tty > >>>>>> typedef struct vpt_data { > >>>>>> char a; > >>>>>> int b; > >>>>>> float c; > >>>>>> } vpt_data; > >>>>>> > >>>>>> void vpt_test( vpt_data vd ) { > >>>>>> } > >>>>>> > >>>>>> int main() { > >>>>>> vpt_data v; > >>>>>> vpt_test(v); > >>>>>> } > >>>>>> > >>>>>> This emits an odd LLVM structure that casts to the desired > struct > >>>>>> type, > >>>>>> but also memcpy's to a temporary structure. I'm unsure of > why the > >>>>>> memcpy > >>>>>> is done as opposed to just casting directly? > >>>>> Because you are passing the parameter by value? It *should* > copy the > >>>>> data. In this particular case it will probably be elided if you > >>>>> turn on > >>>>> optimization, but it is more logical to pass structs via a const > >>>>> reference or pointer. > >>>>> > >>>>> -Dimitry > >>>>> > >>>> > >>>> > >>>> > >>>> _______________________________________________ > >>>> LLVM Developers mailing list > >>>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >>>> > >>> > >> > > > > -- > edA-qa mort-ora-y > http://mortoray.com/ > > Creator of the Leaf language > http://leaflang.org/ > > Streaming algorithms, AI, and design on Twitch > https://www.twitch.tv/mortoray > > Twitter > edaqa > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-- edA-qa mort-ora-y http://mortoray.com/ Creator of the Leaf language http://leaflang.org/ Streaming algorithms, AI, and design on Twitch https://www.twitch.tv/mortoray Twitter edaqa -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180420/adf13e3d/attachment.html>
Possibly Parallel Threads
- Why does clang do a memcpy? Is the cast not enough? (ABI function args)
- Why does clang do a memcpy? Is the cast not enough? (ABI function args)
- Why does clang do a memcpy? Is the cast not enough? (ABI function args)
- Why does clang do a memcpy? Is the cast not enough? (ABI function args)
- Why does clang do a memcpy? Is the cast not enough? (ABI function args)