Mehdi Amini via llvm-dev
2015-Aug-19  19:54 UTC
[llvm-dev] [RFC] Generalize llvm.memcpy / llvm.memmove intrinsics.
> On Aug 19, 2015, at 12:01 PM, Hal Finkel via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > ----- Original Message ----- >> From: "Philip Reames via llvm-dev" <llvm-dev at lists.llvm.org> >> To: "Pete Cooper" <peter_cooper at apple.com>, "Lang Hames" <lhames at gmail.com> >> Cc: "LLVM Developers Mailing List" <llvm-dev at lists.llvm.org> >> Sent: Wednesday, August 19, 2015 12:14:19 PM >> Subject: Re: [llvm-dev] [RFC] Generalize llvm.memcpy / llvm.memmove intrinsics. >> >> On 08/19/2015 09:35 AM, Pete Cooper via llvm-dev wrote: >>> Hey Lang >>>> On Aug 18, 2015, at 6:04 PM, Lang Hames via llvm-dev >>>> <llvm-dev at lists.llvm.org> wrote: >>>> >>>> Hi All, >>>> >>>> I'd like to float two changes to the llvm.memcpy / llvm.memmove >>>> intrinsics. >>>> >>>> >>>> (1) Add an i1 <mayPerfectlyAlias> argument to the llvm.memcpy >>>> intrinsic. >>>> >>>> When set to '1' (the auto-upgrade default), this argument would >>>> indicate that the source and destination arguments may perfectly >>>> alias (otherwise they must not alias at all - memcpy prohibits >>>> partial overlap). While the C standard says that memcpy's >>>> arguments can't alias at all, perfect aliasing works in practice, >>>> and clang currently relies on this behavior: it emits >>>> llvm.memcpys for aggregate copies, despite the possibility of >>>> self-assignment. >>>> >>>> Going forward, llvm.memcpy calls emitted for aggregate copies >>>> would have mayPerfectlyAlias set to '1'. Other uses of >>>> llvm.memcpy (including lowerings from memcpy calls) would have >>>> mapPerfectlyAlias set to '0'. >>>> >>>> This change is motivated by poor optimization for small memcpys on >>>> targets with strict alignment requirements. When a user writes a >>>> small, unaligned memcpy we may transform it into an unaligned >>>> load/store pair in instcombine (See >>>> InstCombine::SimplifyMemTransfer), which is then broken up into >>>> an unwieldy series of smaller loads and stores during >>>> legalization. I have a fix for this issue which tags the pointers >>>> for unaligned load/store pairs with noalias metadata allowing >>>> CodeGen to produce better code during legalization, but it's not >>>> safe to apply while clang is emitting memcpys with pointers that >>>> may perfectly alias. If the 'mayPerfectlyAlias' flag were >>>> introduced, I could inspect that and add the noalias tag only if >>>> mayPerfectlyAlias is '0'. >>>> >>>> Note: We could also achieve the desired effect by adding a new >>>> intrinsic (llvm.structcpy?) with semantics that match the current >>>> llvm.memcpy ones (i.e. perfect-aliasing or non-aliasing, but no >>>> partial), and then reclaim llvm.memcpy for non-aliasing pointers >>>> only. I floated this idea with David Majnemer on IRC and he >>>> suggested that adding a flag to llvm.memcpy might be less >>>> disruptive and easier to maintain - thanks for the suggestion >>>> David! >> Given there's a semantically conservative interpretation and a more >> optimistic one, this really sounds like a case for metadata not >> another >> argument to the function. Our memcpy could keep it's current >> semantics, >> and we could add a piece of metadata which says none of the arguments >> to >> the call alias. > > We could add some "memcpy-allows-self-copies" metadata, and have Clang tag its associated aggregate copies with it. That would also work.Isn’t introducing an instruction wise “correctness” related metadata? Shouldn’t it be the opposite for correctness, i.e. “memcpy-disallows-self-copies”? (correctness in the sense that dropping the metadata does not break anything).> >> >> Actually, can't we already get this interpretation by marking both >> argument points as noalias? Doesn't that require that they don't >> overlap at all? I think we just need the ability to specify noalias >> at >> the callsite for each argument. I don't know if that's been tried, >> but >> it should work in theory. There are some issues with control >> dependence >> of call site attributes though that we'd need to watch out for/fix. > > But that's not quite what we want. We want to say: These can't alias, unless they're exactly equal. noalias either means that it does not alias at all, nor do any derived pointers, and obviously the lack of it says nothing. > > This we can still make aliasing assumptions if can prove that src != destination, which is often easier than proving things accounting for overlaps.Is this limited to the memcpy case or are these other use-cases so that it would be worth having another attribute than noalias that would carry this semantic (“nooverlap”)? — Mehdi> >>>> >>>> >>>> >>>> (2) Allow different source and destination alignments on both >>>> llvm.memcpy / llvm.memmove. >>>> >>>> Since I'm talking about changes to llvm.memcpy anyway, a few >>>> people asked me to float this one. Having separate alignments for >>>> the source and destination pointers may allow us to generate >>>> better code when one of the pointers has a higher alignment. >>>> >>>> The auto-upgrade for this would be to set both source and >>>> destination alignment to the original 'align' value. >>> FWIW, I have a patch for this lying around. I can dig it up. I >>> use alignment attributes to do it as there’s no need for alignment >>> to be its own argument any more. >> This would be a nice cleanup in general. +1 > > I agree, this sounds useful. > > -Hal > >>> >>> Cheers, >>> Pete >>>> >>>> >>>> Any thoughts? >>>> >>>> Cheers, >>>> Lang. >>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org >>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=03tkj3107244TlY4t3_hEgkDY-UG6gKwwK0wOUS3qjM&m=Js9_JWwnnCSoMnHhNlCr8sySTkjrVAbkaLqUP-49_x8&s=fAOxwvp7OA1L-OJfpwmZClRuD_eqxcJWA9p2bZ2-zz0&e>>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=v-ruWq0KCv2O3thJZiK6naxuXK8mQHZUmGq5FBtAmZ4&m=cdlq9gO3Mw04smTsaMSYBJqPKPuYO_guZlyYV2-SNCo&s=Ywfk-QiLMxxWDFi3tHscrVSc4DBfToJguedKSyzZbos&e= >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=v-ruWq0KCv2O3thJZiK6naxuXK8mQHZUmGq5FBtAmZ4&m=cdlq9gO3Mw04smTsaMSYBJqPKPuYO_guZlyYV2-SNCo&s=Ywfk-QiLMxxWDFi3tHscrVSc4DBfToJguedKSyzZbos&e= >> > > -- > Hal Finkel > Assistant Computational Scientist > Leadership Computing Facility > Argonne National Laboratory > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=v-ruWq0KCv2O3thJZiK6naxuXK8mQHZUmGq5FBtAmZ4&m=cdlq9gO3Mw04smTsaMSYBJqPKPuYO_guZlyYV2-SNCo&s=Ywfk-QiLMxxWDFi3tHscrVSc4DBfToJguedKSyzZbos&e=
Hal Finkel via llvm-dev
2015-Aug-19  20:17 UTC
[llvm-dev] [RFC] Generalize llvm.memcpy / llvm.memmove intrinsics.
----- Original Message -----> From: "Mehdi Amini" <mehdi.amini at apple.com> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: "Philip Reames" <listmail at philipreames.com>, "LLVM Developers Mailing List" <llvm-dev at lists.llvm.org> > Sent: Wednesday, August 19, 2015 2:54:56 PM > Subject: Re: [llvm-dev] [RFC] Generalize llvm.memcpy / llvm.memmove intrinsics. > > > > On Aug 19, 2015, at 12:01 PM, Hal Finkel via llvm-dev > > <llvm-dev at lists.llvm.org> wrote: > > > > ----- Original Message ----- > >> From: "Philip Reames via llvm-dev" <llvm-dev at lists.llvm.org> > >> To: "Pete Cooper" <peter_cooper at apple.com>, "Lang Hames" > >> <lhames at gmail.com> > >> Cc: "LLVM Developers Mailing List" <llvm-dev at lists.llvm.org> > >> Sent: Wednesday, August 19, 2015 12:14:19 PM > >> Subject: Re: [llvm-dev] [RFC] Generalize llvm.memcpy / > >> llvm.memmove intrinsics. > >> > >> On 08/19/2015 09:35 AM, Pete Cooper via llvm-dev wrote: > >>> Hey Lang > >>>> On Aug 18, 2015, at 6:04 PM, Lang Hames via llvm-dev > >>>> <llvm-dev at lists.llvm.org> wrote: > >>>> > >>>> Hi All, > >>>> > >>>> I'd like to float two changes to the llvm.memcpy / llvm.memmove > >>>> intrinsics. > >>>> > >>>> > >>>> (1) Add an i1 <mayPerfectlyAlias> argument to the llvm.memcpy > >>>> intrinsic. > >>>> > >>>> When set to '1' (the auto-upgrade default), this argument would > >>>> indicate that the source and destination arguments may perfectly > >>>> alias (otherwise they must not alias at all - memcpy prohibits > >>>> partial overlap). While the C standard says that memcpy's > >>>> arguments can't alias at all, perfect aliasing works in > >>>> practice, > >>>> and clang currently relies on this behavior: it emits > >>>> llvm.memcpys for aggregate copies, despite the possibility of > >>>> self-assignment. > >>>> > >>>> Going forward, llvm.memcpy calls emitted for aggregate copies > >>>> would have mayPerfectlyAlias set to '1'. Other uses of > >>>> llvm.memcpy (including lowerings from memcpy calls) would have > >>>> mapPerfectlyAlias set to '0'. > >>>> > >>>> This change is motivated by poor optimization for small memcpys > >>>> on > >>>> targets with strict alignment requirements. When a user writes a > >>>> small, unaligned memcpy we may transform it into an unaligned > >>>> load/store pair in instcombine (See > >>>> InstCombine::SimplifyMemTransfer), which is then broken up into > >>>> an unwieldy series of smaller loads and stores during > >>>> legalization. I have a fix for this issue which tags the > >>>> pointers > >>>> for unaligned load/store pairs with noalias metadata allowing > >>>> CodeGen to produce better code during legalization, but it's not > >>>> safe to apply while clang is emitting memcpys with pointers that > >>>> may perfectly alias. If the 'mayPerfectlyAlias' flag were > >>>> introduced, I could inspect that and add the noalias tag only if > >>>> mayPerfectlyAlias is '0'. > >>>> > >>>> Note: We could also achieve the desired effect by adding a new > >>>> intrinsic (llvm.structcpy?) with semantics that match the > >>>> current > >>>> llvm.memcpy ones (i.e. perfect-aliasing or non-aliasing, but no > >>>> partial), and then reclaim llvm.memcpy for non-aliasing pointers > >>>> only. I floated this idea with David Majnemer on IRC and he > >>>> suggested that adding a flag to llvm.memcpy might be less > >>>> disruptive and easier to maintain - thanks for the suggestion > >>>> David! > >> Given there's a semantically conservative interpretation and a > >> more > >> optimistic one, this really sounds like a case for metadata not > >> another > >> argument to the function. Our memcpy could keep it's current > >> semantics, > >> and we could add a piece of metadata which says none of the > >> arguments > >> to > >> the call alias. > > > > We could add some "memcpy-allows-self-copies" metadata, and have > > Clang tag its associated aggregate copies with it. That would also > > work. > > Isn’t introducing an instruction wise “correctness” related metadata? > Shouldn’t it be the opposite for correctness, i.e. > “memcpy-disallows-self-copies”? > (correctness in the sense that dropping the metadata does not break > anything). >Indeed, you're correct. -Hal> > > > > >> > >> Actually, can't we already get this interpretation by marking both > >> argument points as noalias? Doesn't that require that they don't > >> overlap at all? I think we just need the ability to specify > >> noalias > >> at > >> the callsite for each argument. I don't know if that's been > >> tried, > >> but > >> it should work in theory. There are some issues with control > >> dependence > >> of call site attributes though that we'd need to watch out > >> for/fix. > > > > But that's not quite what we want. We want to say: These can't > > alias, unless they're exactly equal. noalias either means that it > > does not alias at all, nor do any derived pointers, and obviously > > the lack of it says nothing. > > > > This we can still make aliasing assumptions if can prove that src > > != destination, which is often easier than proving things > > accounting for overlaps. > > Is this limited to the memcpy case or are these other use-cases so > that it would be worth having another attribute than noalias that > would carry this semantic (“nooverlap”)? > > — > Mehdi > > > > > >>>> > >>>> > >>>> > >>>> (2) Allow different source and destination alignments on both > >>>> llvm.memcpy / llvm.memmove. > >>>> > >>>> Since I'm talking about changes to llvm.memcpy anyway, a few > >>>> people asked me to float this one. Having separate alignments > >>>> for > >>>> the source and destination pointers may allow us to generate > >>>> better code when one of the pointers has a higher alignment. > >>>> > >>>> The auto-upgrade for this would be to set both source and > >>>> destination alignment to the original 'align' value. > >>> FWIW, I have a patch for this lying around. I can dig it up. I > >>> use alignment attributes to do it as there’s no need for > >>> alignment > >>> to be its own argument any more. > >> This would be a nice cleanup in general. +1 > > > > I agree, this sounds useful. > > > > -Hal > > > >>> > >>> Cheers, > >>> Pete > >>>> > >>>> > >>>> Any thoughts? > >>>> > >>>> Cheers, > >>>> Lang. > >>>> > >>>> _______________________________________________ > >>>> LLVM Developers mailing list > >>>> llvm-dev at lists.llvm.org > >>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=03tkj3107244TlY4t3_hEgkDY-UG6gKwwK0wOUS3qjM&m=Js9_JWwnnCSoMnHhNlCr8sySTkjrVAbkaLqUP-49_x8&s=fAOxwvp7OA1L-OJfpwmZClRuD_eqxcJWA9p2bZ2-zz0&e> >>> _______________________________________________ > >>> LLVM Developers mailing list > >>> llvm-dev at lists.llvm.org > >>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=v-ruWq0KCv2O3thJZiK6naxuXK8mQHZUmGq5FBtAmZ4&m=cdlq9gO3Mw04smTsaMSYBJqPKPuYO_guZlyYV2-SNCo&s=Ywfk-QiLMxxWDFi3tHscrVSc4DBfToJguedKSyzZbos&e> >> > >> _______________________________________________ > >> LLVM Developers mailing list > >> llvm-dev at lists.llvm.org > >> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=v-ruWq0KCv2O3thJZiK6naxuXK8mQHZUmGq5FBtAmZ4&m=cdlq9gO3Mw04smTsaMSYBJqPKPuYO_guZlyYV2-SNCo&s=Ywfk-QiLMxxWDFi3tHscrVSc4DBfToJguedKSyzZbos&e> >> > > > > -- > > Hal Finkel > > Assistant Computational Scientist > > Leadership Computing Facility > > Argonne National Laboratory > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=v-ruWq0KCv2O3thJZiK6naxuXK8mQHZUmGq5FBtAmZ4&m=cdlq9gO3Mw04smTsaMSYBJqPKPuYO_guZlyYV2-SNCo&s=Ywfk-QiLMxxWDFi3tHscrVSc4DBfToJguedKSyzZbos&e> >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
Gerolf Hoflehner via llvm-dev
2015-Aug-19  20:56 UTC
[llvm-dev] [RFC] Generalize llvm.memcpy / llvm.memmove intrinsics.
> On Aug 19, 2015, at 12:54 PM, Mehdi Amini via llvm-dev <llvm-dev at lists.llvm.org> wrote: > >> >> On Aug 19, 2015, at 12:01 PM, Hal Finkel via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> >> ----- Original Message ----- >>> From: "Philip Reames via llvm-dev" <llvm-dev at lists.llvm.org> >>> To: "Pete Cooper" <peter_cooper at apple.com>, "Lang Hames" <lhames at gmail.com> >>> Cc: "LLVM Developers Mailing List" <llvm-dev at lists.llvm.org> >>> Sent: Wednesday, August 19, 2015 12:14:19 PM >>> Subject: Re: [llvm-dev] [RFC] Generalize llvm.memcpy / llvm.memmove intrinsics. >>> >>> On 08/19/2015 09:35 AM, Pete Cooper via llvm-dev wrote: >>>> Hey Lang >>>>> On Aug 18, 2015, at 6:04 PM, Lang Hames via llvm-dev >>>>> <llvm-dev at lists.llvm.org> wrote: >>>>> >>>>> Hi All, >>>>> >>>>> I'd like to float two changes to the llvm.memcpy / llvm.memmove >>>>> intrinsics. >>>>> >>>>> >>>>> (1) Add an i1 <mayPerfectlyAlias> argument to the llvm.memcpy >>>>> intrinsic. >>>>> >>>>> When set to '1' (the auto-upgrade default), this argument would >>>>> indicate that the source and destination arguments may perfectly >>>>> alias (otherwise they must not alias at all - memcpy prohibits >>>>> partial overlap). While the C standard says that memcpy's >>>>> arguments can't alias at all, perfect aliasing works in practice, >>>>> and clang currently relies on this behavior: it emits >>>>> llvm.memcpys for aggregate copies, despite the possibility of >>>>> self-assignment. >>>>> >>>>> Going forward, llvm.memcpy calls emitted for aggregate copies >>>>> would have mayPerfectlyAlias set to '1'. Other uses of >>>>> llvm.memcpy (including lowerings from memcpy calls) would have >>>>> mapPerfectlyAlias set to '0'. >>>>> >>>>> This change is motivated by poor optimization for small memcpys on >>>>> targets with strict alignment requirements. When a user writes a >>>>> small, unaligned memcpy we may transform it into an unaligned >>>>> load/store pair in instcombine (See >>>>> InstCombine::SimplifyMemTransfer), which is then broken up into >>>>> an unwieldy series of smaller loads and stores during >>>>> legalization. I have a fix for this issue which tags the pointers >>>>> for unaligned load/store pairs with noalias metadata allowing >>>>> CodeGen to produce better code during legalization, but it's not >>>>> safe to apply while clang is emitting memcpys with pointers that >>>>> may perfectly alias. If the 'mayPerfectlyAlias' flag were >>>>> introduced, I could inspect that and add the noalias tag only if >>>>> mayPerfectlyAlias is '0'. >>>>> >>>>> Note: We could also achieve the desired effect by adding a new >>>>> intrinsic (llvm.structcpy?) with semantics that match the current >>>>> llvm.memcpy ones (i.e. perfect-aliasing or non-aliasing, but no >>>>> partial), and then reclaim llvm.memcpy for non-aliasing pointers >>>>> only. I floated this idea with David Majnemer on IRC and he >>>>> suggested that adding a flag to llvm.memcpy might be less >>>>> disruptive and easier to maintain - thanks for the suggestion >>>>> David! >>> Given there's a semantically conservative interpretation and a more >>> optimistic one, this really sounds like a case for metadata not >>> another >>> argument to the function. Our memcpy could keep it's current >>> semantics, >>> and we could add a piece of metadata which says none of the arguments >>> to >>> the call alias. >> >> We could add some "memcpy-allows-self-copies" metadata, and have Clang tag its associated aggregate copies with it. That would also work. > > Isn’t introducing an instruction wise “correctness” related metadata? > Shouldn’t it be the opposite for correctness, i.e. “memcpy-disallows-self-copies”? > (correctness in the sense that dropping the metadata does not break anything). > > > >> >>> >>> Actually, can't we already get this interpretation by marking both >>> argument points as noalias? Doesn't that require that they don't >>> overlap at all? I think we just need the ability to specify noalias >>> at >>> the callsite for each argument. I don't know if that's been tried, >>> but >>> it should work in theory. There are some issues with control >>> dependence >>> of call site attributes though that we'd need to watch out for/fix. >> >> But that's not quite what we want. We want to say: These can't alias, unless they're exactly equal. noalias either means that it does not alias at all, nor do any derived pointers, and obviously the lack of it says nothing. >> >> This we can still make aliasing assumptions if can prove that src != destination, which is often easier than proving things accounting for overlaps. > > Is this limited to the memcpy case or are these other use-cases so that it would be worth having another attribute than noalias that would carry this semantic (“nooverlap”)?I was wondering about that, too. This looks like information either the user has or the compiler could derive. Would it be best to condense the properties into *alias* and *align* attributes that are also user visible?> > — > Mehdi > > >> >>>>> >>>>> >>>>> >>>>> (2) Allow different source and destination alignments on both >>>>> llvm.memcpy / llvm.memmove. >>>>> >>>>> Since I'm talking about changes to llvm.memcpy anyway, a few >>>>> people asked me to float this one. Having separate alignments for >>>>> the source and destination pointers may allow us to generate >>>>> better code when one of the pointers has a higher alignment. >>>>> >>>>> The auto-upgrade for this would be to set both source and >>>>> destination alignment to the original 'align' value. >>>> FWIW, I have a patch for this lying around. I can dig it up. I >>>> use alignment attributes to do it as there’s no need for alignment >>>> to be its own argument any more. >>> This would be a nice cleanup in general. +1 >> >> I agree, this sounds useful. >> >> -Hal >> >>>> >>>> Cheers, >>>> Pete >>>>> >>>>> >>>>> Any thoughts? >>>>> >>>>> Cheers, >>>>> Lang. >>>>> >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> llvm-dev at lists.llvm.org >>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=03tkj3107244TlY4t3_hEgkDY-UG6gKwwK0wOUS3qjM&m=Js9_JWwnnCSoMnHhNlCr8sySTkjrVAbkaLqUP-49_x8&s=fAOxwvp7OA1L-OJfpwmZClRuD_eqxcJWA9p2bZ2-zz0&e>>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org >>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=v-ruWq0KCv2O3thJZiK6naxuXK8mQHZUmGq5FBtAmZ4&m=cdlq9gO3Mw04smTsaMSYBJqPKPuYO_guZlyYV2-SNCo&s=Ywfk-QiLMxxWDFi3tHscrVSc4DBfToJguedKSyzZbos&e= <https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=v-ruWq0KCv2O3thJZiK6naxuXK8mQHZUmGq5FBtAmZ4&m=cdlq9gO3Mw04smTsaMSYBJqPKPuYO_guZlyYV2-SNCo&s=Ywfk-QiLMxxWDFi3tHscrVSc4DBfToJguedKSyzZbos&e=> >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=v-ruWq0KCv2O3thJZiK6naxuXK8mQHZUmGq5FBtAmZ4&m=cdlq9gO3Mw04smTsaMSYBJqPKPuYO_guZlyYV2-SNCo&s=Ywfk-QiLMxxWDFi3tHscrVSc4DBfToJguedKSyzZbos&e= <https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=v-ruWq0KCv2O3thJZiK6naxuXK8mQHZUmGq5FBtAmZ4&m=cdlq9gO3Mw04smTsaMSYBJqPKPuYO_guZlyYV2-SNCo&s=Ywfk-QiLMxxWDFi3tHscrVSc4DBfToJguedKSyzZbos&e=> >>> >> >> -- >> Hal Finkel >> Assistant Computational Scientist >> Leadership Computing Facility >> Argonne National Laboratory >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=v-ruWq0KCv2O3thJZiK6naxuXK8mQHZUmGq5FBtAmZ4&m=cdlq9gO3Mw04smTsaMSYBJqPKPuYO_guZlyYV2-SNCo&s=Ywfk-QiLMxxWDFi3tHscrVSc4DBfToJguedKSyzZbos&e= <https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=v-ruWq0KCv2O3thJZiK6naxuXK8mQHZUmGq5FBtAmZ4&m=cdlq9gO3Mw04smTsaMSYBJqPKPuYO_guZlyYV2-SNCo&s=Ywfk-QiLMxxWDFi3tHscrVSc4DBfToJguedKSyzZbos&e=> > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=PlQHl7sshPU7FSzb4jGZyKbtJGJEL8ML0yYUKuWLs60&m=1ldotGn12NIM8scnVXnxKfrKZywUWKkEsSehTMLLR0E&s=489ZmsCqyXRRy8ULJCTdjh8vwbKjS5wSZfLbnsf4fD8&e= <https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=PlQHl7sshPU7FSzb4jGZyKbtJGJEL8ML0yYUKuWLs60&m=1ldotGn12NIM8scnVXnxKfrKZywUWKkEsSehTMLLR0E&s=489ZmsCqyXRRy8ULJCTdjh8vwbKjS5wSZfLbnsf4fD8&e=>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150819/f2815ef1/attachment.html>
Lang Hames via llvm-dev
2015-Aug-20  21:26 UTC
[llvm-dev] [RFC] Generalize llvm.memcpy / llvm.memmove intrinsics.
Pete - That patch sounds great!
Philip, Hal, Medhi, Gerolf - Thanks very much for the feedback.
So how about this:
(1) We drop llvm.memcpy's alignment argument and use Pete's
alignment-via-metadata patch (whatever version of it passes review).
(2) llvm.memcpy retains its current semantics, but we teach clang,
SimplifyLibCalls, etc. to add noalias metadata where we know it's safe.
Dropping the alignment argument will still change the signature of
llvm.memcpy / llvm.memmove, so I guess there's one other issue worth
discussing: Should we also split 'isVolatile' into
'isSrcVolatile' and
'isDstVolatile' ? Nobody has asked for this as far as I know, but I
believe it
would improve codegen in some cases. E.g.:
typedef struct {
  unsigned X[8];
} S;
unsigned foo(volatile S* s) {
  S t = *s;
  return t.X[4];
}
If the frontend lowers the struct copy to a volatile memcpy we'll have to
copy the whole struct before reading part of 't'. If we could mark only
the
source as volatile then we could discard the stores to 't'.
Again - nobody has asked for this, but if there's interest now would be a
good time to look at it.
Cheers,
Lang.
On Wed, Aug 19, 2015 at 1:56 PM, Gerolf Hoflehner via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
>
> On Aug 19, 2015, at 12:54 PM, Mehdi Amini via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>
> On Aug 19, 2015, at 12:01 PM, Hal Finkel via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> ----- Original Message -----
>
> From: "Philip Reames via llvm-dev" <llvm-dev at
lists.llvm.org>
> To: "Pete Cooper" <peter_cooper at apple.com>, "Lang
Hames" <lhames at gmail.com
> >
> Cc: "LLVM Developers Mailing List" <llvm-dev at
lists.llvm.org>
> Sent: Wednesday, August 19, 2015 12:14:19 PM
> Subject: Re: [llvm-dev] [RFC] Generalize llvm.memcpy / llvm.memmove
> intrinsics.
>
> On 08/19/2015 09:35 AM, Pete Cooper via llvm-dev wrote:
>
> Hey Lang
>
> On Aug 18, 2015, at 6:04 PM, Lang Hames via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>
> Hi All,
>
> I'd like to float two changes to the llvm.memcpy / llvm.memmove
> intrinsics.
>
>
> (1) Add an i1 <mayPerfectlyAlias> argument to the llvm.memcpy
> intrinsic.
>
> When set to '1' (the auto-upgrade default), this argument would
> indicate that the source and destination arguments may perfectly
> alias (otherwise they must not alias at all - memcpy prohibits
> partial overlap). While the C standard says that memcpy's
> arguments can't alias at all, perfect aliasing works in practice,
> and clang currently relies on this behavior: it emits
> llvm.memcpys for aggregate copies, despite the possibility of
> self-assignment.
>
> Going forward, llvm.memcpy calls emitted for aggregate copies
> would have mayPerfectlyAlias set to '1'. Other uses of
> llvm.memcpy (including lowerings from memcpy calls) would have
> mapPerfectlyAlias set to '0'.
>
> This change is motivated by poor optimization for small memcpys on
> targets with strict alignment requirements. When a user writes a
> small, unaligned memcpy we may transform it into an unaligned
> load/store pair in instcombine (See
> InstCombine::SimplifyMemTransfer), which is then broken up into
> an unwieldy series of smaller loads and stores during
> legalization. I have a fix for this issue which tags the pointers
> for unaligned load/store pairs with noalias metadata allowing
> CodeGen to produce better code during legalization, but it's not
> safe to apply while clang is emitting memcpys with pointers that
> may perfectly alias. If the 'mayPerfectlyAlias' flag were
> introduced, I could inspect that and add the noalias tag only if
> mayPerfectlyAlias is '0'.
>
> Note: We could also achieve the desired effect by adding a new
> intrinsic (llvm.structcpy?) with semantics that match the current
> llvm.memcpy ones (i.e. perfect-aliasing or non-aliasing, but no
> partial), and then reclaim llvm.memcpy for non-aliasing pointers
> only. I floated this idea with David Majnemer on IRC and he
> suggested that adding a flag to llvm.memcpy might be less
> disruptive and easier to maintain - thanks for the suggestion
> David!
>
> Given there's a semantically conservative interpretation and a more
> optimistic one, this really sounds like a case for metadata not
> another
> argument to the function.  Our memcpy could keep it's current
> semantics,
> and we could add a piece of metadata which says none of the arguments
> to
> the call alias.
>
>
> We could add some "memcpy-allows-self-copies" metadata, and have
Clang tag
> its associated aggregate copies with it. That would also work.
>
>
> Isn’t introducing an instruction wise “correctness” related metadata?
> Shouldn’t it be the opposite for correctness, i.e.
> “memcpy-disallows-self-copies”?
> (correctness in the sense that dropping the metadata does not break
> anything).
>
>
>
>
>
> Actually, can't we already get this interpretation by marking both
> argument points as noalias?  Doesn't that require that they don't
> overlap at all?  I think we just need the ability to specify noalias
> at
> the callsite for each argument.  I don't know if that's been tried,
> but
> it should work in theory.  There are some issues with control
> dependence
> of call site attributes though that we'd need to watch out for/fix.
>
>
> But that's not quite what we want. We want to say: These can't
alias,
> unless they're exactly equal. noalias either means that it does not
alias
> at all, nor do any derived pointers, and obviously the lack of it says
> nothing.
>
> This we can still make aliasing assumptions if can prove that src !>
destination, which is often easier than proving things accounting for
> overlaps.
>
>
> Is this limited to the memcpy case or are these other use-cases so that it
> would be worth having another attribute than noalias that would carry this
> semantic (“nooverlap”)?
>
>
> I was wondering about that, too. This looks like information either the
> user has or the compiler could derive. Would it be best to condense the
> properties into *alias* and *align* attributes that are also user visible?
>
>
> —
>
> Mehdi
>
>
>
>
>
>
> (2) Allow different source and destination alignments on both
> llvm.memcpy / llvm.memmove.
>
> Since I'm talking about changes to llvm.memcpy anyway, a few
> people asked me to float this one. Having separate alignments for
> the source and destination pointers may allow us to generate
> better code when one of the pointers has a higher alignment.
>
> The auto-upgrade for this would be to set both source and
> destination alignment to the original 'align' value.
>
> FWIW, I have a patch for this lying around.  I can dig it up.  I
> use alignment attributes to do it as there’s no need for alignment
> to be its own argument any more.
>
> This would be a nice cleanup in general.  +1
>
>
> I agree, this sounds useful.
>
> -Hal
>
>
> Cheers,
> Pete
>
>
>
> Any thoughts?
>
> Cheers,
> Lang.
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
>
>
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=03tkj3107244TlY4t3_hEgkDY-UG6gKwwK0wOUS3qjM&m=Js9_JWwnnCSoMnHhNlCr8sySTkjrVAbkaLqUP-49_x8&s=fAOxwvp7OA1L-OJfpwmZClRuD_eqxcJWA9p2bZ2-zz0&e>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
>
>
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=v-ruWq0KCv2O3thJZiK6naxuXK8mQHZUmGq5FBtAmZ4&m=cdlq9gO3Mw04smTsaMSYBJqPKPuYO_guZlyYV2-SNCo&s=Ywfk-QiLMxxWDFi3tHscrVSc4DBfToJguedKSyzZbos&e>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
>
>
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=v-ruWq0KCv2O3thJZiK6naxuXK8mQHZUmGq5FBtAmZ4&m=cdlq9gO3Mw04smTsaMSYBJqPKPuYO_guZlyYV2-SNCo&s=Ywfk-QiLMxxWDFi3tHscrVSc4DBfToJguedKSyzZbos&e>
>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
>
>
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=v-ruWq0KCv2O3thJZiK6naxuXK8mQHZUmGq5FBtAmZ4&m=cdlq9gO3Mw04smTsaMSYBJqPKPuYO_guZlyYV2-SNCo&s=Ywfk-QiLMxxWDFi3tHscrVSc4DBfToJguedKSyzZbos&e>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
>
>
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=PlQHl7sshPU7FSzb4jGZyKbtJGJEL8ML0yYUKuWLs60&m=1ldotGn12NIM8scnVXnxKfrKZywUWKkEsSehTMLLR0E&s=489ZmsCqyXRRy8ULJCTdjh8vwbKjS5wSZfLbnsf4fD8&e>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150820/b3b14241/attachment.html>
Apparently Analagous Threads
- [RFC] Generalize llvm.memcpy / llvm.memmove intrinsics.
- [RFC] Generalize llvm.memcpy / llvm.memmove intrinsics.
- [RFC] Generalize llvm.memcpy / llvm.memmove intrinsics.
- [RFC] Generalize llvm.memcpy / llvm.memmove intrinsics.
- [RFC] Generalize llvm.memcpy / llvm.memmove intrinsics.