Johannes Doerfert via llvm-dev
2021-Jan-18 21:30 UTC
[llvm-dev] GVN removing loads that are affected by call
A function that returns a pointer argument "captures" it for now. The Attributor uses "nocapute_maybe_returned" internally but it is not an enum attribute (yet). On 1/15/21 7:32 PM, Ryan Taylor wrote:> So this seems to be an issue with the intrinsic setting NoCapture<0> (it > returns the pointer it takes). We have a similar intrinsic that does the > exact same thing but doesn't return the pointer. It also has NoCapture<1> > and doesn't cause this issue. Also, I managed to substitute memcpy and > there were no issues, it also has nocapture. > > Are there some papers/citations which talk more about Pointer > Capture/Capture Tracking? > > I saw this blog: > https://jonasdevlieghere.com/escape-analysis-capture-tracking-in-llvm/ > And I read the CaptureTracking.h short description but I'd like a clearer > grasp. > > Thanks, > > Ryan > > > > On Thu, Jan 14, 2021 at 2:03 PM Ryan Taylor <ryta1203 at gmail.com> wrote: > >> Ok, thanks. I'll try to come up with a more generic test case. >> >> On Thu, Jan 14, 2021 at 2:00 PM Johannes Doerfert < >> johannesdoerfert at gmail.com> wrote: >> >>> Nothing comes to mind given the information. >>> >>> ~ Johannes >>> >>> >>> On 1/14/21 12:56 PM, Ryan Taylor wrote: >>>> argmemonly, nounwind, writeonly >>>> >>>> -fno-strict-aliasing did not help. >>>> >>>> On Thu, Jan 14, 2021 at 1:29 PM Johannes Doerfert < >>>> johannesdoerfert at gmail.com> wrote: >>>> >>>>> Can you share the attributes of the intrinsic declaration, >>>>> assuming removing `!tbaa` didn't help. >>>>> >>>>> >>>>> On 1/14/21 11:43 AM, Ryan Taylor wrote: >>>>>> Yes, this is for downstream/out of tree target so I'm not sure how you >>>>>> could reproduce it either but I thought the IR might help a bit. >>>>>> >>>>>> My guess is it's not a bug in GVN as much as an issue with the >>> intrinsic >>>>>> properties, or lack thereof. I put this in the first post but the >>> Alias >>>>>> Sets show: >>>>>> >>>>>> AliasSet[0x75fbd0, 9] may alias, Mod/Ref Pointers: (i8* %1, 8), >>> (i16* >>>>>> %arrayidx, 2), (i16* %arrayidx1, 2), (i16* %arrayidx5, 2), (i16* %19, >>> 2), >>>>>> (i16* %arrayidx6, 2), (i16* %arrayidx14, 2), (i16* %arrayidx19, 2) >>>>>> 4 Unknown instructions: void <badref>, <4 x i16>* %6, <4 x i16>* >>>>> %22, >>>>>> void <badref> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Thu, Jan 14, 2021 at 12:37 PM Johannes Doerfert < >>>>>> johannesdoerfert at gmail.com> wrote: >>>>>> >>>>>>> There is still not enough information here. >>>>>>> >>>>>>> My first guess. The `!tbaa` annotation on the `XXX.intrinsic` and the >>>>>>> `load` >>>>>>> basically encode there is no alias. Easy to verify, remove the ones >>> on >>>>> the >>>>>>> intrinsic. >>>>>>> >>>>>>> ~ Johannes >>>>>>> >>>>>>> >>>>>>> P.S. If this was a bug in GVN, and I assume it is not, a reproducer >>>>>>> would help >>>>>>> a lot. So a small IR sample that shows the problem and which >>> we >>>>>>> can run. >>>>>>> This is a "redacted?" IR fragment in which I don't know what >>>>>>> transformation >>>>>>> is problematic. I also can not run it through GVN, which >>> makes it >>>>>>> impossible >>>>>>> to reproduce. >>>>>>> >>>>>>> >>>>>>> On 1/14/21 11:27 AM, Ryan Taylor via llvm-dev wrote: >>>>>>>> This is right before GVN: >>>>>>>> >>>>>>>> define i32 @foo(<4 x i16> %p, <4 x i16> %p1, i16* nocapture %res) >>>>>>>> local_unnamed_addr #0 !dbg !6 { >>>>>>>> entry: >>>>>>>> %temp = alloca i64, align 8 >>>>>>>> %tmpcast = bitcast i64* %temp to [4 x i16]* >>>>>>>> %0 = bitcast i64* %temp to i8*, !dbg !8 >>>>>>>> call void @llvm.lifetime.start.p0i8(i64 8, i8* nonnull %0) #3, >>>>> !dbg !8 >>>>>>>> store i64 0, i64* %temp, align 8, !dbg !9 >>>>>>>> %1 = bitcast i64* %temp to <4 x i16>*, !dbg !10 >>>>>>>> %2 = call <4 x i16>* @llvm.XXX.intrinsic(<4 x i16>* nonnull >>> %1, <4 >>>>> x >>>>>>> i16> >>>>>>>> %p, i32 0), !dbg !11, !tbaa !12 >>>>>>>> %arrayidx = bitcast i64* %temp to i16*, !dbg !16 >>>>>>>> %3 = load i16, i16* %arrayidx, align 8, !dbg !16, !tbaa !17 >>>>>>>> br label %for.body, !dbg !19 >>>>>>>> >>>>>>>> for.body: ; preds = %entry >>>>>>>> %arrayidx1 = getelementptr inbounds [4 x i16], [4 x i16]* >>> %tmpcast, >>>>>>> i32 >>>>>>>> 0, i32 1, !dbg !20 >>>>>>>> %4 = load i16, i16* %arrayidx1, align 2, !dbg !20, !tbaa !17 >>>>>>>> %cmp3 = icmp sgt i16 %3, %4, !dbg !21 >>>>>>>> %spec.select = select i1 %cmp3, i16 %4, i16 %3, !dbg !22 >>>>>>>> %arrayidx1.1 = getelementptr inbounds [4 x i16], [4 x i16]* >>>>> %tmpcast, >>>>>>> i32 >>>>>>>> 0, i32 2, !dbg !20 >>>>>>>> %5 = load i16, i16* %arrayidx1.1, align 2, !dbg !20, !tbaa !17 >>>>>>>> %cmp3.1 = icmp sgt i16 %spec.select, %5, !dbg !21 >>>>>>>> %spec.select.1 = select i1 %cmp3.1, i16 %5, i16 %spec.select, >>> !dbg >>>>> !22 >>>>>>>> %arrayidx1.2 = getelementptr inbounds [4 x i16], [4 x i16]* >>>>> %tmpcast, >>>>>>> i32 >>>>>>>> 0, i32 3, !dbg !20 >>>>>>>> %6 = load i16, i16* %arrayidx1.2, align 2, !dbg !20, !tbaa !17 >>>>>>>> %cmp3.2 = icmp sgt i16 %spec.select.1, %6, !dbg !21 >>>>>>>> %spec.select.2 = select i1 %cmp3.2, i16 %6, i16 %spec.select.1, >>>>> !dbg >>>>>>> !22 >>>>>>>> store i16 %spec.select.2, i16* %res, align 2, !dbg !23, !tbaa >>> !17 >>>>>>>> %7 = tail call <4 x i16>* @llvm.XXX.intrinsic(<4 x i16>* %2, >>> <4 x >>>>> i16> >>>>>>>> %p1, i32 0), !dbg !24, !tbaa !12 >>>>>>>> %8 = load i16, i16* %arrayidx, align 8, !dbg !25, !tbaa !17 >>>>>>>> br label %for.body12, !dbg !26 >>>>>>>> >>>>>>>> for.body12: ; preds >>> %for.body >>>>>>>> %arrayidx14 = getelementptr inbounds [4 x i16], [4 x i16]* >>>>> %tmpcast, >>>>>>> i32 >>>>>>>> 0, i32 1, !dbg !27 >>>>>>>> %9 = load i16, i16* %arrayidx14, align 2, !dbg !27, !tbaa !17 >>>>>>>> %cmp16 = icmp sgt i16 %8, %9, !dbg !28 >>>>>>>> %spec.select39 = select i1 %cmp16, i16 %9, i16 %8, !dbg !29 >>>>>>>> %arrayidx14.1 = getelementptr inbounds [4 x i16], [4 x i16]* >>>>> %tmpcast, >>>>>>>> i32 0, i32 2, !dbg !27 >>>>>>>> %10 = load i16, i16* %arrayidx14.1, align 2, !dbg !27, !tbaa >>> !17 >>>>>>>> %cmp16.1 = icmp sgt i16 %spec.select39, %10, !dbg !28 >>>>>>>> %spec.select39.1 = select i1 %cmp16.1, i16 %10, i16 >>> %spec.select39, >>>>>>> !dbg >>>>>>>> !29 >>>>>>>> %arrayidx14.2 = getelementptr inbounds [4 x i16], [4 x i16]* >>>>> %tmpcast, >>>>>>>> i32 0, i32 3, !dbg !27 >>>>>>>> %11 = load i16, i16* %arrayidx14.2, align 2, !dbg !27, !tbaa >>> !17 >>>>>>>> %cmp16.2 = icmp sgt i16 %spec.select39.1, %11, !dbg !28 >>>>>>>> %spec.select39.2 = select i1 %cmp16.2, i16 %11, i16 >>>>> %spec.select39.1, >>>>>>>> !dbg !29 >>>>>>>> %conv24 = sext i16 %spec.select39.2 to i32, !dbg !30 >>>>>>>> call void @llvm.lifetime.end.p0i8(i64 8, i8* nonnull %0) #3, >>> !dbg >>>>> !31 >>>>>>>> ret i32 %conv24, !dbg !32 >>>>>>>> >>>>>>>> On Thu, Jan 14, 2021 at 11:54 AM Roman Lebedev < >>> lebedev.ri at gmail.com> >>>>>>> wrote: >>>>>>>>> It would be good to have an actual IR reproducer here. >>>>>>>>> >>>>>>>>> On Thu, Jan 14, 2021 at 7:51 PM Ryan Taylor via llvm-dev >>>>>>>>> <llvm-dev at lists.llvm.org> wrote: >>>>>>>>>> So given an intrinsic that has a pointer as in/out and >>> IntrWriteMem >>>>>>>>> property. >>>>>>>>>> call intrinsic(address a, ....); >>>>>>>>>> loop over address a >>>>>>>>>> load from address a + offset >>>>>>>>>> call intrinsic (address a, ...); >>>>>>>>>> loop over address a >>>>>>>>>> load from address a + offset >>>>>>>>>> >>>>>>>>>> GVN is removing the second loads, despite the second call >>> overwriting >>>>>>>>> the memory starting at address a. AA has the intrinsics marked as >>>>>>> unknown >>>>>>>>> instructions but has all of these as mayAlias in a set. I'm not >>> seeing >>>>>>> this >>>>>>>>> issue with -fno-unroll-loops. >>>>>>>>>> Thanks. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> LLVM Developers mailing list >>>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>> _______________________________________________ >>>>>>>> LLVM Developers mailing list >>>>>>>> llvm-dev at lists.llvm.org >>>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Ryan Taylor via llvm-dev
2021-Jan-18 22:55 UTC
[llvm-dev] GVN removing loads that are affected by call
I'm still a bit fuzzy on the definition but that doesn't seem to completely agree with pointer capture? " A pointer value *is captured* if the function makes a copy of any part of the pointer that outlives the call. " I can replace memcpy in our example and see no issues. memcpy is no capture. #define SIZE 5 #include<string> int foo(short a[SIZE], short b[SIZE], short *res) { short temp[SIZE] = {0, 0, 0, 0, 0}; short *ptr; ptr = temp; memcpy(ptr, b, SIZE*2); short t = temp[0]; for (int i = 1; i < 4; i++) { if (t > temp[i]) t = temp[i]; } *res = t; memcpy(ptr, b, SIZE*2); t = temp[0]; for (int i = 1; i < 4; i++) { if (t > temp[i]) t = temp[i]; } return t; } So you're saying that if the memcpy intrinsic, in this case, returned a pointer (even if not directly used, ie "ptr" in this case) and had NoCapture attribute, the second set of loads would be removed? -Ryan On Mon, Jan 18, 2021 at 4:30 PM Johannes Doerfert < johannesdoerfert at gmail.com> wrote:> A function that returns a pointer argument "captures" it for now. > > The Attributor uses "nocapute_maybe_returned" internally but it is not > an enum attribute (yet). > > > On 1/15/21 7:32 PM, Ryan Taylor wrote: > > So this seems to be an issue with the intrinsic setting NoCapture<0> (it > > returns the pointer it takes). We have a similar intrinsic that does the > > exact same thing but doesn't return the pointer. It also has NoCapture<1> > > and doesn't cause this issue. Also, I managed to substitute memcpy and > > there were no issues, it also has nocapture. > > > > Are there some papers/citations which talk more about Pointer > > Capture/Capture Tracking? > > > > I saw this blog: > > https://jonasdevlieghere.com/escape-analysis-capture-tracking-in-llvm/ > > And I read the CaptureTracking.h short description but I'd like a clearer > > grasp. > > > > Thanks, > > > > Ryan > > > > > > > > On Thu, Jan 14, 2021 at 2:03 PM Ryan Taylor <ryta1203 at gmail.com> wrote: > > > >> Ok, thanks. I'll try to come up with a more generic test case. > >> > >> On Thu, Jan 14, 2021 at 2:00 PM Johannes Doerfert < > >> johannesdoerfert at gmail.com> wrote: > >> > >>> Nothing comes to mind given the information. > >>> > >>> ~ Johannes > >>> > >>> > >>> On 1/14/21 12:56 PM, Ryan Taylor wrote: > >>>> argmemonly, nounwind, writeonly > >>>> > >>>> -fno-strict-aliasing did not help. > >>>> > >>>> On Thu, Jan 14, 2021 at 1:29 PM Johannes Doerfert < > >>>> johannesdoerfert at gmail.com> wrote: > >>>> > >>>>> Can you share the attributes of the intrinsic declaration, > >>>>> assuming removing `!tbaa` didn't help. > >>>>> > >>>>> > >>>>> On 1/14/21 11:43 AM, Ryan Taylor wrote: > >>>>>> Yes, this is for downstream/out of tree target so I'm not sure how > you > >>>>>> could reproduce it either but I thought the IR might help a bit. > >>>>>> > >>>>>> My guess is it's not a bug in GVN as much as an issue with the > >>> intrinsic > >>>>>> properties, or lack thereof. I put this in the first post but the > >>> Alias > >>>>>> Sets show: > >>>>>> > >>>>>> AliasSet[0x75fbd0, 9] may alias, Mod/Ref Pointers: (i8* %1, 8), > >>> (i16* > >>>>>> %arrayidx, 2), (i16* %arrayidx1, 2), (i16* %arrayidx5, 2), (i16* > %19, > >>> 2), > >>>>>> (i16* %arrayidx6, 2), (i16* %arrayidx14, 2), (i16* %arrayidx19, 2) > >>>>>> 4 Unknown instructions: void <badref>, <4 x i16>* %6, <4 x > i16>* > >>>>> %22, > >>>>>> void <badref> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> On Thu, Jan 14, 2021 at 12:37 PM Johannes Doerfert < > >>>>>> johannesdoerfert at gmail.com> wrote: > >>>>>> > >>>>>>> There is still not enough information here. > >>>>>>> > >>>>>>> My first guess. The `!tbaa` annotation on the `XXX.intrinsic` and > the > >>>>>>> `load` > >>>>>>> basically encode there is no alias. Easy to verify, remove the ones > >>> on > >>>>> the > >>>>>>> intrinsic. > >>>>>>> > >>>>>>> ~ Johannes > >>>>>>> > >>>>>>> > >>>>>>> P.S. If this was a bug in GVN, and I assume it is not, a reproducer > >>>>>>> would help > >>>>>>> a lot. So a small IR sample that shows the problem and > which > >>> we > >>>>>>> can run. > >>>>>>> This is a "redacted?" IR fragment in which I don't know > what > >>>>>>> transformation > >>>>>>> is problematic. I also can not run it through GVN, which > >>> makes it > >>>>>>> impossible > >>>>>>> to reproduce. > >>>>>>> > >>>>>>> > >>>>>>> On 1/14/21 11:27 AM, Ryan Taylor via llvm-dev wrote: > >>>>>>>> This is right before GVN: > >>>>>>>> > >>>>>>>> define i32 @foo(<4 x i16> %p, <4 x i16> %p1, i16* nocapture %res) > >>>>>>>> local_unnamed_addr #0 !dbg !6 { > >>>>>>>> entry: > >>>>>>>> %temp = alloca i64, align 8 > >>>>>>>> %tmpcast = bitcast i64* %temp to [4 x i16]* > >>>>>>>> %0 = bitcast i64* %temp to i8*, !dbg !8 > >>>>>>>> call void @llvm.lifetime.start.p0i8(i64 8, i8* nonnull %0) > #3, > >>>>> !dbg !8 > >>>>>>>> store i64 0, i64* %temp, align 8, !dbg !9 > >>>>>>>> %1 = bitcast i64* %temp to <4 x i16>*, !dbg !10 > >>>>>>>> %2 = call <4 x i16>* @llvm.XXX.intrinsic(<4 x i16>* nonnull > >>> %1, <4 > >>>>> x > >>>>>>> i16> > >>>>>>>> %p, i32 0), !dbg !11, !tbaa !12 > >>>>>>>> %arrayidx = bitcast i64* %temp to i16*, !dbg !16 > >>>>>>>> %3 = load i16, i16* %arrayidx, align 8, !dbg !16, !tbaa !17 > >>>>>>>> br label %for.body, !dbg !19 > >>>>>>>> > >>>>>>>> for.body: ; preds = %entry > >>>>>>>> %arrayidx1 = getelementptr inbounds [4 x i16], [4 x i16]* > >>> %tmpcast, > >>>>>>> i32 > >>>>>>>> 0, i32 1, !dbg !20 > >>>>>>>> %4 = load i16, i16* %arrayidx1, align 2, !dbg !20, !tbaa !17 > >>>>>>>> %cmp3 = icmp sgt i16 %3, %4, !dbg !21 > >>>>>>>> %spec.select = select i1 %cmp3, i16 %4, i16 %3, !dbg !22 > >>>>>>>> %arrayidx1.1 = getelementptr inbounds [4 x i16], [4 x i16]* > >>>>> %tmpcast, > >>>>>>> i32 > >>>>>>>> 0, i32 2, !dbg !20 > >>>>>>>> %5 = load i16, i16* %arrayidx1.1, align 2, !dbg !20, !tbaa > !17 > >>>>>>>> %cmp3.1 = icmp sgt i16 %spec.select, %5, !dbg !21 > >>>>>>>> %spec.select.1 = select i1 %cmp3.1, i16 %5, i16 > %spec.select, > >>> !dbg > >>>>> !22 > >>>>>>>> %arrayidx1.2 = getelementptr inbounds [4 x i16], [4 x i16]* > >>>>> %tmpcast, > >>>>>>> i32 > >>>>>>>> 0, i32 3, !dbg !20 > >>>>>>>> %6 = load i16, i16* %arrayidx1.2, align 2, !dbg !20, !tbaa > !17 > >>>>>>>> %cmp3.2 = icmp sgt i16 %spec.select.1, %6, !dbg !21 > >>>>>>>> %spec.select.2 = select i1 %cmp3.2, i16 %6, i16 > %spec.select.1, > >>>>> !dbg > >>>>>>> !22 > >>>>>>>> store i16 %spec.select.2, i16* %res, align 2, !dbg !23, > !tbaa > >>> !17 > >>>>>>>> %7 = tail call <4 x i16>* @llvm.XXX.intrinsic(<4 x i16>* %2, > >>> <4 x > >>>>> i16> > >>>>>>>> %p1, i32 0), !dbg !24, !tbaa !12 > >>>>>>>> %8 = load i16, i16* %arrayidx, align 8, !dbg !25, !tbaa !17 > >>>>>>>> br label %for.body12, !dbg !26 > >>>>>>>> > >>>>>>>> for.body12: ; preds > >>> %for.body > >>>>>>>> %arrayidx14 = getelementptr inbounds [4 x i16], [4 x i16]* > >>>>> %tmpcast, > >>>>>>> i32 > >>>>>>>> 0, i32 1, !dbg !27 > >>>>>>>> %9 = load i16, i16* %arrayidx14, align 2, !dbg !27, !tbaa > !17 > >>>>>>>> %cmp16 = icmp sgt i16 %8, %9, !dbg !28 > >>>>>>>> %spec.select39 = select i1 %cmp16, i16 %9, i16 %8, !dbg !29 > >>>>>>>> %arrayidx14.1 = getelementptr inbounds [4 x i16], [4 x i16]* > >>>>> %tmpcast, > >>>>>>>> i32 0, i32 2, !dbg !27 > >>>>>>>> %10 = load i16, i16* %arrayidx14.1, align 2, !dbg !27, !tbaa > >>> !17 > >>>>>>>> %cmp16.1 = icmp sgt i16 %spec.select39, %10, !dbg !28 > >>>>>>>> %spec.select39.1 = select i1 %cmp16.1, i16 %10, i16 > >>> %spec.select39, > >>>>>>> !dbg > >>>>>>>> !29 > >>>>>>>> %arrayidx14.2 = getelementptr inbounds [4 x i16], [4 x i16]* > >>>>> %tmpcast, > >>>>>>>> i32 0, i32 3, !dbg !27 > >>>>>>>> %11 = load i16, i16* %arrayidx14.2, align 2, !dbg !27, !tbaa > >>> !17 > >>>>>>>> %cmp16.2 = icmp sgt i16 %spec.select39.1, %11, !dbg !28 > >>>>>>>> %spec.select39.2 = select i1 %cmp16.2, i16 %11, i16 > >>>>> %spec.select39.1, > >>>>>>>> !dbg !29 > >>>>>>>> %conv24 = sext i16 %spec.select39.2 to i32, !dbg !30 > >>>>>>>> call void @llvm.lifetime.end.p0i8(i64 8, i8* nonnull %0) #3, > >>> !dbg > >>>>> !31 > >>>>>>>> ret i32 %conv24, !dbg !32 > >>>>>>>> > >>>>>>>> On Thu, Jan 14, 2021 at 11:54 AM Roman Lebedev < > >>> lebedev.ri at gmail.com> > >>>>>>> wrote: > >>>>>>>>> It would be good to have an actual IR reproducer here. > >>>>>>>>> > >>>>>>>>> On Thu, Jan 14, 2021 at 7:51 PM Ryan Taylor via llvm-dev > >>>>>>>>> <llvm-dev at lists.llvm.org> wrote: > >>>>>>>>>> So given an intrinsic that has a pointer as in/out and > >>> IntrWriteMem > >>>>>>>>> property. > >>>>>>>>>> call intrinsic(address a, ....); > >>>>>>>>>> loop over address a > >>>>>>>>>> load from address a + offset > >>>>>>>>>> call intrinsic (address a, ...); > >>>>>>>>>> loop over address a > >>>>>>>>>> load from address a + offset > >>>>>>>>>> > >>>>>>>>>> GVN is removing the second loads, despite the second call > >>> overwriting > >>>>>>>>> the memory starting at address a. AA has the intrinsics marked as > >>>>>>> unknown > >>>>>>>>> instructions but has all of these as mayAlias in a set. I'm not > >>> seeing > >>>>>>> this > >>>>>>>>> issue with -fno-unroll-loops. > >>>>>>>>>> Thanks. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> _______________________________________________ > >>>>>>>>>> LLVM Developers mailing list > >>>>>>>>>> llvm-dev at lists.llvm.org > >>>>>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >>>>>>>> _______________________________________________ > >>>>>>>> LLVM Developers mailing list > >>>>>>>> llvm-dev at lists.llvm.org > >>>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210118/508c7f14/attachment.html>