Ryan Taylor via llvm-dev
2021-Jan-16 01:32 UTC
[llvm-dev] GVN removing loads that are affected by call
So this seems to be an issue with the intrinsic setting NoCapture<0> (it returns the pointer it takes). We have a similar intrinsic that does the exact same thing but doesn't return the pointer. It also has NoCapture<1> and doesn't cause this issue. Also, I managed to substitute memcpy and there were no issues, it also has nocapture. Are there some papers/citations which talk more about Pointer Capture/Capture Tracking? I saw this blog: https://jonasdevlieghere.com/escape-analysis-capture-tracking-in-llvm/ And I read the CaptureTracking.h short description but I'd like a clearer grasp. Thanks, Ryan On Thu, Jan 14, 2021 at 2:03 PM Ryan Taylor <ryta1203 at gmail.com> wrote:> Ok, thanks. I'll try to come up with a more generic test case. > > On Thu, Jan 14, 2021 at 2:00 PM Johannes Doerfert < > johannesdoerfert at gmail.com> wrote: > >> Nothing comes to mind given the information. >> >> ~ Johannes >> >> >> On 1/14/21 12:56 PM, Ryan Taylor wrote: >> > argmemonly, nounwind, writeonly >> > >> > -fno-strict-aliasing did not help. >> > >> > On Thu, Jan 14, 2021 at 1:29 PM Johannes Doerfert < >> > johannesdoerfert at gmail.com> wrote: >> > >> >> Can you share the attributes of the intrinsic declaration, >> >> assuming removing `!tbaa` didn't help. >> >> >> >> >> >> On 1/14/21 11:43 AM, Ryan Taylor wrote: >> >>> Yes, this is for downstream/out of tree target so I'm not sure how you >> >>> could reproduce it either but I thought the IR might help a bit. >> >>> >> >>> My guess is it's not a bug in GVN as much as an issue with the >> intrinsic >> >>> properties, or lack thereof. I put this in the first post but the >> Alias >> >>> Sets show: >> >>> >> >>> AliasSet[0x75fbd0, 9] may alias, Mod/Ref Pointers: (i8* %1, 8), >> (i16* >> >>> %arrayidx, 2), (i16* %arrayidx1, 2), (i16* %arrayidx5, 2), (i16* %19, >> 2), >> >>> (i16* %arrayidx6, 2), (i16* %arrayidx14, 2), (i16* %arrayidx19, 2) >> >>> 4 Unknown instructions: void <badref>, <4 x i16>* %6, <4 x i16>* >> >> %22, >> >>> void <badref> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> On Thu, Jan 14, 2021 at 12:37 PM Johannes Doerfert < >> >>> johannesdoerfert at gmail.com> wrote: >> >>> >> >>>> There is still not enough information here. >> >>>> >> >>>> My first guess. The `!tbaa` annotation on the `XXX.intrinsic` and the >> >>>> `load` >> >>>> basically encode there is no alias. Easy to verify, remove the ones >> on >> >> the >> >>>> intrinsic. >> >>>> >> >>>> ~ Johannes >> >>>> >> >>>> >> >>>> P.S. If this was a bug in GVN, and I assume it is not, a reproducer >> >>>> would help >> >>>> a lot. So a small IR sample that shows the problem and which >> we >> >>>> can run. >> >>>> This is a "redacted?" IR fragment in which I don't know what >> >>>> transformation >> >>>> is problematic. I also can not run it through GVN, which >> makes it >> >>>> impossible >> >>>> to reproduce. >> >>>> >> >>>> >> >>>> On 1/14/21 11:27 AM, Ryan Taylor via llvm-dev wrote: >> >>>>> This is right before GVN: >> >>>>> >> >>>>> define i32 @foo(<4 x i16> %p, <4 x i16> %p1, i16* nocapture %res) >> >>>>> local_unnamed_addr #0 !dbg !6 { >> >>>>> entry: >> >>>>> %temp = alloca i64, align 8 >> >>>>> %tmpcast = bitcast i64* %temp to [4 x i16]* >> >>>>> %0 = bitcast i64* %temp to i8*, !dbg !8 >> >>>>> call void @llvm.lifetime.start.p0i8(i64 8, i8* nonnull %0) #3, >> >> !dbg !8 >> >>>>> store i64 0, i64* %temp, align 8, !dbg !9 >> >>>>> %1 = bitcast i64* %temp to <4 x i16>*, !dbg !10 >> >>>>> %2 = call <4 x i16>* @llvm.XXX.intrinsic(<4 x i16>* nonnull >> %1, <4 >> >> x >> >>>> i16> >> >>>>> %p, i32 0), !dbg !11, !tbaa !12 >> >>>>> %arrayidx = bitcast i64* %temp to i16*, !dbg !16 >> >>>>> %3 = load i16, i16* %arrayidx, align 8, !dbg !16, !tbaa !17 >> >>>>> br label %for.body, !dbg !19 >> >>>>> >> >>>>> for.body: ; preds = %entry >> >>>>> %arrayidx1 = getelementptr inbounds [4 x i16], [4 x i16]* >> %tmpcast, >> >>>> i32 >> >>>>> 0, i32 1, !dbg !20 >> >>>>> %4 = load i16, i16* %arrayidx1, align 2, !dbg !20, !tbaa !17 >> >>>>> %cmp3 = icmp sgt i16 %3, %4, !dbg !21 >> >>>>> %spec.select = select i1 %cmp3, i16 %4, i16 %3, !dbg !22 >> >>>>> %arrayidx1.1 = getelementptr inbounds [4 x i16], [4 x i16]* >> >> %tmpcast, >> >>>> i32 >> >>>>> 0, i32 2, !dbg !20 >> >>>>> %5 = load i16, i16* %arrayidx1.1, align 2, !dbg !20, !tbaa !17 >> >>>>> %cmp3.1 = icmp sgt i16 %spec.select, %5, !dbg !21 >> >>>>> %spec.select.1 = select i1 %cmp3.1, i16 %5, i16 %spec.select, >> !dbg >> >> !22 >> >>>>> %arrayidx1.2 = getelementptr inbounds [4 x i16], [4 x i16]* >> >> %tmpcast, >> >>>> i32 >> >>>>> 0, i32 3, !dbg !20 >> >>>>> %6 = load i16, i16* %arrayidx1.2, align 2, !dbg !20, !tbaa !17 >> >>>>> %cmp3.2 = icmp sgt i16 %spec.select.1, %6, !dbg !21 >> >>>>> %spec.select.2 = select i1 %cmp3.2, i16 %6, i16 %spec.select.1, >> >> !dbg >> >>>> !22 >> >>>>> store i16 %spec.select.2, i16* %res, align 2, !dbg !23, !tbaa >> !17 >> >>>>> %7 = tail call <4 x i16>* @llvm.XXX.intrinsic(<4 x i16>* %2, >> <4 x >> >> i16> >> >>>>> %p1, i32 0), !dbg !24, !tbaa !12 >> >>>>> %8 = load i16, i16* %arrayidx, align 8, !dbg !25, !tbaa !17 >> >>>>> br label %for.body12, !dbg !26 >> >>>>> >> >>>>> for.body12: ; preds >> %for.body >> >>>>> %arrayidx14 = getelementptr inbounds [4 x i16], [4 x i16]* >> >> %tmpcast, >> >>>> i32 >> >>>>> 0, i32 1, !dbg !27 >> >>>>> %9 = load i16, i16* %arrayidx14, align 2, !dbg !27, !tbaa !17 >> >>>>> %cmp16 = icmp sgt i16 %8, %9, !dbg !28 >> >>>>> %spec.select39 = select i1 %cmp16, i16 %9, i16 %8, !dbg !29 >> >>>>> %arrayidx14.1 = getelementptr inbounds [4 x i16], [4 x i16]* >> >> %tmpcast, >> >>>>> i32 0, i32 2, !dbg !27 >> >>>>> %10 = load i16, i16* %arrayidx14.1, align 2, !dbg !27, !tbaa >> !17 >> >>>>> %cmp16.1 = icmp sgt i16 %spec.select39, %10, !dbg !28 >> >>>>> %spec.select39.1 = select i1 %cmp16.1, i16 %10, i16 >> %spec.select39, >> >>>> !dbg >> >>>>> !29 >> >>>>> %arrayidx14.2 = getelementptr inbounds [4 x i16], [4 x i16]* >> >> %tmpcast, >> >>>>> i32 0, i32 3, !dbg !27 >> >>>>> %11 = load i16, i16* %arrayidx14.2, align 2, !dbg !27, !tbaa >> !17 >> >>>>> %cmp16.2 = icmp sgt i16 %spec.select39.1, %11, !dbg !28 >> >>>>> %spec.select39.2 = select i1 %cmp16.2, i16 %11, i16 >> >> %spec.select39.1, >> >>>>> !dbg !29 >> >>>>> %conv24 = sext i16 %spec.select39.2 to i32, !dbg !30 >> >>>>> call void @llvm.lifetime.end.p0i8(i64 8, i8* nonnull %0) #3, >> !dbg >> >> !31 >> >>>>> ret i32 %conv24, !dbg !32 >> >>>>> >> >>>>> On Thu, Jan 14, 2021 at 11:54 AM Roman Lebedev < >> lebedev.ri at gmail.com> >> >>>> wrote: >> >>>>>> It would be good to have an actual IR reproducer here. >> >>>>>> >> >>>>>> On Thu, Jan 14, 2021 at 7:51 PM Ryan Taylor via llvm-dev >> >>>>>> <llvm-dev at lists.llvm.org> wrote: >> >>>>>>> So given an intrinsic that has a pointer as in/out and >> IntrWriteMem >> >>>>>> property. >> >>>>>>> call intrinsic(address a, ....); >> >>>>>>> loop over address a >> >>>>>>> load from address a + offset >> >>>>>>> call intrinsic (address a, ...); >> >>>>>>> loop over address a >> >>>>>>> load from address a + offset >> >>>>>>> >> >>>>>>> GVN is removing the second loads, despite the second call >> overwriting >> >>>>>> the memory starting at address a. AA has the intrinsics marked as >> >>>> unknown >> >>>>>> instructions but has all of these as mayAlias in a set. I'm not >> seeing >> >>>> this >> >>>>>> issue with -fno-unroll-loops. >> >>>>>>> Thanks. >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> _______________________________________________ >> >>>>>>> LLVM Developers mailing list >> >>>>>>> llvm-dev at lists.llvm.org >> >>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >>>>> _______________________________________________ >> >>>>> LLVM Developers mailing list >> >>>>> llvm-dev at lists.llvm.org >> >>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210115/1f62079e/attachment.html>
Johannes Doerfert via llvm-dev
2021-Jan-18 21:30 UTC
[llvm-dev] GVN removing loads that are affected by call
A function that returns a pointer argument "captures" it for now. The Attributor uses "nocapute_maybe_returned" internally but it is not an enum attribute (yet). On 1/15/21 7:32 PM, Ryan Taylor wrote:> So this seems to be an issue with the intrinsic setting NoCapture<0> (it > returns the pointer it takes). We have a similar intrinsic that does the > exact same thing but doesn't return the pointer. It also has NoCapture<1> > and doesn't cause this issue. Also, I managed to substitute memcpy and > there were no issues, it also has nocapture. > > Are there some papers/citations which talk more about Pointer > Capture/Capture Tracking? > > I saw this blog: > https://jonasdevlieghere.com/escape-analysis-capture-tracking-in-llvm/ > And I read the CaptureTracking.h short description but I'd like a clearer > grasp. > > Thanks, > > Ryan > > > > On Thu, Jan 14, 2021 at 2:03 PM Ryan Taylor <ryta1203 at gmail.com> wrote: > >> Ok, thanks. I'll try to come up with a more generic test case. >> >> On Thu, Jan 14, 2021 at 2:00 PM Johannes Doerfert < >> johannesdoerfert at gmail.com> wrote: >> >>> Nothing comes to mind given the information. >>> >>> ~ Johannes >>> >>> >>> On 1/14/21 12:56 PM, Ryan Taylor wrote: >>>> argmemonly, nounwind, writeonly >>>> >>>> -fno-strict-aliasing did not help. >>>> >>>> On Thu, Jan 14, 2021 at 1:29 PM Johannes Doerfert < >>>> johannesdoerfert at gmail.com> wrote: >>>> >>>>> Can you share the attributes of the intrinsic declaration, >>>>> assuming removing `!tbaa` didn't help. >>>>> >>>>> >>>>> On 1/14/21 11:43 AM, Ryan Taylor wrote: >>>>>> Yes, this is for downstream/out of tree target so I'm not sure how you >>>>>> could reproduce it either but I thought the IR might help a bit. >>>>>> >>>>>> My guess is it's not a bug in GVN as much as an issue with the >>> intrinsic >>>>>> properties, or lack thereof. I put this in the first post but the >>> Alias >>>>>> Sets show: >>>>>> >>>>>> AliasSet[0x75fbd0, 9] may alias, Mod/Ref Pointers: (i8* %1, 8), >>> (i16* >>>>>> %arrayidx, 2), (i16* %arrayidx1, 2), (i16* %arrayidx5, 2), (i16* %19, >>> 2), >>>>>> (i16* %arrayidx6, 2), (i16* %arrayidx14, 2), (i16* %arrayidx19, 2) >>>>>> 4 Unknown instructions: void <badref>, <4 x i16>* %6, <4 x i16>* >>>>> %22, >>>>>> void <badref> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Thu, Jan 14, 2021 at 12:37 PM Johannes Doerfert < >>>>>> johannesdoerfert at gmail.com> wrote: >>>>>> >>>>>>> There is still not enough information here. >>>>>>> >>>>>>> My first guess. The `!tbaa` annotation on the `XXX.intrinsic` and the >>>>>>> `load` >>>>>>> basically encode there is no alias. Easy to verify, remove the ones >>> on >>>>> the >>>>>>> intrinsic. >>>>>>> >>>>>>> ~ Johannes >>>>>>> >>>>>>> >>>>>>> P.S. If this was a bug in GVN, and I assume it is not, a reproducer >>>>>>> would help >>>>>>> a lot. So a small IR sample that shows the problem and which >>> we >>>>>>> can run. >>>>>>> This is a "redacted?" IR fragment in which I don't know what >>>>>>> transformation >>>>>>> is problematic. I also can not run it through GVN, which >>> makes it >>>>>>> impossible >>>>>>> to reproduce. >>>>>>> >>>>>>> >>>>>>> On 1/14/21 11:27 AM, Ryan Taylor via llvm-dev wrote: >>>>>>>> This is right before GVN: >>>>>>>> >>>>>>>> define i32 @foo(<4 x i16> %p, <4 x i16> %p1, i16* nocapture %res) >>>>>>>> local_unnamed_addr #0 !dbg !6 { >>>>>>>> entry: >>>>>>>> %temp = alloca i64, align 8 >>>>>>>> %tmpcast = bitcast i64* %temp to [4 x i16]* >>>>>>>> %0 = bitcast i64* %temp to i8*, !dbg !8 >>>>>>>> call void @llvm.lifetime.start.p0i8(i64 8, i8* nonnull %0) #3, >>>>> !dbg !8 >>>>>>>> store i64 0, i64* %temp, align 8, !dbg !9 >>>>>>>> %1 = bitcast i64* %temp to <4 x i16>*, !dbg !10 >>>>>>>> %2 = call <4 x i16>* @llvm.XXX.intrinsic(<4 x i16>* nonnull >>> %1, <4 >>>>> x >>>>>>> i16> >>>>>>>> %p, i32 0), !dbg !11, !tbaa !12 >>>>>>>> %arrayidx = bitcast i64* %temp to i16*, !dbg !16 >>>>>>>> %3 = load i16, i16* %arrayidx, align 8, !dbg !16, !tbaa !17 >>>>>>>> br label %for.body, !dbg !19 >>>>>>>> >>>>>>>> for.body: ; preds = %entry >>>>>>>> %arrayidx1 = getelementptr inbounds [4 x i16], [4 x i16]* >>> %tmpcast, >>>>>>> i32 >>>>>>>> 0, i32 1, !dbg !20 >>>>>>>> %4 = load i16, i16* %arrayidx1, align 2, !dbg !20, !tbaa !17 >>>>>>>> %cmp3 = icmp sgt i16 %3, %4, !dbg !21 >>>>>>>> %spec.select = select i1 %cmp3, i16 %4, i16 %3, !dbg !22 >>>>>>>> %arrayidx1.1 = getelementptr inbounds [4 x i16], [4 x i16]* >>>>> %tmpcast, >>>>>>> i32 >>>>>>>> 0, i32 2, !dbg !20 >>>>>>>> %5 = load i16, i16* %arrayidx1.1, align 2, !dbg !20, !tbaa !17 >>>>>>>> %cmp3.1 = icmp sgt i16 %spec.select, %5, !dbg !21 >>>>>>>> %spec.select.1 = select i1 %cmp3.1, i16 %5, i16 %spec.select, >>> !dbg >>>>> !22 >>>>>>>> %arrayidx1.2 = getelementptr inbounds [4 x i16], [4 x i16]* >>>>> %tmpcast, >>>>>>> i32 >>>>>>>> 0, i32 3, !dbg !20 >>>>>>>> %6 = load i16, i16* %arrayidx1.2, align 2, !dbg !20, !tbaa !17 >>>>>>>> %cmp3.2 = icmp sgt i16 %spec.select.1, %6, !dbg !21 >>>>>>>> %spec.select.2 = select i1 %cmp3.2, i16 %6, i16 %spec.select.1, >>>>> !dbg >>>>>>> !22 >>>>>>>> store i16 %spec.select.2, i16* %res, align 2, !dbg !23, !tbaa >>> !17 >>>>>>>> %7 = tail call <4 x i16>* @llvm.XXX.intrinsic(<4 x i16>* %2, >>> <4 x >>>>> i16> >>>>>>>> %p1, i32 0), !dbg !24, !tbaa !12 >>>>>>>> %8 = load i16, i16* %arrayidx, align 8, !dbg !25, !tbaa !17 >>>>>>>> br label %for.body12, !dbg !26 >>>>>>>> >>>>>>>> for.body12: ; preds >>> %for.body >>>>>>>> %arrayidx14 = getelementptr inbounds [4 x i16], [4 x i16]* >>>>> %tmpcast, >>>>>>> i32 >>>>>>>> 0, i32 1, !dbg !27 >>>>>>>> %9 = load i16, i16* %arrayidx14, align 2, !dbg !27, !tbaa !17 >>>>>>>> %cmp16 = icmp sgt i16 %8, %9, !dbg !28 >>>>>>>> %spec.select39 = select i1 %cmp16, i16 %9, i16 %8, !dbg !29 >>>>>>>> %arrayidx14.1 = getelementptr inbounds [4 x i16], [4 x i16]* >>>>> %tmpcast, >>>>>>>> i32 0, i32 2, !dbg !27 >>>>>>>> %10 = load i16, i16* %arrayidx14.1, align 2, !dbg !27, !tbaa >>> !17 >>>>>>>> %cmp16.1 = icmp sgt i16 %spec.select39, %10, !dbg !28 >>>>>>>> %spec.select39.1 = select i1 %cmp16.1, i16 %10, i16 >>> %spec.select39, >>>>>>> !dbg >>>>>>>> !29 >>>>>>>> %arrayidx14.2 = getelementptr inbounds [4 x i16], [4 x i16]* >>>>> %tmpcast, >>>>>>>> i32 0, i32 3, !dbg !27 >>>>>>>> %11 = load i16, i16* %arrayidx14.2, align 2, !dbg !27, !tbaa >>> !17 >>>>>>>> %cmp16.2 = icmp sgt i16 %spec.select39.1, %11, !dbg !28 >>>>>>>> %spec.select39.2 = select i1 %cmp16.2, i16 %11, i16 >>>>> %spec.select39.1, >>>>>>>> !dbg !29 >>>>>>>> %conv24 = sext i16 %spec.select39.2 to i32, !dbg !30 >>>>>>>> call void @llvm.lifetime.end.p0i8(i64 8, i8* nonnull %0) #3, >>> !dbg >>>>> !31 >>>>>>>> ret i32 %conv24, !dbg !32 >>>>>>>> >>>>>>>> On Thu, Jan 14, 2021 at 11:54 AM Roman Lebedev < >>> lebedev.ri at gmail.com> >>>>>>> wrote: >>>>>>>>> It would be good to have an actual IR reproducer here. >>>>>>>>> >>>>>>>>> On Thu, Jan 14, 2021 at 7:51 PM Ryan Taylor via llvm-dev >>>>>>>>> <llvm-dev at lists.llvm.org> wrote: >>>>>>>>>> So given an intrinsic that has a pointer as in/out and >>> IntrWriteMem >>>>>>>>> property. >>>>>>>>>> call intrinsic(address a, ....); >>>>>>>>>> loop over address a >>>>>>>>>> load from address a + offset >>>>>>>>>> call intrinsic (address a, ...); >>>>>>>>>> loop over address a >>>>>>>>>> load from address a + offset >>>>>>>>>> >>>>>>>>>> GVN is removing the second loads, despite the second call >>> overwriting >>>>>>>>> the memory starting at address a. AA has the intrinsics marked as >>>>>>> unknown >>>>>>>>> instructions but has all of these as mayAlias in a set. I'm not >>> seeing >>>>>>> this >>>>>>>>> issue with -fno-unroll-loops. >>>>>>>>>> Thanks. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> LLVM Developers mailing list >>>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>> _______________________________________________ >>>>>>>> LLVM Developers mailing list >>>>>>>> llvm-dev at lists.llvm.org >>>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev