Johannes Doerfert via llvm-dev
2021-Jan-14 19:00 UTC
[llvm-dev] GVN removing loads that are affected by call
Nothing comes to mind given the information. ~ Johannes On 1/14/21 12:56 PM, Ryan Taylor wrote:> argmemonly, nounwind, writeonly > > -fno-strict-aliasing did not help. > > On Thu, Jan 14, 2021 at 1:29 PM Johannes Doerfert < > johannesdoerfert at gmail.com> wrote: > >> Can you share the attributes of the intrinsic declaration, >> assuming removing `!tbaa` didn't help. >> >> >> On 1/14/21 11:43 AM, Ryan Taylor wrote: >>> Yes, this is for downstream/out of tree target so I'm not sure how you >>> could reproduce it either but I thought the IR might help a bit. >>> >>> My guess is it's not a bug in GVN as much as an issue with the intrinsic >>> properties, or lack thereof. I put this in the first post but the Alias >>> Sets show: >>> >>> AliasSet[0x75fbd0, 9] may alias, Mod/Ref Pointers: (i8* %1, 8), (i16* >>> %arrayidx, 2), (i16* %arrayidx1, 2), (i16* %arrayidx5, 2), (i16* %19, 2), >>> (i16* %arrayidx6, 2), (i16* %arrayidx14, 2), (i16* %arrayidx19, 2) >>> 4 Unknown instructions: void <badref>, <4 x i16>* %6, <4 x i16>* >> %22, >>> void <badref> >>> >>> >>> >>> >>> >>> On Thu, Jan 14, 2021 at 12:37 PM Johannes Doerfert < >>> johannesdoerfert at gmail.com> wrote: >>> >>>> There is still not enough information here. >>>> >>>> My first guess. The `!tbaa` annotation on the `XXX.intrinsic` and the >>>> `load` >>>> basically encode there is no alias. Easy to verify, remove the ones on >> the >>>> intrinsic. >>>> >>>> ~ Johannes >>>> >>>> >>>> P.S. If this was a bug in GVN, and I assume it is not, a reproducer >>>> would help >>>> a lot. So a small IR sample that shows the problem and which we >>>> can run. >>>> This is a "redacted?" IR fragment in which I don't know what >>>> transformation >>>> is problematic. I also can not run it through GVN, which makes it >>>> impossible >>>> to reproduce. >>>> >>>> >>>> On 1/14/21 11:27 AM, Ryan Taylor via llvm-dev wrote: >>>>> This is right before GVN: >>>>> >>>>> define i32 @foo(<4 x i16> %p, <4 x i16> %p1, i16* nocapture %res) >>>>> local_unnamed_addr #0 !dbg !6 { >>>>> entry: >>>>> %temp = alloca i64, align 8 >>>>> %tmpcast = bitcast i64* %temp to [4 x i16]* >>>>> %0 = bitcast i64* %temp to i8*, !dbg !8 >>>>> call void @llvm.lifetime.start.p0i8(i64 8, i8* nonnull %0) #3, >> !dbg !8 >>>>> store i64 0, i64* %temp, align 8, !dbg !9 >>>>> %1 = bitcast i64* %temp to <4 x i16>*, !dbg !10 >>>>> %2 = call <4 x i16>* @llvm.XXX.intrinsic(<4 x i16>* nonnull %1, <4 >> x >>>> i16> >>>>> %p, i32 0), !dbg !11, !tbaa !12 >>>>> %arrayidx = bitcast i64* %temp to i16*, !dbg !16 >>>>> %3 = load i16, i16* %arrayidx, align 8, !dbg !16, !tbaa !17 >>>>> br label %for.body, !dbg !19 >>>>> >>>>> for.body: ; preds = %entry >>>>> %arrayidx1 = getelementptr inbounds [4 x i16], [4 x i16]* %tmpcast, >>>> i32 >>>>> 0, i32 1, !dbg !20 >>>>> %4 = load i16, i16* %arrayidx1, align 2, !dbg !20, !tbaa !17 >>>>> %cmp3 = icmp sgt i16 %3, %4, !dbg !21 >>>>> %spec.select = select i1 %cmp3, i16 %4, i16 %3, !dbg !22 >>>>> %arrayidx1.1 = getelementptr inbounds [4 x i16], [4 x i16]* >> %tmpcast, >>>> i32 >>>>> 0, i32 2, !dbg !20 >>>>> %5 = load i16, i16* %arrayidx1.1, align 2, !dbg !20, !tbaa !17 >>>>> %cmp3.1 = icmp sgt i16 %spec.select, %5, !dbg !21 >>>>> %spec.select.1 = select i1 %cmp3.1, i16 %5, i16 %spec.select, !dbg >> !22 >>>>> %arrayidx1.2 = getelementptr inbounds [4 x i16], [4 x i16]* >> %tmpcast, >>>> i32 >>>>> 0, i32 3, !dbg !20 >>>>> %6 = load i16, i16* %arrayidx1.2, align 2, !dbg !20, !tbaa !17 >>>>> %cmp3.2 = icmp sgt i16 %spec.select.1, %6, !dbg !21 >>>>> %spec.select.2 = select i1 %cmp3.2, i16 %6, i16 %spec.select.1, >> !dbg >>>> !22 >>>>> store i16 %spec.select.2, i16* %res, align 2, !dbg !23, !tbaa !17 >>>>> %7 = tail call <4 x i16>* @llvm.XXX.intrinsic(<4 x i16>* %2, <4 x >> i16> >>>>> %p1, i32 0), !dbg !24, !tbaa !12 >>>>> %8 = load i16, i16* %arrayidx, align 8, !dbg !25, !tbaa !17 >>>>> br label %for.body12, !dbg !26 >>>>> >>>>> for.body12: ; preds = %for.body >>>>> %arrayidx14 = getelementptr inbounds [4 x i16], [4 x i16]* >> %tmpcast, >>>> i32 >>>>> 0, i32 1, !dbg !27 >>>>> %9 = load i16, i16* %arrayidx14, align 2, !dbg !27, !tbaa !17 >>>>> %cmp16 = icmp sgt i16 %8, %9, !dbg !28 >>>>> %spec.select39 = select i1 %cmp16, i16 %9, i16 %8, !dbg !29 >>>>> %arrayidx14.1 = getelementptr inbounds [4 x i16], [4 x i16]* >> %tmpcast, >>>>> i32 0, i32 2, !dbg !27 >>>>> %10 = load i16, i16* %arrayidx14.1, align 2, !dbg !27, !tbaa !17 >>>>> %cmp16.1 = icmp sgt i16 %spec.select39, %10, !dbg !28 >>>>> %spec.select39.1 = select i1 %cmp16.1, i16 %10, i16 %spec.select39, >>>> !dbg >>>>> !29 >>>>> %arrayidx14.2 = getelementptr inbounds [4 x i16], [4 x i16]* >> %tmpcast, >>>>> i32 0, i32 3, !dbg !27 >>>>> %11 = load i16, i16* %arrayidx14.2, align 2, !dbg !27, !tbaa !17 >>>>> %cmp16.2 = icmp sgt i16 %spec.select39.1, %11, !dbg !28 >>>>> %spec.select39.2 = select i1 %cmp16.2, i16 %11, i16 >> %spec.select39.1, >>>>> !dbg !29 >>>>> %conv24 = sext i16 %spec.select39.2 to i32, !dbg !30 >>>>> call void @llvm.lifetime.end.p0i8(i64 8, i8* nonnull %0) #3, !dbg >> !31 >>>>> ret i32 %conv24, !dbg !32 >>>>> >>>>> On Thu, Jan 14, 2021 at 11:54 AM Roman Lebedev <lebedev.ri at gmail.com> >>>> wrote: >>>>>> It would be good to have an actual IR reproducer here. >>>>>> >>>>>> On Thu, Jan 14, 2021 at 7:51 PM Ryan Taylor via llvm-dev >>>>>> <llvm-dev at lists.llvm.org> wrote: >>>>>>> So given an intrinsic that has a pointer as in/out and IntrWriteMem >>>>>> property. >>>>>>> call intrinsic(address a, ....); >>>>>>> loop over address a >>>>>>> load from address a + offset >>>>>>> call intrinsic (address a, ...); >>>>>>> loop over address a >>>>>>> load from address a + offset >>>>>>> >>>>>>> GVN is removing the second loads, despite the second call overwriting >>>>>> the memory starting at address a. AA has the intrinsics marked as >>>> unknown >>>>>> instructions but has all of these as mayAlias in a set. I'm not seeing >>>> this >>>>>> issue with -fno-unroll-loops. >>>>>>> Thanks. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> LLVM Developers mailing list >>>>>>> llvm-dev at lists.llvm.org >>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> llvm-dev at lists.llvm.org >>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Ryan Taylor via llvm-dev
2021-Jan-14 19:03 UTC
[llvm-dev] GVN removing loads that are affected by call
Ok, thanks. I'll try to come up with a more generic test case. On Thu, Jan 14, 2021 at 2:00 PM Johannes Doerfert < johannesdoerfert at gmail.com> wrote:> Nothing comes to mind given the information. > > ~ Johannes > > > On 1/14/21 12:56 PM, Ryan Taylor wrote: > > argmemonly, nounwind, writeonly > > > > -fno-strict-aliasing did not help. > > > > On Thu, Jan 14, 2021 at 1:29 PM Johannes Doerfert < > > johannesdoerfert at gmail.com> wrote: > > > >> Can you share the attributes of the intrinsic declaration, > >> assuming removing `!tbaa` didn't help. > >> > >> > >> On 1/14/21 11:43 AM, Ryan Taylor wrote: > >>> Yes, this is for downstream/out of tree target so I'm not sure how you > >>> could reproduce it either but I thought the IR might help a bit. > >>> > >>> My guess is it's not a bug in GVN as much as an issue with the > intrinsic > >>> properties, or lack thereof. I put this in the first post but the Alias > >>> Sets show: > >>> > >>> AliasSet[0x75fbd0, 9] may alias, Mod/Ref Pointers: (i8* %1, 8), (i16* > >>> %arrayidx, 2), (i16* %arrayidx1, 2), (i16* %arrayidx5, 2), (i16* %19, > 2), > >>> (i16* %arrayidx6, 2), (i16* %arrayidx14, 2), (i16* %arrayidx19, 2) > >>> 4 Unknown instructions: void <badref>, <4 x i16>* %6, <4 x i16>* > >> %22, > >>> void <badref> > >>> > >>> > >>> > >>> > >>> > >>> On Thu, Jan 14, 2021 at 12:37 PM Johannes Doerfert < > >>> johannesdoerfert at gmail.com> wrote: > >>> > >>>> There is still not enough information here. > >>>> > >>>> My first guess. The `!tbaa` annotation on the `XXX.intrinsic` and the > >>>> `load` > >>>> basically encode there is no alias. Easy to verify, remove the ones on > >> the > >>>> intrinsic. > >>>> > >>>> ~ Johannes > >>>> > >>>> > >>>> P.S. If this was a bug in GVN, and I assume it is not, a reproducer > >>>> would help > >>>> a lot. So a small IR sample that shows the problem and which > we > >>>> can run. > >>>> This is a "redacted?" IR fragment in which I don't know what > >>>> transformation > >>>> is problematic. I also can not run it through GVN, which > makes it > >>>> impossible > >>>> to reproduce. > >>>> > >>>> > >>>> On 1/14/21 11:27 AM, Ryan Taylor via llvm-dev wrote: > >>>>> This is right before GVN: > >>>>> > >>>>> define i32 @foo(<4 x i16> %p, <4 x i16> %p1, i16* nocapture %res) > >>>>> local_unnamed_addr #0 !dbg !6 { > >>>>> entry: > >>>>> %temp = alloca i64, align 8 > >>>>> %tmpcast = bitcast i64* %temp to [4 x i16]* > >>>>> %0 = bitcast i64* %temp to i8*, !dbg !8 > >>>>> call void @llvm.lifetime.start.p0i8(i64 8, i8* nonnull %0) #3, > >> !dbg !8 > >>>>> store i64 0, i64* %temp, align 8, !dbg !9 > >>>>> %1 = bitcast i64* %temp to <4 x i16>*, !dbg !10 > >>>>> %2 = call <4 x i16>* @llvm.XXX.intrinsic(<4 x i16>* nonnull %1, > <4 > >> x > >>>> i16> > >>>>> %p, i32 0), !dbg !11, !tbaa !12 > >>>>> %arrayidx = bitcast i64* %temp to i16*, !dbg !16 > >>>>> %3 = load i16, i16* %arrayidx, align 8, !dbg !16, !tbaa !17 > >>>>> br label %for.body, !dbg !19 > >>>>> > >>>>> for.body: ; preds = %entry > >>>>> %arrayidx1 = getelementptr inbounds [4 x i16], [4 x i16]* > %tmpcast, > >>>> i32 > >>>>> 0, i32 1, !dbg !20 > >>>>> %4 = load i16, i16* %arrayidx1, align 2, !dbg !20, !tbaa !17 > >>>>> %cmp3 = icmp sgt i16 %3, %4, !dbg !21 > >>>>> %spec.select = select i1 %cmp3, i16 %4, i16 %3, !dbg !22 > >>>>> %arrayidx1.1 = getelementptr inbounds [4 x i16], [4 x i16]* > >> %tmpcast, > >>>> i32 > >>>>> 0, i32 2, !dbg !20 > >>>>> %5 = load i16, i16* %arrayidx1.1, align 2, !dbg !20, !tbaa !17 > >>>>> %cmp3.1 = icmp sgt i16 %spec.select, %5, !dbg !21 > >>>>> %spec.select.1 = select i1 %cmp3.1, i16 %5, i16 %spec.select, > !dbg > >> !22 > >>>>> %arrayidx1.2 = getelementptr inbounds [4 x i16], [4 x i16]* > >> %tmpcast, > >>>> i32 > >>>>> 0, i32 3, !dbg !20 > >>>>> %6 = load i16, i16* %arrayidx1.2, align 2, !dbg !20, !tbaa !17 > >>>>> %cmp3.2 = icmp sgt i16 %spec.select.1, %6, !dbg !21 > >>>>> %spec.select.2 = select i1 %cmp3.2, i16 %6, i16 %spec.select.1, > >> !dbg > >>>> !22 > >>>>> store i16 %spec.select.2, i16* %res, align 2, !dbg !23, !tbaa > !17 > >>>>> %7 = tail call <4 x i16>* @llvm.XXX.intrinsic(<4 x i16>* %2, <4 > x > >> i16> > >>>>> %p1, i32 0), !dbg !24, !tbaa !12 > >>>>> %8 = load i16, i16* %arrayidx, align 8, !dbg !25, !tbaa !17 > >>>>> br label %for.body12, !dbg !26 > >>>>> > >>>>> for.body12: ; preds = %for.body > >>>>> %arrayidx14 = getelementptr inbounds [4 x i16], [4 x i16]* > >> %tmpcast, > >>>> i32 > >>>>> 0, i32 1, !dbg !27 > >>>>> %9 = load i16, i16* %arrayidx14, align 2, !dbg !27, !tbaa !17 > >>>>> %cmp16 = icmp sgt i16 %8, %9, !dbg !28 > >>>>> %spec.select39 = select i1 %cmp16, i16 %9, i16 %8, !dbg !29 > >>>>> %arrayidx14.1 = getelementptr inbounds [4 x i16], [4 x i16]* > >> %tmpcast, > >>>>> i32 0, i32 2, !dbg !27 > >>>>> %10 = load i16, i16* %arrayidx14.1, align 2, !dbg !27, !tbaa !17 > >>>>> %cmp16.1 = icmp sgt i16 %spec.select39, %10, !dbg !28 > >>>>> %spec.select39.1 = select i1 %cmp16.1, i16 %10, i16 > %spec.select39, > >>>> !dbg > >>>>> !29 > >>>>> %arrayidx14.2 = getelementptr inbounds [4 x i16], [4 x i16]* > >> %tmpcast, > >>>>> i32 0, i32 3, !dbg !27 > >>>>> %11 = load i16, i16* %arrayidx14.2, align 2, !dbg !27, !tbaa !17 > >>>>> %cmp16.2 = icmp sgt i16 %spec.select39.1, %11, !dbg !28 > >>>>> %spec.select39.2 = select i1 %cmp16.2, i16 %11, i16 > >> %spec.select39.1, > >>>>> !dbg !29 > >>>>> %conv24 = sext i16 %spec.select39.2 to i32, !dbg !30 > >>>>> call void @llvm.lifetime.end.p0i8(i64 8, i8* nonnull %0) #3, > !dbg > >> !31 > >>>>> ret i32 %conv24, !dbg !32 > >>>>> > >>>>> On Thu, Jan 14, 2021 at 11:54 AM Roman Lebedev <lebedev.ri at gmail.com > > > >>>> wrote: > >>>>>> It would be good to have an actual IR reproducer here. > >>>>>> > >>>>>> On Thu, Jan 14, 2021 at 7:51 PM Ryan Taylor via llvm-dev > >>>>>> <llvm-dev at lists.llvm.org> wrote: > >>>>>>> So given an intrinsic that has a pointer as in/out and IntrWriteMem > >>>>>> property. > >>>>>>> call intrinsic(address a, ....); > >>>>>>> loop over address a > >>>>>>> load from address a + offset > >>>>>>> call intrinsic (address a, ...); > >>>>>>> loop over address a > >>>>>>> load from address a + offset > >>>>>>> > >>>>>>> GVN is removing the second loads, despite the second call > overwriting > >>>>>> the memory starting at address a. AA has the intrinsics marked as > >>>> unknown > >>>>>> instructions but has all of these as mayAlias in a set. I'm not > seeing > >>>> this > >>>>>> issue with -fno-unroll-loops. > >>>>>>> Thanks. > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> _______________________________________________ > >>>>>>> LLVM Developers mailing list > >>>>>>> llvm-dev at lists.llvm.org > >>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >>>>> _______________________________________________ > >>>>> LLVM Developers mailing list > >>>>> llvm-dev at lists.llvm.org > >>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210114/f700149c/attachment.html>