Johannes Doerfert via llvm-dev
2021-Jan-18 23:13 UTC
[llvm-dev] GVN removing loads that are affected by call
Your example/trouble starts with an "identified function local" (for short, IFL) pointer (see llvm::isIdentifiedFunctionLocal), i.a., as it fallsĀ out of `malloc`, `alloca`, or a `__restrict` qualified argument. Since no access on a pointer not derived from the IFL one can cause the underlying memory to be altered, we can ignore other accesses *iff* we can show they don't use a pointer that is "based-on" the IFL one. The easiest way to prove it is not based on the IFL one is to show the IFL pointer does not escape since then every based-on use can be identified by def-use chains. A call argument that is nocapture guarantees no copy of the pointer is made that outlives the call. If you return the pointer, a copy is made and you broke the contract. We consequently "ignore" accesses through that pointer when we argue about the IFL one as they cannot alias. Since we often return a pointer argument, the Attributor uses "no-capture-maybe-returned" which we recently discussed to propose as a new enum attribute. You could then mark the argument as such and capture + use traversals could continue with the intrinsic value to determine all accesses to the IFL. I hope this makes it somewhat clearer. WRT. memcpy, it does not copy the pointers you put it but the memory they point to. The address of the memory did not escape/was not captured. ~ Johannes On 1/18/21 4:55 PM, Ryan Taylor wrote:> I'm still a bit fuzzy on the definition but that doesn't seem to completely > agree with pointer capture? > > " A pointer value *is captured* if the function makes a copy of any part of > the pointer that outlives the call. " > > I can replace memcpy in our example and see no issues. memcpy is no > capture. > > #define SIZE 5 > #include<string> > int foo(short a[SIZE], short b[SIZE], short *res) { > short temp[SIZE] = {0, 0, 0, 0, 0}; > short *ptr; > ptr = temp; > memcpy(ptr, b, SIZE*2); > short t = temp[0]; > for (int i = 1; i < 4; i++) { > if (t > temp[i]) t = temp[i]; > } > *res = t; > memcpy(ptr, b, SIZE*2); > t = temp[0]; > for (int i = 1; i < 4; i++) { > if (t > temp[i]) t = temp[i]; > } > return t; > } > > So you're saying that if the memcpy intrinsic, in this case, returned a > pointer (even if not directly used, ie "ptr" in this case) and had > NoCapture attribute, the second set of loads would be removed? > > -Ryan > > On Mon, Jan 18, 2021 at 4:30 PM Johannes Doerfert < > johannesdoerfert at gmail.com> wrote: > >> A function that returns a pointer argument "captures" it for now. >> >> The Attributor uses "nocapute_maybe_returned" internally but it is not >> an enum attribute (yet). >> >> >> On 1/15/21 7:32 PM, Ryan Taylor wrote: >>> So this seems to be an issue with the intrinsic setting NoCapture<0> (it >>> returns the pointer it takes). We have a similar intrinsic that does the >>> exact same thing but doesn't return the pointer. It also has NoCapture<1> >>> and doesn't cause this issue. Also, I managed to substitute memcpy and >>> there were no issues, it also has nocapture. >>> >>> Are there some papers/citations which talk more about Pointer >>> Capture/Capture Tracking? >>> >>> I saw this blog: >>> https://jonasdevlieghere.com/escape-analysis-capture-tracking-in-llvm/ >>> And I read the CaptureTracking.h short description but I'd like a clearer >>> grasp. >>> >>> Thanks, >>> >>> Ryan >>> >>> >>> >>> On Thu, Jan 14, 2021 at 2:03 PM Ryan Taylor <ryta1203 at gmail.com> wrote: >>> >>>> Ok, thanks. I'll try to come up with a more generic test case. >>>> >>>> On Thu, Jan 14, 2021 at 2:00 PM Johannes Doerfert < >>>> johannesdoerfert at gmail.com> wrote: >>>> >>>>> Nothing comes to mind given the information. >>>>> >>>>> ~ Johannes >>>>> >>>>> >>>>> On 1/14/21 12:56 PM, Ryan Taylor wrote: >>>>>> argmemonly, nounwind, writeonly >>>>>> >>>>>> -fno-strict-aliasing did not help. >>>>>> >>>>>> On Thu, Jan 14, 2021 at 1:29 PM Johannes Doerfert < >>>>>> johannesdoerfert at gmail.com> wrote: >>>>>> >>>>>>> Can you share the attributes of the intrinsic declaration, >>>>>>> assuming removing `!tbaa` didn't help. >>>>>>> >>>>>>> >>>>>>> On 1/14/21 11:43 AM, Ryan Taylor wrote: >>>>>>>> Yes, this is for downstream/out of tree target so I'm not sure how >> you >>>>>>>> could reproduce it either but I thought the IR might help a bit. >>>>>>>> >>>>>>>> My guess is it's not a bug in GVN as much as an issue with the >>>>> intrinsic >>>>>>>> properties, or lack thereof. I put this in the first post but the >>>>> Alias >>>>>>>> Sets show: >>>>>>>> >>>>>>>> AliasSet[0x75fbd0, 9] may alias, Mod/Ref Pointers: (i8* %1, 8), >>>>> (i16* >>>>>>>> %arrayidx, 2), (i16* %arrayidx1, 2), (i16* %arrayidx5, 2), (i16* >> %19, >>>>> 2), >>>>>>>> (i16* %arrayidx6, 2), (i16* %arrayidx14, 2), (i16* %arrayidx19, 2) >>>>>>>> 4 Unknown instructions: void <badref>, <4 x i16>* %6, <4 x >> i16>* >>>>>>> %22, >>>>>>>> void <badref> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Jan 14, 2021 at 12:37 PM Johannes Doerfert < >>>>>>>> johannesdoerfert at gmail.com> wrote: >>>>>>>> >>>>>>>>> There is still not enough information here. >>>>>>>>> >>>>>>>>> My first guess. The `!tbaa` annotation on the `XXX.intrinsic` and >> the >>>>>>>>> `load` >>>>>>>>> basically encode there is no alias. Easy to verify, remove the ones >>>>> on >>>>>>> the >>>>>>>>> intrinsic. >>>>>>>>> >>>>>>>>> ~ Johannes >>>>>>>>> >>>>>>>>> >>>>>>>>> P.S. If this was a bug in GVN, and I assume it is not, a reproducer >>>>>>>>> would help >>>>>>>>> a lot. So a small IR sample that shows the problem and >> which >>>>> we >>>>>>>>> can run. >>>>>>>>> This is a "redacted?" IR fragment in which I don't know >> what >>>>>>>>> transformation >>>>>>>>> is problematic. I also can not run it through GVN, which >>>>> makes it >>>>>>>>> impossible >>>>>>>>> to reproduce. >>>>>>>>> >>>>>>>>> >>>>>>>>> On 1/14/21 11:27 AM, Ryan Taylor via llvm-dev wrote: >>>>>>>>>> This is right before GVN: >>>>>>>>>> >>>>>>>>>> define i32 @foo(<4 x i16> %p, <4 x i16> %p1, i16* nocapture %res) >>>>>>>>>> local_unnamed_addr #0 !dbg !6 { >>>>>>>>>> entry: >>>>>>>>>> %temp = alloca i64, align 8 >>>>>>>>>> %tmpcast = bitcast i64* %temp to [4 x i16]* >>>>>>>>>> %0 = bitcast i64* %temp to i8*, !dbg !8 >>>>>>>>>> call void @llvm.lifetime.start.p0i8(i64 8, i8* nonnull %0) >> #3, >>>>>>> !dbg !8 >>>>>>>>>> store i64 0, i64* %temp, align 8, !dbg !9 >>>>>>>>>> %1 = bitcast i64* %temp to <4 x i16>*, !dbg !10 >>>>>>>>>> %2 = call <4 x i16>* @llvm.XXX.intrinsic(<4 x i16>* nonnull >>>>> %1, <4 >>>>>>> x >>>>>>>>> i16> >>>>>>>>>> %p, i32 0), !dbg !11, !tbaa !12 >>>>>>>>>> %arrayidx = bitcast i64* %temp to i16*, !dbg !16 >>>>>>>>>> %3 = load i16, i16* %arrayidx, align 8, !dbg !16, !tbaa !17 >>>>>>>>>> br label %for.body, !dbg !19 >>>>>>>>>> >>>>>>>>>> for.body: ; preds = %entry >>>>>>>>>> %arrayidx1 = getelementptr inbounds [4 x i16], [4 x i16]* >>>>> %tmpcast, >>>>>>>>> i32 >>>>>>>>>> 0, i32 1, !dbg !20 >>>>>>>>>> %4 = load i16, i16* %arrayidx1, align 2, !dbg !20, !tbaa !17 >>>>>>>>>> %cmp3 = icmp sgt i16 %3, %4, !dbg !21 >>>>>>>>>> %spec.select = select i1 %cmp3, i16 %4, i16 %3, !dbg !22 >>>>>>>>>> %arrayidx1.1 = getelementptr inbounds [4 x i16], [4 x i16]* >>>>>>> %tmpcast, >>>>>>>>> i32 >>>>>>>>>> 0, i32 2, !dbg !20 >>>>>>>>>> %5 = load i16, i16* %arrayidx1.1, align 2, !dbg !20, !tbaa >> !17 >>>>>>>>>> %cmp3.1 = icmp sgt i16 %spec.select, %5, !dbg !21 >>>>>>>>>> %spec.select.1 = select i1 %cmp3.1, i16 %5, i16 >> %spec.select, >>>>> !dbg >>>>>>> !22 >>>>>>>>>> %arrayidx1.2 = getelementptr inbounds [4 x i16], [4 x i16]* >>>>>>> %tmpcast, >>>>>>>>> i32 >>>>>>>>>> 0, i32 3, !dbg !20 >>>>>>>>>> %6 = load i16, i16* %arrayidx1.2, align 2, !dbg !20, !tbaa >> !17 >>>>>>>>>> %cmp3.2 = icmp sgt i16 %spec.select.1, %6, !dbg !21 >>>>>>>>>> %spec.select.2 = select i1 %cmp3.2, i16 %6, i16 >> %spec.select.1, >>>>>>> !dbg >>>>>>>>> !22 >>>>>>>>>> store i16 %spec.select.2, i16* %res, align 2, !dbg !23, >> !tbaa >>>>> !17 >>>>>>>>>> %7 = tail call <4 x i16>* @llvm.XXX.intrinsic(<4 x i16>* %2, >>>>> <4 x >>>>>>> i16> >>>>>>>>>> %p1, i32 0), !dbg !24, !tbaa !12 >>>>>>>>>> %8 = load i16, i16* %arrayidx, align 8, !dbg !25, !tbaa !17 >>>>>>>>>> br label %for.body12, !dbg !26 >>>>>>>>>> >>>>>>>>>> for.body12: ; preds >>>>> %for.body >>>>>>>>>> %arrayidx14 = getelementptr inbounds [4 x i16], [4 x i16]* >>>>>>> %tmpcast, >>>>>>>>> i32 >>>>>>>>>> 0, i32 1, !dbg !27 >>>>>>>>>> %9 = load i16, i16* %arrayidx14, align 2, !dbg !27, !tbaa >> !17 >>>>>>>>>> %cmp16 = icmp sgt i16 %8, %9, !dbg !28 >>>>>>>>>> %spec.select39 = select i1 %cmp16, i16 %9, i16 %8, !dbg !29 >>>>>>>>>> %arrayidx14.1 = getelementptr inbounds [4 x i16], [4 x i16]* >>>>>>> %tmpcast, >>>>>>>>>> i32 0, i32 2, !dbg !27 >>>>>>>>>> %10 = load i16, i16* %arrayidx14.1, align 2, !dbg !27, !tbaa >>>>> !17 >>>>>>>>>> %cmp16.1 = icmp sgt i16 %spec.select39, %10, !dbg !28 >>>>>>>>>> %spec.select39.1 = select i1 %cmp16.1, i16 %10, i16 >>>>> %spec.select39, >>>>>>>>> !dbg >>>>>>>>>> !29 >>>>>>>>>> %arrayidx14.2 = getelementptr inbounds [4 x i16], [4 x i16]* >>>>>>> %tmpcast, >>>>>>>>>> i32 0, i32 3, !dbg !27 >>>>>>>>>> %11 = load i16, i16* %arrayidx14.2, align 2, !dbg !27, !tbaa >>>>> !17 >>>>>>>>>> %cmp16.2 = icmp sgt i16 %spec.select39.1, %11, !dbg !28 >>>>>>>>>> %spec.select39.2 = select i1 %cmp16.2, i16 %11, i16 >>>>>>> %spec.select39.1, >>>>>>>>>> !dbg !29 >>>>>>>>>> %conv24 = sext i16 %spec.select39.2 to i32, !dbg !30 >>>>>>>>>> call void @llvm.lifetime.end.p0i8(i64 8, i8* nonnull %0) #3, >>>>> !dbg >>>>>>> !31 >>>>>>>>>> ret i32 %conv24, !dbg !32 >>>>>>>>>> >>>>>>>>>> On Thu, Jan 14, 2021 at 11:54 AM Roman Lebedev < >>>>> lebedev.ri at gmail.com> >>>>>>>>> wrote: >>>>>>>>>>> It would be good to have an actual IR reproducer here. >>>>>>>>>>> >>>>>>>>>>> On Thu, Jan 14, 2021 at 7:51 PM Ryan Taylor via llvm-dev >>>>>>>>>>> <llvm-dev at lists.llvm.org> wrote: >>>>>>>>>>>> So given an intrinsic that has a pointer as in/out and >>>>> IntrWriteMem >>>>>>>>>>> property. >>>>>>>>>>>> call intrinsic(address a, ....); >>>>>>>>>>>> loop over address a >>>>>>>>>>>> load from address a + offset >>>>>>>>>>>> call intrinsic (address a, ...); >>>>>>>>>>>> loop over address a >>>>>>>>>>>> load from address a + offset >>>>>>>>>>>> >>>>>>>>>>>> GVN is removing the second loads, despite the second call >>>>> overwriting >>>>>>>>>>> the memory starting at address a. AA has the intrinsics marked as >>>>>>>>> unknown >>>>>>>>>>> instructions but has all of these as mayAlias in a set. I'm not >>>>> seeing >>>>>>>>> this >>>>>>>>>>> issue with -fno-unroll-loops. >>>>>>>>>>>> Thanks. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> LLVM Developers mailing list >>>>>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>>> _______________________________________________ >>>>>>>>>> LLVM Developers mailing list >>>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Ryan Taylor via llvm-dev
2021-Jan-18 23:38 UTC
[llvm-dev] GVN removing loads that are affected by call
Johannes, Thanks, this makes it a bit more clear, yes. I was a bit confused if it was copying the pointer or the data being pointed to. It looks like you are proposing the new attribute to fill this hole that the compiler is not currently able to tell that this pointer, though "escaped" is not used again and is therefore functionally the same as not captured? Do you have a phab for the recently proposed no-capture-maybe-returned attribute? Thanks. On Mon, Jan 18, 2021 at 6:13 PM Johannes Doerfert < johannesdoerfert at gmail.com> wrote:> Your example/trouble starts with an "identified function local" (for > short, IFL) pointer > (see llvm::isIdentifiedFunctionLocal), i.a., as it falls out of > `malloc`, `alloca`, or > a `__restrict` qualified argument. > > Since no access on a pointer not derived from the IFL one can cause the > underlying memory to be altered, > we can ignore other accesses *iff* we can show they don't use a pointer > that is "based-on" the IFL one. > The easiest way to prove it is not based on the IFL one is to show the > IFL pointer does not escape since > then every based-on use can be identified by def-use chains. A call > argument that is nocapture guarantees > no copy of the pointer is made that outlives the call. If you return the > pointer, a copy is made and you > broke the contract. We consequently "ignore" accesses through that > pointer when we argue about the IFL one > as they cannot alias. > > Since we often return a pointer argument, the Attributor uses > "no-capture-maybe-returned" > which we recently discussed to propose as a new enum attribute. You > could then mark the argument as such > and capture + use traversals could continue with the intrinsic value to > determine all accesses to the IFL. > > I hope this makes it somewhat clearer. > > WRT. memcpy, it does not copy the pointers you put it but the memory > they point to. The address of the memory > did not escape/was not captured. > > ~ Johannes > > > On 1/18/21 4:55 PM, Ryan Taylor wrote: > > I'm still a bit fuzzy on the definition but that doesn't seem to > completely > > agree with pointer capture? > > > > " A pointer value *is captured* if the function makes a copy of any part > of > > the pointer that outlives the call. " > > > > I can replace memcpy in our example and see no issues. memcpy is no > > capture. > > > > #define SIZE 5 > > #include<string> > > int foo(short a[SIZE], short b[SIZE], short *res) { > > short temp[SIZE] = {0, 0, 0, 0, 0}; > > short *ptr; > > ptr = temp; > > memcpy(ptr, b, SIZE*2); > > short t = temp[0]; > > for (int i = 1; i < 4; i++) { > > if (t > temp[i]) t = temp[i]; > > } > > *res = t; > > memcpy(ptr, b, SIZE*2); > > t = temp[0]; > > for (int i = 1; i < 4; i++) { > > if (t > temp[i]) t = temp[i]; > > } > > return t; > > } > > > > So you're saying that if the memcpy intrinsic, in this case, returned a > > pointer (even if not directly used, ie "ptr" in this case) and had > > NoCapture attribute, the second set of loads would be removed? > > > > -Ryan > > > > On Mon, Jan 18, 2021 at 4:30 PM Johannes Doerfert < > > johannesdoerfert at gmail.com> wrote: > > > >> A function that returns a pointer argument "captures" it for now. > >> > >> The Attributor uses "nocapute_maybe_returned" internally but it is not > >> an enum attribute (yet). > >> > >> > >> On 1/15/21 7:32 PM, Ryan Taylor wrote: > >>> So this seems to be an issue with the intrinsic setting NoCapture<0> > (it > >>> returns the pointer it takes). We have a similar intrinsic that does > the > >>> exact same thing but doesn't return the pointer. It also has > NoCapture<1> > >>> and doesn't cause this issue. Also, I managed to substitute memcpy and > >>> there were no issues, it also has nocapture. > >>> > >>> Are there some papers/citations which talk more about Pointer > >>> Capture/Capture Tracking? > >>> > >>> I saw this blog: > >>> https://jonasdevlieghere.com/escape-analysis-capture-tracking-in-llvm/ > >>> And I read the CaptureTracking.h short description but I'd like a > clearer > >>> grasp. > >>> > >>> Thanks, > >>> > >>> Ryan > >>> > >>> > >>> > >>> On Thu, Jan 14, 2021 at 2:03 PM Ryan Taylor <ryta1203 at gmail.com> > wrote: > >>> > >>>> Ok, thanks. I'll try to come up with a more generic test case. > >>>> > >>>> On Thu, Jan 14, 2021 at 2:00 PM Johannes Doerfert < > >>>> johannesdoerfert at gmail.com> wrote: > >>>> > >>>>> Nothing comes to mind given the information. > >>>>> > >>>>> ~ Johannes > >>>>> > >>>>> > >>>>> On 1/14/21 12:56 PM, Ryan Taylor wrote: > >>>>>> argmemonly, nounwind, writeonly > >>>>>> > >>>>>> -fno-strict-aliasing did not help. > >>>>>> > >>>>>> On Thu, Jan 14, 2021 at 1:29 PM Johannes Doerfert < > >>>>>> johannesdoerfert at gmail.com> wrote: > >>>>>> > >>>>>>> Can you share the attributes of the intrinsic declaration, > >>>>>>> assuming removing `!tbaa` didn't help. > >>>>>>> > >>>>>>> > >>>>>>> On 1/14/21 11:43 AM, Ryan Taylor wrote: > >>>>>>>> Yes, this is for downstream/out of tree target so I'm not sure how > >> you > >>>>>>>> could reproduce it either but I thought the IR might help a bit. > >>>>>>>> > >>>>>>>> My guess is it's not a bug in GVN as much as an issue with the > >>>>> intrinsic > >>>>>>>> properties, or lack thereof. I put this in the first post but the > >>>>> Alias > >>>>>>>> Sets show: > >>>>>>>> > >>>>>>>> AliasSet[0x75fbd0, 9] may alias, Mod/Ref Pointers: (i8* %1, 8), > >>>>> (i16* > >>>>>>>> %arrayidx, 2), (i16* %arrayidx1, 2), (i16* %arrayidx5, 2), (i16* > >> %19, > >>>>> 2), > >>>>>>>> (i16* %arrayidx6, 2), (i16* %arrayidx14, 2), (i16* %arrayidx19, 2) > >>>>>>>> 4 Unknown instructions: void <badref>, <4 x i16>* %6, <4 x > >> i16>* > >>>>>>> %22, > >>>>>>>> void <badref> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> On Thu, Jan 14, 2021 at 12:37 PM Johannes Doerfert < > >>>>>>>> johannesdoerfert at gmail.com> wrote: > >>>>>>>> > >>>>>>>>> There is still not enough information here. > >>>>>>>>> > >>>>>>>>> My first guess. The `!tbaa` annotation on the `XXX.intrinsic` and > >> the > >>>>>>>>> `load` > >>>>>>>>> basically encode there is no alias. Easy to verify, remove the > ones > >>>>> on > >>>>>>> the > >>>>>>>>> intrinsic. > >>>>>>>>> > >>>>>>>>> ~ Johannes > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> P.S. If this was a bug in GVN, and I assume it is not, a > reproducer > >>>>>>>>> would help > >>>>>>>>> a lot. So a small IR sample that shows the problem and > >> which > >>>>> we > >>>>>>>>> can run. > >>>>>>>>> This is a "redacted?" IR fragment in which I don't know > >> what > >>>>>>>>> transformation > >>>>>>>>> is problematic. I also can not run it through GVN, > which > >>>>> makes it > >>>>>>>>> impossible > >>>>>>>>> to reproduce. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On 1/14/21 11:27 AM, Ryan Taylor via llvm-dev wrote: > >>>>>>>>>> This is right before GVN: > >>>>>>>>>> > >>>>>>>>>> define i32 @foo(<4 x i16> %p, <4 x i16> %p1, i16* nocapture > %res) > >>>>>>>>>> local_unnamed_addr #0 !dbg !6 { > >>>>>>>>>> entry: > >>>>>>>>>> %temp = alloca i64, align 8 > >>>>>>>>>> %tmpcast = bitcast i64* %temp to [4 x i16]* > >>>>>>>>>> %0 = bitcast i64* %temp to i8*, !dbg !8 > >>>>>>>>>> call void @llvm.lifetime.start.p0i8(i64 8, i8* nonnull > %0) > >> #3, > >>>>>>> !dbg !8 > >>>>>>>>>> store i64 0, i64* %temp, align 8, !dbg !9 > >>>>>>>>>> %1 = bitcast i64* %temp to <4 x i16>*, !dbg !10 > >>>>>>>>>> %2 = call <4 x i16>* @llvm.XXX.intrinsic(<4 x i16>* > nonnull > >>>>> %1, <4 > >>>>>>> x > >>>>>>>>> i16> > >>>>>>>>>> %p, i32 0), !dbg !11, !tbaa !12 > >>>>>>>>>> %arrayidx = bitcast i64* %temp to i16*, !dbg !16 > >>>>>>>>>> %3 = load i16, i16* %arrayidx, align 8, !dbg !16, !tbaa > !17 > >>>>>>>>>> br label %for.body, !dbg !19 > >>>>>>>>>> > >>>>>>>>>> for.body: ; preds > %entry > >>>>>>>>>> %arrayidx1 = getelementptr inbounds [4 x i16], [4 x i16]* > >>>>> %tmpcast, > >>>>>>>>> i32 > >>>>>>>>>> 0, i32 1, !dbg !20 > >>>>>>>>>> %4 = load i16, i16* %arrayidx1, align 2, !dbg !20, !tbaa > !17 > >>>>>>>>>> %cmp3 = icmp sgt i16 %3, %4, !dbg !21 > >>>>>>>>>> %spec.select = select i1 %cmp3, i16 %4, i16 %3, !dbg !22 > >>>>>>>>>> %arrayidx1.1 = getelementptr inbounds [4 x i16], [4 x > i16]* > >>>>>>> %tmpcast, > >>>>>>>>> i32 > >>>>>>>>>> 0, i32 2, !dbg !20 > >>>>>>>>>> %5 = load i16, i16* %arrayidx1.1, align 2, !dbg !20, > !tbaa > >> !17 > >>>>>>>>>> %cmp3.1 = icmp sgt i16 %spec.select, %5, !dbg !21 > >>>>>>>>>> %spec.select.1 = select i1 %cmp3.1, i16 %5, i16 > >> %spec.select, > >>>>> !dbg > >>>>>>> !22 > >>>>>>>>>> %arrayidx1.2 = getelementptr inbounds [4 x i16], [4 x > i16]* > >>>>>>> %tmpcast, > >>>>>>>>> i32 > >>>>>>>>>> 0, i32 3, !dbg !20 > >>>>>>>>>> %6 = load i16, i16* %arrayidx1.2, align 2, !dbg !20, > !tbaa > >> !17 > >>>>>>>>>> %cmp3.2 = icmp sgt i16 %spec.select.1, %6, !dbg !21 > >>>>>>>>>> %spec.select.2 = select i1 %cmp3.2, i16 %6, i16 > >> %spec.select.1, > >>>>>>> !dbg > >>>>>>>>> !22 > >>>>>>>>>> store i16 %spec.select.2, i16* %res, align 2, !dbg !23, > >> !tbaa > >>>>> !17 > >>>>>>>>>> %7 = tail call <4 x i16>* @llvm.XXX.intrinsic(<4 x i16>* > %2, > >>>>> <4 x > >>>>>>> i16> > >>>>>>>>>> %p1, i32 0), !dbg !24, !tbaa !12 > >>>>>>>>>> %8 = load i16, i16* %arrayidx, align 8, !dbg !25, !tbaa > !17 > >>>>>>>>>> br label %for.body12, !dbg !26 > >>>>>>>>>> > >>>>>>>>>> for.body12: ; preds > >>>>> %for.body > >>>>>>>>>> %arrayidx14 = getelementptr inbounds [4 x i16], [4 x > i16]* > >>>>>>> %tmpcast, > >>>>>>>>> i32 > >>>>>>>>>> 0, i32 1, !dbg !27 > >>>>>>>>>> %9 = load i16, i16* %arrayidx14, align 2, !dbg !27, !tbaa > >> !17 > >>>>>>>>>> %cmp16 = icmp sgt i16 %8, %9, !dbg !28 > >>>>>>>>>> %spec.select39 = select i1 %cmp16, i16 %9, i16 %8, !dbg > !29 > >>>>>>>>>> %arrayidx14.1 = getelementptr inbounds [4 x i16], [4 x > i16]* > >>>>>>> %tmpcast, > >>>>>>>>>> i32 0, i32 2, !dbg !27 > >>>>>>>>>> %10 = load i16, i16* %arrayidx14.1, align 2, !dbg !27, > !tbaa > >>>>> !17 > >>>>>>>>>> %cmp16.1 = icmp sgt i16 %spec.select39, %10, !dbg !28 > >>>>>>>>>> %spec.select39.1 = select i1 %cmp16.1, i16 %10, i16 > >>>>> %spec.select39, > >>>>>>>>> !dbg > >>>>>>>>>> !29 > >>>>>>>>>> %arrayidx14.2 = getelementptr inbounds [4 x i16], [4 x > i16]* > >>>>>>> %tmpcast, > >>>>>>>>>> i32 0, i32 3, !dbg !27 > >>>>>>>>>> %11 = load i16, i16* %arrayidx14.2, align 2, !dbg !27, > !tbaa > >>>>> !17 > >>>>>>>>>> %cmp16.2 = icmp sgt i16 %spec.select39.1, %11, !dbg !28 > >>>>>>>>>> %spec.select39.2 = select i1 %cmp16.2, i16 %11, i16 > >>>>>>> %spec.select39.1, > >>>>>>>>>> !dbg !29 > >>>>>>>>>> %conv24 = sext i16 %spec.select39.2 to i32, !dbg !30 > >>>>>>>>>> call void @llvm.lifetime.end.p0i8(i64 8, i8* nonnull %0) > #3, > >>>>> !dbg > >>>>>>> !31 > >>>>>>>>>> ret i32 %conv24, !dbg !32 > >>>>>>>>>> > >>>>>>>>>> On Thu, Jan 14, 2021 at 11:54 AM Roman Lebedev < > >>>>> lebedev.ri at gmail.com> > >>>>>>>>> wrote: > >>>>>>>>>>> It would be good to have an actual IR reproducer here. > >>>>>>>>>>> > >>>>>>>>>>> On Thu, Jan 14, 2021 at 7:51 PM Ryan Taylor via llvm-dev > >>>>>>>>>>> <llvm-dev at lists.llvm.org> wrote: > >>>>>>>>>>>> So given an intrinsic that has a pointer as in/out and > >>>>> IntrWriteMem > >>>>>>>>>>> property. > >>>>>>>>>>>> call intrinsic(address a, ....); > >>>>>>>>>>>> loop over address a > >>>>>>>>>>>> load from address a + offset > >>>>>>>>>>>> call intrinsic (address a, ...); > >>>>>>>>>>>> loop over address a > >>>>>>>>>>>> load from address a + offset > >>>>>>>>>>>> > >>>>>>>>>>>> GVN is removing the second loads, despite the second call > >>>>> overwriting > >>>>>>>>>>> the memory starting at address a. AA has the intrinsics marked > as > >>>>>>>>> unknown > >>>>>>>>>>> instructions but has all of these as mayAlias in a set. I'm not > >>>>> seeing > >>>>>>>>> this > >>>>>>>>>>> issue with -fno-unroll-loops. > >>>>>>>>>>>> Thanks. > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> _______________________________________________ > >>>>>>>>>>>> LLVM Developers mailing list > >>>>>>>>>>>> llvm-dev at lists.llvm.org > >>>>>>>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >>>>>>>>>> _______________________________________________ > >>>>>>>>>> LLVM Developers mailing list > >>>>>>>>>> llvm-dev at lists.llvm.org > >>>>>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210118/83fc398b/attachment-0001.html>