Daniel Berlin via llvm-dev
2016-Aug-29 20:59 UTC
[llvm-dev] GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics
Okay, so then it sounds like, for now, the right fix is to stop marking masked.gather and masked.scatter with intrarg* options. On Mon, Aug 29, 2016, 1:26 PM Philip Reames <listmail at philipreames.com> wrote:> We might have specification bug here, but we appear to implement what we > specified. argmemonly is specified as only considering pointer typed > arguments. It's behavior on vector-of-pointers is unspecified, but would > seem to fall into the same case as inttoptr of an integer argument (i.e. > explicitly undefined). We could consider changing that. > > > > On 08/29/2016 11:59 AM, Daniel Berlin wrote: > > + a few others. > > After following this rabbit hole a bit, there are a lot of mutually > recursive calls, etc, that may or may not do the right thing with vectors > of pointers. > I can fix *this* particular bug with the attached patch. > > However, it's mostly papering over stuff. Nothing seems to know what to > do with a memorylocation that is a vector of pointers. They all expect > memorylocation to be a single pointer location. > > I would chalk it up to "luck" that this patch fixes the bug. > > It's pretty clear that MemoryLocation doesn't fit the needs of a lot of > stuff anymore (we hacked AA nodes into it, and lots of stuff now tries to > figure out the invariantess of the locations, blah blah blah), but it seems > like a big job to figure out what to replace it with that will work for > these cases. > > (I'm pretty positive if we just make it MemoryLocations, and have > everything loop over the locations, the compiler will get a lot larger and > a lot slower) > > On Mon, Aug 29, 2016 at 9:10 AM, Daniel Berlin <dberlin at dberlin.org> > wrote: > >> it would also, for that matter, say the same about an array of pointers. >> >> It's not clear to me what will break if you change this to >> isPtrOrPtrVectorTy. >> >> In fact, i know it doesn't fix this bug. >> It's a pretty deep rabbit hole of things not quite prepared to understand >> vectors of pointers. >> >> (we prepare memorylocations of them, but memory locations expect to be >> one thing, not a group of things, etc). >> >> >> >> >> On Mon, Aug 29, 2016 at 8:58 AM, Daniel Berlin <dberlin at dberlin.org> >> wrote: >> >>> this is definitely a bug in AA. >>> >>> 225 for (auto I = CS2.arg_begin(), E = CS2.arg_end(); I != E; >>> ++I) { >>> 226 const Value *Arg = *I; >>> 227 if (!Arg->getType()->isPointerTy()) >>> -> 228 continue; >>> 229 unsigned CS2ArgIdx = std::distance(CS2.arg_begin(), I); >>> 230 auto CS2ArgLoc = MemoryLocation::getForArgument(CS2, >>> CS2ArgIdx, TLI); >>> >>> AliasAnalysis.cpp:228 >>> >>> It ignores every argument because they are vectors of pointers, not >>> pointers. >>> >>> >>> I'm surprised this has not broken anything before. It will never say two >>> intrinsics with vectors of pointers mod/ref each other. >>> >>> >>> On Mon, Aug 29, 2016 at 7:36 AM, Davide Italiano <davide at freebsd.org> >>> wrote: >>> >>>> + Daniel Berlin >>>> >>>> On Mon, Aug 29, 2016 at 3:42 PM, Chris Sakalis via llvm-dev >>>> <llvm-dev at lists.llvm.org> wrote: >>>> > Hello everyone, >>>> > >>>> > I think I have found an gvn / alias analysis related bug, but before >>>> opening >>>> > an issue on the tracker I wanted to see if I am missing something. I >>>> have >>>> > the following testcase: >>>> > >>>> >> define spir_kernel void @test(<2 x i32*> %in1, <2 x i32*> %in2, i32* >>>> %out) >>>> >> { >>>> >> entry: >>>> >> ; Just some temporary storage >>>> >> %tmp.0 = alloca i32 >>>> >> %tmp.1 = alloca i32 >>>> >> %tmp.i = insertelement <2 x i32*> undef, i32* %tmp.0, i32 0 >>>> >> %tmp = insertelement <2 x i32*> %tmp.i, i32* %tmp.1, i32 1 >>>> >> ; Read from in1 and in2 >>>> >> %in1.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %in1, >>>> i32 >>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 >>>> >> %in2.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %in2, >>>> i32 >>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 >>>> >> ; Store in1 to the allocas >>>> >> call void @llvm.masked.scatter.v2i32(<2 x i32> %in1.v, <2 x i32*> >>>> %tmp, >>>> >> i32 1, <2 x i1> <i1 true, i1 true>); >>>> >> ; Read in1 from the allocas >>>> >> %tmp.v.0 = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> >>>> %tmp, i32 >>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 >>>> >> ; Store in2 to the allocas >>>> >> call void @llvm.masked.scatter.v2i32(<2 x i32> %in2.v, <2 x i32*> >>>> %tmp, >>>> >> i32 1, <2 x i1> <i1 true, i1 true>); >>>> >> ; Read in2 from the allocas >>>> >> %tmp.v.1 = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> >>>> %tmp, i32 >>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 >>>> >> ; Store in2 to out for good measure >>>> >> %tmp.v.1.0 = extractelement <2 x i32> %tmp.v.1, i32 0 >>>> >> %tmp.v.1.1 = extractelement <2 x i32> %tmp.v.1, i32 1 >>>> >> store i32 %tmp.v.1.0, i32* %out >>>> >> %out.1 = getelementptr i32, i32* %out, i32 1 >>>> >> store i32 %tmp.v.1.1, i32* %out.1 >>>> >> ret void >>>> >> } >>>> > >>>> > >>>> > It uses a masked scatter operation to store a value to the two >>>> allocas and >>>> > then uses a masked gather operation to read that same value. This is >>>> done >>>> > twice in a row, with two different values. If I run this code through >>>> the >>>> > GVN pass, the second gather (%tmp.v.1) will be deemed to be the same >>>> as the >>>> > first gather (%tmp.v.0) and it will be removed. After some debugging, >>>> I >>>> > realized that this is happening because the Memory Dependence Analysis >>>> > returns %tmp.v.0 as the Def dependency for %tmp.v.1, even though the >>>> scatter >>>> > call in between changes the value stored at %tmp. This, in turn, is >>>> > happening because the alias analysis is returning NoModRef for the >>>> %tmp.v.1 >>>> > and second scatter callsites. The resulting IR produces the wrong >>>> result: >>>> > >>>> >> define spir_kernel void @test(<2 x i32*> %in1, <2 x i32*> %in2, i32* >>>> %out) >>>> >> { >>>> >> entry: >>>> >> %tmp.0 = alloca i32 >>>> >> %tmp.1 = alloca i32 >>>> >> %tmp.i = insertelement <2 x i32*> undef, i32* %tmp.0, i32 0 >>>> >> %tmp = insertelement <2 x i32*> %tmp.i, i32* %tmp.1, i32 1 >>>> >> %in1.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %in1, >>>> i32 >>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 >>>> >> %in2.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %in2, >>>> i32 >>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 >>>> >> call void @llvm.masked.scatter.v2i32(<2 x i32> %in1.v, <2 x i32*> >>>> %tmp, >>>> >> i32 1, <2 x i1> <i1 true, i1 true>) >>>> >> %tmp.v.0 = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> >>>> %tmp, i32 >>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 >>>> >> call void @llvm.masked.scatter.v2i32(<2 x i32> %in2.v, <2 x i32*> >>>> %tmp, >>>> >> i32 1, <2 x i1> <i1 true, i1 true>) >>>> >> ; The call to masked.gather is gone and now we are using the old >>>> value >>>> >> (%tmp.v.0) >>>> >> %tmp.v.1.0 = extractelement <2 x i32> %tmp.v.0, i32 0 >>>> >> %tmp.v.1.1 = extractelement <2 x i32> %tmp.v.0, i32 1 >>>> >> store i32 %tmp.v.1.0, i32* %out >>>> >> %out.1 = getelementptr i32, i32* %out, i32 1 >>>> >> store i32 %tmp.v.1.1, i32* %out.1 >>>> >> ret void >>>> >> } >>>> > >>>> > >>>> > The old value read from %tmp is used, instead of the new one. I >>>> tested this >>>> > code using `opt -gvn`, with LLVM 3.8.1. I also tried tip (84cb7f4) >>>> with the >>>> > same result. >>>> > >>>> > I should mention that if I replace the second scatter with stores, >>>> the issue >>>> > persists. The only way to avoid it is to replace all scatter/gather >>>> > intrinsics with equivalent sets of store/load, in which case the >>>> MemDep >>>> > returns the correct dependencies and the GVN pass will not remove the >>>> second >>>> > set of loads. >>>> > >>>> > So, my question is, is this a bug or am I doing something that I >>>> shouldn't >>>> > be in the IR? And if it is a bug, is it the AA analyses that return >>>> the >>>> > wrong result (I presume so) or should GVN handle such cases >>>> differently? >>>> > >>>> > Best regards, >>>> > >>>> > Chris >>>> > >>>> > _______________________________________________ >>>> > LLVM Developers mailing list >>>> > llvm-dev at lists.llvm.org >>>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> > >>>> >>>> >>>> >>>> -- >>>> Davide >>>> >>>> "There are no solved problems; there are only problems that are more >>>> or less solved" -- Henri Poincare >>>> >>> >>> >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160829/0db527f5/attachment.html>
Daniel Berlin via llvm-dev
2016-Aug-30 20:07 UTC
[llvm-dev] GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics
This is now committed and a test added to GVN. On Mon, Aug 29, 2016 at 1:59 PM, Daniel Berlin <dberlin at dberlin.org> wrote:> Okay, so then it sounds like, for now, the right fix is to stop marking > masked.gather and masked.scatter with intrarg* options. > > On Mon, Aug 29, 2016, 1:26 PM Philip Reames <listmail at philipreames.com> > wrote: > >> We might have specification bug here, but we appear to implement what we >> specified. argmemonly is specified as only considering pointer typed >> arguments. It's behavior on vector-of-pointers is unspecified, but would >> seem to fall into the same case as inttoptr of an integer argument (i.e. >> explicitly undefined). We could consider changing that. >> >> >> >> On 08/29/2016 11:59 AM, Daniel Berlin wrote: >> >> + a few others. >> >> After following this rabbit hole a bit, there are a lot of mutually >> recursive calls, etc, that may or may not do the right thing with vectors >> of pointers. >> I can fix *this* particular bug with the attached patch. >> >> However, it's mostly papering over stuff. Nothing seems to know what to >> do with a memorylocation that is a vector of pointers. They all expect >> memorylocation to be a single pointer location. >> >> I would chalk it up to "luck" that this patch fixes the bug. >> >> It's pretty clear that MemoryLocation doesn't fit the needs of a lot of >> stuff anymore (we hacked AA nodes into it, and lots of stuff now tries to >> figure out the invariantess of the locations, blah blah blah), but it seems >> like a big job to figure out what to replace it with that will work for >> these cases. >> >> (I'm pretty positive if we just make it MemoryLocations, and have >> everything loop over the locations, the compiler will get a lot larger and >> a lot slower) >> >> On Mon, Aug 29, 2016 at 9:10 AM, Daniel Berlin <dberlin at dberlin.org> >> wrote: >> >>> it would also, for that matter, say the same about an array of pointers. >>> >>> It's not clear to me what will break if you change this to >>> isPtrOrPtrVectorTy. >>> >>> In fact, i know it doesn't fix this bug. >>> It's a pretty deep rabbit hole of things not quite prepared to >>> understand vectors of pointers. >>> >>> (we prepare memorylocations of them, but memory locations expect to be >>> one thing, not a group of things, etc). >>> >>> >>> >>> >>> On Mon, Aug 29, 2016 at 8:58 AM, Daniel Berlin <dberlin at dberlin.org> >>> wrote: >>> >>>> this is definitely a bug in AA. >>>> >>>> 225 for (auto I = CS2.arg_begin(), E = CS2.arg_end(); I != E; >>>> ++I) { >>>> 226 const Value *Arg = *I; >>>> 227 if (!Arg->getType()->isPointerTy()) >>>> -> 228 continue; >>>> 229 unsigned CS2ArgIdx = std::distance(CS2.arg_begin(), I); >>>> 230 auto CS2ArgLoc = MemoryLocation::getForArgument(CS2, >>>> CS2ArgIdx, TLI); >>>> >>>> AliasAnalysis.cpp:228 >>>> >>>> It ignores every argument because they are vectors of pointers, not >>>> pointers. >>>> >>>> >>>> I'm surprised this has not broken anything before. It will never say >>>> two intrinsics with vectors of pointers mod/ref each other. >>>> >>>> >>>> On Mon, Aug 29, 2016 at 7:36 AM, Davide Italiano <davide at freebsd.org> >>>> wrote: >>>> >>>>> + Daniel Berlin >>>>> >>>>> On Mon, Aug 29, 2016 at 3:42 PM, Chris Sakalis via llvm-dev >>>>> <llvm-dev at lists.llvm.org> wrote: >>>>> > Hello everyone, >>>>> > >>>>> > I think I have found an gvn / alias analysis related bug, but before >>>>> opening >>>>> > an issue on the tracker I wanted to see if I am missing something. I >>>>> have >>>>> > the following testcase: >>>>> > >>>>> >> define spir_kernel void @test(<2 x i32*> %in1, <2 x i32*> %in2, >>>>> i32* %out) >>>>> >> { >>>>> >> entry: >>>>> >> ; Just some temporary storage >>>>> >> %tmp.0 = alloca i32 >>>>> >> %tmp.1 = alloca i32 >>>>> >> %tmp.i = insertelement <2 x i32*> undef, i32* %tmp.0, i32 0 >>>>> >> %tmp = insertelement <2 x i32*> %tmp.i, i32* %tmp.1, i32 1 >>>>> >> ; Read from in1 and in2 >>>>> >> %in1.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> >>>>> %in1, i32 >>>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 >>>>> >> %in2.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> >>>>> %in2, i32 >>>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 >>>>> >> ; Store in1 to the allocas >>>>> >> call void @llvm.masked.scatter.v2i32(<2 x i32> %in1.v, <2 x i32*> >>>>> %tmp, >>>>> >> i32 1, <2 x i1> <i1 true, i1 true>); >>>>> >> ; Read in1 from the allocas >>>>> >> %tmp.v.0 = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> >>>>> %tmp, i32 >>>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 >>>>> >> ; Store in2 to the allocas >>>>> >> call void @llvm.masked.scatter.v2i32(<2 x i32> %in2.v, <2 x i32*> >>>>> %tmp, >>>>> >> i32 1, <2 x i1> <i1 true, i1 true>); >>>>> >> ; Read in2 from the allocas >>>>> >> %tmp.v.1 = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> >>>>> %tmp, i32 >>>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 >>>>> >> ; Store in2 to out for good measure >>>>> >> %tmp.v.1.0 = extractelement <2 x i32> %tmp.v.1, i32 0 >>>>> >> %tmp.v.1.1 = extractelement <2 x i32> %tmp.v.1, i32 1 >>>>> >> store i32 %tmp.v.1.0, i32* %out >>>>> >> %out.1 = getelementptr i32, i32* %out, i32 1 >>>>> >> store i32 %tmp.v.1.1, i32* %out.1 >>>>> >> ret void >>>>> >> } >>>>> > >>>>> > >>>>> > It uses a masked scatter operation to store a value to the two >>>>> allocas and >>>>> > then uses a masked gather operation to read that same value. This is >>>>> done >>>>> > twice in a row, with two different values. If I run this code >>>>> through the >>>>> > GVN pass, the second gather (%tmp.v.1) will be deemed to be the same >>>>> as the >>>>> > first gather (%tmp.v.0) and it will be removed. After some >>>>> debugging, I >>>>> > realized that this is happening because the Memory Dependence >>>>> Analysis >>>>> > returns %tmp.v.0 as the Def dependency for %tmp.v.1, even though the >>>>> scatter >>>>> > call in between changes the value stored at %tmp. This, in turn, is >>>>> > happening because the alias analysis is returning NoModRef for the >>>>> %tmp.v.1 >>>>> > and second scatter callsites. The resulting IR produces the wrong >>>>> result: >>>>> > >>>>> >> define spir_kernel void @test(<2 x i32*> %in1, <2 x i32*> %in2, >>>>> i32* %out) >>>>> >> { >>>>> >> entry: >>>>> >> %tmp.0 = alloca i32 >>>>> >> %tmp.1 = alloca i32 >>>>> >> %tmp.i = insertelement <2 x i32*> undef, i32* %tmp.0, i32 0 >>>>> >> %tmp = insertelement <2 x i32*> %tmp.i, i32* %tmp.1, i32 1 >>>>> >> %in1.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> >>>>> %in1, i32 >>>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 >>>>> >> %in2.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> >>>>> %in2, i32 >>>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 >>>>> >> call void @llvm.masked.scatter.v2i32(<2 x i32> %in1.v, <2 x i32*> >>>>> %tmp, >>>>> >> i32 1, <2 x i1> <i1 true, i1 true>) >>>>> >> %tmp.v.0 = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> >>>>> %tmp, i32 >>>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 >>>>> >> call void @llvm.masked.scatter.v2i32(<2 x i32> %in2.v, <2 x i32*> >>>>> %tmp, >>>>> >> i32 1, <2 x i1> <i1 true, i1 true>) >>>>> >> ; The call to masked.gather is gone and now we are using the old >>>>> value >>>>> >> (%tmp.v.0) >>>>> >> %tmp.v.1.0 = extractelement <2 x i32> %tmp.v.0, i32 0 >>>>> >> %tmp.v.1.1 = extractelement <2 x i32> %tmp.v.0, i32 1 >>>>> >> store i32 %tmp.v.1.0, i32* %out >>>>> >> %out.1 = getelementptr i32, i32* %out, i32 1 >>>>> >> store i32 %tmp.v.1.1, i32* %out.1 >>>>> >> ret void >>>>> >> } >>>>> > >>>>> > >>>>> > The old value read from %tmp is used, instead of the new one. I >>>>> tested this >>>>> > code using `opt -gvn`, with LLVM 3.8.1. I also tried tip (84cb7f4) >>>>> with the >>>>> > same result. >>>>> > >>>>> > I should mention that if I replace the second scatter with stores, >>>>> the issue >>>>> > persists. The only way to avoid it is to replace all scatter/gather >>>>> > intrinsics with equivalent sets of store/load, in which case the >>>>> MemDep >>>>> > returns the correct dependencies and the GVN pass will not remove >>>>> the second >>>>> > set of loads. >>>>> > >>>>> > So, my question is, is this a bug or am I doing something that I >>>>> shouldn't >>>>> > be in the IR? And if it is a bug, is it the AA analyses that return >>>>> the >>>>> > wrong result (I presume so) or should GVN handle such cases >>>>> differently? >>>>> > >>>>> > Best regards, >>>>> > >>>>> > Chris >>>>> > >>>>> > _______________________________________________ >>>>> > LLVM Developers mailing list >>>>> > llvm-dev at lists.llvm.org >>>>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>> > >>>>> >>>>> >>>>> >>>>> -- >>>>> Davide >>>>> >>>>> "There are no solved problems; there are only problems that are more >>>>> or less solved" -- Henri Poincare >>>>> >>>> >>>> >>> >> >>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160830/d5557e8b/attachment.html>
Hal Finkel via llvm-dev
2016-Aug-30 20:11 UTC
[llvm-dev] GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics
----- Original Message -----> From: "Daniel Berlin" <dberlin at dberlin.org> > To: "Philip Reames" <listmail at philipreames.com>, "Davide Italiano" > <davide at freebsd.org>, "Chandler Carruth" <chandlerc at gmail.com> > Cc: "Chris Sakalis" <chrissakalis at gmail.com>, "David Majnemer" > <david.majnemer at gmail.com>, "Hal Finkel" <hfinkel at anl.gov>, > "llvm-dev" <llvm-dev at lists.llvm.org> > Sent: Tuesday, August 30, 2016 3:07:01 PM > Subject: Re: [llvm-dev] GVN / Alias Analysis issue with > llvm.masked.scatter/gather intrinsics> This is now committed and a test added to GVN.Thanks! I suspect that, in practice, we'll get little benefit from handling this until our AA passes learn how to deal with (i.e. look back through) pointer vectors. -Hal> On Mon, Aug 29, 2016 at 1:59 PM, Daniel Berlin < dberlin at dberlin.org > > wrote:> > Okay, so then it sounds like, for now, the right fix is to stop > > marking masked.gather and masked.scatter with intrarg* options. >> > On Mon, Aug 29, 2016, 1:26 PM Philip Reames < > > listmail at philipreames.com > wrote: >> > > We might have specification bug here, but we appear to implement > > > what > > > we specified. argmemonly is specified as only considering pointer > > > typed arguments. It's behavior on vector-of-pointers is > > > unspecified, > > > but would seem to fall into the same case as inttoptr of an > > > integer > > > argument (i.e. explicitly undefined). We could consider changing > > > that. > > >> > > On 08/29/2016 11:59 AM, Daniel Berlin wrote: > > >> > > > + a few others. > > > > > >> > > > After following this rabbit hole a bit, there are a lot of > > > > mutually > > > > recursive calls, etc, that may or may not do the right thing > > > > with > > > > vectors of pointers. > > > > > > > > > > I can fix *this* particular bug with the attached patch. > > > > > >> > > > However, it's mostly papering over stuff. Nothing seems to know > > > > what > > > > to do with a memorylocation that is a vector of pointers. They > > > > all > > > > expect memorylocation to be a single pointer location. > > > > > >> > > > I would chalk it up to "luck" that this patch fixes the bug. > > > > > >> > > > It's pretty clear that MemoryLocation doesn't fit the needs of > > > > a > > > > lot > > > > of stuff anymore (we hacked AA nodes into it, and lots of stuff > > > > now > > > > tries to figure out the invariantess of the locations, blah > > > > blah > > > > blah), but it seems like a big job to figure out what to > > > > replace > > > > it > > > > with that will work for these cases. > > > > > >> > > > (I'm pretty positive if we just make it MemoryLocations, and > > > > have > > > > everything loop over the locations, the compiler will get a lot > > > > larger and a lot slower) > > > > > >> > > > On Mon, Aug 29, 2016 at 9:10 AM, Daniel Berlin < > > > > dberlin at dberlin.org > > > > > wrote: > > > > > >> > > > > it would also, for that matter, say the same about an array > > > > > of > > > > > pointers. > > > > > > > > > >> > > > > It's not clear to me what will break if you change this to > > > > > isPtrOrPtrVectorTy. > > > > > > > > > >> > > > > In fact, i know it doesn't fix this bug. > > > > > > > > > > > > > > > It's a pretty deep rabbit hole of things not quite prepared > > > > > to > > > > > understand vectors of pointers. > > > > > > > > > >> > > > > (we prepare memorylocations of them, but memory locations > > > > > expect > > > > > to > > > > > be one thing, not a group of things, etc). > > > > > > > > > >> > > > > On Mon, Aug 29, 2016 at 8:58 AM, Daniel Berlin < > > > > > dberlin at dberlin.org > > > > > > wrote: > > > > > > > > > >> > > > > > this is definitely a bug in AA. > > > > > > > > > > > > > > >> > > > > > 225 for (auto I = CS2.arg_begin(), E = CS2.arg_end(); I !> > > > > > E; > > > > > > ++I) > > > > > > { > > > > > > > > > > > > > > > > > > > > > 226 const Value *Arg = *I; > > > > > > > > > > > > > > > > > > > > > 227 if (!Arg->getType()->isPointerTy()) > > > > > > > > > > > > > > > > > > > > > -> 228 continue; > > > > > > > > > > > > > > > > > > > > > 229 unsigned CS2ArgIdx = std::distance(CS2.arg_begin(), I); > > > > > > > > > > > > > > > > > > > > > 230 auto CS2ArgLoc = MemoryLocation::getForArgument(CS2, > > > > > > CS2ArgIdx, > > > > > > TLI); > > > > > > > > > > > > > > >> > > > > > AliasAnalysis.cpp:228 > > > > > > > > > > > > > > >> > > > > > It ignores every argument because they are vectors of > > > > > > pointers, > > > > > > not > > > > > > pointers. > > > > > > > > > > > > > > >> > > > > > I'm surprised this has not broken anything before. It will > > > > > > never > > > > > > say > > > > > > two intrinsics with vectors of pointers mod/ref each other. > > > > > > > > > > > > > > >> > > > > > On Mon, Aug 29, 2016 at 7:36 AM, Davide Italiano < > > > > > > davide at freebsd.org > > > > > > > wrote: > > > > > > > > > > > > > > >> > > > > > > + Daniel Berlin > > > > > > > > > > > > > > > > > > > > >> > > > > > > On Mon, Aug 29, 2016 at 3:42 PM, Chris Sakalis via > > > > > > > llvm-dev > > > > > > > > > > > > > > > > > > > > > > > > > > > > < llvm-dev at lists.llvm.org > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hello everyone, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I think I have found an gvn / alias analysis related > > > > > > > > bug, > > > > > > > > but > > > > > > > > before opening > > > > > > > > > > > > > > > > > > > > > > > > > > > > > an issue on the tracker I wanted to see if I am missing > > > > > > > > something. > > > > > > > > I have > > > > > > > > > > > > > > > > > > > > > > > > > > > > > the following testcase: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> define spir_kernel void @test(<2 x i32*> %in1, <2 x > > > > > > > >> i32*> > > > > > > > >> %in2, > > > > > > > >> i32* %out) > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> { > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> entry: > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> ; Just some temporary storage > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> %tmp.0 = alloca i32 > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> %tmp.1 = alloca i32 > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> %tmp.i = insertelement <2 x i32*> undef, i32* %tmp.0, > > > > > > > >> i32 > > > > > > > >> 0 > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> %tmp = insertelement <2 x i32*> %tmp.i, i32* %tmp.1, > > > > > > > >> i32 > > > > > > > >> 1 > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> ; Read from in1 and in2 > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> %in1.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x > > > > > > > >> i32*> > > > > > > > >> %in1, > > > > > > > >> i32 > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> %in2.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x > > > > > > > >> i32*> > > > > > > > >> %in2, > > > > > > > >> i32 > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> ; Store in1 to the allocas > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> call void @llvm.masked.scatter.v2i32(<2 x i32> %in1.v, > > > > > > > >> <2 > > > > > > > >> x > > > > > > > >> i32*> > > > > > > > >> %tmp, > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> i32 1, <2 x i1> <i1 true, i1 true>); > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> ; Read in1 from the allocas > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> %tmp.v.0 = call <2 x i32> @llvm.masked.gather.v2i32(<2 > > > > > > > >> x > > > > > > > >> i32*> > > > > > > > >> %tmp, i32 > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> ; Store in2 to the allocas > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> call void @llvm.masked.scatter.v2i32(<2 x i32> %in2.v, > > > > > > > >> <2 > > > > > > > >> x > > > > > > > >> i32*> > > > > > > > >> %tmp, > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> i32 1, <2 x i1> <i1 true, i1 true>); > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> ; Read in2 from the allocas > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> %tmp.v.1 = call <2 x i32> @llvm.masked.gather.v2i32(<2 > > > > > > > >> x > > > > > > > >> i32*> > > > > > > > >> %tmp, i32 > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> ; Store in2 to out for good measure > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> %tmp.v.1.0 = extractelement <2 x i32> %tmp.v.1, i32 0 > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> %tmp.v.1.1 = extractelement <2 x i32> %tmp.v.1, i32 1 > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> store i32 %tmp.v.1.0, i32* %out > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> %out.1 = getelementptr i32, i32* %out, i32 1 > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> store i32 %tmp.v.1.1, i32* %out.1 > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> ret void > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> } > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > It uses a masked scatter operation to store a value to > > > > > > > > the > > > > > > > > two > > > > > > > > allocas and > > > > > > > > > > > > > > > > > > > > > > > > > > > > > then uses a masked gather operation to read that same > > > > > > > > value. > > > > > > > > This > > > > > > > > is done > > > > > > > > > > > > > > > > > > > > > > > > > > > > > twice in a row, with two different values. If I run > > > > > > > > this > > > > > > > > code > > > > > > > > through the > > > > > > > > > > > > > > > > > > > > > > > > > > > > > GVN pass, the second gather (%tmp.v.1) will be deemed > > > > > > > > to > > > > > > > > be > > > > > > > > the > > > > > > > > same as the > > > > > > > > > > > > > > > > > > > > > > > > > > > > > first gather (%tmp.v.0) and it will be removed. After > > > > > > > > some > > > > > > > > debugging, I > > > > > > > > > > > > > > > > > > > > > > > > > > > > > realized that this is happening because the Memory > > > > > > > > Dependence > > > > > > > > Analysis > > > > > > > > > > > > > > > > > > > > > > > > > > > > > returns %tmp.v.0 as the Def dependency for %tmp.v.1, > > > > > > > > even > > > > > > > > though > > > > > > > > the scatter > > > > > > > > > > > > > > > > > > > > > > > > > > > > > call in between changes the value stored at %tmp. This, > > > > > > > > in > > > > > > > > turn, > > > > > > > > is > > > > > > > > > > > > > > > > > > > > > > > > > > > > > happening because the alias analysis is returning > > > > > > > > NoModRef > > > > > > > > for > > > > > > > > the > > > > > > > > %tmp.v.1 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > and second scatter callsites. The resulting IR produces > > > > > > > > the > > > > > > > > wrong > > > > > > > > result: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> define spir_kernel void @test(<2 x i32*> %in1, <2 x > > > > > > > >> i32*> > > > > > > > >> %in2, > > > > > > > >> i32* %out) > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> { > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> entry: > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> %tmp.0 = alloca i32 > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> %tmp.1 = alloca i32 > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> %tmp.i = insertelement <2 x i32*> undef, i32* %tmp.0, > > > > > > > >> i32 > > > > > > > >> 0 > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> %tmp = insertelement <2 x i32*> %tmp.i, i32* %tmp.1, > > > > > > > >> i32 > > > > > > > >> 1 > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> %in1.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x > > > > > > > >> i32*> > > > > > > > >> %in1, > > > > > > > >> i32 > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> %in2.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x > > > > > > > >> i32*> > > > > > > > >> %in2, > > > > > > > >> i32 > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> call void @llvm.masked.scatter.v2i32(<2 x i32> %in1.v, > > > > > > > >> <2 > > > > > > > >> x > > > > > > > >> i32*> > > > > > > > >> %tmp, > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> i32 1, <2 x i1> <i1 true, i1 true>) > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> %tmp.v.0 = call <2 x i32> @llvm.masked.gather.v2i32(<2 > > > > > > > >> x > > > > > > > >> i32*> > > > > > > > >> %tmp, i32 > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> call void @llvm.masked.scatter.v2i32(<2 x i32> %in2.v, > > > > > > > >> <2 > > > > > > > >> x > > > > > > > >> i32*> > > > > > > > >> %tmp, > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> i32 1, <2 x i1> <i1 true, i1 true>) > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> ; The call to masked.gather is gone and now we are > > > > > > > >> using > > > > > > > >> the > > > > > > > >> old > > > > > > > >> value > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> (%tmp.v.0) > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> %tmp.v.1.0 = extractelement <2 x i32> %tmp.v.0, i32 0 > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> %tmp.v.1.1 = extractelement <2 x i32> %tmp.v.0, i32 1 > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> store i32 %tmp.v.1.0, i32* %out > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> %out.1 = getelementptr i32, i32* %out, i32 1 > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> store i32 %tmp.v.1.1, i32* %out.1 > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> ret void > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> } > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The old value read from %tmp is used, instead of the > > > > > > > > new > > > > > > > > one. > > > > > > > > I > > > > > > > > tested this > > > > > > > > > > > > > > > > > > > > > > > > > > > > > code using `opt -gvn`, with LLVM 3.8.1. I also tried > > > > > > > > tip > > > > > > > > (84cb7f4) > > > > > > > > with the > > > > > > > > > > > > > > > > > > > > > > > > > > > > > same result. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I should mention that if I replace the second scatter > > > > > > > > with > > > > > > > > stores, > > > > > > > > the issue > > > > > > > > > > > > > > > > > > > > > > > > > > > > > persists. The only way to avoid it is to replace all > > > > > > > > scatter/gather > > > > > > > > > > > > > > > > > > > > > > > > > > > > > intrinsics with equivalent sets of store/load, in which > > > > > > > > case > > > > > > > > the > > > > > > > > MemDep > > > > > > > > > > > > > > > > > > > > > > > > > > > > > returns the correct dependencies and the GVN pass will > > > > > > > > not > > > > > > > > remove > > > > > > > > the second > > > > > > > > > > > > > > > > > > > > > > > > > > > > > set of loads. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > So, my question is, is this a bug or am I doing > > > > > > > > something > > > > > > > > that > > > > > > > > I > > > > > > > > shouldn't > > > > > > > > > > > > > > > > > > > > > > > > > > > > > be in the IR? And if it is a bug, is it the AA analyses > > > > > > > > that > > > > > > > > return > > > > > > > > the > > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrong result (I presume so) or should GVN handle such > > > > > > > > cases > > > > > > > > differently? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best regards, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Chris > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > LLVM Developers mailing list > > > > > > > > > > > > > > > > > > > > > > > > > > > > > llvm-dev at lists.llvm.org > > > > > > > > > > > > > > > > > > > > > > > > > > > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > -- > > > > > > > > > > > > > > > > > > > > > > > > > > > > Davide > > > > > > > > > > > > > > > > > > > > >> > > > > > > "There are no solved problems; there are only problems > > > > > > > that > > > > > > > are > > > > > > > more > > > > > > > > > > > > > > > > > > > > > > > > > > > > or less solved" -- Henri Poincare > > > > > > > > > > > > > > > > > > > > >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160830/8ed9f221/attachment-0001.html>
Possibly Parallel Threads
- GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics
- GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics
- GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics
- GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics
- GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics