Daniel Berlin via llvm-dev
2016-Aug-29 18:59 UTC
[llvm-dev] GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics
+ a few others. After following this rabbit hole a bit, there are a lot of mutually recursive calls, etc, that may or may not do the right thing with vectors of pointers. I can fix *this* particular bug with the attached patch. However, it's mostly papering over stuff. Nothing seems to know what to do with a memorylocation that is a vector of pointers. They all expect memorylocation to be a single pointer location. I would chalk it up to "luck" that this patch fixes the bug. It's pretty clear that MemoryLocation doesn't fit the needs of a lot of stuff anymore (we hacked AA nodes into it, and lots of stuff now tries to figure out the invariantess of the locations, blah blah blah), but it seems like a big job to figure out what to replace it with that will work for these cases. (I'm pretty positive if we just make it MemoryLocations, and have everything loop over the locations, the compiler will get a lot larger and a lot slower) On Mon, Aug 29, 2016 at 9:10 AM, Daniel Berlin <dberlin at dberlin.org> wrote:> it would also, for that matter, say the same about an array of pointers. > > It's not clear to me what will break if you change this to > isPtrOrPtrVectorTy. > > In fact, i know it doesn't fix this bug. > It's a pretty deep rabbit hole of things not quite prepared to understand > vectors of pointers. > > (we prepare memorylocations of them, but memory locations expect to be one > thing, not a group of things, etc). > > > > > On Mon, Aug 29, 2016 at 8:58 AM, Daniel Berlin <dberlin at dberlin.org> > wrote: > >> this is definitely a bug in AA. >> >> 225 for (auto I = CS2.arg_begin(), E = CS2.arg_end(); I != E; >> ++I) { >> 226 const Value *Arg = *I; >> 227 if (!Arg->getType()->isPointerTy()) >> -> 228 continue; >> 229 unsigned CS2ArgIdx = std::distance(CS2.arg_begin(), I); >> 230 auto CS2ArgLoc = MemoryLocation::getForArgument(CS2, >> CS2ArgIdx, TLI); >> >> AliasAnalysis.cpp:228 >> >> It ignores every argument because they are vectors of pointers, not >> pointers. >> >> >> I'm surprised this has not broken anything before. It will never say two >> intrinsics with vectors of pointers mod/ref each other. >> >> >> On Mon, Aug 29, 2016 at 7:36 AM, Davide Italiano <davide at freebsd.org> >> wrote: >> >>> + Daniel Berlin >>> >>> On Mon, Aug 29, 2016 at 3:42 PM, Chris Sakalis via llvm-dev >>> <llvm-dev at lists.llvm.org> wrote: >>> > Hello everyone, >>> > >>> > I think I have found an gvn / alias analysis related bug, but before >>> opening >>> > an issue on the tracker I wanted to see if I am missing something. I >>> have >>> > the following testcase: >>> > >>> >> define spir_kernel void @test(<2 x i32*> %in1, <2 x i32*> %in2, i32* >>> %out) >>> >> { >>> >> entry: >>> >> ; Just some temporary storage >>> >> %tmp.0 = alloca i32 >>> >> %tmp.1 = alloca i32 >>> >> %tmp.i = insertelement <2 x i32*> undef, i32* %tmp.0, i32 0 >>> >> %tmp = insertelement <2 x i32*> %tmp.i, i32* %tmp.1, i32 1 >>> >> ; Read from in1 and in2 >>> >> %in1.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %in1, >>> i32 >>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 >>> >> %in2.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %in2, >>> i32 >>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 >>> >> ; Store in1 to the allocas >>> >> call void @llvm.masked.scatter.v2i32(<2 x i32> %in1.v, <2 x i32*> >>> %tmp, >>> >> i32 1, <2 x i1> <i1 true, i1 true>); >>> >> ; Read in1 from the allocas >>> >> %tmp.v.0 = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> >>> %tmp, i32 >>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 >>> >> ; Store in2 to the allocas >>> >> call void @llvm.masked.scatter.v2i32(<2 x i32> %in2.v, <2 x i32*> >>> %tmp, >>> >> i32 1, <2 x i1> <i1 true, i1 true>); >>> >> ; Read in2 from the allocas >>> >> %tmp.v.1 = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> >>> %tmp, i32 >>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 >>> >> ; Store in2 to out for good measure >>> >> %tmp.v.1.0 = extractelement <2 x i32> %tmp.v.1, i32 0 >>> >> %tmp.v.1.1 = extractelement <2 x i32> %tmp.v.1, i32 1 >>> >> store i32 %tmp.v.1.0, i32* %out >>> >> %out.1 = getelementptr i32, i32* %out, i32 1 >>> >> store i32 %tmp.v.1.1, i32* %out.1 >>> >> ret void >>> >> } >>> > >>> > >>> > It uses a masked scatter operation to store a value to the two allocas >>> and >>> > then uses a masked gather operation to read that same value. This is >>> done >>> > twice in a row, with two different values. If I run this code through >>> the >>> > GVN pass, the second gather (%tmp.v.1) will be deemed to be the same >>> as the >>> > first gather (%tmp.v.0) and it will be removed. After some debugging, I >>> > realized that this is happening because the Memory Dependence Analysis >>> > returns %tmp.v.0 as the Def dependency for %tmp.v.1, even though the >>> scatter >>> > call in between changes the value stored at %tmp. This, in turn, is >>> > happening because the alias analysis is returning NoModRef for the >>> %tmp.v.1 >>> > and second scatter callsites. The resulting IR produces the wrong >>> result: >>> > >>> >> define spir_kernel void @test(<2 x i32*> %in1, <2 x i32*> %in2, i32* >>> %out) >>> >> { >>> >> entry: >>> >> %tmp.0 = alloca i32 >>> >> %tmp.1 = alloca i32 >>> >> %tmp.i = insertelement <2 x i32*> undef, i32* %tmp.0, i32 0 >>> >> %tmp = insertelement <2 x i32*> %tmp.i, i32* %tmp.1, i32 1 >>> >> %in1.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %in1, >>> i32 >>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 >>> >> %in2.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %in2, >>> i32 >>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 >>> >> call void @llvm.masked.scatter.v2i32(<2 x i32> %in1.v, <2 x i32*> >>> %tmp, >>> >> i32 1, <2 x i1> <i1 true, i1 true>) >>> >> %tmp.v.0 = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> >>> %tmp, i32 >>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 >>> >> call void @llvm.masked.scatter.v2i32(<2 x i32> %in2.v, <2 x i32*> >>> %tmp, >>> >> i32 1, <2 x i1> <i1 true, i1 true>) >>> >> ; The call to masked.gather is gone and now we are using the old >>> value >>> >> (%tmp.v.0) >>> >> %tmp.v.1.0 = extractelement <2 x i32> %tmp.v.0, i32 0 >>> >> %tmp.v.1.1 = extractelement <2 x i32> %tmp.v.0, i32 1 >>> >> store i32 %tmp.v.1.0, i32* %out >>> >> %out.1 = getelementptr i32, i32* %out, i32 1 >>> >> store i32 %tmp.v.1.1, i32* %out.1 >>> >> ret void >>> >> } >>> > >>> > >>> > The old value read from %tmp is used, instead of the new one. I tested >>> this >>> > code using `opt -gvn`, with LLVM 3.8.1. I also tried tip (84cb7f4) >>> with the >>> > same result. >>> > >>> > I should mention that if I replace the second scatter with stores, the >>> issue >>> > persists. The only way to avoid it is to replace all scatter/gather >>> > intrinsics with equivalent sets of store/load, in which case the MemDep >>> > returns the correct dependencies and the GVN pass will not remove the >>> second >>> > set of loads. >>> > >>> > So, my question is, is this a bug or am I doing something that I >>> shouldn't >>> > be in the IR? And if it is a bug, is it the AA analyses that return the >>> > wrong result (I presume so) or should GVN handle such cases >>> differently? >>> > >>> > Best regards, >>> > >>> > Chris >>> > >>> > _______________________________________________ >>> > LLVM Developers mailing list >>> > llvm-dev at lists.llvm.org >>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> > >>> >>> >>> >>> -- >>> Davide >>> >>> "There are no solved problems; there are only problems that are more >>> or less solved" -- Henri Poincare >>> >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160829/db3748fd/attachment.html> -------------- next part -------------- diff --git a/lib/Analysis/AliasAnalysis.cpp b/lib/Analysis/AliasAnalysis.cpp index f931b6f..ee51067 100644 --- a/lib/Analysis/AliasAnalysis.cpp +++ b/lib/Analysis/AliasAnalysis.cpp @@ -155,7 +155,7 @@ ModRefInfo AAResults::getModRefInfo(ImmutableCallSite CS, if (doesAccessArgPointees(MRB)) { for (auto AI = CS.arg_begin(), AE = CS.arg_end(); AI != AE; ++AI) { const Value *Arg = *AI; - if (!Arg->getType()->isPointerTy()) + if (!Arg->getType()->isPtrOrPtrVectorTy()) continue; unsigned ArgIdx = std::distance(CS.arg_begin(), AI); MemoryLocation ArgLoc = MemoryLocation::getForArgument(CS, ArgIdx, TLI); @@ -224,7 +224,7 @@ ModRefInfo AAResults::getModRefInfo(ImmutableCallSite CS1, if (doesAccessArgPointees(CS2B)) { for (auto I = CS2.arg_begin(), E = CS2.arg_end(); I != E; ++I) { const Value *Arg = *I; - if (!Arg->getType()->isPointerTy()) + if (!Arg->getType()->isPtrOrPtrVectorTy()) continue; unsigned CS2ArgIdx = std::distance(CS2.arg_begin(), I); auto CS2ArgLoc = MemoryLocation::getForArgument(CS2, CS2ArgIdx, TLI); @@ -254,7 +254,7 @@ ModRefInfo AAResults::getModRefInfo(ImmutableCallSite CS1, if (doesAccessArgPointees(CS1B)) { for (auto I = CS1.arg_begin(), E = CS1.arg_end(); I != E; ++I) { const Value *Arg = *I; - if (!Arg->getType()->isPointerTy()) + if (!Arg->getType()->isPtrOrPtrVectorTy()) continue; unsigned CS1ArgIdx = std::distance(CS1.arg_begin(), I); auto CS1ArgLoc = MemoryLocation::getForArgument(CS1, CS1ArgIdx, TLI); @@ -458,7 +458,7 @@ ModRefInfo AAResults::callCapturesBefore(const Instruction *I, // Only look at the no-capture or byval pointer arguments. If this // pointer were passed to arguments that were neither of these, then it // couldn't be no-capture. - if (!(*CI)->getType()->isPointerTy() || + if (!(*CI)->getType()->isPtrOrPtrVectorTy() || (!CS.doesNotCapture(ArgNo) && !CS.isByValArgument(ArgNo))) continue;
Philip Reames via llvm-dev
2016-Aug-29 20:26 UTC
[llvm-dev] GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics
We might have specification bug here, but we appear to implement what we specified. argmemonly is specified as only considering pointer typed arguments. It's behavior on vector-of-pointers is unspecified, but would seem to fall into the same case as inttoptr of an integer argument (i.e. explicitly undefined). We could consider changing that. On 08/29/2016 11:59 AM, Daniel Berlin wrote:> + a few others. > > After following this rabbit hole a bit, there are a lot of mutually > recursive calls, etc, that may or may not do the right thing with > vectors of pointers. > I can fix *this* particular bug with the attached patch. > > However, it's mostly papering over stuff. Nothing seems to know what > to do with a memorylocation that is a vector of pointers. They all > expect memorylocation to be a single pointer location. > > I would chalk it up to "luck" that this patch fixes the bug. > > It's pretty clear that MemoryLocation doesn't fit the needs of a lot > of stuff anymore (we hacked AA nodes into it, and lots of stuff now > tries to figure out the invariantess of the locations, blah blah > blah), but it seems like a big job to figure out what to replace it > with that will work for these cases. > > (I'm pretty positive if we just make it MemoryLocations, and have > everything loop over the locations, the compiler will get a lot larger > and a lot slower) > > On Mon, Aug 29, 2016 at 9:10 AM, Daniel Berlin <dberlin at dberlin.org > <mailto:dberlin at dberlin.org>> wrote: > > it would also, for that matter, say the same about an array of > pointers. > > It's not clear to me what will break if you change this to > isPtrOrPtrVectorTy. > > In fact, i know it doesn't fix this bug. > It's a pretty deep rabbit hole of things not quite prepared to > understand vectors of pointers. > > (we prepare memorylocations of them, but memory locations expect > to be one thing, not a group of things, etc). > > > > > On Mon, Aug 29, 2016 at 8:58 AM, Daniel Berlin > <dberlin at dberlin.org <mailto:dberlin at dberlin.org>> wrote: > > this is definitely a bug in AA. > > 225 for (auto I = CS2.arg_begin(), E = CS2.arg_end(); I > != E; ++I) { > 226 const Value *Arg = *I; > 227 if (!Arg->getType()->isPointerTy()) > -> 228 continue; > 229 unsigned CS2ArgIdx > std::distance(CS2.arg_begin(), I); > 230 auto CS2ArgLoc > MemoryLocation::getForArgument(CS2, CS2ArgIdx, TLI); > > AliasAnalysis.cpp:228 > > It ignores every argument because they are vectors of > pointers, not pointers. > > > I'm surprised this has not broken anything before. It will > never say two intrinsics with vectors of pointers mod/ref each > other. > > > On Mon, Aug 29, 2016 at 7:36 AM, Davide Italiano > <davide at freebsd.org <mailto:davide at freebsd.org>> wrote: > > + Daniel Berlin > > On Mon, Aug 29, 2016 at 3:42 PM, Chris Sakalis via llvm-dev > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> > wrote: > > Hello everyone, > > > > I think I have found an gvn / alias analysis related > bug, but before opening > > an issue on the tracker I wanted to see if I am missing > something. I have > > the following testcase: > > > >> define spir_kernel void @test(<2 x i32*> %in1, <2 x > i32*> %in2, i32* %out) > >> { > >> entry: > >> ; Just some temporary storage > >> %tmp.0 = alloca i32 > >> %tmp.1 = alloca i32 > >> %tmp.i = insertelement <2 x i32*> undef, i32* %tmp.0, > i32 0 > >> %tmp = insertelement <2 x i32*> %tmp.i, i32* %tmp.1, > i32 1 > >> ; Read from in1 and in2 > >> %in1.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 > x i32*> %in1, i32 > >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 > >> %in2.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 > x i32*> %in2, i32 > >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 > >> ; Store in1 to the allocas > >> call void @llvm.masked.scatter.v2i32(<2 x i32> > %in1.v, <2 x i32*> %tmp, > >> i32 1, <2 x i1> <i1 true, i1 true>); > >> ; Read in1 from the allocas > >> %tmp.v.0 = call <2 x i32> > @llvm.masked.gather.v2i32(<2 x i32*> %tmp, i32 > >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 > >> ; Store in2 to the allocas > >> call void @llvm.masked.scatter.v2i32(<2 x i32> > %in2.v, <2 x i32*> %tmp, > >> i32 1, <2 x i1> <i1 true, i1 true>); > >> ; Read in2 from the allocas > >> %tmp.v.1 = call <2 x i32> > @llvm.masked.gather.v2i32(<2 x i32*> %tmp, i32 > >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 > >> ; Store in2 to out for good measure > >> %tmp.v.1.0 = extractelement <2 x i32> %tmp.v.1, i32 0 > >> %tmp.v.1.1 = extractelement <2 x i32> %tmp.v.1, i32 1 > >> store i32 %tmp.v.1.0, i32* %out > >> %out.1 = getelementptr i32, i32* %out, i32 1 > >> store i32 %tmp.v.1.1, i32* %out.1 > >> ret void > >> } > > > > > > It uses a masked scatter operation to store a value to > the two allocas and > > then uses a masked gather operation to read that same > value. This is done > > twice in a row, with two different values. If I run this > code through the > > GVN pass, the second gather (%tmp.v.1) will be deemed to > be the same as the > > first gather (%tmp.v.0) and it will be removed. After > some debugging, I > > realized that this is happening because the Memory > Dependence Analysis > > returns %tmp.v.0 as the Def dependency for %tmp.v.1, > even though the scatter > > call in between changes the value stored at %tmp. This, > in turn, is > > happening because the alias analysis is returning > NoModRef for the %tmp.v.1 > > and second scatter callsites. The resulting IR produces > the wrong result: > > > >> define spir_kernel void @test(<2 x i32*> %in1, <2 x > i32*> %in2, i32* %out) > >> { > >> entry: > >> %tmp.0 = alloca i32 > >> %tmp.1 = alloca i32 > >> %tmp.i = insertelement <2 x i32*> undef, i32* %tmp.0, > i32 0 > >> %tmp = insertelement <2 x i32*> %tmp.i, i32* %tmp.1, > i32 1 > >> %in1.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 > x i32*> %in1, i32 > >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 > >> %in2.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 > x i32*> %in2, i32 > >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 > >> call void @llvm.masked.scatter.v2i32(<2 x i32> > %in1.v, <2 x i32*> %tmp, > >> i32 1, <2 x i1> <i1 true, i1 true>) > >> %tmp.v.0 = call <2 x i32> > @llvm.masked.gather.v2i32(<2 x i32*> %tmp, i32 > >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 > >> call void @llvm.masked.scatter.v2i32(<2 x i32> > %in2.v, <2 x i32*> %tmp, > >> i32 1, <2 x i1> <i1 true, i1 true>) > >> ; The call to masked.gather is gone and now we are > using the old value > >> (%tmp.v.0) > >> %tmp.v.1.0 = extractelement <2 x i32> %tmp.v.0, i32 0 > >> %tmp.v.1.1 = extractelement <2 x i32> %tmp.v.0, i32 1 > >> store i32 %tmp.v.1.0, i32* %out > >> %out.1 = getelementptr i32, i32* %out, i32 1 > >> store i32 %tmp.v.1.1, i32* %out.1 > >> ret void > >> } > > > > > > The old value read from %tmp is used, instead of the new > one. I tested this > > code using `opt -gvn`, with LLVM 3.8.1. I also tried tip > (84cb7f4) with the > > same result. > > > > I should mention that if I replace the second scatter > with stores, the issue > > persists. The only way to avoid it is to replace all > scatter/gather > > intrinsics with equivalent sets of store/load, in which > case the MemDep > > returns the correct dependencies and the GVN pass will > not remove the second > > set of loads. > > > > So, my question is, is this a bug or am I doing > something that I shouldn't > > be in the IR? And if it is a bug, is it the AA analyses > that return the > > wrong result (I presume so) or should GVN handle such > cases differently? > > > > Best regards, > > > > Chris > > > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > > > > > > -- > Davide > > "There are no solved problems; there are only problems > that are more > or less solved" -- Henri Poincare > > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160829/6d72b852/attachment-0001.html>
Daniel Berlin via llvm-dev
2016-Aug-29 20:59 UTC
[llvm-dev] GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics
Okay, so then it sounds like, for now, the right fix is to stop marking masked.gather and masked.scatter with intrarg* options. On Mon, Aug 29, 2016, 1:26 PM Philip Reames <listmail at philipreames.com> wrote:> We might have specification bug here, but we appear to implement what we > specified. argmemonly is specified as only considering pointer typed > arguments. It's behavior on vector-of-pointers is unspecified, but would > seem to fall into the same case as inttoptr of an integer argument (i.e. > explicitly undefined). We could consider changing that. > > > > On 08/29/2016 11:59 AM, Daniel Berlin wrote: > > + a few others. > > After following this rabbit hole a bit, there are a lot of mutually > recursive calls, etc, that may or may not do the right thing with vectors > of pointers. > I can fix *this* particular bug with the attached patch. > > However, it's mostly papering over stuff. Nothing seems to know what to > do with a memorylocation that is a vector of pointers. They all expect > memorylocation to be a single pointer location. > > I would chalk it up to "luck" that this patch fixes the bug. > > It's pretty clear that MemoryLocation doesn't fit the needs of a lot of > stuff anymore (we hacked AA nodes into it, and lots of stuff now tries to > figure out the invariantess of the locations, blah blah blah), but it seems > like a big job to figure out what to replace it with that will work for > these cases. > > (I'm pretty positive if we just make it MemoryLocations, and have > everything loop over the locations, the compiler will get a lot larger and > a lot slower) > > On Mon, Aug 29, 2016 at 9:10 AM, Daniel Berlin <dberlin at dberlin.org> > wrote: > >> it would also, for that matter, say the same about an array of pointers. >> >> It's not clear to me what will break if you change this to >> isPtrOrPtrVectorTy. >> >> In fact, i know it doesn't fix this bug. >> It's a pretty deep rabbit hole of things not quite prepared to understand >> vectors of pointers. >> >> (we prepare memorylocations of them, but memory locations expect to be >> one thing, not a group of things, etc). >> >> >> >> >> On Mon, Aug 29, 2016 at 8:58 AM, Daniel Berlin <dberlin at dberlin.org> >> wrote: >> >>> this is definitely a bug in AA. >>> >>> 225 for (auto I = CS2.arg_begin(), E = CS2.arg_end(); I != E; >>> ++I) { >>> 226 const Value *Arg = *I; >>> 227 if (!Arg->getType()->isPointerTy()) >>> -> 228 continue; >>> 229 unsigned CS2ArgIdx = std::distance(CS2.arg_begin(), I); >>> 230 auto CS2ArgLoc = MemoryLocation::getForArgument(CS2, >>> CS2ArgIdx, TLI); >>> >>> AliasAnalysis.cpp:228 >>> >>> It ignores every argument because they are vectors of pointers, not >>> pointers. >>> >>> >>> I'm surprised this has not broken anything before. It will never say two >>> intrinsics with vectors of pointers mod/ref each other. >>> >>> >>> On Mon, Aug 29, 2016 at 7:36 AM, Davide Italiano <davide at freebsd.org> >>> wrote: >>> >>>> + Daniel Berlin >>>> >>>> On Mon, Aug 29, 2016 at 3:42 PM, Chris Sakalis via llvm-dev >>>> <llvm-dev at lists.llvm.org> wrote: >>>> > Hello everyone, >>>> > >>>> > I think I have found an gvn / alias analysis related bug, but before >>>> opening >>>> > an issue on the tracker I wanted to see if I am missing something. I >>>> have >>>> > the following testcase: >>>> > >>>> >> define spir_kernel void @test(<2 x i32*> %in1, <2 x i32*> %in2, i32* >>>> %out) >>>> >> { >>>> >> entry: >>>> >> ; Just some temporary storage >>>> >> %tmp.0 = alloca i32 >>>> >> %tmp.1 = alloca i32 >>>> >> %tmp.i = insertelement <2 x i32*> undef, i32* %tmp.0, i32 0 >>>> >> %tmp = insertelement <2 x i32*> %tmp.i, i32* %tmp.1, i32 1 >>>> >> ; Read from in1 and in2 >>>> >> %in1.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %in1, >>>> i32 >>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 >>>> >> %in2.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %in2, >>>> i32 >>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 >>>> >> ; Store in1 to the allocas >>>> >> call void @llvm.masked.scatter.v2i32(<2 x i32> %in1.v, <2 x i32*> >>>> %tmp, >>>> >> i32 1, <2 x i1> <i1 true, i1 true>); >>>> >> ; Read in1 from the allocas >>>> >> %tmp.v.0 = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> >>>> %tmp, i32 >>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 >>>> >> ; Store in2 to the allocas >>>> >> call void @llvm.masked.scatter.v2i32(<2 x i32> %in2.v, <2 x i32*> >>>> %tmp, >>>> >> i32 1, <2 x i1> <i1 true, i1 true>); >>>> >> ; Read in2 from the allocas >>>> >> %tmp.v.1 = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> >>>> %tmp, i32 >>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 >>>> >> ; Store in2 to out for good measure >>>> >> %tmp.v.1.0 = extractelement <2 x i32> %tmp.v.1, i32 0 >>>> >> %tmp.v.1.1 = extractelement <2 x i32> %tmp.v.1, i32 1 >>>> >> store i32 %tmp.v.1.0, i32* %out >>>> >> %out.1 = getelementptr i32, i32* %out, i32 1 >>>> >> store i32 %tmp.v.1.1, i32* %out.1 >>>> >> ret void >>>> >> } >>>> > >>>> > >>>> > It uses a masked scatter operation to store a value to the two >>>> allocas and >>>> > then uses a masked gather operation to read that same value. This is >>>> done >>>> > twice in a row, with two different values. If I run this code through >>>> the >>>> > GVN pass, the second gather (%tmp.v.1) will be deemed to be the same >>>> as the >>>> > first gather (%tmp.v.0) and it will be removed. After some debugging, >>>> I >>>> > realized that this is happening because the Memory Dependence Analysis >>>> > returns %tmp.v.0 as the Def dependency for %tmp.v.1, even though the >>>> scatter >>>> > call in between changes the value stored at %tmp. This, in turn, is >>>> > happening because the alias analysis is returning NoModRef for the >>>> %tmp.v.1 >>>> > and second scatter callsites. The resulting IR produces the wrong >>>> result: >>>> > >>>> >> define spir_kernel void @test(<2 x i32*> %in1, <2 x i32*> %in2, i32* >>>> %out) >>>> >> { >>>> >> entry: >>>> >> %tmp.0 = alloca i32 >>>> >> %tmp.1 = alloca i32 >>>> >> %tmp.i = insertelement <2 x i32*> undef, i32* %tmp.0, i32 0 >>>> >> %tmp = insertelement <2 x i32*> %tmp.i, i32* %tmp.1, i32 1 >>>> >> %in1.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %in1, >>>> i32 >>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 >>>> >> %in2.v = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> %in2, >>>> i32 >>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 >>>> >> call void @llvm.masked.scatter.v2i32(<2 x i32> %in1.v, <2 x i32*> >>>> %tmp, >>>> >> i32 1, <2 x i1> <i1 true, i1 true>) >>>> >> %tmp.v.0 = call <2 x i32> @llvm.masked.gather.v2i32(<2 x i32*> >>>> %tmp, i32 >>>> >> 1, <2 x i1> <i1 true, i1 true>, <2 x i32> undef) #1 >>>> >> call void @llvm.masked.scatter.v2i32(<2 x i32> %in2.v, <2 x i32*> >>>> %tmp, >>>> >> i32 1, <2 x i1> <i1 true, i1 true>) >>>> >> ; The call to masked.gather is gone and now we are using the old >>>> value >>>> >> (%tmp.v.0) >>>> >> %tmp.v.1.0 = extractelement <2 x i32> %tmp.v.0, i32 0 >>>> >> %tmp.v.1.1 = extractelement <2 x i32> %tmp.v.0, i32 1 >>>> >> store i32 %tmp.v.1.0, i32* %out >>>> >> %out.1 = getelementptr i32, i32* %out, i32 1 >>>> >> store i32 %tmp.v.1.1, i32* %out.1 >>>> >> ret void >>>> >> } >>>> > >>>> > >>>> > The old value read from %tmp is used, instead of the new one. I >>>> tested this >>>> > code using `opt -gvn`, with LLVM 3.8.1. I also tried tip (84cb7f4) >>>> with the >>>> > same result. >>>> > >>>> > I should mention that if I replace the second scatter with stores, >>>> the issue >>>> > persists. The only way to avoid it is to replace all scatter/gather >>>> > intrinsics with equivalent sets of store/load, in which case the >>>> MemDep >>>> > returns the correct dependencies and the GVN pass will not remove the >>>> second >>>> > set of loads. >>>> > >>>> > So, my question is, is this a bug or am I doing something that I >>>> shouldn't >>>> > be in the IR? And if it is a bug, is it the AA analyses that return >>>> the >>>> > wrong result (I presume so) or should GVN handle such cases >>>> differently? >>>> > >>>> > Best regards, >>>> > >>>> > Chris >>>> > >>>> > _______________________________________________ >>>> > LLVM Developers mailing list >>>> > llvm-dev at lists.llvm.org >>>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> > >>>> >>>> >>>> >>>> -- >>>> Davide >>>> >>>> "There are no solved problems; there are only problems that are more >>>> or less solved" -- Henri Poincare >>>> >>> >>> >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160829/0db527f5/attachment.html>
Maybe Matching Threads
- GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics
- GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics
- GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics
- GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics
- GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics