Arnold Schwaighofer
2015-Jan-08 15:31 UTC
[LLVMdev] Crash in SLP for vector data type as function argument.
I am fine with this going in without a test case. This code is currently not enabled. Just mention in the commit message that the assumption that "VectorizedValue" is always an instruction is not correct because we could get a vectorized tree consisting of just a Constant or an Argument vector. Thanks On 01/08/15, Suyog Kamal Sarda wrote:> Hi Arnold, Shahid, > > Thanks for reply and suggestions. > > I would go with Arnold's suggestion not to cast VectorizedValue to Instruction and use it as it is. > > I can commit patch on this (trivial change in assignment to TempVec). > > However, this issue hits only when we have http://reviews.llvm.org/D6818 properly committed. > It will take some time for me to improve that patch. > > This crash issue doesn't have anything to do with above patch code, so I am not getting how to put a test case for this. > > Can I simply commit it without test case? If yes, should I go ahead with the commit after check-all without putting it for approval? > Please suggest a proper way to do this. > > Regards, > Suyog > > ------- Original Message ------- > Sender : Arnold Schwaighofer<aschwaighofer at apple.com> > Date : Jan 08, 2015 02:05 (GMT+09:00) > Title : Re: [LLVMdev] Crash in SLP for vector data type as function argument. > > The code in emitReduction has to be fixed. As your example shows it is not safe to assume we will always have an instruction as a result of vectorizeTree(). It seems to me that we can just remove the line that performs the cast. All subsequent uses of the value ‘ValToReduce’ actually are uses of “Value *TmpVec”. The IRBuilder in the variable “Builder” carries the insertion point for all operations in this function (inserting after the instruction “ValToReduce” would be a reason why we need an “Instruction”). > > /// \brief Emit a horizontal reduction of the vectorized value. > Value *emitReduction(Value *VectorizedValue, IRBuilder<> &Builder) { > assert(VectorizedValue && "Need to have a vectorized tree node"); > Instruction *ValToReduce = dyn_cast(VectorizedValue); > assert(isPowerOf2_32(ReduxWidth) && > "We only handle power-of-two reductions for now"); > > Value *TmpVec = ValToReduce; > for (unsigned i = ReduxWidth / 2; i != 0; i >>= 1) { > if (IsPairwiseReduction) { > Value *LeftMask > createRdxShuffleMask(ReduxWidth, i, true, true, Builder); > Value *RightMask > createRdxShuffleMask(ReduxWidth, i, true, false, Builder); > > Value *LeftShuf = Builder.CreateShuffleVector( > TmpVec, UndefValue::get(TmpVec->getType()), LeftMask, "rdx.shuf.l"); > Value *RightShuf = Builder.CreateShuffleVector( > TmpVec, UndefValue::get(TmpVec->getType()), (RightMask), > "rdx.shuf.r"); > TmpVec = createBinOp(Builder, ReductionOpcode, LeftShuf, RightShuf, > "bin.rdx"); > } else { > Value *UpperHalf > createRdxShuffleMask(ReduxWidth, i, false, false, Builder); > Value *Shuf = Builder.CreateShuffleVector( > TmpVec, UndefValue::get(TmpVec->getType()), UpperHalf, "rdx.shuf"); > TmpVec = createBinOp(Builder, ReductionOpcode, TmpVec, Shuf, "bin.rdx"); > } > } > > Thanks, > Arnold > > > On Jan 7, 2015, at 3:34 AM, Suyog Kamal Sarda wrote: > > > > Hi Shahid, > > > > Thanks for the reply. > > > > Actually, yes, the emitreduction() takes vectorizedvalue which is leaf of the tree. ' > > I got confused by the name of the argument passed while calling emitReduction(). > > > > Value *ReducedSubTree = emitReduction(VectorizedRoot, Builder) > > > > Anyways, that should hardly matter. > > > > I had mentioned the test case : > > > > int foo(uint32x4_t a) { > > return a[0] + a[1] + a[2] + a[3]; > > } > > > > LLVM IR : > > > > define i32 @hadd(<4 x i32> %a) { > > entry: > > %vecext = extractelement <4 x i32> %a, i32 0 > > %vecext1 = extractelement <4 x i32> %a, i32 1 > > %add = add i32 %vecext, %vecext1 > > %vecext2 = extractelement <4 x i32> %a, i32 2 > > %add3 = add i32 %add, %vecext2 > > %vecext4 = extractelement <4 x i32> %a, i32 3 > > %add5 = add i32 %add3, %vecext4 > > ret i32 %add5 > > } > > > > Now, when leaf %vecext is reached, the vectorizeTree() function call sets the VectorizedValue to 0th operand of extractelement instruction. > > > > case Instruction::ExtractElelement: { > > if(CanReuseExtract(E->Scalars)) { > > Value *V = VL0->getOperand(0); > > E->VectorizedValue = V; > > return V; > > } > > return Gather(E->Scalars, VecTy); > > } > > > > Now in emitReduction(), the VectorizedValue is dyn_cast to Instruction. > > In above IR, %a is not an instruction (function argument), hence while referring the casted value which is null, > > crash occurs. > > > > Instruction *ValToReduce = dyn_cast(VectorizedValue); > > > > Note : The above test case won't crash with current svn version, since code for parsing the tree for above IR is yet > > to be included in svn. Initial patch was submitted in http://reviews.llvm.org/D6818. > > I am working on refining it, however, the above code flow is not disturbed at all in my patch of parsing. > > You can try to reproduce the problem by importing above patch in local code. > > > > When the vector data type 'a' is in global scope, a 'load' instruction is generated in basic block of the function: > > > > test case 2: > > > > unint32x4_t a; > > int foo() { > > return a[0] + a[1] + a[2] + a[3]; > > } > > > > IR for above test case : > > > > @a = common global <4 x i32> zeroinitializer, align 16 > > > > define i32 @hadd() #0 { > > entry: > > %0 = load <4 x i32>* @a, align 16, !tbaa !1 > > %vecext = extractelement <4 x i32> %0, i32 0 > > %vecext1 = extractelement <4 x i32> %0, i32 1 > > %add = add i32 %vecext, %vecext1 > > %vecext2 = extractelement <4 x i32> %0, i32 2 > > %add3 = add i32 %add, %vecext2 > > %vecext4 = extractelement <4 x i32> %0, i32 3 > > %add5 = add i32 %add3, %vecext4 > > ret i32 %add5 > > } > > > > Now, since here, 0th operand of leaf %vecext is a load instruction, > > the dyn_casting into an instruction will succeed here and reduction will be emitted properly. > > > > How can we solve this problem? What type of casting should a function argument belong to? > > > > Regards, > > Suyog > > > > > > > > ------- Original Message ------- > > Sender : Shahid, Asghar-ahmad > > Date : Jan 07, 2015 20:05 (GMT+09:00) > > Title : RE: [LLVMdev] Crash in SLP for vector data type as function argument. > > > > Hi Suyog, > > > > IMO emitReduction() takes a vectorized value which is the leafs of the matched pattern/tree. > > So what you are thinking as root is actually the leaf of the tree. > > Root should actually be the value which is being feed to the "return" statement. > > > > It would be of great help if you could, share the sample test? > > > > Regards, > > Shahid > > > >> -----Original Message----- > >> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu](javascript:main.compose() > >> On Behalf Of Suyog Kamal Sarda > >> Sent: Monday, January 05, 2015 5:40 PM > >> To: nrotem at apple.com; aschwaighofer at apple.com; > >> mzolotukhin at apple.com; james.molloy at arm.com > >> Cc: llvmdev at cs.uiuc.edu > >> Subject: [LLVMdev] Crash in SLP for vector data type as function argument. > >> > >> Hi all, > >> > >> Came across a crash in SLP vectorization while testing following code for > >> AArch64 : > >> > >> int foo(uint32x4_t a) { > >> return a[0] + a[1] + a[2] + a[3]; > >> } > >> > >> The LLVM IR for above code will be: > >> > >> define i32 @hadd(<4 x i32> %a) { > >> entry: > >> %vecext = extractelement <4 x i32> %a, i32 0 > >> %vecext1 = extractelement <4 x i32> %a, i32 1 > >> %add = add i32 %vecext, %vecext1 > >> %vecext2 = extractelement <4 x i32> %a, i32 2 > >> %add3 = add i32 %add, %vecext2 > >> %vecext4 = extractelement <4 x i32> %a, i32 3 > >> %add5 = add i32 %add3, %vecext4 > >> ret i32 %add5 > >> } > >> > >> I somehow try to recognize this pattern and try to vectorize it using existing > >> code for horizontal reductions (I just recognize the pattern and fill up the > >> data, rest is done by already existing code. > >> I do pattern matching very badly though, but that's a different story). > >> > >> > >> Please note that whatever follows is with existing code, I haven't modified > >> any bit of it. > >> > >> Now, once the pattern is recognized, we call "trytoReduce()" where we try > >> to vectorize tree by function call "vectorizeTree()" which returns root of the > >> vectorized tree. Then we emit the reduction using call "emitRedcution()" > >> which takes the root of the vector tree as argument. Inside > >> "emitReduction()", we cast root of the tree into an instruction. > >> > >> Now, for above case, while setting the root of the vectorized tree, > >> extractelement instruction is encountered, and its 0th operand is set as the > >> root of the tree, which in above case is "%a". However, this is not an > >> instruction and hence, when we cast it into an instruction in > >> "emitReduction()" code, it returns nullptr which causes a crash ahead when > >> referencing it. > >> > >> Take a second case where the vector data type is in global scope. > >> > >> unint32x4_t a; > >> int foo() { > >> return a[0] + a[1] + a[2] + a[3]; > >> } > >> > >> The IR for above code is: > >> > >> @a = common global <4 x i32> zeroinitializer, align 16 > >> > >> define i32 @hadd() #0 { > >> entry: > >> %0 = load <4 x i32>* @a, align 16, !tbaa !1 > >> %vecext = extractelement <4 x i32> %0, i32 0 > >> %vecext1 = extractelement <4 x i32> %0, i32 1 > >> %add = add i32 %vecext, %vecext1 > >> %vecext2 = extractelement <4 x i32> %0, i32 2 > >> %add3 = add i32 %add, %vecext2 > >> %vecext4 = extractelement <4 x i32> %0, i32 3 > >> %add5 = add i32 %add3, %vecext4 > >> ret i32 %add5 > >> } > >> > >> Now in above case, 0th operand of extractelement %0 is a load instruction, > >> and hence it doesn't crash while casting into an instruction and runs smoothly > >> further. > >> > >> Can someone please suggest how to resolve this? Is there something I am > >> missing or is it a basic problem with IR itself ? > >> > >> Regards, > >> Suyog > >> > >> > >> _______________________________________________ > >> LLVM Developers mailing list > >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev