Suyog Kamal Sarda
2015-Jan-05 12:09 UTC
[LLVMdev] Crash in SLP for vector data type as function argument.
Hi all, Came across a crash in SLP vectorization while testing following code for AArch64 : int foo(uint32x4_t a) { return a[0] + a[1] + a[2] + a[3]; } The LLVM IR for above code will be: define i32 @hadd(<4 x i32> %a) { entry: %vecext = extractelement <4 x i32> %a, i32 0 %vecext1 = extractelement <4 x i32> %a, i32 1 %add = add i32 %vecext, %vecext1 %vecext2 = extractelement <4 x i32> %a, i32 2 %add3 = add i32 %add, %vecext2 %vecext4 = extractelement <4 x i32> %a, i32 3 %add5 = add i32 %add3, %vecext4 ret i32 %add5 } I somehow try to recognize this pattern and try to vectorize it using existing code for horizontal reductions (I just recognize the pattern and fill up the data, rest is done by already existing code. I do pattern matching very badly though, but that's a different story). Please note that whatever follows is with existing code, I haven't modified any bit of it. Now, once the pattern is recognized, we call "trytoReduce()" where we try to vectorize tree by function call "vectorizeTree()" which returns root of the vectorized tree. Then we emit the reduction using call "emitRedcution()" which takes the root of the vector tree as argument. Inside "emitReduction()", we cast root of the tree into an instruction. Now, for above case, while setting the root of the vectorized tree, extractelement instruction is encountered, and its 0th operand is set as the root of the tree, which in above case is "%a". However, this is not an instruction and hence, when we cast it into an instruction in "emitReduction()" code, it returns nullptr which causes a crash ahead when referencing it. Take a second case where the vector data type is in global scope. unint32x4_t a; int foo() { return a[0] + a[1] + a[2] + a[3]; } The IR for above code is: @a = common global <4 x i32> zeroinitializer, align 16 define i32 @hadd() #0 { entry: %0 = load <4 x i32>* @a, align 16, !tbaa !1 %vecext = extractelement <4 x i32> %0, i32 0 %vecext1 = extractelement <4 x i32> %0, i32 1 %add = add i32 %vecext, %vecext1 %vecext2 = extractelement <4 x i32> %0, i32 2 %add3 = add i32 %add, %vecext2 %vecext4 = extractelement <4 x i32> %0, i32 3 %add5 = add i32 %add3, %vecext4 ret i32 %add5 } Now in above case, 0th operand of extractelement %0 is a load instruction, and hence it doesn't crash while casting into an instruction and runs smoothly further. Can someone please suggest how to resolve this? Is there something I am missing or is it a basic problem with IR itself ? Regards, Suyog
Shahid, Asghar-ahmad
2015-Jan-07 11:05 UTC
[LLVMdev] Crash in SLP for vector data type as function argument.
Hi Suyog, IMO emitReduction() takes a vectorized value which is the leafs of the matched pattern/tree. So what you are thinking as root is actually the leaf of the tree. Root should actually be the value which is being feed to the "return" statement. It would be of great help if you could, share the sample test? Regards, Shahid> -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] > On Behalf Of Suyog Kamal Sarda > Sent: Monday, January 05, 2015 5:40 PM > To: nrotem at apple.com; aschwaighofer at apple.com; > mzolotukhin at apple.com; james.molloy at arm.com > Cc: llvmdev at cs.uiuc.edu > Subject: [LLVMdev] Crash in SLP for vector data type as function argument. > > Hi all, > > Came across a crash in SLP vectorization while testing following code for > AArch64 : > > int foo(uint32x4_t a) { > return a[0] + a[1] + a[2] + a[3]; > } > > The LLVM IR for above code will be: > > define i32 @hadd(<4 x i32> %a) { > entry: > %vecext = extractelement <4 x i32> %a, i32 0 > %vecext1 = extractelement <4 x i32> %a, i32 1 > %add = add i32 %vecext, %vecext1 > %vecext2 = extractelement <4 x i32> %a, i32 2 > %add3 = add i32 %add, %vecext2 > %vecext4 = extractelement <4 x i32> %a, i32 3 > %add5 = add i32 %add3, %vecext4 > ret i32 %add5 > } > > I somehow try to recognize this pattern and try to vectorize it using existing > code for horizontal reductions (I just recognize the pattern and fill up the > data, rest is done by already existing code. > I do pattern matching very badly though, but that's a different story). > > > Please note that whatever follows is with existing code, I haven't modified > any bit of it. > > Now, once the pattern is recognized, we call "trytoReduce()" where we try > to vectorize tree by function call "vectorizeTree()" which returns root of the > vectorized tree. Then we emit the reduction using call "emitRedcution()" > which takes the root of the vector tree as argument. Inside > "emitReduction()", we cast root of the tree into an instruction. > > Now, for above case, while setting the root of the vectorized tree, > extractelement instruction is encountered, and its 0th operand is set as the > root of the tree, which in above case is "%a". However, this is not an > instruction and hence, when we cast it into an instruction in > "emitReduction()" code, it returns nullptr which causes a crash ahead when > referencing it. > > Take a second case where the vector data type is in global scope. > > unint32x4_t a; > int foo() { > return a[0] + a[1] + a[2] + a[3]; > } > > The IR for above code is: > > @a = common global <4 x i32> zeroinitializer, align 16 > > define i32 @hadd() #0 { > entry: > %0 = load <4 x i32>* @a, align 16, !tbaa !1 > %vecext = extractelement <4 x i32> %0, i32 0 > %vecext1 = extractelement <4 x i32> %0, i32 1 > %add = add i32 %vecext, %vecext1 > %vecext2 = extractelement <4 x i32> %0, i32 2 > %add3 = add i32 %add, %vecext2 > %vecext4 = extractelement <4 x i32> %0, i32 3 > %add5 = add i32 %add3, %vecext4 > ret i32 %add5 > } > > Now in above case, 0th operand of extractelement %0 is a load instruction, > and hence it doesn't crash while casting into an instruction and runs smoothly > further. > > Can someone please suggest how to resolve this? Is there something I am > missing or is it a basic problem with IR itself ? > > Regards, > Suyog > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev