Martin J. O'Riordan
2015-Feb-16 15:25 UTC
[LLVMdev] ISD::VAARG and DAGTypeLegalizer::WidenVectorResult
I am trying to debug a crash (llvm_unreachable) where 'DAGTypeLegalizer::WidenVectorResult' is being called with an SDNode of type 'ISD::VAARG' for a vector of type 'v2i32'. This node type is not handled in this function, so I am assuming something is broken even earlier. The target does not support 'v2i32' as a native vector type, but it does support a native 'v4i32'. I have configured 'MYTargetLowering::getPreferredVectorAction' to return 'TypeWidenVector', but this does not seem to help. I have also tried various methods in 'MYTargetLowering::ReplaceNodeResults' but this doesn't help either. It looks like the type should have already been widened by the time it gets to 'DAGTypeLegalizer::WidenVectorResult', but I cannot see why this is not happening. It is also strange that a VAARG node is getting this far too. The caller side pushes the 'v2i32' on the stack as a 'v4i32' as expected (with two elements of rubbish), but when 'va_arg' is used in the callee for the type 'v2i32' it fails in this way. This happens for all vectors that do not match a native vector, so it applies to vectors that should be widened and vectors that should be split. It is probably a fairly strange thing to do passing vectors to variadic functions, but it occurs in the OpenCL variant of 'printf'. Has anybody seen this kind of problem before, and have any recommendations I could use to identify the cause of this problem so that I can figure out where best to fix it? Thanks, Martin O'Riordan (Movidius Ltd.)
Martin J. O'Riordan
2015-Feb-20 13:37 UTC
[LLVMdev] ISD::VAARG and DAGTypeLegalizer::WidenVectorResult
I have been having trouble getting 'va_arg' to work with vector parameters. It seems to work fine for each of the native vector types in our processor (128-bit vectors, of 8, 16 and 32 bit elements). But if I throw something like 'v2i32' into the mix I get a crash in the type legalizer. This is all with respect the the v3.5 release, but the same seems to be the case in the v3.6 pending release. Examining how other targets do this doesn't seem to show that I am handling VAARG any differently, and I have registered a custom lowering action for ISD::VAARG. However, the crash occurs a lot earlier than lowering. What I can't tell for sure, is whether this is just absent functionality in LLVM, or if I have missed out some hook or TD description necessary to make this work. In C an example that fails is: typedef int __attribute__((ext_vector_type(2))) int2; ... int2 vt = va_arg(vl, int2); This results in LLVM IR that dumps as: %1 = va_arg i8** %vl, <2 x i32> But if I rewrite the C as: typedef int __attribute__((ext_vector_type(2))) int2; // v2i32 is not natively supported typedef int __attribute__((ext_vector_type(4))) int4; // v4i32 is natively supported ... int2 vt = (int2)va_arg(vl, int4).s01; // Or '.xy' it all works exactly as expected. This form results in the following IR: %1 = va_arg i8** %vl, <4 x i32> %2 = shufflevector <4 x i32> %1, <4 x i32> undef, <2 x i32> <i32 0, i32 1> The problem is harder when it is a 'v3i32' because there is no 'MVT::v3i32' declaration. Looking at how the scalar 'va_arg's are legalized for inspiration, integers use 'ExpandRes_VAARG' and 'PromoteIntRes_VAARG', while floating-point uses 'ExpandRes_VAARG' and 'SoftenFloatRes_VAARG', but there is no equivalent handling for vectors. I thought I'd add handlers for vectors to see if I could achieve the IR transformation above, but still haven't had any luck. My attempted resolution for widening follows this message, but I have not yet attempted a solution for the splitting variant 'DAGTypeLegalizer::SplitVecRes_VAARG'. So I'm just not seeing the wood for the trees, or is this scenario just not yet implemented in LLVM? Thanks in advance for any insights, Martin O'Riordan - Movidius Ltd. ====== My Attempt at Resolving This ===== To 'class DAGTypeLegalizer' in 'LegalizeType.h' I added the declaration: | SDValue WidenVecRes_VAARG(SDNode *N); and to 'DAGTypeLegalizer::WidenVectorResult()' in 'LegalizeVectorTypes.cpp' I added a use case: | case ISD::VAARG: Res = WidenVecRes_VAARG(N); break; Finally I implemented 'DAGTypeLegalizer::WidenVecRes_VAARG()' in 'LegalizeVectorTypes.cpp' as follows: | SDValue DAGTypeLegalizer::WidenVecRes_VAARG(SDNode *N) { | assert(N->getValueType(0).isVector() && "Operand must be a vector"); | #ifndef NDEBUG | dbgs() << "DAGTypeLegalizer::WidenVecRes_VAARG: Before widening:\n>>>> ";| N->dump(&DAG); | dbgs() << "\n"; | #endif | SDValue Chain = N->getOperand(0); // Get the chain | SDValue Ptr = N->getOperand(1); // Get the pointer | EVT VT = N->getValueType(0); // Get the requested type | EVT WidenVT = TLI.getTypeToTransformTo(*DAG.getContext(), N->getValueType(0)); | SDLoc dl(N); | MVT RegVT = TLI.getRegisterType(*DAG.getContext(), VT); | | // Construct a VAARG chain with the replacement type | Chain = DAG.getVAArg(RegVT, dl, Chain, Ptr, N->getOperand(2), | N->getConstantOperandVal(3)); | | // Now use a vector shuffle to extract the elements required from the | // widened vector. Create a mask of the elements to select from the vector | const unsigned NumElts = VT.getVectorNumElements(); | const unsigned WidenNumElts = WidenVT.getVectorNumElements(); | | // Adjust mask based on new input vector length | SmallVector<int, 16> NewMask; | for (unsigned i = 0; i != NumElts; ++i) | NewMask.push_back(i); // Select this element | for (unsigned i = NumElts; i != WidenNumElts; ++i) | NewMask.push_back(-1); // Set this element to undefined | | SDValue Res = DAG.getVectorShuffle(WidenVT, dl, Chain, DAG.getUNDEF(WidenVT), | NewMask.data()); | | // Modified the chain result - switch anything that used the old chain to | // use the new one | ReplaceValueWith(SDValue(N, 1), Chain); | | #ifndef NDEBUG | dbgs() << "DAGTypeLegalizer::WidenVecRes_VAARG: After widening:\n >>>> "; | N->dump(&DAG); | dbgs() << "\n"; | #endif | | #ifndef MARTINO | llvm_unreachable("vector 'va_arg' widening is not yet implemented"); | #endif | | return Res; | }