Michael Kuperstein via llvm-dev
2019-Oct-31 21:46 UTC
[llvm-dev] Questions about type based "createBuildVecShuffle()" in DAG Combiner
Hi Wei, I haven't worked on LLVM codegen for several years now, so you'll probably want to talk to someone who's been involved with it more recently. Sending to llvm-dev@ was the right call. +llvm-dev@ back in. Having said that: Those are the semantics of the SHUFFLE_VECTOR node, so the code is doing the right thing. See ISDOpcodes.h: /// VECTOR_SHUFFLE(VEC1, VEC2) - Returns a vector, of the same type as /// VEC1/VEC2. A VECTOR_SHUFFLE node also contains an array of constant int /// values that indicate which value (or undef) each result element will /// get. These constant ints are accessible through the /// ShuffleVectorSDNode class. This is quite similar to the Altivec /// 'vperm' instruction, except that the indices must be constants and are /// in terms of the element size of VEC1/VEC2, not in terms of bytes. If you're asking why SHUFFLE_VECTOR was defined this way - I don't know. This goes back to the very early days of LLVM and SelectionDAG (see https://reviews.llvm.org/rL26879). The reference to Altivec in the above comment is a good hint of how it ended up like this. In any case, this does not necessarily make the generated code less efficient - this is just an intermediate representation. Every target supports its own (possibly very restricted, or very wide) set of vector operations, and targets try to pattern-match complex DAG patterns to their instructions. So you can get a target shuffle-like instruction even if the target-independent DAG does not have a SHUFFLE_VECTOR. Michael On Thu, Oct 31, 2019 at 2:14 PM Wei Zhao <wxz at marvell.com> wrote:> Hi Michael, > > > > Found your email from LLVM git log. Here is the message I posted to > llvm-dev. Thought this is a quicker way to get an answer. > > > > Thanks, > > > > Wei > > =============================> > In LLVM DAG Combiner, DAGCombiner::createBuildVecShuffle() is type based. > > > > DAGCombiner.cpp, > > │17184 // We can't generate a shuffle node with mismatched input > and output types. > > │17185 // Try to make the types match the type of the output. > > > > 1. The codes following the above comment are trying to do a matching > job between the input vectors and the output vector. Why the code is based > on the assumption that only matched type can be allowed to do a vector > shuffle? > > A shuffle takes some fields of data from the input vector and > reassembles them in the output vector. It is purely a data movement > operation. The input vector is the container for the source data, and the > output vector is the container for the resulting data. Why these two > containers have to have the same vector type? > > For example, > > VT’s type: v2i16 > > VecIn1 and VecIn2’s type: v4i16 > > We take two i16 elements, each from VecIn1 and VecIn2 separately. With > the current code, because of their type difference, there will be no > shuffle generated > > > > The requirement to create a shuffle operation should be: the capacity > (SizeInBits) of the output vector can hold all the extracted data from the > input vector container > > So as long as the total SizeInBits of the input data extracted from the > input vectors does not exceed the total SizeInBits of the out vector, the > shuffle should be allowed to create. Sure there are some other checks > needed like indexes cannot be the same to avoid two data being placed in > the same position. > > > > 1. Another inconsistence is that the split of the vector right before > the createBuildVecShuffle() > > > > DAGCombiner.cpp, > > │17436 // If all the Operands of BUILD_VECTOR extract from > same > > │17437 // vector, then split the vector efficiently based on the > maximum > > │17438 // vector access index and adjust the VectorMask and > > > │17439 // VecIn accordingly. > > > > This split will create a new vector type which most likely will not be the > same as the output vector type. For example, if the previous vector input > container and output container both have a type v8i16, after splitting, the > input vector will have type v4i16, again this will cause no shuffle being > created later by the type based createBuildVecShuffle(), missing some > shuffle operations. This type based shuffle node creation makes many > optimization error-prone. > > > > Looks like the input/output container type based approach to create a > shuffle node will miss some shuffle operations which makes the generated > code less efficient. > > > > Any comment why it was first designed like this? > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191031/216bd061/attachment.html>