Rob Cameron
2014-May-18 03:26 UTC
[LLVMdev] Legalizing v32i1, v64i1 for Haswell pext/pdep instructions
I have a group of students working with me on some LLVM projects related to our Parabix research. One interesting issue that has come up for us is code generation support for the Haswell new instructions pext and pdep. These instructions shuffle bits within a 64-bit word, either gathering all selected bits to the beginning (pext) or scattering some initial bits throughout (pdep). A natural model for this is to use shufflevector on v32i1 and v64i1 vectors. We've got some preliminary notes here: http://parabix.costar.sfu.ca/wiki/BitShuffle Since we're quite new at this, I have some questions about strategy. (1) First, it seems that legalizing v32i1 and v64i1 types for x86 would make sense. This will allow us to retain the shufflemasks involving these vector types into code generation. (2) To legalize these types, we need to support all the vector operations. This involves implementing all of the IR operations on these vector types. For most of these it seems that simple substitutions suffice. Adding two 32vi1 vectors just involves bitwise xor, while mul of such vectors just involves bitwise and. My questions are these. (1) Is this strategy basically reasonable? (2) Are there alternative mechanisms for generating pext/pdep from v32i1 and v64i1 shufflevector invocations? (3) The legalization method for most operations (other than shufflevector) is generic, i.e, not processor-specific. Is there a way to make types such as v64i1 legal for any processor that supports i64?
Nadav Rotem
2014-May-19 20:29 UTC
[LLVMdev] Legalizing v32i1, v64i1 for Haswell pext/pdep instructions
Hi Rob,> > One interesting issue that has come up for us is > code generation support for the Haswell new instructions > pext and pdep. These instructions shuffle bits within > a 64-bit word, either gathering all selected bits to > the beginning (pext) or scattering some initial bits > throughout (pdep). > > A natural model for this is to use shufflevector > on v32i1 and v64i1 vectors. We've got some preliminary > notes here: > http://parabix.costar.sfu.ca/wiki/BitShuffleI agree that one way to model these instructions is as shuffles but like you mentioned in your email I expect that implementing this would be non trivial. Are you expecting the pext/pdep instructions to interact with other instructions? If not, then going with intrinsics is probably the best option. Thanks, Nadav
Rob Cameron
2014-May-20 00:30 UTC
[LLVMdev] Legalizing v32i1, v64i1 for Haswell pext/pdep instructions
Hi, Nadav. We are expecting to do a lot with bit vector manipulation at the IR level and then to target multiple back ends. For example, one of our core operations is the Parabix Transform - this can be implemented using bit shuffling among other operations. http://parabix.costar.sfu.ca/wiki/ParabixTransform We're also trying to move away from the use of intrinsics in our libraries and target LLVM IR instead. If we can add capabilities to LLVM that give us the performance we currently achieve with intrinsics-based libraries, we will be happy. This is part of an ongoing research effort; we realize that we have much to learn and work to do. ----- Original Message -----> Hi Rob, > > > > > One interesting issue that has come up for us is > > code generation support for the Haswell new instructions > > pext and pdep. These instructions shuffle bits within > > a 64-bit word, either gathering all selected bits to > > the beginning (pext) or scattering some initial bits > > throughout (pdep). > > > > A natural model for this is to use shufflevector > > on v32i1 and v64i1 vectors. We've got some preliminary > > notes here: > > http://parabix.costar.sfu.ca/wiki/BitShuffle > > I agree that one way to model these instructions is as shuffles but > like you mentioned in your email I expect that implementing this > would be non trivial. Are you expecting the pext/pdep instructions > to interact with other instructions? If not, then going with > intrinsics is probably the best option. > > Thanks, > Nadav >