On Thursday 17 December 2009 17:16, Nate Begeman wrote:> David, this is probably the wrong approach, based on the accreted awfulness > of the X86 shuffle lowering code,Ha! I have no issue believing this statement. :)> The correct approach is probably a rewrite based around what > AltiVec does: Canonicalize to byte ops, and write all the patterns once > rather than having to look for 6 different variants of the same pattern.Can you expand on this with an example? There seems to be an awful lot of shuffle patterns and predicates in PPCInstrAltivec.td. What do you mean by, "Canonicalize to byte ops?" Can you walk me through how that works with Altivec? Since I'm rewriting all of the SSE patterns to clean them up and incorporate AVX functionality anyway, a complete rewrite of shuffles is not additional work. :) Thanks. -Dave
On Thursday 17 December 2009 17:30, David Greene wrote:> Can you expand on this with an example? There seems to be an awful lot of > shuffle patterns and predicates in PPCInstrAltivec.td. What do you mean > by, "Canonicalize to byte ops?" Can you walk me through how that works > with Altivec?Ah wait, I think I know what you mean. For x86, you mean rewrite the shuffle operations in X86ISelLowering to operate on 32-bit elements always (the shuffle instructions with immediate masks don't have 16-bit or 8-bit variants), bitcasting the vector operands as appropriate and reformulating the index vector to account for the bitcasts. Is that right? I think I can see how to do this. I think before I start this I will try to check in as muich AVX work as I can so that we all see the changes as they happen. I have quite an extensive set backlogged but I think it's stable enough to start releasing now. -Dave
Hello, David> Can you expand on this with an example? There seems to be an awful lot of > shuffle patterns and predicates in PPCInstrAltivec.td. What do you mean by, > "Canonicalize to byte ops?" Can you walk me through how that works with > Altivec?The basic idea is quite simple - lower everything to vNi8 and write all the patterns using only these types. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University
On Thursday 17 December 2009 18:04, Anton Korobeynikov wrote:> Hello, David > > > Can you expand on this with an example? There seems to be an awful lot > > of shuffle patterns and predicates in PPCInstrAltivec.td. What do you > > mean by, "Canonicalize to byte ops?" Can you walk me through how that > > works with Altivec? > > The basic idea is quite simple - lower everything to vNi8 and write > all the patterns using only these types.Yeah, I figured that out after thinking a bit more. However, I think in this case we only want to lower to vNi32 since there are no immediate-mask shuffles in X86 that operate on smaller element types. Doing it at the byte level would just be more confusing, I think. PSHUFB is really a completely different instruction than PSHUFD, for example. -Dave