I've run into a problem with X86ShuffleDecode.hpp while implementing AVX
shuffles.
It turns out that to decode AVX shuffles properly, I need to pass types
to the X86ShuffleDecode logic. NumElements is not enough because 4
elements could mean 4 32-bit or 4 64-bit. The shuffle decode will be
different based on the element type.
As things stand right now, X86ShuffleDecode.hpp gets included into
X86ISLowering.cpp and X86InstComments.cpp (at the very least, there may
be other clients I'm not aware of). All functions in
X86ShuffleDecode.hpp are static in the header with their full
implementation visible.
The #include strategy breaks down because sometimes static functions are
defined in X86ShuffleDecode.hpp that are not used in the file that
includes them. The reasons for this boil down to the fact that some
clients have type information readily available and others don't so such
type information needs to be synthesized from the available opcode and
NumElements data. For convenience I put that synthesizing code into
X86ShuffleDecode.hpp at new entry points. Such unused definitions cause
compiler warnings which will be build errors when using -Werror.
Naturally, I found the statically-defined functions in a header file
odd, and figured the solution would be to move them to their own .cpp
file, removing the static qualifier. That would eliminate the unused
static function problems.
Unfortunately, I tripped over the reason why they were defined static in
the header in the first place. Moving them to their own .cpp causes a
circular dependence between libLLVMX86AsmPrinter.a and
libLLVMX86CodeGen.a. Putting them in the header seems to be a big hack
to get around this problem.
I need a way out. One option would be to create another X86 target
library (libX86ShuffleDecode.a? We would need a better name). Another
option is to use #ifdef code in X86ShuffleDecode.hpp but I really don't
want to go that route. A third option might be to make some client code
uglier by requiring it to compute types before calling functions in
X86ShuffleDecode.hpp. I haven't tried this option yet so I don't know
how ugly things will get or if it is even feasible. It would require
more intrusive changes throughout the X86 codegen to implement AVX
properly. I had hoped to avoid these kinds of changes as much as
possible.
I would prefer to create a new library. Is that a reasonable solution?
If so, is there are good name for it? I can imagine more code that just
shuffle decode logic might go in there eventually. Maybe something like
X86Utils.a would be a good name.
-Dave