Ok, so I've been chugging away at AVX and added some new
features in TableGen to facilitate writing generic patterns.
Here's an example:
//===----------------------------------------------------------------------===//
// Dummy defs for writing generic patterns
//===----------------------------------------------------------------------===//
def SRCREGCLASS;
def DSTREGCLASS;
def MEMCLASS;
def SRC1CLASS;
def SRC2CLASS;
def ADDRCLASS;
def INTRINSIC;
def TYPE;
def INTTYPE;
def MEMOP;
// TYPE - The data type (f32 for SS, f64 for SD, etc.)
// SRCREGCLASS - The source register class (VR128, FR32, etc.)
// DSTREGCLASS - The destination register class
// MEMCLASS - The memory classe (f32mem, f64mem, etc.)
// SRC1CLASS - The first source object class (register or memory, depending)
// SRC2CLASS - The second source object class (register or memory,
depending)
// DSTCLASS - The destination object class (register or memory, depending)
// ADDRCLASS - Either 'addr' or REGCLASS, depending
// MEMOP - Either 'memop' or 'srcvalue,' depending
// Scalar
defm FsANDN : sse1_sse2_avx_binary_scalar_xs_xd_node_pattern_rm_rrm<
0x55,
"andn",
[[(set DSTREGCLASS:$dst,
(INTTYPE (and (not (INTTYPE (bitconvert (TYPE SRCREGCLASS:$src1)))),
(INTTYPE (MEMOP ADDRCLASS:$src2)))))]]>;
// Vector
defm ANDN : sse1_sse2_avx_binary_vector_tb_ostb_node_pattern_rm_rrm<
0x55,
"andn",
[[(set DSTREGCLASS:$dst,
(INTTYPE (and (vnot (INTTYPE (bitconvert (TYPE SRCREGCLASS:$src1)))),
(INTTYPE (MEMOP ADDRCLASS:$src2)))))]]>;
The "not" vs. "vnot" is unfortunate. I could add another
class argument that
says "instantiate with members of this list of operators" but see
below about
arguments and the combinatorial explosion problem. That and the fact that we
have no "foreach for subclass specification" makes this difficult to
do.
(Thinking about this some more, a "cross product" operator [list x
list] ->
[list] could work.)
In any case, the lower classes take care of substituting the appropriate
symbols based on the specific instruction generated ([v]PS, [V]PD, etc.).
I still don't know how to capture the hierarchy under
sse1_sse2_avx_binary_scalar_xs_xd_node_pattern_rm_rrm and other such
higher-level classes. Right now it's generated by a Perl script but Chris
isn't enamored of that solution. I think it can be better as well.
One thought I had was to implement a "copy arguments" feature in
TableGen so
we could do something like this:
defm FsANDN : sse1_binary_scalar_xs_node_pattern_rm<
0x55,
"andn",
[[(set DSTREGCLASS:$dst,
(INTTYPE (and (not (INTTYPE (bitconvert (TYPE SRCREGCLASS:$src1)))),
(INTTYPE (MEMOP ADDRCLASS:$src2)))))]]>,
sse2_binary_scalar_xd_node_pattern_rm<''>,
avx_binary_scalar_xs_node_pattern_rm<''>,
avx_binary_scalar_xd_node_pattern_rm<''>;
where "''" (two apostrophes) is a mnemonic for the
"ditto" mark used in
English (and other languages?).
This way we could define fewer base classes because we wouldn't have to
define intermediate base classes that just serve to aggregate other classes
in order to get us down to one class and thus one argument specification.
But there would still be a lot of classes to manually define. Here's an
incomplete list:
sse1_unary_scalar_xs_node_rm; // For generic unary
sse1_unary_scalar_xs_node_pattern_rm; // To use custom patterns
sse1_unary_scalar_xd_node_intrinsic_rm; // With an intrinsic
sse1_unary_scalar_xd_node_pattern_intrinsic_ipattern_rm; // Custom patterns
sse1_binary_scalar_xs_node_rm; // Binary
plus the rest of the sse1 "xs rm" classes, the mr encodings, all the
binary
operations, all the sse2 classes (which look like the sse1 classes except they
use "xd", all the vector classes, all the AVX classes, LRBni, etc. We
still
have a combinatorial explosion problem.
Of course, we only have to define the ones we actually use and that cuts
down significantly on the numbers, but it's still large.
So I'm still looking for a complete solution. Ideas welcome.
-Dave