Displaying 2 results from an estimated 2 matches for "8xf32".
Did you mean:
4xf32
2008 Sep 30
4
[LLVMdev] Generalizing shuffle vector
...ttribute__(( ext_vector_type(4) )) float float4;
typedef __attribute__(( ext_vector_type(8) )) float float8;
float8 f8;
float4 f4a, f4b, f4c;
f4a = f8.hi;
f8.hi = f4b; f8.lo = f4c;
where hi and lo represent the high half and low half of the vector.
The outgoing IR is
%f4a = shufflevector <8xf32>%f8, undef, <4xi32> <0, 1, 2, 3>
%f8 = shufflevector <4xf32>%f4b, <4xf32>%f4c, <8xi32> <0, 1, 2, 3,
4, 5, 6, 7>
The problem with generating insert and extracts is that we can
generate poor code
%tmp16 = extractelement <4 x float> %f4b, i3...
2008 Sep 30
0
[LLVMdev] Generalizing shuffle vector
...gt; shuffle. We are very careful in combining vector shuffles because we don't
> want to produce a vector shuffle whose mask is illegal or hard to code gen
> so we end up in this code to generate a sequence of unpcks and movhlps for
> this. With the new form, Legalize will divide the 8xf32 vector into two
> 4xf32 and since the two sides are the same, it will generate quad word moves
> to copy the values.
I think this specific issue can be fixed without extending the
IL-level syntax; DAGCombiner could easily be made a lot more clever
about cases like this. For example, before...