thr3ads.net - search: "8xf32"

Displaying 2 results from an estimated 2 matches for "8xf32".

Did you mean: 4xf32

2008 Sep 30

[LLVMdev] Generalizing shuffle vector

...ttribute__(( ext_vector_type(4) )) float float4; typedef __attribute__(( ext_vector_type(8) )) float float8; float8 f8; float4 f4a, f4b, f4c; f4a = f8.hi; f8.hi = f4b; f8.lo = f4c; where hi and lo represent the high half and low half of the vector. The outgoing IR is %f4a = shufflevector <8xf32>%f8, undef, <4xi32> <0, 1, 2, 3> %f8 = shufflevector <4xf32>%f4b, <4xf32>%f4c, <8xi32> <0, 1, 2, 3, 4, 5, 6, 7> The problem with generating insert and extracts is that we can generate poor code %tmp16 = extractelement <4 x float> %f4b, i3...

[LLVMdev] Generalizing shuffle vector

2008 Sep 30

[LLVMdev] Generalizing shuffle vector

...gt; shuffle. We are very careful in combining vector shuffles because we don't > want to produce a vector shuffle whose mask is illegal or hard to code gen > so we end up in this code to generate a sequence of unpcks and movhlps for > this. With the new form, Legalize will divide the 8xf32 vector into two > 4xf32 and since the two sides are the same, it will generate quad word moves > to copy the values. I think this specific issue can be fixed without extending the IL-level syntax; DAGCombiner could easily be made a lot more clever about cases like this. For example, before...

search for: 8xf32