Displaying 5 results from an estimated 5 matches for "4xf32".
2008 Sep 30
4
[LLVMdev] Generalizing shuffle vector
...or_type(8) )) float float8;
float8 f8;
float4 f4a, f4b, f4c;
f4a = f8.hi;
f8.hi = f4b; f8.lo = f4c;
where hi and lo represent the high half and low half of the vector.
The outgoing IR is
%f4a = shufflevector <8xf32>%f8, undef, <4xi32> <0, 1, 2, 3>
%f8 = shufflevector <4xf32>%f4b, <4xf32>%f4c, <8xi32> <0, 1, 2, 3,
4, 5, 6, 7>
The problem with generating insert and extracts is that we can
generate poor code
%tmp16 = extractelement <4 x float> %f4b, i32 0
%f8a = insertelement <8 x float> %f8a, float %tmp16, i32 0...
2008 Sep 30
0
[LLVMdev] Generalizing shuffle vector
...ented using it (not that I necessarily
think they should be, it's just a nice side effect).
If this is feasible, it would be nice to extend it all the way. This
lets you do things like:
float3 x;
float4 y;
// ...
y.xyz = x;
as a single shufflevector, e.g.:
%y2 = shufflevector <4xf32> %y1, <3xf32> %x, <4, 5, 6, 3>
I assume my proposed generalization can't hurt codegen, since it could
always be turned into a sequence of insert and extracts which would
provide the same behaviour as today.
--
Stefanus Du Toit <stefanus.dutoit at rapidmind.com>
Rap...
2008 Sep 30
2
[LLVMdev] Generalizing shuffle vector
...e, it's just a nice side effect).
>
> If this is feasible, it would be nice to extend it all the way. This
> lets you do things like:
>
> float3 x;
> float4 y;
>
> // ...
>
> y.xyz = x;
>
> as a single shufflevector, e.g.:
>
> %y2 = shufflevector <4xf32> %y1, <3xf32> %x, <4, 5, 6, 3>
>
> I assume my proposed generalization can't hurt codegen, since it could
> always be turned into a sequence of insert and extracts which would
> provide the same behaviour as today.
>
> --
> Stefanus Du Toit <stefanus.dutoi...
2008 Sep 30
0
[LLVMdev] Generalizing shuffle vector
...areful in combining vector shuffles because we don't
> want to produce a vector shuffle whose mask is illegal or hard to code gen
> so we end up in this code to generate a sequence of unpcks and movhlps for
> this. With the new form, Legalize will divide the 8xf32 vector into two
> 4xf32 and since the two sides are the same, it will generate quad word moves
> to copy the values.
I think this specific issue can be fixed without extending the
IL-level syntax; DAGCombiner could easily be made a lot more clever
about cases like this. For example, before legalization, we can
transf...
2008 Sep 30
0
[LLVMdev] Generalizing shuffle vector
...; If this is feasible, it would be nice to extend it all the way. This
>> lets you do things like:
>>
>> float3 x;
>> float4 y;
>>
>> // ...
>>
>> y.xyz = x;
>>
>> as a single shufflevector, e.g.:
>>
>> %y2 = shufflevector <4xf32> %y1, <3xf32> %x, <4, 5, 6, 3>
>>
>> I assume my proposed generalization can't hurt codegen, since it
>> could
>> always be turned into a sequence of insert and extracts which would
>> provide the same behaviour as today.
>>
>> --
>&...