Displaying 2 results from an estimated 2 matches for "float1".
Did you mean:
float
2009 Feb 16
0
[LLVMdev] Modeling GPU vector registers, again (with my implementation)
...that
would encode the swizzle mask in 32bits. The correct swizzles can then
be generated in the asm printer by decoding the integer constant. This
does require having extra moves, but your example below would end up
being something like the following:
dp4 r100, r1, r2
mov r0.x, r100 (float4 => float1 extract_vector_elt)
dp4 r101, r4, r5
mov r3.x, r101 (float4 => float1 extract_vector_elt)
iadd r6.xy__, r0.x000, r3.0x00(float1 + float1 => float2 build_vector)
dp4 r7.x, r8, r9
<as above>
dp4 r10.x, r11, r12
<as above>
iadd r13.xy__, r7.x000, f10.0x00(float1 + float1 => float2...
2009 Feb 16
2
[LLVMdev] Modeling GPU vector registers, again (with my implementation)
Evan Cheng-2 wrote:
>
> Well, how many possible permutations are there? Is it possible to
> model each case as a separate physical register?
>
> Evan
>
I don't think so. There are 4x4x4x4 = 256 permutations. For example:
* xyzw: default
* zxyw
* yyyy: splat
Even if can model each of these 256 cases as a separate physical register,
how can I model the use of r0.xyzw in