Displaying 5 results from an estimated 5 matches for "pshuf".
Did you mean:
pshufd
2005 May 10
1
[LLVMdev] avoid live range overlap of "vector" registers
...is...
Actually, I think it would be better to define the registers as a machine value type for packed float x4, and providing
some 'extract' and 'inject' instructions to access individual components... There should also be a 'shuffle' instruction
(corresponding to the SSE PSHUF instruction) to change the individual components around.
m.
2005 May 10
0
[LLVMdev] avoid live range overlap of "vector" registers
On Fri, 6 May 2005, Tzu-Chien Chiu wrote:
> a "vector" register r0 is composed of four 32-bit floating scalar
> registers, r0.x, r0.y, r0.z, r0.w.
>
> each scalar reg can be assigned individually, e.g.
>
> mov r0.x, r1.y
> add r0.y, r1,x, r2.z
>
> or assigned simultaneously with vector instructions, e.g.
>
> add r0.xyzw, r1.xzyw, r2.xyzw
>
> My
2005 May 11
2
[LLVMdev] avoid live range overlap of "vector" registers
...would be better to define the registers as a machine
>> value type for packed float x4, and providing some 'extract' and 'inject'
>> instructions to access individual components... There should also be a
>> 'shuffle' instruction (corresponding to the SSE PSHUF instruction) to change
>> the individual components around.
>
>You're right, that would be a better way to go. To start, I would suggest
>adding extract/inject intrinsics (not instructions) because it is easier.
>If you're interested in doing this, there is documentati...
2005 May 06
3
[LLVMdev] avoid live range overlap of "vector" registers
a "vector" register r0 is composed of four 32-bit floating scalar
registers, r0.x, r0.y, r0.z, r0.w.
each scalar reg can be assigned individually, e.g.
mov r0.x, r1.y
add r0.y, r1,x, r2.z
or assigned simultaneously with vector instructions, e.g.
add r0.xyzw, r1.xzyw, r2.xyzw
My question is how to define the register in .td file to avoid the
code generator overlaps the
2005 May 11
0
[LLVMdev] avoid live range overlap of "vector" registers
...better to define the registers as a machine
>>> value type for packed float x4, and providing some 'extract' and 'inject'
>>> instructions to access individual components... There should also be a
>>> 'shuffle' instruction (corresponding to the SSE PSHUF instruction) to change
>>> the individual components around.
>>
>> You're right, that would be a better way to go. To start, I would suggest
>> adding extract/inject intrinsics (not instructions) because it is easier.
>> If you're interested in doing this,...