Displaying 4 results from an estimated 4 matches for "tp22001613p22034856".
2009 Feb 16
2
[LLVMdev] Modeling GPU vector registers, again (with my implementation)
...I model the use of r0.xyzw in the following example:
// dp4 = dot product 4-element
dp4 r0.x, r1, r2
dp4 r0.y, r3, r4
dp4 r0.z, r5, r6
dp4 r0.w, r7, r8
sub r5, r0.xyzw, r6
--
View this message in context: http://www.nabble.com/Modeling-GPU-vector-registers%2C-again-%28with-my-implementation%29-tp22001613p22034856.html
Sent from the LLVM - Dev mailing list archive at Nabble.com.
2009 Feb 16
0
[LLVMdev] Modeling GPU vector registers, again (with my implementation)
...I model the use of r0.xyzw in the following example:
// dp4 = dot product 4-element
dp4 r0.x, r1, r2
dp4 r0.y, r3, r4
dp4 r0.z, r5, r6
dp4 r0.w, r7, r8
sub r5, r0.xyzw, r6
--
View this message in context:
http://www.nabble.com/Modeling-GPU-vector-registers%2C-again-%28with-my-
implementation%29-tp22001613p22034856.html
Sent from the LLVM - Dev mailing list archive at Nabble.com.
_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
2009 Feb 13
0
[LLVMdev] Modeling GPU vector registers, again (with my implementation)
On Feb 13, 2009, at 9:47 AM, Alex wrote:
> It seems to me that LLVM sub-register is not for the following
> hardware architecture.
>
> All instructions of a hardware are vector instructions. All
> registers contains
> 4 32-bit FP sub-registers. They are called r0.x, r0.y, r0.z, r0.w.
>
> Most instructions write more than one elements in this way:
>
> mul
2009 Feb 13
3
[LLVMdev] Modeling GPU vector registers, again (with my implementation)
It seems to me that LLVM sub-register is not for the following hardware
architecture.
All instructions of a hardware are vector instructions. All registers
contains
4 32-bit FP sub-registers. They are called r0.x, r0.y, r0.z, r0.w.
Most instructions write more than one elements in this way:
mul r0.xyw, r1, r2
add r0.z, r3, r4
sub r5, r0, r1
Notice that the four elements of r0 are written