Displaying 3 results from an estimated 3 matches for "r0_1".
2012 Jul 06
2
[LLVMdev] Excessive register spilling in large automatically generated functions, such as is found in FFTW
...t6 = ADD(t2, t3);
t7 = SUB(t2, t3);
*r2 = UNPACK2HI(t4, t5);
*r3 = UNPACK2HI(t6, t7);
t7 = MULI(t7);
t0 = ADD(t4, t6);
t2 = SUB(t4, t6);
t1 = SUB(t5, t7);
t3 = ADD(t5, t7);
*r0 = UNPACK2LO(t0, t1);
*r1 = UNPACK2LO(t2, t3);
}
void fft32(const float *in, float *out) {
__m128 r0_1,r2_3,r4_5,r6_7,r8_9,r10_11,r12_13,r14_15,r16_17,r18_19,r20_21,r22_23,r24_25,r26_27,r28_29,r30_31;
L_4_4(in+0,in+32,in+16,in+48,&r0_1,&r2_3,&r16_17,&r18_19);
L_2_2(in+8,in+40,in+56,in+24,&r4_5,&r6_7,&r20_21,&r22_23);
K_N(VLIT4(0.7071,0.7071,1,1),VLIT4(0.7071,-0...
2005 Jul 27
3
[LLVMdev] How to define complicated instruction in TableGen (Direct3D shader instruction)
...WM:$wm, IM:$im,
GPR:$src0, SW:$sw0, SM:$sm0,
GPR:$src1, SW:$sw1 SM:$sm1 ), ... >
2. add llvm intrinsics:
; add_sat r0.a, r1_bias.xxyy, r3_x2.zzzz
r1_1 = llvm.bias( r1_0 )
r1_2 = llvm.shuffle( xxyy )
r3_1 = llvm.x2( r3_0 )
r3_2 = llvm.shuffle( zzzz )
r0_0 = add r1_2, r3_2
r0_1 = llvm.sature( r0_0 )
r0_2 = llvm.select( a )
but it makes the implementing the instruction selector very diffifult.
in this example, llvm.select() and llvm.sature() are encountered frist
(bootm-up), but they must be 'remembered' and the instruction cannot
be generated (BuildMI) until th...
2005 Jul 29
0
[LLVMdev] How to define complicated instruction in TableGen (Direct3D shader instruction)
...GPR:$src1, SW:$sw1 SM:$sm1 ), ... >
>
> 2. add llvm intrinsics:
>
> ; add_sat r0.a, r1_bias.xxyy, r3_x2.zzzz
> r1_1 = llvm.bias( r1_0 )
> r1_2 = llvm.shuffle( xxyy )
> r3_1 = llvm.x2( r3_0 )
> r3_2 = llvm.shuffle( zzzz )
> r0_0 = add r1_2, r3_2
> r0_1 = llvm.sature( r0_0 )
> r0_2 = llvm.select( a )
>
> but it makes the implementing the instruction selector very diffifult.
> in this example, llvm.select() and llvm.sature() are encountered frist
> (bootm-up), but they must be 'remembered' and the instruction cannot
> b...