thr3ads.net - search: "r0

Displaying 3 results from an estimated 3 matches for "r0_1".

[LLVMdev] Excessive register spilling in large automatically generated functions, such as is found in FFTW

2012 Jul 06

[LLVMdev] Excessive register spilling in large automatically generated functions, such as is found in FFTW

...t6 = ADD(t2, t3); t7 = SUB(t2, t3); *r2 = UNPACK2HI(t4, t5); *r3 = UNPACK2HI(t6, t7); t7 = MULI(t7); t0 = ADD(t4, t6); t2 = SUB(t4, t6); t1 = SUB(t5, t7); t3 = ADD(t5, t7); *r0 = UNPACK2LO(t0, t1); *r1 = UNPACK2LO(t2, t3); } void fft32(const float *in, float *out) { __m128 r0_1,r2_3,r4_5,r6_7,r8_9,r10_11,r12_13,r14_15,r16_17,r18_19,r20_21,r22_23,r24_25,r26_27,r28_29,r30_31; L_4_4(in+0,in+32,in+16,in+48,&r0_1,&r2_3,&r16_17,&r18_19); L_2_2(in+8,in+40,in+56,in+24,&r4_5,&r6_7,&r20_21,&r22_23); K_N(VLIT4(0.7071,0.7071,1,1),VLIT4(0.7071,-0...

[LLVMdev] How to define complicated instruction in TableGen (Direct3D shader instruction)

2005 Jul 27

[LLVMdev] How to define complicated instruction in TableGen (Direct3D shader instruction)

...WM:$wm, IM:$im, GPR:$src0, SW:$sw0, SM:$sm0, GPR:$src1, SW:$sw1 SM:$sm1 ), ... > 2. add llvm intrinsics: ; add_sat r0.a, r1_bias.xxyy, r3_x2.zzzz r1_1 = llvm.bias( r1_0 ) r1_2 = llvm.shuffle( xxyy ) r3_1 = llvm.x2( r3_0 ) r3_2 = llvm.shuffle( zzzz ) r0_0 = add r1_2, r3_2 r0_1 = llvm.sature( r0_0 ) r0_2 = llvm.select( a ) but it makes the implementing the instruction selector very diffifult. in this example, llvm.select() and llvm.sature() are encountered frist (bootm-up), but they must be 'remembered' and the instruction cannot be generated (BuildMI) until th...

[LLVMdev] How to define complicated instruction in TableGen (Direct3D shader instruction)

2005 Jul 29

[LLVMdev] How to define complicated instruction in TableGen (Direct3D shader instruction)

...GPR:$src1, SW:$sw1 SM:$sm1 ), ... > > > 2. add llvm intrinsics: > > ; add_sat r0.a, r1_bias.xxyy, r3_x2.zzzz > r1_1 = llvm.bias( r1_0 ) > r1_2 = llvm.shuffle( xxyy ) > r3_1 = llvm.x2( r3_0 ) > r3_2 = llvm.shuffle( zzzz ) > r0_0 = add r1_2, r3_2 > r0_1 = llvm.sature( r0_0 ) > r0_2 = llvm.select( a ) > > but it makes the implementing the instruction selector very diffifult. > in this example, llvm.select() and llvm.sature() are encountered frist > (bootm-up), but they must be 'remembered' and the instruction cannot > b...

search for: r0_1