thr3ads.net - search: "vec

run time assembler patch for altivec, sse + bug fixes

2005 Dec 02

0

run time assembler patch for altivec, sse + bug fixes

...__vector float vec_a, vec_b; __vector float vec_result; vec_result = (__vector float)vec_splat_u8(0); if ((!a_aligned) && (!b_aligned)) { // This (unfortunately) is the common case. maska = vec_lvsl(0, a); maskb = vec_lvsl(0, b); MSQa = vec_ld(0, a); MSQb = vec_ld(0, b); for (i = 0; i < len; i+=8) { a += 4; LSQa = vec_ld(0, a); vec_a = vec_perm(MSQa, LSQa, maska); b += 4; LSQb = vec_ld(0, b); vec_b = vec_perm(MSQb, LSQb, maskb);...

[PATCH] Make SSE Run Time option. Add Win32 SSE code

2004 Aug 06

2

[PATCH] Make SSE Run Time option. Add Win32 SSE code

Jean-Marc, >I'm still not sure I get it. On an Athlon XP, I can do something like >"mulps xmm0, xmm1", which means that the xmm registers are indeed >supported. Besides, without the xmm registers, you can't use much of >SSE. In the Atholon XP 2400+ that we have in our QA lab (Win2000 ) if you run that code it generates an Illegal Instruction Error. In addition,

[PATCH] Make SSE Run Time option.

2004 Aug 06

6

[PATCH] Make SSE Run Time option.

...; __vector float vec_result; vec_result = (__vector float)vec_splat_u8(0); if ((!a_aligned) && (!b_aligned)) { // This (unfortunately) is the common case. maska = vec_lvsl(0, a); maskb = vec_lvsl(0, b); MSQa = vec_ld(0, a); MSQb = vec_ld(0, b); for (i = 0; i < len; i+=8) { a += 4; LSQa = vec_ld(0, a); vec_a = vec_perm(MSQa, LSQa, maska); b += 4; LSQb = vec_ld(0, b); vec_b = vec_p...

A couple of points about flac 1.1.1 on ppc/linux/altivec

2005 Jan 29

4

A couple of points about flac 1.1.1 on ppc/linux/altivec

On Thu, 27 Jan 2005, John Steele Scott wrote: > That looks fine to me as well. However, the best solution is something which > Luca suggested a few months ago, which is to use the functions defined in > altivec.h. These are C functions which map directly to Altivec machine > instructions. I am willing to help out, but I don't find the current lpc_asm.s > very easy to follow, and

[PATCH] Make SSE Run Time option.

2004 Aug 06

0

[PATCH] Make SSE Run Time option.

...el SSE2 instructions > #define CPU_MODE_ALTIVEC 64 // PowerPC Altivec support. You may wish to save space for PNI. http://cedar.intel.com/media/pdf/PNI_LEGAL3.pdf Likewise, all that branching is probably going to cause more trouble than it saves. Try this: vector float a0 = vec_ld( 0, a ); vector float a1 = vec_ld( 15, a ); vector float b0 = vec_ld( 0, b ); vector float b1 = vec_ld( 15, b ); a0 = vec_perm( a0, a1, vec_lvsl( 0, a ) ); b0 = vec_perm( b0, b1, vec_lvsl( 0, b ) ); a0 = vec_madd( a0, b0, (vector float) vec_splat_u3...

search for: vec_ld