thr3ads.net - search: "vec_splat

Displaying 2 results from an estimated 2 matches for "vec_splat_u32".

2004 Aug 06

[PATCH] Make SSE Run Time option.

...a0 = vec_ld( 0, a ); vector float a1 = vec_ld( 15, a ); vector float b0 = vec_ld( 0, b ); vector float b1 = vec_ld( 15, b ); a0 = vec_perm( a0, a1, vec_lvsl( 0, a ) ); b0 = vec_perm( b0, b1, vec_lvsl( 0, b ) ); a0 = vec_madd( a0, b0, (vector float) vec_splat_u32(0) ) ; a0 = vec_add( a0, vec_sld( a0, a0, 8 ) ); a0 = vec_add( a0, vec_sld( a0, a0, 4 ) ); vec_ste( a0, 0, &sum ); return sum; Please note that dot products of simple vector floats are usually faster in the scalar units. The add across and transfer to scalar is...

[PATCH] Make SSE Run Time option.

2004 Aug 06

[PATCH] Make SSE Run Time option.

So we ran the code on a Windows XP based Atholon XP system and the xmm registers work just fine so it appears that Windows 2000 and below does not support them. We agree on not supporting the non-FP version, however the run time flags need to be settable with a non FP SSE mode so that exceptions are avoided. I thus propose a set of defines like this instead of the ones in our initial patch:

search for: vec_splat_u32