Displaying 2 results from an estimated 2 matches for "vec_splat_u32".
2004 Aug 06
0
[PATCH] Make SSE Run Time option.
...a0 = vec_ld( 0, a );
vector float a1 = vec_ld( 15, a );
vector float b0 = vec_ld( 0, b );
vector float b1 = vec_ld( 15, b );
a0 = vec_perm( a0, a1, vec_lvsl( 0, a ) );
b0 = vec_perm( b0, b1, vec_lvsl( 0, b ) );
a0 = vec_madd( a0, b0, (vector float) vec_splat_u32(0) ) ;
a0 = vec_add( a0, vec_sld( a0, a0, 8 ) );
a0 = vec_add( a0, vec_sld( a0, a0, 4 ) );
vec_ste( a0, 0, &sum );
return sum;
Please note that dot products of simple vector floats are usually faster
in the scalar units. The add across and transfer to scalar is...
2004 Aug 06
6
[PATCH] Make SSE Run Time option.
So we ran the code on a Windows XP based Atholon XP system and the xmm
registers work just fine so it appears that Windows 2000 and below does not
support them.
We agree on not supporting the non-FP version, however the run time flags
need to be settable with a non FP SSE mode so that exceptions are avoided.
I thus propose a set of defines like this instead of the ones in our
initial patch: