Displaying 5 results from an estimated 5 matches for "_use_sse2".
Did you mean:
_use_sse
2008 Nov 26
1
SSE2 code won't compile in VC
Jean-Marc,
At least VS2005 (what I'm using) won't compile resample_sse.h with
_USE_SSE2 defined because it refuses to cast __m128 to __m128d and vice
versa. While there are intrinsics to do the casts, I thought it would be
simpler to just use an intrinsic that accomplishes the same thing
without all the casting. Thanks,
--John
@@ -91,7 +91,7 @@ static inline double inner_product...
2009 Oct 26
1
[PATCH] Fix miscompile of SSE resampler
...e float interpolate_product_single(const float *a, const float *b, u
sum = _mm_mul_ps(f, sum);
sum = _mm_add_ps(sum, _mm_movehl_ps(sum, sum));
sum = _mm_add_ss(sum, _mm_shuffle_ps(sum, sum, 0x55));
- _mm_store_ss(&ret, sum);
- return ret;
+ _mm_store_ss(ret, sum);
}
#ifdef _USE_SSE2
#include <emmintrin.h>
#define OVERRIDE_INNER_PRODUCT_DOUBLE
-static inline double inner_product_double(const float *a, const float *b, unsigned int len)
+static inline void inner_product_double(double *ret, const float *a, const float *b, unsigned int len)
{
int i;
- double ret;...
2008 Apr 26
2
Updated resampler patch
Hi,
Here's an updated resampler patch against current SVN. It includes SSE
and SSE2 optimizations (the latter if included by _USE_SSE2).
Best regards,
Thorvald
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: speex-resampler-update.diff
Url: http://lists.xiph.org/pipermail/speex-dev/attachments/20080426/e055077f/attachment-0001.txt
2008 May 03
2
Resampler (no api)
...sum = _mm_add_ps(sum, _mm_mul_ps(_mm_load1_ps(a+i+1), _mm_loadu_ps(b+(i+1)*oversample)));
+ }
+ sum = _mm_mul_ps(f, sum);
+ sum = _mm_add_ps(sum, _mm_movehl_ps(sum, sum));
+ sum = _mm_add_ss(sum, _mm_shuffle_ps(sum, sum, 0x55));
+ _mm_store_ss(&ret, sum);
+ return ret;
+}
+
+#ifdef _USE_SSE2
+#include <emmintrin.h>
+#define OVERRIDE_INNER_PRODUCT_DOUBLE
+
+static inline double inner_product_double(const float *a, const float *b, unsigned int len)
+{
+ int i;
+ double ret;
+ __m128d sum = _mm_setzero_pd();
+ __m128 t;
+ for (i=0;i<len;i+=8)
+ {
+ t = _mm_mul_ps...
2008 May 03
0
Resampler, memory only variant
...sum = _mm_add_ps(sum, _mm_mul_ps(_mm_load1_ps(a+i+1), _mm_loadu_ps(b+(i+1)*oversample)));
+ }
+ sum = _mm_mul_ps(f, sum);
+ sum = _mm_add_ps(sum, _mm_movehl_ps(sum, sum));
+ sum = _mm_add_ss(sum, _mm_shuffle_ps(sum, sum, 0x55));
+ _mm_store_ss(&ret, sum);
+ return ret;
+}
+
+#ifdef _USE_SSE2
+#include <emmintrin.h>
+#define OVERRIDE_INNER_PRODUCT_DOUBLE
+
+static inline double inner_product_double(const float *a, const float *b, unsigned int len)
+{
+ int i;
+ double ret;
+ __m128d sum = _mm_setzero_pd();
+ __m128 t;
+ for (i=0;i<len;i+=8)
+ {
+ t = _mm_mul_ps...