Tom Harper
2005-Dec-02 14:26 UTC
[Speex-dev] run time assembler patch for altivec, sse + bug fixes
Hi Folks, Attached is a patch against the latest svn, plus new source files. This patch allows the specification of c or assembler versions of various functions at run time if _USE_SSE or _USE_ALTIVEC is specified at compile time. The basic concept is to use function pointers and preprocessor trickery to allow for run-time without changing how the other platforms work, esp. the platform function overrides. I also included two small fixes to svn, as well as project file changes needed to get things working. If anyone has any feedback that would be great. I have tested this on windows using vc2003 and darwin using xcode. I mainly want to make sure I didn't break any of the arm/blackfin stuff, as I don't have the setup(s) to test that. Thanks! Tom -------------- next part -------------- A non-text attachment was scrubbed... Name: filters_altivec.c Type: application/octet-stream Size: 4970 bytes Desc: not available Url : http://lists.xiph.org/pipermail/speex-dev/attachments/20051202/46c05f55/filters_altivec-0002.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: cb_search_sse.c Type: application/octet-stream Size: 3214 bytes Desc: not available Url : http://lists.xiph.org/pipermail/speex-dev/attachments/20051202/46c05f55/cb_search_sse-0001.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: vq_sse.c Type: application/octet-stream Size: 4006 bytes Desc: not available Url : http://lists.xiph.org/pipermail/speex-dev/attachments/20051202/46c05f55/vq_sse-0001.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: filters_sse.c Type: application/octet-stream Size: 9943 bytes Desc: not available Url : http://lists.xiph.org/pipermail/speex-dev/attachments/20051202/46c05f55/filters_sse-0001.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: ltp_sse.c Type: application/octet-stream Size: 3211 bytes Desc: not available Url : http://lists.xiph.org/pipermail/speex-dev/attachments/20051202/46c05f55/ltp_sse-0001.obj -------------- next part -------------- /* Copyright (C) 2002 Jean-Marc Valin */ /** @file ltp_altivec.c @brief Long-Term Prediction functions (altivec version) */ /* Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: - Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. - Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. - Neither the name of the Xiph.org Foundation nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #ifdef _USE_ALTIVEC #include "ltp_altivec.h" spx_word32_t inner_prod_altivec(const float *a, const float *b, int len) { int i; float sum; int a_aligned = (((unsigned long)a) & 15) ? 0 : 1; int b_aligned = (((unsigned long)b) & 15) ? 0 : 1; __vector float MSQa, LSQa, MSQb, LSQb; __vector unsigned char maska, maskb; __vector float vec_a, vec_b; __vector float vec_result; vec_result = (__vector float)vec_splat_u8(0); if ((!a_aligned) && (!b_aligned)) { // This (unfortunately) is the common case. maska = vec_lvsl(0, a); maskb = vec_lvsl(0, b); MSQa = vec_ld(0, a); MSQb = vec_ld(0, b); for (i = 0; i < len; i+=8) { a += 4; LSQa = vec_ld(0, a); vec_a = vec_perm(MSQa, LSQa, maska); b += 4; LSQb = vec_ld(0, b); vec_b = vec_perm(MSQb, LSQb, maskb); vec_result = vec_madd(vec_a, vec_b, vec_result); a += 4; MSQa = vec_ld(0, a); vec_a = vec_perm(LSQa, MSQa, maska); b += 4; MSQb = vec_ld(0, b); vec_b = vec_perm(LSQb, MSQb, maskb); vec_result = vec_madd(vec_a, vec_b, vec_result); } } else if (a_aligned && b_aligned) { for (i = 0; i < len; i+=8) { vec_a = vec_ld(0, a); vec_b = vec_ld(0, b); vec_result = vec_madd(vec_a, vec_b, vec_result); a += 4; b += 4; vec_a = vec_ld(0, a); vec_b = vec_ld(0, b); vec_result = vec_madd(vec_a, vec_b, vec_result); a += 4; b += 4; } } else if (a_aligned) { maskb = vec_lvsl(0, b); MSQb = vec_ld(0, b); for (i = 0; i < len; i+=8) { vec_a = vec_ld(0, a); a += 4; b += 4; LSQb = vec_ld(0, b); vec_b = vec_perm(MSQb, LSQb, maskb); vec_result = vec_madd(vec_a, vec_b, vec_result); vec_a = vec_ld(0, a); a += 4; b += 4; MSQb = vec_ld(0, b); vec_b = vec_perm(LSQb, MSQb, maskb); vec_result = vec_madd(vec_a, vec_b, vec_result); } } else if (b_aligned) { maska = vec_lvsl(0, a); MSQa = vec_ld(0, a); for (i = 0; i < len; i+=8) { a += 4; LSQa = vec_ld(0, a); vec_a = vec_perm(MSQa, LSQa, maska); vec_b = vec_ld(0, b); b += 4; vec_result = vec_madd(vec_a, vec_b, vec_result); a += 4; MSQa = vec_ld(0, a); vec_a = vec_perm(LSQa, MSQa, maska); vec_b = vec_ld(0, b); b += 4; vec_result = vec_madd(vec_a, vec_b, vec_result); } } vec_result = vec_add(vec_result, vec_sld(vec_result, vec_result, 8)); vec_result = vec_add(vec_result, vec_sld(vec_result, vec_result, 4)); vec_ste(vec_result, 0, &sum); return sum; } #endif -------------- next part -------------- A non-text attachment was scrubbed... Name: ltp_altivec.h Type: application/octet-stream Size: 1786 bytes Desc: not available Url : http://lists.xiph.org/pipermail/speex-dev/attachments/20051202/46c05f55/ltp_altivec-0001.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: filters_altivec.h Type: application/octet-stream Size: 1865 bytes Desc: not available Url : http://lists.xiph.org/pipermail/speex-dev/attachments/20051202/46c05f55/filters_altivec-0003.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: asm_flag_patch_12_2_05.patch Type: application/octet-stream Size: 58437 bytes Desc: not available Url : http://lists.xiph.org/pipermail/speex-dev/attachments/20051202/46c05f55/asm_flag_patch_12_2_05-0001.obj -------------- next part -------------- ______________________________________________ Tom Harper Lead Software Engineer SightSpeed - <http://www.sightspeed.com/>http://www.sightspeed.com/ 918 Parker St, Suite A14 Berkeley, CA 94710 Email: tharper@sightspeed.com Phone: 510-665-2920 Fax: 510-649-9569 My SightSpeed Video Link: http://tom.sightspeed.com