search for: stvewx

Displaying 3 results from an estimated 3 matches for "stvewx".

Did you mean: stew
2004 Sep 10
4
Altivec Optimizations
Hi, I have been playing with Altivec, and I rewrote a couple of the routines in assembly. Looking at the archives, I noticed that there may already be some effort on this. Anyways... Right now, I have two routines working. They need to be cleaned up, made relocatable, and documented; otherwise, they seem to work fairly well. I see an overall ~27% speed improvement when encoding with the
2004 Oct 06
3
flac-1.1.1 completely broken on linux/ppc and on macosx if built with the standard toolchain (not xcode)
Sadly the latest optimization broke completely everything. The asm code isn't gas compliant. the libFLAC linker script has a typo, disabling the asm optimization and/or altivec won't let a correct build anyway. Instant fixes for the asm stuff: sed -i -e"s:;:\#:" on the lpc_asm.s to load address instead of addis+ori you could use lis and la and PLEASE use the @l(register)
2004 Sep 10
1
altivec lpc_restore_signal
...lvewx v21,0,r3 ; v21[n]: *residual vperm v21,v21,v21,v18 ; v21[3]: *residual vaddsws v20,v21,v20 ; v20[3]: *residual + (sum >> lp_quantization) vsldoi v18,v18,v18,4 ; increment shift vector vperm v21,v20,v20,v17 ; v21[n]: shift for storage vsldoi v17,v17,v17,12 ; increment shift vector stvewx v21,0,r8 vsldoi v20,v20,v20,12 vsldoi v8,v8,v20,4 ; insert value onto history addi r3,r3,4 addi r8,r8,4 cmplw cr0,r8,r4 ; i<data_len bc 12,0,L1200 L1400: mtspr 256,r0 ; restore old vrsave lmw r31,-4(r1) blr _FLAC__lpc_restore_signal_asm_ppc_altivec_16_order8: ; r3: residual[] ; r4:...