Displaying 3 results from an estimated 3 matches for "stvewx".
Did you mean:
stew
2004 Sep 10
4
Altivec Optimizations
Hi,
I have been playing with Altivec, and I rewrote a couple of the routines
in assembly. Looking at the archives, I noticed that there may already
be some effort on this. Anyways...
Right now, I have two routines working. They need to be cleaned up,
made
relocatable, and documented; otherwise, they seem to work fairly well.
I
see an overall ~27% speed improvement when encoding with the
2004 Oct 06
3
flac-1.1.1 completely broken on linux/ppc and on macosx if built with the standard toolchain (not xcode)
Sadly the latest optimization broke completely everything.
The asm code isn't gas compliant. the libFLAC linker script has a typo,
disabling the asm optimization and/or altivec won't let a correct build
anyway.
Instant fixes for the asm stuff:
sed -i -e"s:;:\#:" on the lpc_asm.s
to load address instead of addis+ori you could use
lis and la and PLEASE use the @l(register)
2004 Sep 10
1
altivec lpc_restore_signal
...lvewx v21,0,r3 ; v21[n]: *residual
vperm v21,v21,v21,v18 ; v21[3]: *residual
vaddsws v20,v21,v20 ; v20[3]: *residual + (sum >> lp_quantization)
vsldoi v18,v18,v18,4 ; increment shift vector
vperm v21,v20,v20,v17 ; v21[n]: shift for storage
vsldoi v17,v17,v17,12 ; increment shift vector
stvewx v21,0,r8
vsldoi v20,v20,v20,12
vsldoi v8,v8,v20,4 ; insert value onto history
addi r3,r3,4
addi r8,r8,4
cmplw cr0,r8,r4 ; i<data_len
bc 12,0,L1200
L1400:
mtspr 256,r0 ; restore old vrsave
lmw r31,-4(r1)
blr
_FLAC__lpc_restore_signal_asm_ppc_altivec_16_order8:
; r3: residual[]
; r4:...