similar to: Optimization and doubles vs. floats

Displaying 20 results from an estimated 6000 matches similar to: "Optimization and doubles vs. floats"

optimization progress

2000 Aug 22

1

optimization progress

Hi all, The decoder is down 30% execution time, identical bit output. Didn't get the mdct yet; 1024 point mdct is a bit much to brute-force, and I'm not going to hand-unroll the whole thing either (the machine- unrolled version produced a 1.5M executable; understandably, it wasn't very fast. Still waiting for processors with 1.5M L1 code caches ;-) Slowest parts now are: -- mdct --

New LSP code committed

2000 Aug 19

3

New LSP code committed

So, it turns out (and another implementation actually explicitly mentions it) that LSP->LPC computation using the FIR algorithm is very sensitive to noise (iterative algorithm) and really really requires doubles [we're not kidding]. This was complicating things for folks pursuing fixed point implementations, and also was a potential source for bugs if FP optimizations got out of hand. This

Thought for the new year

2000 Dec 26

4

Thought for the new year

Some thoughts for the new year: 1) MDCT is good for image coding 2) image coding and audio coding are two very different things 3) combine 1 and 2 4) if a psycho model is good, after leaving out what it tells you you can without hurting quality, applying the same model should yield the same results as you got before 5) from 4: decode -> encode -> decode should result in (almost) the

2001 May 23

3

optimisation

what are the main fields where optimisation will take place to improve the CPU use when decoding Ogg Vorbis files? -- Venlig hilsen/Kind regards Thomas Kirk ARKENA thomas@arkena.com http://www.arkena.com "I was drunk last night, crawled home across the lawn. By accident I put the car key in the door lock. The house started up. So I figured what the hell, and drove it around the block a

More mdct questions

2000 Oct 23

4

More mdct questions

Sorry for starting another topic, this is actually a reply to Segher's post on Sun Oct 22 on the 'mdct question' topic. I wasn't subscribed properly and so I didn't get email confirmation and thus can't add to that thread. So Segher, if the equation is indeed what you say it is, then replacing mdct_backward with this version should work, but it doesn't. Am I applying

2003 Apr 08

6

bitpeeler

No offense, Segher, but the output quality of this thing is awful. =) I'll disregard the fact that, at least with *my* compiler, the source tarball I downloaded reduces every packet to zero bytes, which isn't terribly interesting. I decided to set the byte reduction to something constant: I started by dividing each packet's size by 2 just to see what would happen. The resulting ogg

[PATCH] Make SSE Run Time option.

2004 Aug 06

6

[PATCH] Make SSE Run Time option.

So we ran the code on a Windows XP based Atholon XP system and the xmm registers work just fine so it appears that Windows 2000 and below does not support them. We agree on not supporting the non-FP version, however the run time flags need to be settable with a non FP SSE mode so that exceptions are avoided. I thus propose a set of defines like this instead of the ones in our initial patch:

SIMD instructions

2003 Jan 23

4

SIMD instructions

Vorbis does not appear to use any SIMD instructions. A short look around in the source code indicates that it would be possible and might even yield big performance improvements. Why has nobody done it yet? I am currently trying to learn using these instructions and would be willing to rewrite a few functions in SIMD instructions, if I understand how to vectorize them and if they make a

Transient coding: AAC vs. Vorbis

2004 Jun 02

4

Transient coding: AAC vs. Vorbis

Thread-split from the vorbis-mailing list ("Vorbis determined to be as good as MPC at 128 kbps!") <p>On Sun, 30 May 2004, Segher Boessenkool wrote: [Steven So] SS>> If iTunes AAC can encode castanets with much less pre-echo at SS>> ABR 128 kbps, then hopefully there will be an imaginative SS>> (and non-patented) way of doing this in Vorbis without the SS>>

mdct_backward with fused muladd?

2003 May 20

2

mdct_backward with fused muladd?

Can anybody point me at any resources that would explain how to optimize mdct_backward for a cpu with a fused multiply-accumute unit? >From what I understand from responses to my older postings, Tremor's mdct_backward could be rewritten to take advantage of a muladd. My target machine can do either two-wide 32x32 + Accum(64) -> Accum(64) integer muladd or eight-wide 16x16 + Accum(32)

Vorbis optimizations...

2000 Aug 11

2

Vorbis optimizations...

I was wondering about what the near and longer term future directions for optimization of Vorbis decompression are. I am interested in using it as a replacement for MP3. But it seems at this point (not certain about this) that Vorbis doesn't really have much optimization implemented yet. By this I mean, for example, implementing something like MMX x86 optimzations in order to speed up the

2000 Oct 20

2

mdct question

Hi, Can someone tell me which MDCT and invMDCT equation uses? I implemented the invMDCT one given in eusipco.corrected.ps file (handed out by Monty way back) and it produces different time domain samples. I tried both the FFT method and the slow way directly from the equation and couldn't reproduce the results from the original code. This leads me to believe that the forward MDCT used in

several questions

2005 Sep 25

3

several questions

VorbisHi Any help is appreciate. I have two questions. First, I looked into the vorbis-?-specific and was confused by the floor1 algorithm. I think at last the aim is to derive the piecewise curve with the list X and Y,then when encodering ,why not use the selected point orderly to get the curve ? In other words, the specific use the list of [0,128,64,32,96,16,48,80] , why can not use

mdct as hardware

2002 Mar 19

1

mdct as hardware

Hi vorbis-dev! I'm working with Pattara in the oggonachip project, and wondering about the implementation of mdct.c as hardware. According your recomendations about using the floating point version, I would say, we have to implement the integerized version of mdct as a core, and use the fpu only to round the input values. By doing that, you think the result would be still acceptable? How

Spectral phase information in residue vectors

2002 Oct 22

3

Spectral phase information in residue vectors

I found this sentence in the Ogg format specs: http://www.xiph.org/ogg/vorbis/doc/vorbis-spec-res.html "A residue vector may represent spectral lines, spectral magnitude, spectral phase or hybrids as mixed by channel coupling." But where does the spectral phase information come from ? AFAIK MDCT doesn't provide any phase information. And in OGG-encoding, MDCT is taking place a few

2000 Nov 15

8

Optimisations

Looking through the archives I have seen talk of making CPU specific optimisations for Vorbis, a la MMX/3DNow!/SSE. The feeling I gather is to wait until something is working well in C before committing to any kind of specific optimisation. What if oft used and needed DSP functions were identified and standardised DSP functionality be written for Vorbis? This would seperate the basically

here's the test case, possible solution

2000 Nov 21

2

here's the test case, possible solution

Hello all, Finally I succeeded in uploading the test case I promised. It's at http://home.wanadoo.nl/segher/test1.wav.bz2 (It is a wav, the headers are a bit inconsistent, but encoder_example will be ok with it, as it just skips them). I did some thinking, and a possible solution is decreasing the ATH_Bark_dB[] for the lower frequencies. As the comments say, it's not really an ATH, but

Altivec-enabled libvorbis...

2003 Oct 12

1

Altivec-enabled libvorbis...

Hey guys, I just released my new MacOSX-based OpenAL implementation...part of it is a Ogg Vorbis decoder based on the 1.0 reference libraries. I spent some time optimizing them and found that many of the hotspots in libvorbis are perfect candidates for vectorization, so I wrote Altivec versions of them. The end result? Decoding of a .ogg file is between 30 and 50% faster on a Mac with an

2004 Aug 06

5

SIMD interest

Greetings, <p>my apologies for putting this trash in the mailing list but the topic about SSE run-time option interested me pretty much. Looks like some people is really experienced on the topic. I would really appreciate if somebody could point me to good resources about SSE and Altivec (not necessarly on the net, I'm ready to invest some money if necessary). I already have intel

[LLVMdev] Getting command line options to affect subtarget features

2013 Jan 31

0

[LLVMdev] Getting command line options to affect subtarget features

On Thu, 2013-01-31 at 11:29 -0600, Bill Schmidt wrote: > On Thu, 2013-01-31 at 11:23 -0600, Bill Schmidt wrote: > > On Thu, 2013-01-31 at 10:17 -0600, Bill Schmidt wrote: > > > > > > On Thu, 2013-01-31 at 09:42 -0600, Hal Finkel wrote: > > > > ----- Original Message ----- > > > > > From: "Bill Schmidt" <wschmidt at