Displaying 20 results from an estimated 1000 matches similar to: "Opus ARM optimizations"
2013 Jun 11
0
Bug fix in celt_lpc.c and some xcorr_kernel, optimizations
Although I've never used ARM's compiler, I admit I'm very surprised that
it's not compatible with the NEON intrinsics. Given that and M.
Zanelli's speed tests, it seems clear that M. Zanelli's code is the way
to go. I look forward to its inclusion in the opus GIT.
--John
On 6/10/2013 1:00 PM, opus-request at xiph.org wrote:
> Date: Mon, 10 Jun 2013 10:36:34 +0100
2013 May 17
1
[Patch]01-Add ARM5E macros
Hello,
This is a first patch which add macros for ARMv5E.
Also, I copy headers from other files and add company name, tell me if
I'm wrong.
Also, if you have any question or comment about it, feel free to contact me.
Best regards,
--
Aur?lien Zanelli
Parrot SA
174, quai de Jemmapes
75010 Paris
France
-------------- next part --------------
diff --git a/celt/fixed_arm5e.h
2013 May 27
0
[Patch] Check if opus_compare is executable in run_vectors.sh
If opus_compare doesn't exist or isn't executable, tests failed normally
which could be misleading.
So test for existence and mode to avoid this ambiguity.
--
Aur?lien Zanelli
Parrot SA
174, quai de Jemmapes
75010 Paris
France
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Check-if-opus_compare-is-executable-in-run_vectors.s.patch
Type:
2013 May 23
2
ASM runtime detection and optimizations
I wrote a proof of concept regarding the cpu capabilities runtime
detection and choice of optimized function. I follow design which had
been discussed on IRC.
Also, i notice a little drawback: we must propagate the arch index
through functions which don't have codec state as argument.
However, if it's look good, i will continue to implement it.
Best regards,
--
Aur?lien Zanelli
2013 May 21
2
[PATCH] 02-Add CELT filter optimizations
Please ignore my previous mail and patch, there is a new version :).
Patch changes are:
- Use MAC16_16 macros instead of (sum += a*b) and unroll a loop by 2. It
increase performance when using optimized macros (ex: ARMv5E). A
possible side effect of loop unroll is that i don't check for odd length
here.
- Add NEON version of FIR filter and autocorr
- Add a section in autoconf in order to
2013 Jun 07
1
Bug fix in celt_lpc.c and some xcorr_kernel optimizations
Unfortunately I don't have a setup that lets me easily profile ARM code,
so I really can't tell which method is faster (though I suspect Mr.
Zanelli's code is). Let me offer up another intrinsic version of the
NEON xcorr_kernel that is almost identical to the SSE version, and more
in line with Mr. Zanelli's code:
static inline void xcorr_kernel_neon(const opus_val16 *x, const
2013 Jun 07
2
Bug fix in celt_lpc.c and some xcorr_kernel optimizations
Hi JM,
I have no doubt that Mr. Zanelli's NEON code is faster, since hand tuned
assembly is bound to be faster than using intrinsics. However I notice
that his code can also read past the y buffer.
Cheers,
--John
On 6/6/2013 9:22 PM, Jean-Marc Valin wrote:
> Hi John,
>
> Thanks for the two fixes. They're in git now. Your SSE version seems to
> also be slightly faster than
2013 May 21
0
[PATCH] 02-
- Use MAC16_16 macros instead of (sum += a*b) and unroll a loop by 2. It
increase performance when using optimized macros (ex: ARMv5E). A
possible side effect of loop unroll is that i don't check for odd length
here.
- Add NEON version of FIR filter and autocorr
--
Aur?lien Zanelli
Parrot SA
174, quai de Jemmapes
75010 Paris
France
-------------- next part --------------
diff --git
2012 Mar 27
0
[LLVMdev] PBQP & CalcSpillWeights
Hi Arnaud,
Thanks for attaching those files. I'll take a look at them.
Commit r153483 adds an option to the PBQP allocator,
"-pbqp-dump-graphs", to dump the PBQP graph for each round of each
function in a compilation unit. The generated files are named "<module
id>.<function>.<round>.pbqpgraph", and contain a simple text
representation of the PBQP graph.
2012 Mar 26
2
[LLVMdev] PBQP & CalcSpillWeights
Hi Lang,
> From memory your target is not public, so I won't be able to reproduce
> the crash myself. Is that correct?
Correct.
> If that's the case, I could add functionality to dump the PBQP graphs
> during allocation. I think they should give me enough information to
> debug the issue. Would you be able to share the PBQP graphs?
I can share the pbqp graph if you send
2012 Apr 03
0
[LLVMdev] PBQP & CalcSpillWeights
Hi Arnaud,
Apologies for the delayed reply.
Thank you for the excellent test case - it exposed a subtle bug in the
colorability heuristic. This has been fixed in r153958.
In case you are curious, the bug was as follows: the PBQP solver applies
applies a simplification step to each matrix. When all elements of a matrix
row or column are equal, the value for those elements is "pushed
2012 Apr 19
1
[LLVMdev] PBQP & CalcSpillWeights
Hi Arnaud,
I'm glad to hear that your test case is working.
I however still get my wrong allocation in some non trivial cases : the
> pairing constraint is not fulfilled.
>
> I have tried to modify the 'ensure pairable' pass (the pass undoing some
> of the coalescer's work) to always insert register copies for
> instructions with the pairable constraint, instead of
2012 Mar 27
2
[LLVMdev] PBQP & CalcSpillWeights
Hi Lang,
I have reduced the testcase as much as possible. The log of the run and the
dumped graphes are attached.
Cheers,
--
Arnaud de Grandmaison
On Tuesday, March 27, 2012 01:20:35 Lang Hames wrote:
> Hi Arnaud,
>
> Thanks for attaching those files. I'll take a look at them.
>
> Commit r153483 adds an option to the PBQP allocator,
> "-pbqp-dump-graphs", to
2012 Apr 11
0
[LLVMdev] PBQP & CalcSpillWeights
Hi Lang,
The assert is not triggered any longer on my testcases :)
I however still get my wrong allocation in some non trivial cases : the
pairing constraint is not fulfilled.
I have tried to modify the 'ensure pairable' pass (the pass undoing some
of the coalescer's work) to always insert register copies for
instructions with the pairable constraint, instead of being smart and
2012 Apr 05
2
[LLVMdev] PBQP & CalcSpillWeights
Hi Lang,
Thanks a lot for taking time to look into this. I will test the fix soon and
let you know the results.
Cheers,
--
Arnaud de Grandmaison
On Tuesday, April 03, 2012 17:30:33 Lang Hames wrote:
> Hi Arnaud,
>
> Apologies for the delayed reply.
>
> Thank you for the excellent test case - it exposed a subtle bug in the
> colorability heuristic. This has been fixed in
2013 Jun 07
2
Bug fix in celt_lpc.c and some xcorr_kernel optimizations
Hi JM,
At line 221 in celt_lpc.c (the celt_iir function) I think you really
want the RESTORE_STACK statement to be before the #endif instead of
after it. Also, I couldn't help notice that your SSE code for
xcorr_kernel reads more than "len" elements of "_x". I don't know if
that's really a problem when running the codec, but a tool like valgrind
will have a
2013 Jan 20
0
[LLVMdev] codegen of volatile aggregate copies (was "Weird volatile propagation" on llvm-dev)
As a results of my investigations, the thread is also added to cfe-dev.
The context : while porting my company code from the LLVM/Clang releases
3.1 to 3.2, I stumbled on a code size and performance regression. The
testcase is :
$ cat test.c
#include <stdint.h>
struct R {
uint16_t a;
uint16_t b;
};
volatile struct R * const addr = (volatile struct R *) 416;
void test(uint16_t a)
{
2008 Aug 01
3
Xen Networking problem!
Hi,
I ''ve got a CentOS 5.2 server running xen 3.0 with 2 DomUs also running
CentOS 5.2.
All my boxes are up-to date.
I''m experiencing trouble with networking.
Dom0 can reach the outside world when no DomU are started. It can also
reach the outside world when only one DomU is running.
The troubles begin when I start the second DomU. At first, this new
DomU, called DomU2,
2008 May 15
7
Unable to run Watchtower Library 2007 Japanese : font ?
Hello,
I was using both version of WTLib 2007 (French and Japanese) under Unbuntu7.10 and Wine 0.9.47 and it worked fine with just an installation of VCRedist and some japanese fonts in windows folder. But recently i moved my system to the new Ubuntu8.04 with the last version of wine : 1.0 RC1.
I read that i don't need to install vcredist and font anymore. It's supposed to be a platinum
2006 Jun 01
2
Problem to join ADS domain.
Hi,
I post my message here because I can't debug my problem, I hope you have
time to help me to find the problem.
I'm trying to join my Samba machine to an ADS domain, but my "net ads join"
don't work :(
There is my logs, if you need more detail ask me.
~# klist
Ticket cache: FILE:/tmp/krb5cc_0
Default principal: administrateur@TEST.LAN
Valid starting Expires