Displaying 5 results from an estimated 5 matches for "hboehm".
Did you mean:
boehm
2015 Jan 20
1
[RFC PATCH v1 1/2] Optimize repeated calls to opus_select_arch
...requires introducing dependencies on threading libraries,
which are platform-specific (and may not even be available on some
platforms) and will make you lose performance.
So really I think the better solution is to modify the function
signatures, even if it looks like more typing.
[1] http://hboehm.info/boehm-hotpar11.pdf
2016 Jan 14
2
RFC: non-temporal fencing in LLVM IR
...change?
Thanks again,
Hal
----- Original Message -----
> From: "Philip Reames via llvm-dev" <llvm-dev at lists.llvm.org>
> To: "JF Bastien" <jfb at google.com>, "llvm-dev"
> <llvm-dev at lists.llvm.org>
> Cc: "Hans Boehm" <hboehm at google.com>
> Sent: Wednesday, January 13, 2016 11:45:35 AM
> Subject: Re: [llvm-dev] RFC: non-temporal fencing in LLVM IR
> On 01/12/2016 11:16 PM, JF Bastien wrote:
> > Hello, fencing enthusiasts!
>
> > TL;DR: We'd like to propose an addition to the LLVM memo...
2016 Jan 14
4
RFC: non-temporal fencing in LLVM IR
I agree with Tim's assessment for ARM. That's interesting; I wasn't
previously aware of that instruction.
My understanding is that Alpha would have the same problem for normal loads.
I'm all in favor of more systematic handling of the fences associated with
x86 non-temporal accesses.
AFAICT, nontemporal loads and stores seem to have different fencing rules
on x86, none of them
2016 Jan 13
4
RFC: non-temporal fencing in LLVM IR
Hello, fencing enthusiasts!
*TL;DR:* We'd like to propose an addition to the LLVM memory model
requiring non-temporal accesses be surrounded by non-temporal load barriers
and non-temporal store barriers, and we'd like to add such orderings to the
fence IR opcode.
We are open to different approaches, hence this email instead of a patch.
*Who's "we"?*
Philip Reames brought
2015 Jan 20
6
[RFC PATCH v1 0/2] Encode optimize using libNE10
Hello opus-dev,
I've been cooking up this patchset to integrate NE10 library into opus.
Current patchset focuses on encode use case mainly effecting performance of
clt_mdct_forward() and opus_fft() (for float only)
Glad to report the following on Encode use case:
(Measured on my Beaglebone Black Cortex-A8 board)
- Performance improvement for encode use case ~= 12.34% (Based on time -p