Displaying 20 results from an estimated 4000 matches similar to: "High CPU usage"
2009 Sep 23
2
High CPU usage
Hi Jean-Marc,
I recompiled with FIXED_POINT and CPU utilization stays below 4%. This is a great improvement.
So how can I fix this to work with floating point ?
Thanks.
Mark
-----Urspr?ngliche Nachricht-----
Von: Jean-Marc Valin [mailto:jean-marc.valin at usherbrooke.ca]
Betreff: Re: [Speex-dev] High CPU usage
Hi,
Sound like it could be the good old denormalised float problem on the Intel
2009 Sep 23
0
High CPU usage
Mark Schilling a ?crit :
> I recompiled with FIXED_POINT and CPU utilization stays below 4%. This is a great improvement.
> So how can I fix this to work with floating point ?
OK, so it looks a lot like a denorm problem. The issue is basically that
there are filters that decay exponentially, so when the input suddenly
goes to zero, then the filter's output value becomes smaller and
2019 Sep 16
3
Handling of FP denormal values
Hi all,
While reviewing a recent clang documentation change, I became aware of an issue with the way that clang is handling FP denormals. There is currently some support for variations in the way denormals are handled, but it isn't consistent across architectures and generally feels kind of half-baked. I'd like to discuss possible solutions to this problem.
First, there is a clang
2012 Jun 14
1
High CPU usage
Hi Mark,
Code below:
int16_t* samples;
int16_t* fbSilenceFrame;
void *fSpeexState;
float eng(0.f);
int speexFrameSize(0);
speex_encoder_ctl(speexState, SPEEX_GET_FRAME_SIZE, &speexFrameSize);
for (int i = 0; i < speexFrameSize; i++)
{
eng += samples[i] * samples[i];
}
if (eng / speexFrameSize < 3.f)
{
memcpy(samples, silenceFrame, speexFrameSize * sizeof(int16_t));
}
where
2009 Sep 24
0
High CPU usage
Hi Jean-Marc,
I tried to add VERY_SMALL at the input of the encoder, but that did not change much.
Here's a list of source code locations where denormals appear for the first time as calculation results.
This list is based on a 4 minutes recording of ambient sound that is passed to speexenc 1.2rc1 with the command line
--narrowband --denoise --agc --abr 15000
2009 Sep 22
1
High CPU usage
Hi,
I have a curious problem with speex. As long as I'm talking, it takes about 2-5% of my CPU. This seems ok.
But as soon as I stop talking, CPU utilization rises to about 30-45% and stays there until I start talking again.
I compiled speex from source and use it with these settings:
- Preprocessor: Denoiser = ON, AGC = ON
- Encoder: ABR = 15000, DTX = 1, Mode = narrowband, Rate = 8000 Hz.
2019 Sep 17
2
[cfe-dev] Handling of FP denormal values
On Mon, Sep 16, 2019 at 9:43 PM Matt Arsenault via cfe-dev <
cfe-dev at lists.llvm.org> wrote:
>
>
> On Sep 16, 2019, at 19:57, Kaylor, Andrew via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>
> Do we need an ftz fast-math flag?
>
>
> This would be useful for matching a handful of AMDGPU instructions (a fmad
> that only always flushes being the
2012 Jun 13
0
High CPU usage
Hi Tanmay,
>Does compiling speex API with DISABLE_FLOAT_API and DISABLE_VBR solve the
>problem?
I remember that this fixed the problem. But at that time I also needed VBR so this was not an option.
As far as I know, it is related to some calculations that involve float denormals that cause the high CPU usage.
Today I'm still using the following code before speex_encoder_init and
2017 Jun 06
4
Antw: Re: celt_inner_prod() and dual_inner_prod() NEON intrinsics
>>> Linfeng Zhang <linfengz at google.com> schrieb am 06.06.2017 um 06:46 in Nachricht
<CAKoqLCAfj+fDUMLfN4dLNSZ4NNAZpaSt_BWZRp+7XBqfhiSqiQ at mail.gmail.com>:
> Hi Jean-Marc,
>
> I tried "==" before, and it failed when both results are 0.0. Maybe the
> exponent or sign has difference because of the different 0.0 representation
> in NEON. If anybody
2019 Mar 18
3
[RFC] Making space for a flush-to-zero flag in FastMathFlags
We knew the day when we needed another FMF bit was coming back in:
https://reviews.llvm.org/D39304
...it was just a question of 'when'. :)
I'm guessing that an FTZ bit won't be the last new bit needed if we
consider permutations between strict FP and fast-math. Even without that,
denormals-as-zero (DAZ) might also be useful?
So rather than continuing to carve these out bit-by-bit,
2016 Feb 11
2
Vectorization with fast-math on irregular ISA sub-sets
Our processor also has some issues regarding the handling of denormals - scalar and vector - and we ran into a related problem only a few days ago.
The v3.8 compiler has done a lot of good work on optimisations for floating-point math, but ironically one of them broke our implementation of 'nextafterf'. The desired code fragment (FP32) is:
float xAbs = fabsf(x);
since we know our
2019 Mar 16
3
[RFC] Making space for a flush-to-zero flag in FastMathFlags
Hi,
I need to add a flush-denormals-to-zero (FTZ) flag to FastMathFlags,
but we've already used up the 7 bits available in
Value::SubclassOptionalData (the "backing storage" for
FPMathOperator::getFastMathFlags()). These are the possibilities I
can think of:
1. Increase the size of FPMathOperator. This gives us some additional
bits for FTZ and other fastmath flags we'd want
2019 Mar 18
2
[RFC] Making space for a flush-to-zero flag in FastMathFlags
On Sun, Mar 17, 2019 at 1:47 PM Craig Topper <craig.topper at gmail.com> wrote:
> Can we move HasValueHandle out of the byte used for SubClassOptionalData and move it to the flags at the bottom of value by shrinking NumUserOperands to 27?
I like this approach because it is less work for me. :)
But I agree with Sanjay below that this only kicks the can slightly
further down the road
2013 Jun 07
2
[LLVMdev] NEON vector instructions and the fast math IR flags
On 06/07/2013 06:49 AM, Arnold Schwaighofer wrote:
>
> On Jun 7, 2013, at 3:14 AM, Renato Golin <renato.golin at linaro.org> wrote:
>
>> On 7 June 2013 08:48, Tobias Grosser <tobias at grosser.es> wrote:
>> When to set which subtarget feature is a policy decision, where I honestly don't have any opinion on for clang. The best is probably to mirror the gcc
2013 Jun 10
0
[LLVMdev] NEON vector instructions and the fast math IR flags
| For programs that have mixed precision requirements for floating point
| operations we probably need to do this according to the fast math flags.
| Until we get there, a good first step would probably be to provide a
| global option similar to -enable-no-infs-fp-math that specifies if
| denormals should be allowed or not. This would allow the user to specify
| the precision requirements, without
2013 Jun 10
1
[LLVMdev] NEON vector instructions and the fast math IR flags
On 06/10/2013 01:56 AM, David Tweed wrote:
> | For programs that have mixed precision requirements for floating point
> | operations we probably need to do this according to the fast math flags.
> | Until we get there, a good first step would probably be to provide a
> | global option similar to -enable-no-infs-fp-math that specifies if
> | denormals should be allowed or not. This
2013 Jun 07
0
[LLVMdev] NEON vector instructions and the fast math IR flags
On Jun 7, 2013, at 3:14 AM, Renato Golin <renato.golin at linaro.org> wrote:
> On 7 June 2013 08:48, Tobias Grosser <tobias at grosser.es> wrote:
> When to set which subtarget feature is a policy decision, where I honestly don't have any opinion on for clang. The best is probably to mirror the gcc behavior on linux targets.
>
> Not really, since GCC has no special
2016 Mar 22
2
NEON FP flags
Hal, James,
My plan to disable vectorization on NEON FP had two steps:
1. Create the infrastructure to detect unsafe FP maths and force NEON
FP via fast-math.
2. Use -mfpmath=neon/sse to fine-tune the flags even further, but this
needs a lot of work in IR.
The expected behaviour is to have most performance with least options,
but with correctness in mind. So, we can't vectorize FP loops
2013 Mar 20
0
[LLVMdev] ARM NEON VMUL.f32 issue
Hi,
| The question is:
| * is this a problem with the test, that shouldn't be expecting values below FLT_MIN, or
| * is it a bug in the lowering, that should only be lowering to NEON's VMUL when unsafe-math is on, or
| * neither, and people should disable that when they want correctness?
Note that if you go for the second option, IMO unsafe-math is _far_ too "aggressive" an
2016 Apr 07
7
Implementing a proposed InstCombine optimization
I am not entirely sure this is safe. Transforming this to an fsub could change the value stored on platforms that implement negates using arithmetic instead of with bitmath (such as ours) and either canonicalize NaNs or don’t support denormals. This is actually important because this kind of bitmath on floats is very commonly used as part of algorithms for complex math functions that need to get