thr3ads.net - similar to: "High CPU usage"

Displaying 20 results from an estimated 4000 matches similar to: "High CPU usage"

2009 Sep 23

High CPU usage

Hi Jean-Marc, I recompiled with FIXED_POINT and CPU utilization stays below 4%. This is a great improvement. So how can I fix this to work with floating point ? Thanks. Mark -----Urspr?ngliche Nachricht----- Von: Jean-Marc Valin [mailto:jean-marc.valin at usherbrooke.ca] Betreff: Re: [Speex-dev] High CPU usage Hi, Sound like it could be the good old denormalised float problem on the Intel

High CPU usage

2009 Sep 23

High CPU usage

Mark Schilling a ?crit : > I recompiled with FIXED_POINT and CPU utilization stays below 4%. This is a great improvement. > So how can I fix this to work with floating point ? OK, so it looks a lot like a denorm problem. The issue is basically that there are filters that decay exponentially, so when the input suddenly goes to zero, then the filter's output value becomes smaller and

Handling of FP denormal values

2019 Sep 16

Handling of FP denormal values

Hi all, While reviewing a recent clang documentation change, I became aware of an issue with the way that clang is handling FP denormals. There is currently some support for variations in the way denormals are handled, but it isn't consistent across architectures and generally feels kind of half-baked. I'd like to discuss possible solutions to this problem. First, there is a clang

High CPU usage

2012 Jun 14

High CPU usage

Hi Mark, Code below: int16_t* samples; int16_t* fbSilenceFrame; void *fSpeexState; float eng(0.f); int speexFrameSize(0); speex_encoder_ctl(speexState, SPEEX_GET_FRAME_SIZE, &speexFrameSize); for (int i = 0; i < speexFrameSize; i++) { eng += samples[i] * samples[i]; } if (eng / speexFrameSize < 3.f) { memcpy(samples, silenceFrame, speexFrameSize * sizeof(int16_t)); } where

High CPU usage

2009 Sep 24

High CPU usage

Hi Jean-Marc, I tried to add VERY_SMALL at the input of the encoder, but that did not change much. Here's a list of source code locations where denormals appear for the first time as calculation results. This list is based on a 4 minutes recording of ambient sound that is passed to speexenc 1.2rc1 with the command line --narrowband --denoise --agc --abr 15000

High CPU usage

2009 Sep 22

High CPU usage

Hi, I have a curious problem with speex. As long as I'm talking, it takes about 2-5% of my CPU. This seems ok. But as soon as I stop talking, CPU utilization rises to about 30-45% and stays there until I start talking again. I compiled speex from source and use it with these settings: - Preprocessor: Denoiser = ON, AGC = ON - Encoder: ABR = 15000, DTX = 1, Mode = narrowband, Rate = 8000 Hz.

[cfe-dev] Handling of FP denormal values

2019 Sep 17

[cfe-dev] Handling of FP denormal values

On Mon, Sep 16, 2019 at 9:43 PM Matt Arsenault via cfe-dev < cfe-dev at lists.llvm.org> wrote: > > > On Sep 16, 2019, at 19:57, Kaylor, Andrew via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > > Do we need an ftz fast-math flag? > > > This would be useful for matching a handful of AMDGPU instructions (a fmad > that only always flushes being the

High CPU usage

2012 Jun 13

High CPU usage

Hi Tanmay, >Does compiling speex API with DISABLE_FLOAT_API and DISABLE_VBR solve the >problem? I remember that this fixed the problem. But at that time I also needed VBR so this was not an option. As far as I know, it is related to some calculations that involve float denormals that cause the high CPU usage. Today I'm still using the following code before speex_encoder_init and

Antw: Re: celt_inner_prod() and dual_inner_prod() NEON intrinsics

2017 Jun 06

Antw: Re: celt_inner_prod() and dual_inner_prod() NEON intrinsics

>>> Linfeng Zhang <linfengz at google.com> schrieb am 06.06.2017 um 06:46 in Nachricht <CAKoqLCAfj+fDUMLfN4dLNSZ4NNAZpaSt_BWZRp+7XBqfhiSqiQ at mail.gmail.com>: > Hi Jean-Marc, > > I tried "==" before, and it failed when both results are 0.0. Maybe the > exponent or sign has difference because of the different 0.0 representation > in NEON. If anybody

[RFC] Making space for a flush-to-zero flag in FastMathFlags

2019 Mar 18

[RFC] Making space for a flush-to-zero flag in FastMathFlags

We knew the day when we needed another FMF bit was coming back in: https://reviews.llvm.org/D39304 ...it was just a question of 'when'. :) I'm guessing that an FTZ bit won't be the last new bit needed if we consider permutations between strict FP and fast-math. Even without that, denormals-as-zero (DAZ) might also be useful? So rather than continuing to carve these out bit-by-bit,

Vectorization with fast-math on irregular ISA sub-sets

2016 Feb 11

Vectorization with fast-math on irregular ISA sub-sets

Our processor also has some issues regarding the handling of denormals - scalar and vector - and we ran into a related problem only a few days ago. The v3.8 compiler has done a lot of good work on optimisations for floating-point math, but ironically one of them broke our implementation of 'nextafterf'. The desired code fragment (FP32) is: float xAbs = fabsf(x); since we know our

[RFC] Making space for a flush-to-zero flag in FastMathFlags

2019 Mar 16

[RFC] Making space for a flush-to-zero flag in FastMathFlags

Hi, I need to add a flush-denormals-to-zero (FTZ) flag to FastMathFlags, but we've already used up the 7 bits available in Value::SubclassOptionalData (the "backing storage" for FPMathOperator::getFastMathFlags()). These are the possibilities I can think of: 1. Increase the size of FPMathOperator. This gives us some additional bits for FTZ and other fastmath flags we'd want

[RFC] Making space for a flush-to-zero flag in FastMathFlags

2019 Mar 18

[RFC] Making space for a flush-to-zero flag in FastMathFlags

On Sun, Mar 17, 2019 at 1:47 PM Craig Topper <craig.topper at gmail.com> wrote: > Can we move HasValueHandle out of the byte used for SubClassOptionalData and move it to the flags at the bottom of value by shrinking NumUserOperands to 27? I like this approach because it is less work for me. :) But I agree with Sanjay below that this only kicks the can slightly further down the road

[LLVMdev] NEON vector instructions and the fast math IR flags

2013 Jun 07

[LLVMdev] NEON vector instructions and the fast math IR flags

On 06/07/2013 06:49 AM, Arnold Schwaighofer wrote: > > On Jun 7, 2013, at 3:14 AM, Renato Golin <renato.golin at linaro.org> wrote: > >> On 7 June 2013 08:48, Tobias Grosser <tobias at grosser.es> wrote: >> When to set which subtarget feature is a policy decision, where I honestly don't have any opinion on for clang. The best is probably to mirror the gcc

[LLVMdev] NEON vector instructions and the fast math IR flags

2013 Jun 10

[LLVMdev] NEON vector instructions and the fast math IR flags

| For programs that have mixed precision requirements for floating point | operations we probably need to do this according to the fast math flags. | Until we get there, a good first step would probably be to provide a | global option similar to -enable-no-infs-fp-math that specifies if | denormals should be allowed or not. This would allow the user to specify | the precision requirements, without

[LLVMdev] NEON vector instructions and the fast math IR flags

2013 Jun 10

[LLVMdev] NEON vector instructions and the fast math IR flags

On 06/10/2013 01:56 AM, David Tweed wrote: > | For programs that have mixed precision requirements for floating point > | operations we probably need to do this according to the fast math flags. > | Until we get there, a good first step would probably be to provide a > | global option similar to -enable-no-infs-fp-math that specifies if > | denormals should be allowed or not. This

[LLVMdev] NEON vector instructions and the fast math IR flags

2013 Jun 07

[LLVMdev] NEON vector instructions and the fast math IR flags

On Jun 7, 2013, at 3:14 AM, Renato Golin <renato.golin at linaro.org> wrote: > On 7 June 2013 08:48, Tobias Grosser <tobias at grosser.es> wrote: > When to set which subtarget feature is a policy decision, where I honestly don't have any opinion on for clang. The best is probably to mirror the gcc behavior on linux targets. > > Not really, since GCC has no special

NEON FP flags

2016 Mar 22

NEON FP flags

Hal, James, My plan to disable vectorization on NEON FP had two steps: 1. Create the infrastructure to detect unsafe FP maths and force NEON FP via fast-math. 2. Use -mfpmath=neon/sse to fine-tune the flags even further, but this needs a lot of work in IR. The expected behaviour is to have most performance with least options, but with correctness in mind. So, we can't vectorize FP loops

[LLVMdev] ARM NEON VMUL.f32 issue

2013 Mar 20

[LLVMdev] ARM NEON VMUL.f32 issue

Hi, | The question is: | * is this a problem with the test, that shouldn't be expecting values below FLT_MIN, or | * is it a bug in the lowering, that should only be lowering to NEON's VMUL when unsafe-math is on, or | * neither, and people should disable that when they want correctness? Note that if you go for the second option, IMO unsafe-math is _far_ too "aggressive" an

Implementing a proposed InstCombine optimization

2016 Apr 07

Implementing a proposed InstCombine optimization

I am not entirely sure this is safe. Transforming this to an fsub could change the value stored on platforms that implement negates using arithmetic instead of with bitmath (such as ours) and either canonicalize NaNs or don’t support denormals. This is actually important because this kind of bitmath on floats is very commonly used as part of algorithms for complex math functions that need to get

similar to: High CPU usage