thr3ads.net - similar to: "[LLVMdev] Vectorization Cost Models and Multi-Instruction Patterns?"

Displaying 20 results from an estimated 5000 matches similar to: "[LLVMdev] Vectorization Cost Models and Multi-Instruction Patterns?"

[LLVMdev] [RFC] Integer Saturation Intrinsics

2015 Jan 14

[LLVMdev] [RFC] Integer Saturation Intrinsics

Hi all, The patches linked below introduce a new family of intrinsics, for integer saturation: @llvm.usat, and @llvm.ssat (unsigned/signed). Quoting the added documentation: %r = call i32 @llvm.ssat.i32(i32 %x, i32 %n) is equivalent to the expression min(max(x, -2^(n-1)), 2^(n-1)-1), itself implementable as the following IR: %min_sint_n = i32 ... ; the min. signed integer of

[LLVMdev] [RFC] Integer Saturation Intrinsics

2015 Jan 15

[LLVMdev] [RFC] Integer Saturation Intrinsics

On Thu, Jan 15, 2015 at 12:42 AM, Philip Reames <listmail at philipreames.com> wrote: > At a very high level, why do we need these intrinsics? In short, to catch sequences you can't catch in the SelectionDAG. > What is the use case? What are typical values for N? Typically, you get this from (a little overlapping) compression, DSP, or pixel-handling code. Off the top of my

[LLVMdev] [RFC] Integer Saturation Intrinsics

2015 Jan 15

[LLVMdev] [RFC] Integer Saturation Intrinsics

On Thu, Jan 15, 2015 at 2:33 AM, David Chisnall <David.Chisnall at cl.cam.ac.uk > wrote: > A couple of questions: > > 1) Should this really be an intrinsic and not a flag on add? The add > instruction already allows overflow to be either undefined or defined to > wrap. Making it defined to saturate seems a natural extension. > I don't think this should be a flag on

Resampler saturation

2009 Jun 13

Resampler saturation

> Quoting Stephane Lesage <stephane.lesage at ateis-international.com>: > > Is this a bug ? Is it possible to fix it ? > > (I use version speex 1.2beta2, because newer versions just > don't work > > on my > > platform) > > This is probable the cause. 1.2beta2 was the first release to > include the resampler and it had many bugs. I suggest trying

speex on TI C5x fixed-point DSP

2004 Nov 03

speex on TI C5x fixed-point DSP

> One thing I've noticed so far in the filter_mem2 code is the calls to > SATURATE(x, 805306368). 805306368 is 0x30000000. I was expecting that > to be on a bit boundary, say 0x3fffffff? In which case the arithmetic > saturation logic could be used. I don't think it would make that big of a difference, since the saturation is outside of the inner loop. If it's that

[LLVMdev] [RFC] Integer Saturation Intrinsics

2015 Jan 15

[LLVMdev] [RFC] Integer Saturation Intrinsics

On 01/14/2015 04:16 PM, Ahmed Bougacha wrote: > On Thu, Jan 15, 2015 at 12:42 AM, Philip Reames > <listmail at philipreames.com> wrote: >> At a very high level, why do we need these intrinsics? > In short, to catch sequences you can't catch in the SelectionDAG. > >> What is the use case? What are typical values for N? > Typically, you get this from (a little

Safe fptoui/fptosi casts

2018 Nov 05

Safe fptoui/fptosi casts

I would be interested in learning what the set of used semantics for float-to-int conversion is. If the only two used are 1) undefined behavior if unrepresentable and 2) saturate to int_{min,max} with NaN going to zero, then I think it makes sense to expose both of those natively in the IR. If the set is much larger, I think separate intrinsics for each behavior would make sense. It would be nice

Safe fptoui/fptosi casts

2018 Nov 05

Safe fptoui/fptosi casts

Hi everyone! The fptoui/fptosi instructions are currently specified to return a poison value if the rounded-towards-zero floating point number cannot be represented by the target integer type. The motivation for this behavior is that overflowing float to int casts in C are undefined behavior. However, many newer languages prefer to have a float to integer cast that is well-defined for all input

Re: Problem with Blackfin assembly optimizations -- bug in fixed_bfin.h / resampler saturation???

2008 Feb 05

Re: Problem with Blackfin assembly optimizations -- bug in fixed_bfin.h / resampler saturation???

Hi, I just started to examine the DIV32_16 function (Blackfin ASM version), and wondered why the return value of the function inside 'fixed_bfin.h' is of type 'spx_word16_t', but the local variable 'res' which is returned by this function is of type 'spx_word32_t'. Is this a trick of optimization or a bug? (Same question for PDIV32_16 and MAX16, too!) best

FW: Re: Problem with Blackfin assembly optimizations -- bug in fixed_bfin.h / resampler saturation???

2008 Feb 01

FW: Re: Problem with Blackfin assembly optimizations -- bug in fixed_bfin.h / resampler saturation???

Hi Jean-Marc, didn't get a reply to my last post (see below) -- do you have no idea what happens here? After some more tests, I disabled the DIV32_16 Blackfin optimizations and now get good quality on the Blackfin. But when I have overdrive on the input, things become very bad -- I'm not sure if this is really a filter stability issue like I wrote some weeks ago. I use the speex

Re: Problem with Blackfin assembly optimizations -- bug in fixed_bfin.h / resampler saturation???

2008 Feb 08

Re: Problem with Blackfin assembly optimizations -- bug in fixed_bfin.h / resampler saturation???

Hi, I tried to figure out what the problem is -- but it seems to be totally different from what I expected. My status at the moment is: - computing results for "generic" and "Blackfin ASM" versions of the DIV32_16 function are the same, there is no "algorithmic bug" - Instead, there seems some sort of memory corruption: When I comment out the DIV32_16 function

[LLVMdev] Anyone is building a DSP-C frontend?

2005 Aug 31

[LLVMdev] Anyone is building a DSP-C frontend?

fixed-point number could be stored in LLVM first class integer types. i cannot see the problem now. but to be type-safe, there should be a first class 'fixed'. some llvm extensions required to mapping dsp-c lanaguages could be implemented as qualifiers. 1. _sat qualifier Saturate the result within [0.0, +1.0> or [-1.0,+1.0> (unsigned/singed). sat signed fixed a; sat signed fixed

Saturating float-to-int casts

2020 Aug 07

Saturating float-to-int casts

I have encountered a need for float-to-int casts that saturate to min/max when the value is out of the range of the target type. It seems that there is no intrinsic to do this, currently, but on IRC it was pointed out that a patch [1] has been proposed to implement this functionality in exactly the way that I was looking for. It looks like the discussion has died out but I was hoping maybe to

[PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions

2015 Jan 11

[PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions

On 11.01.2015 20:19, Ilia Mirkin wrote: > On Sun, Jan 11, 2015 at 12:27 PM, Tobias Klausmann > <tobias.johannes.klausmann at mni.thm.de> wrote: >> >> On 11.01.2015 01:58, Ilia Mirkin wrote: >>> On Fri, Jan 9, 2015 at 8:24 PM, Tobias Klausmann >>> <tobias.johannes.klausmann at mni.thm.de> wrote: >>>> Folding for conversions:

Getting CELT to work under Windows

2010 Jun 24

Getting CELT to work under Windows

Hi, My name is Riccardo Micci i downloaded the CELT source code and I compiled it under Windows. This is meant to be a preliminary study for my company's project. When i run CELT it encodes and decodes the file back saying "Encoder matches decoder!!". When i try to play the output though the result is just noise and clicks. The only changes I've applied are some #defines to

[PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions

2015 Jan 11

[PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions

On 11.01.2015 01:58, Ilia Mirkin wrote: > On Fri, Jan 9, 2015 at 8:24 PM, Tobias Klausmann > <tobias.johannes.klausmann at mni.thm.de> wrote: >> Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, {S16/32})->F32 >> >> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> >> --- >> V2: beat me, whip me, split

[PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions

2015 Jan 11

[PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions

On 11.01.2015 20:57, Ilia Mirkin wrote: > On Sun, Jan 11, 2015 at 2:56 PM, Tobias Klausmann > <tobias.johannes.klausmann at mni.thm.de> wrote: >> >> On 11.01.2015 20:19, Ilia Mirkin wrote: >>> On Sun, Jan 11, 2015 at 12:27 PM, Tobias Klausmann >>> <tobias.johannes.klausmann at mni.thm.de> wrote: >>>> >>>> On 11.01.2015 01:58,

Problem with Blackfin assembly optimizations -- bug in fixed_bfin.h / resampler saturation???

2008 Mar 05

Problem with Blackfin assembly optimizations -- bug in fixed_bfin.h / resampler saturation???

Jean-Marc, Frank, I have stumbled across a similar situation regarding optimization. I seem to have a similar setup as Frank does with a fixed 48khz in and out. The wideband mode and ultra-wideband modes are really what I?m looking for. I have a test application that reads audio, downsample to 16kHz (or 32kHz), speex encode, speex decode, upsample back to 48kHz, and playback. If I remove

[RESEND/PATCH] nv50/ir: Handle OP_CVT when folding constant expressions

2015 Jan 09

[RESEND/PATCH] nv50/ir: Handle OP_CVT when folding constant expressions

Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, {S16/32})->F32 Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 109 +++++++++++++++++++++ 1 file changed, 109 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp

[LLVMdev] RFC: generation of PSAD instruction

2015 Jan 28

[LLVMdev] RFC: generation of PSAD instruction

On Wed, Jan 28, 2015 at 7:50 AM, Hal Finkel <hfinkel at anl.gov> wrote: > Hi Vijender, > > Thanks for posting this, there is wide support here for improving our support for reductions of various kinds, both in flavor and robustness. I've cc'd some others who have previously discussed this. > > James has advocated in the past for an intrinsic for horizontal reductions,

similar to: [LLVMdev] Vectorization Cost Models and Multi-Instruction Patterns?