search for: saturating

Displaying 20 results from an estimated 955 matches for "saturating".

2015 Jan 11
2
[PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions
...loat doesn't fit in the >>> destination... whether it saturates or not. I don't hugely care >>> though. >> Actually i can't remember why that was added in the first place, i'll go >> ahead and follow your advice here. > Oh wait... this was to support saturating an array access into a u16... > > const int sat = (i->op == OP_TXF) ? 1 : 0; > DataType sTy = (i->op == OP_TXF) ? TYPE_U32 : TYPE_F32; > bld.mkCvt(OP_CVT, TYPE_U16, layer, sTy, src)->saturate = sat; > > So... basically if the source is a U32...
2015 Jan 14
5
[LLVMdev] [RFC] Integer Saturation Intrinsics
...Gen] Add legalization for Integer Saturation Intrinsics. >From there, we can generate several new instructions, more efficient than their expanded counterpart. Locally, I have worked on: - ARM: the SSAT/USAT instructions (scalar) - AArch64: the SQ/UQ ADD/SUB AArch64 instructions (vector/scalar saturating arithmetic) - X86: PACK SS/US (vector, saturate+truncate) - X86: PADD/SUB S/US (vector, saturating arithmetic) Anyway, let's first agree on the intrinsics, so that further development is done on trunk. Thanks! -Ahmed
2004 Nov 03
2
speex on TI C5x fixed-point DSP
> One thing I've noticed so far in the filter_mem2 code is the calls to > SATURATE(x, 805306368). 805306368 is 0x30000000. I was expecting that > to be on a bit boundary, say 0x3fffffff? In which case the arithmetic > saturation logic could be used. I don't think it would make that big of a difference, since the saturation is outside of the inner loop. If it's that
2015 Jan 11
2
[PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions
On 11.01.2015 01:58, Ilia Mirkin wrote: > On Fri, Jan 9, 2015 at 8:24 PM, Tobias Klausmann > <tobias.johannes.klausmann at mni.thm.de> wrote: >> Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, {S16/32})->F32 >> >> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> >> --- >> V2: beat me, whip me, split
2011 Jun 17
0
[LLVMdev] RFC: Integer saturation intrinsics
...semantics matches exactly. X86 doesn't have saturation instructions. However, SSE does have packed add, packed sub, and pack with saturation. So it's possible to instruction select patterns such as (int_{s|u}sat ({add|sub} x, y), c). The stated pattern simply doesn't work. A portable saturating add/subtract intrinsic might be nice given that most vector instruction sets have such an instruction, but this seems completely orthogonal. > The plan is to form calls to these intrinsics in InstCombine. Legalizer can expand these intrinsics if they are not legal. The expansion should be fairl...
2011 Jun 17
5
[LLVMdev] RFC: Integer saturation intrinsics
Hi all, I'm proposing integer saturation intrinsics. def int_ssat : Intrinsic<[llvm_anyint_ty], [LLVMMatchType<0>, llvm_i32_ty]>; def int_usat : Intrinsic<[llvm_anyint_ty], [LLVMMatchType<0>, llvm_i32_ty]>; The first operand is the integer value being saturated, and second is the saturation bit position. For scalar integer types, the semantics are: int_ssat: x <
2004 Mar 16
2
glm questions --- saturated model
> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of David Firth > Sent: Tuesday, March 16, 2004 1:12 PM > To: Paul Johnson > Cc: r-help at r-project.org > Subject: Re: [R] glm questions > > > Dear Paul > > Here are some attempts at your questions. I hope it's of some help.
2015 Jan 19
2
[LLVMdev] Vectorization Cost Models and Multi-Instruction Patterns?
Hi all, While tinkering with saturation instructions, I hit problems with the cost model calculations. The loop vectorizer cost model accumulates the individual TTI cost model of each instruction. For saturating arithmetic, this is a gross overestimate, since you have 2 sexts (inputs), 2 icmps + 2 selects (for the saturation), and a truncate (output); these all fold alway. With an intrinsic, you'd end up with a better estimate; however, I'm trying to see what problems we would encounter without int...
2011 Jun 17
2
[LLVMdev] RFC: Integer saturation intrinsics
...matches exactly. X86 doesn't have saturation instructions. However, SSE does have packed add, packed sub, and pack with saturation. So it's possible to instruction select patterns such as (int_{s|u}sat ({add|sub} x, y), c). > > The stated pattern simply doesn't work. A portable saturating > add/subtract intrinsic might be nice given that most vector > instruction sets have such an instruction, but this seems completely > orthogonal. Can you explain why you think the pattern (which?) would not work? > >> The plan is to form calls to these intrinsics in InstCombin...
2019 Oct 10
2
[RFC] Use of saturating intrinsics
Hello all again, take 2. Over in D68651 I would like to make code that attempt to saturate an value (using higher bitwidth integers) use a saturating intrinsic instead. Something like this: https://godbolt.org/z/9knBnP As can be seen, the unsigned cases are already being matched to llvm.uadd.sat intrinsics. I am hoping to extend that to the signed cases. This has numerous benefits including simpler vectorization, cost-modelling and matching in...
2008 Feb 05
1
Re: Problem with Blackfin assembly optimizations -- bug in fixed_bfin.h / resampler saturation???
Hi, I just started to examine the DIV32_16 function (Blackfin ASM version), and wondered why the return value of the function inside 'fixed_bfin.h' is of type 'spx_word16_t', but the local variable 'res' which is returned by this function is of type 'spx_word32_t'. Is this a trick of optimization or a bug? (Same question for PDIV32_16 and MAX16, too!) best
2020 Jul 08
4
[RFC] Saturating left shift intrinsics
Hello, This is an RFC for adding intrinsics which perform saturating signed/unsigned left shift. There is currently a patch on Phabricator here: https://reviews.llvm.org/D83216 The intrinsics are of the form i32 @llvm.sshl.sat.i32(i32, i32) i32 @llvm.ushl.sat.i32(i32, i32) <4 x i32> @llvm.sshl.sat.v4i32(<4 x i32>, <4 x i32>) <4...
2015 Jan 15
3
[LLVMdev] [RFC] Integer Saturation Intrinsics
...on. > I don't think this should be a flag on add. Flags are designed such that the middle-end may be ignorant of them and nothing bad might happen, it is always safe to ignore or drop flags when doing so is convenient (for a concrete example, take a look at reassociate). In this case, the saturating nature of the operation does not seem like something that can be safely ignored. > > 2) How do you imagine this being used and what are the guarantees for > sequences of operations with respect to optimisation? If I do a+b-c (or +c > where c is negative), and a+b would saturate, but...
2015 Jan 11
2
[PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions
...>> destination... whether it saturates or not. I don't hugely care >>>>> though. >>>> Actually i can't remember why that was added in the first place, i'll go >>>> ahead and follow your advice here. >>> Oh wait... this was to support saturating an array access into a u16... >>> >>> const int sat = (i->op == OP_TXF) ? 1 : 0; >>> DataType sTy = (i->op == OP_TXF) ? TYPE_U32 : TYPE_F32; >>> bld.mkCvt(OP_CVT, TYPE_U16, layer, sTy, src)->saturate = sat; >>> &...
2018 Aug 21
4
Fixed Point Support in LLVM
...> LLVM that tries to do something different for every possible type instead of assuming > that > > If we did this, I would suggest separating types by representation differences, not the > semantics of the operations on them. For example, we'd have different operations for > saturating and non-saturating arithmetic, but saturating and non-saturing types would > get lowered to the same IR type. Unlike integers, though, I think maybe we wouldn't > want to unify signed and unsigned types because of the padded-representation issue; > or maybe we'd only unify types w...
2015 Jan 09
3
[RESEND/PATCH] nv50/ir: Handle OP_CVT when folding constant expressions
Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, {S16/32})->F32 Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 109 +++++++++++++++++++++ 1 file changed, 109 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
2008 Feb 08
1
Re: Problem with Blackfin assembly optimizations -- bug in fixed_bfin.h / resampler saturation???
Hi, I tried to figure out what the problem is -- but it seems to be totally different from what I expected. My status at the moment is: - computing results for "generic" and "Blackfin ASM" versions of the DIV32_16 function are the same, there is no "algorithmic bug" - Instead, there seems some sort of memory corruption: When I comment out the DIV32_16 function
2009 Jun 13
1
Resampler saturation
> Quoting Stephane Lesage <stephane.lesage at ateis-international.com>: > > Is this a bug ? Is it possible to fix it ? > > (I use version speex 1.2beta2, because newer versions just > don't work > > on my > > platform) > > This is probable the cause. 1.2beta2 was the first release to > include the resampler and it had many bugs. I suggest trying
2008 Feb 01
0
FW: Re: Problem with Blackfin assembly optimizations -- bug in fixed_bfin.h / resampler saturation???
Frank Lorenz a ?crit : > And yes, the same "overflow" happens even when I disable Blackfin ASM > optimizations. Indeed, that shouldn't happen. Just to make sure I understand, so far there's two problems: 1) DIV32_16() in Blackfin assembly causes problems 2) The resampler overflows When you fix/workaround those two, is the encoder/decoder working correctly or are there
2015 Jan 10
2
[PATCH v2] nv50/ir: Handle OP_CVT when folding constant expressions
Folding for conversions: F32->(U{16/32}, S{16/32}) and (U{16/32}, {S16/32})->F32 Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- V2: beat me, whip me, split out F64 .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 81 ++++++++++++++++++++++ 1 file changed, 81 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp