thr3ads.net - search: "nonan"

Displaying 20 results from an estimated 20 matches for "nonan".

Did you mean: nnan

[RFC] Making space for a flush-to-zero flag in FastMathFlags

2019 Mar 16

[RFC] Making space for a flush-to-zero flag in FastMathFlags

...herwise we could steal 4 bits. 3. Allow only specific combinations in FastMathFlags. In practice, I don't think folks are equally interested in all the 2^N combinations present in FastMathFlags, so we could compromise and allow only the most "typical" 2^7 combinations (e.g. we could nonan and noinf into a single bit, under the assumption that users want to enable-disable them as a unit). I'm unsure if establishing the most typical 2^7 combinations will be straightforward though. 4. Function level attributes. Instead of wasting precious instruction-level space, we could move a...

[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef

2014 Sep 22

[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef

...es for fadd: >> (1) fadd %x, -0.0 -> %x >> (2) fadd undef, undef -> undef >> (3) fadd %x, undef -> NaN (undef is a NaN which >> is propagated) >> >> Looking through the code I found the "NoNaNs" flag accessed through >> an instance >> of the FastMathFlags class. >> (2) and (3) should probably depend on it. >> If the flag is set, (2) and (3) cannot be folded as there are no NaNs >> and we are >> not guaranteed to get an arbitrary bit pattern from...

[RFC] Making space for a flush-to-zero flag in FastMathFlags

2019 Mar 18

[RFC] Making space for a flush-to-zero flag in FastMathFlags

...> 3. Allow only specific combinations in FastMathFlags. In practice, I >> don't think folks are equally interested in all the 2^N combinations >> present in FastMathFlags, so we could compromise and allow only the >> most "typical" 2^7 combinations (e.g. we could nonan and noinf into a >> single bit, under the assumption that users want to enable-disable >> them as a unit). I'm unsure if establishing the most typical 2^7 >> combinations will be straightforward though. >> >> 4. Function level attributes. Instead of wasting preci...

[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef

2014 Sep 17

[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef

...o sum up, below is the list of correct folding examples for fadd: (1) fadd %x, -0.0 -> %x (2) fadd undef, undef -> undef (3) fadd %x, undef -> NaN (undef is a NaN which is propagated) Looking through the code I found the "NoNaNs" flag accessed through an instance of the FastMathFlags class. (2) and (3) should probably depend on it. If the flag is set, (2) and (3) cannot be folded as there are no NaNs and we are not guaranteed to get an arbitrary bit pattern from fadd, right? Other arithmetic FP operations (fsub, f...

[RFC] Making space for a flush-to-zero flag in FastMathFlags

2019 Mar 18

[RFC] Making space for a flush-to-zero flag in FastMathFlags

RFC: Generic IR reductions

2017 Jan 31

RFC: Generic IR reductions

...on in a single instruction can pattern match the promote->reduce sequence. And for minmax recurrence types, we have: int_vector_reduce_smax(vector_src) int_vector_reduce_smin(vector_src) int_vector_reduce_umax(vector_src) int_vector_reduce_umin(vector_src) int_vector_reduce_fmin(vector_src, i32 NoNaNs) int_vector_reduce_fmax(vector_src, i32 NoNaNs) These reduction operations can be mapped from the recurrence kinds defined in LoopUtils, however any front-end or pass in LLVM may use them. Predication =========== We have multiple options for expressing vector predication in reductions: 1. The...

[LLVMdev] [llvm-commits] [PATCH] fast-math patches!

2012 Nov 16

[LLVMdev] [llvm-commits] [PATCH] fast-math patches!

Michael, Overall the code looks good. 80-cols: 2046 FMF.UnsafeAlgebra = 0 != (Record[OpNum] & (1 << bitc::FMF_UNSAFE_ALGEBRA)); 2047 FMF.NoNaNs = 0 != (Record[OpNum] & (1 << bitc::FMF_NO_NANS)); 2048 FMF.NoInfs = 0 != (Record[OpNum] & (1 << bitc::FMF_NO_INFS)); 2049 FMF.NoSignedZeros = 0 != (Record[OpNum] & (1 << bitc::FMF_NO_SIGNED_ZEROS)); 2050 FMF.AllowReci...

[LLVMdev] [llvm-commits] [PATCH] fast-math patches!

2012 Nov 16

[LLVMdev] [llvm-commits] [PATCH] fast-math patches!

...le.com> wrote: > > On Nov 15, 2012, at 3:23 PM, Joe Abbey <joe.abbey at gmail.com> wrote: > >> Though semantically equivalent in this case, however I think you should use logical ors here not bitwise. >> >> + bool any() { >> + return UnsafeAlgebra | NoNaNs | NoInfs | NoSignedZeros | >> + AllowReciprocal; >> + } >> > > Will do. > >> Gripe: This pattern is probably super fast and has precedence… but the code is non-obvious: >> >> SubclassOptionalData = >> (SubclassOptionalData & ~BitTo...

[LLVMdev] [llvm-commits] [PATCH] fast-math patches!

2012 Nov 15

[LLVMdev] [llvm-commits] [PATCH] fast-math patches!

Though semantically equivalent in this case, however I think you should use logical ors here not bitwise. + bool any() { + return UnsafeAlgebra | NoNaNs | NoInfs | NoSignedZeros | + AllowReciprocal; + } Gripe: This pattern is probably super fast and has precedence… but the code is non-obvious: SubclassOptionalData = (SubclassOptionalData & ~BitToSet) | (B * BitToSet); This is likely one iota slower.. but it sure is easier to get the...

[LLVMdev] [llvm-commits] [PATCH] fast-math patches!

2012 Nov 15

[LLVMdev] [llvm-commits] [PATCH] fast-math patches!

On Nov 15, 2012, at 3:23 PM, Joe Abbey <joe.abbey at gmail.com> wrote: > Though semantically equivalent in this case, however I think you should use logical ors here not bitwise. > > + bool any() { > + return UnsafeAlgebra | NoNaNs | NoInfs | NoSignedZeros | > + AllowReciprocal; > + } > Will do. > Gripe: This pattern is probably super fast and has precedence… but the code is non-obvious: > > SubclassOptionalData = > (SubclassOptionalData & ~BitToSet) | (B * BitToSet); > This is an ex...

RFC: Generic IR reductions

2017 Jan 31

RFC: Generic IR reductions

...ector.reduce.add(...) ? > These intrinsics do not do any type promotion of the scalar result. Architectures like SVE which can do type promotion and reduction in a single instruction can pattern match the promote->reduce sequence. Yup. > ... > int_vector_reduce_fmax(vector_src, i32 NoNaNs) A large list, and probably doesn't even cover all SVE can do, let alone other reductions. Why not simplify this into something like: %sum = add <N x float>, <N x float> %a, <N x float> %b %red = @llvm.reduce(%sum, float %acc) or %fast_red = @llvm.reduce(%sum) For a...

what does -ffp-contract=fast allow?

2016 Nov 18

what does -ffp-contract=fast allow?

...as storing all FMF in > metadata) was discussed here: > https://llvm.org/bugs/show_bug.cgi?id=13118 > 3. The backend needs a thread of its own. We have at least these > mechanisms to handle FMA codegen: > a. TargetOptions for LessPreciseFPMADOption, UnsafeFPMath, > NoInfsFPMath, NoNaNsFPMath, AllowFPOpFusion (Fast, Standard, Strict) > b. SDNodeFlags for UnsafeAlgebra, NoNaNs, NoInfs, NoSignedZeros (but > nothing for FMA since IR FMF has nothing for FMA) > c. SelectionDAGTargetInfo::generateFMAsInMachineCombiner() > d. TargetLoweringBase::isFMAFasterThanFMulAndFAdd...

what does -ffp-contract=fast allow?

2016 Nov 18

what does -ffp-contract=fast allow?

Sent from my Verizon Wireless 4G LTE DROID On Nov 17, 2016 5:53 PM, Mehdi Amini <mehdi.amini at apple.com<mailto:mehdi.amini at apple.com>> wrote: > > >> On Nov 17, 2016, at 4:33 PM, Hal Finkel <hfinkel at anl.gov<mailto:hfinkel at anl.gov>> wrote: >> >> >> ________________________________ >>> >>> From: "Warren

RFC: Matrix math support

2019 Oct 28

RFC: Matrix math support

...trix with M rows and N columns, using a stride of %Stride between columns. This allows for convenient storing of sub matrixes. The floating point versions of the intrinsics also take fast-math flags, which can be used to opt-in to FMA generation and/or constant folding opportunities via NoInfs and NoNaNs. We plan to add them to the lowered instructions and rely on InstCombine & Co for related optimisations. The intrinsics will be lowered to regular LLVM vector operations in a IR lowering pass. This means per default, we can lower the builtins on all targets. Additionally, targets can implemen...

[LLVMdev] [llvm-commits] [PATCH] fast-math patches!

2012 Nov 15

[LLVMdev] [llvm-commits] [PATCH] fast-math patches!

Trying to apply patches.. What's your base revision? Joe On Nov 15, 2012, at 5:44 PM, Michael Ilseman <milseman at apple.com> wrote: > New patches with review feedback incorporated: > * Changed single letter flags to short abbreviations ('S' ==> 'nsz') > * Indentation fixes > * Comments don't state function names > >

[RFC] Supporting ARM's SVE in LLVM

2016 Nov 04

[RFC] Supporting ARM's SVE in LLVM

...r for active lanes so we introduce predicated versions of the most common routines (e.g. `*llvm.pow`). This is mostly transparent to the loop vectorizer beyond the requirement to pass an extra parameter. ### TTI Interface: ```cpp bool canReduceInVector(const RecurrenceDescriptor &Desc, bool NoNaN) const; Value* getReductionIntrinsic(IRBuilder<> &Builder, const RecurrenceDescriptor& Desc, bool NoNaN, Value* Src) const; ``` # SelectionDAG Nodes ## *ISD::BUILD_VECTOR* {#isdbuildvector} `BUILD_VECTOR(ELT0, ELT1, EL...

[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef

2014 Sep 16

[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef

As far as I know, LLVM does not try very hard to guarantee constant folded NaN payloads that match exactly what the target would generate. —Owen > On Sep 16, 2014, at 10:30 AM, Oleg Ranevskyy <llvm.mail.list at gmail.com> wrote: > > Hi Duncan, > > I reread everything we've discussed so far and would like to pay closer attention to the the ARM's FPSCR register

[LLVMdev] [PATCH] fast-math patches!

2012 Nov 15

[LLVMdev] [PATCH] fast-math patches!

New patches with review feedback incorporated: * Changed single letter flags to short abbreviations ('S' ==> 'nsz') * Indentation fixes * Comments don't state function names -------------- next part -------------- A non-text attachment was scrubbed... Name: 0002-Fast-math-flags-added-to-FPMathOperator.patch Type: application/octet-stream Size: 4937 bytes Desc: not

RFC: Generic IR reductions

2017 Jan 31

RFC: Generic IR reductions

...;> These intrinsics do not do any type promotion of the scalar result. Architectures like SVE which can do type promotion and reduction in a single instruction can pattern match the promote->reduce sequence. > > Yup. > > >> ... >> int_vector_reduce_fmax(vector_src, i32 NoNaNs) > > A large list, and probably doesn't even cover all SVE can do, let > alone other reductions. > > Why not simplify this into something like: > > %sum = add <N x float>, <N x float> %a, <N x float> %b > %red = @llvm.reduce(%sum, float %acc) >...

Trouble when suppressing a portion of fast-math-transformations

2017 Oct 04

Trouble when suppressing a portion of fast-math-transformations

> It might be clearer, instead of using 'libm', to use something like 'trans' (for transcendental functions). That does seem clearer. ‘trans’ is definitely good with me. -Warren From: Hal Finkel [mailto:hfinkel at anl.gov] Sent: Tuesday, October 3, 2017 5:13 PM To: Ristow, Warren; Bruce Hoult Cc: llvm-dev at lists.llvm.org Subject: Re: [llvm-dev] Trouble when suppressing a

search for: nonan