thr3ads.net - search: "nonans"

Displaying 20 results from an estimated 20 matches for "nonans".

[RFC] Making space for a flush-to-zero flag in FastMathFlags

2019 Mar 16

[RFC] Making space for a flush-to-zero flag in FastMathFlags

Hi, I need to add a flush-denormals-to-zero (FTZ) flag to FastMathFlags, but we've already used up the 7 bits available in Value::SubclassOptionalData (the "backing storage" for FPMathOperator::getFastMathFlags()). These are the possibilities I can think of: 1. Increase the size of FPMathOperator. This gives us some additional bits for FTZ and other fastmath flags we'd want

[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef

2014 Sep 22

[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef

...es for fadd: >> (1) fadd %x, -0.0 -> %x >> (2) fadd undef, undef -> undef >> (3) fadd %x, undef -> NaN (undef is a NaN which >> is propagated) >> >> Looking through the code I found the "NoNaNs" flag accessed through >> an instance >> of the FastMathFlags class. >> (2) and (3) should probably depend on it. >> If the flag is set, (2) and (3) cannot be folded as there are no NaNs >> and we are >> not guaranteed to get an arbitrary bit pattern from...

[RFC] Making space for a flush-to-zero flag in FastMathFlags

2019 Mar 18

[RFC] Making space for a flush-to-zero flag in FastMathFlags

On Sun, Mar 17, 2019 at 1:47 PM Craig Topper <craig.topper at gmail.com> wrote: > Can we move HasValueHandle out of the byte used for SubClassOptionalData and move it to the flags at the bottom of value by shrinking NumUserOperands to 27? I like this approach because it is less work for me. :) But I agree with Sanjay below that this only kicks the can slightly further down the road

[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef

2014 Sep 17

[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef

...o sum up, below is the list of correct folding examples for fadd: (1) fadd %x, -0.0 -> %x (2) fadd undef, undef -> undef (3) fadd %x, undef -> NaN (undef is a NaN which is propagated) Looking through the code I found the "NoNaNs" flag accessed through an instance of the FastMathFlags class. (2) and (3) should probably depend on it. If the flag is set, (2) and (3) cannot be folded as there are no NaNs and we are not guaranteed to get an arbitrary bit pattern from fadd, right? Other arithmetic FP operations (fsub, fm...

[RFC] Making space for a flush-to-zero flag in FastMathFlags

2019 Mar 18

[RFC] Making space for a flush-to-zero flag in FastMathFlags

We knew the day when we needed another FMF bit was coming back in: https://reviews.llvm.org/D39304 ...it was just a question of 'when'. :) I'm guessing that an FTZ bit won't be the last new bit needed if we consider permutations between strict FP and fast-math. Even without that, denormals-as-zero (DAZ) might also be useful? So rather than continuing to carve these out bit-by-bit,

RFC: Generic IR reductions

2017 Jan 31

RFC: Generic IR reductions

...on in a single instruction can pattern match the promote->reduce sequence. And for minmax recurrence types, we have: int_vector_reduce_smax(vector_src) int_vector_reduce_smin(vector_src) int_vector_reduce_umax(vector_src) int_vector_reduce_umin(vector_src) int_vector_reduce_fmin(vector_src, i32 NoNaNs) int_vector_reduce_fmax(vector_src, i32 NoNaNs) These reduction operations can be mapped from the recurrence kinds defined in LoopUtils, however any front-end or pass in LLVM may use them. Predication =========== We have multiple options for expressing vector predication in reductions: 1. The f...

[LLVMdev] [llvm-commits] [PATCH] fast-math patches!

2012 Nov 16

[LLVMdev] [llvm-commits] [PATCH] fast-math patches!

Michael, Overall the code looks good. 80-cols: 2046 FMF.UnsafeAlgebra = 0 != (Record[OpNum] & (1 << bitc::FMF_UNSAFE_ALGEBRA)); 2047 FMF.NoNaNs = 0 != (Record[OpNum] & (1 << bitc::FMF_NO_NANS)); 2048 FMF.NoInfs = 0 != (Record[OpNum] & (1 << bitc::FMF_NO_INFS)); 2049 FMF.NoSignedZeros = 0 != (Record[OpNum] & (1 << bitc::FMF_NO_SIGNED_ZEROS)); 2050 FMF.AllowRecip...

[LLVMdev] [llvm-commits] [PATCH] fast-math patches!

2012 Nov 16

[LLVMdev] [llvm-commits] [PATCH] fast-math patches!

...le.com> wrote: > > On Nov 15, 2012, at 3:23 PM, Joe Abbey <joe.abbey at gmail.com> wrote: > >> Though semantically equivalent in this case, however I think you should use logical ors here not bitwise. >> >> + bool any() { >> + return UnsafeAlgebra | NoNaNs | NoInfs | NoSignedZeros | >> + AllowReciprocal; >> + } >> > > Will do. > >> Gripe: This pattern is probably super fast and has precedence… but the code is non-obvious: >> >> SubclassOptionalData = >> (SubclassOptionalData & ~BitToS...

[LLVMdev] [llvm-commits] [PATCH] fast-math patches!

2012 Nov 15

[LLVMdev] [llvm-commits] [PATCH] fast-math patches!

Though semantically equivalent in this case, however I think you should use logical ors here not bitwise. + bool any() { + return UnsafeAlgebra | NoNaNs | NoInfs | NoSignedZeros | + AllowReciprocal; + } Gripe: This pattern is probably super fast and has precedence… but the code is non-obvious: SubclassOptionalData = (SubclassOptionalData & ~BitToSet) | (B * BitToSet); This is likely one iota slower.. but it sure is easier to get the...

[LLVMdev] [llvm-commits] [PATCH] fast-math patches!

2012 Nov 15

[LLVMdev] [llvm-commits] [PATCH] fast-math patches!

On Nov 15, 2012, at 3:23 PM, Joe Abbey <joe.abbey at gmail.com> wrote: > Though semantically equivalent in this case, however I think you should use logical ors here not bitwise. > > + bool any() { > + return UnsafeAlgebra | NoNaNs | NoInfs | NoSignedZeros | > + AllowReciprocal; > + } > Will do. > Gripe: This pattern is probably super fast and has precedence… but the code is non-obvious: > > SubclassOptionalData = > (SubclassOptionalData & ~BitToSet) | (B * BitToSet); > This is an exi...

RFC: Generic IR reductions

2017 Jan 31

RFC: Generic IR reductions

...ector.reduce.add(...) ? > These intrinsics do not do any type promotion of the scalar result. Architectures like SVE which can do type promotion and reduction in a single instruction can pattern match the promote->reduce sequence. Yup. > ... > int_vector_reduce_fmax(vector_src, i32 NoNaNs) A large list, and probably doesn't even cover all SVE can do, let alone other reductions. Why not simplify this into something like: %sum = add <N x float>, <N x float> %a, <N x float> %b %red = @llvm.reduce(%sum, float %acc) or %fast_red = @llvm.reduce(%sum) For a...

what does -ffp-contract=fast allow?

2016 Nov 18

what does -ffp-contract=fast allow?

...as storing all FMF in > metadata) was discussed here: > https://llvm.org/bugs/show_bug.cgi?id=13118 > 3. The backend needs a thread of its own. We have at least these > mechanisms to handle FMA codegen: > a. TargetOptions for LessPreciseFPMADOption, UnsafeFPMath, > NoInfsFPMath, NoNaNsFPMath, AllowFPOpFusion (Fast, Standard, Strict) > b. SDNodeFlags for UnsafeAlgebra, NoNaNs, NoInfs, NoSignedZeros (but > nothing for FMA since IR FMF has nothing for FMA) > c. SelectionDAGTargetInfo::generateFMAsInMachineCombiner() > d. TargetLoweringBase::isFMAFasterThanFMulAndFAdd(...

what does -ffp-contract=fast allow?

2016 Nov 18

what does -ffp-contract=fast allow?

Sent from my Verizon Wireless 4G LTE DROID On Nov 17, 2016 5:53 PM, Mehdi Amini <mehdi.amini at apple.com<mailto:mehdi.amini at apple.com>> wrote: > > >> On Nov 17, 2016, at 4:33 PM, Hal Finkel <hfinkel at anl.gov<mailto:hfinkel at anl.gov>> wrote: >> >> >> ________________________________ >>> >>> From: "Warren

RFC: Matrix math support

2019 Oct 28

RFC: Matrix math support

...trix with M rows and N columns, using a stride of %Stride between columns. This allows for convenient storing of sub matrixes. The floating point versions of the intrinsics also take fast-math flags, which can be used to opt-in to FMA generation and/or constant folding opportunities via NoInfs and NoNaNs. We plan to add them to the lowered instructions and rely on InstCombine & Co for related optimisations. The intrinsics will be lowered to regular LLVM vector operations in a IR lowering pass. This means per default, we can lower the builtins on all targets. Additionally, targets can implement...

[LLVMdev] [llvm-commits] [PATCH] fast-math patches!

2012 Nov 15

[LLVMdev] [llvm-commits] [PATCH] fast-math patches!

Trying to apply patches.. What's your base revision? Joe On Nov 15, 2012, at 5:44 PM, Michael Ilseman <milseman at apple.com> wrote: > New patches with review feedback incorporated: > * Changed single letter flags to short abbreviations ('S' ==> 'nsz') > * Indentation fixes > * Comments don't state function names > >

[RFC] Supporting ARM's SVE in LLVM

2016 Nov 04

[RFC] Supporting ARM's SVE in LLVM

Hi, We've been working for the last two years on support for ARM's Scalable Vector Extension in LLVM, and we'd like to upstream our work. We've had to make several design decisions without community input, and would like to discuss the major changes we've made. To help with the discussions, I've attached a technical document (also in plain text below) to describe the

[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef

2014 Sep 16

[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef

As far as I know, LLVM does not try very hard to guarantee constant folded NaN payloads that match exactly what the target would generate. —Owen > On Sep 16, 2014, at 10:30 AM, Oleg Ranevskyy <llvm.mail.list at gmail.com> wrote: > > Hi Duncan, > > I reread everything we've discussed so far and would like to pay closer attention to the the ARM's FPSCR register

[LLVMdev] [PATCH] fast-math patches!

2012 Nov 15

[LLVMdev] [PATCH] fast-math patches!

New patches with review feedback incorporated: * Changed single letter flags to short abbreviations ('S' ==> 'nsz') * Indentation fixes * Comments don't state function names -------------- next part -------------- A non-text attachment was scrubbed... Name: 0002-Fast-math-flags-added-to-FPMathOperator.patch Type: application/octet-stream Size: 4937 bytes Desc: not

RFC: Generic IR reductions

2017 Jan 31

RFC: Generic IR reductions

...;> These intrinsics do not do any type promotion of the scalar result. Architectures like SVE which can do type promotion and reduction in a single instruction can pattern match the promote->reduce sequence. > > Yup. > > >> ... >> int_vector_reduce_fmax(vector_src, i32 NoNaNs) > > A large list, and probably doesn't even cover all SVE can do, let > alone other reductions. > > Why not simplify this into something like: > > %sum = add <N x float>, <N x float> %a, <N x float> %b > %red = @llvm.reduce(%sum, float %acc) > o...

Trouble when suppressing a portion of fast-math-transformations

2017 Oct 04

Trouble when suppressing a portion of fast-math-transformations

> It might be clearer, instead of using 'libm', to use something like 'trans' (for transcendental functions). That does seem clearer. ‘trans’ is definitely good with me. -Warren From: Hal Finkel [mailto:hfinkel at anl.gov] Sent: Tuesday, October 3, 2017 5:13 PM To: Ristow, Warren; Bruce Hoult Cc: llvm-dev at lists.llvm.org Subject: Re: [llvm-dev] Trouble when suppressing a

search for: nonans