Displaying 20 results from an estimated 20 matches for "nonans".
2019 Mar 16
3
[RFC] Making space for a flush-to-zero flag in FastMathFlags
Hi,
I need to add a flush-denormals-to-zero (FTZ) flag to FastMathFlags,
but we've already used up the 7 bits available in
Value::SubclassOptionalData (the "backing storage" for
FPMathOperator::getFastMathFlags()). These are the possibilities I
can think of:
1. Increase the size of FPMathOperator. This gives us some additional
bits for FTZ and other fastmath flags we'd want
2014 Sep 22
2
[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef
...es for fadd:
>> (1) fadd %x, -0.0 -> %x
>> (2) fadd undef, undef -> undef
>> (3) fadd %x, undef -> NaN (undef is a NaN which
>> is propagated)
>>
>> Looking through the code I found the "NoNaNs" flag accessed through
>> an instance
>> of the FastMathFlags class.
>> (2) and (3) should probably depend on it.
>> If the flag is set, (2) and (3) cannot be folded as there are no NaNs
>> and we are
>> not guaranteed to get an arbitrary bit pattern from...
2019 Mar 18
2
[RFC] Making space for a flush-to-zero flag in FastMathFlags
On Sun, Mar 17, 2019 at 1:47 PM Craig Topper <craig.topper at gmail.com> wrote:
> Can we move HasValueHandle out of the byte used for SubClassOptionalData and move it to the flags at the bottom of value by shrinking NumUserOperands to 27?
I like this approach because it is less work for me. :)
But I agree with Sanjay below that this only kicks the can slightly
further down the road
2014 Sep 17
3
[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef
...o sum up, below is the list of correct folding examples for fadd:
(1) fadd %x, -0.0 -> %x
(2) fadd undef, undef -> undef
(3) fadd %x, undef -> NaN (undef is a NaN which is
propagated)
Looking through the code I found the "NoNaNs" flag accessed through an
instance of the FastMathFlags class.
(2) and (3) should probably depend on it.
If the flag is set, (2) and (3) cannot be folded as there are no NaNs
and we are not guaranteed to get an arbitrary bit pattern from fadd, right?
Other arithmetic FP operations (fsub, fm...
2019 Mar 18
3
[RFC] Making space for a flush-to-zero flag in FastMathFlags
We knew the day when we needed another FMF bit was coming back in:
https://reviews.llvm.org/D39304
...it was just a question of 'when'. :)
I'm guessing that an FTZ bit won't be the last new bit needed if we
consider permutations between strict FP and fast-math. Even without that,
denormals-as-zero (DAZ) might also be useful?
So rather than continuing to carve these out bit-by-bit,
2017 Jan 31
2
RFC: Generic IR reductions
...on in a single instruction can pattern match the promote->reduce sequence.
And for minmax recurrence types, we have:
int_vector_reduce_smax(vector_src)
int_vector_reduce_smin(vector_src)
int_vector_reduce_umax(vector_src)
int_vector_reduce_umin(vector_src)
int_vector_reduce_fmin(vector_src, i32 NoNaNs)
int_vector_reduce_fmax(vector_src, i32 NoNaNs)
These reduction operations can be mapped from the recurrence kinds defined in LoopUtils, however any front-end or pass in LLVM may use them.
Predication
===========
We have multiple options for expressing vector predication in reductions:
1. The f...
2012 Nov 16
0
[LLVMdev] [llvm-commits] [PATCH] fast-math patches!
Michael,
Overall the code looks good.
80-cols:
2046 FMF.UnsafeAlgebra = 0 != (Record[OpNum] & (1 << bitc::FMF_UNSAFE_ALGEBRA));
2047 FMF.NoNaNs = 0 != (Record[OpNum] & (1 << bitc::FMF_NO_NANS));
2048 FMF.NoInfs = 0 != (Record[OpNum] & (1 << bitc::FMF_NO_INFS));
2049 FMF.NoSignedZeros = 0 != (Record[OpNum] & (1 << bitc::FMF_NO_SIGNED_ZEROS));
2050 FMF.AllowRecip...
2012 Nov 16
2
[LLVMdev] [llvm-commits] [PATCH] fast-math patches!
...le.com> wrote:
>
> On Nov 15, 2012, at 3:23 PM, Joe Abbey <joe.abbey at gmail.com> wrote:
>
>> Though semantically equivalent in this case, however I think you should use logical ors here not bitwise.
>>
>> + bool any() {
>> + return UnsafeAlgebra | NoNaNs | NoInfs | NoSignedZeros |
>> + AllowReciprocal;
>> + }
>>
>
> Will do.
>
>> Gripe: This pattern is probably super fast and has precedence… but the code is non-obvious:
>>
>> SubclassOptionalData =
>> (SubclassOptionalData & ~BitToS...
2012 Nov 15
2
[LLVMdev] [llvm-commits] [PATCH] fast-math patches!
Though semantically equivalent in this case, however I think you should use logical ors here not bitwise.
+ bool any() {
+ return UnsafeAlgebra | NoNaNs | NoInfs | NoSignedZeros |
+ AllowReciprocal;
+ }
Gripe: This pattern is probably super fast and has precedence… but the code is non-obvious:
SubclassOptionalData =
(SubclassOptionalData & ~BitToSet) | (B * BitToSet);
This is likely one iota slower.. but it sure is easier to get the...
2012 Nov 15
0
[LLVMdev] [llvm-commits] [PATCH] fast-math patches!
On Nov 15, 2012, at 3:23 PM, Joe Abbey <joe.abbey at gmail.com> wrote:
> Though semantically equivalent in this case, however I think you should use logical ors here not bitwise.
>
> + bool any() {
> + return UnsafeAlgebra | NoNaNs | NoInfs | NoSignedZeros |
> + AllowReciprocal;
> + }
>
Will do.
> Gripe: This pattern is probably super fast and has precedence… but the code is non-obvious:
>
> SubclassOptionalData =
> (SubclassOptionalData & ~BitToSet) | (B * BitToSet);
>
This is an exi...
2017 Jan 31
0
RFC: Generic IR reductions
...ector.reduce.add(...) ?
> These intrinsics do not do any type promotion of the scalar result. Architectures like SVE which can do type promotion and reduction in a single instruction can pattern match the promote->reduce sequence.
Yup.
> ...
> int_vector_reduce_fmax(vector_src, i32 NoNaNs)
A large list, and probably doesn't even cover all SVE can do, let
alone other reductions.
Why not simplify this into something like:
%sum = add <N x float>, <N x float> %a, <N x float> %b
%red = @llvm.reduce(%sum, float %acc)
or
%fast_red = @llvm.reduce(%sum)
For a...
2016 Nov 18
2
what does -ffp-contract=fast allow?
...as storing all FMF in
> metadata) was discussed here:
> https://llvm.org/bugs/show_bug.cgi?id=13118
> 3. The backend needs a thread of its own. We have at least these
> mechanisms to handle FMA codegen:
> a. TargetOptions for LessPreciseFPMADOption, UnsafeFPMath,
> NoInfsFPMath, NoNaNsFPMath, AllowFPOpFusion (Fast, Standard, Strict)
> b. SDNodeFlags for UnsafeAlgebra, NoNaNs, NoInfs, NoSignedZeros (but
> nothing for FMA since IR FMF has nothing for FMA)
> c. SelectionDAGTargetInfo::generateFMAsInMachineCombiner()
> d. TargetLoweringBase::isFMAFasterThanFMulAndFAdd(...
2016 Nov 18
2
what does -ffp-contract=fast allow?
Sent from my Verizon Wireless 4G LTE DROID
On Nov 17, 2016 5:53 PM, Mehdi Amini <mehdi.amini at apple.com<mailto:mehdi.amini at apple.com>> wrote:
>
>
>> On Nov 17, 2016, at 4:33 PM, Hal Finkel <hfinkel at anl.gov<mailto:hfinkel at anl.gov>> wrote:
>>
>>
>> ________________________________
>>>
>>> From: "Warren
2019 Oct 28
6
RFC: Matrix math support
...trix with M rows and N columns, using a stride of %Stride between columns. This allows for convenient storing of sub matrixes.
The floating point versions of the intrinsics also take fast-math flags, which can be used to opt-in to FMA generation and/or constant folding opportunities via NoInfs and NoNaNs. We plan to add them to the lowered instructions and rely on InstCombine & Co for related optimisations.
The intrinsics will be lowered to regular LLVM vector operations in a IR lowering pass. This means per default, we can lower the builtins on all targets. Additionally, targets can implement...
2012 Nov 15
0
[LLVMdev] [llvm-commits] [PATCH] fast-math patches!
Trying to apply patches..
What's your base revision?
Joe
On Nov 15, 2012, at 5:44 PM, Michael Ilseman <milseman at apple.com> wrote:
> New patches with review feedback incorporated:
> * Changed single letter flags to short abbreviations ('S' ==> 'nsz')
> * Indentation fixes
> * Comments don't state function names
>
>
2016 Nov 04
2
[RFC] Supporting ARM's SVE in LLVM
Hi,
We've been working for the last two years on support for ARM's Scalable Vector Extension in LLVM, and we'd like to upstream our work. We've had to make several design decisions without community input, and would like to discuss the major changes we've made. To help with the discussions, I've attached a technical document (also in plain text below) to describe the
2014 Sep 16
2
[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef
As far as I know, LLVM does not try very hard to guarantee constant folded NaN payloads that match exactly what the target would generate.
—Owen
> On Sep 16, 2014, at 10:30 AM, Oleg Ranevskyy <llvm.mail.list at gmail.com> wrote:
>
> Hi Duncan,
>
> I reread everything we've discussed so far and would like to pay closer attention to the the ARM's FPSCR register
2012 Nov 15
3
[LLVMdev] [PATCH] fast-math patches!
New patches with review feedback incorporated:
* Changed single letter flags to short abbreviations ('S' ==> 'nsz')
* Indentation fixes
* Comments don't state function names
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-Fast-math-flags-added-to-FPMathOperator.patch
Type: application/octet-stream
Size: 4937 bytes
Desc: not
2017 Jan 31
4
RFC: Generic IR reductions
...;> These intrinsics do not do any type promotion of the scalar result. Architectures like SVE which can do type promotion and reduction in a single instruction can pattern match the promote->reduce sequence.
>
> Yup.
>
>
>> ...
>> int_vector_reduce_fmax(vector_src, i32 NoNaNs)
>
> A large list, and probably doesn't even cover all SVE can do, let
> alone other reductions.
>
> Why not simplify this into something like:
>
> %sum = add <N x float>, <N x float> %a, <N x float> %b
> %red = @llvm.reduce(%sum, float %acc)
> o...
2017 Oct 04
2
Trouble when suppressing a portion of fast-math-transformations
> It might be clearer, instead of using 'libm', to use something like 'trans' (for transcendental functions).
That does seem clearer. ‘trans’ is definitely good with me.
-Warren
From: Hal Finkel [mailto:hfinkel at anl.gov]
Sent: Tuesday, October 3, 2017 5:13 PM
To: Ristow, Warren; Bruce Hoult
Cc: llvm-dev at lists.llvm.org
Subject: Re: [llvm-dev] Trouble when suppressing a