thr3ads.net - llvm dev - [LLVMdev] Representing -ffast-math at the IR level [Apr 2012]

If this information is useful, please help other people find it:
Share via:

Duncan Sands

2012-Apr-14 18:28 UTC

[LLVMdev] Representing -ffast-math at the IR level

The attached patch is a first attempt at representing "-ffast-math" at
the IR
level, in fact on individual floating point instructions (fadd, fsub etc). It
is done using metadata. We already have a "fpmath" metadata type
which can be
used to signal that reduced precision is OK for a floating point operation, eg

%z = fmul float %x, %y, !fpmath !0
...
!0 = metadata !{double 2.5}

indicates that the multiplication can be done in any way that doesn't
introduce
more than 2.5 ULPs of error.

The first observation is that !fpmath can be extended with additional operands
in the future: operands that say things like whether it is OK to assume that
there are no NaNs and so forth.

This patch doesn't add additional operands though. It just allows the
existing
accuracy operand to be the special keyword "fast" instead of a number:

%z = fmul float %x, %y, !fpmath !0
...
!0 = metadata !{!metadata "fast"}

This indicates that accuracy loss is acceptable (just how much is unspecified)
for the sake of speed. Thanks to Chandler for pushing me to do it this way!

It also creates a simple way of getting and setting this information: the
FPMathOperator class: you can cast appropriate instructions to this class
and then use the querying/mutating methods to get/set the accuracy, whether
2.5 or "fast". The attached clang patch uses this to set the openCL
2.5 ULPs
accuracy rather than doing it by hand for example.

In addition it changes IRBuilder so that you can provide an accuracy when
creating floating point operations. I don't like this so much. It would
be more efficient to just create the metadata once and then splat it onto
each instruction. Also, if fpmath gets a bunch more options/operands in
the future then this interface will become more and more awkward. Opinions
welcome!

I didn't actually implement any optimizations that use this yet.

I took a look at the impact on aermod.f90, a reasonably floating point heavy
Fortran benchmark (4% of the human readable IR consists of floating point
operations). At -O3 (the worst), the size of the bitcode increases by 0.8%.
No idea if that's acceptable - hopefully it is!

Enjoy!

Duncan.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fastm-llvm.diff
Type: text/x-patch
Size: 14251 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20120414/95aa6cb6/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fastm-clang.diff
Type: text/x-patch
Size: 2240 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20120414/95aa6cb6/attachment-0001.bin>

Renato Golin

2012-Apr-14 19:23 UTC

head link

[LLVMdev] Representing -ffast-math at the IR level

Hi Duncan,

I'm not sure about this:

+  if (!Accuracy)
+    // If it's not a floating point number then it must be 'fast'.
+    return getFastAccuracy();

Since you allow accuracies bigger than 1 in setFPAccuracy(), integers
should be treated as float. Or at least assert.

Also, I'm thinking you should carry the annotation forward on all uses
of an annotated result, or make sure the floating point library
searches recursively for annotations on any dependency of the value
being analysed.

About creating annotations every time, I think this could be a nice
idea for a metadata factory functionality. Something that would cache
metadata, and in case of repetition, point to the same metadata. This
could be used for other optimisations (if I recall correctly, the
debug metadata does that already).

The problem with this is that, if an optimisation pass changes one,
you must make sure the other can also be changed, or split-on-write,
and that can cause some bloated code in the optimiser, which is not
ideal.

I think, for now, it's acceptable. But should be on request basis
(aka, only present if -fmath options are explicitly specified).

The rest of the patch looks sane, though. I like the idea of using
metadata, since the target code can easily ignore if it doesn't
support FP optimisations or IEEE strictness.

cheers,
--renato

Duncan Sands

2012-Apr-14 19:34 UTC

head link

[LLVMdev] Representing -ffast-math at the IR level

Hi Renato,
> I'm not sure about this:
>
> +  if (!Accuracy)
> +    // If it's not a floating point number then it must be
'fast'.
> +    return getFastAccuracy();
>
> Since you allow accuracies bigger than 1 in setFPAccuracy(), integers
> should be treated as float. Or at least assert.
the verifier checks that the accuracy operand is either a floating point
number (ConstantFP) or the keyword "fast".  If "Accuracy" is
zero here
then that means it wasn't ConstantFP.  Thus it must have been the keyword
"fast".
> Also, I'm thinking you should carry the annotation forward on all uses
> of an annotated result, or make sure the floating point library
> searches recursively for annotations on any dependency of the value
> being analysed.
Yes, this is a possible optimization (especially useful if functions from a
-ffast-math compiled module are inlined into functions from a non -ffast-math
compiled module or vice versa) but it is not needed for correctness.  I plan to
implement optimizations using the metadata later.
> About creating annotations every time, I think this could be a nice
> idea for a metadata factory functionality. Something that would cache
> metadata, and in case of repetition, point to the same metadata. This
> could be used for other optimisations (if I recall correctly, the
> debug metadata does that already).
Yes, Chandler suggested it already, and I think it is a good idea.
> The problem with this is that, if an optimisation pass changes one,
> you must make sure the other can also be changed, or split-on-write,
> and that can cause some bloated code in the optimiser, which is not
> ideal.
Optimizers don't (or shouldn't) change metadata because metadata is
uniqued: if you change it you change it for all users.  Instead new
metadata has to be created.  So I doubt that this is a problem in
practice.  Also, I think metadata is intrinsically a weak value handle,
so if someone changes the metadata underneath the builder then its
copy will become null.  When it sees that the cached metadata is null
then it can create it anew.  So I think it should be possible to ensure
that this works well.
> I think, for now, it's acceptable. But should be on request basis
> (aka, only present if -fmath options are explicitly specified).
>
> The rest of the patch looks sane, though. I like the idea of using
> metadata, since the target code can easily ignore if it doesn't
> support FP optimisations or IEEE strictness.
This kind of metadata must only relax IEEE strictness (and never tighten
it) because *metadata can always be discarded*.  Discarding it must never
result in wrong IR/transforms, thus metadata can only give additional
permissions.

Ciao, Duncan.

Dmitry Babokin

2012-Apr-14 19:35 UTC

head link

[LLVMdev] Representing -ffast-math at the IR level

Hi Duncan,

I'm not an expert in fp accuracy question, but I had quite a
few experience dealing with fp accuracy problems during compiler
transformations.

I think you have a step in the right direction, walking away from ULPs,
which are pretty useless for the purpose of describing allowed
fp optimizations IMHO. But using just "fast" keyword (or whatever else
will
be added in the future) is not enough without strict definition of this
keyword in terms of IR transformations. For example, particular
transformation may be interested if reassociation is allowed or not
((a+b)+c=> a+(b+c)), if fp contraction is allowed or not (ab+c
>fma(a,b,c)), if addition of zero may be canceled (x+0=>x) and etc. If
this
definition is not given on infrastructure level, this may lead to disaster,
when each transformation interprets "fast" in its own way.

Dmitry.

On Sat, Apr 14, 2012 at 10:28 PM, Duncan Sands <baldrick at free.fr>
wrote:
> The attached patch is a first attempt at representing
"-ffast-math" at the
> IR
> level, in fact on individual floating point instructions (fadd, fsub etc).
>  It
> is done using metadata.  We already have a "fpmath" metadata type
which
> can be
> used to signal that reduced precision is OK for a floating point
> operation, eg
>
>    %z = fmul float %x, %y, !fpmath !0
>  ...
>  !0 = metadata !{double 2.5}
>
> indicates that the multiplication can be done in any way that doesn't
> introduce
> more than 2.5 ULPs of error.
>
> The first observation is that !fpmath can be extended with additional
> operands
> in the future: operands that say things like whether it is OK to assume
> that
> there are no NaNs and so forth.
>
> This patch doesn't add additional operands though.  It just allows the
> existing
> accuracy operand to be the special keyword "fast" instead of a
number:
>
>    %z = fmul float %x, %y, !fpmath !0
>  ...
>  !0 = metadata !{!metadata "fast"}
>
> This indicates that accuracy loss is acceptable (just how much is
> unspecified)
> for the sake of speed.  Thanks to Chandler for pushing me to do it this
> way!
>
> It also creates a simple way of getting and setting this information: the
> FPMathOperator class: you can cast appropriate instructions to this class
> and then use the querying/mutating methods to get/set the accuracy, whether
> 2.5 or "fast".  The attached clang patch uses this to set the
openCL 2.5
> ULPs
> accuracy rather than doing it by hand for example.
>
> In addition it changes IRBuilder so that you can provide an accuracy when
> creating floating point operations.  I don't like this so much.  It
would
> be more efficient to just create the metadata once and then splat it onto
> each instruction.  Also, if fpmath gets a bunch more options/operands in
> the future then this interface will become more and more awkward.  Opinions
> welcome!
>
> I didn't actually implement any optimizations that use this yet.
>
> I took a look at the impact on aermod.f90, a reasonably floating point
> heavy
> Fortran benchmark (4% of the human readable IR consists of floating point
> operations).  At -O3 (the worst), the size of the bitcode increases by
> 0.8%.
> No idea if that's acceptable - hopefully it is!
>
> Enjoy!
>
> Duncan.
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20120414/90dd087d/attachment.html>

Duncan Sands

2012-Apr-14 19:44 UTC

head link

[LLVMdev] Representing -ffast-math at the IR level

Hi Dmitry,
> I'm not an expert in fp accuracy question, but I had quite a
> few experience dealing with fp accuracy problems during compiler
transformations.
I agree that it's a minefield which is why I intend to proceed
conservatively.
> I think you have a step in the right direction, walking away from ULPs,
which
> are pretty useless for the purpose of describing allowed fp optimizations
IMHO.
> But using just "fast" keyword (or whatever else will be added in
the future) is
> not enough without strict definition of this keyword in terms of IR
> transformations. For example, particular transformation may be interested
if
> reassociation is allowed or not ((a+b)+c=> a+(b+c)), if fp contraction
is
> allowed or not (ab+c = >fma(a,b,c)), if addition of zero may be canceled
> (x+0=>x) and etc. If this definition is not given on infrastructure
level, this
> may lead to disaster, when each transformation interprets "fast"
in its own way.
This is actually the main reason for using metadata rather than a flag like the
"nsw" flag on integer operations: it is easily extendible with more
info to say
whether reassociation is OK and so forth.

The kinds of transforms I think can reasonably be done with the current
information are things like: x + 0.0 -> x; x / constant -> x * (1 /
constant) if
constant and 1 / constant are normal (and not denormal) numbers.

Ciao, Duncan.
>
> Dmitry.
>
> On Sat, Apr 14, 2012 at 10:28 PM, Duncan Sands <baldrick at free.fr
> <mailto:baldrick at free.fr>> wrote:
>
>     The attached patch is a first attempt at representing
"-ffast-math" at the IR
>     level, in fact on individual floating point instructions (fadd, fsub
etc).  It
>     is done using metadata.  We already have a "fpmath" metadata
type which can be
>     used to signal that reduced precision is OK for a floating point
operation, eg
>
>         %z = fmul float %x, %y, !fpmath !0
>       ...
>       !0 = metadata !{double 2.5}
>
>     indicates that the multiplication can be done in any way that
doesn't introduce
>     more than 2.5 ULPs of error.
>
>     The first observation is that !fpmath can be extended with additional
operands
>     in the future: operands that say things like whether it is OK to assume
that
>     there are no NaNs and so forth.
>
>     This patch doesn't add additional operands though.  It just allows
the existing
>     accuracy operand to be the special keyword "fast" instead of
a number:
>
>         %z = fmul float %x, %y, !fpmath !0
>       ...
>       !0 = metadata !{!metadata "fast"}
>
>     This indicates that accuracy loss is acceptable (just how much is
unspecified)
>     for the sake of speed.  Thanks to Chandler for pushing me to do it this
way!
>
>     It also creates a simple way of getting and setting this information:
the
>     FPMathOperator class: you can cast appropriate instructions to this
class
>     and then use the querying/mutating methods to get/set the accuracy,
whether
>     2.5 or "fast".  The attached clang patch uses this to set the
openCL 2.5 ULPs
>     accuracy rather than doing it by hand for example.
>
>     In addition it changes IRBuilder so that you can provide an accuracy
when
>     creating floating point operations.  I don't like this so much.  It
would
>     be more efficient to just create the metadata once and then splat it
onto
>     each instruction.  Also, if fpmath gets a bunch more options/operands
in
>     the future then this interface will become more and more awkward. 
Opinions
>     welcome!
>
>     I didn't actually implement any optimizations that use this yet.
>
>     I took a look at the impact on aermod.f90, a reasonably floating point
heavy
>     Fortran benchmark (4% of the human readable IR consists of floating
point
>     operations).  At -O3 (the worst), the size of the bitcode increases by
0.8%.
>     No idea if that's acceptable - hopefully it is!
>
>     Enjoy!
>
>     Duncan.
>
>     _______________________________________________
>     LLVM Developers mailing list
>     LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
http://llvm.cs.uiuc.edu
>     http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>

Duncan Sands

2012-Apr-16 11:41 UTC

head link

[LLVMdev] Representing -ffast-math at the IR level

Here's a revised patch, plus patches showing how fpmath metadata could be
turned on in clang and dragonegg (it seemed safest for the moment to
condition on -ffast-math rather than on one of the flags implied by
-ffast-math).

Major changes:

- The FPMathOperator class can no longer be used to change math settings,
only to read them.  Currently it can be queried for accuracy info.  I split
the accuracy methods into two: one for 'fast' accuracy, one for a
numerical
accuracy (which returns +infty when the accuracy is 'fast').

- MDBuilder got support for creating fpmath metadata, in particular there is
function that returns the appropriate settings for -ffast-math.

- A default fpmath setting can be supplied to IRBuilder, which will then apply
it to all floating point operations.  It is also possible to specify specific
fpmath metadata when creating an operation.

Ciao, Duncan.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fastm.diff
Type: text/x-patch
Size: 18788 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20120416/e7f4b1c8/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fastm-clang.diff
Type: text/x-patch
Size: 1497 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20120416/e7f4b1c8/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fastm-dragonegg.diff
Type: text/x-patch
Size: 563 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20120416/e7f4b1c8/attachment-0002.bin>

Renato Golin

2012-Apr-16 12:16 UTC

head link

[LLVMdev] Representing -ffast-math at the IR level

Hi Duncan,

I like the changes to IRBuilder and how the operator can't change it.
Looks a lot safer (mistake-wise) and more convenient.

This function won't to remove a previously set tag, which could be
used by optimisations or inlining.

+  Instruction *AddFPMathTag(Instruction *I, MDNode *FPMathTag) const {
+    if (!FPMathTag)
+      FPMathTag = DefaultFPMathTag;
+    if (FPMathTag)
+      I->setMetadata(LLVMContext::MD_fpmath, FPMathTag);
+    return I;
+  }

If you want to keep it as only Add, then make FPMathTag = 0 so that
you can easily add the default by just calling AddFPMathTag(instr);

But I'd add a ClearFPMathTag function for optimisations/inlining. Maybe
later.

Also, would be good to make sure the instruction is, in fact, a
floating point operation. Either via restricting the type or asserting
on it.

-- 
cheers,
--renato

http://systemcall.org/

Chandler Carruth

2012-Apr-16 14:29 UTC

head link

[LLVMdev] Representing -ffast-math at the IR level

Thanks for the updates!

Minor comments:
+  if (!Accuracy)
+    // If it's not a floating point number then it must be 'fast'.
+    return HUGE_VALF;

Can we add an assert instead of a comment? It's just as documenting and
will catch any goofs.

+  // If it's not a floating point number then it must be 'fast'.
+  return !isa<ConstantFP>(MD->getOperand(0));

Here as well.

+    if (ConstantFP *CFP0 = dyn_cast_or_null<ConstantFP>(Op0)) {
+      APFloat Accuracy = CFP0->getValueAPF();
+      Assert1(Accuracy.isNormal() && !Accuracy.isNegative(),
+              "fpmath accuracy not a positive number!", &I);

To be pedantic for a moment, zero is not a positive number. What about
asserting these individually to give us more clear asserts if they fire?
That also makes the string easier to write: "fpmath accuracy is a negative
number!".

+  /// SetDefaultFPMathTag - Set the floating point math metadata to be
used.
+  void SetDefaultFPMathTag(MDNode *FPMathTag) { DefaultFPMathTag FPMathTag; }

This should be 'setDefault...' much like 'getDefault...' above.

+  Instruction *AddFPMathTag(Instruction *I, MDNode *FPMathTag) const {

Another bad case, but I think this instruction is gone...

+    MDString *GetFastString() const {
+      return CreateString("fast");
+    }

'getFastString'.

+    /// CreateFastFPMath - Return metadata with appropriate settings for
'fast
+    /// math'.

I would prefer the more modern doxygen style:

/// \brief Return metadata ...

+    MDNode *CreateFastFPMath() {

Capitalization.

The capitalization and doxygen style comments apply to the next function as
well.


Both the Clang and DragonEgg patches look good, but both need test cases. =]
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20120416/466d8d0e/attachment.html>

Owen Anderson

2012-Apr-16 17:25 UTC

head link

[LLVMdev] Representing -ffast-math at the IR level

Duncan,

I have some issues with representing this as a single "fast" mode
flag, which mostly boil down to the fact that this is a very C-centric view of
the world.  And, since C compilers are not generally known for their awesomeness
on issues of numerics,  I'm not sure that's a good idea.

Having something called a "fast" or "relaxed" mode implies
that it is less precise than whatever the standard mode is.  However, C is
notably sparse in specifying what exactly the standard mode is.  The typical
assumption is that it is the strict one-to-one translation to IEEE754 semantics,
but no optimizing C compiler actually implements that.

Other languages are more interesting in this regard.  Fortran, for instance,
allows reassociation within parentheses.  (Can that even be represented with
instruction metadata?)  OpenCL has a very fairly baseline mode, but specifies a
number of specific options the user can enable to relax it (-cl-mad-enable,
-cl-no-signed-zeros, -cl-unsafe-math-optimization (implies the previous two),
-cl-finite-math-only, -cl-fast-relaxed-math (implies all prior)).  GLSL has
distinct desktop and embedded specifications that place different levels of
constraint on implementations.

If we define the baseline behavior to be strict IEEE conformance, and then
don't provide a more nuanced method of relaxing it, we're not going to
be in a significantly better world than we are today.  No reasonable
implementation of these languages wants strict conformance (except maybe
desktop-profile OpenCL) as their default mode, nor is there any way a universal
definition of "fast" math can work for all of them.

--Owen

On Apr 14, 2012, at 11:28 AM, Duncan Sands <baldrick at free.fr> wrote:
> The attached patch is a first attempt at representing
"-ffast-math" at the IR
> level, in fact on individual floating point instructions (fadd, fsub etc). 
It
> is done using metadata.  We already have a "fpmath" metadata type
which can be
> used to signal that reduced precision is OK for a floating point operation,
eg
> 
>    %z = fmul float %x, %y, !fpmath !0
>  ...
>  !0 = metadata !{double 2.5}
> 
> indicates that the multiplication can be done in any way that doesn't
introduce
> more than 2.5 ULPs of error.
> 
> The first observation is that !fpmath can be extended with additional
operands
> in the future: operands that say things like whether it is OK to assume
that
> there are no NaNs and so forth.
> 
> This patch doesn't add additional operands though.  It just allows the
existing
> accuracy operand to be the special keyword "fast" instead of a
number:
> 
>    %z = fmul float %x, %y, !fpmath !0
>  ...
>  !0 = metadata !{!metadata "fast"}
> 
> This indicates that accuracy loss is acceptable (just how much is
unspecified)
> for the sake of speed.  Thanks to Chandler for pushing me to do it this
way!
> 
> It also creates a simple way of getting and setting this information: the
> FPMathOperator class: you can cast appropriate instructions to this class
> and then use the querying/mutating methods to get/set the accuracy, whether
> 2.5 or "fast".  The attached clang patch uses this to set the
openCL 2.5 ULPs
> accuracy rather than doing it by hand for example.
> 
> In addition it changes IRBuilder so that you can provide an accuracy when
> creating floating point operations.  I don't like this so much.  It
would
> be more efficient to just create the metadata once and then splat it onto
> each instruction.  Also, if fpmath gets a bunch more options/operands in
> the future then this interface will become more and more awkward.  Opinions
> welcome!
> 
> I didn't actually implement any optimizations that use this yet.
> 
> I took a look at the impact on aermod.f90, a reasonably floating point
heavy
> Fortran benchmark (4% of the human readable IR consists of floating point
> operations).  At -O3 (the worst), the size of the bitcode increases by
0.8%.
> No idea if that's acceptable - hopefully it is!
> 
> Enjoy!
> 
> Duncan.
>
<fastm-llvm.diff><fastm-clang.diff>_______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Duncan Sands

2012-Apr-16 17:40 UTC

head link

[LLVMdev] Representing -ffast-math at the IR level

Hi Owen,
> I have some issues with representing this as a single "fast" mode
flag,
it isn't a single flag, that's the whole point of using metadata.  OK,
right
now there is only one option (the "accuracy"), true, but the intent is
that
others will be added, and the meaning of accuracy tightened, later.  MDBuilder
has a createFastFPMath method which is intended to produce settings that match
GCC's -ffast-math, however frontends will be able to specify whatever
settings
they like if that doesn't suit them (i.e. createFPMath will get more
arguments
as more settings become available).

Note that as the current option isn't actually connected to any
optimizations,
there is nothing much to argue about for the moment.

My plan is to introduce a few simple optimizations (x + 0.0 -> x for example)
that introduce a finite number of ULPs of error, and hook them up.  Thus this
does not include things like x * 0.0 -> 0.0 (infinite ULPs of error),
reassociation (infinite ULPs of error) or any other scary things.

  which mostly boil down to the fact that this is a very C-centric view of the 
world.  And, since C compilers are not generally known for their awesomeness on 
issues of numerics,  I'm not sure that's a good
idea.> Having something called a "fast" or "relaxed" mode
implies that it is less precise than whatever the standard mode is.  However, C
is notably sparse in specifying what exactly the standard mode is.  The typical
assumption is that it is the strict one-to-one translation to IEEE754 semantics,
but no optimizing C compiler actually implements that.
I think this is a misunderstanding of where I'm going, see above.
> Other languages are more interesting in this regard.  Fortran, for
instance, allows reassociation within parentheses.  (Can that even be
represented with instruction metadata?)
I'm aware of Fortran parentheses (PAREN_EXPR in gcc).  If it can't be
expressed
well then too bad: reassociation can just be turned off and we won't
optimize
Fortran as well as we could.  (As mentioned above I have no intention of turning
on reassociation based on the current flag since it can introduce an unbounded
number of ULPs of error).

   OpenCL has a very fairly baseline mode, but specifies a number of specific 
options the user can enable to relax it (-cl-mad-enable, -cl-no-signed-zeros, 
-cl-unsafe-math-optimization (implies the previous two), -cl-finite-math-only, 
-cl-fast-relaxed-math (implies all prior)).  GLSL has distinct desktop and 
embedded specifications that place different levels of constraint on 
implementations.

Yup.
>
> If we define the baseline behavior to be strict IEEE conformance,
Which we do.

  and then don't provide a more nuanced method of relaxing it,

Allowing more nuanced ways is the reason for using metadata as explained above.

  we're not going to be in a significantly better world than we are today. 
No
reasonable implementation of these languages wants strict conformance (except 
maybe desktop-profile OpenCL) as their default mode,

Strict conformance is what they get right now.

  nor is there any way a universal definition of "fast" math can work
for all of
them.

I agree, and I'm not trying to provide one.

Ciao, Duncan.
>
> --Owen
>
> On Apr 14, 2012, at 11:28 AM, Duncan Sands<baldrick at free.fr> 
wrote:
>
>> The attached patch is a first attempt at representing
"-ffast-math" at the IR
>> level, in fact on individual floating point instructions (fadd, fsub
etc).  It
>> is done using metadata.  We already have a "fpmath" metadata
type which can be
>> used to signal that reduced precision is OK for a floating point
operation, eg
>>
>>     %z = fmul float %x, %y, !fpmath !0
>>   ...
>>   !0 = metadata !{double 2.5}
>>
>> indicates that the multiplication can be done in any way that
doesn't introduce
>> more than 2.5 ULPs of error.
>>
>> The first observation is that !fpmath can be extended with additional
operands
>> in the future: operands that say things like whether it is OK to assume
that
>> there are no NaNs and so forth.
>>
>> This patch doesn't add additional operands though.  It just allows
the existing
>> accuracy operand to be the special keyword "fast" instead of
a number:
>>
>>     %z = fmul float %x, %y, !fpmath !0
>>   ...
>>   !0 = metadata !{!metadata "fast"}
>>
>> This indicates that accuracy loss is acceptable (just how much is
unspecified)
>> for the sake of speed.  Thanks to Chandler for pushing me to do it this
way!
>>
>> It also creates a simple way of getting and setting this information:
the
>> FPMathOperator class: you can cast appropriate instructions to this
class
>> and then use the querying/mutating methods to get/set the accuracy,
whether
>> 2.5 or "fast".  The attached clang patch uses this to set the
openCL 2.5 ULPs
>> accuracy rather than doing it by hand for example.
>>
>> In addition it changes IRBuilder so that you can provide an accuracy
when
>> creating floating point operations.  I don't like this so much.  It
would
>> be more efficient to just create the metadata once and then splat it
onto
>> each instruction.  Also, if fpmath gets a bunch more options/operands
in
>> the future then this interface will become more and more awkward. 
Opinions
>> welcome!
>>
>> I didn't actually implement any optimizations that use this yet.
>>
>> I took a look at the impact on aermod.f90, a reasonably floating point
heavy
>> Fortran benchmark (4% of the human readable IR consists of floating
point
>> operations).  At -O3 (the worst), the size of the bitcode increases by
0.8%.
>> No idea if that's acceptable - hopefully it is!
>>
>> Enjoy!
>>
>> Duncan.
>>
<fastm-llvm.diff><fastm-clang.diff>_______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

Possibly Parallel Threads

Search for more possibly parallel threads

llvm dev - Apr 2012 - [LLVMdev] Representing -ffast-math at the IR level

[LLVMdev] Representing -ffast-math at the IR level

[LLVMdev] Representing -ffast-math at the IR level

[LLVMdev] Representing -ffast-math at the IR level

[LLVMdev] Representing -ffast-math at the IR level

[LLVMdev] Representing -ffast-math at the IR level

[LLVMdev] Representing -ffast-math at the IR level

[LLVMdev] Representing -ffast-math at the IR level

[LLVMdev] Representing -ffast-math at the IR level

[LLVMdev] Representing -ffast-math at the IR level

[LLVMdev] Representing -ffast-math at the IR level

Possibly Parallel Threads