thr3ads.net - llvm dev - [LLVMdev] [RFC] Extend LLVM IR to express "fast-math" at a per-instruction level [Nov 2012]

If this information is useful, please help other people find it:
Share via:

Joe Abbey

2012-Nov-12 18:39 UTC

[LLVMdev] [RFC] Extend LLVM IR to express "fast-math" at a per-instruction level

Michael,

Since you won't be using metadata to store this information and are
augmenting the IR, I'd recommend incrementing the bitcode version number. 
The current version stored in a local variable in BitcodeWriter.cpp:1814*

I would suspect then you'll also need to provide additional logic for
reading:

      switch (module_version) {
        default: return Error("Unknown bitstream version!");
        case 2:
  EncodesFastMathIR = true;
        case 1:
          UseRelativeIDs = true;
          break;
  case 0:
          UseRelativeIDs = false;
          break;

      }

Joe

(*TODO: Put this somewhere else).

On Nov 9, 2012, at 5:34 PM, Michael Ilseman <milseman at
apple.com<mailto:milseman at apple.com>> wrote:

Revision 2

Revision 2 changes:
* Add in separate Reciprocal flag
* Clarified wording of flags, specified undefined values, not behavior
* Removed some confusing language
* Mentioned optimizations/analyses adding in flags due to inferred knowledge

Revision 1 changes:
* Removed Fusion flag from all sections
* Clarified and changed descriptions of remaining flags:
  * Make 'N' and 'I' flags be explicitly concerning values of
operands, and
    producing undef values if a NaN/Inf is provided.
  * 'S' is now only about distinguishing between +/-0.
  * LangRef changes updated to reflect flags changes
  * Updated Quesiton section given the now simpler set of flags
  * Optimizations changed to reflect 'N' and 'I' describing
operands and not
    results
* Be explicit on what LLVM's default behavior is (no signaling NaNs, etc)
* Mention that this could be alternatively solved with metadata, and open the
  debate


Introduction
---

LLVM IR currently does not have any support for specifying fine-grained control
over relaxing floating point requirements for the optimizer. The below is a
proposal to extend floating point IR instructions to support a number of flags
that a creator of IR can use to allow for greater optimizations when
desired. Such changes are sometimes referred to as fast-math, but this proposal
is about finer-grained specifications at a per-instruction level.


What this doesn't address
---

Default behavior is retained, and this proposal is only addressing relaxing
restrictions. LLVM currently by default:
- ignores signaling NaNs
- assumes default rounding mode
- assumes FENV_ACCESS is off

Discussion on changing the default behavior of LLVM or allowing for more
restrictive behavior is outside the scope of this proposal. This proposal does
not address behavior of denormals, which is more of a backend concern.

Specifying exact precision control or requirements is outside the scope of this
proposal, and can probably be handled with the existing metadata implementation.

This proposal covers changes to and optimizations over LLVM IR, and changes to
codegen are outside the scope of this proposal. The flags described in the next
section exist only at the IR level, and will not be propagated into codegen or
the SelectionDAG.


Flags
---

LLVM IR instructions will have the following flags that can be set by the
creator of the IR.

no NaNs (N)
- Allow optimizations that assume the arguments and result are not NaN. Such
  optimizations are required to retain defined behavior over NaNs, but the
  value of the result is undefined.

no Infs (I)
- Allow optimizations that assume the arguments and result are not
  +/-Inf. Such optimizations are required to retain defined behavior over
  +/-Inf, but the value of the result is undefined.

no signed zeros (S)
- Allow optimizations to treat the sign of a zero argument or result as
  insignificant.

allow reciprocal (R)
- Allow optimizations to use the reciprocal of an argument instead of dividing

unsafe algebra (A)
- The optimizer is allowed to perform algebraically equivalent transformations
   that may dramatically change results in floating point. (e.g.
   reassociation).

Throughout I'll refer to these options in their short-hand, e.g.
'A'.
Internally, these flags are to reside in SubclassData.

Setting the 'A' flag implies the setting of all the others ('N',
'I', 'S', 'R').


Changes to LangRef
---

Change the definitions of floating point arithmetic operations, below is how
fadd will change:

'fadd' Instruction
Syntax:

<result> = fadd {flag}* <ty> <op1>, <op2>   ; yields
{ty}:result
...
Semantics:
...
flag can be one of the following optimizer hints to enable otherwise unsafe
floating point optimizations:
N: no NaNs - Allow optimizations that assume the arguments and result are not
  NaN. Such optimizations are required to retain defined behavior over NaNs,
  but the value of the result is undefined.
I: no infs - Allow optimizations that assume the arguments and result are not
  +/-Inf. Such optimizations are required to retain defined behavior over
  +/-Inf, but the value of the result is undefined.
S: no signed zeros - Allow optimizations to treat the sign of a zero argument
  or result as insignificant.
A: unsafe algebra - The optimizer is allowed to perform algebraically
   equivalent transformations that may dramatically change results in floating
   point. (e.g.  reassociation).

fdiv will also mention that 'R' allows the fdiv to be replaced by a
multiply-by-reciprocal.


Changes to optimizations
---

Optimizations should be allowed to perform unsafe optimizations provided the
instructions involved have the corresponding restrictions relaxed. When
combining instructions, optimizations should do what makes sense to not remove
restrictions that previously existed (commonly, a bitwise-AND of the flags).

Below are some example optimizations that could be allowed with the given
relaxations.

N - no NaNs
x == x ==> true

S - no signed zeros
x - 0 ==> x
0 - (x - y) ==> y - x

NIS - no signed zeros AND no NaNs AND no Infs
x * 0 ==> 0

NI - no infs AND no NaNs
x - x ==> 0

R - reciprocal
 x / y ==> x * (1/y)

A - unsafe-algebra
Reassociation
  (x + y) + z ==> x + (y + z)
  (x + C1) + C2 ==> x + (C1 + C2)
Redistribution
  (x * C) + x ==> x * (C+1)
  (x * C) + (x + x) ==> x * (C + 2)

I propose to expand -instsimplify and -instcombine to perform these kinds of
optimizations. -reassociate will be expanded to reassociate floating point
operations when allowed. Similar to existing behavior regarding integer
wrapping, -early-cse will not CSE FP operations with mismatched flags, while
-gvn will (conservatively). This allows later optimizations to optimize the
expressions independently between runs of -early-cse and -gvn.

Optimizations and analyses that are able to infer certain properties of
instructions are allowed to set relevant flags. For example, if some analysis
has determined that the arguments and result of an instruction are not NaNs or
Infs, then it may set the 'N' and 'I' flags, allowing every
other optimization
and analysis to benefit from this inferred knowledge.

Changes to frontends
---

Frontends are free to generate code with flags set as they desire. Frontends
should continue to call llc with their desired options, as the flags apply only
at the IR level and not at codegen or the SelectionDAGs.

The intention behind the flags are to allow the IR creator to say something
along the lines of:
"If this operation is given a NaN, or the result is a NaN, then I don't
care
what answer I get back. However, I expect my program to otherwise behave
properly."

Below is a suggested change to clang's command-line options.

-ffast-math
Currently described as:
Enable the *frontend*'s 'fast-math' mode. This has no effect on
optimizations,
but provides a preprocessor macro __FAST_MATH__ the same as GCC's
-ffast-math
flag

I propose to change the description and behavior to:

Enable 'fast-math' mode. This allows for optimizations that may produce
incorrect and unsafe results, and thus should only be used with care. This
also provides a preprocessor macro __FAST_MATH__ the same as GCC's
-ffast-math
flag

I propose that this turn on all flags for all floating point instructions. If
this flag doesn't already cause clang to run llc with
-enable-unsafe-fp-math,
then I propose that it does so as well.

(Optional)
I propose adding the below flags:

-ffinite-math-only
Allow optimizations to assume that floating point arguments and results are
NaNs or +/-Inf. This may produce incorrect results, and so should be used with
care.

This would set the 'I' and 'N' bits on all generated floating
point instructions.

-fno-signed-zeros
Allow optimizations to ignore the signedness of zero. This may produce
incorrect results, and so should be used with care.

This would set the 'S' bit on all FP instructions.

-freciprocal-math
Allow optimizations to use the reciprocal of an argument instead of using
division. This may produce less precise results, and so should be used with
care.

This would set the 'R' bit on all relevant FP instructions

Changes to llvm cli tools
---
opt and llc already have the command line options
-enable-unsafe-fp-math: Enable optimizations that may decrease FP precision
-enable-no-infs-fp-math: Enable FP math optimizations that assume no +-Infs
-enable-no-nans-fp-math: Enable FP math optimizations that assume no NaNs
However, opt makes no use of them as they are currently only considered to be
TargetOptions. llc will remain unchanged, as these options apply to DAG
optimizations while this proposal deals with IR optimizations.

(Optional)
Have an opt pass that adds the desired flags to floating point instructions.


Miscellaneous explanations in the form of Q&A
---

Why not just have "fast-math" rather than individual flags?

Having the individual flags gives the granularity to choose the levels of
optimizations. For example, unsafe-algebra can lead to dramatically different
results in corner cases, and may not be desired when a user just wants to ensure
that x*0 folds to 0.


Why have these flags attached to the instruction itself, rather than be a
compiler mode?

Being attached to the instruction itself allows much greater flexibility both
for other optimizations and for the concerns of the source and target. For
example, a frontend may desire that x - x be folded to 0. This would require
no-NaNs for the subtract. However, the frontend may want to keep NaNs for its
comparisons.

Additionally, these properties can be set internally in the optimizer when the
property has been proven. For example, if x has been found to be positive, then
operations involving x and a constant can be marked to ignore signed zero.

Finally, having these flags allows for greater safety and optimization when code
of different flags are mixed. For example, a function author may set the
unsafe-algebra flag knowing that such transformations will not meaningfully
alter its result. If that function gets inlined into a caller, however, we
don't
want to always assume that the function's expressions can be reassociated
with
the caller's expressions. These properties allow us to preserve the
optimizations of the inlined function without affecting the caller.


Why not use metadata rather than flags?

There is existing metadata to denote precisions, and this proposal is orthogonal
to those efforts. While these properties could still be expressed as metadata,
the proposed flags are analogous to nsw/nuw and are inherent properties of the
IR instructions themselves that all transformations should respect.

_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu<mailto:LLVMdev at cs.uiuc.edu>        
http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20121112/0ff5421a/attachment.html>

Chris Lattner

2012-Nov-13 01:42 UTC

head link

[LLVMdev] [RFC] Extend LLVM IR to express "fast-math" at a per-instruction level

On Nov 12, 2012, at 10:39 AM, Joe Abbey <jabbey at arxan.com> wrote:
> Michael,
> 
> Since you won't be using metadata to store this information and are
augmenting the IR, I'd recommend incrementing the bitcode version number. 
The current version stored in a local variable in BitcodeWriter.cpp:1814*
> 
> I would suspect then you'll also need to provide additional logic for
reading:
> 
>       switch (module_version) {
>         default: return Error("Unknown bitstream version!");
>         case 2:
> 	  EncodesFastMathIR = true;
>         case 1:
>           UseRelativeIDs = true;
>           break;
>  	case 0:
>           UseRelativeIDs = false;
>           break;
>        
>       }
Couldn't this be handled by adding an extra operand to the binary operators?

-Chris
> 
> Joe
> 
> (*TODO: Put this somewhere else).
> 
> On Nov 9, 2012, at 5:34 PM, Michael Ilseman <milseman at apple.com>
wrote:
> 
>> Revision 2
>> 
>> Revision 2 changes:
>> * Add in separate Reciprocal flag
>> * Clarified wording of flags, specified undefined values, not behavior
>> * Removed some confusing language
>> * Mentioned optimizations/analyses adding in flags due to inferred
knowledge
>> 
>> Revision 1 changes:
>> * Removed Fusion flag from all sections
>> * Clarified and changed descriptions of remaining flags:
>>   * Make 'N' and 'I' flags be explicitly concerning
values of operands, and
>>     producing undef values if a NaN/Inf is provided.
>>   * 'S' is now only about distinguishing between +/-0.
>>   * LangRef changes updated to reflect flags changes
>>   * Updated Quesiton section given the now simpler set of flags
>>   * Optimizations changed to reflect 'N' and 'I'
describing operands and not
>>     results
>> * Be explicit on what LLVM's default behavior is (no signaling
NaNs, etc)
>> * Mention that this could be alternatively solved with metadata, and
open the
>>   debate
>> 
>> 
>> Introduction
>> ---
>> 
>> LLVM IR currently does not have any support for specifying fine-grained
control
>> over relaxing floating point requirements for the optimizer. The below
is a
>> proposal to extend floating point IR instructions to support a number
of flags
>> that a creator of IR can use to allow for greater optimizations when
>> desired. Such changes are sometimes referred to as fast-math, but this
proposal
>> is about finer-grained specifications at a per-instruction level.
>> 
>> 
>> What this doesn't address
>> ---
>> 
>> Default behavior is retained, and this proposal is only addressing
relaxing
>> restrictions. LLVM currently by default:
>> - ignores signaling NaNs
>> - assumes default rounding mode
>> - assumes FENV_ACCESS is off
>> 
>> Discussion on changing the default behavior of LLVM or allowing for
more
>> restrictive behavior is outside the scope of this proposal. This
proposal does
>> not address behavior of denormals, which is more of a backend concern.
>> 
>> Specifying exact precision control or requirements is outside the scope
of this
>> proposal, and can probably be handled with the existing metadata
implementation.
>> 
>> This proposal covers changes to and optimizations over LLVM IR, and
changes to
>> codegen are outside the scope of this proposal. The flags described in
the next
>> section exist only at the IR level, and will not be propagated into
codegen or
>> the SelectionDAG.
>> 
>> 
>> Flags
>> ---
>> 
>> LLVM IR instructions will have the following flags that can be set by
the
>> creator of the IR.
>> 
>> no NaNs (N)
>> - Allow optimizations that assume the arguments and result are not NaN.
Such
>>   optimizations are required to retain defined behavior over NaNs, but
the
>>   value of the result is undefined.
>> 
>> no Infs (I)
>> - Allow optimizations that assume the arguments and result are not
>>   +/-Inf. Such optimizations are required to retain defined behavior
over
>>   +/-Inf, but the value of the result is undefined.
>> 
>> no signed zeros (S)
>> - Allow optimizations to treat the sign of a zero argument or result as
>>   insignificant.
>> 
>> allow reciprocal (R)
>> - Allow optimizations to use the reciprocal of an argument instead of
dividing
>> 
>> unsafe algebra (A)
>> - The optimizer is allowed to perform algebraically equivalent
transformations
>>    that may dramatically change results in floating point. (e.g.
>>    reassociation).
>> 
>> Throughout I'll refer to these options in their short-hand, e.g.
'A'.
>> Internally, these flags are to reside in SubclassData.
>> 
>> Setting the 'A' flag implies the setting of all the others
('N', 'I', 'S', 'R').
>> 
>> 
>> Changes to LangRef
>> ---
>> 
>> Change the definitions of floating point arithmetic operations, below
is how
>> fadd will change:
>> 
>> 'fadd' Instruction
>> Syntax:
>> 
>> <result> = fadd {flag}* <ty> <op1>, <op2>   ;
yields {ty}:result
>> ...
>> Semantics:
>> ...
>> flag can be one of the following optimizer hints to enable otherwise
unsafe
>> floating point optimizations:
>> N: no NaNs - Allow optimizations that assume the arguments and result
are not
>>   NaN. Such optimizations are required to retain defined behavior over
NaNs,
>>   but the value of the result is undefined.
>> I: no infs - Allow optimizations that assume the arguments and result
are not
>>   +/-Inf. Such optimizations are required to retain defined behavior
over
>>   +/-Inf, but the value of the result is undefined.
>> S: no signed zeros - Allow optimizations to treat the sign of a zero
argument
>>   or result as insignificant.
>> A: unsafe algebra - The optimizer is allowed to perform algebraically
>>    equivalent transformations that may dramatically change results in
floating
>>    point. (e.g.  reassociation).
>> 
>> fdiv will also mention that 'R' allows the fdiv to be replaced
by a
>> multiply-by-reciprocal.
>> 
>> 
>> Changes to optimizations
>> ---
>> 
>> Optimizations should be allowed to perform unsafe optimizations
provided the
>> instructions involved have the corresponding restrictions relaxed. When
>> combining instructions, optimizations should do what makes sense to not
remove
>> restrictions that previously existed (commonly, a bitwise-AND of the
flags).
>> 
>> Below are some example optimizations that could be allowed with the
given
>> relaxations.
>> 
>> N - no NaNs
>> x == x ==> true
>> 
>> S - no signed zeros
>> x - 0 ==> x
>> 0 - (x - y) ==> y - x
>> 
>> NIS - no signed zeros AND no NaNs AND no Infs
>> x * 0 ==> 0
>> 
>> NI - no infs AND no NaNs
>> x - x ==> 0
>> 
>> R - reciprocal
>>  x / y ==> x * (1/y)
>> 
>> A - unsafe-algebra
>> Reassociation
>>   (x + y) + z ==> x + (y + z)
>>   (x + C1) + C2 ==> x + (C1 + C2)
>> Redistribution
>>   (x * C) + x ==> x * (C+1)
>>   (x * C) + (x + x) ==> x * (C + 2)
>> 
>> I propose to expand -instsimplify and -instcombine to perform these
kinds of
>> optimizations. -reassociate will be expanded to reassociate floating
point
>> operations when allowed. Similar to existing behavior regarding integer
>> wrapping, -early-cse will not CSE FP operations with mismatched flags,
while
>> -gvn will (conservatively). This allows later optimizations to optimize
the
>> expressions independently between runs of -early-cse and -gvn.
>> 
>> Optimizations and analyses that are able to infer certain properties of
>> instructions are allowed to set relevant flags. For example, if some
analysis
>> has determined that the arguments and result of an instruction are not
NaNs or
>> Infs, then it may set the 'N' and 'I' flags, allowing
every other optimization
>> and analysis to benefit from this inferred knowledge.
>> 
>> Changes to frontends
>> ---
>> 
>> Frontends are free to generate code with flags set as they desire.
Frontends
>> should continue to call llc with their desired options, as the flags
apply only
>> at the IR level and not at codegen or the SelectionDAGs.
>> 
>> The intention behind the flags are to allow the IR creator to say
something
>> along the lines of:
>> "If this operation is given a NaN, or the result is a NaN, then I
don't care
>> what answer I get back. However, I expect my program to otherwise
behave
>> properly."
>> 
>> Below is a suggested change to clang's command-line options.
>> 
>> -ffast-math
>> Currently described as:
>> Enable the *frontend*'s 'fast-math' mode. This has no
effect on optimizations,
>> but provides a preprocessor macro __FAST_MATH__ the same as GCC's
-ffast-math
>> flag
>> 
>> I propose to change the description and behavior to:
>> 
>> Enable 'fast-math' mode. This allows for optimizations that may
produce
>> incorrect and unsafe results, and thus should only be used with care.
This
>> also provides a preprocessor macro __FAST_MATH__ the same as GCC's
-ffast-math
>> flag
>> 
>> I propose that this turn on all flags for all floating point
instructions. If
>> this flag doesn't already cause clang to run llc with
-enable-unsafe-fp-math,
>> then I propose that it does so as well.
>> 
>> (Optional)
>> I propose adding the below flags:
>> 
>> -ffinite-math-only
>> Allow optimizations to assume that floating point arguments and results
are
>> NaNs or +/-Inf. This may produce incorrect results, and so should be
used with
>> care.
>> 
>> This would set the 'I' and 'N' bits on all generated
floating point instructions.
>> 
>> -fno-signed-zeros
>> Allow optimizations to ignore the signedness of zero. This may produce
>> incorrect results, and so should be used with care.
>> 
>> This would set the 'S' bit on all FP instructions.
>> 
>> -freciprocal-math
>> Allow optimizations to use the reciprocal of an argument instead of
using
>> division. This may produce less precise results, and so should be used
with
>> care.
>> 
>> This would set the 'R' bit on all relevant FP instructions
>> 
>> Changes to llvm cli tools
>> ---
>> opt and llc already have the command line options
>> -enable-unsafe-fp-math: Enable optimizations that may decrease FP
precision
>> -enable-no-infs-fp-math: Enable FP math optimizations that assume no
+-Infs
>> -enable-no-nans-fp-math: Enable FP math optimizations that assume no
NaNs
>> However, opt makes no use of them as they are currently only considered
to be
>> TargetOptions. llc will remain unchanged, as these options apply to DAG
>> optimizations while this proposal deals with IR optimizations.
>> 
>> (Optional)
>> Have an opt pass that adds the desired flags to floating point
instructions.
>> 
>> 
>> Miscellaneous explanations in the form of Q&A
>> ---
>> 
>> Why not just have "fast-math" rather than individual flags?
>> 
>> Having the individual flags gives the granularity to choose the levels
of
>> optimizations. For example, unsafe-algebra can lead to dramatically
different
>> results in corner cases, and may not be desired when a user just wants
to ensure
>> that x*0 folds to 0.
>> 
>> 
>> Why have these flags attached to the instruction itself, rather than be
a
>> compiler mode?
>> 
>> Being attached to the instruction itself allows much greater
flexibility both
>> for other optimizations and for the concerns of the source and target.
For
>> example, a frontend may desire that x - x be folded to 0. This would
require
>> no-NaNs for the subtract. However, the frontend may want to keep NaNs
for its
>> comparisons.
>> 
>> Additionally, these properties can be set internally in the optimizer
when the
>> property has been proven. For example, if x has been found to be
positive, then
>> operations involving x and a constant can be marked to ignore signed
zero.
>> 
>> Finally, having these flags allows for greater safety and optimization
when code
>> of different flags are mixed. For example, a function author may set
the
>> unsafe-algebra flag knowing that such transformations will not
meaningfully
>> alter its result. If that function gets inlined into a caller, however,
we don't
>> want to always assume that the function's expressions can be
reassociated with
>> the caller's expressions. These properties allow us to preserve the
>> optimizations of the inlined function without affecting the caller.
>> 
>> 
>> Why not use metadata rather than flags?
>> 
>> There is existing metadata to denote precisions, and this proposal is
orthogonal
>> to those efforts. While these properties could still be expressed as
metadata,
>> the proposed flags are analogous to nsw/nuw and are inherent properties
of the
>> IR instructions themselves that all transformations should respect.
>> 
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20121112/c68dde09/attachment.html>

Michael Ilseman

2012-Nov-14 20:28 UTC

head link

[LLVMdev] [RFC] Extend LLVM IR to express "fast-math" at a per-instruction level

I think I missed what problem we're trying to solve here.

I'm looking at implementing the bitcode now. I have code to successfully
read and write out the LLVM IR textual formal (LLParser, etc) and set the
corresponding SubclassOptionalData bits. Looking at LLVMBitCodes.h, I'm
seeing where these bits reside in the bitcode, so I believe that things should
be pretty straight-forward from here.

Joe, what are the reasons for me to increment the IR version number? My
understanding is that I'll just be using existing bits that were previously
ignored. Ignoring these bits is still valid, just conservative. I believe these
flags would be zero-ed out in old IR (correct me if I'm wrong), which is the
intended default.

Chris, what problem could be solved by adding extra operands to binary ops?
I'm trying to avoid those sorts of modifications, as the fast-math flags
could make sense applied to a variety operations, e.g. comparisons and casts.


On Nov 12, 2012, at 5:42 PM, Chris Lattner <clattner at apple.com> wrote:
> 
> On Nov 12, 2012, at 10:39 AM, Joe Abbey <jabbey at arxan.com> wrote:
> 
>> Michael,
>> 
>> Since you won't be using metadata to store this information and are
augmenting the IR, I'd recommend incrementing the bitcode version number. 
The current version stored in a local variable in BitcodeWriter.cpp:1814*
>> 
>> I would suspect then you'll also need to provide additional logic
for reading:
>> 
>>       switch (module_version) {
>>         default: return Error("Unknown bitstream version!");
>>         case 2:
>> 	  EncodesFastMathIR = true;
>>         case 1:
>>           UseRelativeIDs = true;
>>           break;
>>  	case 0:
>>           UseRelativeIDs = false;
>>           break;
>>        
>>       }
> 
> Couldn't this be handled by adding an extra operand to the binary
operators?
> 
> -Chris
> 
>> 
>> Joe
>> 
>> (*TODO: Put this somewhere else).
>> 
>> On Nov 9, 2012, at 5:34 PM, Michael Ilseman <milseman at
apple.com> wrote:
>> 
>>> Revision 2
>>> 
>>> Revision 2 changes:
>>> * Add in separate Reciprocal flag
>>> * Clarified wording of flags, specified undefined values, not
behavior
>>> * Removed some confusing language
>>> * Mentioned optimizations/analyses adding in flags due to inferred
knowledge
>>> 
>>> Revision 1 changes:
>>> * Removed Fusion flag from all sections
>>> * Clarified and changed descriptions of remaining flags:
>>>   * Make 'N' and 'I' flags be explicitly concerning
values of operands, and
>>>     producing undef values if a NaN/Inf is provided.
>>>   * 'S' is now only about distinguishing between +/-0.
>>>   * LangRef changes updated to reflect flags changes
>>>   * Updated Quesiton section given the now simpler set of flags
>>>   * Optimizations changed to reflect 'N' and 'I'
describing operands and not
>>>     results
>>> * Be explicit on what LLVM's default behavior is (no signaling
NaNs, etc)
>>> * Mention that this could be alternatively solved with metadata,
and open the
>>>   debate
>>> 
>>> 
>>> Introduction
>>> ---
>>> 
>>> LLVM IR currently does not have any support for specifying
fine-grained control
>>> over relaxing floating point requirements for the optimizer. The
below is a
>>> proposal to extend floating point IR instructions to support a
number of flags
>>> that a creator of IR can use to allow for greater optimizations
when
>>> desired. Such changes are sometimes referred to as fast-math, but
this proposal
>>> is about finer-grained specifications at a per-instruction level.
>>> 
>>> 
>>> What this doesn't address
>>> ---
>>> 
>>> Default behavior is retained, and this proposal is only addressing
relaxing
>>> restrictions. LLVM currently by default:
>>> - ignores signaling NaNs
>>> - assumes default rounding mode
>>> - assumes FENV_ACCESS is off
>>> 
>>> Discussion on changing the default behavior of LLVM or allowing for
more
>>> restrictive behavior is outside the scope of this proposal. This
proposal does
>>> not address behavior of denormals, which is more of a backend
concern.
>>> 
>>> Specifying exact precision control or requirements is outside the
scope of this
>>> proposal, and can probably be handled with the existing metadata
implementation.
>>> 
>>> This proposal covers changes to and optimizations over LLVM IR, and
changes to
>>> codegen are outside the scope of this proposal. The flags described
in the next
>>> section exist only at the IR level, and will not be propagated into
codegen or
>>> the SelectionDAG.
>>> 
>>> 
>>> Flags
>>> ---
>>> 
>>> LLVM IR instructions will have the following flags that can be set
by the
>>> creator of the IR.
>>> 
>>> no NaNs (N)
>>> - Allow optimizations that assume the arguments and result are not
NaN. Such
>>>   optimizations are required to retain defined behavior over NaNs,
but the
>>>   value of the result is undefined.
>>> 
>>> no Infs (I)
>>> - Allow optimizations that assume the arguments and result are not
>>>   +/-Inf. Such optimizations are required to retain defined
behavior over
>>>   +/-Inf, but the value of the result is undefined.
>>> 
>>> no signed zeros (S)
>>> - Allow optimizations to treat the sign of a zero argument or
result as
>>>   insignificant.
>>> 
>>> allow reciprocal (R)
>>> - Allow optimizations to use the reciprocal of an argument instead
of dividing
>>> 
>>> unsafe algebra (A)
>>> - The optimizer is allowed to perform algebraically equivalent
transformations
>>>    that may dramatically change results in floating point. (e.g.
>>>    reassociation).
>>> 
>>> Throughout I'll refer to these options in their short-hand,
e.g. 'A'.
>>> Internally, these flags are to reside in SubclassData.
>>> 
>>> Setting the 'A' flag implies the setting of all the others
('N', 'I', 'S', 'R').
>>> 
>>> 
>>> Changes to LangRef
>>> ---
>>> 
>>> Change the definitions of floating point arithmetic operations,
below is how
>>> fadd will change:
>>> 
>>> 'fadd' Instruction
>>> Syntax:
>>> 
>>> <result> = fadd {flag}* <ty> <op1>, <op2>  
; yields {ty}:result
>>> ...
>>> Semantics:
>>> ...
>>> flag can be one of the following optimizer hints to enable
otherwise unsafe
>>> floating point optimizations:
>>> N: no NaNs - Allow optimizations that assume the arguments and
result are not
>>>   NaN. Such optimizations are required to retain defined behavior
over NaNs,
>>>   but the value of the result is undefined.
>>> I: no infs - Allow optimizations that assume the arguments and
result are not
>>>   +/-Inf. Such optimizations are required to retain defined
behavior over
>>>   +/-Inf, but the value of the result is undefined.
>>> S: no signed zeros - Allow optimizations to treat the sign of a
zero argument
>>>   or result as insignificant.
>>> A: unsafe algebra - The optimizer is allowed to perform
algebraically
>>>    equivalent transformations that may dramatically change results
in floating
>>>    point. (e.g.  reassociation).
>>> 
>>> fdiv will also mention that 'R' allows the fdiv to be
replaced by a
>>> multiply-by-reciprocal.
>>> 
>>> 
>>> Changes to optimizations
>>> ---
>>> 
>>> Optimizations should be allowed to perform unsafe optimizations
provided the
>>> instructions involved have the corresponding restrictions relaxed.
When
>>> combining instructions, optimizations should do what makes sense to
not remove
>>> restrictions that previously existed (commonly, a bitwise-AND of
the flags).
>>> 
>>> Below are some example optimizations that could be allowed with the
given
>>> relaxations.
>>> 
>>> N - no NaNs
>>> x == x ==> true
>>> 
>>> S - no signed zeros
>>> x - 0 ==> x
>>> 0 - (x - y) ==> y - x
>>> 
>>> NIS - no signed zeros AND no NaNs AND no Infs
>>> x * 0 ==> 0
>>> 
>>> NI - no infs AND no NaNs
>>> x - x ==> 0
>>> 
>>> R - reciprocal
>>>  x / y ==> x * (1/y)
>>> 
>>> A - unsafe-algebra
>>> Reassociation
>>>   (x + y) + z ==> x + (y + z)
>>>   (x + C1) + C2 ==> x + (C1 + C2)
>>> Redistribution
>>>   (x * C) + x ==> x * (C+1)
>>>   (x * C) + (x + x) ==> x * (C + 2)
>>> 
>>> I propose to expand -instsimplify and -instcombine to perform these
kinds of
>>> optimizations. -reassociate will be expanded to reassociate
floating point
>>> operations when allowed. Similar to existing behavior regarding
integer
>>> wrapping, -early-cse will not CSE FP operations with mismatched
flags, while
>>> -gvn will (conservatively). This allows later optimizations to
optimize the
>>> expressions independently between runs of -early-cse and -gvn.
>>> 
>>> Optimizations and analyses that are able to infer certain
properties of
>>> instructions are allowed to set relevant flags. For example, if
some analysis
>>> has determined that the arguments and result of an instruction are
not NaNs or
>>> Infs, then it may set the 'N' and 'I' flags,
allowing every other optimization
>>> and analysis to benefit from this inferred knowledge.
>>> 
>>> Changes to frontends
>>> ---
>>> 
>>> Frontends are free to generate code with flags set as they desire.
Frontends
>>> should continue to call llc with their desired options, as the
flags apply only
>>> at the IR level and not at codegen or the SelectionDAGs.
>>> 
>>> The intention behind the flags are to allow the IR creator to say
something
>>> along the lines of:
>>> "If this operation is given a NaN, or the result is a NaN,
then I don't care
>>> what answer I get back. However, I expect my program to otherwise
behave
>>> properly."
>>> 
>>> Below is a suggested change to clang's command-line options.
>>> 
>>> -ffast-math
>>> Currently described as:
>>> Enable the *frontend*'s 'fast-math' mode. This has no
effect on optimizations,
>>> but provides a preprocessor macro __FAST_MATH__ the same as
GCC's -ffast-math
>>> flag
>>> 
>>> I propose to change the description and behavior to:
>>> 
>>> Enable 'fast-math' mode. This allows for optimizations that
may produce
>>> incorrect and unsafe results, and thus should only be used with
care. This
>>> also provides a preprocessor macro __FAST_MATH__ the same as
GCC's -ffast-math
>>> flag
>>> 
>>> I propose that this turn on all flags for all floating point
instructions. If
>>> this flag doesn't already cause clang to run llc with
-enable-unsafe-fp-math,
>>> then I propose that it does so as well.
>>> 
>>> (Optional)
>>> I propose adding the below flags:
>>> 
>>> -ffinite-math-only
>>> Allow optimizations to assume that floating point arguments and
results are
>>> NaNs or +/-Inf. This may produce incorrect results, and so should
be used with
>>> care.
>>> 
>>> This would set the 'I' and 'N' bits on all
generated floating point instructions.
>>> 
>>> -fno-signed-zeros
>>> Allow optimizations to ignore the signedness of zero. This may
produce
>>> incorrect results, and so should be used with care.
>>> 
>>> This would set the 'S' bit on all FP instructions.
>>> 
>>> -freciprocal-math
>>> Allow optimizations to use the reciprocal of an argument instead of
using
>>> division. This may produce less precise results, and so should be
used with
>>> care.
>>> 
>>> This would set the 'R' bit on all relevant FP instructions
>>> 
>>> Changes to llvm cli tools
>>> ---
>>> opt and llc already have the command line options
>>> -enable-unsafe-fp-math: Enable optimizations that may decrease FP
precision
>>> -enable-no-infs-fp-math: Enable FP math optimizations that assume
no +-Infs
>>> -enable-no-nans-fp-math: Enable FP math optimizations that assume
no NaNs
>>> However, opt makes no use of them as they are currently only
considered to be
>>> TargetOptions. llc will remain unchanged, as these options apply to
DAG
>>> optimizations while this proposal deals with IR optimizations.
>>> 
>>> (Optional)
>>> Have an opt pass that adds the desired flags to floating point
instructions.
>>> 
>>> 
>>> Miscellaneous explanations in the form of Q&A
>>> ---
>>> 
>>> Why not just have "fast-math" rather than individual
flags?
>>> 
>>> Having the individual flags gives the granularity to choose the
levels of
>>> optimizations. For example, unsafe-algebra can lead to dramatically
different
>>> results in corner cases, and may not be desired when a user just
wants to ensure
>>> that x*0 folds to 0.
>>> 
>>> 
>>> Why have these flags attached to the instruction itself, rather
than be a
>>> compiler mode?
>>> 
>>> Being attached to the instruction itself allows much greater
flexibility both
>>> for other optimizations and for the concerns of the source and
target. For
>>> example, a frontend may desire that x - x be folded to 0. This
would require
>>> no-NaNs for the subtract. However, the frontend may want to keep
NaNs for its
>>> comparisons.
>>> 
>>> Additionally, these properties can be set internally in the
optimizer when the
>>> property has been proven. For example, if x has been found to be
positive, then
>>> operations involving x and a constant can be marked to ignore
signed zero.
>>> 
>>> Finally, having these flags allows for greater safety and
optimization when code
>>> of different flags are mixed. For example, a function author may
set the
>>> unsafe-algebra flag knowing that such transformations will not
meaningfully
>>> alter its result. If that function gets inlined into a caller,
however, we don't
>>> want to always assume that the function's expressions can be
reassociated with
>>> the caller's expressions. These properties allow us to preserve
the
>>> optimizations of the inlined function without affecting the caller.
>>> 
>>> 
>>> Why not use metadata rather than flags?
>>> 
>>> There is existing metadata to denote precisions, and this proposal
is orthogonal
>>> to those efforts. While these properties could still be expressed
as metadata,
>>> the proposed flags are analogous to nsw/nuw and are inherent
properties of the
>>> IR instructions themselves that all transformations should respect.
>>> 
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>> 
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20121114/709e9a56/attachment.html>

Possibly Parallel Threads

Search for more seemingly similar threads

llvm dev - Nov 2012 - [LLVMdev] [RFC] Extend LLVM IR to express "fast-math" at a per-instruction level

[LLVMdev] [RFC] Extend LLVM IR to express "fast-math" at a per-instruction level

[LLVMdev] [RFC] Extend LLVM IR to express "fast-math" at a per-instruction level

[LLVMdev] [RFC] Extend LLVM IR to express "fast-math" at a per-instruction level

Possibly Parallel Threads