thr3ads.net - llvm dev - [llvm-dev] [RFC] FP Environment and Rounding mode handling in LLVM [Feb 2016]

If this information is useful, please help other people find it:
Share via:

Pete Cooper via llvm-dev

2016-Feb-06 01:53 UTC

[llvm-dev] [RFC] FP Environment and Rounding mode handling in LLVM

FWIW, +1 from me.

Just one request on the implementation though.  However we model these
intrinsics and their properties (metadata, constants, etc), can we please
abstract away those details the same way we have MemCpyInst which just wraps an
IntrinsicInst?

I think this would be very beneficial if we ever need to add more state, or
change something about the underlying implementation, and not have to search all
the code for ‘bool traps =
cast<ConstantInt>(I->getOperand(1))->getZextValue()’ or whatever it
happens to be.

Pete> On Feb 5, 2016, at 4:36 PM, Stephen Canon via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> Seems like everyone’s on board, but I want to mention that I also think
this is very much the right approach.  In particular, it allows us to support
both existing CPU designs with dynamic rounding modes as well as GPU designs and
soft-float libraries with statically specified rounding.
> 
> Support for “I want the flags, but I really don’t care about when they
happen specifically” is somewhat interesting; I assume this would take the form
of “returning” the flag state and OR-ing it into an integer that represents the
cumulative flags (much like common cpu hardware does, but this would also let us
support soft-float implementations).  This wouldn’t impose ordering
restrictions, but would prevent speculation.
> 
> – Steve
> 
>> On Feb 5, 2016, at 4:25 PM, Hal Finkel via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
>> 
>> ----- Original Message -----
>>> From: "Chandler Carruth" <chandlerc at gmail.com>
>>> To: "Hal Finkel" <hfinkel at anl.gov>,
"Chandler Carruth" <chandlerc at gmail.com>
>>> Cc: "llvm-dev" <llvm-dev at lists.llvm.org>
>>> Sent: Friday, February 5, 2016 4:36:54 PM
>>> Subject: Re: [llvm-dev] [RFC] FP Environment and Rounding mode
handling in LLVM
>>> 
>>> On Fri, Feb 5, 2016 at 2:10 PM Hal Finkel via llvm-dev <
>>> llvm-dev at lists.llvm.org > wrote:
>>> 
>>> 
>>> Hi Chandler,
>>> 
>>> This scheme has significant advantages over what was being pursued,
>>> but one question (or two)...
>>> 
>>> Under the proposed system, how would you represent the necessary
>>> dependency edges between the fp intrinsics and function calls? How
>>> is the state 'returned' to the caller? [I was thinking that
our new
>>> operand bundles could help for the inputs, but the outputs? Plus
>>> what about the live-in state?]
>>> 
>>> This is important because any external subroutine call could
>>> (potentially) change the rounding mode or any other part of the
>>> floating-point environment.
>>> 
>>> 
>>> 
>>> So, one thing that was missing in my original email and that
talking
>>> with Steve Canon offline clarified was that we need a way to
>>> directly query the current modes for systems where those can be set
>>> externally.
>>> 
>>> 
>>> My suggestion was to have an intrinsic that "loads" this
state. This
>>> could then be used to load whatever the current state is, and pass
>>> that to the floating point intrinsics proposed in order to pick up
>>> whatever the "current" state happens to be on systems
where this is
>>> truly a background stateful thing, while still allowing us to model
>>> operation-specific state for other systems. Naturally, there should
>>> be a complimenting "store" of the state as well.
>>> 
>>> 
>>> Then, for code which really needs this degree of faithful FP
>>> environment handling, you would expect the #pragma to be present
>>> enabling that mode. While that pragma is in place, all floating
>>> point operations would be lowered using these intrinsics, and
>>> external function calls could be guarded by storing and reloading
>>> this state at the IR level. This would make the IR substantially
>>> more verbose when the pragma is enabled, but that seems like an
>>> acceptable tradeoff given that we expect this code to be rare (see
>>> my preconditions section). And naturally, on any system that
>>> actually manages FP environment in a state "register" or
whatever,
>>> we'd want to do some work to try to optimize away state
changes.
>>> Much like we have attributes that can be inferred about access to
>>> memory, we could infer attributes on functions about whether they
>>> change the FP environment state, and if not, propagate across the
>>> function call boundaries.
>>> 
>>> 
>>> But even though this would be some amount of work to optimize, the
>>> nice thing (IMO) is that it would be localized. We would have
>>> specific code that dealt with optimizing the FP environment
>>> concerns, while the rest of LLVM could remain oblivious and rely on
>>> simple common constructs to provide conservatively correct
behavior.
>>> 
>>> What do you think?
>> 
>> SGTM.
>> 
>> -Hal
>> 
>>> -Chandler
>>> 
>>> 
>>> 
>>> 
>>> Thanks again,
>>> Hal
>>> 
>>> ----- Original Message -----
>>>> From: "Chandler Carruth" < chandlerc at gmail.com
>
>>>> To: "Mehdi Amini" < mehdi.amini at apple.com >,
"llvm-dev" <
>>>> llvm-dev at lists.llvm.org >
>>>> Cc: "Steve (Numerics) Canon" < scanon at apple.com
>, "Sergey
>>>> Dmitrouk" < sdmitrouk at accesssoftek.com >,
"David Majnemer"
>>>> < david.majnemer at gmail.com >, "Hal Finkel"
< hfinkel at anl.gov >
>>>> Sent: Thursday, February 4, 2016 8:05:38 PM
>>>> Subject: Re: [RFC] FP Environment and Rounding mode handling in
>>>> LLVM
>>>> 
>>>> 
>>>> First, thanks Mehdi for putting something on llvm-dev and
getting
>>>> wider awareness of this.
>>>> 
>>>> 
>>>> I am actually really interested in finding a way for LLVM to
>>>> support
>>>> the interesting functionality we are missing from fenv-like
>>>> interfaces. Things like rounding modes, exceptions, etc.
However, I
>>>> think the current design is going to be a really high burden
for
>>>> the
>>>> entire optimizer and I think there is a simpler model that we
might
>>>> pursue instead.
>>>> 
>>>> 
>>>> I'll start off with some underlying principles that I'm
operating
>>>> from:
>>>> a) Most code in the world will be very happy with the default
>>>> floating point environment, doesn't need to carefully model
>>>> floating
>>>> point exceptions, etc. Essentially, I think that LLVM's
behavior
>>>> today is probably right for most code. Now, the code which
needs
>>>> support for the other features of floating point isn't bad
or
>>>> unimportant! But it is relatively speaking rare, and so I think
it
>>>> is reasonable to optimize the *representation* model for the
common
>>>> case provided we don't lose support for functionality.
>>>> 
>>>> 
>>>> a) When outside the default floating point environment's
rules,
>>>> there
>>>> are few if any optimizations that we realistically expect from
>>>> LLVM.
>>>> Certainly, any changes to the LLVM optimizer which impact code
>>>> outside the default needs to be done *much* more carefully to
avoid
>>>> introducing subtle bugs.
>>>> 
>>>> 
>>>> OK, based on that, consider the following model:
>>>> We provide intrinsics that mirror the instructions
'fadd', 'fsub',
>>>> 'fmul', 'fdiv', and 'frem' (so 5
total). From here on out, I'll
>>>> exclusively use 'fadd' as my examples. The intrinsics
would look
>>>> like:
>>>> 
>>>> declare {f32, i1} @llvm.fadd.with.environment.f32(f32 %lhs, f32
>>>> %rhs,
>>>> i8 %rounding_mode, i8 %exception_behavior)
>>>> 
>>>> 
>>>> Then we define specific values to be used for the IEEE rounding
>>>> modes. And we define values to control exception behavior.
I'm not
>>>> an expert on floating point exceptions in particular (my
platforms
>>>> don't use them) but I'm imagining three states
"ignore", "return",
>>>> and "trap". I've used a single 'i1', but
I'm assuming it would need
>>>> to be several i1s or an iN in order to model the set of FP
>>>> exceptions. I'm using i1 here just to simplify the
explanation, I
>>>> think it generalizes and I'll let the experts suggest the
exact
>>>> formulation.
>>>> 
>>>> 
>>>> If the default rounding mode is provided to these intrinsics
and
>>>> the
>>>> "ignore" exception behavior is provided, they behave
exactly as the
>>>> existing instructions do, and instcombine should canonicalize
to
>>>> the
>>>> existing instructions.
>>>> 
>>>> 
>>>> The semantics of non-default rounding modes are to perform the
>>>> operation with that rounding mode.
>>>> 
>>>> 
>>>> If "return" is provided for the exception behavior,
then the i1
>>>> component of the result is true if an FP exception occured and
>>>> false
>>>> otherwise. If "ignore" is provided then any FP
exceptions are
>>>> ignored and the i1 is always false. If "trap" is
provided then the
>>>> i1 is always false, but the call to the intrinsic might trap.
We
>>>> could either define a trap as precisely the same as a call to
>>>> @llvm.trap(), or we could introduce an @llvm.fp.trap() and
define
>>>> it
>>>> as a call to that.
>>>> 
>>>> 
>>>> The frontend would then be responsible for lowering floating
point
>>>> arithmetic using these intrinsics. This may be somewhat
challenging
>>>> because in the frontend behavior is controlled dynamically in
some
>>>> languages. In those situations, we can either allow these
>>>> intrinsics
>>>> to accept non-constant arguments for %rounding_mode and
>>>> %exception_behavior so that frontends can emit code that just
>>>> dynamically computes them, or we could follow the same model
that
>>>> atomics use, and if the frontend cannot trivially compute a
>>>> constant, it can emit a switch over the possible states with a
>>>> specific intrinsic call in each case. I don't have strong
opinions
>>>> about which would be best, I think either could be made to
work.
>>>> 
>>>> 
>>>> If we go with constant arguments being required, we could use
>>>> "metadata arguments" which aren't actually
metadata but just
>>>> encoded
>>>> arguments for intrinsics.
>>>> 
>>>> 
>>>> When emitting constants and trying to respect floating point
>>>> environment settings, frontends will have to emit runtime calls
>>>> instead of actual constants. But this seems actually good
because
>>>> that is what we'll need anyways -- we aren't able to
with full
>>>> generality emulate all the environment options if I understand
>>>> things correctly (and let me know if I've misunderstood).
>>>> 
>>>> 
>>>> 
>>>> 
>>>> The two really big reasons why I like this model much more than
>>>> extending flags are:
>>>> 
>>>> 
>>>> 1) This avoids implicit state. The implicit state of the
floating
>>>> point environment makes things like code motion extremely hard
to
>>>> reason about. I think we will just get it wrong too often to
make
>>>> this a good approach. By modeling all of this as actual SSA
values
>>>> I
>>>> think there is a much better chance we'll get this stuff
right. For
>>>> example by or-ing all the i1s for floating point exceptions and
>>>> testing the result to implement fetestexcept. Then the backend
can
>>>> spill the state when necessary and reload it when needed even
if
>>>> other floating point math is introduced. I admit that first
class
>>>> aggregate returns aren't a beautiful way to encapsulate
this, but
>>>> they are an *effective* way that we know how to work with in
the
>>>> LLVM IR. If we ever come up with a better multi-def model, we
can
>>>> always switch these and all the other intrinsics which need
this to
>>>> that model.
>>>> 
>>>> 
>>>> 2) Every pass will conservatively correctly model the
operations.
>>>> This is most significant when modeling trapping on exceptions.
We
>>>> need every pass to realize that control flow might not proceed
past
>>>> such operations. We already have this logic for calls, and it
seems
>>>> a really nice fit for allowing most of the optimizer to be
unaware
>>>> of these constructs while respecting them and preserving
behavior
>>>> in
>>>> the face of them.
>>>> 
>>>> 
>>>> 
>>>> 
>>>> I suspect that there are things this model doesn't handle
that I've
>>>> not thought of (as this is outside the are of FP that I'm
deeply
>>>> familiar with), but I really think this model would be easier
to
>>>> reason about and would be much less invasive within the IR and
>>>> optimizer. I wonder if folks think this could work and would be
up
>>>> for moving their efforts in this direction?
>>>> 
>>>> 
>>>> -Chandler
>>>> 
>>>> 
>>>> On Wed, Feb 3, 2016 at 3:04 PM Mehdi Amini < mehdi.amini at
apple.com
>>>>> 
>>>> wrote:
>>>> 
>>>> 
>>>> Hi everyone,
>>>> 
>>>> Sergey (CC’ed) worked on a series of patches to add support for
>>>> floating-point environment and floating-point rounding modes in
>>>> LLVM.
>>>> This started *in 2014* and the patches after multiple rounds of
>>>> review in the last months (involving amongst other Steve Canon,
Hal
>>>> Finkel, David Majnemer, and myself) are getting very close
(IMO) to
>>>> be in a state where we can land them.
>>>> 
>>>> This is the thread that started this development: “ [LLVMdev]
More
>>>> careful treatment of floating point exceptions"
>>>> http://marc.info/?l=llvm-dev&m=141113983302113&w=2
>>>> And this is the thread where most of the discussion on the
design
>>>> occurred: "[PATCH] Flag to enable IEEE-754 friendly FP
>>>> optimizations”
>>>> http://marc.info/?l=llvm-commits&m=141235814915999&w=2
>>>> 
>>>> Since Chandler raised some concerns on IRC today, so I figured
I
>>>> should send a heads-up on this topic to allow any one to
comment on
>>>> the current plan.
>>>> 
>>>> We plan on adding two new FP env flags to the existing FMF
>>>> (fast-math
>>>> flags). Without these flags set, the optimizer has to assume
that
>>>> the FP env can be observed, or the rounding mode can be
changed.
>>>> For
>>>> clang, these flags would be set unless a command line option
would
>>>> require to preserve the FP env.
>>>> 
>>>> Here is the list of patches:
>>>> 
>>>> [FPEnv Core 01/14] Add flags and command-line switches for
FPEnv:
>>>> http://reviews.llvm.org/D14066
>>>> [FPEnv Core 02/14] Add FPEnv access flags to fast-math flags:
>>>> http://reviews.llvm.org/D14067
>>>> [FPEnv Core 03/14] Make SelectionDAG aware of FPEnv flags:
>>>> http://reviews.llvm.org/D14068
>>>> [FPEnv Core 04/14] Skip constant folding to preserve FPEnv:
>>>> http://reviews.llvm.org/D14069
>>>> [FPEnv Core 05/14] Teach IR builder and folders about new
flags:
>>>> http://reviews.llvm.org/D14070
>>>> [FPEnv Core 06/14] Do not fold constants on reading in IR
>>>> asm/bitcode: http://reviews.llvm.org/D14071
>>>> [FPEnv Core 07/14] Prevent undesired folding by InstSimplify:
>>>> http://reviews.llvm.org/D14072
>>>> [FPEnv Core 08/14] Do not simplify expressions with FPEnv
access:
>>>> http://reviews.llvm.org/D14073
>>>> [FPEnv Core 09/14] Make Strict flag available for more clients:
>>>> http://reviews.llvm.org/D14074
>>>> [FPEnv Core 10/14] Use Strict in IRBuilder:
>>>> http://reviews.llvm.org/D14075
>>>> [FPEnv Core 11/14] Don't convert fpops to constexprs in
SCCP:
>>>> http://reviews.llvm.org/D14076
>>>> [FPEnv Core 13/14] Don't hoist FP-ops with side-effects in
LICM:
>>>> http://reviews.llvm.org/D14078
>>>> [FPEnv Core 14/14] Introduce F*_W_CHAIN instrs to prevent
>>>> reordering:
>>>> http://reviews.llvm.org/D14079
>>>> 
>>>> 
>>>> —
>>>> Mehdi
>>>> 
>>>> 
>>> 
>>> --
>>> Hal Finkel
>>> Assistant Computational Scientist
>>> Leadership Computing Facility
>>> Argonne National Laboratory
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>> 
>> 
>> -- 
>> Hal Finkel
>> Assistant Computational Scientist
>> Leadership Computing Facility
>> Argonne National Laboratory
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Chandler Carruth via llvm-dev

2016-Feb-06 02:03 UTC

head link

[llvm-dev] [RFC] FP Environment and Rounding mode handling in LLVM

Agreed.

On Fri, Feb 5, 2016 at 5:54 PM Pete Cooper via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> FWIW, +1 from me.
>
> Just one request on the implementation though.  However we model these
> intrinsics and their properties (metadata, constants, etc), can we please
> abstract away those details the same way we have MemCpyInst which just
> wraps an IntrinsicInst?
>
> I think this would be very beneficial if we ever need to add more state,
> or change something about the underlying implementation, and not have to
> search all the code for ‘bool traps >
cast<ConstantInt>(I->getOperand(1))->getZextValue()’ or whatever it
happens
> to be.
>
> Pete
> > On Feb 5, 2016, at 4:36 PM, Stephen Canon via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> >
> > Seems like everyone’s on board, but I want to mention that I also
think
> this is very much the right approach.  In particular, it allows us to
> support both existing CPU designs with dynamic rounding modes as well as
> GPU designs and soft-float libraries with statically specified rounding.
> >
> > Support for “I want the flags, but I really don’t care about when they
> happen specifically” is somewhat interesting; I assume this would take the
> form of “returning” the flag state and OR-ing it into an integer that
> represents the cumulative flags (much like common cpu hardware does, but
> this would also let us support soft-float implementations).  This wouldn’t
> impose ordering restrictions, but would prevent speculation.
> >
> > – Steve
> >
> >> On Feb 5, 2016, at 4:25 PM, Hal Finkel via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> >>
> >> ----- Original Message -----
> >>> From: "Chandler Carruth" <chandlerc at
gmail.com>
> >>> To: "Hal Finkel" <hfinkel at anl.gov>,
"Chandler Carruth" <
> chandlerc at gmail.com>
> >>> Cc: "llvm-dev" <llvm-dev at lists.llvm.org>
> >>> Sent: Friday, February 5, 2016 4:36:54 PM
> >>> Subject: Re: [llvm-dev] [RFC] FP Environment and Rounding mode
> handling in LLVM
> >>>
> >>> On Fri, Feb 5, 2016 at 2:10 PM Hal Finkel via llvm-dev <
> >>> llvm-dev at lists.llvm.org > wrote:
> >>>
> >>>
> >>> Hi Chandler,
> >>>
> >>> This scheme has significant advantages over what was being
pursued,
> >>> but one question (or two)...
> >>>
> >>> Under the proposed system, how would you represent the
necessary
> >>> dependency edges between the fp intrinsics and function calls?
How
> >>> is the state 'returned' to the caller? [I was thinking
that our new
> >>> operand bundles could help for the inputs, but the outputs?
Plus
> >>> what about the live-in state?]
> >>>
> >>> This is important because any external subroutine call could
> >>> (potentially) change the rounding mode or any other part of
the
> >>> floating-point environment.
> >>>
> >>>
> >>>
> >>> So, one thing that was missing in my original email and that
talking
> >>> with Steve Canon offline clarified was that we need a way to
> >>> directly query the current modes for systems where those can
be set
> >>> externally.
> >>>
> >>>
> >>> My suggestion was to have an intrinsic that "loads"
this state. This
> >>> could then be used to load whatever the current state is, and
pass
> >>> that to the floating point intrinsics proposed in order to
pick up
> >>> whatever the "current" state happens to be on
systems where this is
> >>> truly a background stateful thing, while still allowing us to
model
> >>> operation-specific state for other systems. Naturally, there
should
> >>> be a complimenting "store" of the state as well.
> >>>
> >>>
> >>> Then, for code which really needs this degree of faithful FP
> >>> environment handling, you would expect the #pragma to be
present
> >>> enabling that mode. While that pragma is in place, all
floating
> >>> point operations would be lowered using these intrinsics, and
> >>> external function calls could be guarded by storing and
reloading
> >>> this state at the IR level. This would make the IR
substantially
> >>> more verbose when the pragma is enabled, but that seems like
an
> >>> acceptable tradeoff given that we expect this code to be rare
(see
> >>> my preconditions section). And naturally, on any system that
> >>> actually manages FP environment in a state
"register" or whatever,
> >>> we'd want to do some work to try to optimize away state
changes.
> >>> Much like we have attributes that can be inferred about access
to
> >>> memory, we could infer attributes on functions about whether
they
> >>> change the FP environment state, and if not, propagate across
the
> >>> function call boundaries.
> >>>
> >>>
> >>> But even though this would be some amount of work to optimize,
the
> >>> nice thing (IMO) is that it would be localized. We would have
> >>> specific code that dealt with optimizing the FP environment
> >>> concerns, while the rest of LLVM could remain oblivious and
rely on
> >>> simple common constructs to provide conservatively correct
behavior.
> >>>
> >>> What do you think?
> >>
> >> SGTM.
> >>
> >> -Hal
> >>
> >>> -Chandler
> >>>
> >>>
> >>>
> >>>
> >>> Thanks again,
> >>> Hal
> >>>
> >>> ----- Original Message -----
> >>>> From: "Chandler Carruth" < chandlerc at
gmail.com >
> >>>> To: "Mehdi Amini" < mehdi.amini at apple.com
>, "llvm-dev" <
> >>>> llvm-dev at lists.llvm.org >
> >>>> Cc: "Steve (Numerics) Canon" < scanon at
apple.com >, "Sergey
> >>>> Dmitrouk" < sdmitrouk at accesssoftek.com >,
"David Majnemer"
> >>>> < david.majnemer at gmail.com >, "Hal
Finkel" < hfinkel at anl.gov >
> >>>> Sent: Thursday, February 4, 2016 8:05:38 PM
> >>>> Subject: Re: [RFC] FP Environment and Rounding mode
handling in
> >>>> LLVM
> >>>>
> >>>>
> >>>> First, thanks Mehdi for putting something on llvm-dev and
getting
> >>>> wider awareness of this.
> >>>>
> >>>>
> >>>> I am actually really interested in finding a way for LLVM
to
> >>>> support
> >>>> the interesting functionality we are missing from
fenv-like
> >>>> interfaces. Things like rounding modes, exceptions, etc.
However, I
> >>>> think the current design is going to be a really high
burden for
> >>>> the
> >>>> entire optimizer and I think there is a simpler model that
we might
> >>>> pursue instead.
> >>>>
> >>>>
> >>>> I'll start off with some underlying principles that
I'm operating
> >>>> from:
> >>>> a) Most code in the world will be very happy with the
default
> >>>> floating point environment, doesn't need to carefully
model
> >>>> floating
> >>>> point exceptions, etc. Essentially, I think that
LLVM's behavior
> >>>> today is probably right for most code. Now, the code which
needs
> >>>> support for the other features of floating point isn't
bad or
> >>>> unimportant! But it is relatively speaking rare, and so I
think it
> >>>> is reasonable to optimize the *representation* model for
the common
> >>>> case provided we don't lose support for functionality.
> >>>>
> >>>>
> >>>> a) When outside the default floating point
environment's rules,
> >>>> there
> >>>> are few if any optimizations that we realistically expect
from
> >>>> LLVM.
> >>>> Certainly, any changes to the LLVM optimizer which impact
code
> >>>> outside the default needs to be done *much* more carefully
to avoid
> >>>> introducing subtle bugs.
> >>>>
> >>>>
> >>>> OK, based on that, consider the following model:
> >>>> We provide intrinsics that mirror the instructions
'fadd', 'fsub',
> >>>> 'fmul', 'fdiv', and 'frem' (so 5
total). From here on out, I'll
> >>>> exclusively use 'fadd' as my examples. The
intrinsics would look
> >>>> like:
> >>>>
> >>>> declare {f32, i1} @llvm.fadd.with.environment.f32(f32
%lhs, f32
> >>>> %rhs,
> >>>> i8 %rounding_mode, i8 %exception_behavior)
> >>>>
> >>>>
> >>>> Then we define specific values to be used for the IEEE
rounding
> >>>> modes. And we define values to control exception behavior.
I'm not
> >>>> an expert on floating point exceptions in particular (my
platforms
> >>>> don't use them) but I'm imagining three states
"ignore", "return",
> >>>> and "trap". I've used a single 'i1',
but I'm assuming it would need
> >>>> to be several i1s or an iN in order to model the set of FP
> >>>> exceptions. I'm using i1 here just to simplify the
explanation, I
> >>>> think it generalizes and I'll let the experts suggest
the exact
> >>>> formulation.
> >>>>
> >>>>
> >>>> If the default rounding mode is provided to these
intrinsics and
> >>>> the
> >>>> "ignore" exception behavior is provided, they
behave exactly as the
> >>>> existing instructions do, and instcombine should
canonicalize to
> >>>> the
> >>>> existing instructions.
> >>>>
> >>>>
> >>>> The semantics of non-default rounding modes are to perform
the
> >>>> operation with that rounding mode.
> >>>>
> >>>>
> >>>> If "return" is provided for the exception
behavior, then the i1
> >>>> component of the result is true if an FP exception occured
and
> >>>> false
> >>>> otherwise. If "ignore" is provided then any FP
exceptions are
> >>>> ignored and the i1 is always false. If "trap" is
provided then the
> >>>> i1 is always false, but the call to the intrinsic might
trap. We
> >>>> could either define a trap as precisely the same as a call
to
> >>>> @llvm.trap(), or we could introduce an @llvm.fp.trap() and
define
> >>>> it
> >>>> as a call to that.
> >>>>
> >>>>
> >>>> The frontend would then be responsible for lowering
floating point
> >>>> arithmetic using these intrinsics. This may be somewhat
challenging
> >>>> because in the frontend behavior is controlled dynamically
in some
> >>>> languages. In those situations, we can either allow these
> >>>> intrinsics
> >>>> to accept non-constant arguments for %rounding_mode and
> >>>> %exception_behavior so that frontends can emit code that
just
> >>>> dynamically computes them, or we could follow the same
model that
> >>>> atomics use, and if the frontend cannot trivially compute
a
> >>>> constant, it can emit a switch over the possible states
with a
> >>>> specific intrinsic call in each case. I don't have
strong opinions
> >>>> about which would be best, I think either could be made to
work.
> >>>>
> >>>>
> >>>> If we go with constant arguments being required, we could
use
> >>>> "metadata arguments" which aren't actually
metadata but just
> >>>> encoded
> >>>> arguments for intrinsics.
> >>>>
> >>>>
> >>>> When emitting constants and trying to respect floating
point
> >>>> environment settings, frontends will have to emit runtime
calls
> >>>> instead of actual constants. But this seems actually good
because
> >>>> that is what we'll need anyways -- we aren't able
to with full
> >>>> generality emulate all the environment options if I
understand
> >>>> things correctly (and let me know if I've
misunderstood).
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> The two really big reasons why I like this model much more
than
> >>>> extending flags are:
> >>>>
> >>>>
> >>>> 1) This avoids implicit state. The implicit state of the
floating
> >>>> point environment makes things like code motion extremely
hard to
> >>>> reason about. I think we will just get it wrong too often
to make
> >>>> this a good approach. By modeling all of this as actual
SSA values
> >>>> I
> >>>> think there is a much better chance we'll get this
stuff right. For
> >>>> example by or-ing all the i1s for floating point
exceptions and
> >>>> testing the result to implement fetestexcept. Then the
backend can
> >>>> spill the state when necessary and reload it when needed
even if
> >>>> other floating point math is introduced. I admit that
first class
> >>>> aggregate returns aren't a beautiful way to
encapsulate this, but
> >>>> they are an *effective* way that we know how to work with
in the
> >>>> LLVM IR. If we ever come up with a better multi-def model,
we can
> >>>> always switch these and all the other intrinsics which
need this to
> >>>> that model.
> >>>>
> >>>>
> >>>> 2) Every pass will conservatively correctly model the
operations.
> >>>> This is most significant when modeling trapping on
exceptions. We
> >>>> need every pass to realize that control flow might not
proceed past
> >>>> such operations. We already have this logic for calls, and
it seems
> >>>> a really nice fit for allowing most of the optimizer to be
unaware
> >>>> of these constructs while respecting them and preserving
behavior
> >>>> in
> >>>> the face of them.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> I suspect that there are things this model doesn't
handle that I've
> >>>> not thought of (as this is outside the are of FP that
I'm deeply
> >>>> familiar with), but I really think this model would be
easier to
> >>>> reason about and would be much less invasive within the IR
and
> >>>> optimizer. I wonder if folks think this could work and
would be up
> >>>> for moving their efforts in this direction?
> >>>>
> >>>>
> >>>> -Chandler
> >>>>
> >>>>
> >>>> On Wed, Feb 3, 2016 at 3:04 PM Mehdi Amini <
mehdi.amini at apple.com
> >>>>>
> >>>> wrote:
> >>>>
> >>>>
> >>>> Hi everyone,
> >>>>
> >>>> Sergey (CC’ed) worked on a series of patches to add
support for
> >>>> floating-point environment and floating-point rounding
modes in
> >>>> LLVM.
> >>>> This started *in 2014* and the patches after multiple
rounds of
> >>>> review in the last months (involving amongst other Steve
Canon, Hal
> >>>> Finkel, David Majnemer, and myself) are getting very close
(IMO) to
> >>>> be in a state where we can land them.
> >>>>
> >>>> This is the thread that started this development: “
[LLVMdev] More
> >>>> careful treatment of floating point exceptions"
> >>>> http://marc.info/?l=llvm-dev&m=141113983302113&w=2
> >>>> And this is the thread where most of the discussion on the
design
> >>>> occurred: "[PATCH] Flag to enable IEEE-754 friendly
FP
> >>>> optimizations”
> >>>>
http://marc.info/?l=llvm-commits&m=141235814915999&w=2
> >>>>
> >>>> Since Chandler raised some concerns on IRC today, so I
figured I
> >>>> should send a heads-up on this topic to allow any one to
comment on
> >>>> the current plan.
> >>>>
> >>>> We plan on adding two new FP env flags to the existing FMF
> >>>> (fast-math
> >>>> flags). Without these flags set, the optimizer has to
assume that
> >>>> the FP env can be observed, or the rounding mode can be
changed.
> >>>> For
> >>>> clang, these flags would be set unless a command line
option would
> >>>> require to preserve the FP env.
> >>>>
> >>>> Here is the list of patches:
> >>>>
> >>>> [FPEnv Core 01/14] Add flags and command-line switches for
FPEnv:
> >>>> http://reviews.llvm.org/D14066
> >>>> [FPEnv Core 02/14] Add FPEnv access flags to fast-math
flags:
> >>>> http://reviews.llvm.org/D14067
> >>>> [FPEnv Core 03/14] Make SelectionDAG aware of FPEnv flags:
> >>>> http://reviews.llvm.org/D14068
> >>>> [FPEnv Core 04/14] Skip constant folding to preserve
FPEnv:
> >>>> http://reviews.llvm.org/D14069
> >>>> [FPEnv Core 05/14] Teach IR builder and folders about new
flags:
> >>>> http://reviews.llvm.org/D14070
> >>>> [FPEnv Core 06/14] Do not fold constants on reading in IR
> >>>> asm/bitcode: http://reviews.llvm.org/D14071
> >>>> [FPEnv Core 07/14] Prevent undesired folding by
InstSimplify:
> >>>> http://reviews.llvm.org/D14072
> >>>> [FPEnv Core 08/14] Do not simplify expressions with FPEnv
access:
> >>>> http://reviews.llvm.org/D14073
> >>>> [FPEnv Core 09/14] Make Strict flag available for more
clients:
> >>>> http://reviews.llvm.org/D14074
> >>>> [FPEnv Core 10/14] Use Strict in IRBuilder:
> >>>> http://reviews.llvm.org/D14075
> >>>> [FPEnv Core 11/14] Don't convert fpops to constexprs
in SCCP:
> >>>> http://reviews.llvm.org/D14076
> >>>> [FPEnv Core 13/14] Don't hoist FP-ops with
side-effects in LICM:
> >>>> http://reviews.llvm.org/D14078
> >>>> [FPEnv Core 14/14] Introduce F*_W_CHAIN instrs to prevent
> >>>> reordering:
> >>>> http://reviews.llvm.org/D14079
> >>>>
> >>>>
> >>>> —
> >>>> Mehdi
> >>>>
> >>>>
> >>>
> >>> --
> >>> Hal Finkel
> >>> Assistant Computational Scientist
> >>> Leadership Computing Facility
> >>> Argonne National Laboratory
> >>> _______________________________________________
> >>> LLVM Developers mailing list
> >>> llvm-dev at lists.llvm.org
> >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >>>
> >>
> >> --
> >> Hal Finkel
> >> Assistant Computational Scientist
> >> Leadership Computing Facility
> >> Argonne National Laboratory
> >> _______________________________________________
> >> LLVM Developers mailing list
> >> llvm-dev at lists.llvm.org
> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160206/4a18116f/attachment-0001.html>

Philip Reames via llvm-dev

2016-Feb-10 03:57 UTC

head link

[llvm-dev] [RFC] FP Environment and Rounding mode handling in LLVM

+1 to this.  Having it structured this way would make things much easier 
if we someday decided to promote these intrinsics to instructions or 
merge them (via non-optional modifiers like "volatile") with the 
existing floating point instructions.

Philip

On 02/05/2016 06:03 PM, Chandler Carruth via llvm-dev
wrote:> Agreed.
>
> On Fri, Feb 5, 2016 at 5:54 PM Pete Cooper via llvm-dev 
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
>
>     FWIW, +1 from me.
>
>     Just one request on the implementation though.  However we model
>     these intrinsics and their properties (metadata, constants, etc),
>     can we please abstract away those details the same way we have
>     MemCpyInst which just wraps an IntrinsicInst?
>
>     I think this would be very beneficial if we ever need to add more
>     state, or change something about the underlying implementation,
>     and not have to search all the code for ‘bool traps >    
cast<ConstantInt>(I->getOperand(1))->getZextValue()’ or whatever
>     it happens to be.
>
>     Pete
>     > On Feb 5, 2016, at 4:36 PM, Stephen Canon via llvm-dev
>     <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
>     >
>     > Seems like everyone’s on board, but I want to mention that I
>     also think this is very much the right approach.  In particular,
>     it allows us to support both existing CPU designs with dynamic
>     rounding modes as well as GPU designs and soft-float libraries
>     with statically specified rounding.
>     >
>     > Support for “I want the flags, but I really don’t care about
>     when they happen specifically” is somewhat interesting; I assume
>     this would take the form of “returning” the flag state and OR-ing
>     it into an integer that represents the cumulative flags (much like
>     common cpu hardware does, but this would also let us support
>     soft-float implementations).  This wouldn’t impose ordering
>     restrictions, but would prevent speculation.
>     >
>     > – Steve
>     >
>     >> On Feb 5, 2016, at 4:25 PM, Hal Finkel via llvm-dev
>     <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
>     >>
>     >> ----- Original Message -----
>     >>> From: "Chandler Carruth" <chandlerc at
gmail.com
>     <mailto:chandlerc at gmail.com>>
>     >>> To: "Hal Finkel" <hfinkel at anl.gov
<mailto:hfinkel at anl.gov>>,
>     "Chandler Carruth" <chandlerc at gmail.com
<mailto:chandlerc at gmail.com>>
>     >>> Cc: "llvm-dev" <llvm-dev at lists.llvm.org
>     <mailto:llvm-dev at lists.llvm.org>>
>     >>> Sent: Friday, February 5, 2016 4:36:54 PM
>     >>> Subject: Re: [llvm-dev] [RFC] FP Environment and Rounding
mode
>     handling in LLVM
>     >>>
>     >>> On Fri, Feb 5, 2016 at 2:10 PM Hal Finkel via llvm-dev
<
>     >>> llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org> > wrote:
>     >>>
>     >>>
>     >>> Hi Chandler,
>     >>>
>     >>> This scheme has significant advantages over what was being
>     pursued,
>     >>> but one question (or two)...
>     >>>
>     >>> Under the proposed system, how would you represent the
necessary
>     >>> dependency edges between the fp intrinsics and function
calls? How
>     >>> is the state 'returned' to the caller? [I was
thinking that
>     our new
>     >>> operand bundles could help for the inputs, but the
outputs? Plus
>     >>> what about the live-in state?]
>     >>>
>     >>> This is important because any external subroutine call
could
>     >>> (potentially) change the rounding mode or any other part
of the
>     >>> floating-point environment.
>     >>>
>     >>>
>     >>>
>     >>> So, one thing that was missing in my original email and
that
>     talking
>     >>> with Steve Canon offline clarified was that we need a way
to
>     >>> directly query the current modes for systems where those
can
>     be set
>     >>> externally.
>     >>>
>     >>>
>     >>> My suggestion was to have an intrinsic that
"loads" this
>     state. This
>     >>> could then be used to load whatever the current state is,
and pass
>     >>> that to the floating point intrinsics proposed in order to
pick up
>     >>> whatever the "current" state happens to be on
systems where
>     this is
>     >>> truly a background stateful thing, while still allowing us
to
>     model
>     >>> operation-specific state for other systems. Naturally,
there
>     should
>     >>> be a complimenting "store" of the state as well.
>     >>>
>     >>>
>     >>> Then, for code which really needs this degree of faithful
FP
>     >>> environment handling, you would expect the #pragma to be
present
>     >>> enabling that mode. While that pragma is in place, all
floating
>     >>> point operations would be lowered using these intrinsics,
and
>     >>> external function calls could be guarded by storing and
reloading
>     >>> this state at the IR level. This would make the IR
substantially
>     >>> more verbose when the pragma is enabled, but that seems
like an
>     >>> acceptable tradeoff given that we expect this code to be
rare (see
>     >>> my preconditions section). And naturally, on any system
that
>     >>> actually manages FP environment in a state
"register" or whatever,
>     >>> we'd want to do some work to try to optimize away
state changes.
>     >>> Much like we have attributes that can be inferred about
access to
>     >>> memory, we could infer attributes on functions about
whether they
>     >>> change the FP environment state, and if not, propagate
across the
>     >>> function call boundaries.
>     >>>
>     >>>
>     >>> But even though this would be some amount of work to
optimize, the
>     >>> nice thing (IMO) is that it would be localized. We would
have
>     >>> specific code that dealt with optimizing the FP
environment
>     >>> concerns, while the rest of LLVM could remain oblivious
and
>     rely on
>     >>> simple common constructs to provide conservatively correct
>     behavior.
>     >>>
>     >>> What do you think?
>     >>
>     >> SGTM.
>     >>
>     >> -Hal
>     >>
>     >>> -Chandler
>     >>>
>     >>>
>     >>>
>     >>>
>     >>> Thanks again,
>     >>> Hal
>     >>>
>     >>> ----- Original Message -----
>     >>>> From: "Chandler Carruth" < chandlerc at
gmail.com
>     <mailto:chandlerc at gmail.com> >
>     >>>> To: "Mehdi Amini" < mehdi.amini at
apple.com
>     <mailto:mehdi.amini at apple.com> >, "llvm-dev" <
>     >>>> llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org> >
>     >>>> Cc: "Steve (Numerics) Canon" < scanon at
apple.com
>     <mailto:scanon at apple.com> >, "Sergey
>     >>>> Dmitrouk" < sdmitrouk at accesssoftek.com
>     <mailto:sdmitrouk at accesssoftek.com> >, "David
Majnemer"
>     >>>> < david.majnemer at gmail.com
<mailto:david.majnemer at gmail.com>
>     >, "Hal Finkel" < hfinkel at anl.gov <mailto:hfinkel
at anl.gov> >
>     >>>> Sent: Thursday, February 4, 2016 8:05:38 PM
>     >>>> Subject: Re: [RFC] FP Environment and Rounding mode
handling in
>     >>>> LLVM
>     >>>>
>     >>>>
>     >>>> First, thanks Mehdi for putting something on llvm-dev
and getting
>     >>>> wider awareness of this.
>     >>>>
>     >>>>
>     >>>> I am actually really interested in finding a way for
LLVM to
>     >>>> support
>     >>>> the interesting functionality we are missing from
fenv-like
>     >>>> interfaces. Things like rounding modes, exceptions,
etc.
>     However, I
>     >>>> think the current design is going to be a really high
burden for
>     >>>> the
>     >>>> entire optimizer and I think there is a simpler model
that we
>     might
>     >>>> pursue instead.
>     >>>>
>     >>>>
>     >>>> I'll start off with some underlying principles
that I'm operating
>     >>>> from:
>     >>>> a) Most code in the world will be very happy with the
default
>     >>>> floating point environment, doesn't need to
carefully model
>     >>>> floating
>     >>>> point exceptions, etc. Essentially, I think that
LLVM's behavior
>     >>>> today is probably right for most code. Now, the code
which needs
>     >>>> support for the other features of floating point
isn't bad or
>     >>>> unimportant! But it is relatively speaking rare, and
so I
>     think it
>     >>>> is reasonable to optimize the *representation* model
for the
>     common
>     >>>> case provided we don't lose support for
functionality.
>     >>>>
>     >>>>
>     >>>> a) When outside the default floating point
environment's rules,
>     >>>> there
>     >>>> are few if any optimizations that we realistically
expect from
>     >>>> LLVM.
>     >>>> Certainly, any changes to the LLVM optimizer which
impact code
>     >>>> outside the default needs to be done *much* more
carefully to
>     avoid
>     >>>> introducing subtle bugs.
>     >>>>
>     >>>>
>     >>>> OK, based on that, consider the following model:
>     >>>> We provide intrinsics that mirror the instructions
'fadd',
>     'fsub',
>     >>>> 'fmul', 'fdiv', and 'frem' (so
5 total). From here on out, I'll
>     >>>> exclusively use 'fadd' as my examples. The
intrinsics would look
>     >>>> like:
>     >>>>
>     >>>> declare {f32, i1} @llvm.fadd.with.environment.f32(f32
%lhs, f32
>     >>>> %rhs,
>     >>>> i8 %rounding_mode, i8 %exception_behavior)
>     >>>>
>     >>>>
>     >>>> Then we define specific values to be used for the IEEE
rounding
>     >>>> modes. And we define values to control exception
behavior.
>     I'm not
>     >>>> an expert on floating point exceptions in particular
(my
>     platforms
>     >>>> don't use them) but I'm imagining three states
"ignore",
>     "return",
>     >>>> and "trap". I've used a single
'i1', but I'm assuming it
>     would need
>     >>>> to be several i1s or an iN in order to model the set
of FP
>     >>>> exceptions. I'm using i1 here just to simplify the
explanation, I
>     >>>> think it generalizes and I'll let the experts
suggest the exact
>     >>>> formulation.
>     >>>>
>     >>>>
>     >>>> If the default rounding mode is provided to these
intrinsics and
>     >>>> the
>     >>>> "ignore" exception behavior is provided,
they behave exactly
>     as the
>     >>>> existing instructions do, and instcombine should
canonicalize to
>     >>>> the
>     >>>> existing instructions.
>     >>>>
>     >>>>
>     >>>> The semantics of non-default rounding modes are to
perform the
>     >>>> operation with that rounding mode.
>     >>>>
>     >>>>
>     >>>> If "return" is provided for the exception
behavior, then the i1
>     >>>> component of the result is true if an FP exception
occured and
>     >>>> false
>     >>>> otherwise. If "ignore" is provided then any
FP exceptions are
>     >>>> ignored and the i1 is always false. If
"trap" is provided
>     then the
>     >>>> i1 is always false, but the call to the intrinsic
might trap. We
>     >>>> could either define a trap as precisely the same as a
call to
>     >>>> @llvm.trap(), or we could introduce an @llvm.fp.trap()
and define
>     >>>> it
>     >>>> as a call to that.
>     >>>>
>     >>>>
>     >>>> The frontend would then be responsible for lowering
floating
>     point
>     >>>> arithmetic using these intrinsics. This may be
somewhat
>     challenging
>     >>>> because in the frontend behavior is controlled
dynamically in
>     some
>     >>>> languages. In those situations, we can either allow
these
>     >>>> intrinsics
>     >>>> to accept non-constant arguments for %rounding_mode
and
>     >>>> %exception_behavior so that frontends can emit code
that just
>     >>>> dynamically computes them, or we could follow the same
model that
>     >>>> atomics use, and if the frontend cannot trivially
compute a
>     >>>> constant, it can emit a switch over the possible
states with a
>     >>>> specific intrinsic call in each case. I don't have
strong
>     opinions
>     >>>> about which would be best, I think either could be
made to work.
>     >>>>
>     >>>>
>     >>>> If we go with constant arguments being required, we
could use
>     >>>> "metadata arguments" which aren't
actually metadata but just
>     >>>> encoded
>     >>>> arguments for intrinsics.
>     >>>>
>     >>>>
>     >>>> When emitting constants and trying to respect floating
point
>     >>>> environment settings, frontends will have to emit
runtime calls
>     >>>> instead of actual constants. But this seems actually
good because
>     >>>> that is what we'll need anyways -- we aren't
able to with full
>     >>>> generality emulate all the environment options if I
understand
>     >>>> things correctly (and let me know if I've
misunderstood).
>     >>>>
>     >>>>
>     >>>>
>     >>>>
>     >>>> The two really big reasons why I like this model much
more than
>     >>>> extending flags are:
>     >>>>
>     >>>>
>     >>>> 1) This avoids implicit state. The implicit state of
the floating
>     >>>> point environment makes things like code motion
extremely hard to
>     >>>> reason about. I think we will just get it wrong too
often to make
>     >>>> this a good approach. By modeling all of this as
actual SSA
>     values
>     >>>> I
>     >>>> think there is a much better chance we'll get this
stuff
>     right. For
>     >>>> example by or-ing all the i1s for floating point
exceptions and
>     >>>> testing the result to implement fetestexcept. Then the
>     backend can
>     >>>> spill the state when necessary and reload it when
needed even if
>     >>>> other floating point math is introduced. I admit that
first class
>     >>>> aggregate returns aren't a beautiful way to
encapsulate this, but
>     >>>> they are an *effective* way that we know how to work
with in the
>     >>>> LLVM IR. If we ever come up with a better multi-def
model, we can
>     >>>> always switch these and all the other intrinsics which
need
>     this to
>     >>>> that model.
>     >>>>
>     >>>>
>     >>>> 2) Every pass will conservatively correctly model the
operations.
>     >>>> This is most significant when modeling trapping on
exceptions. We
>     >>>> need every pass to realize that control flow might not
>     proceed past
>     >>>> such operations. We already have this logic for calls,
and it
>     seems
>     >>>> a really nice fit for allowing most of the optimizer
to be
>     unaware
>     >>>> of these constructs while respecting them and
preserving behavior
>     >>>> in
>     >>>> the face of them.
>     >>>>
>     >>>>
>     >>>>
>     >>>>
>     >>>> I suspect that there are things this model doesn't
handle
>     that I've
>     >>>> not thought of (as this is outside the are of FP that
I'm deeply
>     >>>> familiar with), but I really think this model would be
easier to
>     >>>> reason about and would be much less invasive within
the IR and
>     >>>> optimizer. I wonder if folks think this could work and
would
>     be up
>     >>>> for moving their efforts in this direction?
>     >>>>
>     >>>>
>     >>>> -Chandler
>     >>>>
>     >>>>
>     >>>> On Wed, Feb 3, 2016 at 3:04 PM Mehdi Amini <
>     mehdi.amini at apple.com <mailto:mehdi.amini at apple.com>
>     >>>>>
>     >>>> wrote:
>     >>>>
>     >>>>
>     >>>> Hi everyone,
>     >>>>
>     >>>> Sergey (CC’ed) worked on a series of patches to add
support for
>     >>>> floating-point environment and floating-point rounding
modes in
>     >>>> LLVM.
>     >>>> This started *in 2014* and the patches after multiple
rounds of
>     >>>> review in the last months (involving amongst other
Steve
>     Canon, Hal
>     >>>> Finkel, David Majnemer, and myself) are getting very
close
>     (IMO) to
>     >>>> be in a state where we can land them.
>     >>>>
>     >>>> This is the thread that started this development: “
[LLVMdev]
>     More
>     >>>> careful treatment of floating point exceptions"
>     >>>>
http://marc.info/?l=llvm-dev&m=141113983302113&w=2
>     >>>> And this is the thread where most of the discussion on
the design
>     >>>> occurred: "[PATCH] Flag to enable IEEE-754
friendly FP
>     >>>> optimizations”
>     >>>>
http://marc.info/?l=llvm-commits&m=141235814915999&w=2
>     >>>>
>     >>>> Since Chandler raised some concerns on IRC today, so I
figured I
>     >>>> should send a heads-up on this topic to allow any one
to
>     comment on
>     >>>> the current plan.
>     >>>>
>     >>>> We plan on adding two new FP env flags to the existing
FMF
>     >>>> (fast-math
>     >>>> flags). Without these flags set, the optimizer has to
assume that
>     >>>> the FP env can be observed, or the rounding mode can
be changed.
>     >>>> For
>     >>>> clang, these flags would be set unless a command line
option
>     would
>     >>>> require to preserve the FP env.
>     >>>>
>     >>>> Here is the list of patches:
>     >>>>
>     >>>> [FPEnv Core 01/14] Add flags and command-line switches
for FPEnv:
>     >>>> http://reviews.llvm.org/D14066
>     >>>> [FPEnv Core 02/14] Add FPEnv access flags to fast-math
flags:
>     >>>> http://reviews.llvm.org/D14067
>     >>>> [FPEnv Core 03/14] Make SelectionDAG aware of FPEnv
flags:
>     >>>> http://reviews.llvm.org/D14068
>     >>>> [FPEnv Core 04/14] Skip constant folding to preserve
FPEnv:
>     >>>> http://reviews.llvm.org/D14069
>     >>>> [FPEnv Core 05/14] Teach IR builder and folders about
new flags:
>     >>>> http://reviews.llvm.org/D14070
>     >>>> [FPEnv Core 06/14] Do not fold constants on reading in
IR
>     >>>> asm/bitcode: http://reviews.llvm.org/D14071
>     >>>> [FPEnv Core 07/14] Prevent undesired folding by
InstSimplify:
>     >>>> http://reviews.llvm.org/D14072
>     >>>> [FPEnv Core 08/14] Do not simplify expressions with
FPEnv access:
>     >>>> http://reviews.llvm.org/D14073
>     >>>> [FPEnv Core 09/14] Make Strict flag available for more
clients:
>     >>>> http://reviews.llvm.org/D14074
>     >>>> [FPEnv Core 10/14] Use Strict in IRBuilder:
>     >>>> http://reviews.llvm.org/D14075
>     >>>> [FPEnv Core 11/14] Don't convert fpops to
constexprs in SCCP:
>     >>>> http://reviews.llvm.org/D14076
>     >>>> [FPEnv Core 13/14] Don't hoist FP-ops with
side-effects in LICM:
>     >>>> http://reviews.llvm.org/D14078
>     >>>> [FPEnv Core 14/14] Introduce F*_W_CHAIN instrs to
prevent
>     >>>> reordering:
>     >>>> http://reviews.llvm.org/D14079
>     >>>>
>     >>>>
>     >>>> —
>     >>>> Mehdi
>     >>>>
>     >>>>
>     >>>
>     >>> --
>     >>> Hal Finkel
>     >>> Assistant Computational Scientist
>     >>> Leadership Computing Facility
>     >>> Argonne National Laboratory
>     >>> _______________________________________________
>     >>> LLVM Developers mailing list
>     >>> llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
>     >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>     >>>
>     >>
>     >> --
>     >> Hal Finkel
>     >> Assistant Computational Scientist
>     >> Leadership Computing Facility
>     >> Argonne National Laboratory
>     >> _______________________________________________
>     >> LLVM Developers mailing list
>     >> llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
>     >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>     >
>     > _______________________________________________
>     > LLVM Developers mailing list
>     > llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
>     > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>     _______________________________________________
>     LLVM Developers mailing list
>     llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160209/3d037e79/attachment-0001.html>

llvm dev - Feb 2016 - [RFC] FP Environment and Rounding mode handling in LLVM

[llvm-dev] [RFC] FP Environment and Rounding mode handling in LLVM

[llvm-dev] [RFC] FP Environment and Rounding mode handling in LLVM

[llvm-dev] [RFC] FP Environment and Rounding mode handling in LLVM