thr3ads.net - llvm dev - [llvm-dev] RFC: Consider changing the semantics of 'fast' flag implying all fast-math-flags [Nov 2016]

If this information is useful, please help other people find it:
Share via:

Hal Finkel via llvm-dev

2016-Nov-16 07:59 UTC

[llvm-dev] RFC: Consider changing the semantics of 'fast' flag implying all fast-math-flags

----- Original Message -----
> From: "Mehdi Amini via llvm-dev" <llvm-dev at
lists.llvm.org>
> To: "Warren Ristow" <warren.ristow at sony.com>
> Cc: llvm-dev at lists.llvm.org
> Sent: Tuesday, November 15, 2016 11:10:48 PM
> Subject: Re: [llvm-dev] RFC: Consider changing the semantics of
> 'fast' flag implying all fast-math-flags
> Hi,
> > On Nov 15, 2016, at 5:15 PM, Ristow, Warren via llvm-dev <
> > llvm-dev at lists.llvm.org > wrote:
> 
> > Hi all,
> 
> > This is about https://reviews.llvm.org/D26708
> 
> > Currently when the command-line switch '-ffast-math' is
specified,
> > the
> 
> > IR-level fast-math-flag 'fast' gets attached to appropriate FP
math
> 
> > instructions. That flag acts as an "umbrella" to implicitly
turn on
> > all the
> 
> > other fast-math-flags ('nnan', 'ninf', 'nsz'
and 'arcp'):
> 
> > http://llvm.org/docs/LangRef.html#fast-math-flags
> 
> > This approach has the shortcoming that when there is a desire to
> > disable one
> 
> > of these fast-math-flags, if the 'fast' flag remains, it
implicitly
> 
> > re-enables the one being disabled. For example, compiling this
> > test-case:
> 
> > extern void use(float x, float y);
> 
> > void test(float a, float b, float c) {
> 
> > float q1 = a / c;
> 
> > float q2 = b / c;
> 
> > use(q1, q2);
> 
> > }
> 
> > at '-O2 -ffast-math' does a reciprocal-transformation, so only
one
> > division
> 
> > is done (as desired with fast-math). Compiling it with:
> 
> > -O2 -ffast-math -fno-reciprocal-math
> 
> > should disable the reciprocal transformations (the flag
'arcp'),
> > but
> > leave
> 
> > all the other fast-math transformations enabled. The current
> > implementation
> 
> > doesn't do that, since the 'fast' IR-level flag still gets
set.
> 
> > Motivation of this discussion:
> > https://llvm.org/bugs/show_bug.cgi?id=27372#c2
> 
> > As an aside, when '-ffast-math' is specified on the
command-line,
> > the
> 
> > following six switches are all passed to cc1:
> 
> > -menable-no-infs
> 
> > -menable-no-nans
> 
> > -fno-signed-zeros
> 
> > -freciprocal-math
> 
> > -fno-trapping-math
> 
> > -ffp-contract=fast
> 
> > and '-ffast-math' itself is also passed cc1 (the act of
passing
> > '-ffast-math'
> 
> > to cc1 results in the macro '__FAST_MATH__' being defined).
When
> > (for
> 
> > example) '-fno-reciprocal-math' is passed in addition to
> > '-ffast-math', then
> 
> > '-freciprocal-math' is no longer passed to cc1 (and the other
five
> > listed
> 
> > above still are passed, along with '-ffast-math' still being
> > passed).
> > It
> 
> > seems like the intention was that these individual switches were to
> > enable
> 
> > the individual floating-point transformations (and so the lack of
> > any
> > of
> 
> > those switches would suppress the relevant transformations), but
> > the
> 
> > '-ffast-math' "umbrella" is over-riding the
attempted suppression.
> 
> Sure, this looks like a bug, disable an individual fast-math flag on
> the command line should be possible and override a prior -ffast-math
> (usually the last one on the command line “wins”/override).
> The Cfe-dev mailing list would be more appropriate to discuss the
> behavior of clang command line flags though.
> > The change proposed at https://reviews.llvm.org/D26708 deals with
> > this issue
> 
> This patch seems to modify on LLVM, it does not deal at all with the
> issue you describe above.
> I don’t see why the issue with the clang command line flags need to
> be dealt with at the LLVM level.
> > just for the reciprocal-transformation case, but it changes the
> > semantics of
> 
> > the 'fast' IR-level flag so that it no longer implies all the
> > others.
> 
> The starting point for any change is:
> http://llvm.org/docs/LangRef.html#fast-math-flags
> You would need to write a new definition for what “fast” would mean.
> However I don’t need anything need to be changed here to address the
> use-case you want to fix.I suspect that we want to start by getting rid of 'fast' on the IR level
and replacing it with individual flags for the various optimization classes - Do
we have only allowing reassociation and libm optimizations? Then we can readjust
the Clang flags in a straightforward way.

-Hal 
> > With
> 
> > that proposed approach, rather than an "umbrella" flag such
as
> > 'fast'
> > being
> 
> > checked in the back-end (along with an individual flag like
> > 'arcp'),
> > just
> 
> > checking the individual flag ('arcp') would be done.
> 
> There is already no need to check the “fast” *and* arcp flag, if a
> transformation is about reciprocal, then you only need to check arcp
> (fast implies arcp, checking for fast would be redundant).
> Be careful also that the fast-math flags are mainly an IR level
> definition, the backend only inherited these per instruction flag
> very recently. It has been entirely converted to use these, and it
> still uses a global flag in some places.
> The line you’re touching in your patch for instance is about this
> legacy:
> if (!UnsafeMath && !Flags->hasAllowReciprocal())
> The first flag is the global “fast-math” mode on the backend, which
> is not as fine grain as the per-instruction model.
> The second flag is the “per instruction” flag, which is the model we
> aim at.
> We should get rid of the “global” UnsafeMath in the backend, but that
> does not require any change to the IR or the individual fast-math
> flags.
> > Any fast-math-related
> 
> > transformation that doesn't have an individual flag (e.g.,
> > re-association
> 
> > currently doesn't), should eventually have an individual flag
> > defined
> > for
> 
> > it, and then that individual flag should be checked.
> 
> > What do people think?
> 
> I think these are valuable problems to solve, but you should tackle
> them piece by piece:
> 1) the clang part of overriding the individual FMF and emitting the
> right IR is the first thing to fix.
> 2) the backend is still using the global UnsafeFPMath and it should
> be killed.
> Hope this makes sense.
> —
> Mehdi
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-- 

Hal Finkel 
Lead, Compiler Technology and Programming Languages 
Leadership Computing Facility 
Argonne National Laboratory 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20161116/d3a077b2/attachment.html>

Sanjay Patel via llvm-dev

2016-Nov-16 16:02 UTC

head link

[llvm-dev] RFC: Consider changing the semantics of 'fast' flag implying all fast-math-flags

I know Hal and Mehdi are already aware, but for the wider audience and back
reference, we recently had some discussions about the broken state of
fast-math in:
https://reviews.llvm.org/D24816 ([Target] move reciprocal estimate settings
from TargetOptions to TargetLowering)
https://reviews.llvm.org/D24815 ([clang] make reciprocal estimate codegen a
function attribute)

Starting by fixing the IR flags is likely the best first step because if we
fix the backend first, we may expose more bugs-on-top-of-bugs as described
here:
https://llvm.org/bugs/show_bug.cgi?id=27372#c5

But we may need a bigger change than just removing the umbrella effect of
'fast':
When I started looking at how we could move the reciprocal-estimate setting
(this is different than 'arcp') to instructions, I noted that the
existing
FMF uses Value's SubclassOptionalData.

I'm guessing this is for maximum performance, but SubclassOptionalData is
exactly 7 bits, and we've already burned 5 of them. So if we need more
refinement than 'fast' / 'aggr' to express what's possible
via the
front-end flags, we either need to expand SubclassOptionalData, add a
custom field for FMF, or move FMF to metadata or attributes. I'm assuming
the first 2 options are much larger undertakings than the third.

For reciprocal-estimates, using the existing FMF/SubclassOptionalData isn't
a viable solution because there's no way to express all of the front-end
info (number of refinement steps, etc) in 2 bits. So what about metadata?
We already have an instruction-level metadata type called "MD_fpmath".
AFAICT, it's currently only used to indicate a reduced accuracy operation
for AMDGPU. So I was hoping to just add an option to that metadata type for
reciprocal-estimates.

I realize that metadata is optional (ie, it can be dropped without
affecting correctness). I think that's fine for reciprocal-estimates, but
I'm not sure. Are -ffast-math and friends optional optimization flags? Or
do we consider those non-optional modes of execution? If they are not
optional, then should we have instruction-level attributes for FMF? This
possibility came up when we enabled FMF on calls ( https://reviews.llvm.org/
D14707 ).

Apologies for raising too many issues at once, but consider that another
potential subclass of relaxed-FP-math was recently discussed here:
https://reviews.llvm.org/D26602

Is a "distributive" transform different than an
"associative" transform?


On Wed, Nov 16, 2016 at 12:59 AM, Hal Finkel via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
>
> ------------------------------
>
> *From: *"Mehdi Amini via llvm-dev" <llvm-dev at
lists.llvm.org>
> *To: *"Warren Ristow" <warren.ristow at sony.com>
> *Cc: *llvm-dev at lists.llvm.org
> *Sent: *Tuesday, November 15, 2016 11:10:48 PM
> *Subject: *Re: [llvm-dev] RFC: Consider changing the semantics of
'fast'
> flag implying all fast-math-flags
>
>
> Hi,
>
> On Nov 15, 2016, at 5:15 PM, Ristow, Warren via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> Hi all,
>
> This is about https://reviews.llvm.org/D26708
>
> Currently when the command-line switch '-ffast-math' is specified,
the
> IR-level fast-math-flag 'fast' gets attached to appropriate FP math
> instructions.  That flag acts as an "umbrella" to implicitly turn
on all
> the
> other fast-math-flags ('nnan', 'ninf', 'nsz' and
'arcp'):
>
> http://llvm.org/docs/LangRef.html#fast-math-flags
>
> This approach has the shortcoming that when there is a desire to disable
> one
> of these fast-math-flags, if the 'fast' flag remains, it implicitly
> re-enables the one being disabled.  For example, compiling this test-case:
>
>     extern void use(float x, float y);
>     void test(float a, float b, float c) {
>       float q1 = a / c;
>       float q2 = b / c;
>       use(q1, q2);
>     }
>
> at '-O2 -ffast-math' does a reciprocal-transformation, so only one
division
> is done (as desired with fast-math).  Compiling it with:
>
>   -O2 -ffast-math -fno-reciprocal-math
>
> should disable the reciprocal transformations (the flag 'arcp'),
but leave
> all the other fast-math transformations enabled.  The current
> implementation
> doesn't do that, since the 'fast' IR-level flag still gets set.
>
> Motivation of this discussion: https://llvm.org/b
> ugs/show_bug.cgi?id=27372#c2
>
> As an aside, when '-ffast-math' is specified on the command-line,
the
> following six switches are all passed to cc1:
>
>     -menable-no-infs
>     -menable-no-nans
>     -fno-signed-zeros
>     -freciprocal-math
>     -fno-trapping-math
>     -ffp-contract=fast
>
> and '-ffast-math' itself is also passed cc1 (the act of passing
> '-ffast-math'
> to cc1 results in the macro '__FAST_MATH__' being defined).  When
(for
> example) '-fno-reciprocal-math' is passed in addition to
'-ffast-math',
> then
> '-freciprocal-math' is no longer passed to cc1 (and the other five
listed
> above still are passed, along with '-ffast-math' still being
passed).  It
> seems like the intention was that these individual switches were to enable
> the individual floating-point transformations (and so the lack of any of
> those switches would suppress the relevant transformations), but the
> '-ffast-math' "umbrella" is over-riding the attempted
suppression.
>
>
> Sure, this looks like a bug, disable an individual fast-math flag on the
> command line should be possible and override a prior -ffast-math (usually
> the last one on the command line “wins”/override).
>
> The Cfe-dev mailing list would be more appropriate to discuss the behavior
> of clang command line flags though.
>
>
>
> The change proposed at https://reviews.llvm.org/D26708 deals with this
> issue
>
>
> This patch seems to modify on LLVM, it does not deal at all with the issue
> you describe above.
> I don’t see why the issue with the clang command line flags need to be
> dealt with at the LLVM level.
>
>
> just for the reciprocal-transformation case, but it changes the semantics
> of
> the 'fast' IR-level flag so that it no longer implies all the
others.
>
>
> The starting point for any change is: http://llvm.org/docs/LangRef.h
> tml#fast-math-flags
> You would need to write a new definition for what “fast” would mean.
>
> However I don’t need anything need to be changed here to address the
> use-case you want to fix.
>
> I suspect that we want to start by getting rid of 'fast' on the IR
level
> and replacing it with individual flags for the various optimization classes
> - Do we have only allowing reassociation and libm optimizations? Then we
> can readjust the Clang flags in a straightforward way.
>
>  -Hal
>
>
>
> With
> that proposed approach, rather than an "umbrella" flag such as
'fast' being
> checked in the back-end (along with an individual flag like
'arcp'), just
> checking the individual flag ('arcp') would be done.
>
>
> There is already no need to check the “fast” *and* arcp flag, if a
> transformation is about reciprocal, then you only need to check arcp (fast
> implies arcp, checking for fast would be redundant).
>
> Be careful also that the fast-math flags are mainly an IR level
> definition, the backend only inherited these per instruction flag very
> recently. It has been entirely converted to use these, and it still uses
> a global flag in some places.
> The line you’re touching in your patch for instance is about this legacy:
>
>   if (!UnsafeMath && !Flags->hasAllowReciprocal())
>
> The first flag is the global “fast-math” mode on the backend, which is not
> as fine grain as the per-instruction model.
> The second flag is the “per instruction” flag, which is the model we aim
> at.
>
> We should get rid of the “global” UnsafeMath in the backend, but that does
> not require any change to the IR or the individual fast-math flags.
>
>
> Any fast-math-related
> transformation that doesn't have an individual flag (e.g.,
re-association
> currently doesn't), should eventually have an individual flag defined
for
> it, and then that individual flag should be checked.
>
> What do people think?
>
>
> I think these are valuable problems to solve, but you should tackle them
> piece by piece:
>
> 1) the clang part of overriding the individual FMF and emitting the right
> IR is the first thing to fix.
> 2) the backend is still using the global UnsafeFPMath and it should be
> killed.
>
> Hope this makes sense.
>
> —
> Mehdi
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
>
> --
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20161116/12f7b288/attachment-0001.html>

Nicolai Hähnle via llvm-dev

2016-Nov-16 16:38 UTC

head link

[llvm-dev] RFC: Consider changing the semantics of 'fast' flag implying all fast-math-flags

I still haven't sorted through all this, and I have to run, but as a 
general rule, can we define flags in a positive sense? E.g. NoInfsFPMath 
is currently defined in the comment as

     /// NoInfsFPMath - This flag is enabled when the
     /// -enable-no-infs-fp-math flag is specified on the command line. When
     /// this flag is off (the default), the code generator is not 
allowed to
     /// assume the FP arithmetic arguments and results are never +-Infs.

I find it easier to parse

     /// NoInfsFPMath - This flag is enabled when the
     /// -enable-no-infs-fp-math flag is specified on the command line. When
     /// this flag is on, the code generator may assume that FP
     /// arithmetic arguments and results are never +-Infs.

On 16.11.2016 17:02, Sanjay Patel via llvm-dev wrote:> I realize that metadata is optional (ie, it can be dropped without
> affecting correctness). I think that's fine for reciprocal-estimates,
> but I'm not sure. Are -ffast-math and friends optional optimization
> flags? Or do we consider those non-optional modes of execution? If they
> are not optional, then should we have instruction-level attributes for
> FMF? This possibility came up when we enabled FMF on calls (
> https://reviews.llvm.org/D14707 <https://reviews.llvm.org/D14707> ).
Well, for AMDGPU we'd prefer the metadata not be dropped, of course. But 
despite that I'd say they should be considered optional hints that give 
more freedom to the optimizer. Once you enable those flags, you're 
pretty much saying goodbye to reproducibility anyway, because different 
compilers will absolutely treat them differently.

> Apologies for raising too many issues at once, but consider that another
> potential subclass of relaxed-FP-math was recently discussed here:
> https://reviews.llvm.org/D26602
>
> Is a "distributive" transform different than an
"associative" transform?
Probably not, but the name is misleading :)

Cheers,
Nicolai
>
>
> On Wed, Nov 16, 2016 at 12:59 AM, Hal Finkel via llvm-dev
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
>
>
>    
------------------------------------------------------------------------
>
>         *From: *"Mehdi Amini via llvm-dev" <llvm-dev at
lists.llvm.org
>         <mailto:llvm-dev at lists.llvm.org>>
>         *To: *"Warren Ristow" <warren.ristow at sony.com
>         <mailto:warren.ristow at sony.com>>
>         *Cc: *llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
>         *Sent: *Tuesday, November 15, 2016 11:10:48 PM
>         *Subject: *Re: [llvm-dev] RFC: Consider changing the semantics
>         of 'fast' flag implying all fast-math-flags
>
>
>         Hi,
>
>             On Nov 15, 2016, at 5:15 PM, Ristow, Warren via llvm-dev
>             <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>>
>             wrote:
>
>             Hi all,
>
>             This is about https://reviews.llvm.org/D26708
>             <https://reviews.llvm.org/D26708>
>
>             Currently when the command-line switch '-ffast-math' is
>             specified, the
>             IR-level fast-math-flag 'fast' gets attached to
appropriate
>             FP math
>             instructions.  That flag acts as an "umbrella" to
implicitly
>             turn on all the
>             other fast-math-flags ('nnan', 'ninf',
'nsz' and 'arcp'):
>
>             http://llvm.org/docs/LangRef.html#fast-math-flags
>             <http://llvm.org/docs/LangRef.html#fast-math-flags>
>
>             This approach has the shortcoming that when there is a
>             desire to disable one
>             of these fast-math-flags, if the 'fast' flag remains,
it
>             implicitly
>             re-enables the one being disabled.  For example, compiling
>             this test-case:
>
>                 extern void use(float x, float y);
>                 void test(float a, float b, float c) {
>                   float q1 = a / c;
>                   float q2 = b / c;
>                   use(q1, q2);
>                 }
>
>             at '-O2 -ffast-math' does a reciprocal-transformation,
so
>             only one division
>             is done (as desired with fast-math).  Compiling it with:
>
>               -O2 -ffast-math -fno-reciprocal-math
>
>             should disable the reciprocal transformations (the flag
>             'arcp'), but leave
>             all the other fast-math transformations enabled.  The
>             current implementation
>             doesn't do that, since the 'fast' IR-level flag
still gets set.
>
>             Motivation of this
>             discussion: https://llvm.org/bugs/show_bug.cgi?id=27372#c2
>             <https://llvm.org/bugs/show_bug.cgi?id=27372#c2>
>
>             As an aside, when '-ffast-math' is specified on the
>             command-line, the
>             following six switches are all passed to cc1:
>
>                 -menable-no-infs
>                 -menable-no-nans
>                 -fno-signed-zeros
>                 -freciprocal-math
>                 -fno-trapping-math
>                 -ffp-contract=fast
>
>             and '-ffast-math' itself is also passed cc1 (the act of
>             passing '-ffast-math'
>             to cc1 results in the macro '__FAST_MATH__' being
defined).
>             When (for
>             example) '-fno-reciprocal-math' is passed in addition
to
>             '-ffast-math', then
>             '-freciprocal-math' is no longer passed to cc1 (and the
>             other five listed
>             above still are passed, along with '-ffast-math' still
being
>             passed).  It
>             seems like the intention was that these individual switches
>             were to enable
>             the individual floating-point transformations (and so the
>             lack of any of
>             those switches would suppress the relevant transformations),
>             but the
>             '-ffast-math' "umbrella" is over-riding the
attempted
>             suppression.
>
>
>         Sure, this looks like a bug, disable an individual fast-math
>         flag on the command line should be possible and override a prior
>         -ffast-math (usually the last one on the command line
>         “wins”/override).
>
>         The Cfe-dev mailing list would be more appropriate to discuss
>         the behavior of clang command line flags though.
>
>
>
>             The change proposed at https://reviews.llvm.org/D26708
>             <https://reviews.llvm.org/D26708> deals with this issue
>
>
>         This patch seems to modify on LLVM, it does not deal at all with
>         the issue you describe above.
>         I don’t see why the issue with the clang command line flags need
>         to be dealt with at the LLVM level.
>
>
>             just for the reciprocal-transformation case, but it changes
>             the semantics of
>             the 'fast' IR-level flag so that it no longer implies
all
>             the others.
>
>
>         The starting point for any change is:
>         http://llvm.org/docs/LangRef.html#fast-math-flags
>         <http://llvm.org/docs/LangRef.html#fast-math-flags>
>         You would need to write a new definition for what “fast” would
mean.
>
>         However I don’t need anything need to be changed here to address
>         the use-case you want to fix.
>
>     I suspect that we want to start by getting rid of 'fast' on the
IR
>     level and replacing it with individual flags for the various
>     optimization classes - Do we have only allowing reassociation and
>     libm optimizations? Then we can readjust the Clang flags in a
>     straightforward way.
>
>      -Hal
>
>
>
>             With
>             that proposed approach, rather than an "umbrella"
flag such
>             as 'fast' being
>             checked in the back-end (along with an individual flag like
>             'arcp'), just
>             checking the individual flag ('arcp') would be done.
>
>
>         There is already no need to check the “fast” *and* arcp flag, if
>         a transformation is about reciprocal, then you only need to
>         check arcp (fast implies arcp, checking for fast would be
>         redundant).
>
>         Be careful also that the fast-math flags are mainly an IR level
>         definition, the backend only inherited these per instruction
>         flag very recently. It has been entirely converted to use these,
>         and it still uses a global flag in some places.
>         The line you’re touching in your patch for instance is about
>         this legacy:
>
>           if (!UnsafeMath && !Flags->hasAllowReciprocal())
>
>         The first flag is the global “fast-math” mode on the backend,
>         which is not as fine grain as the per-instruction model.
>         The second flag is the “per instruction” flag, which is the
>         model we aim at.
>
>         We should get rid of the “global” UnsafeMath in the backend, but
>         that does not require any change to the IR or the individual
>         fast-math flags.
>
>
>             Any fast-math-related
>             transformation that doesn't have an individual flag (e.g.,
>             re-association
>             currently doesn't), should eventually have an individual
>             flag defined for
>             it, and then that individual flag should be checked.
>
>             What do people think?
>
>
>         I think these are valuable problems to solve, but you should
>         tackle them piece by piece:
>
>         1) the clang part of overriding the individual FMF and emitting
>         the right IR is the first thing to fix.
>         2) the backend is still using the global UnsafeFPMath and it
>         should be killed.
>
>         Hope this makes sense.
>
>         —
>         Mehdi
>
>
>         _______________________________________________
>         LLVM Developers mailing list
>         llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
>         http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>         <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>
>
>
>
>     --
>     Hal Finkel
>     Lead, Compiler Technology and Programming Languages
>     Leadership Computing Facility
>     Argonne National Laboratory
>
>     _______________________________________________
>     LLVM Developers mailing list
>     llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>     <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

Mehdi Amini via llvm-dev

2016-Nov-16 17:03 UTC

head link

[llvm-dev] RFC: Consider changing the semantics of 'fast' flag implying all fast-math-flags

> On Nov 15, 2016, at 11:59 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> 
> 
> From: "Mehdi Amini via llvm-dev" <llvm-dev at
lists.llvm.org>
> To: "Warren Ristow" <warren.ristow at sony.com>
> Cc: llvm-dev at lists.llvm.org
> Sent: Tuesday, November 15, 2016 11:10:48 PM
> Subject: Re: [llvm-dev] RFC: Consider changing the semantics of
'fast' flag implying all fast-math-flags
> 
> Hi,
> 
> On Nov 15, 2016, at 5:15 PM, Ristow, Warren via llvm-dev <llvm-dev at
lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
> 
> Hi all,
>  
> This is about https://reviews.llvm.org/D26708
<https://reviews.llvm.org/D26708>
>  
> Currently when the command-line switch '-ffast-math' is specified,
the
> IR-level fast-math-flag 'fast' gets attached to appropriate FP math
> instructions.  That flag acts as an "umbrella" to implicitly turn
on all the
> other fast-math-flags ('nnan', 'ninf', 'nsz' and
'arcp'):
>  
> http://llvm.org/docs/LangRef.html#fast-math-flags
<http://llvm.org/docs/LangRef.html#fast-math-flags>
>  
> This approach has the shortcoming that when there is a desire to disable
one
> of these fast-math-flags, if the 'fast' flag remains, it implicitly
> re-enables the one being disabled.  For example, compiling this test-case:
>  
>     extern void use(float x, float y);
>     void test(float a, float b, float c) {
>       float q1 = a / c;
>       float q2 = b / c;
>       use(q1, q2);
>     }
>  
> at '-O2 -ffast-math' does a reciprocal-transformation, so only one
division
> is done (as desired with fast-math).  Compiling it with:
>  
>   -O2 -ffast-math -fno-reciprocal-math
>  
> should disable the reciprocal transformations (the flag 'arcp'),
but leave
> all the other fast-math transformations enabled.  The current
implementation
> doesn't do that, since the 'fast' IR-level flag still gets set.
>  
> Motivation of this discussion:
https://llvm.org/bugs/show_bug.cgi?id=27372#c2
<https://llvm.org/bugs/show_bug.cgi?id=27372#c2>
>  
> As an aside, when '-ffast-math' is specified on the command-line,
the
> following six switches are all passed to cc1:
>  
>     -menable-no-infs
>     -menable-no-nans
>     -fno-signed-zeros
>     -freciprocal-math
>     -fno-trapping-math
>     -ffp-contract=fast
>  
> and '-ffast-math' itself is also passed cc1 (the act of passing
'-ffast-math'
> to cc1 results in the macro '__FAST_MATH__' being defined).  When
(for
> example) '-fno-reciprocal-math' is passed in addition to
'-ffast-math', then
> '-freciprocal-math' is no longer passed to cc1 (and the other five
listed
> above still are passed, along with '-ffast-math' still being
passed).  It
> seems like the intention was that these individual switches were to enable
> the individual floating-point transformations (and so the lack of any of
> those switches would suppress the relevant transformations), but the
> '-ffast-math' "umbrella" is over-riding the attempted
suppression.
> 
> Sure, this looks like a bug, disable an individual fast-math flag on the
command line should be possible and override a prior -ffast-math (usually the
last one on the command line “wins”/override).
> 
> The Cfe-dev mailing list would be more appropriate to discuss the behavior
of clang command line flags though.
> 
> 
>  
> The change proposed at https://reviews.llvm.org/D26708
<https://reviews.llvm.org/D26708> deals with this issue
> 
> This patch seems to modify on LLVM, it does not deal at all with the issue
you describe above.
> I don’t see why the issue with the clang command line flags need to be
dealt with at the LLVM level.
> 
> 
> just for the reciprocal-transformation case, but it changes the semantics
of
> the 'fast' IR-level flag so that it no longer implies all the
others.
> 
> The starting point for any change is:
http://llvm.org/docs/LangRef.html#fast-math-flags
<http://llvm.org/docs/LangRef.html#fast-math-flags>
> You would need to write a new definition for what “fast” would mean.
> 
> However I don’t need anything need to be changed here to address the
use-case you want to fix.
> I suspect that we want to start by getting rid of 'fast' on the IR
level and replacing it with individual flags for the various optimization
classes - Do we have only allowing reassociation and libm optimizations? Then we
can readjust the Clang flags in a straightforward way.
Individual flags for various optimization classes make sense only if you don’t
end up with a lot of very specialized new flags.
If a single “reassociate” flag could be enough to complete the existing and
replace the “fast” that would be great.
But some auditing of all the users of “fast" would be needed first. For
instance is "X * log2(0.5*Y) = X*log2(Y) - X” covered by “reassociation”?
That seems a bit more than what people think about with reassociation at first.

— 
Mehdi
> 
> 
> 
> With
> that proposed approach, rather than an "umbrella" flag such as
'fast' being
> checked in the back-end (along with an individual flag like
'arcp'), just
> checking the individual flag ('arcp') would be done. 
> 
> There is already no need to check the “fast” *and* arcp flag, if a
transformation is about reciprocal, then you only need to check arcp (fast
implies arcp, checking for fast would be redundant).
> 
> Be careful also that the fast-math flags are mainly an IR level definition,
the backend only inherited these per instruction flag very recently. It has been
entirely converted to use these, and it still uses a global flag in some places.
> The line you’re touching in your patch for instance is about this legacy:
> 
>   if (!UnsafeMath && !Flags->hasAllowReciprocal())
> 
> The first flag is the global “fast-math” mode on the backend, which is not
as fine grain as the per-instruction model.
> The second flag is the “per instruction” flag, which is the model we aim
at.
> 
> We should get rid of the “global” UnsafeMath in the backend, but that does
not require any change to the IR or the individual fast-math flags.
> 
> 
> Any fast-math-related
> transformation that doesn't have an individual flag (e.g.,
re-association
> currently doesn't), should eventually have an individual flag defined
for
> it, and then that individual flag should be checked.
>  
> What do people think?
> 
> I think these are valuable problems to solve, but you should tackle them
piece by piece:
> 
> 1) the clang part of overriding the individual FMF and emitting the right
IR is the first thing to fix.
> 2) the backend is still using the global UnsafeFPMath and it should be
killed.
> 
> Hope this makes sense.
> 
> — 
> Mehdi
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> 
> 
> 
> -- 
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20161116/209d0c85/attachment.html>

Hal Finkel via llvm-dev

2016-Nov-16 21:14 UTC

head link

[llvm-dev] RFC: Consider changing the semantics of 'fast' flag implying all fast-math-flags

----- Original Message -----
> From: "Mehdi Amini" <mehdi.amini at apple.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: llvm-dev at lists.llvm.org, "Warren Ristow" <warren.ristow
at sony.com>
> Sent: Wednesday, November 16, 2016 11:03:48 AM
> Subject: Re: [llvm-dev] RFC: Consider changing the semantics of
> 'fast' flag implying all fast-math-flags
> > On Nov 15, 2016, at 11:59 PM, Hal Finkel < hfinkel at anl.gov >
wrote:
> 
> > ----- Original Message -----
> 
> > > From: "Mehdi Amini via llvm-dev" < llvm-dev at
lists.llvm.org >
> > 
> 
> > > To: "Warren Ristow" < warren.ristow at sony.com >
> > 
> 
> > > Cc: llvm-dev at lists.llvm.org
> > 
> 
> > > Sent: Tuesday, November 15, 2016 11:10:48 PM
> > 
> 
> > > Subject: Re: [llvm-dev] RFC: Consider changing the semantics of
> > > 'fast' flag implying all fast-math-flags
> > 
> 
> > > Hi,
> > 
> 
> > > > On Nov 15, 2016, at 5:15 PM, Ristow, Warren via llvm-dev
<
> > > > llvm-dev at lists.llvm.org > wrote:
> > > 
> > 
> 
> > > > Hi all,
> > > 
> > 
> 
> > > > This is about https://reviews.llvm.org/D26708
> > > 
> > 
> 
> > > > Currently when the command-line switch '-ffast-math'
is
> > > > specified,
> > > > the
> > > 
> > 
> 
> > > > IR-level fast-math-flag 'fast' gets attached to
appropriate FP
> > > > math
> > > 
> > 
> 
> > > > instructions. That flag acts as an "umbrella" to
implicitly
> > > > turn
> > > > on
> > > > all the
> > > 
> > 
> 
> > > > other fast-math-flags ('nnan', 'ninf',
'nsz' and 'arcp'):
> > > 
> > 
> 
> > > > http://llvm.org/docs/LangRef.html#fast-math-flags
> > > 
> > 
> 
> > > > This approach has the shortcoming that when there is a
desire
> > > > to
> > > > disable one
> > > 
> > 
> 
> > > > of these fast-math-flags, if the 'fast' flag
remains, it
> > > > implicitly
> > > 
> > 
> 
> > > > re-enables the one being disabled. For example, compiling
this
> > > > test-case:
> > > 
> > 
> 
> > > > extern void use(float x, float y);
> > > 
> > 
> 
> > > > void test(float a, float b, float c) {
> > > 
> > 
> 
> > > > float q1 = a / c;
> > > 
> > 
> 
> > > > float q2 = b / c;
> > > 
> > 
> 
> > > > use(q1, q2);
> > > 
> > 
> 
> > > > }
> > > 
> > 
> 
> > > > at '-O2 -ffast-math' does a
reciprocal-transformation, so only
> > > > one
> > > > division
> > > 
> > 
> 
> > > > is done (as desired with fast-math). Compiling it with:
> > > 
> > 
> 
> > > > -O2 -ffast-math -fno-reciprocal-math
> > > 
> > 
> 
> > > > should disable the reciprocal transformations (the flag
> > > > 'arcp'),
> > > > but
> > > > leave
> > > 
> > 
> 
> > > > all the other fast-math transformations enabled. The current
> > > > implementation
> > > 
> > 
> 
> > > > doesn't do that, since the 'fast' IR-level flag
still gets set.
> > > 
> > 
> 
> > > > Motivation of this discussion:
> > > > https://llvm.org/bugs/show_bug.cgi?id=27372#c2
> > > 
> > 
> 
> > > > As an aside, when '-ffast-math' is specified on the
> > > > command-line,
> > > > the
> > > 
> > 
> 
> > > > following six switches are all passed to cc1:
> > > 
> > 
> 
> > > > -menable-no-infs
> > > 
> > 
> 
> > > > -menable-no-nans
> > > 
> > 
> 
> > > > -fno-signed-zeros
> > > 
> > 
> 
> > > > -freciprocal-math
> > > 
> > 
> 
> > > > -fno-trapping-math
> > > 
> > 
> 
> > > > -ffp-contract=fast
> > > 
> > 
> 
> > > > and '-ffast-math' itself is also passed cc1 (the act
of passing
> > > > '-ffast-math'
> > > 
> > 
> 
> > > > to cc1 results in the macro '__FAST_MATH__' being
defined).
> > > > When
> > > > (for
> > > 
> > 
> 
> > > > example) '-fno-reciprocal-math' is passed in
addition to
> > > > '-ffast-math', then
> > > 
> > 
> 
> > > > '-freciprocal-math' is no longer passed to cc1 (and
the other
> > > > five
> > > > listed
> > > 
> > 
> 
> > > > above still are passed, along with '-ffast-math'
still being
> > > > passed).
> > > > It
> > > 
> > 
> 
> > > > seems like the intention was that these individual switches
> > > > were
> > > > to
> > > > enable
> > > 
> > 
> 
> > > > the individual floating-point transformations (and so the
lack
> > > > of
> > > > any
> > > > of
> > > 
> > 
> 
> > > > those switches would suppress the relevant transformations),
> > > > but
> > > > the
> > > 
> > 
> 
> > > > '-ffast-math' "umbrella" is over-riding
the attempted
> > > > suppression.
> > > 
> > 
> 
> > > Sure, this looks like a bug, disable an individual fast-math flag
> > > on
> > > the command line should be possible and override a prior
> > > -ffast-math
> > > (usually the last one on the command line “wins”/override).
> > 
> 
> > > The Cfe-dev mailing list would be more appropriate to discuss the
> > > behavior of clang command line flags though.
> > 
> 
> > > > The change proposed at https://reviews.llvm.org/D26708 deals
> > > > with
> > > > this issue
> > > 
> > 
> 
> > > This patch seems to modify on LLVM, it does not deal at all with
> > > the
> > > issue you describe above.
> > 
> 
> > > I don’t see why the issue with the clang command line flags need
> > > to
> > > be dealt with at the LLVM level.
> > 
> 
> > > > just for the reciprocal-transformation case, but it changes
the
> > > > semantics of
> > > 
> > 
> 
> > > > the 'fast' IR-level flag so that it no longer
implies all the
> > > > others.
> > > 
> > 
> 
> > > The starting point for any change is:
> > > http://llvm.org/docs/LangRef.html#fast-math-flags
> > 
> 
> > > You would need to write a new definition for what “fast” would
> > > mean.
> > 
> 
> > > However I don’t need anything need to be changed here to address
> > > the
> > > use-case you want to fix.
> > 
> 
> > I suspect that we want to start by getting rid of 'fast' on
the IR
> > level and replacing it with individual flags for the various
> > optimization classes - Do we have only allowing reassociation and
> > libm optimizations? Then we can readjust the Clang flags in a
> > straightforward way.
> 
> Individual flags for various optimization classes make sense only if
> you don’t end up with a lot of very specialized new flags.
> If a single “reassociate” flag could be enough to complete the
> existing and replace the “fast” that would be great.
> But some auditing of all the users of “fast" would be needed first.
> For instance is "X * log2(0.5*Y) = X*log2(Y) - X” covered by
> “reassociation”? That seems a bit more than what people think about
> with reassociation at first.I can only think of two flags we might need: One for reassociation and one for
libm-function optimization. Your example with log2 might require both.

-Hal 
> —
> Mehdi
> > > > With
> > > 
> > 
> 
> > > > that proposed approach, rather than an "umbrella"
flag such as
> > > > 'fast'
> > > > being
> > > 
> > 
> 
> > > > checked in the back-end (along with an individual flag like
> > > > 'arcp'),
> > > > just
> > > 
> > 
> 
> > > > checking the individual flag ('arcp') would be done.
> > > 
> > 
> 
> > > There is already no need to check the “fast” *and* arcp flag, if
> > > a
> > > transformation is about reciprocal, then you only need to check
> > > arcp
> > > (fast implies arcp, checking for fast would be redundant).
> > 
> 
> > > Be careful also that the fast-math flags are mainly an IR level
> > > definition, the backend only inherited these per instruction flag
> > > very recently. It has been entirely converted to use these, and
> > > it
> > > still uses a global flag in some places.
> > 
> 
> > > The line you’re touching in your patch for instance is about this
> > > legacy:
> > 
> 
> > > if (!UnsafeMath && !Flags->hasAllowReciprocal())
> > 
> 
> > > The first flag is the global “fast-math” mode on the backend,
> > > which
> > > is not as fine grain as the per-instruction model.
> > 
> 
> > > The second flag is the “per instruction” flag, which is the model
> > > we
> > > aim at.
> > 
> 
> > > We should get rid of the “global” UnsafeMath in the backend, but
> > > that
> > > does not require any change to the IR or the individual fast-math
> > > flags.
> > 
> 
> > > > Any fast-math-related
> > > 
> > 
> 
> > > > transformation that doesn't have an individual flag
(e.g.,
> > > > re-association
> > > 
> > 
> 
> > > > currently doesn't), should eventually have an individual
flag
> > > > defined
> > > > for
> > > 
> > 
> 
> > > > it, and then that individual flag should be checked.
> > > 
> > 
> 
> > > > What do people think?
> > > 
> > 
> 
> > > I think these are valuable problems to solve, but you should
> > > tackle
> > > them piece by piece:
> > 
> 
> > > 1) the clang part of overriding the individual FMF and emitting
> > > the
> > > right IR is the first thing to fix.
> > 
> 
> > > 2) the backend is still using the global UnsafeFPMath and it
> > > should
> > > be killed.
> > 
> 
> > > Hope this makes sense.
> > 
> 
> > > —
> > 
> 
> > > Mehdi
> > 
> 
> > > _______________________________________________
> > 
> 
> > > LLVM Developers mailing list
> > 
> 
> > > llvm-dev at lists.llvm.org
> > 
> 
> > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> > 
> 
> > --
> 
> > Hal Finkel
> 
> > Lead, Compiler Technology and Programming Languages
> 
> > Leadership Computing Facility
> 
> > Argonne National Laboratory
> -- 

Hal Finkel 
Lead, Compiler Technology and Programming Languages 
Leadership Computing Facility 
Argonne National Laboratory 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20161116/664fc4b3/attachment.html>

Nicolai Hähnle via llvm-dev

2016-Nov-16 21:42 UTC

head link

[llvm-dev] RFC: Consider changing the semantics of 'fast' flag implying all fast-math-flags

On 16.11.2016 18:03, Mehdi Amini via llvm-dev wrote:> Individual flags for various optimization classes make sense only if you
> don’t end up with a lot of very specialized new flags.
> If a single “reassociate” flag could be enough to complete the existing
> and replace the “fast” that would be great.
> But some auditing of all the users of “fast" would be needed first.
For
> instance is "X * log2(0.5*Y) = X*log2(Y) - X” covered by
> “reassociation”? That seems a bit more than what people think about with
> reassociation at first.
Why are there individual flags for different types of optimization _at all_?

The way I see it, if you started from a blank slate, then the flags 
should define what the application cares about. In other words, the 
following four flags would do:

nnan, ninf, nsz, inexact

The first three have the same meaning as today. The last one is "Allow 
algebraically equivalent transformations that may change the result due 
to different rounding only".

This includes what is today arcp, fp-contract=fast, and most of "unsafe 
math", but it *doesn't* by itself allow you to ignore inf or nan.
I'm
not sure about nsz; it seems like that could be implied by inexact.

I'd be really curious to know if there is anybody who really needs arcp 
without fp-contract=fast or vice versa, or who needs both of these but 
not the X*log2(0.5*Y) transform you mentioned, and so on.[1]

One could consider expressing inexact in terms of ulp, but since errors 
can accumulate through multiple transforms I doubt that that's too 
useful in general. (We do have a use case for ulp in AMDGPU since we 
want fast inexact reciprocals for some applications. We currently use 
metadata for that, and I see no need to change that.)

Cheers,
Nicolai

[1] One case I _can_ think of (and which may have been a reason for the 
proliferation of flags in the first place) is somebody who enables fast 
math, but then doesn't want their results to change when they update the 
compiler and get a new set of optimizations. But IMO that's a use case 
that should be explicitly rejected.
>
> —
> Mehdi
>
>>
>>
>>
>>         With
>>         that proposed approach, rather than an "umbrella"
flag such as
>>         'fast' being
>>         checked in the back-end (along with an individual flag like
>>         'arcp'), just
>>         checking the individual flag ('arcp') would be done.
>>
>>
>>     There is already no need to check the “fast” *and* arcp flag, if a
>>     transformation is about reciprocal, then you only need to check
>>     arcp (fast implies arcp, checking for fast would be redundant).
>>
>>     Be careful also that the fast-math flags are mainly an IR level
>>     definition, the backend only inherited these per instruction flag
>>     very recently. It has been entirely converted to use these, and it
>>     still uses a global flag in some places.
>>     The line you’re touching in your patch for instance is about this
>>     legacy:
>>
>>       if (!UnsafeMath && !Flags->hasAllowReciprocal())
>>
>>     The first flag is the global “fast-math” mode on the backend,
>>     which is not as fine grain as the per-instruction model.
>>     The second flag is the “per instruction” flag, which is the model
>>     we aim at.
>>
>>     We should get rid of the “global” UnsafeMath in the backend, but
>>     that does not require any change to the IR or the individual
>>     fast-math flags.
>>
>>
>>         Any fast-math-related
>>         transformation that doesn't have an individual flag (e.g.,
>>         re-association
>>         currently doesn't), should eventually have an individual
flag
>>         defined for
>>         it, and then that individual flag should be checked.
>>
>>         What do people think?
>>
>>
>>     I think these are valuable problems to solve, but you should
>>     tackle them piece by piece:
>>
>>     1) the clang part of overriding the individual FMF and emitting
>>     the right IR is the first thing to fix.
>>     2) the backend is still using the global UnsafeFPMath and it
>>     should be killed.
>>
>>     Hope this makes sense.
>>
>>     —
>>     Mehdi
>>
>>
>>     _______________________________________________
>>     LLVM Developers mailing list
>>     llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
>>     http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>>
>>
>> --
>> Hal Finkel
>> Lead, Compiler Technology and Programming Languages
>> Leadership Computing Facility
>> Argonne National Laboratory
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

llvm dev - Nov 2016 - RFC: Consider changing the semantics of 'fast' flag implying all fast-math-flags

[llvm-dev] RFC: Consider changing the semantics of 'fast' flag implying all fast-math-flags

[llvm-dev] RFC: Consider changing the semantics of 'fast' flag implying all fast-math-flags

[llvm-dev] RFC: Consider changing the semantics of 'fast' flag implying all fast-math-flags

[llvm-dev] RFC: Consider changing the semantics of 'fast' flag implying all fast-math-flags

[llvm-dev] RFC: Consider changing the semantics of 'fast' flag implying all fast-math-flags

[llvm-dev] RFC: Consider changing the semantics of 'fast' flag implying all fast-math-flags