thr3ads.net - llvm dev - [LLVMdev] Representing -ffast-math at the IR level [Apr 2012]

If this information is useful, please help other people find it:
Share via:

Harris, Kevin

2012-Apr-17 15:29 UTC

[LLVMdev] Representing -ffast-math at the IR level

Duncan,
        Your effort to improve the control of floating point optimizations in
LLVM is noble and commendable.  I'd like to make two points that appear not
to have been raised previously in the discussion of your proposal to date:

1)      Most compiler and back-end control of floating point behavior appears to
be motivated by controlling the loss or gain of a few low bits of precision on a
whole module scale.  In fact, these concerns are usually insignificant for
programmers of floating-point intensive applications.  The input to most
floating point computations have far lower significance than the computations
themselves, and therefore they have precision to burn.  So the vast majority of
such app developers would happily trade precision for performance, even as the
default behavior.  However, the place where trouble DOES occur is with overflow
and underflow behavior at critical points.  Changing the order of operations, or
combining operations, can cause overflows or underflows to occur that
wouldn't otherwise occur, and vice versa.  Sometimes this is beneficial, but
it is almost always unexpected.  Underflows may sound less important in this
regard, but they can be worse than overflows, because they can mostly or
completely eliminate the significant bits, in complete silence, leaving the
entire computation worthless.  Much of numerical analysis, especially in writing
floating point library functions, concerns the precise control of overflow and
loss of significance in specific operations.  To the extent that optimizations
which make such control difficult or impossible, can render the use of a
compiler or backend unusable for that purpose.
2)      While the use of metadata for control of LLVM behavior is attractive for
its simplicity and power, the philosophy that it can be safely ignored or even
removed in some optimization passes would seem to doom its effectiveness for
controlling floating point optimizations.  For anyone trying to use source
language and compiler option mechanisms to control for fp overflow and
underflow,  this approach would seem ill conceived.  For the purpose of
providing a Front-End developer with a powerful platform for supporting
fp-intensive programming, the primary requirement is that the Front-end should
be able to precisely control optimizations that can change the fp intermediate
results under all optimization levels for each individual fp operation specified
in the IR.  The vast majority of such usage can and should chosen to default to
high performance behavior.  But it should be possible for the front-end to
precisely control IR re-ordering, operation combining (including exploitation of
mul-add hardware support), and reactions to overflow and underflow conditions
(using the exception handling conditions and underlying the hardware support). 
By providing this power in the IR, it allows a Front-end developer to reliably
support source language mechanisms (e.g. use of parentheses) and front-end
recognized compiler options (e.g. for fp exception handling) to respond to the
needs of the source language programmer for fp-intensive applications.

        It should be possible to define one or more attribute flags for FP
operations in the IR with semantics that guarantee allowance or suppression of
optimizations that might create or eliminate overflow, underflow, or significant
precision loss.  The implementation of such semantics in the existing
optimization passes might take a fair amount of work, I admit.  But that is
exactly what Front-End developers and their source language programmers would
most benefit from.
        -Kevin


-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20120417/91ae09b0/attachment.html>

Duncan Sands

2012-Apr-17 15:52 UTC

head link

[LLVMdev] Representing -ffast-math at the IR level

Hi Kevin,
>  1. Most compiler and back-end control of floating point behavior appears
to be
>     motivated by controlling the loss or gain of a few low bits of
precision on
>     a whole module scale. In fact, these concerns are usually insignificant
for
>     programmers of floating-point intensive applications. The input to most
>     floating point computations have far lower significance than the
>     computations themselves, and therefore they have precision to burn. So
the
>     vast majority of such app developers would happily trade precision for
>     performance, even as the default behavior. However, the place where
trouble
>     DOES occur is with overflow and underflow behavior at critical points.
>     Changing the order of operations, or combining operations, can cause
>     overflows or underflows to occur that wouldn’t otherwise occur, and
vice
>     versa.
for the moment I'm distinguishing (mentally) between transformations that
introduce a uniformly bounded relative error, for example x+0 -> x, or
x/constant -> x * (1/constant) if constant and 1/constant are normal (and
not denormal), and those that can introduce an unbounded relative error.
Reassociation is an example of a transformation that can introduce unbounded
relative error, for example (1 + epsilon) - 1 -> 0 if epsilon is small
enough,
while (1 - 1) + epsilon -> epsilon.  I'm basically assuming that everyone
is
happy with the transforms that introduce a bounded relative error - it sounds
to me like this is the distinction that you are making too.  Transforms that
introduce unbounded relative error (like reassocation) are a can of worms, and
I'm not sure how best to handle them.  So for the moment I'm not
planning to
handle them, just gather ideas and discuss.

  Sometimes this is beneficial, but it is almost always
unexpected.>     Underflows may sound less important in this regard, but they can be
worse
>     than overflows, because they can mostly or completely eliminate the
>     significant bits, in complete silence, leaving the entire computation
>     worthless. Much of numerical analysis, especially in writing floating
point
>     library functions, concerns the precise control of overflow and loss of
>     significance in specific operations. To the extent that optimizations
which
>     make such control difficult or impossible, can render the use of a
compiler
>     or backend unusable for that purpose.
>  2. While the use of metadata for control of LLVM behavior is attractive
for its
>     simplicity and power, the philosophy that it can be safely ignored or
even
>     removed in some optimization passes would seem to doom its
effectiveness for
>     controlling floating point optimizations. For anyone trying to use
source
>     language and compiler option mechanisms to control for fp overflow and
>     underflow, this approach would seem ill conceived.
I think there may be a misunderstanding here.  True, the design of metadata is
that it is not wrong to drop it.  However the compiler isn't trying to drop
it,
it tries hard not to drop it: any cases of pointlessly dropped metadata are a
bug.  In this fpmath metadata is analogous to tbaa (type based alias analysis
metadata): if it is dropped you get conservatively correct results, but some
optimizations are missed.  Compiler writers don't like missing
optimizations!
If you see any cases of fpmath metadata being dropped then please report it.

  For the purpose of>     providing a Front-End developer with a powerful platform for supporting
>     fp-intensive programming,
Let me just say up front that it is not clear to me that this is a goal of LLVM.

  the primary requirement is that the Front-end>     should be able to precisely control optimizations that can change the
fp
>     intermediate results under all optimization levels for each individual
fp
>     operation specified in the IR. The vast majority of such usage can and
>     should chosen to default to high performance behavior. But it should be
>     possible for the front-end to precisely control IR re-ordering,
operation
>     combining (including exploitation of mul-add hardware support), and
>     reactions to overflow and underflow conditions (using the exception
handling
>     conditions and underlying the hardware support). By providing this
power in
>     the IR, it allows a Front-end developer to reliably support source
language
>     mechanisms (e.g. use of parentheses) and front-end recognized compiler
>     options (e.g. for fp exception handling) to respond to the needs of the
>     source language programmer for fp-intensive applications.
Given that LLVM doesn't even properly support rounding modes, I think you
are
going to have to wait a few years at least before we are anywhere near something
like this.  That said, we'd get there sooner (assuming we actually want to
go
there) if you help - patches welcome!
> It should be possible to define one or more attribute flags for FP
operations in
> the IR with semantics that guarantee allowance or suppression of
optimizations
> that might create or eliminate overflow, underflow, or significant
precision
> loss. The implementation of such semantics in the existing optimization
passes
> might take a fair amount of work, I admit. But that is exactly what
Front-End
> developers and their source language programmers would most benefit from.
I'm pretty sure that building lots of flags into floating point operations
is
not going to fly at this stage.  Metadata allows us to grow lots of flags if we
want without much impact on the compiler.  Once the metadata approach has
matured and shown its usefulness or limitations then we can consider baking
things into the IR or other such approaches.  But that's a long way off.

Ciao, Duncan.

Hal Finkel

2012-Apr-17 16:59 UTC

head link

[LLVMdev] Representing -ffast-math at the IR level

On Tue, 17 Apr 2012 17:52:30 +0200
Duncan Sands <baldrick at free.fr> wrote:
> Hi Kevin,
> 
> >  1. Most compiler and back-end control of floating point behavior
> > appears to be motivated by controlling the loss or gain of a few
> > low bits of precision on a whole module scale. In fact, these
> > concerns are usually insignificant for programmers of
> > floating-point intensive applications. The input to most floating
> > point computations have far lower significance than the
> > computations themselves, and therefore they have precision to burn.
> > So the vast majority of such app developers would happily trade
> > precision for performance, even as the default behavior. However,
> > the place where trouble DOES occur is with overflow and underflow
> > behavior at critical points. Changing the order of operations, or
> > combining operations, can cause overflows or underflows to occur
> > that wouldn’t otherwise occur, and vice versa.
> 
> for the moment I'm distinguishing (mentally) between transformations
> that introduce a uniformly bounded relative error, for example x+0 ->
> x, or x/constant -> x * (1/constant) if constant and 1/constant are
> normal (and not denormal), and those that can introduce an unbounded
> relative error. Reassociation is an example of a transformation that
> can introduce unbounded relative error, for example (1 + epsilon) - 1
> -> 0 if epsilon is small enough, while (1 - 1) + epsilon -> epsilon.
> I'm basically assuming that everyone is happy with the transforms
> that introduce a bounded relative error - it sounds to me like this
> is the distinction that you are making too.  Transforms that
> introduce unbounded relative error (like reassocation) are a can of
> worms, and I'm not sure how best to handle them.  So for the moment
> I'm not planning to handle them, just gather ideas and discuss.
I agree, these two things are quite different. For the constant
relative error class, yes, we should do them if at all possible. For
the others, we may want to only do these if we have some additional
information available (input ranges, for example). As far as this goes,
I would suggest trying to get advice from "the experts" (there are a
number of relevant projects at INRIA (Gappa, flocq, etc.), so perhaps
pinging someone from there would be helpful).

 -Hal
> 
>   Sometimes this is beneficial, but it is almost always unexpected.
> >     Underflows may sound less important in this regard, but they
> > can be worse than overflows, because they can mostly or completely
> > eliminate the significant bits, in complete silence, leaving the
> > entire computation worthless. Much of numerical analysis,
> > especially in writing floating point library functions, concerns
> > the precise control of overflow and loss of significance in
> > specific operations. To the extent that optimizations which make
> > such control difficult or impossible, can render the use of a
> > compiler or backend unusable for that purpose. 2. While the use of
> > metadata for control of LLVM behavior is attractive for its
> > simplicity and power, the philosophy that it can be safely ignored
> > or even removed in some optimization passes would seem to doom its
> > effectiveness for controlling floating point optimizations. For
> > anyone trying to use source language and compiler option mechanisms
> > to control for fp overflow and underflow, this approach would seem
> > ill conceived.
> 
> I think there may be a misunderstanding here.  True, the design of
> metadata is that it is not wrong to drop it.  However the compiler
> isn't trying to drop it, it tries hard not to drop it: any cases of
> pointlessly dropped metadata are a bug.  In this fpmath metadata is
> analogous to tbaa (type based alias analysis metadata): if it is
> dropped you get conservatively correct results, but some
> optimizations are missed.  Compiler writers don't like missing
> optimizations! If you see any cases of fpmath metadata being dropped
> then please report it.
> 
>   For the purpose of
> >     providing a Front-End developer with a powerful platform for
> > supporting fp-intensive programming,
> 
> Let me just say up front that it is not clear to me that this is a
> goal of LLVM.
> 
>   the primary requirement is that the Front-end
> >     should be able to precisely control optimizations that can
> > change the fp intermediate results under all optimization levels
> > for each individual fp operation specified in the IR. The vast
> > majority of such usage can and should chosen to default to high
> > performance behavior. But it should be possible for the front-end
> > to precisely control IR re-ordering, operation combining (including
> > exploitation of mul-add hardware support), and reactions to
> > overflow and underflow conditions (using the exception handling
> > conditions and underlying the hardware support). By providing this
> > power in the IR, it allows a Front-end developer to reliably
> > support source language mechanisms (e.g. use of parentheses) and
> > front-end recognized compiler options (e.g. for fp exception
> > handling) to respond to the needs of the source language programmer
> > for fp-intensive applications.
> 
> Given that LLVM doesn't even properly support rounding modes, I think
> you are going to have to wait a few years at least before we are
> anywhere near something like this.  That said, we'd get there sooner
> (assuming we actually want to go there) if you help - patches welcome!
> 
> > It should be possible to define one or more attribute flags for FP
> > operations in the IR with semantics that guarantee allowance or
> > suppression of optimizations that might create or eliminate
> > overflow, underflow, or significant precision loss. The
> > implementation of such semantics in the existing optimization
> > passes might take a fair amount of work, I admit. But that is
> > exactly what Front-End developers and their source language
> > programmers would most benefit from.
> 
> I'm pretty sure that building lots of flags into floating point
> operations is not going to fly at this stage.  Metadata allows us to
> grow lots of flags if we want without much impact on the compiler.
> Once the metadata approach has matured and shown its usefulness or
> limitations then we can consider baking things into the IR or other
> such approaches.  But that's a long way off.
> 
> Ciao, Duncan.
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory

Owen Anderson

2012-Apr-17 17:16 UTC

head link

[LLVMdev] Representing -ffast-math at the IR level

Duncan,

On Apr 17, 2012, at 8:52 AM, Duncan Sands <baldrick at free.fr> wrote:
>>    Underflows may sound less important in this regard, but they can be
worse
>>    than overflows, because they can mostly or completely eliminate the
>>    significant bits, in complete silence, leaving the entire
computation
>>    worthless. Much of numerical analysis, especially in writing
floating point
>>    library functions, concerns the precise control of overflow and loss
of
>>    significance in specific operations. To the extent that
optimizations which
>>    make such control difficult or impossible, can render the use of a
compiler
>>    or backend unusable for that purpose.
>> 2. While the use of metadata for control of LLVM behavior is attractive
for its
>>    simplicity and power, the philosophy that it can be safely ignored
or even
>>    removed in some optimization passes would seem to doom its
effectiveness for
>>    controlling floating point optimizations. For anyone trying to use
source
>>    language and compiler option mechanisms to control for fp overflow
and
>>    underflow, this approach would seem ill conceived.
> 
> I think there may be a misunderstanding here.  True, the design of metadata
is
> that it is not wrong to drop it.  However the compiler isn't trying to
drop it,
> it tries hard not to drop it: any cases of pointlessly dropped metadata are
a
> bug.  In this fpmath metadata is analogous to tbaa (type based alias
analysis
> metadata): if it is dropped you get conservatively correct results, but
some
> optimizations are missed.  Compiler writers don't like missing
optimizations!
> If you see any cases of fpmath metadata being dropped then please report
it.

There's a deeper concern here.  Suppose that we have a caller A, that is
compiled in "fast" mode, which calls B, which was compiled in
"strict" mode.  Suppose they both contain a computation of x*y, and
that B got inlined into A.

%mul_A = fmul float %x, %y !fpmath !{"fast"} ; Or however it's
spelled
%mul_B = fmul float %x, %y

GVN (and EarlyCSE, and InstCombine, ...) are going to to want to CSE the two
copies of x*y that arise.  If the version from A appears earlier than the
version from B, it will want to CSE the latter into the former.  This is not
legal, unless it also knows how to make the metadata on the early multiply
stricter.  However, that means that a pass like GVN or EarlyCSE cannot safely
ignore metadata that it does not understand!

This is a general problem with the semantics of LLVM metadata.  Being able to
strip it arbitrarily is not sufficient to guarantee that optimizations are
allowed to be completely ignorant of it.

--Owen

Harris, Kevin

2012-Apr-17 17:41 UTC

head link

[LLVMdev] Representing -ffast-math at the IR level

Duncan,
	Thanks for the thoughtful response.  Some follow up:
> -----Original Message-----
> From: Duncan Sands [mailto:baldrick at free.fr] 
> Sent: Tuesday, April 17, 2012 11:53 AM
> To: Harris, Kevin
> Cc: llvmdev at cs.uiuc.edu
> Subject: Re: [LLVMdev] Representing -ffast-math at the IR level
>
> Hi Kevin,
>
>>  1. Most compiler and back-end control of floating point behavior
appears to be
>>     motivated by controlling the loss or gain of a few low bits of
precision on
>>     a whole module scale. In fact, these concerns are usually
insignificant for
>>     programmers of floating-point intensive applications. The input to
most
>>     floating point computations have far lower significance than the
>>     computations themselves, and therefore they have precision to burn.
So the
>>     vast majority of such app developers would happily trade precision
for
>>     performance, even as the default behavior. However, the place where
trouble
>>     DOES occur is with overflow and underflow behavior at critical
points.
>>     Changing the order of operations, or combining operations, can
cause
>>     overflows or underflows to occur that wouldn't otherwise occur,
and vice
>>     versa.
>
> for the moment I'm distinguishing (mentally) between transformations
that
> introduce a uniformly bounded relative error, for example x+0 -> x, or
> x/constant -> x * (1/constant) if constant and 1/constant are normal
(and
> not denormal), and those that can introduce an unbounded relative error.
> Reassociation is an example of a transformation that can introduce
unbounded
> relative error, for example (1 + epsilon) - 1 -> 0 if epsilon is small
enough,
> while (1 - 1) + epsilon -> epsilon.  I'm basically assuming that
everyone is
> happy with the transforms that introduce a bounded relative error - it
sounds
> to me like this is the distinction that you are making too.  Transforms
that
> introduce unbounded relative error (like reassocation) are a can of worms,
and
> I'm not sure how best to handle them.  So for the moment I'm not
planning to
> handle them, just gather ideas and discuss.
This is a reasonable distinction.  How you could enforce it across the various
optimization passes is not obvious.  Loss of precision problems are difficult
to diagnose even when strong fp correctness goals and methods are in place.
>>  Sometimes this is beneficial, but it is almost always unexpected.
>>     Underflows may sound less important in this regard, but they can be
worse
>>     than overflows, because they can mostly or completely eliminate the
>>     significant bits, in complete silence, leaving the entire
computation
>>     worthless. Much of numerical analysis, especially in writing
floating point
>>     library functions, concerns the precise control of overflow and
loss of
>>     significance in specific operations. To the extent that
optimizations which
>>     make such control difficult or impossible, can render the use of a
compiler
>>     or backend unusable for that purpose.
>>  2. While the use of metadata for control of LLVM behavior is
attractive for its
>>     simplicity and power, the philosophy that it can be safely ignored
or even
>>     removed in some optimization passes would seem to doom its
effectiveness for
>>     controlling floating point optimizations. For anyone trying to use
source
>>     language and compiler option mechanisms to control for fp overflow
and
>>     underflow, this approach would seem ill conceived.
>
> I think there may be a misunderstanding here.  True, the design of metadata
is
> that it is not wrong to drop it.  However the compiler isn't trying to
drop it,
> it tries hard not to drop it: any cases of pointlessly dropped metadata are
a
> bug.  In this fpmath metadata is analogous to tbaa (type based alias
analysis
> metadata): if it is dropped you get conservatively correct results, but
some
> optimizations are missed.  Compiler writers don't like missing
optimizations!
> If you see any cases of fpmath metadata being dropped then please report
it.
Yes, I (and others, obviously) have been confused before about the extent to
which metadata can be ignored or dropped.  I think its use for providing
additional information that allows optimizations that would otherwise be
invalid is well motivated and reasonably straightforward.  And your proposal
doesn't change that usage.  Any attempt to provide tighter restrictions for
fp
optimizations, however, would seem to muddy the situation, since it
would violate the basic assumption that the undecorated IR is the "most 
conservative".
>>  For the purpose of
>>     providing a Front-End developer with a powerful platform for
supporting
>>     fp-intensive programming,
>
> Let me just say up front that it is not clear to me that this is a goal of
LLVM.
I realize that good fp precision and control is a fairly specialized niche, esp.
for an open source compiler.  This is the main reason why I hesitated to comment
initially.  I didn't even necessarily mean to inject any additional goals in
this
space.  But since you had made an effort in this direction, and generated some 
thoughtful discussion already, I wanted to alert the community to some practical
issues they might not have considered.
>>  the primary requirement is that the Front-end
>>     should be able to precisely control optimizations that can change
the fp
>>     intermediate results under all optimization levels for each
individual fp
>>     operation specified in the IR. The vast majority of such usage can
and
>>     should chosen to default to high performance behavior. But it
should be
>>     possible for the front-end to precisely control IR re-ordering,
operation
>>     combining (including exploitation of mul-add hardware support), and
>>     reactions to overflow and underflow conditions (using the exception
handling
>>     conditions and underlying the hardware support). By providing this
power in
>>     the IR, it allows a Front-end developer to reliably support source
language
>>     mechanisms (e.g. use of parentheses) and front-end recognized
compiler
>>     options (e.g. for fp exception handling) to respond to the needs of
the
>>     source language programmer for fp-intensive applications.
>
> Given that LLVM doesn't even properly support rounding modes, I think
you are
> going to have to wait a few years at least before we are anywhere near
something
> like this.  That said, we'd get there sooner (assuming we actually want
to go
> there) if you help - patches welcome!
Point taken.  I definitely have fantasies in this area, but won't likely
have
extra cycles to devote to this area in the near future.  :-(  

Regarding rounding modes specifically, in spite of the hardware support for
these,
I think they are an even more specialized area than controlling for overflow / 
underflow.  They are almost never useful outside the context of fp library
function
authorship, and there are several commercial compilers that support library
development adequately.
>> It should be possible to define one or more attribute flags for FP
operations in
>> the IR with semantics that guarantee allowance or suppression of
optimizations
>> that might create or eliminate overflow, underflow, or significant
precision
>> loss. The implementation of such semantics in the existing optimization
passes
>> might take a fair amount of work, I admit. But that is exactly what
Front-End
>> developers and their source language programmers would most benefit
from.
>
> I'm pretty sure that building lots of flags into floating point
operations is
> not going to fly at this stage.  Metadata allows us to grow lots of flags
if we
> want without much impact on the compiler.  Once the metadata approach has
> matured and shown its usefulness or limitations then we can consider baking
> things into the IR or other such approaches.  But that's a long way
off.
The role of metadata as a prototyping vehicle is clear, and may indeed continue
to be useful in this space.  Clarifying the role of metadata in cases where it
would restrict optimizations rather than permit them would seem to be a step in
the right direction.
	-Kevin

Maybe Matching Threads

Search for more possibly parallel threads

llvm dev - Apr 2012 - [LLVMdev] Representing -ffast-math at the IR level

[LLVMdev] Representing -ffast-math at the IR level

[LLVMdev] Representing -ffast-math at the IR level

[LLVMdev] Representing -ffast-math at the IR level

[LLVMdev] Representing -ffast-math at the IR level

[LLVMdev] Representing -ffast-math at the IR level

Maybe Matching Threads