thr3ads.net - llvm dev - [LLVMdev] Representing -ffast-math at the IR level [Apr 2012]

If this information is useful, please help other people find it:
Share via:

Chandler Carruth

2012-Apr-14 23:53 UTC

[LLVMdev] Representing -ffast-math at the IR level

I feel like this discussion is getting a bit off track...

On Sun, Apr 15, 2012 at 12:00 AM, Dmitry Babokin <babokin at gmail.com>
wrote:
>
> I would define the set of transformations, such as (i can help with more
> complete list if you prefer):
>
>    - reassociation
>    - x+0.0=>x
>    - x*0.0=>0.0
>    - x*1.0=>x
>    - a/b => a* 1/b
>    - a*b+c=>fma(a,b,c)
>    - ignoring NaNs in compare, i.e. (a<b) => !(a>=b)
>    - value unsafe transformation (for aggressive fp optimizations, like
>    a*b+a*c => a(b+c)) and other of the kind.
>
> and several aliases for "strict", "precise",
"fast" models (which are
> effectively combination of flags above).
>
> So that metadata would be able to say "fast", "fast, but no
fma allowed",
> "strict, but fma allowed", I.e. metadata should be a base-level +
optional
> set of adjustments from the list above.
>
I would love to see such detailed models if we have real use cases and
people interested in implementing them.

However, today we have a feature in moderately widespread use,
'-ffast-math'. It's semantics may not be the ideal way to enable
restricted, predictable optimizations of floating point operations, but
they are effective for a wide range of programs today.

I think having a generic flag value which specifically is attempting to
model the *loose* semantics of '-ffast-math' is really important, and I
think any more detailed framework for classifying and enabling specific
optimizations should be layered on afterward. While I share our frustration
with the very vague and hard to reason about semantics of '-ffast-math',
I
think we can provide a clear enough spec to make it implementable, and we
should give ourselves the freedom to implement all the optimizations within
that spec which existing applications rely on for performance.

And, again, I think this should be function level model, unless
specified> otherwise in the instruction, as it will be the case in 99.9999% of the
> compilations.
>
I actually lobbied with Duncan to use a function default, with instruction
level overrides, but his posts about the metadata overhead of just doing it
on each instruction, I think his approach is simpler.

As he argued to me, *eventually*, this has to end up on the instruction in
order to model inlining correctly -- a function compiled with
'-ffast-math'
might be inlined into a function compiled without it, and vice versa. Since
you need this ability, it makes sense to simplify the inliner, the metadata
schema, etc and just always place the data on the instructions *unless*
there is some significant scaling problem. I think Duncan has demonstrated
it scales pretty well.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20120415/b539e122/attachment.html>

Dmitry Babokin

2012-Apr-15 01:50 UTC

head link

[LLVMdev] Representing -ffast-math at the IR level

On Sun, Apr 15, 2012 at 3:53 AM, Chandler Carruth <chandlerc at
google.com>wrote:
>
>> And, again, I think this should be function level model, unless
specified
>> otherwise in the instruction, as it will be the case in 99.9999% of the
>> compilations.
>
>
> I actually lobbied with Duncan to use a function default, with instruction
> level overrides, but his posts about the metadata overhead of just doing it
> on each instruction, I think his approach is simpler.
>
> As he argued to me, *eventually*, this has to end up on the instruction in
> order to model inlining correctly -- a function compiled with
'-ffast-math'
> might be inlined into a function compiled without it, and vice versa. Since
> you need this ability, it makes sense to simplify the inliner, the metadata
> schema, etc and just always place the data on the instructions *unless*
> there is some significant scaling problem. I think Duncan has demonstrated
> it scales pretty well.
>
For simple metadata, like "fast" in initial proposal, it could be ok.
But
if more complex metadata is possible (like I've described), then this
approach could consume more bitcode size, than expected. And I'm sure there
will be attempts to add fine-grain precision control. And the first
candidate is probably enabling/disable FMAs.

Inlining is a valid concern, though inside the single module fp model will
be the same in absolute majority of cases. People also tend to have
consistent flags across the project, so it shouldn't be rare case when
it's
consistent between modules.

Function or module level default setting is really just an optimization,
but IMHO quite useful one. It would also simplify dumps and understanding
of what is going on for people who don't want dig into details of fp
precision problems and be distracted by additional metadata.

Just to be clear. As it's not me, who is going to implement this, I'm
just
try to draw an attention to the issues that we'll finally encounter down
the road.

Dmitry.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20120415/092808f2/attachment.html>

Chandler Carruth

2012-Apr-15 02:11 UTC

head link

[LLVMdev] Representing -ffast-math at the IR level

On Sun, Apr 15, 2012 at 3:50 AM, Dmitry Babokin <babokin at gmail.com>
wrote:
> On Sun, Apr 15, 2012 at 3:53 AM, Chandler Carruth <chandlerc at
google.com>wrote:
>
>>
>>> And, again, I think this should be function level model, unless
>>> specified otherwise in the instruction, as it will be the case in
99.9999%
>>> of the compilations.
>>
>>
>> I actually lobbied with Duncan to use a function default, with
>> instruction level overrides, but his posts about the metadata overhead
of
>> just doing it on each instruction, I think his approach is simpler.
>>
>> As he argued to me, *eventually*, this has to end up on the instruction
>> in order to model inlining correctly -- a function compiled with
>> '-ffast-math' might be inlined into a function compiled without
it, and
>> vice versa. Since you need this ability, it makes sense to simplify the
>> inliner, the metadata schema, etc and just always place the data on the
>> instructions *unless* there is some significant scaling problem. I
think
>> Duncan has demonstrated it scales pretty well.
>>
>
> For simple metadata, like "fast" in initial proposal, it could be
ok. But
> if more complex metadata is possible (like I've described), then this
> approach could consume more bitcode size, than expected. And I'm sure
there
> will be attempts to add fine-grain precision control. And the first
> candidate is probably enabling/disable FMAs.
>
> Inlining is a valid concern, though inside the single module fp model will
> be the same in absolute majority of cases. People also tend to have
> consistent flags across the project, so it shouldn't be rare case when
it's
> consistent between modules.
>
> Function or module level default setting is really just an optimization,
> but IMHO quite useful one.
>
And I don't disagree, I just think it is premature until we have measured
an issue with the simpler form. Since we will almost certainly need the
simpler form anyways, we might as well wait until the problem manifests.

The reason I don't expect it to get worse with more complex specifications
is because the actual metadata nodes are uniqued. Thus we should see many
instructions all referring to the same (potentially complex) node.

It would also simplify dumps and understanding of what is going on
for> people who don't want dig into details of fp precision problems and be
> distracted by additional metadata.
>
The IR is not a normalized representation already though. It's primary
consumer and producer are libraries and machines, not humans. Debug
metadata, TBAA metadata, and numerous other complexities are already
present.

Just to be clear. As it's not me, who is going to implement this, I'm
just> try to draw an attention to the issues that we'll finally encounter
down
> the road.
>
Yep, I'm just trying to explain my perspective on these issues. =]
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20120415/5bf2f8fe/attachment.html>

Duncan Sands

2012-Apr-15 08:22 UTC

head link

[LLVMdev] Representing -ffast-math at the IR level

Hi,
> I would love to see such detailed models if we have real use cases and
people
> interested in implementing them.
>
> However, today we have a feature in moderately widespread use,
'-ffast-math'.
> It's semantics may not be the ideal way to enable restricted,
predictable
> optimizations of floating point operations, but they are effective for a
wide
> range of programs today.
>
> I think having a generic flag value which specifically is attempting to
model
> the *loose* semantics of '-ffast-math' is really important, and I
think any more
> detailed framework for classifying and enabling specific optimizations
should be
> layered on afterward. While I share our frustration with the very vague and
hard
> to reason about semantics of '-ffast-math', I think we can provide
a clear
> enough spec to make it implementable, and we should give ourselves the
freedom
> to implement all the optimizations within that spec which existing
applications
> rely on for performance.
I agree with Chandler.  Also, don't forget that the safest way to proceed is
to
start with a permissive interpretation of flags and tighten then up later.  For
example, suppose we start with an fpaccuracy of "fast" meaning: ignore
NaN's,
ignore infinities, do whatever you like; and then later tighten it to mean: do
the right thing with NaN's and infinities, only introduce a bounded number
of
ULPs of error.  Then this is conservatively safe: existing bitcode created with
the loose semantics will be correctly optimized and codegened with the new
tight semantics (just less optimized than it used to be).  However if we start
with tight semantics and then decide later that it was too tight, then we are
in trouble since existing bitcode might then undergo optimizations that the
creator of the bitcode didn't want.  So I'd rather start with a quite
permissive
setup which seems generally useful and allows the most important optimizations,
and worry about decomposing and tightening it later.

Given the fact that no-one was interested enough to implement any kind of
relaxed floating point mode in LLVM IR in all the years gone by, I actually
suspect that there might never be anything more than just this simple and not
very well defined 'fast-math' mode.  But at least there is a clear path
for
how to evolve towards a more sophisticated setup.

Ciao, Duncan.

Renato Golin

2012-Apr-15 09:25 UTC

head link

[LLVMdev] Representing -ffast-math at the IR level

On 15 April 2012 09:22, Duncan Sands <baldrick at free.fr>
wrote:> Given the fact that no-one was interested enough to implement any kind of
> relaxed floating point mode in LLVM IR in all the years gone by, I actually
> suspect that there might never be anything more than just this simple and
not
> very well defined 'fast-math' mode.  But at least there is a clear
path for
> how to evolve towards a more sophisticated setup.
Once it's implemented, there will be zealots complaining that your
"-ffast-math" is not as good as <insert-compiler-here>. But you
can
kindly ask them to contribute with code.

-- 
cheers,
--renato

http://systemcall.org/

Seemingly Similar Threads

Search for more possibly parallel threads

llvm dev - Apr 2012 - [LLVMdev] Representing -ffast-math at the IR level

[LLVMdev] Representing -ffast-math at the IR level

[LLVMdev] Representing -ffast-math at the IR level

[LLVMdev] Representing -ffast-math at the IR level

[LLVMdev] Representing -ffast-math at the IR level

[LLVMdev] Representing -ffast-math at the IR level

Seemingly Similar Threads