I feel like this discussion is getting a bit off track... On Sun, Apr 15, 2012 at 12:00 AM, Dmitry Babokin <babokin at gmail.com> wrote:> > I would define the set of transformations, such as (i can help with more > complete list if you prefer): > > - reassociation > - x+0.0=>x > - x*0.0=>0.0 > - x*1.0=>x > - a/b => a* 1/b > - a*b+c=>fma(a,b,c) > - ignoring NaNs in compare, i.e. (a<b) => !(a>=b) > - value unsafe transformation (for aggressive fp optimizations, like > a*b+a*c => a(b+c)) and other of the kind. > > and several aliases for "strict", "precise", "fast" models (which are > effectively combination of flags above). > > So that metadata would be able to say "fast", "fast, but no fma allowed", > "strict, but fma allowed", I.e. metadata should be a base-level + optional > set of adjustments from the list above. >I would love to see such detailed models if we have real use cases and people interested in implementing them. However, today we have a feature in moderately widespread use, '-ffast-math'. It's semantics may not be the ideal way to enable restricted, predictable optimizations of floating point operations, but they are effective for a wide range of programs today. I think having a generic flag value which specifically is attempting to model the *loose* semantics of '-ffast-math' is really important, and I think any more detailed framework for classifying and enabling specific optimizations should be layered on afterward. While I share our frustration with the very vague and hard to reason about semantics of '-ffast-math', I think we can provide a clear enough spec to make it implementable, and we should give ourselves the freedom to implement all the optimizations within that spec which existing applications rely on for performance. And, again, I think this should be function level model, unless specified> otherwise in the instruction, as it will be the case in 99.9999% of the > compilations. >I actually lobbied with Duncan to use a function default, with instruction level overrides, but his posts about the metadata overhead of just doing it on each instruction, I think his approach is simpler. As he argued to me, *eventually*, this has to end up on the instruction in order to model inlining correctly -- a function compiled with '-ffast-math' might be inlined into a function compiled without it, and vice versa. Since you need this ability, it makes sense to simplify the inliner, the metadata schema, etc and just always place the data on the instructions *unless* there is some significant scaling problem. I think Duncan has demonstrated it scales pretty well. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120415/b539e122/attachment.html>
On Sun, Apr 15, 2012 at 3:53 AM, Chandler Carruth <chandlerc at google.com>wrote:> >> And, again, I think this should be function level model, unless specified >> otherwise in the instruction, as it will be the case in 99.9999% of the >> compilations. > > > I actually lobbied with Duncan to use a function default, with instruction > level overrides, but his posts about the metadata overhead of just doing it > on each instruction, I think his approach is simpler. > > As he argued to me, *eventually*, this has to end up on the instruction in > order to model inlining correctly -- a function compiled with '-ffast-math' > might be inlined into a function compiled without it, and vice versa. Since > you need this ability, it makes sense to simplify the inliner, the metadata > schema, etc and just always place the data on the instructions *unless* > there is some significant scaling problem. I think Duncan has demonstrated > it scales pretty well. >For simple metadata, like "fast" in initial proposal, it could be ok. But if more complex metadata is possible (like I've described), then this approach could consume more bitcode size, than expected. And I'm sure there will be attempts to add fine-grain precision control. And the first candidate is probably enabling/disable FMAs. Inlining is a valid concern, though inside the single module fp model will be the same in absolute majority of cases. People also tend to have consistent flags across the project, so it shouldn't be rare case when it's consistent between modules. Function or module level default setting is really just an optimization, but IMHO quite useful one. It would also simplify dumps and understanding of what is going on for people who don't want dig into details of fp precision problems and be distracted by additional metadata. Just to be clear. As it's not me, who is going to implement this, I'm just try to draw an attention to the issues that we'll finally encounter down the road. Dmitry. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120415/092808f2/attachment.html>
On Sun, Apr 15, 2012 at 3:50 AM, Dmitry Babokin <babokin at gmail.com> wrote:> On Sun, Apr 15, 2012 at 3:53 AM, Chandler Carruth <chandlerc at google.com>wrote: > >> >>> And, again, I think this should be function level model, unless >>> specified otherwise in the instruction, as it will be the case in 99.9999% >>> of the compilations. >> >> >> I actually lobbied with Duncan to use a function default, with >> instruction level overrides, but his posts about the metadata overhead of >> just doing it on each instruction, I think his approach is simpler. >> >> As he argued to me, *eventually*, this has to end up on the instruction >> in order to model inlining correctly -- a function compiled with >> '-ffast-math' might be inlined into a function compiled without it, and >> vice versa. Since you need this ability, it makes sense to simplify the >> inliner, the metadata schema, etc and just always place the data on the >> instructions *unless* there is some significant scaling problem. I think >> Duncan has demonstrated it scales pretty well. >> > > For simple metadata, like "fast" in initial proposal, it could be ok. But > if more complex metadata is possible (like I've described), then this > approach could consume more bitcode size, than expected. And I'm sure there > will be attempts to add fine-grain precision control. And the first > candidate is probably enabling/disable FMAs. > > Inlining is a valid concern, though inside the single module fp model will > be the same in absolute majority of cases. People also tend to have > consistent flags across the project, so it shouldn't be rare case when it's > consistent between modules. > > Function or module level default setting is really just an optimization, > but IMHO quite useful one. >And I don't disagree, I just think it is premature until we have measured an issue with the simpler form. Since we will almost certainly need the simpler form anyways, we might as well wait until the problem manifests. The reason I don't expect it to get worse with more complex specifications is because the actual metadata nodes are uniqued. Thus we should see many instructions all referring to the same (potentially complex) node. It would also simplify dumps and understanding of what is going on for> people who don't want dig into details of fp precision problems and be > distracted by additional metadata. >The IR is not a normalized representation already though. It's primary consumer and producer are libraries and machines, not humans. Debug metadata, TBAA metadata, and numerous other complexities are already present. Just to be clear. As it's not me, who is going to implement this, I'm just> try to draw an attention to the issues that we'll finally encounter down > the road. >Yep, I'm just trying to explain my perspective on these issues. =] -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120415/5bf2f8fe/attachment.html>
Hi,> I would love to see such detailed models if we have real use cases and people > interested in implementing them. > > However, today we have a feature in moderately widespread use, '-ffast-math'. > It's semantics may not be the ideal way to enable restricted, predictable > optimizations of floating point operations, but they are effective for a wide > range of programs today. > > I think having a generic flag value which specifically is attempting to model > the *loose* semantics of '-ffast-math' is really important, and I think any more > detailed framework for classifying and enabling specific optimizations should be > layered on afterward. While I share our frustration with the very vague and hard > to reason about semantics of '-ffast-math', I think we can provide a clear > enough spec to make it implementable, and we should give ourselves the freedom > to implement all the optimizations within that spec which existing applications > rely on for performance.I agree with Chandler. Also, don't forget that the safest way to proceed is to start with a permissive interpretation of flags and tighten then up later. For example, suppose we start with an fpaccuracy of "fast" meaning: ignore NaN's, ignore infinities, do whatever you like; and then later tighten it to mean: do the right thing with NaN's and infinities, only introduce a bounded number of ULPs of error. Then this is conservatively safe: existing bitcode created with the loose semantics will be correctly optimized and codegened with the new tight semantics (just less optimized than it used to be). However if we start with tight semantics and then decide later that it was too tight, then we are in trouble since existing bitcode might then undergo optimizations that the creator of the bitcode didn't want. So I'd rather start with a quite permissive setup which seems generally useful and allows the most important optimizations, and worry about decomposing and tightening it later. Given the fact that no-one was interested enough to implement any kind of relaxed floating point mode in LLVM IR in all the years gone by, I actually suspect that there might never be anything more than just this simple and not very well defined 'fast-math' mode. But at least there is a clear path for how to evolve towards a more sophisticated setup. Ciao, Duncan.
On 15 April 2012 09:22, Duncan Sands <baldrick at free.fr> wrote:> Given the fact that no-one was interested enough to implement any kind of > relaxed floating point mode in LLVM IR in all the years gone by, I actually > suspect that there might never be anything more than just this simple and not > very well defined 'fast-math' mode. But at least there is a clear path for > how to evolve towards a more sophisticated setup.Once it's implemented, there will be zealots complaining that your "-ffast-math" is not as good as <insert-compiler-here>. But you can kindly ask them to contribute with code. -- cheers, --renato http://systemcall.org/