Hi Dmitry,> That's possible (I already discussed this with Chandler), but in my opinion is > only worth doing if we see unreasonable increases in bitcode size in real code. > > > What is reasonable or not is defined not only by absolute numbers (0.8% or any > other number). Does it make sense to increase bitcode size by 1% if it's used > only by math library writes and a couple other people who reeeeally care about > precision *and* performance at the same time and knowledgeable enough to > restrict precision on particular instructions only? In my experience > it's extremely rare case, when people would like to have more than compiler > flags to control fp accuracy and ready to deal with pragmas (when they are > available).there is no increase in bitcode size if you don't use this feature. If more options are added it will hardly increase the bitcode size: there will be one metadatum with lots of options (!0 = metadata !{ this, that, other }), and instructions just have a reference to it. So the size increase isn't like (number of options) * (number of instructions), it is (number of options) + (number of instructions).> And, again, I think this should be function level model, unless specified > otherwise in the instruction, as it will be the case in 99.9999% of the > compilations.Link-time optimization will sometimes result in "fast-math" functions being inlined into non-fast math functions and vice-versa. This pretty much inevitably means that per-instruction fpmath options are required. That said, to save space, if every fp instruction in a function has the same fpmath metadata then the metadata could be attached to the function instead. But since (in my opinion) the size increase is mild, I don't think it is worth the added complexity. Ciao, Duncan.
On 15 April 2012 09:07, Duncan Sands <baldrick at free.fr> wrote:> Link-time optimization will sometimes result in "fast-math" functions being > inlined into non-fast math functions and vice-versa. This pretty much > inevitably means that per-instruction fpmath options are required.I guess it would be user error if a strict function used the results of a non-strict function (explicitly compiled with -ffast-math) and complain about loss of precision. In that case, the inlining keeping the option per-line makes total sense. Would there be need to make fast-math less strict, ie. to only use it when no strict FP result needs its result? In this case, an option in the whole function would guarantee that all inlined instructions would be modified to strict, even if relaxed in the first place. Just guessing for the future, I agree with you that the first implementation should be very simple, as it is. cheers, --renato
On Sun, Apr 15, 2012 at 1:20 PM, Renato Golin <rengolin at systemcall.org>wrote:> On 15 April 2012 09:07, Duncan Sands <baldrick at free.fr> wrote: > > Link-time optimization will sometimes result in "fast-math" functions > being > > inlined into non-fast math functions and vice-versa. This pretty much > > inevitably means that per-instruction fpmath options are required. > > I guess it would be user error if a strict function used the results > of a non-strict function (explicitly compiled with -ffast-math) and > complain about loss of precision. In that case, the inlining keeping > the option per-line makes total sense. >It's not a user error. User knows his code and accuracy of his code much better, than any compiler could possible do and may have strong reasons to specify fast-math for one function and not specify for another.> > Would there be need to make fast-math less strict, ie. to only use it > when no strict FP result needs its result? In this case, an option in > the whole function would guarantee that all inlined instructions would > be modified to strict, even if relaxed in the first place. >If the user specified different fp-models to different functions on purpose, the most likely you'll ruin performance by assuming stricter model the result of inlining.> > Just guessing for the future, I agree with you that the first > implementation should be very simple, as it is. > > cheers, > --renato >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120415/8610dd64/attachment.html>
[Resend as I forgot this list doesn't set reply-to to list. Oops] On Sun, Apr 15, 2012 at 10:20 AM, Renato Golin <rengolin at systemcall.org> wrote:> On 15 April 2012 09:07, Duncan Sands <baldrick at free.fr> wrote: >> Link-time optimization will sometimes result in "fast-math" functions being >> inlined into non-fast math functions and vice-versa. This pretty much >> inevitably means that per-instruction fpmath options are required. > > I guess it would be user error if a strict function used the results > of a non-strict function (explicitly compiled with -ffast-math) and > complain about loss of precision. In that case, the inlining keeping > the option per-line makes total sense.As a writer of numerical code, the perspective that's being taken makes things seem bizarre. I would never write code/use optimizations that I expect to produce inaccurate results. What I would do is write code which, _for the input data that it is going to use_, is not going to be (to any noticeable degree) any less accurate if some optimzations are being used. (Clearly it's well known that for most optimizations there are some sets of input data that cause big changes in accuracy; however there seems no neat way of telling the compiler that these aren't going to occur other than by specifying modes/allowed transformations.) As such, inlining code that uses more optimizations ("fast-math flagged code") into more sensitive code that expects those inputs need "strict math" to retain the accuracy through to the result. My personal interest is in automatic differentiation, where there's two kinds of "variable entities" in the code-after-auto-differentiation: original variables and derivatives, and it is desirable to have different fp optimizations used on the two kinds of element. (It's quite important that 0*x-> 0 is used to shrink down the amount of "pointless" instructions generated for derivatives.) However, I have to admit I can't think of any other problem where I'd want control over the fp-optimizations used on a per-instruction level, so I don't know if it's worth it for the LLVM codebase in general. Finally, a minor aside: I was talking to Duncan Sands at EuroLLVM and discussing whether the FP optimizations would apply to vector op as well as scalar ops, and he mentioned that the plan was to mirror the integer case where vector code should be optimized as well as scalar code. Since there's no FP optimizations yet, I looked at what LLVM produces for integer code for t0 := a * b t1 := c * d t2 := t0 + t1 t3 := t2 + e return t3 in the 16 cases where both a and c are from {variable, -1, 0, +1} in the scalar and vector cases. The good news is that in each case both scalar and vector code gets fully optmized; interstingly however different choices get made in a couple of cases between vector and scalar. (Basically given an expression like w+x+y-z there are various ways to build this from binary instructions, and different choices seem to be made.) Anyway, I'll rerun this test code for FP mode once there are some FP optimizations implemented. HTH, Dave Tweed