Oleg Ranevskyy
2014-Sep-22 15:56 UTC
[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef
Hi Duncan, On 17.09.2014 21:10, Duncan Sands wrote:> Hi Oleg, > > On 17/09/14 18:45, Oleg Ranevskyy wrote: >> Hi, >> >> Thank you for all your helpful comments. >> >> To sum up, below is the list of correct folding examples for fadd: >> (1) fadd %x, -0.0 -> %x >> (2) fadd undef, undef -> undef >> (3) fadd %x, undef -> NaN (undef is a NaN which >> is propagated) >> >> Looking through the code I found the "NoNaNs" flag accessed through >> an instance >> of the FastMathFlags class. >> (2) and (3) should probably depend on it. >> If the flag is set, (2) and (3) cannot be folded as there are no NaNs >> and we are >> not guaranteed to get an arbitrary bit pattern from fadd, right? > > I think it's exactly the other way round: if NoNans is set then you > can fold (2) and (3) to undef. That's because (IIRC) the NoNans flag > promises that no NaNs will be used by the program. However "undef" > could be a NaN, thus the promise is broken, meaning the program is > performing undefined behaviour, and you can do whatever you want.Oh, I see the point now. I thought if NoNaNs was set then no NaNs were possible at all. But undef is still an arbitrary bit pattern that might occasionally be the same as the one of a NaN. Thank you for the explanation. Thus, "fadd/fsub/fmul/fdiv undef, undef" can always be folded to undef, whereas "fadd/fsub/fmul/fdiv %x, undef" is folded to either undef (NoNaNs is set) or a NaN (NoNaNs is not set). Oleg> >> >> Other arithmetic FP operations (fsub, fmul, fdiv) also propagate >> NaNs. Thus, the >> same rules seem applicable to them as well: >> --------------------------------------------------------------------- >> - fdiv: >> (4) "fdiv %x, undef" is now folded to undef. > > But should be folded to NaN, not undef. > >> The code comment states this is done because undef might be a >> sNaN. We >> can't rely on sNaNs as they can either be masked or the platform >> might not have >> FP exceptions at all. Nevertheless, such folding is still correct due >> to the NaN >> propagation rules we found in the Standard - undef might be chosen to >> be a NaN >> and its payload will be propagated. >> Moreover, this looks similar to (3) and can be folded to a >> NaN. /Is it >> worth doing?/ > > As the current folding to undef is wrong, it has to be fixed. > >> >> (5) fdiv undef, undef -> undef > > Yup. > >> --------------------------------------------------------------------- >> - fmul: >> (6) fmul undef, undef -> undef > > Yup. > >> (7) fmul %x, undef -> NaN or undef (undef is a NaN, >> which is >> propagated) > > Should be folded to NaN, not undef. > >> --------------------------------------------------------------------- >> - fsub: >> (8) fsub %x, -0.0 -> %x (if %x is not -0.0; >> works this way >> now) > > Should this be: fsub %x, +0.0 ?fsub %x, +0.0 is also covered and always folded to %x. The version with -0.0 is similar except it additionally checks if %x is not -0.0.> >> (9) fsub %x, undef -> NaN or undef (undef is a NaN, >> which is >> propagated) > > Should fold to NaN not undef. > >> (10) fsub undef, undef -> undef > > Yup. > > Ciao, Duncan. > >> --------------------------------------------------------------------- >> >> I will be very thankful if you could review this final summary and >> share your >> thoughts. >> >> Thank you. >> >> P.S. Sorry for bothering you again and again. >> Just want to make sure I clearly understand the subject in order to >> make correct >> code changes and to be able to help others with this in the future. >> >> Kind regards, >> Oleg >> >> On 16.09.2014 21:42, Duncan Sands wrote: >>> On 16/09/14 19:37, Owen Anderson wrote: >>>> As far as I know, LLVM does not try very hard to guarantee constant >>>> folded >>>> NaN payloads that match exactly what the target would generate. >>> >>> I'm with Owen here. Unless ARM people object, I think it is >>> reasonable to say >>> that at the LLVM IR level we may assume that the IEEE rules are >>> followed. >>> >>> Ciao, Duncan. >>> >>>> >>>> —Owen >>>> >>>>> On Sep 16, 2014, at 10:30 AM, Oleg Ranevskyy >>>>> <llvm.mail.list at gmail.com> wrote: >>>>> >>>>> Hi Duncan, >>>>> >>>>> I reread everything we've discussed so far and would like to pay >>>>> closer >>>>> attention to the the ARM's FPSCR register mentioned by Stephen. >>>>> It's really possible on ARM systems that floating point operations >>>>> on one or >>>>> more qNaN operands return a NaN different from the operands. I.e. >>>>> operand >>>>> NaN is not propagated. This happens when the "default NaN" flag is >>>>> set in >>>>> the FPSCR (floating point status and control register). The result >>>>> in this >>>>> case is some default NaN value. >>>>> >>>>> This means "fadd %x, -0.0", which is currently folded to %x by >>>>> InstructionSimplify, might produce a different result if %x is a >>>>> NaN. This >>>>> breaks the NaN propagation rules the IEEE standard establishes and >>>>> significantly reduces folding capabilities for the FP operations. >>>>> >>>>> This also applies to "fadd undef, undef" and "fadd %x, undef". We >>>>> can't rely >>>>> on getting an arbitrary NaN here on ARMs. >>>>> >>>>> Would you be able to confirm this please? >>>>> >>>>> Thank you in advance for your time! >>>>> >>>>> Kind regards, >>>>> Oleg >>>>> >>>>> On 10.09.2014 22:50, Duncan Sands wrote: >>>>>> Hi Oleg, >>>>>> >>>>>> On 01/09/14 18:46, Oleg Ranevskyy wrote: >>>>>>> Hi Duncan, >>>>>>> >>>>>>> I looked through the IEEE standard and here is what I found: >>>>>>> >>>>>>> *6.2 Operations with NaNs* >>>>>>> /"For an operation with quiet NaN inputs, other than maximum and >>>>>>> minimum >>>>>>> operations, if a floating-point result is to be delivered the >>>>>>> result shall >>>>>>> be a >>>>>>> quiet NaN which should be one of the input NaNs"/. >>>>>>> >>>>>>> *6.2.3 NaN propagation* >>>>>>> /"An operation that propagates a NaN operand to its result and >>>>>>> has a >>>>>>> single NaN >>>>>>> as an input should produce a NaN with the payload of the input >>>>>>> NaN if >>>>>>> representable in the destination format"./ >>>>>> >>>>>> thanks for finding this out. >>>>>> >>>>>>> >>>>>>> Floating point add propagates a NaN. There is no conversion in >>>>>>> the context of >>>>>>> LLVM's fadd. So, if %x in "fadd %x, -0.0" is a NaN, the result >>>>>>> is also a NaN >>>>>>> with the same payload. >>>>>> >>>>>> Yes, folding "fadd %x, -0.0" to "%x" is correct. This implies >>>>>> that "fadd >>>>>> undef, undef" can be folded to "undef". >>>>>> >>>>>>> >>>>>>> As regards "fadd %x, undef", where %x might be a NaN and undef >>>>>>> might be >>>>>>> chosen >>>>>>> to be (probably some different) NaN, and a possibility to fold >>>>>>> this to a >>>>>>> constant (NaN), the standard says: >>>>>>> /"If two or more inputs are NaN, then the payload of the >>>>>>> resulting NaN >>>>>>> should be >>>>>>> identical to the payload of one of the input NaNs if >>>>>>> representable in the >>>>>>> destination format. *This standard does not specify which of the >>>>>>> input >>>>>>> NaNs will >>>>>>> provide the payload*"/. >>>>>>> >>>>>>> Thus, this makes it possible to fold "fadd %x, undef" to a NaN. >>>>>>> Is this >>>>>>> right? >>>>>> >>>>>> Yes, I agree. >>>>>> >>>>>> Ciao, Duncan. >>>>>> >>>>>>> >>>>>>> Oleg >>>>>>> >>>>>>> On 01.09.2014 10:04, Duncan Sands wrote: >>>>>>>> Hi Oleg, >>>>>>>> >>>>>>>> On 01/09/14 15:42, Oleg Ranevskyy wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> Thank you for your comment, Owen. >>>>>>>>> My LLVM expertise is certainly not enough to make such >>>>>>>>> decisions yet. >>>>>>>>> Duncan, do you have any comments on this or do you know anyone >>>>>>>>> else who can >>>>>>>>> decide about preserving NaN payloads? >>>>>>>> >>>>>>>> my take is that the first thing to do is to see what the IEEE >>>>>>>> standard says >>>>>>>> about NaNs. Consider for example "fadd x, -0.0". Does the >>>>>>>> standard specify >>>>>>>> the exact NaN bit pattern produced as output when a particular >>>>>>>> NaN x is >>>>>>>> input? Or does it just say that the output is a NaN? If the >>>>>>>> standard >>>>>>>> doesn't >>>>>>>> care exactly which NaN is output, I think it is reasonable for >>>>>>>> LLVM to >>>>>>>> assume >>>>>>>> it is whatever NaN is most convenient for LLVM; in this case >>>>>>>> that means >>>>>>>> using >>>>>>>> x itself as the output. >>>>>>>> >>>>>>>> However this approach does implicitly mean that we may end up >>>>>>>> not folding >>>>>>>> floating point operations completely deterministically: >>>>>>>> depending on the >>>>>>>> optimization that kicks in, in one case we might fold to NaN A, >>>>>>>> and in some >>>>>>>> different optimization we might fold the same expression to NaN >>>>>>>> B. I think >>>>>>>> this is pretty reasonable, but it is something to be aware of. >>>>>>>> >>>>>>>> Ciao, Duncan. >>>>>>> >>>>>> >>>>> >>>> >>> >> >
Duncan Sands
2014-Sep-23 13:58 UTC
[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef
Hi Oleg, On 22/09/14 17:56, Oleg Ranevskyy wrote:> Hi Duncan, > > On 17.09.2014 21:10, Duncan Sands wrote: >> Hi Oleg, >> >> On 17/09/14 18:45, Oleg Ranevskyy wrote: >>> Hi, >>> >>> Thank you for all your helpful comments. >>> >>> To sum up, below is the list of correct folding examples for fadd: >>> (1) fadd %x, -0.0 -> %x >>> (2) fadd undef, undef -> undef >>> (3) fadd %x, undef -> NaN (undef is a NaN which is >>> propagated) >>> >>> Looking through the code I found the "NoNaNs" flag accessed through an instance >>> of the FastMathFlags class. >>> (2) and (3) should probably depend on it. >>> If the flag is set, (2) and (3) cannot be folded as there are no NaNs and we are >>> not guaranteed to get an arbitrary bit pattern from fadd, right? >> >> I think it's exactly the other way round: if NoNans is set then you can fold >> (2) and (3) to undef. That's because (IIRC) the NoNans flag promises that no >> NaNs will be used by the program. However "undef" could be a NaN, thus the >> promise is broken, meaning the program is performing undefined behaviour, and >> you can do whatever you want. > Oh, I see the point now. I thought if NoNaNs was set then no NaNs were possible > at all. But undef is still an arbitrary bit pattern that might occasionally be > the same as the one of a NaN. Thank you for the explanation. > > Thus, "fadd/fsub/fmul/fdiv undef, undef" can always be folded to undef, whereas > "fadd/fsub/fmul/fdiv %x, undef" is folded to either undef (NoNaNs is set) or a > NaN (NoNaNs is not set).for fmul and fdiv, the reasoning does depend on fmul %x, 1.0 always being equal to %x (likewise: fdiv %x, 1.0 being equal to %x). Is this true? Ciao, Duncan.> > Oleg >> >>> >>> Other arithmetic FP operations (fsub, fmul, fdiv) also propagate NaNs. Thus, the >>> same rules seem applicable to them as well: >>> --------------------------------------------------------------------- >>> - fdiv: >>> (4) "fdiv %x, undef" is now folded to undef. >> >> But should be folded to NaN, not undef. >> >>> The code comment states this is done because undef might be a sNaN. We >>> can't rely on sNaNs as they can either be masked or the platform might not have >>> FP exceptions at all. Nevertheless, such folding is still correct due to the NaN >>> propagation rules we found in the Standard - undef might be chosen to be a NaN >>> and its payload will be propagated. >>> Moreover, this looks similar to (3) and can be folded to a NaN. /Is it >>> worth doing?/ >> >> As the current folding to undef is wrong, it has to be fixed. >> >>> >>> (5) fdiv undef, undef -> undef >> >> Yup. >> >>> --------------------------------------------------------------------- >>> - fmul: >>> (6) fmul undef, undef -> undef >> >> Yup. >> >>> (7) fmul %x, undef -> NaN or undef (undef is a NaN, which is >>> propagated) >> >> Should be folded to NaN, not undef. >> >>> --------------------------------------------------------------------- >>> - fsub: >>> (8) fsub %x, -0.0 -> %x (if %x is not -0.0; works this way >>> now) >> >> Should this be: fsub %x, +0.0 ? > fsub %x, +0.0 is also covered and always folded to %x. > The version with -0.0 is similar except it additionally checks if %x is not -0.0. >> >>> (9) fsub %x, undef -> NaN or undef (undef is a NaN, which is >>> propagated) >> >> Should fold to NaN not undef. >> >>> (10) fsub undef, undef -> undef >> >> Yup. >> >> Ciao, Duncan. >> >>> --------------------------------------------------------------------- >>> >>> I will be very thankful if you could review this final summary and share your >>> thoughts. >>> >>> Thank you. >>> >>> P.S. Sorry for bothering you again and again. >>> Just want to make sure I clearly understand the subject in order to make correct >>> code changes and to be able to help others with this in the future. >>> >>> Kind regards, >>> Oleg >>> >>> On 16.09.2014 21:42, Duncan Sands wrote: >>>> On 16/09/14 19:37, Owen Anderson wrote: >>>>> As far as I know, LLVM does not try very hard to guarantee constant folded >>>>> NaN payloads that match exactly what the target would generate. >>>> >>>> I'm with Owen here. Unless ARM people object, I think it is reasonable to say >>>> that at the LLVM IR level we may assume that the IEEE rules are followed. >>>> >>>> Ciao, Duncan. >>>> >>>>> >>>>> —Owen >>>>> >>>>>> On Sep 16, 2014, at 10:30 AM, Oleg Ranevskyy <llvm.mail.list at gmail.com> >>>>>> wrote: >>>>>> >>>>>> Hi Duncan, >>>>>> >>>>>> I reread everything we've discussed so far and would like to pay closer >>>>>> attention to the the ARM's FPSCR register mentioned by Stephen. >>>>>> It's really possible on ARM systems that floating point operations on one or >>>>>> more qNaN operands return a NaN different from the operands. I.e. operand >>>>>> NaN is not propagated. This happens when the "default NaN" flag is set in >>>>>> the FPSCR (floating point status and control register). The result in this >>>>>> case is some default NaN value. >>>>>> >>>>>> This means "fadd %x, -0.0", which is currently folded to %x by >>>>>> InstructionSimplify, might produce a different result if %x is a NaN. This >>>>>> breaks the NaN propagation rules the IEEE standard establishes and >>>>>> significantly reduces folding capabilities for the FP operations. >>>>>> >>>>>> This also applies to "fadd undef, undef" and "fadd %x, undef". We can't rely >>>>>> on getting an arbitrary NaN here on ARMs. >>>>>> >>>>>> Would you be able to confirm this please? >>>>>> >>>>>> Thank you in advance for your time! >>>>>> >>>>>> Kind regards, >>>>>> Oleg >>>>>> >>>>>> On 10.09.2014 22:50, Duncan Sands wrote: >>>>>>> Hi Oleg, >>>>>>> >>>>>>> On 01/09/14 18:46, Oleg Ranevskyy wrote: >>>>>>>> Hi Duncan, >>>>>>>> >>>>>>>> I looked through the IEEE standard and here is what I found: >>>>>>>> >>>>>>>> *6.2 Operations with NaNs* >>>>>>>> /"For an operation with quiet NaN inputs, other than maximum and minimum >>>>>>>> operations, if a floating-point result is to be delivered the result shall >>>>>>>> be a >>>>>>>> quiet NaN which should be one of the input NaNs"/. >>>>>>>> >>>>>>>> *6.2.3 NaN propagation* >>>>>>>> /"An operation that propagates a NaN operand to its result and has a >>>>>>>> single NaN >>>>>>>> as an input should produce a NaN with the payload of the input NaN if >>>>>>>> representable in the destination format"./ >>>>>>> >>>>>>> thanks for finding this out. >>>>>>> >>>>>>>> >>>>>>>> Floating point add propagates a NaN. There is no conversion in the >>>>>>>> context of >>>>>>>> LLVM's fadd. So, if %x in "fadd %x, -0.0" is a NaN, the result is also a >>>>>>>> NaN >>>>>>>> with the same payload. >>>>>>> >>>>>>> Yes, folding "fadd %x, -0.0" to "%x" is correct. This implies that "fadd >>>>>>> undef, undef" can be folded to "undef". >>>>>>> >>>>>>>> >>>>>>>> As regards "fadd %x, undef", where %x might be a NaN and undef might be >>>>>>>> chosen >>>>>>>> to be (probably some different) NaN, and a possibility to fold this to a >>>>>>>> constant (NaN), the standard says: >>>>>>>> /"If two or more inputs are NaN, then the payload of the resulting NaN >>>>>>>> should be >>>>>>>> identical to the payload of one of the input NaNs if representable in the >>>>>>>> destination format. *This standard does not specify which of the input >>>>>>>> NaNs will >>>>>>>> provide the payload*"/. >>>>>>>> >>>>>>>> Thus, this makes it possible to fold "fadd %x, undef" to a NaN. Is this >>>>>>>> right? >>>>>>> >>>>>>> Yes, I agree. >>>>>>> >>>>>>> Ciao, Duncan. >>>>>>> >>>>>>>> >>>>>>>> Oleg >>>>>>>> >>>>>>>> On 01.09.2014 10:04, Duncan Sands wrote: >>>>>>>>> Hi Oleg, >>>>>>>>> >>>>>>>>> On 01/09/14 15:42, Oleg Ranevskyy wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> Thank you for your comment, Owen. >>>>>>>>>> My LLVM expertise is certainly not enough to make such decisions yet. >>>>>>>>>> Duncan, do you have any comments on this or do you know anyone else >>>>>>>>>> who can >>>>>>>>>> decide about preserving NaN payloads? >>>>>>>>> >>>>>>>>> my take is that the first thing to do is to see what the IEEE standard >>>>>>>>> says >>>>>>>>> about NaNs. Consider for example "fadd x, -0.0". Does the standard >>>>>>>>> specify >>>>>>>>> the exact NaN bit pattern produced as output when a particular NaN x is >>>>>>>>> input? Or does it just say that the output is a NaN? If the standard >>>>>>>>> doesn't >>>>>>>>> care exactly which NaN is output, I think it is reasonable for LLVM to >>>>>>>>> assume >>>>>>>>> it is whatever NaN is most convenient for LLVM; in this case that means >>>>>>>>> using >>>>>>>>> x itself as the output. >>>>>>>>> >>>>>>>>> However this approach does implicitly mean that we may end up not folding >>>>>>>>> floating point operations completely deterministically: depending on the >>>>>>>>> optimization that kicks in, in one case we might fold to NaN A, and in >>>>>>>>> some >>>>>>>>> different optimization we might fold the same expression to NaN B. I >>>>>>>>> think >>>>>>>>> this is pretty reasonable, but it is something to be aware of. >>>>>>>>> >>>>>>>>> Ciao, Duncan. >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
Oleg Ranevskyy
2014-Sep-23 15:32 UTC
[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef
Hi Duncan, On 23.09.2014 17:58, Duncan Sands wrote:> Hi Oleg, > > On 22/09/14 17:56, Oleg Ranevskyy wrote: >> Hi Duncan, >> >> On 17.09.2014 21:10, Duncan Sands wrote: >>> Hi Oleg, >>> >>> On 17/09/14 18:45, Oleg Ranevskyy wrote: >>>> Hi, >>>> >>>> Thank you for all your helpful comments. >>>> >>>> To sum up, below is the list of correct folding examples for fadd: >>>> (1) fadd %x, -0.0 -> %x >>>> (2) fadd undef, undef -> undef >>>> (3) fadd %x, undef -> NaN (undef is a NaN >>>> which is >>>> propagated) >>>> >>>> Looking through the code I found the "NoNaNs" flag accessed through >>>> an instance >>>> of the FastMathFlags class. >>>> (2) and (3) should probably depend on it. >>>> If the flag is set, (2) and (3) cannot be folded as there are no >>>> NaNs and we are >>>> not guaranteed to get an arbitrary bit pattern from fadd, right? >>> >>> I think it's exactly the other way round: if NoNans is set then you >>> can fold >>> (2) and (3) to undef. That's because (IIRC) the NoNans flag >>> promises that no >>> NaNs will be used by the program. However "undef" could be a NaN, >>> thus the >>> promise is broken, meaning the program is performing undefined >>> behaviour, and >>> you can do whatever you want. >> Oh, I see the point now. I thought if NoNaNs was set then no NaNs >> were possible >> at all. But undef is still an arbitrary bit pattern that might >> occasionally be >> the same as the one of a NaN. Thank you for the explanation. >> >> Thus, "fadd/fsub/fmul/fdiv undef, undef" can always be folded to >> undef, whereas >> "fadd/fsub/fmul/fdiv %x, undef" is folded to either undef (NoNaNs is >> set) or a >> NaN (NoNaNs is not set). > > for fmul and fdiv, the reasoning does depend on fmul %x, 1.0 always > being equal to %x (likewise: fdiv %x, 1.0 being equal to %x). Is this > true?Do you mean that we can't apply "fmul/fdiv undef, undef" to undef folding if "fmul/fdiv %x, 1.0" is not guaranteed to be %x? If we choose one undef to have an arbitrary bit pattern and another undef = 1.0, we need a guarantee to get the bit pattern of the first undef. Do I get it right? I checked the standard regarding "x*1.0 == x" and found that only "10.4 Literal meaning and value-changing optimizations" addresses this. I don't pretend to thoroughly understand this paragraph yet, but it seems to me that language standards are required to preserve the literal meaning of the source code. Applying the identity property x*1 is a part of this. Here is a quote from IEEE-754: /"The following value-changing transformations, among others, preserve the literal meaning of the source// //code:// //― Applying the identity property 0 + x when x is not zero and is not a signaling NaN and the result// //has the same exponent as x.// //― Applying the identity property 1 × x when x is not a signaling NaN and the result has the same// //exponent as x."// // /Maybe Owen or Stephen would be able to clarify this. Thank you. Oleg> > Ciao, Duncan. > >> >> Oleg >>> >>>> >>>> Other arithmetic FP operations (fsub, fmul, fdiv) also propagate >>>> NaNs. Thus, the >>>> same rules seem applicable to them as well: >>>> --------------------------------------------------------------------- >>>> - fdiv: >>>> (4) "fdiv %x, undef" is now folded to undef. >>> >>> But should be folded to NaN, not undef. >>> >>>> The code comment states this is done because undef might be >>>> a sNaN. We >>>> can't rely on sNaNs as they can either be masked or the platform >>>> might not have >>>> FP exceptions at all. Nevertheless, such folding is still correct >>>> due to the NaN >>>> propagation rules we found in the Standard - undef might be chosen >>>> to be a NaN >>>> and its payload will be propagated. >>>> Moreover, this looks similar to (3) and can be folded to a >>>> NaN. /Is it >>>> worth doing?/ >>> >>> As the current folding to undef is wrong, it has to be fixed. >>> >>>> >>>> (5) fdiv undef, undef -> undef >>> >>> Yup. >>> >>>> --------------------------------------------------------------------- >>>> - fmul: >>>> (6) fmul undef, undef -> undef >>> >>> Yup. >>> >>>> (7) fmul %x, undef -> NaN or undef (undef is a NaN, which is >>>> propagated) >>> >>> Should be folded to NaN, not undef. >>> >>>> --------------------------------------------------------------------- >>>> - fsub: >>>> (8) fsub %x, -0.0 -> %x (if %x is not -0.0; >>>> works this way >>>> now) >>> >>> Should this be: fsub %x, +0.0 ? >> fsub %x, +0.0 is also covered and always folded to %x. >> The version with -0.0 is similar except it additionally checks if %x >> is not -0.0. >>> >>>> (9) fsub %x, undef -> NaN or undef (undef is a NaN, which is >>>> propagated) >>> >>> Should fold to NaN not undef. >>> >>>> (10) fsub undef, undef -> undef >>> >>> Yup. >>> >>> Ciao, Duncan. >>> >>>> --------------------------------------------------------------------- >>>> >>>> I will be very thankful if you could review this final summary and >>>> share your >>>> thoughts. >>>> >>>> Thank you. >>>> >>>> P.S. Sorry for bothering you again and again. >>>> Just want to make sure I clearly understand the subject in order to >>>> make correct >>>> code changes and to be able to help others with this in the future. >>>> >>>> Kind regards, >>>> Oleg >>>> >>>> On 16.09.2014 21:42, Duncan Sands wrote: >>>>> On 16/09/14 19:37, Owen Anderson wrote: >>>>>> As far as I know, LLVM does not try very hard to guarantee >>>>>> constant folded >>>>>> NaN payloads that match exactly what the target would generate. >>>>> >>>>> I'm with Owen here. Unless ARM people object, I think it is >>>>> reasonable to say >>>>> that at the LLVM IR level we may assume that the IEEE rules are >>>>> followed. >>>>> >>>>> Ciao, Duncan. >>>>> >>>>>> >>>>>> —Owen >>>>>> >>>>>>> On Sep 16, 2014, at 10:30 AM, Oleg Ranevskyy >>>>>>> <llvm.mail.list at gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>> Hi Duncan, >>>>>>> >>>>>>> I reread everything we've discussed so far and would like to pay >>>>>>> closer >>>>>>> attention to the the ARM's FPSCR register mentioned by Stephen. >>>>>>> It's really possible on ARM systems that floating point >>>>>>> operations on one or >>>>>>> more qNaN operands return a NaN different from the operands. >>>>>>> I.e. operand >>>>>>> NaN is not propagated. This happens when the "default NaN" flag >>>>>>> is set in >>>>>>> the FPSCR (floating point status and control register). The >>>>>>> result in this >>>>>>> case is some default NaN value. >>>>>>> >>>>>>> This means "fadd %x, -0.0", which is currently folded to %x by >>>>>>> InstructionSimplify, might produce a different result if %x is a >>>>>>> NaN. This >>>>>>> breaks the NaN propagation rules the IEEE standard establishes and >>>>>>> significantly reduces folding capabilities for the FP operations. >>>>>>> >>>>>>> This also applies to "fadd undef, undef" and "fadd %x, undef". >>>>>>> We can't rely >>>>>>> on getting an arbitrary NaN here on ARMs. >>>>>>> >>>>>>> Would you be able to confirm this please? >>>>>>> >>>>>>> Thank you in advance for your time! >>>>>>> >>>>>>> Kind regards, >>>>>>> Oleg >>>>>>> >>>>>>> On 10.09.2014 22:50, Duncan Sands wrote: >>>>>>>> Hi Oleg, >>>>>>>> >>>>>>>> On 01/09/14 18:46, Oleg Ranevskyy wrote: >>>>>>>>> Hi Duncan, >>>>>>>>> >>>>>>>>> I looked through the IEEE standard and here is what I found: >>>>>>>>> >>>>>>>>> *6.2 Operations with NaNs* >>>>>>>>> /"For an operation with quiet NaN inputs, other than maximum >>>>>>>>> and minimum >>>>>>>>> operations, if a floating-point result is to be delivered the >>>>>>>>> result shall >>>>>>>>> be a >>>>>>>>> quiet NaN which should be one of the input NaNs"/. >>>>>>>>> >>>>>>>>> *6.2.3 NaN propagation* >>>>>>>>> /"An operation that propagates a NaN operand to its result and >>>>>>>>> has a >>>>>>>>> single NaN >>>>>>>>> as an input should produce a NaN with the payload of the input >>>>>>>>> NaN if >>>>>>>>> representable in the destination format"./ >>>>>>>> >>>>>>>> thanks for finding this out. >>>>>>>> >>>>>>>>> >>>>>>>>> Floating point add propagates a NaN. There is no conversion in >>>>>>>>> the >>>>>>>>> context of >>>>>>>>> LLVM's fadd. So, if %x in "fadd %x, -0.0" is a NaN, the result >>>>>>>>> is also a >>>>>>>>> NaN >>>>>>>>> with the same payload. >>>>>>>> >>>>>>>> Yes, folding "fadd %x, -0.0" to "%x" is correct. This implies >>>>>>>> that "fadd >>>>>>>> undef, undef" can be folded to "undef". >>>>>>>> >>>>>>>>> >>>>>>>>> As regards "fadd %x, undef", where %x might be a NaN and undef >>>>>>>>> might be >>>>>>>>> chosen >>>>>>>>> to be (probably some different) NaN, and a possibility to fold >>>>>>>>> this to a >>>>>>>>> constant (NaN), the standard says: >>>>>>>>> /"If two or more inputs are NaN, then the payload of the >>>>>>>>> resulting NaN >>>>>>>>> should be >>>>>>>>> identical to the payload of one of the input NaNs if >>>>>>>>> representable in the >>>>>>>>> destination format. *This standard does not specify which of >>>>>>>>> the input >>>>>>>>> NaNs will >>>>>>>>> provide the payload*"/. >>>>>>>>> >>>>>>>>> Thus, this makes it possible to fold "fadd %x, undef" to a >>>>>>>>> NaN. Is this >>>>>>>>> right? >>>>>>>> >>>>>>>> Yes, I agree. >>>>>>>> >>>>>>>> Ciao, Duncan. >>>>>>>> >>>>>>>>> >>>>>>>>> Oleg >>>>>>>>> >>>>>>>>> On 01.09.2014 10:04, Duncan Sands wrote: >>>>>>>>>> Hi Oleg, >>>>>>>>>> >>>>>>>>>> On 01/09/14 15:42, Oleg Ranevskyy wrote: >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> Thank you for your comment, Owen. >>>>>>>>>>> My LLVM expertise is certainly not enough to make such >>>>>>>>>>> decisions yet. >>>>>>>>>>> Duncan, do you have any comments on this or do you know >>>>>>>>>>> anyone else >>>>>>>>>>> who can >>>>>>>>>>> decide about preserving NaN payloads? >>>>>>>>>> >>>>>>>>>> my take is that the first thing to do is to see what the IEEE >>>>>>>>>> standard >>>>>>>>>> says >>>>>>>>>> about NaNs. Consider for example "fadd x, -0.0". Does the >>>>>>>>>> standard >>>>>>>>>> specify >>>>>>>>>> the exact NaN bit pattern produced as output when a >>>>>>>>>> particular NaN x is >>>>>>>>>> input? Or does it just say that the output is a NaN? If the >>>>>>>>>> standard >>>>>>>>>> doesn't >>>>>>>>>> care exactly which NaN is output, I think it is reasonable >>>>>>>>>> for LLVM to >>>>>>>>>> assume >>>>>>>>>> it is whatever NaN is most convenient for LLVM; in this case >>>>>>>>>> that means >>>>>>>>>> using >>>>>>>>>> x itself as the output. >>>>>>>>>> >>>>>>>>>> However this approach does implicitly mean that we may end up >>>>>>>>>> not folding >>>>>>>>>> floating point operations completely deterministically: >>>>>>>>>> depending on the >>>>>>>>>> optimization that kicks in, in one case we might fold to NaN >>>>>>>>>> A, and in >>>>>>>>>> some >>>>>>>>>> different optimization we might fold the same expression to >>>>>>>>>> NaN B. I >>>>>>>>>> think >>>>>>>>>> this is pretty reasonable, but it is something to be aware of. >>>>>>>>>> >>>>>>>>>> Ciao, Duncan. >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140923/e3790243/attachment.html>