Duncan Sands
2014-Aug-27 13:34 UTC
[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef
Hi Oleg, On 27/08/14 18:17, Oleg Ranevskyy wrote:> Duncan, >> Hi Oleg, >> >>> >> /This is either a mistake, or a decision that in LLVM IR snans are always >>> considered to be signalling. / >>> Yes, this seems to be an agreement to treat "undef" as a SNaN for "fdiv". >> >> "undef" is whatever bit pattern you want it to be, i.e. the compiler can >> assume it is any convenient value. For one fdiv optimization maybe it is >> convenient to choose it to be the bit pattern of an snan, for some other fdiv >> optimization the compiler can choose it to be 1.0 if that is convenient. >> >>> The question is whether we can make the same assumption for other floating point >>> operations, or "fdiv" needs a correction to prevent folding since signalling of >>> SNaNs might be disabled. >> >> You can assume that undef is an snan if you want in any floating point >> operation. But what does that assumption buy you? If you are willing to >> assume that the processor will trap on snans then it buys you a lot for any >> floating point operation, but if you are not willing to assume that then you >> need to find some other trick or give up on folding the operation to undef. > This is confusing me a bit. > On the one hand, we can assume undef to be an SNaN for the sake of folding. > InstructionSimplify makes such an assumption and folds "fdiv undef, %X" and > "fdiv %X, undef" to undef. > On the other hand, there are the requests to fold "fmul undef, undef" / "fadd > undef, undef" to undef as well. However, it was stated that such folding is > questionable as signalling of SNaNs can potentially be disabled. > Shouldn't fdiv / fmul / fadd be consistent as regards the assumptions they make?use, they should be consistent. Either snans can be assumed to be signalling, in which case this information should be used everywhere, or they can't be assumed to be signalling in which case the signalling argument can't be used anywhere.> Could you please advise how we should proceed with the bugs 16257 and 16258 then?I think you should try to get LLVM floating point experts involved, to find out their opinion about whether LLVM should really assume that snans always trap. If they think it is fine to assume trapping, then you can fold any floating point operation with an "undef" operand to "undef". If they think it is no good, then the existing folds that use this need to be removed or weakened, though maybe another argument can be found to justify them. Ciao, Duncan.>> >>> >>> >> /InstructionSimplify folds "mul %X, undef" to 0 always/ >>> Sorry, I malformed this line and forgot to highlight that by "%X" I meant a >>> constant here. So, constant folding comes into play. The result depends on the >>> constant parity. E.g.: >>> mul i64 5, undef --> undef >>> mul i64 4, undef --> 0 >>> I still have difficulty understanding why the first one is folded to "undef". >>> "Undef" could be zero, so the result is also zero. Besides, if "undef" isn't >>> zero, it's impossible to get an arbitrary bit pattern anyway. The result will >>> always be divisible by 5. >> >> No, that's not true, because we are dealing with arithmetic modulo 2^64 here. >> Since 5 and 2^64 are relatively prime, you can prove that given any i64 value >> R, there exists an i64 value X such that mul i64 5, X is equal to R. Here's >> an example using i3 arithmetic (possible values: 0, ..., 7). >> mul i3 5, 0 -> 0 >> mul i3 5, 5 -> 1 (because 5 * 5 is 25; reduced modulo 8 that gives 1) >> mul i3 5, 2 -> 2 >> mul i3 5, 7 -> 3 >> mul i3 5, 4 -> 4 >> mul i3 5, 1 -> 5 >> mul i3 5, 6 -> 6 >> mul i3 5, 3 -> 7 >> So you see that every possible i3 output is achievable. Thus it is correct to >> fold "mul i3 5, undef" to undef. The same goes for any other number that is >> relatively prime to 2, i.e. is an odd number. > Yes, this explanation is absolutely reasonable. Thanks! >> >> Ciao, Duncan. >> >>> >>> Would you be able to elaborate on this please? >>> Thank you! >>> >>> Kind regards, >>> Oleg >>> >>> On 26.08.2014 15:47, Duncan Sands wrote: >>>> Hi Oleg, >>>> >>>> On 26/08/14 21:20, Oleg Ranevskyy wrote: >>>>> Hi Duncan, >>>>> >>>>> Thank you for your comment to the bug 16257. >>>>> >>>>> I am new to LLVM, so not all the aspects of LLVM's /"undef"/ seem clear to me >>>>> yet. >>>>> I read and understood the examples from the LLVM documentation: >>>>> llvm.org/docs/LangRef.html#undefined-values >>>>> >>>>> However, those examples do not cover all of the possible contexts where >>>>> /"undef"/ can appear. >>>>> E.g., I can't realize why /"fmul undef, undef"/ may not result in/"undef"/ >>>>> whereas /"mul undef, undef"/ is always folded to /"undef"/. Seems I am missing >>>>> something important about floats here. >>>> >>>> in the case of mul undef, undef, the two inputs may be anything. In order to >>>> fold the mul to undef you have to prove that the output may be anything (any >>>> bit pattern). For example it would be wrong to fold "mul 0, undef" to undef >>>> since no matter what value you substitute for "undef" the output of the mul >>>> will always be zero. Note that in something like "mul undef, undef", where >>>> undef occurs more than once, you are allowed to assume that the two undefs are >>>> independent of each other, uncorrelated, i.e. in your reasoning you can insert >>>> any bit pattern for the first undef and any other bit pattern for the second >>>> undef. >>>> >>>> Here is a proof that "mul undef, undef" is undef. Let R represent an >>>> arbitrary bit pattern. We have to find two bit patterns X and Y such that >>>> "mul X, Y" is equal to R. For example set X to 1 and Y to R. This ends the >>>> proof. >>>> >>>> Now consider "fmul undef, undef". You could try to apply the same reasoning, >>>> and maybe it is OK. Let R represent an arbitrary bit pattern. Set Y = R and >>>> set X to be the bit pattern for the floating point number 1.0. Then "fmul X, >>>> Y" is equal to R, proving that "fmul undef, undef" can be folded to "undef". >>>> But... is this correct? Is it always true that "fmul 1.0, R" always has >>>> exactly the same bits as R (this is not the same as the output comparing equal >>>> to R using a floating point comparison)? After all, R might be something >>>> funky like a signed zero, an infinity, or a NaN. I don't know if it is always >>>> true if "fmul 1.0, R" always has the same bits as R, hopefully a floating >>>> point expert can tell us. >>>> >>>> You could always argue that you could choose "undef" to be a signalling NaN, >>>> causing the program to have undefined behaviour at the point of the >>>> multiplication, in which case you could do anything, but kindly instead just >>>> fold the fmul to undef. I don't much like using snans like this though since >>>> they can be made non-signalling by poking the processor AFAIK. >>>> >>>>> >>>>> The same applies to "fdiv". >>>>> /"fdiv undef, undef"/ is not folded but /"div %X, undef"/ and /"div undef, >>>>> %X"/ >>>>> are folded to /"undef"/ (SimplifyFDivInst function in >>>>> lib/Analysis/InstructionSimplify.cpp). >>>> >>>> Here is the argument for "div %X, undef -> undef". The undef value might be >>>> zero (div %X, 0), so lets choose it to be zero. That means that the program >>>> has undefined behaviour at this point, and so we could fold the div to >>>> "exit(1)" or "launch nuclear missile". But instead we only fold it to undef. >>>> >>>> You are wrong to say that "div undef, %X" is folded to "undef" by >>>> InstructionSimplify, it is folded to zero. It would be wrong to fold to >>>> undef, since (eg) if %X is 2 then you can't obtain an arbitrary bit pattern as >>>> the output. >>>> >>>> Fdiv is harder than div because a floating point division by 0.0 has a defined >>>> result, unlike for div. >>>> >>>> Moreover, SimplifyFDivInst does not take >>>>> into account whether signalling of SNaNs can be switched off or not - it >>>>> always >>>>> folds if one of the operands is /"undef"/. >>>> >>>> This is either a mistake, or a decision that in LLVM IR snans are always >>>> considered to be signalling. >>>> >>>>> >>>>> Another mysterious thing for me is folding of /"mul %X, undef"/. >>>>> The result depends on whether %X is odd or even: >>>>> >>>>> * "undef" if %X is odd or equal to "undef"; >>>>> * 0 otherwise. >>>> >>>> InstructionSimplify folds "mul %X, undef" to 0 always, since after all "undef" >>>> could be zero, in which case the output is 0. It would be wrong to fold it to >>>> undef, as you point out, and it isn't folded to undef. >>>> >>>>> >>>>> There is a similar bug 16258 about folding of /"fadd undef, undef"/. /"Add" >>>>> /gets folded to /"undef"/ if one or both its operands are /"undef"/. >>>>> Should /"fadd"/ behave differently than integer /"add"/ too? >>>> >>>> It's the same problem. It is easy to show that "add undef, %X" can produce >>>> any desired bit pattern as output, but it is less clear whether that is true >>>> for fadd, unless you use the snan argument. >>>> >>>> Ciao, Duncan. >>>> >>>>> >>>>> I've tried to google these questions, scanned StackOverflow, but didn't >>>>> find any >>>>> good articles / posts answering them. LLVM's /"undef"/ is covered quite >>>>> poorly. >>>>> >>>>> Duncan, do you know of any web resources explaining this in more detail? >>>>> >>>>> Thank you in advance for any help. >>>>> >>>>> Kind regards, >>>>> Oleg >>>> >>> >> >
Oleg Ranevskyy
2014-Aug-27 16:17 UTC
[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef
Duncan,> Hi Oleg, > >> >> /This is either a mistake, or a decision that in LLVM IR snans >> are always >> considered to be signalling. / >> Yes, this seems to be an agreement to treat "undef" as a SNaN for >> "fdiv". > > "undef" is whatever bit pattern you want it to be, i.e. the compiler > can assume it is any convenient value. For one fdiv optimization > maybe it is convenient to choose it to be the bit pattern of an snan, > for some other fdiv optimization the compiler can choose it to be 1.0 > if that is convenient. > >> The question is whether we can make the same assumption for other >> floating point >> operations, or "fdiv" needs a correction to prevent folding since >> signalling of >> SNaNs might be disabled. > > You can assume that undef is an snan if you want in any floating point > operation. But what does that assumption buy you? If you are willing > to assume that the processor will trap on snans then it buys you a lot > for any floating point operation, but if you are not willing to assume > that then you need to find some other trick or give up on folding the > operation to undef.This is confusing me a bit. On the one hand, we can assume undef to be an SNaN for the sake of folding. InstructionSimplify makes such an assumption and folds "fdiv undef, %X" and "fdiv %X, undef" to undef. On the other hand, there are the requests to fold "fmul undef, undef" / "fadd undef, undef" to undef as well. However, it was stated that such folding is questionable as signalling of SNaNs can potentially be disabled. Shouldn't fdiv / fmul / fadd be consistent as regards the assumptions they make? Could you please advise how we should proceed with the bugs 16257 and 16258 then?> >> >> >> /InstructionSimplify folds "mul %X, undef" to 0 always/ >> Sorry, I malformed this line and forgot to highlight that by "%X" I >> meant a >> constant here. So, constant folding comes into play. The result >> depends on the >> constant parity. E.g.: >> mul i64 5, undef --> undef >> mul i64 4, undef --> 0 >> I still have difficulty understanding why the first one is folded to >> "undef". >> "Undef" could be zero, so the result is also zero. Besides, if >> "undef" isn't >> zero, it's impossible to get an arbitrary bit pattern anyway. The >> result will >> always be divisible by 5. > > No, that's not true, because we are dealing with arithmetic modulo > 2^64 here. Since 5 and 2^64 are relatively prime, you can prove that > given any i64 value R, there exists an i64 value X such that mul i64 > 5, X is equal to R. Here's an example using i3 arithmetic (possible > values: 0, ..., 7). > mul i3 5, 0 -> 0 > mul i3 5, 5 -> 1 (because 5 * 5 is 25; reduced modulo 8 that gives 1) > mul i3 5, 2 -> 2 > mul i3 5, 7 -> 3 > mul i3 5, 4 -> 4 > mul i3 5, 1 -> 5 > mul i3 5, 6 -> 6 > mul i3 5, 3 -> 7 > So you see that every possible i3 output is achievable. Thus it is > correct to fold "mul i3 5, undef" to undef. The same goes for any > other number that is relatively prime to 2, i.e. is an odd number.Yes, this explanation is absolutely reasonable. Thanks!> > Ciao, Duncan. > >> >> Would you be able to elaborate on this please? >> Thank you! >> >> Kind regards, >> Oleg >> >> On 26.08.2014 15:47, Duncan Sands wrote: >>> Hi Oleg, >>> >>> On 26/08/14 21:20, Oleg Ranevskyy wrote: >>>> Hi Duncan, >>>> >>>> Thank you for your comment to the bug 16257. >>>> >>>> I am new to LLVM, so not all the aspects of LLVM's /"undef"/ seem >>>> clear to me >>>> yet. >>>> I read and understood the examples from the LLVM documentation: >>>> llvm.org/docs/LangRef.html#undefined-values >>>> >>>> However, those examples do not cover all of the possible contexts >>>> where >>>> /"undef"/ can appear. >>>> E.g., I can't realize why /"fmul undef, undef"/ may not result >>>> in/"undef"/ >>>> whereas /"mul undef, undef"/ is always folded to /"undef"/. Seems I >>>> am missing >>>> something important about floats here. >>> >>> in the case of mul undef, undef, the two inputs may be anything. In >>> order to >>> fold the mul to undef you have to prove that the output may be >>> anything (any >>> bit pattern). For example it would be wrong to fold "mul 0, undef" >>> to undef >>> since no matter what value you substitute for "undef" the output of >>> the mul >>> will always be zero. Note that in something like "mul undef, >>> undef", where >>> undef occurs more than once, you are allowed to assume that the two >>> undefs are >>> independent of each other, uncorrelated, i.e. in your reasoning you >>> can insert >>> any bit pattern for the first undef and any other bit pattern for >>> the second >>> undef. >>> >>> Here is a proof that "mul undef, undef" is undef. Let R represent an >>> arbitrary bit pattern. We have to find two bit patterns X and Y >>> such that >>> "mul X, Y" is equal to R. For example set X to 1 and Y to R. This >>> ends the >>> proof. >>> >>> Now consider "fmul undef, undef". You could try to apply the same >>> reasoning, >>> and maybe it is OK. Let R represent an arbitrary bit pattern. Set >>> Y = R and >>> set X to be the bit pattern for the floating point number 1.0. Then >>> "fmul X, >>> Y" is equal to R, proving that "fmul undef, undef" can be folded to >>> "undef". >>> But... is this correct? Is it always true that "fmul 1.0, R" always >>> has >>> exactly the same bits as R (this is not the same as the output >>> comparing equal >>> to R using a floating point comparison)? After all, R might be >>> something >>> funky like a signed zero, an infinity, or a NaN. I don't know if it >>> is always >>> true if "fmul 1.0, R" always has the same bits as R, hopefully a >>> floating >>> point expert can tell us. >>> >>> You could always argue that you could choose "undef" to be a >>> signalling NaN, >>> causing the program to have undefined behaviour at the point of the >>> multiplication, in which case you could do anything, but kindly >>> instead just >>> fold the fmul to undef. I don't much like using snans like this >>> though since >>> they can be made non-signalling by poking the processor AFAIK. >>> >>>> >>>> The same applies to "fdiv". >>>> /"fdiv undef, undef"/ is not folded but /"div %X, undef"/ and /"div >>>> undef, %X"/ >>>> are folded to /"undef"/ (SimplifyFDivInst function in >>>> lib/Analysis/InstructionSimplify.cpp). >>> >>> Here is the argument for "div %X, undef -> undef". The undef value >>> might be >>> zero (div %X, 0), so lets choose it to be zero. That means that the >>> program >>> has undefined behaviour at this point, and so we could fold the div to >>> "exit(1)" or "launch nuclear missile". But instead we only fold it >>> to undef. >>> >>> You are wrong to say that "div undef, %X" is folded to "undef" by >>> InstructionSimplify, it is folded to zero. It would be wrong to >>> fold to >>> undef, since (eg) if %X is 2 then you can't obtain an arbitrary bit >>> pattern as >>> the output. >>> >>> Fdiv is harder than div because a floating point division by 0.0 has >>> a defined >>> result, unlike for div. >>> >>> Moreover, SimplifyFDivInst does not take >>>> into account whether signalling of SNaNs can be switched off or not >>>> - it always >>>> folds if one of the operands is /"undef"/. >>> >>> This is either a mistake, or a decision that in LLVM IR snans are >>> always >>> considered to be signalling. >>> >>>> >>>> Another mysterious thing for me is folding of /"mul %X, undef"/. >>>> The result depends on whether %X is odd or even: >>>> >>>> * "undef" if %X is odd or equal to "undef"; >>>> * 0 otherwise. >>> >>> InstructionSimplify folds "mul %X, undef" to 0 always, since after >>> all "undef" >>> could be zero, in which case the output is 0. It would be wrong to >>> fold it to >>> undef, as you point out, and it isn't folded to undef. >>> >>>> >>>> There is a similar bug 16258 about folding of /"fadd undef, >>>> undef"/. /"Add" >>>> /gets folded to /"undef"/ if one or both its operands are /"undef"/. >>>> Should /"fadd"/ behave differently than integer /"add"/ too? >>> >>> It's the same problem. It is easy to show that "add undef, %X" can >>> produce >>> any desired bit pattern as output, but it is less clear whether that >>> is true >>> for fadd, unless you use the snan argument. >>> >>> Ciao, Duncan. >>> >>>> >>>> I've tried to google these questions, scanned StackOverflow, but >>>> didn't find any >>>> good articles / posts answering them. LLVM's /"undef"/ is covered >>>> quite poorly. >>>> >>>> Duncan, do you know of any web resources explaining this in more >>>> detail? >>>> >>>> Thank you in advance for any help. >>>> >>>> Kind regards, >>>> Oleg >>> >> >
Owen Anderson
2014-Aug-27 17:06 UTC
[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef
> On Aug 27, 2014, at 6:34 AM, Duncan Sands <duncan.sands at deepbluecap.com> wrote: > > I think you should try to get LLVM floating point experts involved, to find out their opinion about whether LLVM should really assume that snans always trap. > > If they think it is fine to assume trapping, then you can fold any floating point operation with an "undef" operand to "undef". > > If they think it is no good, then the existing folds that use this need to be removed or weakened, though maybe another argument can be found to justify them.LLVM is used to target many platforms for which sNaNs do not trap, and indeed many platforms that do not have floating point exceptions at all. —Owen -------------- next part -------------- An HTML attachment was scrubbed... URL: <lists.llvm.org/pipermail/llvm-dev/attachments/20140827/0d160d6c/attachment.html>
Stephen Canon
2014-Aug-28 12:39 UTC
[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef
> I think you should try to get LLVM floating point experts involved, to find out their opinion about whether LLVM should really assume that snans always trap.We definitely cannot assume that operations on sNaN always trap. Actually, are there *any* llvm-supported platforms where the invalid exception is unmasked by default? – Steve