Hello, R friends My student unearthed this quirk that might interest you. I wondered if this might be a bug in the R interpreter. If not a bug, it certainly stands as a good example of the dangers of floating point numbers in computing. What do you think?> 100*(23/40)[1] 57.5> (100*23)/40[1] 57.5> round(100*(23/40))[1] 57> round((100*23)/40)[1] 58 The result in the 2 rounds should be the same, I think. Clearly some digital number devil is at work. I *guess* that when you put in whole numbers and group them like this (100*23), the interpreter does integer math, but if you group (23/40), you force a fractional division and a floating point number. The results from the first 2 calculations are not actually 57.5, they just appear that way. Before you close the books, look at this:> aa <- 100*(23/40) > bb <- (100*23)/40 > all.equal(aa,bb)[1] TRUE> round(aa)[1] 57> round(bb)[1] 58 I'm putting this one in my collection of "difficult to understand" numerical calculations. If you have seen this before, I'm sorry to waste your time. pj -- Paul E. Johnson http://pj.freefaculty.org Director, Center for Research Methods and Data Analysis http://crmda.ku.edu To write to me directly, please address me at pauljohn at ku.edu.
Nordlund, Dan (DSHS/RDA)
2017-Apr-20 22:20 UTC
[R] Interesting quirk with fractions and rounding
This is FAQ 7.31. It is not a bug, it is the unavoidable problem of accurately representing floating point numbers with a finite number of bits of precision. Look at the following:> a <- 100*(23/40) > b <- (100*23)/40 > print(a,digits=20)[1] 57.499999999999993> print(b,digits=20)[1] 57.5>Your example with all.equal evaluates TRUE because all.equal uses a 'fuzz factor'. From the all.equal man page "all.equal(x, y) is a utility to compare R objects x and y testing 'near equality'." Hope this is helpful, Dan Daniel Nordlund, PhD Research and Data Analysis Division Services & Enterprise Support Administration Washington State Department of Social and Health Services> -----Original Message----- > From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Paul > Johnson > Sent: Thursday, April 20, 2017 2:56 PM > To: R-help > Subject: [R] Interesting quirk with fractions and rounding > > Hello, R friends > > My student unearthed this quirk that might interest you. > > I wondered if this might be a bug in the R interpreter. If not a bug, > it certainly stands as a good example of the dangers of floating point > numbers in computing. > > What do you think? > > > 100*(23/40) > [1] 57.5 > > (100*23)/40 > [1] 57.5 > > round(100*(23/40)) > [1] 57 > > round((100*23)/40) > [1] 58 > > The result in the 2 rounds should be the same, I think. Clearly some > digital number devil is at work. I *guess* that when you put in whole > numbers and group them like this (100*23), the interpreter does > integer math, but if you group (23/40), you force a fractional > division and a floating point number. The results from the first 2 > calculations are not actually 57.5, they just appear that way. > > Before you close the books, look at this: > > > aa <- 100*(23/40) > > bb <- (100*23)/40 > > all.equal(aa,bb) > [1] TRUE > > round(aa) > [1] 57 > > round(bb) > [1] 58 > > I'm putting this one in my collection of "difficult to understand" > numerical calculations. > > If you have seen this before, I'm sorry to waste your time. > > pj > -- > Paul E. Johnson http://pj.freefaculty.org > Director, Center for Research Methods and Data Analysis > http://crmda.ku.edu > > To write to me directly, please address me at pauljohn at ku.edu. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.
I might add that things that *look* like integers in R are not really integers, unless you explicitly label them as such:> str(20)num 20> str(20.5)num 20.5> str(20L)int 20>I think that Python 2 will do integer arithmetic on things that look like integers: $ python2 . . .>>> 30 / 201>>>But that behavior has changed in Python 3: $ python3 . . .>>> 30 / 201.5>>>-- Mike On Thu, Apr 20, 2017 at 3:20 PM, Nordlund, Dan (DSHS/RDA) <NordlDJ at dshs.wa.gov> wrote:> This is FAQ 7.31. It is not a bug, it is the unavoidable problem of accurately representing floating point numbers with a finite number of bits of precision. Look at the following: > >> a <- 100*(23/40) >> b <- (100*23)/40 >> print(a,digits=20) > [1] 57.499999999999993 >> print(b,digits=20) > [1] 57.5 >> > > Your example with all.equal evaluates TRUE because all.equal uses a 'fuzz factor'. From the all.equal man page > > "all.equal(x, y) is a utility to compare R objects x and y testing 'near equality'." > > > Hope this is helpful, > > Dan > > Daniel Nordlund, PhD > Research and Data Analysis Division > Services & Enterprise Support Administration > Washington State Department of Social and Health Services > > >> -----Original Message----- >> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Paul >> Johnson >> Sent: Thursday, April 20, 2017 2:56 PM >> To: R-help >> Subject: [R] Interesting quirk with fractions and rounding >> >> Hello, R friends >> >> My student unearthed this quirk that might interest you. >> >> I wondered if this might be a bug in the R interpreter. If not a bug, >> it certainly stands as a good example of the dangers of floating point >> numbers in computing. >> >> What do you think? >> >> > 100*(23/40) >> [1] 57.5 >> > (100*23)/40 >> [1] 57.5 >> > round(100*(23/40)) >> [1] 57 >> > round((100*23)/40) >> [1] 58 >> >> The result in the 2 rounds should be the same, I think. Clearly some >> digital number devil is at work. I *guess* that when you put in whole >> numbers and group them like this (100*23), the interpreter does >> integer math, but if you group (23/40), you force a fractional >> division and a floating point number. The results from the first 2 >> calculations are not actually 57.5, they just appear that way. >> >> Before you close the books, look at this: >> >> > aa <- 100*(23/40) >> > bb <- (100*23)/40 >> > all.equal(aa,bb) >> [1] TRUE >> > round(aa) >> [1] 57 >> > round(bb) >> [1] 58 >> >> I'm putting this one in my collection of "difficult to understand" >> numerical calculations. >> >> If you have seen this before, I'm sorry to waste your time. >> >> pj >> -- >> Paul E. Johnson http://pj.freefaculty.org >> Director, Center for Research Methods and Data Analysis >> http://crmda.ku.edu >> >> To write to me directly, please address me at pauljohn at ku.edu. >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting- >> guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Use all.equal(tolerance=0, aa, bb) to check for exact equality: > aa <- 100*(23/40) > bb <- (100*23)/40 > all.equal(aa,bb) [1] TRUE > all.equal(aa,bb,tolerance=0) [1] "Mean relative difference: 1.235726e-16" > aa < bb [1] TRUE The numbers there are rounded to 52 binary digits (16+ decimal digits) for storage and rounding is not a linear or associative operation. Think of doing arithmetic by hand where you store all numbers, include intermediate results, with only 2 significant decimal digits: (3 * 1) / 3 -> 3 / 3 -> 1 3 * (1/3) -> 3 * 0.33 -> 0.99 Bill Dunlap TIBCO Software wdunlap tibco.com On Thu, Apr 20, 2017 at 2:56 PM, Paul Johnson <pauljohn32 at gmail.com> wrote:> Hello, R friends > > My student unearthed this quirk that might interest you. > > I wondered if this might be a bug in the R interpreter. If not a bug, > it certainly stands as a good example of the dangers of floating point > numbers in computing. > > What do you think? > > > 100*(23/40) > [1] 57.5 > > (100*23)/40 > [1] 57.5 > > round(100*(23/40)) > [1] 57 > > round((100*23)/40) > [1] 58 > > The result in the 2 rounds should be the same, I think. Clearly some > digital number devil is at work. I *guess* that when you put in whole > numbers and group them like this (100*23), the interpreter does > integer math, but if you group (23/40), you force a fractional > division and a floating point number. The results from the first 2 > calculations are not actually 57.5, they just appear that way. > > Before you close the books, look at this: > > > aa <- 100*(23/40) > > bb <- (100*23)/40 > > all.equal(aa,bb) > [1] TRUE > > round(aa) > [1] 57 > > round(bb) > [1] 58 > > I'm putting this one in my collection of "difficult to understand" > numerical calculations. > > If you have seen this before, I'm sorry to waste your time. > > pj > -- > Paul E. Johnson http://pj.freefaculty.org > Director, Center for Research Methods and Data Analysis > http://crmda.ku.edu > > To write to me directly, please address me at pauljohn at ku.edu. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Hi The problem is that people using Excel or probably other such spreadsheets do not encounter this behaviour as Excel silently rounds all your calculations and makes approximate comparison without telling it does so. Therefore most people usually do not have any knowledge of floating point numbers representation. Cheers Petr -----Original Message----- From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Paul Johnson Sent: Thursday, April 20, 2017 11:56 PM To: R-help <r-help at r-project.org> Subject: [R] Interesting quirk with fractions and rounding Hello, R friends My student unearthed this quirk that might interest you. I wondered if this might be a bug in the R interpreter. If not a bug, it certainly stands as a good example of the dangers of floating point numbers in computing. What do you think?> 100*(23/40)[1] 57.5> (100*23)/40[1] 57.5> round(100*(23/40))[1] 57> round((100*23)/40)[1] 58 The result in the 2 rounds should be the same, I think. Clearly some digital number devil is at work. I *guess* that when you put in whole numbers and group them like this (100*23), the interpreter does integer math, but if you group (23/40), you force a fractional division and a floating point number. The results from the first 2 calculations are not actually 57.5, they just appear that way. Before you close the books, look at this:> aa <- 100*(23/40) > bb <- (100*23)/40 > all.equal(aa,bb)[1] TRUE> round(aa)[1] 57> round(bb)[1] 58 I'm putting this one in my collection of "difficult to understand" numerical calculations. If you have seen this before, I'm sorry to waste your time. pj -- Paul E. Johnson http://pj.freefaculty.org Director, Center for Research Methods and Data Analysis http://crmda.ku.edu To write to me directly, please address me at pauljohn at ku.edu. ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ________________________________ Tento e-mail a jak?koliv k n?mu p?ipojen? dokumenty jsou d?v?rn? a jsou ur?eny pouze jeho adres?t?m. Jestli?e jste obdr?el(a) tento e-mail omylem, informujte laskav? neprodlen? jeho odes?latele. Obsah tohoto emailu i s p??lohami a jeho kopie vyma?te ze sv?ho syst?mu. Nejste-li zam??len?m adres?tem tohoto emailu, nejste opr?vn?ni tento email jakkoliv u??vat, roz?i?ovat, kop?rovat ?i zve?ej?ovat. Odes?latel e-mailu neodpov?d? za eventu?ln? ?kodu zp?sobenou modifikacemi ?i zpo?d?n?m p?enosu e-mailu. V p??pad?, ?e je tento e-mail sou??st? obchodn?ho jedn?n?: - vyhrazuje si odes?latel pr?vo ukon?it kdykoliv jedn?n? o uzav?en? smlouvy, a to z jak?hokoliv d?vodu i bez uveden? d?vodu. - a obsahuje-li nab?dku, je adres?t opr?vn?n nab?dku bezodkladn? p?ijmout; Odes?latel tohoto e-mailu (nab?dky) vylu?uje p?ijet? nab?dky ze strany p??jemce s dodatkem ?i odchylkou. - trv? odes?latel na tom, ?e p??slu?n? smlouva je uzav?ena teprve v?slovn?m dosa?en?m shody na v?ech jej?ch n?le?itostech. - odes?latel tohoto emailu informuje, ?e nen? opr?vn?n uzav?rat za spole?nost ??dn? smlouvy s v?jimkou p??pad?, kdy k tomu byl p?semn? zmocn?n nebo p?semn? pov??en a takov? pov??en? nebo pln? moc byly adres?tovi tohoto emailu p??padn? osob?, kterou adres?t zastupuje, p?edlo?eny nebo jejich existence je adres?tovi ?i osob? j?m zastoupen? zn?m?. This e-mail and any documents attached to it may be confidential and are intended only for its intended recipients. If you received this e-mail by mistake, please immediately inform its sender. Delete the contents of this e-mail with all attachments and its copies from your system. If you are not the intended recipient of this e-mail, you are not authorized to use, disseminate, copy or disclose this e-mail in any manner. The sender of this e-mail shall not be liable for any possible damage caused by modifications of the e-mail or by delay with transfer of the email. In case that this e-mail forms part of business dealings: - the sender reserves the right to end negotiations about entering into a contract in any time, for any reason, and without stating any reasoning. - if the e-mail contains an offer, the recipient is entitled to immediately accept such offer; The sender of this e-mail (offer) excludes any acceptance of the offer on the part of the recipient containing any amendment or variation. - the sender insists on that the respective contract is concluded only upon an express mutual agreement on all its aspects. - the sender of this e-mail informs that he/she is not authorized to enter into any contracts on behalf of the company except for cases in which he/she is expressly authorized to do so in writing, and such authorization or power of attorney is submitted to the recipient or the person represented by the recipient, or the existence of such authorization is known to the recipient of the person represented by the recipient.
We all agree it is a problem with digital computing, not unique to R. I don't think that is the right place to stop. What to do? The round example arose in a real funded project where 2 R programs differed in results and cause was that one person got 57 and another got 58. The explanation was found, but its less clear how to prevent similar in future. Guidelines, anyone? So far, these are my guidelines. 1. Insert L on numbers to signal that you really mean INTEGER. In R, forgetting the L in a single number will usually promote whole calculation to floats. 2. S3 variables are called 'numeric' if they are integer or double storage. So avoid "is.numeric" and prefer "is.double". 3. == is a total fail on floats 4. Run print with digits=20 so we can see the less rounded number. Perhaps start sessions with "options(digits=20)" 5. all.equal does what it promises, but one must be cautious. Are there math habits we should follow? For example, Is it generally true in R that (100*x)/y is more accurate than 100*(x/y), if x > y? (If that is generally true, couldn't the R interpreter do it for the user?) I've seen this problem before. In later editions of the game theory program Gambit, extraordinary effort was taken to keep values symbolically as integers as long as possible. Avoid division until the last steps. Same in Swarm simulations. Gary Polhill wrote an essay about the Ghost in the Machine along those lines, showing accidents from trusting floats. I wonder now if all uses of > or < with numeric variables are suspect. Oh well. If everybody posts their advice, I will write a summary. Paul Johnson University of Kansas On Apr 21, 2017 12:02 AM, "PIKAL Petr" <petr.pikal at precheza.cz> wrote:> Hi > > The problem is that people using Excel or probably other such spreadsheets > do not encounter this behaviour as Excel silently rounds all your > calculations and makes approximate comparison without telling it does so. > Therefore most people usually do not have any knowledge of floating point > numbers representation. > > Cheers > Petr > > -----Original Message----- > From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Paul > Johnson > Sent: Thursday, April 20, 2017 11:56 PM > To: R-help <r-help at r-project.org> > Subject: [R] Interesting quirk with fractions and rounding > > Hello, R friends > > My student unearthed this quirk that might interest you. > > I wondered if this might be a bug in the R interpreter. If not a bug, it > certainly stands as a good example of the dangers of floating point numbers > in computing. > > What do you think? > > > 100*(23/40) > [1] 57.5 > > (100*23)/40 > [1] 57.5 > > round(100*(23/40)) > [1] 57 > > round((100*23)/40) > [1] 58 > > The result in the 2 rounds should be the same, I think. Clearly some > digital number devil is at work. I *guess* that when you put in whole > numbers and group them like this (100*23), the interpreter does integer > math, but if you group (23/40), you force a fractional division and a > floating point number. The results from the first 2 calculations are not > actually 57.5, they just appear that way. > > Before you close the books, look at this: > > > aa <- 100*(23/40) > > bb <- (100*23)/40 > > all.equal(aa,bb) > [1] TRUE > > round(aa) > [1] 57 > > round(bb) > [1] 58 > > I'm putting this one in my collection of "difficult to understand" > numerical calculations. > > If you have seen this before, I'm sorry to waste your time. > > pj > -- > Paul E. Johnson http://pj.freefaculty.org > Director, Center for Research Methods and Data Analysis > http://crmda.ku.edu > > To write to me directly, please address me at pauljohn at ku.edu. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > ________________________________ > Tento e-mail a jak?koliv k n?mu p?ipojen? dokumenty jsou d?v?rn? a jsou > ur?eny pouze jeho adres?t?m. > Jestli?e jste obdr?el(a) tento e-mail omylem, informujte laskav? > neprodlen? jeho odes?latele. Obsah tohoto emailu i s p??lohami a jeho kopie > vyma?te ze sv?ho syst?mu. > Nejste-li zam??len?m adres?tem tohoto emailu, nejste opr?vn?ni tento email > jakkoliv u??vat, roz?i?ovat, kop?rovat ?i zve?ej?ovat. > Odes?latel e-mailu neodpov?d? za eventu?ln? ?kodu zp?sobenou modifikacemi > ?i zpo?d?n?m p?enosu e-mailu. > > V p??pad?, ?e je tento e-mail sou??st? obchodn?ho jedn?n?: > - vyhrazuje si odes?latel pr?vo ukon?it kdykoliv jedn?n? o uzav?en? > smlouvy, a to z jak?hokoliv d?vodu i bez uveden? d?vodu. > - a obsahuje-li nab?dku, je adres?t opr?vn?n nab?dku bezodkladn? p?ijmout; > Odes?latel tohoto e-mailu (nab?dky) vylu?uje p?ijet? nab?dky ze strany > p??jemce s dodatkem ?i odchylkou. > - trv? odes?latel na tom, ?e p??slu?n? smlouva je uzav?ena teprve > v?slovn?m dosa?en?m shody na v?ech jej?ch n?le?itostech. > - odes?latel tohoto emailu informuje, ?e nen? opr?vn?n uzav?rat za > spole?nost ??dn? smlouvy s v?jimkou p??pad?, kdy k tomu byl p?semn? zmocn?n > nebo p?semn? pov??en a takov? pov??en? nebo pln? moc byly adres?tovi tohoto > emailu p??padn? osob?, kterou adres?t zastupuje, p?edlo?eny nebo jejich > existence je adres?tovi ?i osob? j?m zastoupen? zn?m?. > > This e-mail and any documents attached to it may be confidential and are > intended only for its intended recipients. > If you received this e-mail by mistake, please immediately inform its > sender. Delete the contents of this e-mail with all attachments and its > copies from your system. > If you are not the intended recipient of this e-mail, you are not > authorized to use, disseminate, copy or disclose this e-mail in any manner. > The sender of this e-mail shall not be liable for any possible damage > caused by modifications of the e-mail or by delay with transfer of the > email. > > In case that this e-mail forms part of business dealings: > - the sender reserves the right to end negotiations about entering into a > contract in any time, for any reason, and without stating any reasoning. > - if the e-mail contains an offer, the recipient is entitled to > immediately accept such offer; The sender of this e-mail (offer) excludes > any acceptance of the offer on the part of the recipient containing any > amendment or variation. > - the sender insists on that the respective contract is concluded only > upon an express mutual agreement on all its aspects. > - the sender of this e-mail informs that he/she is not authorized to enter > into any contracts on behalf of the company except for cases in which > he/she is expressly authorized to do so in writing, and such authorization > or power of attorney is submitted to the recipient or the person > represented by the recipient, or the existence of such authorization is > known to the recipient of the person represented by the recipient. >[[alternative HTML version deleted]]