George Trojan - NOAA Federal
2017-Apr-21 15:54 UTC
[R] Interesting quirk with fractions and rounding
The subject is messy. I vaguely remember learning this stuff on my first numerical analysis course over 40 years ago. The classic reference material (much newer, only 25 years old) is: What Every Computer Scientist Should Know About Floating-Point Arithmetic, David Goldberg, ACM Computing Surveys, Vol 23, No 1, 1991. Available here: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.22.6768 George I suggest you read some basic books on numerical analysis and/or talk> with a numerical analyst. You are (like most of us) an amateur at this > sort of thing trying to reinvent wheels. If you are concerned with > details, talk with experts. Don't assume what you don't know. This > list is *not* a reliable source of such expertise, although there > *are* individuals in the R universe with considerable knowledge who > may or may not choose to respond. > Cheers, > Bert > Bert Gunter > "The trouble with having an open mind is that people keep coming along > and sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )On Fri, Apr 21, 2017 at 5:19 AM, Paul Johnson <pauljohn32 at gmail.com <https://stat.ethz.ch/mailman/listinfo/r-help>> wrote:>* We all agree it is a problem with digital computing, not unique to R. I*>* don't think that is the right place to stop. *>>* What to do? The round example arose in a real funded project where 2 R *>* programs differed in results and cause was that one person got 57 and *>* another got 58. The explanation was found, but its less clear how to *>* prevent similar in future. Guidelines, anyone? *>>* So far, these are my guidelines. *>>* 1. Insert L on numbers to signal that you really mean INTEGER. In R, *>* forgetting the L in a single number will usually promote whole calculation *>* to floats. *>* 2. S3 variables are called 'numeric' if they are integer or double storage. *>* So avoid "is.numeric" and prefer "is.double". *>* 3. == is a total fail on floats *>* 4. Run print with digits=20 so we can see the less rounded number. Perhaps *>* start sessions with "options(digits=20)" *>* 5. all.equal does what it promises, but one must be cautious. *>>* Are there math habits we should follow? *>>* For example, Is it generally true in R that (100*x)/y is more accurate than *>* 100*(x/y), if x > y? (If that is generally true, couldn't the R *>* interpreter do it for the user?) *>>* I've seen this problem before. In later editions of the game theory program *>* Gambit, extraordinary effort was taken to keep values symbolically as *>* integers as long as possible. Avoid division until the last steps. Same in *>* Swarm simulations. Gary Polhill wrote an essay about the Ghost in the *>* Machine along those lines, showing accidents from trusting floats. *>>* I wonder now if all uses of > or < with numeric variables are suspect. *>>* Oh well. If everybody posts their advice, I will write a summary. *>>* Paul Johnson *>* University of Kansas *>>* On Apr 21, 2017 12:02 AM, "PIKAL Petr" <petr.pikal at precheza.cz <https://stat.ethz.ch/mailman/listinfo/r-help>> wrote: *>>>* Hi *>>>>* The problem is that people using Excel or probably other such spreadsheets *>>* do not encounter this behaviour as Excel silently rounds all your *>>* calculations and makes approximate comparison without telling it does so. *>>* Therefore most people usually do not have any knowledge of floating point *>>* numbers representation. *>>>>* Cheers *>>* Petr *>>>>* -----Original Message----- *>>* From: R-help [mailto:r-help-bounces at r-project.org <https://stat.ethz.ch/mailman/listinfo/r-help>] On Behalf Of Paul *>>* Johnson *>>* Sent: Thursday, April 20, 2017 11:56 PM *>>* To: R-help <r-help at r-project.org <https://stat.ethz.ch/mailman/listinfo/r-help>> *>>* Subject: [R] Interesting quirk with fractions and rounding *>>>>* Hello, R friends *>>>>* My student unearthed this quirk that might interest you. *>>>>* I wondered if this might be a bug in the R interpreter. If not a bug, it *>>* certainly stands as a good example of the dangers of floating point numbers *>>* in computing. *>>>>* What do you think? *>>>>* > 100*(23/40) *>>* [1] 57.5 *>>* > (100*23)/40 *>>* [1] 57.5 *>>* > round(100*(23/40)) *>>* [1] 57 *>>* > round((100*23)/40) *>>* [1] 58 *>>>>* The result in the 2 rounds should be the same, I think. Clearly some *>>* digital number devil is at work. I *guess* that when you put in whole *>>* numbers and group them like this (100*23), the interpreter does integer *>>* math, but if you group (23/40), you force a fractional division and a *>>* floating point number. The results from the first 2 calculations are not *>>* actually 57.5, they just appear that way. *>>>>* Before you close the books, look at this: *>>>>* > aa <- 100*(23/40) *>>* > bb <- (100*23)/40 *>>* > all.equal(aa,bb) *>>* [1] TRUE *>>* > round(aa) *>>* [1] 57 *>>* > round(bb) *>>* [1] 58 *>>>>* I'm putting this one in my collection of "difficult to understand" *>>* numerical calculations. *>>>>* If you have seen this before, I'm sorry to waste your time. *>>>>* pj *>>* -- *>>* Paul E. Johnson http://pj.freefaculty.org <http://pj.freefaculty.org> *>>* Director, Center for Research Methods and Data Analysis *>>* http://crmda.ku.edu <http://crmda.ku.edu> *>>>>* To write to me directly, please address me at pauljohn at ku.edu <http://ku.edu>. *>>>>* ______________________________________________ *>>* R-help at r-project.org <https://stat.ethz.ch/mailman/listinfo/r-help> mailing list -- To UNSUBSCRIBE and more, see *>>* https://stat.ethz.ch/mailman/listinfo/r-help <https://stat.ethz.ch/mailman/listinfo/r-help> *>>* PLEASE do read the posting guide http://www.R-project.org/ <http://www.R-project.org/> *>>* posting-guide.html *>>* and provide commented, minimal, self-contained, reproducible code. *>>>>* ________________________________ *>>* Tento e-mail a jak?koliv k n?mu p?ipojen? dokumenty jsou d?v?rn? a jsou *>>* ur?eny pouze jeho adres?t?m. *>>* Jestli?e jste obdr?el(a) tento e-mail omylem, informujte laskav? *>>* neprodlen? jeho odes?latele. Obsah tohoto emailu i s p??lohami a jeho kopie *>>* vyma?te ze sv?ho syst?mu. *>>* Nejste-li zam??len?m adres?tem tohoto emailu, nejste opr?vn?ni tento email *>>* jakkoliv u??vat, roz?i?ovat, kop?rovat ?i zve?ej?ovat. *>>* Odes?latel e-mailu neodpov?d? za eventu?ln? ?kodu zp?sobenou modifikacemi *>>* ?i zpo?d?n?m p?enosu e-mailu. *>>>>* V p??pad?, ?e je tento e-mail sou??st? obchodn?ho jedn?n?: *>>* - vyhrazuje si odes?latel pr?vo ukon?it kdykoliv jedn?n? o uzav?en? *>>* smlouvy, a to z jak?hokoliv d?vodu i bez uveden? d?vodu. *>>* - a obsahuje-li nab?dku, je adres?t opr?vn?n nab?dku bezodkladn? p?ijmout; *>>* Odes?latel tohoto e-mailu (nab?dky) vylu?uje p?ijet? nab?dky ze strany *>>* p??jemce s dodatkem ?i odchylkou. *>>* - trv? odes?latel na tom, ?e p??slu?n? smlouva je uzav?ena teprve *>>* v?slovn?m dosa?en?m shody na v?ech jej?ch n?le?itostech. *>>* - odes?latel tohoto emailu informuje, ?e nen? opr?vn?n uzav?rat za *>>* spole?nost ??dn? smlouvy s v?jimkou p??pad?, kdy k tomu byl p?semn? zmocn?n *>>* nebo p?semn? pov??en a takov? pov??en? nebo pln? moc byly adres?tovi tohoto *>>* emailu p??padn? osob?, kterou adres?t zastupuje, p?edlo?eny nebo jejich *>>* existence je adres?tovi ?i osob? j?m zastoupen? zn?m?. *>>>>* This e-mail and any documents attached to it may be confidential and are *>>* intended only for its intended recipients. *>>* If you received this e-mail by mistake, please immediately inform its *>>* sender. Delete the contents of this e-mail with all attachments and its *>>* copies from your system. *>>* If you are not the intended recipient of this e-mail, you are not *>>* authorized to use, disseminate, copy or disclose this e-mail in any manner. *>>* The sender of this e-mail shall not be liable for any possible damage *>>* caused by modifications of the e-mail or by delay with transfer of the *>>* email. *>>>>* In case that this e-mail forms part of business dealings: *>>* - the sender reserves the right to end negotiations about entering into a *>>* contract in any time, for any reason, and without stating any reasoning. *>>* - if the e-mail contains an offer, the recipient is entitled to *>>* immediately accept such offer; The sender of this e-mail (offer) excludes *>>* any acceptance of the offer on the part of the recipient containing any *>>* amendment or variation. *>>* - the sender insists on that the respective contract is concluded only *>>* upon an express mutual agreement on all its aspects. *>>* - the sender of this e-mail informs that he/she is not authorized to enter *>>* into any contracts on behalf of the company except for cases in which *>>* he/she is expressly authorized to do so in writing, and such authorization *>>* or power of attorney is submitted to the recipient or the person *>>* represented by the recipient, or the existence of such authorization is *>>* known to the recipient of the person represented by the recipient. *>>> [[alternative HTML version deleted]]