Dear R programmers, is there a sensible explanation for the following behaviour? The second command seems not to be interpreted correctly.> seq(0.6, 0.9, by=0.1) == 0.8[1] FALSE FALSE TRUE FALSE> seq(0.7, 0.9, by=0.1) == 0.8[1] FALSE FALSE FALSE> c(0.7, 0.8, 0.9) == 0.8[1] FALSE TRUE FALSE> seq(0.9, 0.7, by=-0.1) == 0.8[1] FALSE TRUE FALSE I am running R version 1.7.1 on XP and NT. Thanks, Marc --
Marc Vandemeulebroecke wrote:> Dear R programmers, > > is there a sensible explanation for the following behaviour? The second > command seems not to be interpreted correctly. > > >>seq(0.6, 0.9, by=0.1) == 0.8 > > [1] FALSE FALSE TRUE FALSE > >>seq(0.7, 0.9, by=0.1) == 0.8 > > [1] FALSE FALSE FALSE > >>c(0.7, 0.8, 0.9) == 0.8 > > [1] FALSE TRUE FALSE > >>seq(0.9, 0.7, by=-0.1) == 0.8 > > [1] FALSE TRUE FALSE > > I am running R version 1.7.1 on XP and NT. > > Thanks, > Marc > > -- > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-helpIt is correct, just an instability of the representation of that floating point number, because (regularly) floating point numbers cannot be represented exactly. Uwe Ligges
To be more precise, the decimal number 0.1 does not have an exact binary equivalent. A long time ago, there was a book called, IIRC, "Pascal with Style" or something of that ilk, which set out the warning "Never compare floating point numbers for equality." -- M. Edward (Ed) Borasky mailto:znmeb at borasky-research.net http://www.borasky-research.net "Suppose that tonight, while you sleep, a miracle happens - you wake up tomorrow with what you have longed for! How will you discover that a miracle happened? How will your loved ones? What will be different? What will you notice? What do you need to explode into tomorrow with grace, power, love, passion and confidence?" -- L. Michael Hall, PhD> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Uwe Ligges > Sent: Monday, July 14, 2003 1:10 AM > To: Marc Vandemeulebroecke > Cc: r-help at stat.math.ethz.ch > Subject: Re: [R] bug? > > > Marc Vandemeulebroecke wrote: > > > Dear R programmers, > > > > is there a sensible explanation for the following behaviour? The > > second command seems not to be interpreted correctly. > > > > > >>seq(0.6, 0.9, by=0.1) == 0.8 > > > > [1] FALSE FALSE TRUE FALSE > > > >>seq(0.7, 0.9, by=0.1) == 0.8 > > > > [1] FALSE FALSE FALSE > > > >>c(0.7, 0.8, 0.9) == 0.8 > > > > [1] FALSE TRUE FALSE > > > >>seq(0.9, 0.7, by=-0.1) == 0.8 > > > > [1] FALSE TRUE FALSE > > > > I am running R version 1.7.1 on XP and NT. > > > > Thanks, > > Marc > > > > -- > > > > ______________________________________________ > > R-help at stat.math.ethz.ch mailing list > > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > > > It is correct, just an instability of the representation of that > floating point number, because (regularly) floating point > numbers cannot > be represented exactly. > > Uwe Ligges > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo> /r-help >
On Mon, 2003-07-14 at 03:10, Uwe Ligges wrote:> Marc Vandemeulebroecke wrote: > > > Dear R programmers, > > > > is there a sensible explanation for the following behaviour? The second > > command seems not to be interpreted correctly. > > > > > >>seq(0.6, 0.9, by=0.1) == 0.8 > > > > [1] FALSE FALSE TRUE FALSE > > > >>seq(0.7, 0.9, by=0.1) == 0.8 > > > > [1] FALSE FALSE FALSE > > > >>c(0.7, 0.8, 0.9) == 0.8 > > > > [1] FALSE TRUE FALSE > > > >>seq(0.9, 0.7, by=-0.1) == 0.8 > > > > [1] FALSE TRUE FALSE > > > > I am running R version 1.7.1 on XP and NT. > > > > Thanks, > > Marc > > > > -- > > > > ______________________________________________ > > R-help at stat.math.ethz.ch mailing list > > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > > > It is correct, just an instability of the representation of that > floating point number, because (regularly) floating point numbers cannot > be represented exactly. > > Uwe LiggesAn good online reference for these issues is at: http://grouper.ieee.org/groups/754/ Specifically, the article by David Goldberg entitled "What Every Computer Scientist Should Know about Floating-Point Arithmetic", which is listed toward the bottom of that page. HTH, Marc Schwartz
At 7/14/2003 at 03:29 AM, Marc Vandemeulebroecke wrote:>Dear R programmers, > >is there a sensible explanation for the following behaviour? > > > seq(0.7, 0.9, by=0.1) == 0.8 >[1] FALSE FALSE FALSEAs Uwe Ligges pointed out, most floating point numbers are not exactly representable in most bases. Therefore, most floating-point comparisons for equality will not yield the common-sense results. A reasonably short and free article that describes this in a bit more detail is http://www.lahey.com/float.htm As a result, a better way of doing such comparisons is something like this: > eps = 1e-6 > aa = seq(0.7, 0.9, by=0.1) > abs(aa-0.8) < eps [1] FALSE TRUE FALSE If the scales of numbers vary in a given computation, it can be better to compare abs((a-b)/(a+b)) to some epsilon, rather than just abs(a-b). Hope that helps. -- Michael Prager, Ph.D. <Mike.Prager at noaa.gov> NOAA Center for Coastal Fisheries and Habitat Research Beaufort, North Carolina 28516 http://shrimp.ccfhrb.noaa.gov/~mprager/
> At 7/14/2003 at 03:29 AM, Marc Vandemeulebroecke wrote: >Dear R programmers, > >is there a sensible explanation for the following behaviour? > > > seq(0.7, 0.9, by=0.1) == 0.8 >[1] FALSE FALSE FALSEYet another %**% function ...> "%~=%"<-function(x,y){abs(x-y)<1e-15} > seq(0.7, 0.9, by=0.1) %~=% 0.8[1] FALSE TRUE FALSE Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk> Fax-to-email: +44 (0)870 167 1972 Date: 14-Jul-03 Time: 22:16:38 ------------------------------ XFMail ------------------------------
Marc Vandemeulebroecke <vandemem at gmx.de> asked: is there a sensible explanation for the following behaviour? > seq(0.6, 0.9, by=0.1) == 0.8 [1] FALSE FALSE TRUE FALSE > seq(0.7, 0.9, by=0.1) == 0.8 [1] FALSE FALSE FALSE Yes. It's called "floating-point arithmetic". The problem is that only computers using decimal floating-point arithmetic can represent 0.1 exactly; computers using binary floating-point can only represent numbers of the form (whole number) * (power of 2) {plus other stuff you probably don't want to know about, like NaNs, which aren't relevant here}. Let's see what you got: > seq(0.6, 0.9, by=0.1) - 0.8 [1] -0.2 -0.1 0.0 0.1 > seq(0.7, 0.9, by=0.1) - 0.8 [1] -1.000000e-01 -1.110223e-16 1.000000e-01 ^^^^^^^^^^^^^ This difference isn't 0; it's about one unit in the last place. The best way to work around this is only to use by=x when x is a whole number times a power of two. For example, > seq(6, 9, by=1)*0.1 == 0.8 [1] FALSE FALSE TRUE FALSE > seq(7, 9, by=1)*0.1 == 0.8 [1] FALSE TRUE FALSE This is the reason why DO-loops with REAL control variables are deprecated in Fortran 90; they often give you very nasty surprises.