peter dalgaard
2017-Apr-23 13:37 UTC
[R] Interesting quirk with fractions and rounding / using == for floating point
> On 23 Apr 2017, at 14:49 , J C Nash <profjcnash at gmail.com> wrote: > > > So equality in floating point is not always "wrong", though it should be used > with some attention to what is going on. > > Apologies to those (e.g., Peter D.) who have heard this all before. I suspect > there are many to whom it is new.Peter D. still insists on never trusting exact equality, though. There was at least one case in the R sources where age-old code got itself into a condition where a residual terme that provably should decrease on every iteration oscillated between two values of 1-2 ulp in magnitude without ever reaching 0. The main thing is that you cannot trust optimising compilers these days. There is, e.g., no guarantee that a compiler will not transform (x_new + offset) == (x_old + offset) to (x_new + offset) - (x_old + offset) == 0 to (x_new - x_old) + (offset - offset) == 0 to.... well, you get the point. -pd -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
J C Nash
2017-Apr-23 14:06 UTC
[R] Interesting quirk with fractions and rounding / using == for floating point
Yes. I should have mentioned "optimizing" compilers, and I can agree with "never trusting exact equality", though I consider conscious use of equality tests useful. Optimizing compilers have bitten me once or twice. Unfortunately, a lot of floating-point work requires attention to detail. In the situation of testing for convergence, the alternatives to the test I propose require quite a lot of code, with variants for different levels of precision e.g., single, double, quad. There's no universal answer and we do have to look "under the hood". A particularly nasty instance (now fortunately long gone) was the Tektronix 4051 graphics station, where the comparisons automatically included a "fuzz". There was a FUZZ command to set this. Sometimes the "good old days" weren't! Today's equivalent is when there is an upstream change to an "optimization" that changes the manner of computation, as in Peter's examples. If we specify x <- y * (a / b), then we should not get x <- (y * a) / b. A slightly related case concerns eigenvectors / singular vectors when there are degenerate values (i.e., two or more equal eigenvalues). The vectors are then determined only to lie in a (hyper)plane. A large computer contracting firm spent two weeks of high-priced but non-numerical help trying to find the "error" in either an IBM or Univac program because they gave very different eigenvectors. And in my own field of function minimization / nonlinear least squares, it is quite common to have multiple minima or a plateau. Does anyone know if R has a test suite to check some of these situations? If not, I'll be happy to participate in generating some. They would not need to be run very often, and could be useful as a didactic tool as well as checking for differences in platforms. JN On 2017-04-23 09:37 AM, peter dalgaard wrote:> >> On 23 Apr 2017, at 14:49 , J C Nash <profjcnash at gmail.com> wrote: >> >> >> So equality in floating point is not always "wrong", though it should be used >> with some attention to what is going on. >> >> Apologies to those (e.g., Peter D.) who have heard this all before. I suspect >> there are many to whom it is new. > > Peter D. still insists on never trusting exact equality, though. There was at least one case in the R sources where age-old code got itself into a condition where a residual terme that provably should decrease on every iteration oscillated between two values of 1-2 ulp in magnitude without ever reaching 0. The main thing is that you cannot trust optimising compilers these days. There is, e.g., no guarantee that a compiler will not transform > > (x_new + offset) == (x_old + offset) > > to > > (x_new + offset) - (x_old + offset) == 0 > > to > > (x_new - x_old) + (offset - offset) == 0 > > to.... well, you get the point. > > -pd >
Richard M. Heiberger
2017-Apr-23 15:42 UTC
[R] Interesting quirk with fractions and rounding / using == for floating point
John, I would be happy to participate in designing the test suite you suggest. About a year ago I revised FAQ 7.31, based on my talk at the Aalberg R conference. It now points, in addition to the Goldberg paper that has been referenced there for a long time, to my appendix on precision. Here is the link from the FAQ A discussion with many easily followed examples is in Appendix G "Computational Precision and Floating Point Arithmetic", pages 753-771 of _Statistical Analysis and Data Display: An Intermediate Course with Examples in R_, Richard M. Heiberger and Burt Holland (Springer 2015, second edition). This appendix is a free download from <http://link.springer.com/content/pdf/bbm%3A978-1-4939-2122-5%2F1.pdf>. The appendix is based on the paper I gave at the Aalberg R Conference. It uses the Rmpfr package to illustrate exactly what is happening at the level of the machine arithmetic. The investigation and discussion of the effect of optimization on floating point arithmetic should use the Rmpfr tools. Rich On Sun, Apr 23, 2017 at 10:06 AM, J C Nash <profjcnash at gmail.com> wrote:> Yes. I should have mentioned "optimizing" compilers, and I can agree with > "never > trusting exact equality", though I consider conscious use of equality tests > useful. > Optimizing compilers have bitten me once or twice. Unfortunately, a lot of > floating-point work requires attention to detail. In the situation of > testing > for convergence, the alternatives to the test I propose require quite a lot > of > code, with variants for different levels of precision e.g., single, double, > quad. > > There's no universal answer and we do have to look "under the hood". A > particularly > nasty instance (now fortunately long gone) was the Tektronix 4051 graphics > station, > where the comparisons automatically included a "fuzz". There was a FUZZ > command to > set this. Sometimes the "good old days" weren't! Today's equivalent is when > there > is an upstream change to an "optimization" that changes the manner of > computation, > as in Peter's examples. If we specify x <- y * (a / b), then we should not > get > x <- (y * a) / b. > > A slightly related case concerns eigenvectors / singular vectors when there > are > degenerate values (i.e., two or more equal eigenvalues). The vectors are > then > determined only to lie in a (hyper)plane. A large computer contracting firm > spent > two weeks of high-priced but non-numerical help trying to find the "error" > in either > an IBM or Univac program because they gave very different eigenvectors. > > And in my own field of function minimization / nonlinear least squares, it > is quite > common to have multiple minima or a plateau. > > Does anyone know if R has a test suite to check some of these situations? If > not, > I'll be happy to participate in generating some. They would not need to be > run > very often, and could be useful as a didactic tool as well as checking for > differences in platforms. > > JN > > On 2017-04-23 09:37 AM, peter dalgaard wrote: >> >> >>> On 23 Apr 2017, at 14:49 , J C Nash <profjcnash at gmail.com> wrote: >>> >>> >>> So equality in floating point is not always "wrong", though it should be >>> used >>> with some attention to what is going on. >>> >>> Apologies to those (e.g., Peter D.) who have heard this all before. I >>> suspect >>> there are many to whom it is new. >> >> >> Peter D. still insists on never trusting exact equality, though. There was >> at least one case in the R sources where age-old code got itself into a >> condition where a residual terme that provably should decrease on every >> iteration oscillated between two values of 1-2 ulp in magnitude without ever >> reaching 0. The main thing is that you cannot trust optimising compilers >> these days. There is, e.g., no guarantee that a compiler will not transform >> >> (x_new + offset) == (x_old + offset) >> >> to >> >> (x_new + offset) - (x_old + offset) == 0 >> >> to >> >> (x_new - x_old) + (offset - offset) == 0 >> >> to.... well, you get the point. >> >> -pd >> > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.