After upgrading to R-2.3.1 on Linux Redhat, I was suprised by this: R> x <- c(721.077, 592.291, 372.208, 381.182) R> sum(x) - 2066.758 [1] 4.547474e-13 Now I understand that floating point arithmetic is not precise, but 1) the result is exactly 0 in R-2.2.1 (patched) on the same machine, 2) .Machine$double.eps = 2.2e-16, so the error seems quite large. Also note I get the same result on R-2.3.1 under Windows, and that R> (721.077 + 592.291 + 372.208 + 381.182) - 2066.758 [1] 0 Is this related to the (2.3.0) NEWS item: sum(), prod(), mean(), rowSums() and friends use a long double accumulator where available and so may be more accurate. and should I be concerned? Thanks. -- David Brahm (brahm at alum.mit.edu) Version: platform = i686-pc-linux-gnu arch = i686 os = linux-gnu system = i686, linux-gnu status = major = 2 minor = 3.1 year = 2006 month = 06 day = 01 svn rev = 38247 language = R version.string = Version 2.3.1 (2006-06-01) Locale: C Search Path: .GlobalEnv, package:methods, package:stats, package:graphics, package:grDevices, package:utils, package:datasets, Autoloads, package:base
Roger D. Peng
2006-Aug-18 19:36 UTC
[R] Floating point imprecision in sum() under R-2.3.1?
I think you want to look at sum(x)/2066.758 - 1 which on my Linux box is 2.2e-16. -roger Brahm, David wrote:> After upgrading to R-2.3.1 on Linux Redhat, I was suprised by this: > > R> x <- c(721.077, 592.291, 372.208, 381.182) > R> sum(x) - 2066.758 > [1] 4.547474e-13 > > Now I understand that floating point arithmetic is not precise, but > 1) the result is exactly 0 in R-2.2.1 (patched) on the same machine, > 2) .Machine$double.eps = 2.2e-16, so the error seems quite large. > > Also note I get the same result on R-2.3.1 under Windows, and that > R> (721.077 + 592.291 + 372.208 + 381.182) - 2066.758 > [1] 0 > > Is this related to the (2.3.0) NEWS item: > sum(), prod(), mean(), rowSums() and friends use a long double > accumulator where available and so may be more accurate. > and should I be concerned? Thanks. > > -- David Brahm (brahm at alum.mit.edu) > > > Version: > platform = i686-pc-linux-gnu > arch = i686 > os = linux-gnu > system = i686, linux-gnu > status = > major = 2 > minor = 3.1 > year = 2006 > month = 06 > day = 01 > svn rev = 38247 > language = R > version.string = Version 2.3.1 (2006-06-01) > > Locale: > C > > Search Path: > .GlobalEnv, package:methods, package:stats, package:graphics, > package:grDevices, package:utils, package:datasets, Autoloads, > package:base > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Roger D. Peng | http://www.biostat.jhsph.edu/~rpeng/
I was concerned by this result (new in R-2.3.1): R> x <- c(721.077, 592.291, 372.208, 381.182) R> sum(x) - 2066.758 [1] 4.547474e-13 But after Roger Peng's <rdpeng at gmail.com> insightful comment that the relative difference (sum(x)/2066.758 - 1) is exactly what is expected, I'm convinced that sum() is indeed really being "more accurate" than it was in 2.2.1, i.e. accurately preserving the numerical imprecision of the original inputs. Sorry for the distraction... -- David Brahm (brahm at alum.mit.edu)