Greetings all, I'm porting an algorithm from MATLAB to R, and noticed some minor discrepancies in small decimal values using rowSums and colSums which are exacerbated after heavy iteration and log space transformation. This was rather perplexing as both programs claimed and appeared to use the IEEE 754 standard for floating point arithmetic (confirmed with manual basic operations). After some tracing and testing, I've managed to isolated a minimal working example as follows: a = 0.812672 b = 0.916541 c = 0.797810 sum(c(a, b, c)) == (a + b + c) [1] FALSE Should I attribute this to the woes of working with floating point numbers and accept it? i.e. sprintf("%.30f", sum(c(a, b, c))) [1] "2.527022999999999797182681504637" sprintf("%.30f", (a + b + c)) [1] "2.527023000000000241271891354700" Change the OS or version I'm using? MAC OSX 10.5.8: sessionInfo() R version 2.13.1 (2011-07-08) Platform: i386-apple-darwin9.8.0/i386 (32-bit) attached base packages: [1] stats graphics grDevices utils datasets methods base Linux 2.6.34: R version 2.12.0 (2010-10-15) Platform: x86_64-unknown-linux-gnu (64-bit) attached base packages: [1] stats graphics grDevices utils datasets methods base Or report this as a bug? Thanks, Daniel
R. Michael Weylandt
2011-Aug-23 20:48 UTC
[R] Bug or feature? sum(c(a, b, c)) != (a + b + c)
Not directly related to what you said below, but might I suggest that for numerical work all.equal() might be a little more robust in a computationally heavy implementation. x = c(0.812672,0.916541,0.797810) #dont' call variables c -- just a bad idea y = x[1]+x[2]+x[3]> sum(x) ==y[1] FALSE> identical(sum(x),y)[1] FALSE> all.equal(sum(x),y)[1] TRUE But yeah, it's just a floating point thing. Michael Weylandt PS -- Just so no one else has to say it: obligatory mention of R FAQ 7.31: http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f On Tue, Aug 23, 2011 at 3:17 PM, Daniel Lai <danlai@bccrc.ca> wrote:> Greetings all, > > I'm porting an algorithm from MATLAB to R, and noticed some minor > discrepancies in small decimal values using rowSums and colSums which are > exacerbated after heavy iteration and log space transformation. This was > rather perplexing as both programs claimed and appeared to use the IEEE 754 > standard for floating point arithmetic (confirmed with manual basic > operations). After some tracing and testing, I've managed to isolated a > minimal working example as follows: > > a = 0.812672 > b = 0.916541 > c = 0.797810 > sum(c(a, b, c)) == (a + b + c) > [1] FALSE > > Should I attribute this to the woes of working with floating point numbers > and accept it? i.e. > > sprintf("%.30f", sum(c(a, b, c))) > [1] "2.**527022999999999797182681504637**" > sprintf("%.30f", (a + b + c)) > [1] "2.**527023000000000241271891354700**" > > Change the OS or version I'm using? > > MAC OSX 10.5.8: > sessionInfo() > R version 2.13.1 (2011-07-08) > Platform: i386-apple-darwin9.8.0/i386 (32-bit) > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > Linux 2.6.34: > R version 2.12.0 (2010-10-15) > Platform: x86_64-unknown-linux-gnu (64-bit) > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > Or report this as a bug? > > Thanks, > Daniel > > ______________________________**________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide http://www.R-project.org/** > posting-guide.html <http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
?all.equal -- Clint Bowman INTERNET: clint at ecy.wa.gov Air Quality Modeler INTERNET: clint at math.utah.edu Department of Ecology VOICE: (360) 407-6815 PO Box 47600 FAX: (360) 407-7534 Olympia, WA 98504-7600 USPS: PO Box 47600, Olympia, WA 98504-7600 Parcels: 300 Desmond Drive, Lacey, WA 98503-1274 On Tue, 23 Aug 2011, Daniel Lai wrote:> Greetings all, > > I'm porting an algorithm from MATLAB to R, and noticed some minor > discrepancies in small decimal values using rowSums and colSums which are > exacerbated after heavy iteration and log space transformation. This was > rather perplexing as both programs claimed and appeared to use the IEEE 754 > standard for floating point arithmetic (confirmed with manual basic > operations). After some tracing and testing, I've managed to isolated a > minimal working example as follows: > > a = 0.812672 > b = 0.916541 > c = 0.797810 > sum(c(a, b, c)) == (a + b + c) > [1] FALSE > > Should I attribute this to the woes of working with floating point numbers > and accept it? i.e. > > sprintf("%.30f", sum(c(a, b, c))) > [1] "2.527022999999999797182681504637" > sprintf("%.30f", (a + b + c)) > [1] "2.527023000000000241271891354700" > > Change the OS or version I'm using? > > MAC OSX 10.5.8: > sessionInfo() > R version 2.13.1 (2011-07-08) > Platform: i386-apple-darwin9.8.0/i386 (32-bit) > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > Linux 2.6.34: > R version 2.12.0 (2010-10-15) > Platform: x86_64-unknown-linux-gnu (64-bit) > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > Or report this as a bug? > > Thanks, > Daniel > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
On Tue, Aug 23, 2011 at 8:17 PM, Daniel Lai <danlai at bccrc.ca> wrote:> Greetings all, > > I'm porting an algorithm from MATLAB to R, and noticed some minor > discrepancies in small decimal values using rowSums and colSums which are > exacerbated after heavy iteration and log space transformation. This was > rather perplexing as both programs claimed and appeared to use the IEEE 754 > standard for floating point arithmetic (confirmed with manual basic > operations). ?After some tracing and testing, I've managed to isolated a > minimal working example as follows: > > a = 0.812672 > b = 0.916541 > c = 0.797810 > sum(c(a, b, c)) == (a + b + c) > [1] FALSEIts probably to do with the order of summations. With your a,b,c you get: > (a+b+c) == (c+b+a) [1] TRUE > (a+b+c) == (c+a+b) [1] FALSE shock horror, addition is not associative[1]. Lets investigate: > sum(c(a,b,c)) == c+a+b [1] TRUE > sum(c(a,b,c)) == a+c+b [1] TRUE 'sum' seems to get the same answer as adding the first and the third, then adding the second - explicitly: > sum(c(a,b,c)) == (a+c)+b [1] TRUE I'm not sure what it would do for four values in the sum. Have fun finding out. Does matlab similarly have a+b+c != c+b+a? Barry [1] or commutative or distributive or one of those -ives you learn one day in school. Too lazy to wikipedia it right now...
Seemingly Similar Threads
- StoryRunner docs/guidance
- Psych package: fa.diagram, how to re-arrange layout so numbers do not over-write each other
- manipulating "by" lists and "ave()" functions
- c't listening test: Ogg problem at 128kbps
- Creating "%d/%m/%Y %H:%M:%S" format from separate date and time columns