thr3ads.net - R help - [R] Bug or feature? sum(c(a, b, c)) != (a + b + c) [Aug 2011]

If this information is useful, please help other people find it:
Share via:

Daniel Lai

2011-Aug-23 19:17 UTC

[R] Bug or feature? sum(c(a, b, c)) != (a + b + c)

Greetings all,

I'm porting an algorithm from MATLAB to R, and noticed some minor 
discrepancies in small decimal values using rowSums and colSums which 
are exacerbated after heavy iteration and log space transformation. 
This was rather perplexing as both programs claimed and appeared to use 
the IEEE 754 standard for floating point arithmetic (confirmed with 
manual basic operations).  After some tracing and testing, I've managed 
to isolated a minimal working example as follows:

a = 0.812672
b = 0.916541
c = 0.797810
sum(c(a, b, c)) == (a + b + c)
[1] FALSE

Should I attribute this to the woes of working with floating point 
numbers and accept it? i.e.

sprintf("%.30f", sum(c(a, b, c)))
[1] "2.527022999999999797182681504637"
sprintf("%.30f", (a + b + c))
[1] "2.527023000000000241271891354700"

Change the OS or version I'm using?

MAC OSX 10.5.8:
sessionInfo()
R version 2.13.1 (2011-07-08)
Platform: i386-apple-darwin9.8.0/i386 (32-bit)
attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

Linux 2.6.34:
R version 2.12.0 (2010-10-15)
Platform: x86_64-unknown-linux-gnu (64-bit)
attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

Or report this as a bug?

Thanks,
Daniel

R. Michael Weylandt

2011-Aug-23 20:48 UTC

head link

[R] Bug or feature? sum(c(a, b, c)) != (a + b + c)

Not directly related to what you said below, but might I suggest that for
numerical work all.equal() might be a little more robust in a
computationally heavy implementation.

x = c(0.812672,0.916541,0.797810) #dont' call variables c -- just a bad idea
y = x[1]+x[2]+x[3]
> sum(x) ==y[1] FALSE
> identical(sum(x),y)[1] FALSE
> all.equal(sum(x),y)[1] TRUE

But yeah, it's just a floating point thing.

Michael Weylandt

PS -- Just so no one else has to say it: obligatory mention of R FAQ 7.31:
http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f


On Tue, Aug 23, 2011 at 3:17 PM, Daniel Lai <danlai@bccrc.ca> wrote:
> Greetings all,
>
> I'm porting an algorithm from MATLAB to R, and noticed some minor
> discrepancies in small decimal values using rowSums and colSums which are
> exacerbated after heavy iteration and log space transformation. This was
> rather perplexing as both programs claimed and appeared to use the IEEE 754
> standard for floating point arithmetic (confirmed with manual basic
> operations).  After some tracing and testing, I've managed to isolated
a
> minimal working example as follows:
>
> a = 0.812672
> b = 0.916541
> c = 0.797810
> sum(c(a, b, c)) == (a + b + c)
> [1] FALSE
>
> Should I attribute this to the woes of working with floating point numbers
> and accept it? i.e.
>
> sprintf("%.30f", sum(c(a, b, c)))
> [1] "2.**527022999999999797182681504637**"
> sprintf("%.30f", (a + b + c))
> [1] "2.**527023000000000241271891354700**"
>
> Change the OS or version I'm using?
>
> MAC OSX 10.5.8:
> sessionInfo()
> R version 2.13.1 (2011-07-08)
> Platform: i386-apple-darwin9.8.0/i386 (32-bit)
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> Linux 2.6.34:
> R version 2.12.0 (2010-10-15)
> Platform: x86_64-unknown-linux-gnu (64-bit)
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> Or report this as a bug?
>
> Thanks,
> Daniel
>
> ______________________________**________________
> R-help@r-project.org mailing list
>
https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
> PLEASE do read the posting guide http://www.R-project.org/**
> posting-guide.html <http://www.R-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Clint Bowman

2011-Aug-23 20:51 UTC

head link

[R] Bug or feature? sum(c(a, b, c)) != (a + b + c)

?all.equal

-- 
Clint Bowman			INTERNET:	clint at ecy.wa.gov
Air Quality Modeler		INTERNET:	clint at math.utah.edu
Department of Ecology		VOICE:		(360) 407-6815
PO Box 47600			FAX:		(360) 407-7534
Olympia, WA 98504-7600


         USPS:           PO Box 47600, Olympia, WA 98504-7600
         Parcels:        300 Desmond Drive, Lacey, WA 98503-1274


On Tue, 23 Aug 2011, Daniel Lai wrote:
> Greetings all,
>
> I'm porting an algorithm from MATLAB to R, and noticed some minor 
> discrepancies in small decimal values using rowSums and colSums which are 
> exacerbated after heavy iteration and log space transformation. This was 
> rather perplexing as both programs claimed and appeared to use the IEEE 754
> standard for floating point arithmetic (confirmed with manual basic 
> operations).  After some tracing and testing, I've managed to isolated
a
> minimal working example as follows:
>
> a = 0.812672
> b = 0.916541
> c = 0.797810
> sum(c(a, b, c)) == (a + b + c)
> [1] FALSE
>
> Should I attribute this to the woes of working with floating point numbers 
> and accept it? i.e.
>
> sprintf("%.30f", sum(c(a, b, c)))
> [1] "2.527022999999999797182681504637"
> sprintf("%.30f", (a + b + c))
> [1] "2.527023000000000241271891354700"
>
> Change the OS or version I'm using?
>
> MAC OSX 10.5.8:
> sessionInfo()
> R version 2.13.1 (2011-07-08)
> Platform: i386-apple-darwin9.8.0/i386 (32-bit)
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> Linux 2.6.34:
> R version 2.12.0 (2010-10-15)
> Platform: x86_64-unknown-linux-gnu (64-bit)
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> Or report this as a bug?
>
> Thanks,
> Daniel
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Barry Rowlingson

2011-Aug-23 22:20 UTC

head link

[R] Bug or feature? sum(c(a, b, c)) != (a + b + c)

On Tue, Aug 23, 2011 at 8:17 PM, Daniel Lai <danlai at bccrc.ca>
wrote:> Greetings all,
>
> I'm porting an algorithm from MATLAB to R, and noticed some minor
> discrepancies in small decimal values using rowSums and colSums which are
> exacerbated after heavy iteration and log space transformation. This was
> rather perplexing as both programs claimed and appeared to use the IEEE 754
> standard for floating point arithmetic (confirmed with manual basic
> operations). ?After some tracing and testing, I've managed to isolated
a
> minimal working example as follows:
>
> a = 0.812672
> b = 0.916541
> c = 0.797810
> sum(c(a, b, c)) == (a + b + c)
> [1] FALSE
 Its probably to do with the order of summations. With your a,b,c you get:

 > (a+b+c) == (c+b+a)
 [1] TRUE
 > (a+b+c) == (c+a+b)
 [1] FALSE

shock horror, addition is not associative[1]. Lets investigate:

 > sum(c(a,b,c)) == c+a+b
 [1] TRUE
 > sum(c(a,b,c)) == a+c+b
 [1] TRUE

 'sum' seems to get the same answer as adding the first and the third,
then adding the second - explicitly:

 > sum(c(a,b,c)) == (a+c)+b
 [1] TRUE

I'm not sure what it would do for four values in the sum. Have fun
finding out. Does matlab similarly have a+b+c != c+b+a?

Barry

[1] or commutative or distributive or one of those -ives you learn one
day in school. Too lazy to wikipedia it right now...

Seemingly Similar Threads

Search for more apparently analagous threads

R help - Aug 2011 - Bug or feature? sum(c(a, b, c)) != (a + b + c)

[R] Bug or feature? sum(c(a, b, c)) != (a + b + c)

[R] Bug or feature? sum(c(a, b, c)) != (a + b + c)

[R] Bug or feature? sum(c(a, b, c)) != (a + b + c)

[R] Bug or feature? sum(c(a, b, c)) != (a + b + c)

Seemingly Similar Threads