thr3ads.net - R devel - [Rd] Floating point maths in R [Dec 2006]

If this information is useful, please help other people find it:
Share via:

Tom McCallum

2006-Dec-09 13:29 UTC

[Rd] Floating point maths in R

Hi,

I am not sure if this is just me using R (R-2.3.1 and R-2.4.0) in the  
wrong way or if there is a more serious bug.  I was having problems  
getting some calculations to add up so I ran the following tests:
> (2.34567 - 2.00000) == 0.34567 <------- should be true
[1] FALSE> (2.23-2.00) == 0.23 <------- should be true
[1] FALSE> 4-2==2
[1] TRUE> (4-2)==2
[1] TRUE> (4.0-2)==2
[1] TRUE> (4.0-2.0)==2
[1] TRUE> (4.0-2.0)==2.0
[1] TRUE> (4.2-2.2)==2.0
[1] TRUE> (4.20-2.20)==2.00
[1] TRUE> (4.23-2.23)==2.00  <------- should be true
[1] FALSE> (4.230-2.230)==2.000 <------- should be true
[1] FALSE> (4.230-2.230)==2.00 <------- should be true
[1] FALSE> (4.230-2.23)==2.00 <------- should be true[1] FALSE

I have tried these on both 64 and 32-bit machines.  Surely R should be  
able to do maths to 2 decimal places and be able to test these simple  
expressions?  The problem occurs as in the 16th decimal place junk is  
being placed by the FPU it seems.  I have also tried:
> (4.2300000000000000-2.230000000000000) == 2
[1] FALSE> a <- (4.2300000000000000-2.230000000000000)
> a == 2
[1] FALSE> (4.2300000000000000-2.230000000000000) == 2.0000000000000000
[1] FALSE> (4.2300000000000000-2.230000000000000) == 2.0000000000000004 <-- correct
> when add 16th decimal place to 4
[1] TRUE> (4.2300000000000000-2.230000000000000) == 2.00000000000000043  <-- any  
> values after the 16th decimal place mean that the expression is true
[1] TRUE> (4.2300000000000000-2.230000000000000) == 2.000000000000000435[1] TRUE

Also :
> (4.2300000000000000-2.230000000000000) == 2.0000000000000001
[1] FALSE> (4.2300000000000000-2.230000000000000) == 2.0000000000000003
[1] TRUE> (4.2300000000000000-2.230000000000000) == 2.0000000000000004
[1] TRUE> (4.2300000000000000-2.230000000000000) == 2.0000000000000005
[1] TRUE> (4.2300000000000000-2.230000000000000) == 2.0000000000000006 <-- 3,5 I  
> can understand being true if rounding occurring, but 6?
[1] TRUE> (4.2300000000000000-2.230000000000000) == 2.0000000000000007
[1] FALSE> (4.2300000000000000-2.230000000000000) == 2.0000000000000008
[1] FALSE> (4.2300000000000000-2.230000000000000) == 2.0000000000000009
[1] FALSE> (4.2300000000000000-2.230000000000000) == 2.0000000000000010

This is an example of junk being added in the FPU> formatC(a, digits=20)[1] "2.0000000000000004441"

I don't know if this is just a formatC error when using more than 16  
decimal places or if this junk is what is stopping the equality from being  
true:
> formatC(a, digits=16)
[1] "                2"> formatC(a, digits=17)  <-- 16 decimal places, 17 significant figures  
> shown[1] "2.0000000000000004" <-- the problem is the 4 at the end

Obviously the bytes are divided between the exponent and mantissa in  
16-16bit share it seems, but this doesn't account for the 16th decimal  
place behaviour does it?

If any one has a work around or reason why this should occur it would be  
useful to know.

what I would like is to be able to do sums such as (2.3456 - 2 ) == 0.3456  
and get a sensible answer - any suggestions?  Currently the only way is  
for formatC the expression to a known number of decimal places - is there  
a better way?

Many thanks

Tom


-- 
Dr. Thomas McCallum
Systems Architect,
Level E Limited
ETTC, The King's Buildings
Mayfield Road,
Edinburgh EH9 3JL, UK
Work  +44 (0) 131 472 4813
Fax:  +44 (0) 131 472 4719
http://www.levelelimited.com
Email: tom at levelelimited.com

Level E is a limited company incorporated in Scotland. The c...{{dropped}}

Peter Dalgaard

2006-Dec-09 13:48 UTC

head link

[Rd] Floating point maths in R

Tom McCallum wrote:> Hi,
>
> I am not sure if this is just me using R (R-2.3.1 and R-2.4.0) in the  
> wrong way or if there is a more serious bug.  I was having problems  
> getting some calculations to add up so I ran the following tests:
>
>   Please read  FAQ 7.31 and the reference therein.

http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f

(short answer: You can not represent thirds exactly in decimal nor 
tenths in binary.)>> (2.34567 - 2.00000) == 0.34567 <------- should be true
>>     
> [1] FALSE
>   
>> (2.23-2.00) == 0.23 <------- should be true
>>     
> [1] FALSE
>   
>> 4-2==2
>>     
> [1] TRUE
>   
>> (4-2)==2
>>     
> [1] TRUE
>   
>> (4.0-2)==2
>>     
> [1] TRUE
>   
>> (4.0-2.0)==2
>>     
> [1] TRUE
>   
>> (4.0-2.0)==2.0
>>     
> [1] TRUE
>   
>> (4.2-2.2)==2.0
>>     
> [1] TRUE
>   
>> (4.20-2.20)==2.00
>>     
> [1] TRUE
>   
>> (4.23-2.23)==2.00  <------- should be true
>>     
> [1] FALSE
>   
>> (4.230-2.230)==2.000 <------- should be true
>>     
> [1] FALSE
>   
>> (4.230-2.230)==2.00 <------- should be true
>>     
> [1] FALSE
>   
>> (4.230-2.23)==2.00 <------- should be true
>>     
> [1] FALSE
>
> I have tried these on both 64 and 32-bit machines.  Surely R should be  
> able to do maths to 2 decimal places and be able to test these simple  
> expressions?  The problem occurs as in the 16th decimal place junk is  
> being placed by the FPU it seems.  I have also tried:
>
>   
>> (4.2300000000000000-2.230000000000000) == 2
>>     
> [1] FALSE
>   
>> a <- (4.2300000000000000-2.230000000000000)
>> a == 2
>>     
> [1] FALSE
>   
>> (4.2300000000000000-2.230000000000000) == 2.0000000000000000
>>     
> [1] FALSE
>   
>> (4.2300000000000000-2.230000000000000) == 2.0000000000000004 <--
correct
>> when add 16th decimal place to 4
>>     
> [1] TRUE
>   
>> (4.2300000000000000-2.230000000000000) == 2.00000000000000043  <--
any
>> values after the 16th decimal place mean that the expression is true
>>     
> [1] TRUE
>   
>> (4.2300000000000000-2.230000000000000) == 2.000000000000000435
>>     
> [1] TRUE
>
> Also :
>
>   
>> (4.2300000000000000-2.230000000000000) == 2.0000000000000001
>>     
> [1] FALSE
>   
>> (4.2300000000000000-2.230000000000000) == 2.0000000000000003
>>     
> [1] TRUE
>   
>> (4.2300000000000000-2.230000000000000) == 2.0000000000000004
>>     
> [1] TRUE
>   
>> (4.2300000000000000-2.230000000000000) == 2.0000000000000005
>>     
> [1] TRUE
>   
>> (4.2300000000000000-2.230000000000000) == 2.0000000000000006 <-- 3,5
I
>> can understand being true if rounding occurring, but 6?
>>     
> [1] TRUE
>   
>> (4.2300000000000000-2.230000000000000) == 2.0000000000000007
>>     
> [1] FALSE
>   
>> (4.2300000000000000-2.230000000000000) == 2.0000000000000008
>>     
> [1] FALSE
>   
>> (4.2300000000000000-2.230000000000000) == 2.0000000000000009
>>     
> [1] FALSE
>   
>> (4.2300000000000000-2.230000000000000) == 2.0000000000000010
>>     
>
>
> This is an example of junk being added in the FPU
>   
>> formatC(a, digits=20)
>>     
> [1] "2.0000000000000004441"
>
> I don't know if this is just a formatC error when using more than 16  
> decimal places or if this junk is what is stopping the equality from being
> true:
>
>   
>> formatC(a, digits=16)
>>     
> [1] "                2"
>   
>> formatC(a, digits=17)  <-- 16 decimal places, 17 significant figures
>> shown
>>     
> [1] "2.0000000000000004" <-- the problem is the 4 at the end
>
> Obviously the bytes are divided between the exponent and mantissa in  
> 16-16bit share it seems, but this doesn't account for the 16th decimal
> place behaviour does it?
>
> If any one has a work around or reason why this should occur it would be  
> useful to know.
>
> what I would like is to be able to do sums such as (2.3456 - 2 ) == 0.3456
> and get a sensible answer - any suggestions?  Currently the only way is  
> for formatC the expression to a known number of decimal places - is there  
> a better way?
>
> Many thanks
>
> Tom
>
>
>

Duncan Murdoch

2006-Dec-09 13:49 UTC

head link

[Rd] Floating point maths in R

On 12/9/2006 8:29 AM, Tom McCallum wrote:> Hi,
> 
> I am not sure if this is just me using R (R-2.3.1 and R-2.4.0) in the  
> wrong way or if there is a more serious bug.  I was having problems  
> getting some calculations to add up so I ran the following tests:
You should read the FAQ item "Why doesn't R think these numbers are 
equal?" at


http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f

Duncan Murdoch
> 
>> (2.34567 - 2.00000) == 0.34567 <------- should be true
> [1] FALSE
>> (2.23-2.00) == 0.23 <------- should be true
> [1] FALSE
>> 4-2==2
> [1] TRUE
>> (4-2)==2
> [1] TRUE
>> (4.0-2)==2
> [1] TRUE
>> (4.0-2.0)==2
> [1] TRUE
>> (4.0-2.0)==2.0
> [1] TRUE
>> (4.2-2.2)==2.0
> [1] TRUE
>> (4.20-2.20)==2.00
> [1] TRUE
>> (4.23-2.23)==2.00  <------- should be true
> [1] FALSE
>> (4.230-2.230)==2.000 <------- should be true
> [1] FALSE
>> (4.230-2.230)==2.00 <------- should be true
> [1] FALSE
>> (4.230-2.23)==2.00 <------- should be true
> [1] FALSE
> 
> I have tried these on both 64 and 32-bit machines.  Surely R should be  
> able to do maths to 2 decimal places and be able to test these simple  
> expressions?  The problem occurs as in the 16th decimal place junk is  
> being placed by the FPU it seems.  I have also tried:
> 
>> (4.2300000000000000-2.230000000000000) == 2
> [1] FALSE
>> a <- (4.2300000000000000-2.230000000000000)
>> a == 2
> [1] FALSE
>> (4.2300000000000000-2.230000000000000) == 2.0000000000000000
> [1] FALSE
>> (4.2300000000000000-2.230000000000000) == 2.0000000000000004 <--
correct
>> when add 16th decimal place to 4
> [1] TRUE
>> (4.2300000000000000-2.230000000000000) == 2.00000000000000043  <--
any
>> values after the 16th decimal place mean that the expression is true
> [1] TRUE
>> (4.2300000000000000-2.230000000000000) == 2.000000000000000435
> [1] TRUE
> 
> Also :
> 
>> (4.2300000000000000-2.230000000000000) == 2.0000000000000001
> [1] FALSE
>> (4.2300000000000000-2.230000000000000) == 2.0000000000000003
> [1] TRUE
>> (4.2300000000000000-2.230000000000000) == 2.0000000000000004
> [1] TRUE
>> (4.2300000000000000-2.230000000000000) == 2.0000000000000005
> [1] TRUE
>> (4.2300000000000000-2.230000000000000) == 2.0000000000000006 <-- 3,5
I
>> can understand being true if rounding occurring, but 6?
> [1] TRUE
>> (4.2300000000000000-2.230000000000000) == 2.0000000000000007
> [1] FALSE
>> (4.2300000000000000-2.230000000000000) == 2.0000000000000008
> [1] FALSE
>> (4.2300000000000000-2.230000000000000) == 2.0000000000000009
> [1] FALSE
>> (4.2300000000000000-2.230000000000000) == 2.0000000000000010
> 
> 
> This is an example of junk being added in the FPU
>> formatC(a, digits=20)
> [1] "2.0000000000000004441"
> 
> I don't know if this is just a formatC error when using more than 16  
> decimal places or if this junk is what is stopping the equality from being
> true:
> 
>> formatC(a, digits=16)
> [1] "                2"
>> formatC(a, digits=17)  <-- 16 decimal places, 17 significant figures
>> shown
> [1] "2.0000000000000004" <-- the problem is the 4 at the end
> 
> Obviously the bytes are divided between the exponent and mantissa in  
> 16-16bit share it seems, but this doesn't account for the 16th decimal
> place behaviour does it?
> 
> If any one has a work around or reason why this should occur it would be  
> useful to know.
> 
> what I would like is to be able to do sums such as (2.3456 - 2 ) == 0.3456
> and get a sensible answer - any suggestions?  Currently the only way is  
> for formatC the expression to a known number of decimal places - is there  
> a better way?
> 
> Many thanks
> 
> Tom
> 
>

Possibly Parallel Threads

Search for more possibly parallel threads

R devel - Dec 2006 - Floating point maths in R

[Rd] Floating point maths in R

[Rd] Floating point maths in R

[Rd] Floating point maths in R

Possibly Parallel Threads