Hi, I am not sure if this is just me using R (R-2.3.1 and R-2.4.0) in the wrong way or if there is a more serious bug. I was having problems getting some calculations to add up so I ran the following tests:> (2.34567 - 2.00000) == 0.34567 <------- should be true[1] FALSE> (2.23-2.00) == 0.23 <------- should be true[1] FALSE> 4-2==2[1] TRUE> (4-2)==2[1] TRUE> (4.0-2)==2[1] TRUE> (4.0-2.0)==2[1] TRUE> (4.0-2.0)==2.0[1] TRUE> (4.2-2.2)==2.0[1] TRUE> (4.20-2.20)==2.00[1] TRUE> (4.23-2.23)==2.00 <------- should be true[1] FALSE> (4.230-2.230)==2.000 <------- should be true[1] FALSE> (4.230-2.230)==2.00 <------- should be true[1] FALSE> (4.230-2.23)==2.00 <------- should be true[1] FALSE I have tried these on both 64 and 32-bit machines. Surely R should be able to do maths to 2 decimal places and be able to test these simple expressions? The problem occurs as in the 16th decimal place junk is being placed by the FPU it seems. I have also tried:> (4.2300000000000000-2.230000000000000) == 2[1] FALSE> a <- (4.2300000000000000-2.230000000000000) > a == 2[1] FALSE> (4.2300000000000000-2.230000000000000) == 2.0000000000000000[1] FALSE> (4.2300000000000000-2.230000000000000) == 2.0000000000000004 <-- correct > when add 16th decimal place to 4[1] TRUE> (4.2300000000000000-2.230000000000000) == 2.00000000000000043 <-- any > values after the 16th decimal place mean that the expression is true[1] TRUE> (4.2300000000000000-2.230000000000000) == 2.000000000000000435[1] TRUE Also :> (4.2300000000000000-2.230000000000000) == 2.0000000000000001[1] FALSE> (4.2300000000000000-2.230000000000000) == 2.0000000000000003[1] TRUE> (4.2300000000000000-2.230000000000000) == 2.0000000000000004[1] TRUE> (4.2300000000000000-2.230000000000000) == 2.0000000000000005[1] TRUE> (4.2300000000000000-2.230000000000000) == 2.0000000000000006 <-- 3,5 I > can understand being true if rounding occurring, but 6?[1] TRUE> (4.2300000000000000-2.230000000000000) == 2.0000000000000007[1] FALSE> (4.2300000000000000-2.230000000000000) == 2.0000000000000008[1] FALSE> (4.2300000000000000-2.230000000000000) == 2.0000000000000009[1] FALSE> (4.2300000000000000-2.230000000000000) == 2.0000000000000010This is an example of junk being added in the FPU> formatC(a, digits=20)[1] "2.0000000000000004441" I don't know if this is just a formatC error when using more than 16 decimal places or if this junk is what is stopping the equality from being true:> formatC(a, digits=16)[1] " 2"> formatC(a, digits=17) <-- 16 decimal places, 17 significant figures > shown[1] "2.0000000000000004" <-- the problem is the 4 at the end Obviously the bytes are divided between the exponent and mantissa in 16-16bit share it seems, but this doesn't account for the 16th decimal place behaviour does it? If any one has a work around or reason why this should occur it would be useful to know. what I would like is to be able to do sums such as (2.3456 - 2 ) == 0.3456 and get a sensible answer - any suggestions? Currently the only way is for formatC the expression to a known number of decimal places - is there a better way? Many thanks Tom -- Dr. Thomas McCallum Systems Architect, Level E Limited ETTC, The King's Buildings Mayfield Road, Edinburgh EH9 3JL, UK Work +44 (0) 131 472 4813 Fax: +44 (0) 131 472 4719 http://www.levelelimited.com Email: tom at levelelimited.com Level E is a limited company incorporated in Scotland. The c...{{dropped}}
Tom McCallum wrote:> Hi, > > I am not sure if this is just me using R (R-2.3.1 and R-2.4.0) in the > wrong way or if there is a more serious bug. I was having problems > getting some calculations to add up so I ran the following tests: > >Please read FAQ 7.31 and the reference therein. http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f (short answer: You can not represent thirds exactly in decimal nor tenths in binary.)>> (2.34567 - 2.00000) == 0.34567 <------- should be true >> > [1] FALSE > >> (2.23-2.00) == 0.23 <------- should be true >> > [1] FALSE > >> 4-2==2 >> > [1] TRUE > >> (4-2)==2 >> > [1] TRUE > >> (4.0-2)==2 >> > [1] TRUE > >> (4.0-2.0)==2 >> > [1] TRUE > >> (4.0-2.0)==2.0 >> > [1] TRUE > >> (4.2-2.2)==2.0 >> > [1] TRUE > >> (4.20-2.20)==2.00 >> > [1] TRUE > >> (4.23-2.23)==2.00 <------- should be true >> > [1] FALSE > >> (4.230-2.230)==2.000 <------- should be true >> > [1] FALSE > >> (4.230-2.230)==2.00 <------- should be true >> > [1] FALSE > >> (4.230-2.23)==2.00 <------- should be true >> > [1] FALSE > > I have tried these on both 64 and 32-bit machines. Surely R should be > able to do maths to 2 decimal places and be able to test these simple > expressions? The problem occurs as in the 16th decimal place junk is > being placed by the FPU it seems. I have also tried: > > >> (4.2300000000000000-2.230000000000000) == 2 >> > [1] FALSE > >> a <- (4.2300000000000000-2.230000000000000) >> a == 2 >> > [1] FALSE > >> (4.2300000000000000-2.230000000000000) == 2.0000000000000000 >> > [1] FALSE > >> (4.2300000000000000-2.230000000000000) == 2.0000000000000004 <-- correct >> when add 16th decimal place to 4 >> > [1] TRUE > >> (4.2300000000000000-2.230000000000000) == 2.00000000000000043 <-- any >> values after the 16th decimal place mean that the expression is true >> > [1] TRUE > >> (4.2300000000000000-2.230000000000000) == 2.000000000000000435 >> > [1] TRUE > > Also : > > >> (4.2300000000000000-2.230000000000000) == 2.0000000000000001 >> > [1] FALSE > >> (4.2300000000000000-2.230000000000000) == 2.0000000000000003 >> > [1] TRUE > >> (4.2300000000000000-2.230000000000000) == 2.0000000000000004 >> > [1] TRUE > >> (4.2300000000000000-2.230000000000000) == 2.0000000000000005 >> > [1] TRUE > >> (4.2300000000000000-2.230000000000000) == 2.0000000000000006 <-- 3,5 I >> can understand being true if rounding occurring, but 6? >> > [1] TRUE > >> (4.2300000000000000-2.230000000000000) == 2.0000000000000007 >> > [1] FALSE > >> (4.2300000000000000-2.230000000000000) == 2.0000000000000008 >> > [1] FALSE > >> (4.2300000000000000-2.230000000000000) == 2.0000000000000009 >> > [1] FALSE > >> (4.2300000000000000-2.230000000000000) == 2.0000000000000010 >> > > > This is an example of junk being added in the FPU > >> formatC(a, digits=20) >> > [1] "2.0000000000000004441" > > I don't know if this is just a formatC error when using more than 16 > decimal places or if this junk is what is stopping the equality from being > true: > > >> formatC(a, digits=16) >> > [1] " 2" > >> formatC(a, digits=17) <-- 16 decimal places, 17 significant figures >> shown >> > [1] "2.0000000000000004" <-- the problem is the 4 at the end > > Obviously the bytes are divided between the exponent and mantissa in > 16-16bit share it seems, but this doesn't account for the 16th decimal > place behaviour does it? > > If any one has a work around or reason why this should occur it would be > useful to know. > > what I would like is to be able to do sums such as (2.3456 - 2 ) == 0.3456 > and get a sensible answer - any suggestions? Currently the only way is > for formatC the expression to a known number of decimal places - is there > a better way? > > Many thanks > > Tom > > >
On 12/9/2006 8:29 AM, Tom McCallum wrote:> Hi, > > I am not sure if this is just me using R (R-2.3.1 and R-2.4.0) in the > wrong way or if there is a more serious bug. I was having problems > getting some calculations to add up so I ran the following tests:You should read the FAQ item "Why doesn't R think these numbers are equal?" at http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f Duncan Murdoch> >> (2.34567 - 2.00000) == 0.34567 <------- should be true > [1] FALSE >> (2.23-2.00) == 0.23 <------- should be true > [1] FALSE >> 4-2==2 > [1] TRUE >> (4-2)==2 > [1] TRUE >> (4.0-2)==2 > [1] TRUE >> (4.0-2.0)==2 > [1] TRUE >> (4.0-2.0)==2.0 > [1] TRUE >> (4.2-2.2)==2.0 > [1] TRUE >> (4.20-2.20)==2.00 > [1] TRUE >> (4.23-2.23)==2.00 <------- should be true > [1] FALSE >> (4.230-2.230)==2.000 <------- should be true > [1] FALSE >> (4.230-2.230)==2.00 <------- should be true > [1] FALSE >> (4.230-2.23)==2.00 <------- should be true > [1] FALSE > > I have tried these on both 64 and 32-bit machines. Surely R should be > able to do maths to 2 decimal places and be able to test these simple > expressions? The problem occurs as in the 16th decimal place junk is > being placed by the FPU it seems. I have also tried: > >> (4.2300000000000000-2.230000000000000) == 2 > [1] FALSE >> a <- (4.2300000000000000-2.230000000000000) >> a == 2 > [1] FALSE >> (4.2300000000000000-2.230000000000000) == 2.0000000000000000 > [1] FALSE >> (4.2300000000000000-2.230000000000000) == 2.0000000000000004 <-- correct >> when add 16th decimal place to 4 > [1] TRUE >> (4.2300000000000000-2.230000000000000) == 2.00000000000000043 <-- any >> values after the 16th decimal place mean that the expression is true > [1] TRUE >> (4.2300000000000000-2.230000000000000) == 2.000000000000000435 > [1] TRUE > > Also : > >> (4.2300000000000000-2.230000000000000) == 2.0000000000000001 > [1] FALSE >> (4.2300000000000000-2.230000000000000) == 2.0000000000000003 > [1] TRUE >> (4.2300000000000000-2.230000000000000) == 2.0000000000000004 > [1] TRUE >> (4.2300000000000000-2.230000000000000) == 2.0000000000000005 > [1] TRUE >> (4.2300000000000000-2.230000000000000) == 2.0000000000000006 <-- 3,5 I >> can understand being true if rounding occurring, but 6? > [1] TRUE >> (4.2300000000000000-2.230000000000000) == 2.0000000000000007 > [1] FALSE >> (4.2300000000000000-2.230000000000000) == 2.0000000000000008 > [1] FALSE >> (4.2300000000000000-2.230000000000000) == 2.0000000000000009 > [1] FALSE >> (4.2300000000000000-2.230000000000000) == 2.0000000000000010 > > > This is an example of junk being added in the FPU >> formatC(a, digits=20) > [1] "2.0000000000000004441" > > I don't know if this is just a formatC error when using more than 16 > decimal places or if this junk is what is stopping the equality from being > true: > >> formatC(a, digits=16) > [1] " 2" >> formatC(a, digits=17) <-- 16 decimal places, 17 significant figures >> shown > [1] "2.0000000000000004" <-- the problem is the 4 at the end > > Obviously the bytes are divided between the exponent and mantissa in > 16-16bit share it seems, but this doesn't account for the 16th decimal > place behaviour does it? > > If any one has a work around or reason why this should occur it would be > useful to know. > > what I would like is to be able to do sums such as (2.3456 - 2 ) == 0.3456 > and get a sensible answer - any suggestions? Currently the only way is > for formatC the expression to a known number of decimal places - is there > a better way? > > Many thanks > > Tom > >