thr3ads.net - R devel - [Rd] Floating point issue [Jul 2022]

If this information is useful, please help other people find it:
Share via:

Antoine Fabri

2022-Jul-10 14:00 UTC

[Rd] Floating point issue

Dear r-devel,

For some numbers, the printed value is not equivalent to the input :

options(scipen = 999)
## GOOD
1e24
#> [1]  999999999999999983222784
1e24 == 999999999999999983222784
#> [1] TRUE

## BAD
1e25
#> [1] 10000000000000000905969664
1e25 == 10000000000000000905969664
#> [1] FALSE

## STILL BAD
10000000000000000905969664
#> [1] 10000000000000003053453312

## GOOD AGAIN
10000000000000003053453312
#> [1] 10000000000000003053453312

# Additionally
10000000000000000000000000 == 1e25
#> [1] FALSE

Are these bugs ?

	[[alternative HTML version deleted]]

Dirk Eddelbuettel

2022-Jul-10 14:09 UTC

head link

[Rd] Floating point issue

On 10 July 2022 at 16:00, Antoine Fabri wrote:
| Dear r-devel,
| 
| For some numbers, the printed value is not equivalent to the input :
| 
| options(scipen = 999)
| ## GOOD
| 1e24
| #> [1]  999999999999999983222784
| 1e24 == 999999999999999983222784
| #> [1] TRUE
| 
| ## BAD
| 1e25
| #> [1] 10000000000000000905969664
| 1e25 == 10000000000000000905969664
| #> [1] FALSE
| 
| ## STILL BAD
| 10000000000000000905969664
| #> [1] 10000000000000003053453312
| 
| ## GOOD AGAIN
| 10000000000000003053453312
| #> [1] 10000000000000003053453312
| 
| # Additionally
| 10000000000000000000000000 == 1e25
| #> [1] FALSE
| 
| Are these bugs ?

No, that is how computers work (with floating point numbers).

Please R FAQ 7.31 "Why doesn?t R think these numbers are equal?" at
 
https://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f
and the references therein for more.

Dirk

-- 
dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org

Bill Dunlap

2022-Jul-10 20:23 UTC

head link

[Rd] Floating point issue

The following function, 'bitC' from ?numToBits, displays the bits in a
double precision number, separated into the sign bit, the 11 exponent bits,
and the 52 bits in the mantissa.  I've shown the results with your numbers
from R-2.4.0 on my Windows 11 Lenovo laptop: what do you get?
> bitC <- function(x) noquote(vapply(as.double(x), function(x) { # splitone double
+     b <- substr(as.character(rev(numToBits(x))), 2L, 2L)
+     paste0(c(b[1L], " ", b[2:12], " | ", b[13:64]),
collapse = "")
+   }, ""))> bitC(10^25)# [1] 0 10001010010 |
0000100010110010101000101100001010000000001010010001> bitC(10000000000000000905969664)# [1] 0 10001010010 |
0000100010110010101000101100001010000000001010010001> bitC(10000000000000000905969664 - 10^25)# [1] 0 00000000000 |
0000000000000000000000000000000000000000000000000000> bitC(1e25)# [1] 0 10001010010 | 0000100010110010101000101100001010000000001010010001

-Bill

On Sun, Jul 10, 2022 at 7:00 AM Antoine Fabri <antoine.fabri at gmail.com>
wrote:
> Dear r-devel,
>
> For some numbers, the printed value is not equivalent to the input :
>
> options(scipen = 999)
> ## GOOD
> 1e24
> #> [1]  999999999999999983222784
> 1e24 == 999999999999999983222784
> #> [1] TRUE
>
> ## BAD
> 1e25
> #> [1] 10000000000000000905969664
> 1e25 == 10000000000000000905969664
> #> [1] FALSE
>
> ## STILL BAD
> 10000000000000000905969664
> #> [1] 10000000000000003053453312
>
> ## GOOD AGAIN
> 10000000000000003053453312
> #> [1] 10000000000000003053453312
>
> # Additionally
> 10000000000000000000000000 == 1e25
> #> [1] FALSE
>
> Are these bugs ?
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
	[[alternative HTML version deleted]]

Duncan Murdoch

2022-Jul-10 20:34 UTC

head link

[Rd] Floating point issue

Here's a similar question that may give you some ideas for dealing with 
this:

https://stackoverflow.com/q/72899973/2554330

Duncan Murdoch

On 10/07/2022 10:00 a.m., Antoine Fabri wrote:> Dear r-devel,
> 
> For some numbers, the printed value is not equivalent to the input :
> 
> options(scipen = 999)
> ## GOOD
> 1e24
> #> [1]  999999999999999983222784
> 1e24 == 999999999999999983222784
> #> [1] TRUE
> 
> ## BAD
> 1e25
> #> [1] 10000000000000000905969664
> 1e25 == 10000000000000000905969664
> #> [1] FALSE
> 
> ## STILL BAD
> 10000000000000000905969664
> #> [1] 10000000000000003053453312
> 
> ## GOOD AGAIN
> 10000000000000003053453312
> #> [1] 10000000000000003053453312
> 
> # Additionally
> 10000000000000000000000000 == 1e25
> #> [1] FALSE
> 
> Are these bugs ?
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

Simon Urbanek

2022-Jul-12 01:02 UTC

head link

[Rd] Floating point issue

I don?t think there is any guarantee that unrepresentable numbers are parsed
into defined patterns, because printing is done by the OS while parsing is done
by R. The way R parses decimal numbers[1] is simply by using the obvious res =
res * 10 + digit and it can be easily checked that for doubles the
representation such obtained from 10000000000000000905969664 is
0x1.08b2a2c280292p+83 (see below if you want to see it yourself) which is not
the same as 10^25 which is 0x1.08b2a2c280291p+83. This is true on all platforms,
it is not specific to M1. The only difference is if your were to use a different
type you can obtain a different result - and that is not well-defined (e.g. long
doubles have no guarantees at all as of the precision).  Note that the decimal
string above would require 83-bits of precision which is not representable.

(BTW: to make it even more fun, if you were to use double res = 1; repeat(25)
res = res * 10; in C, so the naive computation of the original 10^25 you?d get
9999999999999998758486016 and 0x1.08b2a2c28029p+83)

Given that printing is done by the OS and parsing by R, I don?t think R
guarantees anything. If you want representable number you?d use the binary
representation (sprintf(?%a?) or hex-mode deparse as noted). One could argue
that it could make sense to change it one way or another - either having R do it
all or having the OS do it all. In the latter case one may obtain more
consistent results (e.g. system stdtod() yields the original value even on M1),
but it would be OS-specific. In the former R could impose its own guarantees -
but currently it does not.

Cheers,
Simon

[1] -
https://github.com/r-devel/r-svn/blob/97c0a73f1758d09088c200f924d27b362d55ccdc/src/main/util.c#L2094

#include <stdio.h>
#include <math.h>
#include <stdlib.h>

int main() {
  const char *str = "10000000000000000905969664", *c = str;
  double ans = 0;
  while (*c) {
    ans = 10 * ans + (*(c++) - '0');
    printf("%a\n", ans);
  }
  printf("atof: %a\n", atof(str));
  double pow1025 = pow(10.0, 25);
  printf("--\n10^25:\n%25.f\n%a\n", pow1025, pow1025);
  return 0;
}

0x1p+0
0x1.4p+3
0x1.9p+6
0x1.f4p+9
0x1.388p+13
0x1.86ap+16
0x1.e848p+19
0x1.312dp+23
0x1.7d784p+26
0x1.dcd65p+29
0x1.2a05f2p+33
0x1.74876e8p+36
0x1.d1a94a2p+39
0x1.2309ce54p+43
0x1.6bcc41e9p+46
0x1.c6bf52634p+49
0x1.1c37937e08p+53
0x1.6345785d8a001p+56
0x1.bc16d674ec801p+59
0x1.158e460913d01p+63
0x1.5af1d78b58c41p+66
0x1.b1ae4d6e2ef51p+69
0x1.0f0cf064dd593p+73
0x1.52d02c7e14af8p+76
0x1.a784379d99db6p+79
0x1.08b2a2c280292p+83
atof: 0x1.08b2a2c280291p+83
--
10^25:
10000000000000000905969664
0x1.08b2a2c280291p+83

> On 11/07/2022, at 02:00, Antoine Fabri <antoine.fabri at gmail.com>
wrote:
> 
> Dear r-devel,
> 
> For some numbers, the printed value is not equivalent to the input :
> 
> options(scipen = 999)
> ## GOOD
> 1e24
> #> [1]  999999999999999983222784
> 1e24 == 999999999999999983222784
> #> [1] TRUE
> 
> ## BAD
> 1e25
> #> [1] 10000000000000000905969664
> 1e25 == 10000000000000000905969664
> #> [1] FALSE
> 
> ## STILL BAD
> 10000000000000000905969664
> #> [1] 10000000000000003053453312
> 
> ## GOOD AGAIN
> 10000000000000003053453312
> #> [1] 10000000000000003053453312
> 
> # Additionally
> 10000000000000000000000000 == 1e25
> #> [1] FALSE
> 
> Are these bugs ?
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

R devel - Jul 2022 - Floating point issue

[Rd] Floating point issue

[Rd] Floating point issue

[Rd] Floating point issue

[Rd] Floating point issue

[Rd] Floating point issue