thr3ads.net - R devel - [Rd] Floating point issue [Jul 2022]

If this information is useful, please help other people find it:
Share via:

Antoine Fabri

2022-Jul-10 21:38 UTC

[Rd] Floating point issue

Thanks, I get the exact same results as yours

``` r
bitC <- function(x) noquote(vapply(as.double(x), function(x) { # split one
double
  b <- substr(as.character(rev(numToBits(x))), 2L, 2L)
      paste0(c(b[1L], " ", b[2:12], " | ", b[13:64]),
collapse = "")
    }, ""))
bitC(10^25)
#> [1] 0 10001010010 | 0000100010110010101000101100001010000000001010010001
bitC(10000000000000000905969664)
#> [1] 0 10001010010 | 0000100010110010101000101100001010000000001010010010
bitC(10000000000000000905969664 - 10^25)
#> [1] 0 10000011110 | 0000000000000000000000000000000000000000000000000000
bitC(1e25)
#> [1] 0 10001010010 | 0000100010110010101000101100001010000000001010010001
```

<sup>Created on 2022-07-10 by the [reprex package](
https://reprex.tidyverse.org) (v2.0.1)</sup>

Le dim. 10 juil. 2022 ? 22:23, Bill Dunlap <williamwdunlap at gmail.com> a
?crit :
> The following function, 'bitC' from ?numToBits, displays the bits
in a
> double precision number, separated into the sign bit, the 11 exponent bits,
> and the 52 bits in the mantissa.  I've shown the results with your
numbers
> from R-2.4.0 on my Windows 11 Lenovo laptop: what do you get?
>
> > bitC <- function(x) noquote(vapply(as.double(x), function(x) { #
split
> one double
> +     b <- substr(as.character(rev(numToBits(x))), 2L, 2L)
> +     paste0(c(b[1L], " ", b[2:12], " | ", b[13:64]),
collapse = "")
> +   }, ""))
> > bitC(10^25)
> # [1] 0 10001010010 | 0000100010110010101000101100001010000000001010010001
> > bitC(10000000000000000905969664)
> # [1] 0 10001010010 | 0000100010110010101000101100001010000000001010010001
> > bitC(10000000000000000905969664 - 10^25)
> # [1] 0 00000000000 | 0000000000000000000000000000000000000000000000000000
> > bitC(1e25)
> # [1] 0 10001010010 | 0000100010110010101000101100001010000000001010010001
>
> -Bill
>
> On Sun, Jul 10, 2022 at 7:00 AM Antoine Fabri <antoine.fabri at
gmail.com>
> wrote:
>
>> Dear r-devel,
>>
>> For some numbers, the printed value is not equivalent to the input :
>>
>> options(scipen = 999)
>> ## GOOD
>> 1e24
>> #> [1]  999999999999999983222784
>> 1e24 == 999999999999999983222784
>> #> [1] TRUE
>>
>> ## BAD
>> 1e25
>> #> [1] 10000000000000000905969664
>> 1e25 == 10000000000000000905969664
>> #> [1] FALSE
>>
>> ## STILL BAD
>> 10000000000000000905969664
>> #> [1] 10000000000000003053453312
>>
>> ## GOOD AGAIN
>> 10000000000000003053453312
>> #> [1] 10000000000000003053453312
>>
>> # Additionally
>> 10000000000000000000000000 == 1e25
>> #> [1] FALSE
>>
>> Are these bugs ?
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
	[[alternative HTML version deleted]]

Brodie Gaslam

2022-Jul-11 01:10 UTC

head link

[Rd] Floating point issue

The results are not exactly the same.  Notice that on Bill's system the 
bit pattern of 10^25 and 10000000000000000905969664 are the same, but 
not so on yours.  So there is a mismatch happening on parsing between 
your M1 mac and other's systems.

This is the main thing I wanted to point out.  But since I'm here I'm 
going to add some additional lazily researched speculation.

As Dirk points out, M1 does not have long double, and if you look at 
what I think is responsible for parsing of numbers like the ones we're 
discussing here, we see[1]:

     double R_strtod5(const char *str, char **endptr, char dec,
     		 Rboolean NA, int exact)
     {
         LDOUBLE ans = 0.0;

IIRC long double on systems that implement it as 80 bits (most that have 
x87 coprocessors) has 63-4 bits of precision, vs 53 for 64 bit long 
double.  Roughly speaking, that's 19-20 digits of base 10 precision for 
long double, vs 15-16 for 64 bit double.  Then:

     > substr(rep("10000000000000000905969664", 2),  c(1, 1), c(16,
20))
     [1] "1000000000000000"     "10000000000000000905"

Again, I have not carefully researched this, but it seems likely that 
parsing is producing different a different outcome in this case because 
the intermediate values generated can be kept at higher precision on 
systems with 80 bit long doubles prior to coercing to double for the 
final result.

IIRC, if you need invertible deparsing/parsing I think you can use:

     deparse(1e25, control=c('hexNumeric'))

Although I don't have an 80-bit-less system to test on (and I am too 
lazy to recompile R without long double to test).

Best,

B.

[1]: https://github.com/r-devel/r-svn/blob/master/src/main/util.c#L1993



On 7/10/22 5:38 PM, Antoine Fabri wrote:> Thanks, I get the exact same results as yours
> 
> ``` r
> bitC <- function(x) noquote(vapply(as.double(x), function(x) { # split
one
> double
>    b <- substr(as.character(rev(numToBits(x))), 2L, 2L)
>        paste0(c(b[1L], " ", b[2:12], " | ", b[13:64]),
collapse = "")
>      }, ""))
> bitC(10^25)
> #> [1] 0 10001010010 |
0000100010110010101000101100001010000000001010010001
> bitC(10000000000000000905969664)
> #> [1] 0 10001010010 |
0000100010110010101000101100001010000000001010010010
> bitC(10000000000000000905969664 - 10^25)
> #> [1] 0 10000011110 |
0000000000000000000000000000000000000000000000000000
> bitC(1e25)
> #> [1] 0 10001010010 |
0000100010110010101000101100001010000000001010010001
> ```
> 
> <sup>Created on 2022-07-10 by the [reprex package](
> https://reprex.tidyverse.org) (v2.0.1)</sup>
> 
> Le dim. 10 juil. 2022 ? 22:23, Bill Dunlap <williamwdunlap at
gmail.com> a
> ?crit :
> 
>> The following function, 'bitC' from ?numToBits, displays the
bits in a
>> double precision number, separated into the sign bit, the 11 exponent
bits,
>> and the 52 bits in the mantissa.  I've shown the results with your
numbers
>> from R-2.4.0 on my Windows 11 Lenovo laptop: what do you get?
>>
>>> bitC <- function(x) noquote(vapply(as.double(x), function(x) { #
split
>> one double
>> +     b <- substr(as.character(rev(numToBits(x))), 2L, 2L)
>> +     paste0(c(b[1L], " ", b[2:12], " | ",
b[13:64]), collapse = "")
>> +   }, ""))
>>> bitC(10^25)
>> # [1] 0 10001010010 |
0000100010110010101000101100001010000000001010010001
>>> bitC(10000000000000000905969664)
>> # [1] 0 10001010010 |
0000100010110010101000101100001010000000001010010001
>>> bitC(10000000000000000905969664 - 10^25)
>> # [1] 0 00000000000 |
0000000000000000000000000000000000000000000000000000
>>> bitC(1e25)
>> # [1] 0 10001010010 |
0000100010110010101000101100001010000000001010010001
>>
>> -Bill
>>
>> On Sun, Jul 10, 2022 at 7:00 AM Antoine Fabri <antoine.fabri at
gmail.com>
>> wrote:
>>
>>> Dear r-devel,
>>>
>>> For some numbers, the printed value is not equivalent to the input
:
>>>
>>> options(scipen = 999)
>>> ## GOOD
>>> 1e24
>>> #> [1]  999999999999999983222784
>>> 1e24 == 999999999999999983222784
>>> #> [1] TRUE
>>>
>>> ## BAD
>>> 1e25
>>> #> [1] 10000000000000000905969664
>>> 1e25 == 10000000000000000905969664
>>> #> [1] FALSE
>>>
>>> ## STILL BAD
>>> 10000000000000000905969664
>>> #> [1] 10000000000000003053453312
>>>
>>> ## GOOD AGAIN
>>> 10000000000000003053453312
>>> #> [1] 10000000000000003053453312
>>>
>>> # Additionally
>>> 10000000000000000000000000 == 1e25
>>> #> [1] FALSE
>>>
>>> Are these bugs ?
>>>
>>>          [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

Bill Dunlap

2022-Jul-11 01:14 UTC

head link

[Rd] Floating point issue

You said you got the same results as I did.  Not so, the parsing of the
long numeric differs in the last bit of the mantissa.

(A=Antoine, B=Bill):

bitC(10^25)
#A [1] 0 10001010010 | 0000100010110010101000101100001010000000001010010001
#B [1] 0 10001010010 | 0000100010110010101000101100001010000000001010010001

bitC(10000000000000000905969664)
#A [1] 0 10001010010 | 0000100010110010101000101100001010000000001010010010
#B [1] 0 10001010010 | 0000100010110010101000101100001010000000001010010001

bitC(10000000000000000905969664 - 10^25)
#A [1] 0 10000011110 | 0000000000000000000000000000000000000000000000000000
#B [1] 0 00000000000 | 0000000000000000000000000000000000000000000000000000

On Sun, Jul 10, 2022 at 2:39 PM Antoine Fabri <antoine.fabri at gmail.com>
wrote:
> Thanks, I get the exact same results as yours
>
> ``` r
> bitC <- function(x) noquote(vapply(as.double(x), function(x) { # split
one
> double
>   b <- substr(as.character(rev(numToBits(x))), 2L, 2L)
>       paste0(c(b[1L], " ", b[2:12], " | ", b[13:64]),
collapse = "")
>     }, ""))
> bitC(10^25)
> #> [1] 0 10001010010 |
0000100010110010101000101100001010000000001010010001
> bitC(10000000000000000905969664)
> #> [1] 0 10001010010 |
0000100010110010101000101100001010000000001010010010
> bitC(10000000000000000905969664 - 10^25)
> #> [1] 0 10000011110 |
0000000000000000000000000000000000000000000000000000
> bitC(1e25)
> #> [1] 0 10001010010 |
0000100010110010101000101100001010000000001010010001
> ```
>
> <sup>Created on 2022-07-10 by the [reprex package](
> https://reprex.tidyverse.org) (v2.0.1)</sup>
>
> Le dim. 10 juil. 2022 ? 22:23, Bill Dunlap <williamwdunlap at
gmail.com> a
> ?crit :
>
>> The following function, 'bitC' from ?numToBits, displays the
bits in a
>> double precision number, separated into the sign bit, the 11 exponent
bits,
>> and the 52 bits in the mantissa.  I've shown the results with your
numbers
>> from R-2.4.0 on my Windows 11 Lenovo laptop: what do you get?
>>
>> > bitC <- function(x) noquote(vapply(as.double(x), function(x) {
# split
>> one double
>> +     b <- substr(as.character(rev(numToBits(x))), 2L, 2L)
>> +     paste0(c(b[1L], " ", b[2:12], " | ",
b[13:64]), collapse = "")
>> +   }, ""))
>> > bitC(10^25)
>> # [1] 0 10001010010 |
0000100010110010101000101100001010000000001010010001
>> > bitC(10000000000000000905969664)
>> # [1] 0 10001010010 |
0000100010110010101000101100001010000000001010010001
>> > bitC(10000000000000000905969664 - 10^25)
>> # [1] 0 00000000000 |
0000000000000000000000000000000000000000000000000000
>> > bitC(1e25)
>> # [1] 0 10001010010 |
0000100010110010101000101100001010000000001010010001
>>
>> -Bill
>>
>> On Sun, Jul 10, 2022 at 7:00 AM Antoine Fabri <antoine.fabri at
gmail.com>
>> wrote:
>>
>>> Dear r-devel,
>>>
>>> For some numbers, the printed value is not equivalent to the input
:
>>>
>>> options(scipen = 999)
>>> ## GOOD
>>> 1e24
>>> #> [1]  999999999999999983222784
>>> 1e24 == 999999999999999983222784
>>> #> [1] TRUE
>>>
>>> ## BAD
>>> 1e25
>>> #> [1] 10000000000000000905969664
>>> 1e25 == 10000000000000000905969664
>>> #> [1] FALSE
>>>
>>> ## STILL BAD
>>> 10000000000000000905969664
>>> #> [1] 10000000000000003053453312
>>>
>>> ## GOOD AGAIN
>>> 10000000000000003053453312
>>> #> [1] 10000000000000003053453312
>>>
>>> # Additionally
>>> 10000000000000000000000000 == 1e25
>>> #> [1] FALSE
>>>
>>> Are these bugs ?
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
	[[alternative HTML version deleted]]

R devel - Jul 2022 - Floating point issue

[Rd] Floating point issue

[Rd] Floating point issue

[Rd] Floating point issue