thr3ads.net - R devel - [Rd] Converting non-32-bit integers from python to R to use bit64: reticulate [Jun 2019]

If this information is useful, please help other people find it:
Share via:

Juan Telleria Ruiz de Aguirre

2019-May-30 16:46 UTC

[Rd] Converting non-32-bit integers from python to R to use bit64: reticulate

Thank you Gabriel for valuable insights on the 64-bit integers topic.

In addition, my statement was wrong, as Python3 seems to have unlimited
(and variable) size integers. Here is related CPython Code:

https://github.com/python/cpython/blob/master/Objects/longobject.c

Division between Int-32 and Int-64 seems to only happen in Python2.

Best,
Juan

El mi?rcoles, 29 de mayo de 2019, Gabriel Becker <gabembecker at
gmail.com>
escribi?:
> Hi Juan,
>
> Comments inline.
>
> On Wed, May 29, 2019 at 12:48 PM Juan Telleria Ruiz de Aguirre <
> jtelleria.rproject at gmail.com> wrote:
>
>> Dear R Developers,
>>
>> There is an interesting issue related to "reticulate" R
package which
>> discusses how to convert Python's non-32 bit integers to R, which
has had
>> quite an exhaustive discussion:
>>
>> https://github.com/rstudio/reticulate/issues/323
>>
>> Python seems to handle integers differently from R, and is dependant on
>> the
>> system arquitecture: On 32 bit systems uses 32-bit integers, and on
64-bit
>> systems uses 64-bit integers.
>>
>> So my question is:
>>
>> As regards R's C Interface, how costly would it be to convert
INTSXP from
>> 32 bits to 64 bits using C, on 64 bits Systems? Do the benefits surpass
>> the
>> costs? And should such development be handled from within R Core /
>> Ordinary
>> Members , or it shall be left to package maintainers?
>>
>
> Well, I am not an R-core member, but I can mention a few things:
>
> 1. This seems like it would make the results of R code non-reproducible
> between 32 and 64bit versions of R; at least some code would give different
> results (at the very least in terms of when integer values overflow to NA,
> which is documented behavior).
> 2. Obviously all integer data would take twice as much memory, memory
> bandwidth, space in caches, etc, even when it doesn't need it.
> 3. Various places treat data /data pointers coming out of INTSXP and
> LGLSXP objects the same within the internal R sources (as currently
they're
> both int/int*). Catching and fixing all those wouldn't be impossible,
but
> it would take at least some doing.
>
> For me personally 1 seems like a big problem, and 3 makes the conversion
> more work than it might have seemed initially.
>
> As a related side note, as far as I understand what I've heard from
R-core
> members directly, the choice to not have multiple types of integers is
> intentional and unlikely to change.
>
> Best,
> ~G
>
>
>
>
>>
>> Thank you! :)
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
	[[alternative HTML version deleted]]

Martin Maechler

2019-Jun-01 16:29 UTC

head link

[Rd] Converting non-32-bit integers from python to R to use bit64: reticulate

>>>>> Juan Telleria Ruiz de Aguirre 
>>>>>     on Thu, 30 May 2019 18:46:29 +0200 writes:
    >Thank you Gabriel for valuable insights on the 64-bit integers topic.
    >In addition, my statement was wrong, as Python3 seems to have unlimited
    >(and variable) size integers.
    ....

If you are interested in using unlimited size integers, you
could use the CRAN R package 'gmp' which builds on the GMP = GNU
MP = GNU Multi Precision C library.

   https://cran.r-project.org/package=gmp

(and for arbitrary precision "floats", see CRAN pkg 'Rmpfr'
 built on package gmp, and both the GNU C libraries  GMP and
 MPFR:
	   https://cran.r-project.org/package=Rmpfr
) 


    >Division between Int-32 and Int-64 seems to only happen in Python2.

    >Best,
    >Juan

    >El mi?rcoles, 29 de mayo de 2019, Gabriel Becker <gabembecker at
gmail.com>
    >escribi?:

    >> Hi Juan,
    >> 
    >> Comments inline.
    >> 
    >> On Wed, May 29, 2019 at 12:48 PM Juan Telleria Ruiz de Aguirre <
    >> jtelleria.rproject at gmail.com> wrote:
    >> 
    >>> Dear R Developers,
    >>> 
    >>> There is an interesting issue related to "reticulate"
R package which
    >>> discusses how to convert Python's non-32 bit integers to R,
which has had
    >>> quite an exhaustive discussion:
    >>> 
    >>> https://github.com/rstudio/reticulate/issues/323
    >>> 
    >>> Python seems to handle integers differently from R, and is
dependant on
    >>> the
    >>> system arquitecture: On 32 bit systems uses 32-bit integers,
and on 64-bit
    >>> systems uses 64-bit integers.
    >>> 
    >>> So my question is:
    >>> 
    >>> As regards R's C Interface, how costly would it be to
convert INTSXP from
    >>> 32 bits to 64 bits using C, on 64 bits Systems? Do the benefits
surpass
    >>> the
    >>> costs? And should such development be handled from within R
Core /
    >>> Ordinary
    >>> Members , or it shall be left to package maintainers?
    >>> 
    >> 
    >> Well, I am not an R-core member, but I can mention a few things:
    >> 
    >> 1. This seems like it would make the results of R code
non-reproducible
    >> between 32 and 64bit versions of R; at least some code would give
different
    >> results (at the very least in terms of when integer values overflow
to NA,
    >> which is documented behavior).
    >> 2. Obviously all integer data would take twice as much memory,
memory
    >> bandwidth, space in caches, etc, even when it doesn't need it.
    >> 3. Various places treat data /data pointers coming out of INTSXP
and
    >> LGLSXP objects the same within the internal R sources (as currently
they're
    >> both int/int*). Catching and fixing all those wouldn't be
impossible, but
    >> it would take at least some doing.
    >> 
    >> For me personally 1 seems like a big problem, and 3 makes the
conversion
    >> more work than it might have seemed initially.
    >> 
    >> As a related side note, as far as I understand what I've heard
from R-core
    >> members directly, the choice to not have multiple types of integers
is
    >> intentional and unlikely to change.
    >> 
    >> Best,
    >> ~G
    >> 
    >> 
    >> 
    >> 
    >>> 
    >>> Thank you! :)

Juan Telleria Ruiz de Aguirre

2019-Jun-03 04:50 UTC

head link

[Rd] Converting non-32-bit integers from python to R to use bit64: reticulate

Thank you Martin for giving to know and developing 'Rmpfr' library for
unlimited size integers (GNU C GMP) and arbitrary precision floats (GNU C
MPFR):

https://cran.r-project.org/package=Rmpfr

My question is: In the long term (For R3.7.0 or R3.8.0):

Does it have sense that CMP substitutes INTSXP, and MPFR substitutes
REALSXP code? With this we would achieve that an integer is always an
integer, and a numeric double precision float always a numeric double
precision float, without sometimes casting underneath.

And would the R Community / R Ordinary Members would be willing to help R
Core on such implementation (If has sense, and wants to be adopted)?

Thank you all! :)

>
> If you are interested in using unlimited size integers, you
> could use the CRAN R package 'gmp' which builds on the GMP = GNU
> MP = GNU Multi Precision C library.
>
>    https://cran.r-project.org/package=gmp
>
> (and for arbitrary precision "floats", see CRAN pkg
'Rmpfr'
>  built on package gmp, and both the GNU C libraries  GMP and
>  MPFR:
>            https://cran.r-project.org/package=Rmpfr
> )
>
>
	[[alternative HTML version deleted]]

Apparently Analagous Threads

Search for more possibly parallel threads

R devel - Jun 2019 - Converting non-32-bit integers from python to R to use bit64: reticulate

[Rd] Converting non-32-bit integers from python to R to use bit64: reticulate

[Rd] Converting non-32-bit integers from python to R to use bit64: reticulate

[Rd] Converting non-32-bit integers from python to R to use bit64: reticulate

Apparently Analagous Threads