Juan Telleria Ruiz de Aguirre
2019-May-30 16:46 UTC
[Rd] Converting non-32-bit integers from python to R to use bit64: reticulate
Thank you Gabriel for valuable insights on the 64-bit integers topic. In addition, my statement was wrong, as Python3 seems to have unlimited (and variable) size integers. Here is related CPython Code: https://github.com/python/cpython/blob/master/Objects/longobject.c Division between Int-32 and Int-64 seems to only happen in Python2. Best, Juan El mi?rcoles, 29 de mayo de 2019, Gabriel Becker <gabembecker at gmail.com> escribi?:> Hi Juan, > > Comments inline. > > On Wed, May 29, 2019 at 12:48 PM Juan Telleria Ruiz de Aguirre < > jtelleria.rproject at gmail.com> wrote: > >> Dear R Developers, >> >> There is an interesting issue related to "reticulate" R package which >> discusses how to convert Python's non-32 bit integers to R, which has had >> quite an exhaustive discussion: >> >> https://github.com/rstudio/reticulate/issues/323 >> >> Python seems to handle integers differently from R, and is dependant on >> the >> system arquitecture: On 32 bit systems uses 32-bit integers, and on 64-bit >> systems uses 64-bit integers. >> >> So my question is: >> >> As regards R's C Interface, how costly would it be to convert INTSXP from >> 32 bits to 64 bits using C, on 64 bits Systems? Do the benefits surpass >> the >> costs? And should such development be handled from within R Core / >> Ordinary >> Members , or it shall be left to package maintainers? >> > > Well, I am not an R-core member, but I can mention a few things: > > 1. This seems like it would make the results of R code non-reproducible > between 32 and 64bit versions of R; at least some code would give different > results (at the very least in terms of when integer values overflow to NA, > which is documented behavior). > 2. Obviously all integer data would take twice as much memory, memory > bandwidth, space in caches, etc, even when it doesn't need it. > 3. Various places treat data /data pointers coming out of INTSXP and > LGLSXP objects the same within the internal R sources (as currently they're > both int/int*). Catching and fixing all those wouldn't be impossible, but > it would take at least some doing. > > For me personally 1 seems like a big problem, and 3 makes the conversion > more work than it might have seemed initially. > > As a related side note, as far as I understand what I've heard from R-core > members directly, the choice to not have multiple types of integers is > intentional and unlikely to change. > > Best, > ~G > > > > >> >> Thank you! :) >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> >[[alternative HTML version deleted]]
Martin Maechler
2019-Jun-01 16:29 UTC
[Rd] Converting non-32-bit integers from python to R to use bit64: reticulate
>>>>> Juan Telleria Ruiz de Aguirre >>>>> on Thu, 30 May 2019 18:46:29 +0200 writes:>Thank you Gabriel for valuable insights on the 64-bit integers topic. >In addition, my statement was wrong, as Python3 seems to have unlimited >(and variable) size integers. .... If you are interested in using unlimited size integers, you could use the CRAN R package 'gmp' which builds on the GMP = GNU MP = GNU Multi Precision C library. https://cran.r-project.org/package=gmp (and for arbitrary precision "floats", see CRAN pkg 'Rmpfr' built on package gmp, and both the GNU C libraries GMP and MPFR: https://cran.r-project.org/package=Rmpfr ) >Division between Int-32 and Int-64 seems to only happen in Python2. >Best, >Juan >El mi?rcoles, 29 de mayo de 2019, Gabriel Becker <gabembecker at gmail.com> >escribi?: >> Hi Juan, >> >> Comments inline. >> >> On Wed, May 29, 2019 at 12:48 PM Juan Telleria Ruiz de Aguirre < >> jtelleria.rproject at gmail.com> wrote: >> >>> Dear R Developers, >>> >>> There is an interesting issue related to "reticulate" R package which >>> discusses how to convert Python's non-32 bit integers to R, which has had >>> quite an exhaustive discussion: >>> >>> https://github.com/rstudio/reticulate/issues/323 >>> >>> Python seems to handle integers differently from R, and is dependant on >>> the >>> system arquitecture: On 32 bit systems uses 32-bit integers, and on 64-bit >>> systems uses 64-bit integers. >>> >>> So my question is: >>> >>> As regards R's C Interface, how costly would it be to convert INTSXP from >>> 32 bits to 64 bits using C, on 64 bits Systems? Do the benefits surpass >>> the >>> costs? And should such development be handled from within R Core / >>> Ordinary >>> Members , or it shall be left to package maintainers? >>> >> >> Well, I am not an R-core member, but I can mention a few things: >> >> 1. This seems like it would make the results of R code non-reproducible >> between 32 and 64bit versions of R; at least some code would give different >> results (at the very least in terms of when integer values overflow to NA, >> which is documented behavior). >> 2. Obviously all integer data would take twice as much memory, memory >> bandwidth, space in caches, etc, even when it doesn't need it. >> 3. Various places treat data /data pointers coming out of INTSXP and >> LGLSXP objects the same within the internal R sources (as currently they're >> both int/int*). Catching and fixing all those wouldn't be impossible, but >> it would take at least some doing. >> >> For me personally 1 seems like a big problem, and 3 makes the conversion >> more work than it might have seemed initially. >> >> As a related side note, as far as I understand what I've heard from R-core >> members directly, the choice to not have multiple types of integers is >> intentional and unlikely to change. >> >> Best, >> ~G >> >> >> >> >>> >>> Thank you! :)
Juan Telleria Ruiz de Aguirre
2019-Jun-03 04:50 UTC
[Rd] Converting non-32-bit integers from python to R to use bit64: reticulate
Thank you Martin for giving to know and developing 'Rmpfr' library for unlimited size integers (GNU C GMP) and arbitrary precision floats (GNU C MPFR): https://cran.r-project.org/package=Rmpfr My question is: In the long term (For R3.7.0 or R3.8.0): Does it have sense that CMP substitutes INTSXP, and MPFR substitutes REALSXP code? With this we would achieve that an integer is always an integer, and a numeric double precision float always a numeric double precision float, without sometimes casting underneath. And would the R Community / R Ordinary Members would be willing to help R Core on such implementation (If has sense, and wants to be adopted)? Thank you all! :)> > If you are interested in using unlimited size integers, you > could use the CRAN R package 'gmp' which builds on the GMP = GNU > MP = GNU Multi Precision C library. > > https://cran.r-project.org/package=gmp > > (and for arbitrary precision "floats", see CRAN pkg 'Rmpfr' > built on package gmp, and both the GNU C libraries GMP and > MPFR: > https://cran.r-project.org/package=Rmpfr > ) > >[[alternative HTML version deleted]]
Possibly Parallel Threads
- Converting non-32-bit integers from python to R to use bit64: reticulate
- Converting non-32-bit integers from python to R to use bit64: reticulate
- Converting non-32-bit integers from python to R to use bit64: reticulate
- Converting non-32-bit integers from python to R to use bit64: reticulate
- Converting non-32-bit integers from python to R to use bit64: reticulate