Juan Telleria Ruiz de Aguirre
2019-Jun-03 04:50 UTC
[Rd] Converting non-32-bit integers from python to R to use bit64: reticulate
Thank you Martin for giving to know and developing 'Rmpfr' library for unlimited size integers (GNU C GMP) and arbitrary precision floats (GNU C MPFR): https://cran.r-project.org/package=Rmpfr My question is: In the long term (For R3.7.0 or R3.8.0): Does it have sense that CMP substitutes INTSXP, and MPFR substitutes REALSXP code? With this we would achieve that an integer is always an integer, and a numeric double precision float always a numeric double precision float, without sometimes casting underneath. And would the R Community / R Ordinary Members would be willing to help R Core on such implementation (If has sense, and wants to be adopted)? Thank you all! :)> > If you are interested in using unlimited size integers, you > could use the CRAN R package 'gmp' which builds on the GMP = GNU > MP = GNU Multi Precision C library. > > https://cran.r-project.org/package=gmp > > (and for arbitrary precision "floats", see CRAN pkg 'Rmpfr' > built on package gmp, and both the GNU C libraries GMP and > MPFR: > https://cran.r-project.org/package=Rmpfr > ) > >[[alternative HTML version deleted]]
Martin Maechler
2019-Jun-04 16:08 UTC
[Rd] Converting non-32-bit integers from python to R to use bit64: reticulate
>>>>> Juan Telleria Ruiz de Aguirre >>>>> on Mon, 3 Jun 2019 06:50:17 +0200 writes:> Thank you Martin for giving to know and developing 'Rmpfr' library for > unlimited size integers (GNU C GMP) and arbitrary precision floats (GNU C > MPFR): > https://cran.r-project.org/package=Rmpfr > My question is: In the long term (For R3.7.0 or R3.8.0): > Does it have sense that CMP substitutes INTSXP, and MPFR substitutes > REALSXP code? With this we would achieve that an integer is always an > integer, and a numeric double precision float always a numeric double > precision float, without sometimes casting underneath. > And would the R Community / R Ordinary Members would be willing to help R > Core on such implementation (If has sense, and wants to be adopted)? No, such a change has "no sense" and hence won't be adopted (in this form): - INTSXP and REALSXP are part of the C API of R, and are well defined. Changing them will almost surely break 100s and by dependencies, probably 1000s of existing R packages. - I'm sure Python and other system do have fixed size "double precision" vectors, because that's how you interface with all pre-existing computational libraries, and I am almost sure that support of arbitrary long integer (or double) is via another class/type. - I know that Julia has adopted these (GMP and MPFR I think) types and nicely interfaces them on a relatively "base" level. With their nice class hierarchy (and very nice "S4 like" multi-argument method dispatch for *all* functions) it can look quite seemless for the user to work with these extended classes, but they are not all identical to the basic "real"/"double" or "integer" classes. - I'm not the expert here (but there are not so many experts ..), but I'm pretty sure that adding new "basic types" in the underlying C level seems not at all easy for R. It would mean a big break in all back compatibility -- which is conceivable -- and *may* also need a big rewrite of much of the R code base which seems less conceivable in the mid term (2-3 years; long term: > 5 years). > Thank you all! :) You are welcome. I think we should close this thread here, unless some real experts join. Martin
Kevin Ushey
2019-Jun-04 17:14 UTC
[Rd] Converting non-32-bit integers from python to R to use bit64: reticulate
I think a more productive conversation could be: what additions to R would allow for user-defined types / classes that behave just like the built-in vector types? As a motivating example, one cannot currently use the 64bit integer objects from bit64 to subset data frames: > library(bit64); mtcars[as.integer64(1:3), ] [1] mpg cyl disp hp drat wt qsec vs am gear carb <0 rows> (or 0-length row.names) I think ALTREP presents a possibility here, in that we could have a 64bit integer ALTREP object that behaves either like an INTSXP or REALSXP as necessary. But I'm not sure how we would handle large 64bit integer values which won't fit in either an INTSXP or REALSXP (in the REALSXP case, precision could be lost for values > 2^53). One possibility would be to allow ALTREP objects to have a chance at managing dispatch in some methods, so that (for example) in e.g. data[<ALTREP>], the ALTREP object has the opportunity to choose how the data object should be subsetted. Of course, this implies wiring through yet another dispatch mechanism through a category of primitive / internal functions, which could be expensive in terms of implementation / maintenance... and I'm not sure if this could play well with the existing S3 / S4 dispatch mechanisms. FWIW, I think most commonly 64bit integers arise as e.g. database keys / IDs, and are typically just used for subsetting / reordering of data as opposed to math. In these cases, converting the 64bit integers to a character vector is typically a viable workaround, although it's much slower. Still, at least to me, it seems like there is likely a path forward with ALTREP for 64bit integer vectors that can behave (more or less) just like builtin R vectors. Best, Kevin On Tue, Jun 4, 2019 at 9:34 AM Martin Maechler <maechler at stat.math.ethz.ch> wrote:> > >>>>> Juan Telleria Ruiz de Aguirre > >>>>> on Mon, 3 Jun 2019 06:50:17 +0200 writes: > > > Thank you Martin for giving to know and developing 'Rmpfr' library for > > unlimited size integers (GNU C GMP) and arbitrary precision floats (GNU C > > MPFR): > > > https://cran.r-project.org/package=Rmpfr > > > My question is: In the long term (For R3.7.0 or R3.8.0): > > > Does it have sense that CMP substitutes INTSXP, and MPFR substitutes > > REALSXP code? With this we would achieve that an integer is always an > > integer, and a numeric double precision float always a numeric double > > precision float, without sometimes casting underneath. > > > And would the R Community / R Ordinary Members would be willing to help R > > Core on such implementation (If has sense, and wants to be adopted)? > > No, such a change has "no sense" and hence won't be adopted (in > this form): > > - INTSXP and REALSXP are part of the C API of R, and are well defined. > Changing them will almost surely break 100s and by > dependencies, probably 1000s of existing R packages. > > - I'm sure Python and other system do have fixed size "double > precision" vectors, because that's how you interface with all > pre-existing computational libraries, > and I am almost sure that support of arbitrary long integer > (or double) is via another class/type. > > - I know that Julia has adopted these (GMP and MPFR I think) > types and nicely interfaces them on a relatively "base" level. > With their nice class hierarchy (and very nice "S4 like" multi-argument > method dispatch for *all* functions) it can look quite > seemless for the user to work with these extended classes, but > they are not all identical to the basic "real"/"double" or "integer" classes. > > - I'm not the expert here (but there are not so many experts > ..), but I'm pretty sure that adding new "basic types" in the > underlying C level seems not at all easy for R. It would mean a big > break in all back compatibility -- which is conceivable -- > and *may* also need a big rewrite of much of the R code base > which seems less conceivable in the mid term (2-3 years; long > term: > 5 years). > > > > Thank you all! :) > > You are welcome. > > I think we should close this thread here, unless some real > experts join. > > Martin > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
Maybe Matching Threads
- Converting non-32-bit integers from python to R to use bit64: reticulate
- Converting non-32-bit integers from python to R to use bit64: reticulate
- Converting non-32-bit integers from python to R to use bit64: reticulate
- Converting non-32-bit integers from python to R to use bit64: reticulate
- Converting non-32-bit integers from python to R to use bit64: reticulate