Oliver, for clarification that section in writing R extensions mentions VECTOR_ELT and REAL but not REAL_ELT nor any other *_ELT functions. I was looking for an explanation of all the *_ELT functions (which are apparently new), not just VECTOR_ELT. Thanks Simon that response was very helpful. One more question: are there any circumstances in which one should use REAL_ELT(x,i) rather than REAL(x)[i] or vice versa? Or can they be used interchangeably? On Wed, Jun 16, 2021 at 4:29 PM Simon Urbanek <simon.urbanek at r-project.org> wrote:> The usual quote applies: "use the source, Luke": > > $ grep _ELT *.h | sort > Rdefines.h:#define SET_ELEMENT(x, i, val) SET_VECTOR_ELT(x, i, val) > Rinternals.h: The function STRING_ELT is used as an argument to > arrayAssign even > Rinternals.h:#define VECTOR_ELT(x,i) ((SEXP *) DATAPTR(x))[i] > Rinternals.h://SEXP (STRING_ELT)(SEXP x, R_xlen_t i); > Rinternals.h:Rbyte (RAW_ELT)(SEXP x, R_xlen_t i); > Rinternals.h:Rbyte ALTRAW_ELT(SEXP x, R_xlen_t i); > Rinternals.h:Rcomplex (COMPLEX_ELT)(SEXP x, R_xlen_t i); > Rinternals.h:Rcomplex ALTCOMPLEX_ELT(SEXP x, R_xlen_t i); > Rinternals.h:SEXP (STRING_ELT)(SEXP x, R_xlen_t i); > Rinternals.h:SEXP (VECTOR_ELT)(SEXP x, R_xlen_t i); > Rinternals.h:SEXP ALTSTRING_ELT(SEXP, R_xlen_t); > Rinternals.h:SEXP SET_VECTOR_ELT(SEXP x, R_xlen_t i, SEXP v); > Rinternals.h:double (REAL_ELT)(SEXP x, R_xlen_t i); > Rinternals.h:double ALTREAL_ELT(SEXP x, R_xlen_t i); > Rinternals.h:int (INTEGER_ELT)(SEXP x, R_xlen_t i); > Rinternals.h:int (LOGICAL_ELT)(SEXP x, R_xlen_t i); > Rinternals.h:int ALTINTEGER_ELT(SEXP x, R_xlen_t i); > Rinternals.h:int ALTLOGICAL_ELT(SEXP x, R_xlen_t i); > Rinternals.h:void ALTCOMPLEX_SET_ELT(SEXP x, R_xlen_t i, Rcomplex v); > Rinternals.h:void ALTINTEGER_SET_ELT(SEXP x, R_xlen_t i, int v); > Rinternals.h:void ALTLOGICAL_SET_ELT(SEXP x, R_xlen_t i, int v); > Rinternals.h:void ALTRAW_SET_ELT(SEXP x, R_xlen_t i, Rbyte v); > Rinternals.h:void ALTREAL_SET_ELT(SEXP x, R_xlen_t i, double v); > Rinternals.h:void ALTSTRING_SET_ELT(SEXP, R_xlen_t, SEXP); > Rinternals.h:void SET_INTEGER_ELT(SEXP x, R_xlen_t i, int v); > Rinternals.h:void SET_LOGICAL_ELT(SEXP x, R_xlen_t i, int v); > Rinternals.h:void SET_REAL_ELT(SEXP x, R_xlen_t i, double v); > Rinternals.h:void SET_STRING_ELT(SEXP x, R_xlen_t i, SEXP v); > > So the indexing is with R_xlen_t and they return the value itself as one > would expect. > > Cheers, > Simon > > > > On Jun 17, 2021, at 2:22 AM, Toby Hocking <tdhock5 at gmail.com> wrote: > > > > By the way, where is the documentation for INTEGER_ELT, REAL_ELT, etc? I > > looked in Writing R Extensions and R Internals but I did not see any > > mention. > > REAL_ELT is briefly mentioned on > > https://svn.r-project.org/R/branches/ALTREP/ALTREP.html > > Would it be possible to please add some mention of them to Writing R > > Extensions? > > - how many of these _ELT functions are there? INTEGER, REAL, ... ? > > - in what version of R were they introduced? > > - I guess input types are always SEXP and int? > > - What are the output types for each? > > > > On Fri, May 28, 2021 at 5:16 PM <luke-tierney at uiowa.edu> wrote: > > > >> Since the INTEGER_ELT, REAL_ELT, etc, functions are fairly new it may > >> be possible to check that places where they are used allow for them to > >> allocate. I have fixed the one that got caught by Gabor's example, and > >> a rchk run might be able to pick up others if rchk knows these could > >> allocate. (I may also be forgetting other places where the _ELt > >> methods are used.) Fixing all call sites for REAL, INTEGER, etc, was > >> never realistic so there GC has to be suspended during the method > >> call, and that is done in the dispatch mechanism. > >> > >> The bigger problem is jumps from inside things that existing code > >> assumes will not do that. Catching those jumps is possible but > >> expensive; doing anything sensible if one is caught is really not > >> possible. > >> > >> Best, > >> > >> luke > >> > >> On Fri, 28 May 2021, Gabriel Becker wrote: > >> > >>> Hi Jim et al, > >>> Just to hopefully add a bit to what Luke already answered, from what I > am > >>> recalling looking back at that bioconductor thread Elt methods are used > >> in > >>> places where there are hard implicit assumptions that no garbage > >> collection > >>> will occur (ie they are called on things that aren't PROTECTed), and > >> beyond > >>> that, in places where there are hard assumptions that no error > (longjmp) > >>> will occur. I could be wrong, but I don't know that suspending garbage > >>> collection would protect from the second one. Ie it is possible that an > >>> error *ever* being raised from R code that implements an elt method > could > >>> cause all hell to break loose. > >>> > >>> Luke or Tomas Kalibera would know more. > >>> > >>> I was disappointed that implementing ALTREPs in R code was not in the > >> cards > >>> (it was in my original proposal back in 2016 to the DSC) but I trust > Luke > >>> that there are important reasons we can't safely allow that. > >>> > >>> Best, > >>> ~G > >>> > >>> On Fri, May 28, 2021 at 8:31 AM Jim Hester <james.f.hester at gmail.com> > >> wrote: > >>> From reading the discussion on the Bioconductor issue tracker it > >>> seems like > >>> the reason the GC is not suspended for the non-string ALTREP Elt > >>> methods is > >>> primarily due to performance concerns. > >>> > >>> If this is the case perhaps an additional flag could be added to > >>> the > >>> `R_set_altrep_*()` functions so ALTREP authors could indicate if > >>> GC should > >>> be halted when that particular method is called for that > >>> particular ALTREP > >>> class. > >>> > >>> This would avoid the performance hit (other than a boolean > >>> check) for the > >>> standard case when no allocations are expected, but allow > >>> authors to > >>> indicate that R should pause GC if needed for methods in their > >>> class. > >>> > >>> On Fri, May 28, 2021 at 9:42 AM <luke-tierney at uiowa.edu> wrote: > >>> > >>>> integer and real Elt methods are not expected to allocate. You > >>> would > >>>> have to suspend GC to be able to do that. This currently can't > >>> be done > >>>> from package code. > >>>> > >>>> Best, > >>>> > >>>> luke > >>>> > >>>> On Fri, 28 May 2021, G?bor Cs?rdi wrote: > >>>> > >>>>> I have found some weird SEXP corruption behavior with > >>> ALTREP, which > >>>>> could be a bug. (Or I could be doing something wrong.) > >>>>> > >>>>> I have an integer ALTREP vector that calls back to R from > >>> the Elt > >>>>> method. When this vector is indexed in a lapply(), its first > >>> element > >>>>> gets corrupted. Sometimes it's just a type change to > >>> logical, but > >>>>> sometimes the corruption causes a crash. > >>>>> > >>>>> I saw this on macOS from R 3.5.3 to 4.2.0. I created a small > >>> package > >>>>> that demonstrates this: > >>> https://github.com/gaborcsardi/redfish > >>>>> > >>>>> The R callback in this package calls > >>> `loadNamespace("Matrix")`, but > >>>>> the same crash happens for other packages as well, and > >>> sometimes it > >>>>> also happens if I don't load any packages at all. (But that > >>> example > >>>>> was much more complicated, so I went with the package > >>> loading.) > >>>>> > >>>>> It is somewhat random, and sometimes turning off the JIT > >>> avoids the > >>>>> crash, but not always. > >>>>> > >>>>> Hopefully I am just doing something wrong in the ALTREP code > >>> (see > >>>>> > >>> https://github.com/gaborcsardi/redfish/blob/main/src/test.c), > >>> and it > >>>>> is not actually a bug. > >>>>> > >>>>> Thanks, > >>>>> Gabor > >>>>> > >>>>> ______________________________________________ > >>>>> R-devel at r-project.org mailing list > >>>>> https://stat.ethz.ch/mailman/listinfo/r-devel > >>>>> > >>>> > >>>> -- > >>>> Luke Tierney > >>>> Ralph E. Wareham Professor of Mathematical Sciences > >>>> University of Iowa Phone: > >>> 319-335-3386 > >>>> Department of Statistics and Fax: > >>> 319-335-3017 > >>>> Actuarial Science > >>>> 241 Schaeffer Hall email: > >>> luke-tierney at uiowa.edu > >>>> Iowa City, IA 52242 WWW: > >>> http://www.stat.uiowa.edu > >>>> ______________________________________________ > >>>> R-devel at r-project.org mailing list > >>>> https://stat.ethz.ch/mailman/listinfo/r-devel > >>>> > >>> > >>> [[alternative HTML version deleted]] > >>> > >>> ______________________________________________ > >>> R-devel at r-project.org mailing list > >>> https://stat.ethz.ch/mailman/listinfo/r-devel > >>> > >>> > >>> > >> > >> -- > >> Luke Tierney > >> Ralph E. Wareham Professor of Mathematical Sciences > >> University of Iowa Phone: 319-335-3386 > >> Department of Statistics and Fax: 319-335-3017 > >> Actuarial Science > >> 241 Schaeffer Hall email: luke-tierney at uiowa.edu > >> Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu > >> ______________________________________________ > >> R-devel at r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-devel > >> > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-devel at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > > >[[alternative HTML version deleted]]
iuke-tier@ey m@iii@g oii uiow@@edu
2021-Jun-17 18:31 UTC
[Rd] [External] Possible ALTREP bug
On Thu, 17 Jun 2021, Toby Hocking wrote:> Oliver, for clarification that section in writing R extensions mentions > VECTOR_ELT and REAL but not REAL_ELT nor any other *_ELT functions. I was > looking for an explanation of all the *_ELT functions (which are apparently > new), not just VECTOR_ELT. > Thanks Simon that response was very helpful. > One more question: are there any circumstances in which one should use > REAL_ELT(x,i) rather than REAL(x)[i] or vice versa? Or can they be used > interchangeably?For a single call it is better to use REAL_ELT(x, i) since it doesn't force allocating a possibly large object in order to get a pointer to its data with REAL(x). If you are iterating over a whole object you may want to get data in chunks. There are iteration macros that help. Some examples are in src/main/summary.c. Best, luke> > On Wed, Jun 16, 2021 at 4:29 PM Simon Urbanek <simon.urbanek at r-project.org> > wrote: > The usual quote applies: "use the source, Luke": > > $ grep _ELT *.h | sort > Rdefines.h:#define SET_ELEMENT(x, i, val)? ? ? > ?SET_VECTOR_ELT(x, i, val) > Rinternals.h:? ?The function STRING_ELT is used as an argument > to arrayAssign even > Rinternals.h:#define VECTOR_ELT(x,i)? ? ((SEXP *) DATAPTR(x))[i] > Rinternals.h://SEXP (STRING_ELT)(SEXP x, R_xlen_t i); > Rinternals.h:Rbyte (RAW_ELT)(SEXP x, R_xlen_t i); > Rinternals.h:Rbyte ALTRAW_ELT(SEXP x, R_xlen_t i); > Rinternals.h:Rcomplex (COMPLEX_ELT)(SEXP x, R_xlen_t i); > Rinternals.h:Rcomplex ALTCOMPLEX_ELT(SEXP x, R_xlen_t i); > Rinternals.h:SEXP (STRING_ELT)(SEXP x, R_xlen_t i); > Rinternals.h:SEXP (VECTOR_ELT)(SEXP x, R_xlen_t i); > Rinternals.h:SEXP ALTSTRING_ELT(SEXP, R_xlen_t); > Rinternals.h:SEXP SET_VECTOR_ELT(SEXP x, R_xlen_t i, SEXP v); > Rinternals.h:double (REAL_ELT)(SEXP x, R_xlen_t i); > Rinternals.h:double ALTREAL_ELT(SEXP x, R_xlen_t i); > Rinternals.h:int (INTEGER_ELT)(SEXP x, R_xlen_t i); > Rinternals.h:int (LOGICAL_ELT)(SEXP x, R_xlen_t i); > Rinternals.h:int ALTINTEGER_ELT(SEXP x, R_xlen_t i); > Rinternals.h:int ALTLOGICAL_ELT(SEXP x, R_xlen_t i); > Rinternals.h:void ALTCOMPLEX_SET_ELT(SEXP x, R_xlen_t i, > Rcomplex v); > Rinternals.h:void ALTINTEGER_SET_ELT(SEXP x, R_xlen_t i, int v); > Rinternals.h:void ALTLOGICAL_SET_ELT(SEXP x, R_xlen_t i, int v); > Rinternals.h:void ALTRAW_SET_ELT(SEXP x, R_xlen_t i, Rbyte v); > Rinternals.h:void ALTREAL_SET_ELT(SEXP x, R_xlen_t i, double v); > Rinternals.h:void ALTSTRING_SET_ELT(SEXP, R_xlen_t, SEXP); > Rinternals.h:void SET_INTEGER_ELT(SEXP x, R_xlen_t i, int v); > Rinternals.h:void SET_LOGICAL_ELT(SEXP x, R_xlen_t i, int v); > Rinternals.h:void SET_REAL_ELT(SEXP x, R_xlen_t i, double v); > Rinternals.h:void SET_STRING_ELT(SEXP x, R_xlen_t i, SEXP v); > > So the indexing is with R_xlen_t and they return the value > itself as one would expect. > > Cheers, > Simon > > > > On Jun 17, 2021, at 2:22 AM, Toby Hocking <tdhock5 at gmail.com> > wrote: > > > > By the way, where is the documentation for INTEGER_ELT, > REAL_ELT, etc? I > > looked in Writing R Extensions and R Internals but I did not > see any > > mention. > > REAL_ELT is briefly mentioned on > > https://svn.r-project.org/R/branches/ALTREP/ALTREP.html > > Would it be possible to please add some mention of them to > Writing R > > Extensions? > > - how many of these _ELT functions are there? INTEGER, REAL, > ... ? > > - in what version of R were they introduced? > > - I guess input types are always SEXP and int? > > - What are the output types for each? > > > > On Fri, May 28, 2021 at 5:16 PM <luke-tierney at uiowa.edu> > wrote: > > > >> Since the INTEGER_ELT, REAL_ELT, etc, functions are fairly > new it may > >> be possible to check that places where they are used allow > for them to > >> allocate. I have fixed the one that got caught by Gabor's > example, and > >> a rchk run might be able to pick up others if rchk knows > these could > >> allocate. (I may also be forgetting other places where the > _ELt > >> methods are used.)? Fixing all call sites for REAL, INTEGER, > etc, was > >> never realistic so there GC has to be suspended during the > method > >> call, and that is done in the dispatch mechanism. > >> > >> The bigger problem is jumps from inside things that existing > code > >> assumes will not do that. Catching those jumps is possible > but > >> expensive; doing anything sensible if one is caught is really > not > >> possible. > >> > >> Best, > >> > >> luke > >> > >> On Fri, 28 May 2021, Gabriel Becker wrote: > >> > >>> Hi Jim et al, > >>> Just to hopefully add a bit to what Luke already answered, > from what I am > >>> recalling looking back at that bioconductor thread Elt > methods are used > >> in > >>> places where there are hard implicit assumptions that no > garbage > >> collection > >>> will occur (ie they are called on things that aren't > PROTECTed), and > >> beyond > >>> that, in places where there are hard assumptions that no > error (longjmp) > >>> will occur. I could be wrong, but I don't know that > suspending garbage > >>> collection would protect from the second one. Ie it is > possible that an > >>> error *ever* being raised from R code that implements an elt > method could > >>> cause all hell to break loose. > >>> > >>> Luke or Tomas Kalibera would know more. > >>> > >>> I was disappointed that implementing ALTREPs in R code was > not in the > >> cards > >>> (it was in my original proposal back in 2016 to the DSC) but > I trust Luke > >>> that there are important reasons we can't safely allow that. > >>> > >>> Best, > >>> ~G > >>> > >>> On Fri, May 28, 2021 at 8:31 AM Jim Hester > <james.f.hester at gmail.com> > >> wrote: > >>>? ? ? From reading the discussion on the Bioconductor issue > tracker it > >>>? ? ? seems like > >>>? ? ? the reason the GC is not suspended for the non-string > ALTREP Elt > >>>? ? ? methods is > >>>? ? ? primarily due to performance concerns. > >>> > >>>? ? ? If this is the case perhaps an additional flag could be > added to > >>>? ? ? the > >>>? ? ? `R_set_altrep_*()` functions so ALTREP authors could > indicate if > >>>? ? ? GC should > >>>? ? ? be halted when that particular method is called for > that > >>>? ? ? particular ALTREP > >>>? ? ? class. > >>> > >>>? ? ? This would avoid the performance hit (other than a > boolean > >>>? ? ? check) for the > >>>? ? ? standard case when no allocations are expected, but > allow > >>>? ? ? authors to > >>>? ? ? indicate that R should pause GC if needed for methods > in their > >>>? ? ? class. > >>> > >>>? ? ? On Fri, May 28, 2021 at 9:42 AM > <luke-tierney at uiowa.edu> wrote: > >>> > >>>> integer and real Elt methods are not expected to allocate. > You > >>>? ? ? would > >>>> have to suspend GC to be able to do that. This currently > can't > >>>? ? ? be done > >>>> from package code. > >>>> > >>>> Best, > >>>> > >>>> luke > >>>> > >>>> On Fri, 28 May 2021, G?bor Cs?rdi wrote: > >>>> > >>>>> I have found some weird SEXP corruption behavior with > >>>? ? ? ALTREP, which > >>>>> could be a bug. (Or I could be doing something wrong.) > >>>>> > >>>>> I have an integer ALTREP vector that calls back to R from > >>>? ? ? the Elt > >>>>> method. When this vector is indexed in a lapply(), its > first > >>>? ? ? element > >>>>> gets corrupted. Sometimes it's just a type change to > >>>? ? ? logical, but > >>>>> sometimes the corruption causes a crash. > >>>>> > >>>>> I saw this on macOS from R 3.5.3 to 4.2.0. I created a > small > >>>? ? ? package > >>>>> that demonstrates this: > >>>? ? ? https://github.com/gaborcsardi/redfish > >>>>> > >>>>> The R callback in this package calls > >>>? ? ? `loadNamespace("Matrix")`, but > >>>>> the same crash happens for other packages as well, and > >>>? ? ? sometimes it > >>>>> also happens if I don't load any packages at all. (But > that > >>>? ? ? example > >>>>> was much more complicated, so I went with the package > >>>? ? ? loading.) > >>>>> > >>>>> It is somewhat random, and sometimes turning off the JIT > >>>? ? ? avoids the > >>>>> crash, but not always. > >>>>> > >>>>> Hopefully I am just doing something wrong in the ALTREP > code > >>>? ? ? (see > >>>>> > >>>? ? ? > https://github.com/gaborcsardi/redfish/blob/main/src/test.c), > >>>? ? ? and it > >>>>> is not actually a bug. > >>>>> > >>>>> Thanks, > >>>>> Gabor > >>>>> > >>>>> ______________________________________________ > >>>>> R-devel at r-project.org mailing list > >>>>> https://stat.ethz.ch/mailman/listinfo/r-devel > >>>>> > >>>> > >>>> -- > >>>> Luke Tierney > >>>> Ralph E. Wareham Professor of Mathematical Sciences > >>>> University of Iowa? ? ? ? ? ? ? ? ? Phone: > >>>? ? ? ?319-335-3386 > >>>> Department of Statistics and? ? ? ? Fax: > >>>? ? ? ?319-335-3017 > >>>>? ? Actuarial Science > >>>> 241 Schaeffer Hall? ? ? ? ? ? ? ? ? email: > >>>? ? ? ?luke-tierney at uiowa.edu > >>>> Iowa City, IA 52242? ? ? ? ? ? ? ? ?WWW: > >>>? ? ? http://www.stat.uiowa.edu > >>>> ______________________________________________ > >>>> R-devel at r-project.org mailing list > >>>> https://stat.ethz.ch/mailman/listinfo/r-devel > >>>> > >>> > >>>? ? ? ? ? ? ? [[alternative HTML version deleted]] > >>> > >>>? ? ? ______________________________________________ > >>>? ? ? R-devel at r-project.org mailing list > >>>? ? ? https://stat.ethz.ch/mailman/listinfo/r-devel > >>> > >>> > >>> > >> > >> -- > >> Luke Tierney > >> Ralph E. Wareham Professor of Mathematical Sciences > >> University of Iowa? ? ? ? ? ? ? ? ? Phone:? ? ? ? ? ? > ?319-335-3386 > >> Department of Statistics and? ? ? ? Fax:? ? ? ? ? ? ? > ?319-335-3017 > >>? ? Actuarial Science > >> 241 Schaeffer Hall? ? ? ? ? ? ? ? ? email:? > ?luke-tierney at uiowa.edu > >> Iowa City, IA 52242? ? ? ? ? ? ? ? ?WWW:? > http://www.stat.uiowa.edu > >> ______________________________________________ > >> R-devel at r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-devel > >> > > > >? ? ? ?[[alternative HTML version deleted]] > > > > ______________________________________________ > > R-devel at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > > > >-- Luke Tierney Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics and Fax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: luke-tierney at uiowa.edu Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu