Michael Chirico
2025-Apr-23 17:03 UTC
[Rd] R should add an API routine for safe use of memcpy(), memset() for use with 0-length SEXP
h/t Tim Taylor for pointing out my blindspot :) We have Memcpy() in API already [1], which wraps a 0-aware R_chk_memcpy() [2]. We don't quite have Memset() in API, though; instead we have Memzero() [3] for R_chk_memset(s, 0, n) which is 0-aware memset() [4]. [1] https://github.com/r-devel/r-svn/blob/9e597ca8132e8b6298b0bedb39fa8deba5c819df/src/include/R_ext/RS.h#L59-L60 [2] https://github.com/r-devel/r-svn/blob/9e597ca8132e8b6298b0bedb39fa8deba5c819df/src/main/memory.c#L3575-L3580 [3] https://github.com/r-devel/r-svn/blob/9e597ca8132e8b6298b0bedb39fa8deba5c819df/src/include/R_ext/RS.h#L62-L63 [4] https://github.com/r-devel/r-svn/blob/9e597ca8132e8b6298b0bedb39fa8deba5c819df/src/main/memory.c#L3582-L3587 Mike C On Wed, Apr 23, 2025 at 8:57?AM Michael Chirico <michaelchirico4 at gmail.com> wrote:> > From R 4.5.0 [1], all builds of R discourage use of INTEGER() [and > friends REAL(), ... and *_RO() equivalents] on length-0 SEXP [2]. > Before R 4.5.0, this was the behavior under --enable-strict-barrier. > > That means the following can segfault under strict builds (e.g. > -fsanitize=alignment and -O0): > > SEXP x = PROTECT(Rf_allocVector(INTSXP, 0)); > SEXP y = PROTECT(Rf_allocVector(INTSXP, 0)); > const int *x = INTEGER_RO(x); // invalid! > int *y = INTEGER(y); > memcpy(y, x, 0); // alluring, but undefined behavior! > > There are a number of CRAN packages that fall victim to this, see e.g. > this PR and others linked to it [3]. I'm sure there are dozens if not > hundreds of other equivalent bugs waiting to be discovered that just > aren't covered by existing tests. > > {rlang} took the approach to define r_memcpy() and r_memset() which > wrap memcpy() and memset(), resp., with an added length-0 check [4]; I > think R itself should offer these (probably more consistently styled > as R_Memcpy() and R_Memset()). > > (NB there's a possibility I'm still not fully grasping what's going on here :) ) > > Mike C > > [1] related: https://stat.ethz.ch/pipermail/r-devel/2024-June/083456.html > [2] https://github.com/r-devel/r-svn/blob/2b29e52e1c4e3d26b649cb7ac320b8a3dd13de30/src/main/memory.c#L4146 > [3] https://github.com/r-lib/vctrs/pull/1968 > [4] https://github.com/r-lib/rlang/pull/1797
Tomas Kalibera
2025-Apr-23 20:38 UTC
[Rd] R should add an API routine for safe use of memcpy(), memset() for use with 0-length SEXP
On 4/23/25 19:03, Michael Chirico wrote:> h/t Tim Taylor for pointing out my blindspot :) > > We have Memcpy() in API already [1], which wraps a 0-aware R_chk_memcpy() [2]. > > We don't quite have Memset() in API, though; instead we have Memzero() > [3] for R_chk_memset(s, 0, n) which is 0-aware memset() [4].I don't think that using wrappers for this is the right approach. Even loading an invalid pointer is undefined behavior. While it wouldn't matter on typical hardware where R is used today, it could cause a crash on some hardware, and it may well be that various checkers will start warning about such things. Also, holding (intentionally) invalid pointers makes debugging harder. At least in new code, I would avoid holding invalid pointers. Certainly I don't think we should be adding wrappers to R for C functions to make them work with invalid pointers. Best Tomas> > [1] https://github.com/r-devel/r-svn/blob/9e597ca8132e8b6298b0bedb39fa8deba5c819df/src/include/R_ext/RS.h#L59-L60 > [2] https://github.com/r-devel/r-svn/blob/9e597ca8132e8b6298b0bedb39fa8deba5c819df/src/main/memory.c#L3575-L3580 > [3] https://github.com/r-devel/r-svn/blob/9e597ca8132e8b6298b0bedb39fa8deba5c819df/src/include/R_ext/RS.h#L62-L63 > [4] https://github.com/r-devel/r-svn/blob/9e597ca8132e8b6298b0bedb39fa8deba5c819df/src/main/memory.c#L3582-L3587 > > Mike C > > On Wed, Apr 23, 2025 at 8:57?AM Michael Chirico > <michaelchirico4 at gmail.com> wrote: >> From R 4.5.0 [1], all builds of R discourage use of INTEGER() [and >> friends REAL(), ... and *_RO() equivalents] on length-0 SEXP [2]. >> Before R 4.5.0, this was the behavior under --enable-strict-barrier. >> >> That means the following can segfault under strict builds (e.g. >> -fsanitize=alignment and -O0): >> >> SEXP x = PROTECT(Rf_allocVector(INTSXP, 0)); >> SEXP y = PROTECT(Rf_allocVector(INTSXP, 0)); >> const int *x = INTEGER_RO(x); // invalid! >> int *y = INTEGER(y); >> memcpy(y, x, 0); // alluring, but undefined behavior! >> >> There are a number of CRAN packages that fall victim to this, see e.g. >> this PR and others linked to it [3]. I'm sure there are dozens if not >> hundreds of other equivalent bugs waiting to be discovered that just >> aren't covered by existing tests. >> >> {rlang} took the approach to define r_memcpy() and r_memset() which >> wrap memcpy() and memset(), resp., with an added length-0 check [4]; I >> think R itself should offer these (probably more consistently styled >> as R_Memcpy() and R_Memset()). >> >> (NB there's a possibility I'm still not fully grasping what's going on here :) ) >> >> Mike C >> >> [1] related: https://stat.ethz.ch/pipermail/r-devel/2024-June/083456.html >> [2] https://github.com/r-devel/r-svn/blob/2b29e52e1c4e3d26b649cb7ac320b8a3dd13de30/src/main/memory.c#L4146 >> [3] https://github.com/r-lib/vctrs/pull/1968 >> [4] https://github.com/r-lib/rlang/pull/1797 > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel