Is there a missing value constant defined for R_xlen_t, cf. NA_INTEGER (== R_NaInt == INT_MIN) for int(eger)? If not, is it correct to assume that missing values should be taken care/tested for before coercing from int or double? Thank you, Henrik
On Sep 20, 2015, at 3:06 PM, Henrik Bengtsson <henrik.bengtsson at ucsf.edu> wrote:> Is there a missing value constant defined for R_xlen_t, cf. NA_INTEGER > (== R_NaInt == INT_MIN) for int(eger)? If not, is it correct to > assume that missing values should be taken care/tested for before > coercing from int or double? >R_xlen_t is type of the vector length (see XLENGTH()) and as such never holds a missing value (since there is no such thing as a missing length). It is *not* a native type for R vectors and therefore there is no official representation of NAs in R_xlen_t. Although native R vectors can be used as indices, the way it typically works is that the code first checks for NAs in the R vector and only then converts to R_xlen_t, so the NA value is never stored in R_xlen_t even for indexing. --- cut here, content below is less relevant --- That said, when converting packages from "legacy" .Call code before long vector support which used asInteger() to convert an index I tend to use this utility for convenience: static R_INLINE R_xlen_t asLength(SEXP x, R_xlen_t NA) { double d; if (TYPEOF(x) == INTSXP && LENGTH(x) > 0) { int res = INTEGER(x)[0]; return (res == NA_INTEGER) ? NA : ((R_xlen_t) res); } d = asReal(x); return (R_finite(d)) ? ((R_xlen_t) d) : NA; } Note that this explicitly allows the caller to specify NA representation since it depends on the use - often it's simply 0, other times -1 will do since typically anything negative is equally bad. As noted above, this is not what R itself does, so it's more of a convenience to simplify conversion of legacy code. Cheers, Simon
On Mon, Sep 21, 2015 at 11:20 AM, Simon Urbanek <simon.urbanek at r-project.org> wrote:> > On Sep 20, 2015, at 3:06 PM, Henrik Bengtsson <henrik.bengtsson at ucsf.edu> wrote: > >> Is there a missing value constant defined for R_xlen_t, cf. NA_INTEGER >> (== R_NaInt == INT_MIN) for int(eger)? If not, is it correct to >> assume that missing values should be taken care/tested for before >> coercing from int or double? >> > > R_xlen_t is type of the vector length (see XLENGTH()) and as such never holds a missing value (since there is no such thing as a missing length). It is *not* a native type for R vectors and therefore there is no official representation of NAs in R_xlen_t. > > Although native R vectors can be used as indices, the way it typically works is that the code first checks for NAs in the R vector and only then converts to R_xlen_t, so the NA value is never stored in R_xlen_t even for indexing. > > --- cut here, content below is less relevant --- > > That said, when converting packages from "legacy" .Call code before long vector support which used asInteger() to convert an index I tend to use this utility for convenience: > > static R_INLINE R_xlen_t asLength(SEXP x, R_xlen_t NA) { > double d; > if (TYPEOF(x) == INTSXP && LENGTH(x) > 0) { > int res = INTEGER(x)[0]; > return (res == NA_INTEGER) ? NA : ((R_xlen_t) res); > } > d = asReal(x); > return (R_finite(d)) ? ((R_xlen_t) d) : NA; > } > > Note that this explicitly allows the caller to specify NA representation since it depends on the use - often it's simply 0, other times -1 will do since typically anything negative is equally bad. As noted above, this is not what R itself does, so it's more of a convenience to simplify conversion of legacy code.Thank you Simon, all this helped clarify it for me. It's in line with what I suspected, but it is really useful to hear it from the "officials". Cheers, Henrik> > Cheers, > Simon >
Reasonably Related Threads
- The function cummax() seems to have a bug.
- Long vectors: Missing values and R_xlen_t?
- bug with mapply() on an S4 object
- robustbase compilation problem: probably boneheaded? maybe 32-bit?
- R Bug: write.table for matrix of more than 2, 147, 483, 648 elements