Wang Jiefei
2019-Sep-03 18:49 UTC
[Rd] [ALTREP] What is the meaning of the return value of Is_sorted and No_NA function?
Hi, I would like to figure out the meaning of the return value of these two functions. Here are the default definitions I find from R source code: static int altreal_Is_sorted_default(SEXP x) { return UNKNOWN_SORTEDNESS; } static int altreal_No_NA_default(SEXP x) { return 0; } I guess the macro *UNKNOWN_SORTEDNESS *in *Is_sorted* and 0 in *No_NA *simply means unknown sorted/NA status of the vector, so R will loop over the vector and find the answer. However, what should we return in these functions to indicate whether the vector has been sorted/ contains NA? My initial guess is 0/1 but since *NA_NA *uses 0 as its default value so it will be ambiguous. Are there any macros to define yes/no return values for these functions? I would appreciate any thought here. Best, Jiefei [[alternative HTML version deleted]]
Gabriel Becker
2019-Sep-11 17:57 UTC
[Rd] [ALTREP] What is the meaning of the return value of Is_sorted and No_NA function?
Hi Jiefei, The meanings of the return values for sortedness can be found in RInternals.h, and are as follows: /* ALTREP sorting support */ enum {SORTED_DECR_NA_1ST = -2, SORTED_DECR = -1, UNKNOWN_SORTEDNESS = INT_MIN, /*INT_MIN is NA_INTEGER! */ SORTED_INCR = 1, SORTED_INCR_NA_1ST = 2, KNOWN_UNSORTED = 0}; The default value there is NA_INTEGER (ie INT_MIN), indicating that there is no sortedness information. Currently, *_NO_NA effectively return a boolean, (even though the actual return value is int). This can be seen in the method we provide for compact sequences in altclasses.c: static int compact_intseq_No_NA(SEXP x) { #ifdef COMPACT_INTSEQ_MUTABLE /* If the vector has been expanded it may have been modified. */ if (COMPACT_SEQ_EXPANDED(x) != R_NilValue) return FALSE; #endif return TRUE; } (FALSE is a macro for 0, TRUE is a macro for 1). Think of the meaning of the return value to No_NA methods as the object's answer to the following question "Are you sure there are zero NAs in your data?" When it is sure of that, it says "yes" (returning 1, ie TRUE). When it either is sure there are NAs *OR* doesn't have any information about whether there are NAs, it says "no" (returning 0, ie FALSE). Also please note, it is possible there may be another API point in the future which asks the object *how many NAs it has.??* If that materializes, No_NA would just consume the answer to thatto get the binarized version, but again there is nothing like that in there now. Hope that helps. Best, ~G On Wed, Sep 11, 2019 at 12:04 AM Wang Jiefei <szwjf08 at gmail.com> wrote:> Hi, > > > > I would like to figure out the meaning of the return value of these two > functions. Here are the default definitions I find from R source code: > > > > static int altreal_Is_sorted_default(SEXP x) { return UNKNOWN_SORTEDNESS; } > > static int altreal_No_NA_default(SEXP x) { return 0; } > > I guess the macro *UNKNOWN_SORTEDNESS *in *Is_sorted* and 0 in *No_NA > *simply means > unknown sorted/NA status of the vector, so R will loop over the vector and > find the answer. However, what should we return in these functions to > indicate whether the vector has been sorted/ contains NA? My initial guess > is 0/1 but since *NA_NA *uses 0 as its default value so it will be > ambiguous. Are there any macros to define yes/no return values for these > functions? I would appreciate any thought here. > > > > Best, > > Jiefei > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >[[alternative HTML version deleted]]
Wang Jiefei
2019-Sep-11 18:49 UTC
[Rd] [ALTREP] What is the meaning of the return value of Is_sorted and No_NA function?
Hi Gabriel, Thanks for your answer and future update plan. Somehow this email has been delayed for a week, so there might be a wired reply from me saying that I have found the answer from the R source code, it was sent from me last week. Hopefully, this reply will not cost another week to post:) As a side note, I like the idea that defining a macro for sortedness, and I can see why we can only have a binary answer for NO_NA (since the return value is actually bool). For making the code more readable, and for possibly working with the future R release, is it possible to define a macro for NO_NA function in RInternal.h? So if there is any change in NO_NA function, there is no need to modify the code. Also, the code can be more readable by doing that. Best, Jiefei On Wed, Sep 11, 2019 at 1:58 PM Gabriel Becker <gabembecker at gmail.com> wrote:> Hi Jiefei, > > The meanings of the return values for sortedness can be found in > RInternals.h, and are as follows: > > /* ALTREP sorting support */ > enum {SORTED_DECR_NA_1ST = -2, > SORTED_DECR = -1, > UNKNOWN_SORTEDNESS = INT_MIN, /*INT_MIN is NA_INTEGER! */ > SORTED_INCR = 1, > SORTED_INCR_NA_1ST = 2, > KNOWN_UNSORTED = 0}; > > The default value there is NA_INTEGER (ie INT_MIN), indicating that there > is no sortedness information. > > Currently, *_NO_NA effectively return a boolean, (even though the actual > return value is int). This can be seen in the method we provide for compact > sequences in altclasses.c: > > > static int compact_intseq_No_NA(SEXP x) > { > #ifdef COMPACT_INTSEQ_MUTABLE > /* If the vector has been expanded it may have been modified. */ > if (COMPACT_SEQ_EXPANDED(x) != R_NilValue) > return FALSE; > #endif > return TRUE; > } > > (FALSE is a macro for 0, TRUE is a macro for 1). > > Think of the meaning of the return value to No_NA methods as the object's > answer to the following question > > "Are you sure there are zero NAs in your data?" > > When it is sure of that, it says "yes" (returning 1, ie TRUE). When it > either is sure there are NAs *OR* doesn't have any information about > whether there are NAs, it says "no" (returning 0, ie FALSE). > > Also please note, it is possible there may be another API point in the > future which asks the object *how many NAs it has.??* If that > materializes, No_NA would just consume the answer to thatto get the > binarized version, but again there is nothing like that in there now. > > Hope that helps. > > Best, > ~G > > On Wed, Sep 11, 2019 at 12:04 AM Wang Jiefei <szwjf08 at gmail.com> wrote: > >> Hi, >> >> >> >> I would like to figure out the meaning of the return value of these two >> functions. Here are the default definitions I find from R source code: >> >> >> >> static int altreal_Is_sorted_default(SEXP x) { return UNKNOWN_SORTEDNESS; >> } >> >> static int altreal_No_NA_default(SEXP x) { return 0; } >> >> I guess the macro *UNKNOWN_SORTEDNESS *in *Is_sorted* and 0 in *No_NA >> *simply means >> unknown sorted/NA status of the vector, so R will loop over the vector and >> find the answer. However, what should we return in these functions to >> indicate whether the vector has been sorted/ contains NA? My initial guess >> is 0/1 but since *NA_NA *uses 0 as its default value so it will be >> ambiguous. Are there any macros to define yes/no return values for these >> functions? I would appreciate any thought here. >> >> >> >> Best, >> >> Jiefei >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> >[[alternative HTML version deleted]]
Reasonably Related Threads
- [ALTREP] What is the meaning of the return value of Is_sorted and No_NA function?
- [ALTREP] What is the meaning of the return value of Is_sorted and No_NA function?
- [ALTREP] What is the meaning of the return value of Is_sorted and No_NA function?
- ALTREP: Bug reports
- ALTREP: Bug reports