Hi,
I would like to figure out the meaning of the return value of these two
functions. Here are the default definitions I find from R source code:
static int altreal_Is_sorted_default(SEXP x) { return UNKNOWN_SORTEDNESS; }
static int altreal_No_NA_default(SEXP x) { return 0; }
I guess the macro *UNKNOWN_SORTEDNESS *in *Is_sorted* and 0 in *No_NA
*simply means
unknown sorted/NA status of the vector, so R will loop over the vector and
find the answer. However, what should we return in these functions to
indicate whether the vector has been sorted/ contains NA? My initial guess
is 0/1 but since *NA_NA *uses 0 as its default value so it will be
ambiguous. Are there any macros to define yes/no return values for these
functions? I would appreciate any thought here.
Best,
Jiefei
[ALTREP] What is the meaning of the return value of Is_sorted and No_NA function?
4 messages · Gabriel Becker, Jiefei Wang, Martin Maechler
7 days later
Hi Jiefei,
The meanings of the return values for sortedness can be found in
RInternals.h, and are as follows:
/* ALTREP sorting support */
enum {SORTED_DECR_NA_1ST = -2,
SORTED_DECR = -1,
UNKNOWN_SORTEDNESS = INT_MIN, /*INT_MIN is NA_INTEGER! */
SORTED_INCR = 1,
SORTED_INCR_NA_1ST = 2,
KNOWN_UNSORTED = 0};
The default value there is NA_INTEGER (ie INT_MIN), indicating that there
is no sortedness information.
Currently, *_NO_NA effectively return a boolean, (even though the actual
return value is int). This can be seen in the method we provide for compact
sequences in altclasses.c:
static int compact_intseq_No_NA(SEXP x)
{
#ifdef COMPACT_INTSEQ_MUTABLE
/* If the vector has been expanded it may have been modified. */
if (COMPACT_SEQ_EXPANDED(x) != R_NilValue)
return FALSE;
#endif
return TRUE;
}
(FALSE is a macro for 0, TRUE is a macro for 1).
Think of the meaning of the return value to No_NA methods as the object's
answer to the following question
"Are you sure there are zero NAs in your data?"
When it is sure of that, it says "yes" (returning 1, ie TRUE). When it
either is sure there are NAs *OR* doesn't have any information about
whether there are NAs, it says "no" (returning 0, ie FALSE).
Also please note, it is possible there may be another API point in the
future which asks the object *how many NAs it has.??* If that materializes,
No_NA would just consume the answer to thatto get the binarized version,
but again there is nothing like that in there now.
Hope that helps.
Best,
~G
On Wed, Sep 11, 2019 at 12:04 AM Wang Jiefei <szwjf08 at gmail.com> wrote:
Hi,
I would like to figure out the meaning of the return value of these two
functions. Here are the default definitions I find from R source code:
static int altreal_Is_sorted_default(SEXP x) { return UNKNOWN_SORTEDNESS; }
static int altreal_No_NA_default(SEXP x) { return 0; }
I guess the macro *UNKNOWN_SORTEDNESS *in *Is_sorted* and 0 in *No_NA
*simply means
unknown sorted/NA status of the vector, so R will loop over the vector and
find the answer. However, what should we return in these functions to
indicate whether the vector has been sorted/ contains NA? My initial guess
is 0/1 but since *NA_NA *uses 0 as its default value so it will be
ambiguous. Are there any macros to define yes/no return values for these
functions? I would appreciate any thought here.
Best,
Jiefei
[[alternative HTML version deleted]]
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Hi Gabriel, Thanks for your answer and future update plan. Somehow this email has been delayed for a week, so there might be a wired reply from me saying that I have found the answer from the R source code, it was sent from me last week. Hopefully, this reply will not cost another week to post:) As a side note, I like the idea that defining a macro for sortedness, and I can see why we can only have a binary answer for NO_NA (since the return value is actually bool). For making the code more readable, and for possibly working with the future R release, is it possible to define a macro for NO_NA function in RInternal.h? So if there is any change in NO_NA function, there is no need to modify the code. Also, the code can be more readable by doing that. Best, Jiefei On Wed, Sep 11, 2019 at 1:58 PM Gabriel Becker <gabembecker at gmail.com> wrote:
Hi Jiefei,
The meanings of the return values for sortedness can be found in
RInternals.h, and are as follows:
/* ALTREP sorting support */
enum {SORTED_DECR_NA_1ST = -2,
SORTED_DECR = -1,
UNKNOWN_SORTEDNESS = INT_MIN, /*INT_MIN is NA_INTEGER! */
SORTED_INCR = 1,
SORTED_INCR_NA_1ST = 2,
KNOWN_UNSORTED = 0};
The default value there is NA_INTEGER (ie INT_MIN), indicating that there
is no sortedness information.
Currently, *_NO_NA effectively return a boolean, (even though the actual
return value is int). This can be seen in the method we provide for compact
sequences in altclasses.c:
static int compact_intseq_No_NA(SEXP x)
{
#ifdef COMPACT_INTSEQ_MUTABLE
/* If the vector has been expanded it may have been modified. */
if (COMPACT_SEQ_EXPANDED(x) != R_NilValue)
return FALSE;
#endif
return TRUE;
}
(FALSE is a macro for 0, TRUE is a macro for 1).
Think of the meaning of the return value to No_NA methods as the object's
answer to the following question
"Are you sure there are zero NAs in your data?"
When it is sure of that, it says "yes" (returning 1, ie TRUE). When it
either is sure there are NAs *OR* doesn't have any information about
whether there are NAs, it says "no" (returning 0, ie FALSE).
Also please note, it is possible there may be another API point in the
future which asks the object *how many NAs it has.??* If that
materializes, No_NA would just consume the answer to thatto get the
binarized version, but again there is nothing like that in there now.
Hope that helps.
Best,
~G
On Wed, Sep 11, 2019 at 12:04 AM Wang Jiefei <szwjf08 at gmail.com> wrote:
Hi,
I would like to figure out the meaning of the return value of these two
functions. Here are the default definitions I find from R source code:
static int altreal_Is_sorted_default(SEXP x) { return UNKNOWN_SORTEDNESS;
}
static int altreal_No_NA_default(SEXP x) { return 0; }
I guess the macro *UNKNOWN_SORTEDNESS *in *Is_sorted* and 0 in *No_NA
*simply means
unknown sorted/NA status of the vector, so R will loop over the vector and
find the answer. However, what should we return in these functions to
indicate whether the vector has been sorted/ contains NA? My initial guess
is 0/1 but since *NA_NA *uses 0 as its default value so it will be
ambiguous. Are there any macros to define yes/no return values for these
functions? I would appreciate any thought here.
Best,
Jiefei
[[alternative HTML version deleted]]
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Wang Jiefei
on Wed, 11 Sep 2019 14:49:13 -0400 writes:
> Hi Gabriel,
> Thanks for your answer and future update plan. Somehow this email has been
> delayed for a week, so there might be a wired reply from me saying that I
> have found the answer from the R source code, it was sent from me last
> week. Hopefully, this reply will not cost another week to post:)
All our e-mail is heavily spam filtered fortunately, through
quite a few filters which sum up to a final spam score and when
that is too high, the message is "diverted" to the spam
collection.
In your case, the "NiceBayes" spamfilter somehow decided to give
the message quite a high score and that got a relatively large
weight
(maybe you should stop using all capitals such as ALTREP in your
subject !?)
We, the volunteer mailing list moderators, get (daily or weekly, in
this case daily) e-mails from the spam software giving us a full
list of the filtered messages... However, we usually lack the
time to carefully go through that list, notably with R-help or
R-devel where that list is quite long...
so I had detected your "ham" message among the many dozens of
spam ones only a day ago, and released it..
Martin Maechler
ETH Zurich