Time to revisit ifelse ?
On Fri, 11 Jul 2025 04:41:13 -0400
Mikael Jagan <jaganmn2 at gmail.com> wrote:
But perhaps we should aim for consensus on a few issues beforehand.
Thank you for raising this topic!
(Sorry if these have been discussed to death already elsewhere. In that case, links to relevant threads would be helpful ...)
The data.table::fifelse issue [1] comes to mind together with the vctrs article section about the need for a less strict ifelse() [2].
1. Should the type and class attribute of the return value be exactly the type and class attribute of c(yes[0L], no[0L]), independent of 'test'? Or something else?
Can we afford an escape hatch for cases when one of the ifelse() branches is NA or other special value handled by the '[<-' method belonging to the class of the other branch? data.table::fifelse() has a not exactly documented special case where it coerces NA_LOGICAL to the appropriate type, so that data.table::fifelse(runif(10) < .5, Sys.Date(), NA) works as intended, and dplyr::if_else also supports this case, but none of the other ifelses I tested do that. Can we say that if only some of the 'yes' / 'no' / 'na' arguments have classes, those must match and they determine the class of the return value? It could be convenient, and it also could be a source of bugs.
2. What should be the attributes of the return value (other than 'class')?
data.table::fifelse (and kit::iif, which shares a lot of the code) also preserve the names, but neither dplyr nor hutils do. I think it would be reasonable to preserve the 'dim' attribute and thus the 'dimnames' attribute too.
3. Should the new function be stricter and/or more verbose? E.g., should it signal a condition if length(yes) or length(no) is not equal to 1 nor length(test)?
Leaning towards yes, but only because I haven't met any uses for recycling of non-length-1 inputs myself. An allow.recycle=FALSE option is probably overkill, right?
4. Should the most common case, in which neither 'yes' nor 'no' has a 'class' attribute, be handled in C?
This could be a very reasonable performance-correctness trade-off.
FWIW, my first (and untested) approximation of an ifelse2 is just
this:
function (test, yes, no)
I think a widely asked-for feature is a separate 'na' branch.
Best regards, Ivan [1] https://github.com/rdatatable/data.table/issues/3657 [2] https://vctrs.r-lib.org/articles/stability.html#ifelse