On Sun, May 26, 2019 at 6:47 AM Joshua Ulrich <josh.m.ulrich at gmail.com>
wrote:
On Sun, May 26, 2019 at 4:06 AM Michael Chirico
<michaelchirico4 at gmail.com> wrote:
Have finally managed to come up with a fix after checking out
from within the as.Date.IDate debugger, which shows something like:
[[1]] rbind(DF, DF)
[[2]] rbind(deparse.level, ...)
[[3]] `[<-`(`*tmp*`, ri, value = 18042L)
[[4]] `[<-.Date`(`*tmp*`, ri, value = 18042L)
[[5]] as.Date(value)
[[6]] as.Date.IDate(value)
I'm not sure why [<- is called, I guess the implementation is to
the output block by block? Anyway, we didn't have a [<- method. And
[<-.Date looks like:
value <- unclass(as.Date(value)) # <- converts to double
.Date(NextMethod(.Generic), oldClass(x)) # <- restores 'IDate' class
So we can fix our bug by defining a [<- class; the question that I
don't see answered in documentation or source code is, why/where is [<-
called, exactly?
Your rbind(DF, DF) call dispatches to base::rbind.data.frame(). The
`[<-` call is this line:
value[[jj]][ri] <- if (is.factor(xij)) as.vector(xij) else xij
That's where the storage.mode changes from integer to double.
debug: value[[jj]][ri] <- if (is.factor(xij)) as.vector(xij) else xij
Browse[2]>
debug: xij
Browse[2]> storage.mode(xij)
[1] "integer"
Browse[2]> value[[jj]][ri]
[1] "2019-05-26"
Browse[2]> storage.mode(value[[jj]][ri])
[1] "integer"
Browse[2]>
debug: if (!is.null(nm <- names(xij))) names(value[[jj]])[ri] <- nm
Browse[2]> storage.mode(value[[jj]][ri])
[1] "double"
To be clear, I don't think this is a bug in rbind() or
rbind.data.frame(). The confusion is that rbind.data.frame() calls
`[<-` for each column of the data.frame, and there is no `[<-.IDate`
method. So the parent class method is dispatched, which converts the
storage mode to double.
Someone may argue that this is an issue with `[<-.Date`, and that it
shouldn't convert the storage.mode from integer to double.
Mike C
On Sun, May 26, 2019 at 1:16 PM Michael Chirico <
michaelchirico4 at gmail.com>
integer storage for efficiency, hence, I).
The concatenation done by rbind, however, breaks this and returns a
library(data.table)
DF = data.frame(date = as.IDate(Sys.Date()))
storage.mode(rbind(DF, DF)$date)
# [1] "double"
This is specific to base::rbind (data.table's rbind returns an
expected); in ?rbind we see:
The method dispatching is not done via UseMethod(), but by C-internal
dispatching. Therefore there is no need for, e.g., rbind.default.
The dispatch algorithm is described in the source file
(?.../src/main/bind.c?) as
1. For each argument we get the list of possible class memberships
the class attribute.
2. *We inspect each class in turn to see if there is an applicable
method.*
3. If we find an applicable method we make sure that it is identical
any method determined for prior arguments. If it is identical, we
otherwise we immediately drop through to the default code.
It's not clear what #2 means -- an applicable method *for what*?
at the source code would suggest it's looking for rbind.IDate:
const char *generic = ((PRIMVAL(op) == 1) ? "cbind" : "rbind"); //
be rbind here
const char *s = translateChar(STRING_ELT(classlist, i)); //
the classes, should get to IDate first
sprintf(buf, "%s.%s", generic, s); // should be rbind.IDate
but adding this method (or even exporting it) is no help [ simply
rbind.IDate = function(...) as.IDate(NextMethod()) ]
Lastly, it appears that as.Date.IDate is called, which is causing
conversion:
debug(data.table:::as.Date.IDate)
rbind(DF, DF) # launches debugger
x
# [1] "2019-05-26" <-- singleton, so apparently applied to DF$date,
c(DF$date, DF$date)
undebug(data.table:::as.Date.IDate)
I can't really wrap my head around why as.Date is being called here,
even allowing that, why the end result is still the original class [
class(rbind(DF, DF)$date) == c('IDate', 'Date') ]
So, I'm beginning to think this might be a bug. Am I missing
[[alternative HTML version deleted]]