Suggestions for 'diff.default'
--- On Mon, 28/1/13, Suharto Anggono Suharto Anggono <suharto_anggono at yahoo.com> wrote:
From: Suharto Anggono Suharto Anggono <suharto_anggono at yahoo.com> Subject: Suggestions for 'diff.default' To: R-devel at lists.R-project.org Date: Monday, 28 January, 2013, 5:31 PM I have suggestions for function 'diff.default' in R. Suggestion 1: If the input is matrix, always return matrix, even if empty. What happens in R 2.15.2:
rbind(1:2)? ? # matrix
? ???[,1] [,2] [1,]? ? 1? ? 2
diff(rbind(1:2))???# not matrix
integer(0)
sessionInfo()
R version 2.15.2 (2012-10-26) Platform: i386-w64-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats? ???graphics? grDevices utils? ???datasets? methods???base The documentation for 'diff' says, "If 'x' is a matrix then the difference operations are carried out on each column separately." If the result is empty, I expect that the result still has as many columns as the input. Suggestion 2: Make 'diff.default' applicable more generally by (a) not performing 'unclass'; (b) generalizing (changing) ismat <- is.matrix(x) to become ismat <- length(dim(x)) == 2L If suggestion 1 is to be applied, if 'unclass' is not wanted (point (a) in suggestion 2 is also to be applied), ? ? if (lag * differences >= xlen) ??? return(x[0L]) can be changed to ? ? if (lag * differences >= xlen) ??? return( ? ? ? ? ? ? if (ismat) x[0L, , drop = FALSE] - x[0L, , drop = FALSE] else ? ? ? ? ? ? x[0L] - x[0L]) It will handle class where subtraction (minus) operation change class.
Sorry, I wasn't careful enough. To obtain the correct class for the result, differencing should be done as many times as specified by argument 'differences'.
I consider the case of
diff(as.POSIXct(c("2012-01-01", "2012-02-01"), tz="UTC"), d=2)
versus
diff(diff(as.POSIXct(c("2012-01-01", "2012-02-01"), tz="UTC")))
To be safe, maybe just compute as usual, even when it is known that the end result will be empty. It can be done like this.
empty <- integer()
if (ismat)
for (i in seq_len(differences))
r <- if (lag >= nrow(r))
r[empty, , drop = FALSE] - r[empty, , drop = FALSE] else
...
else
for (i in seq_len(differences))
r <- if (lag >= length(r))
r[empty] - r[empty] else
...
If that way is used, 'xlen' is no longer needed.
Otherwise, if 'unclass' is wanted, maybe the handling of empty result can be moved to be after 'unclass', to be consistent with non-empty result. If point (a) in suggestion 2 is applied, 'diff.default' can handle input of class "Date" and "POSIXt". If, in addition, point (b) in suggestion 2 is also applied, 'diff.default' can handle data frame as input.