I found a faster implementation (by an order of magnitude from my
tests) than the one using xts, split, merge (from Joshua).
I report the two fastest solution below with code to generate a test
case; some work still to be done for columns order and naming,
Test case has grown from my previous post to get a more realistic timing.
Any comment or idea to further speed up multivariate time series
creation with classes xts or timeSeries starting from a data.frame
like the one reported here is welcome.
Best regards,
Den
a data.frame example (code below to generate it)
?ID ? ? ? ? ? ? ? ?DATE ? ? VALUE
14 ?3 2000-01-01 00:00:03 0.5726334
4 ? 1 2000-01-01 00:00:03 0.8830174
1 ? 1 2000-01-01 00:00:00 0.2875775
15 ?3 2000-01-01 00:00:04 0.1029247
11 ?3 2000-01-01 00:00:00 0.9568333
9 ? 2 2000-01-01 00:00:03 0.5514350
7 ? 2 2000-01-01 00:00:01 0.5281055
6 ? 2 2000-01-01 00:00:00 0.0455565
12 ?3 2000-01-01 00:00:01 0.4533342
8 ? 2 2000-01-01 00:00:02 0.8924190
3 ? 1 2000-01-01 00:00:02 0.4089769
13 ?3 2000-01-01 00:00:02 0.6775706
And I want to get a timeSeries object or xts object like this:
? ? ? ? ? ? ? ? ? ? ? ? ? 1 ? ? ? ? 2 ? ? ? ? 3
2000-01-01 00:00:00 0.2875775 0.0455565 0.9568333
2000-01-01 00:00:01 ? ? ? ?NA 0.5281055 0.4533342
2000-01-01 00:00:02 0.4089769 0.8924190 0.6775706
2000-01-01 00:00:03 0.8830174 0.5514350 0.5726334
2000-01-01 00:00:04 ? ? ? ?NA ? ? ? ?NA 0.1029247
# CODE:
set.seed(123)
# set N to 5 to reproduce above data.frame
N <- 1000
# set K to 3 to reproduce above data.frame
K <- 10
X <- data.frame(
?ID = rep(1:K, each = N),
?DATE = as.character(rep(as.POSIXct("2000-01-01", tz = "GMT")+ 0:(N-1), K)),
?VALUE = runif(N*K), stringsAsFactors = FALSE)
X <- X[sample(1:(N*K), N*K),]
X <- X[-(sample(1:nrow(X), floor(nrow(X)*0.2))),]
str(X)
xtsSplit <- function(x)
{
?library(xts)
?x <- xts(x[,c("ID","VALUE")], as.POSIXct(x[,"DATE"]))
?return(do.call(merge, split(x$VALUE,x$ID)))
}
xtsSplitTime <- replicate(50,
?system.time(xtsSplit(X))[[1]])
median(xtsSplitTime)
xtsReshape <- function(x)
{
?library(xts)
?x <- reshape(x, idvar = "DATE", timevar = "ID", direction = "wide")
?x <- xts(x[,-1], as.POSIXct(x[,1]))
?return(x)
}
xtsReshapeTime <- replicate(50,
?system.time(xtsReshape(X))[[1]])
median(xtsReshapeTime)