Skip to content
Back to formatted view

Raw Message

Message-ID: <AANLkTinV5-5zu95TrCJfc=dS_DWy8NDeR8w2BddHT7CW@mail.gmail.com>
Date: 2010-09-12T21:40:47Z
From: Gabor Grothendieck
Subject: aligning time series data
In-Reply-To: <AANLkTimVizP-Xyk8GZc5wq13JTD6EWmRPQCAfMpWsqKb@mail.gmail.com>

On Sun, Sep 12, 2010 at 3:45 PM, Anil Vijendran
<anil.vijendran at gmail.com> wrote:
> Hi, I have a data frame that looks like this:
>
>> print(tmp.2, row.names=F)
> ? ? ? Date ? ? ? ?A ? ? ? B ? ? ? ?C ? ? ? ?D
> ?2009-06-30 437.4941 1498.94 3013.123 7771.071
> ?2009-07-31 445.5752 1612.31 3182.020 7800.285
> ?2009-08-31 460.4486 1670.52 3230.984 7900.123
> ?2009-09-30 476.5982 1732.86 3438.276 7980.975
> ?2009-10-31 490.8214 1700.67 3205.386 7937.876
> ?2009-11-30 491.5338 1802.68 3329.164 8091.661
> ?2009-12-31 492.6630 1837.50 3257.927 7701.011
> ?2010-01-29 495.9595 ? ? ?NA ? ? ? NA ? ? ? NA
> ?2010-01-31 495.9595 1771.40 3280.050 7865.038
> ?2010-02-26 493.2970 ? ? ?NA ? ? ? NA ? ? ? NA
> ?2010-02-28 493.2970 1826.27 3373.147 7901.680
> ?2010-03-31 508.8930 1936.48 3236.337 7778.707
> ?2010-04-30 515.2000 1967.05 3356.522 7899.239
> ?2010-05-28 507.9438 ? ? ?NA ? ? ? NA ? ? ? NA
> ?2010-05-30 ? ? ? NA ? ? ?NA 3446.907 ? ? ? NA
> ?2010-05-31 507.9438 1809.98 ? ? ? NA 8172.843
> ?2010-06-30 513.0341 1715.23 3454.708 8426.962
> ?2010-07-30 529.3736 ? ? ?NA ? ? ? NA ? ? ? NA
> ?2010-07-31 ? ? ? NA 1835.40 3568.250 8468.670
> ?2010-08-31 553.9723 1752.55 ? ? ? NA 8824.940
>>
>
> Can someone recommend a way to align all these series on the same date (last
> day of month) - i.e for each series pick the value as of the last day of the
> month if its non-NA, else use the previous value for the last day of the
> month? I was planning to use timeSeries, but solutions that use zoo or xts
> will be great as well.
>
> I could build something myself but am hoping there is a robust solution
> already available to solve a common problem like this.
>

Try this:

library(zoo)
z <- read.zoo(DF)
na.locf(z)[!duplicated(as.yearmon(time(z)), fromLast = TRUE)]

That is:

> DF <-  # input data frame
+ structure(list(Date = structure(c(14425, 14456, 14487, 14517,
+ 14548, 14578, 14609, 14638, 14640, 14666, 14668, 14699, 14729,
+ 14757, 14759, 14760, 14790, 14820, 14821, 14852), class = "Date"),
+     A = c(437.4941, 445.5752, 460.4486, 476.5982, 490.8214, 491.5338,
+     492.663, 495.9595, 495.9595, 493.297, 493.297, 508.893, 515.2,
+     507.9438, NA, 507.9438, 513.0341, 529.3736, NA, 553.9723),
+     B = c(1498.94, 1612.31, 1670.52, 1732.86, 1700.67, 1802.68,
+     1837.5, NA, 1771.4, NA, 1826.27, 1936.48, 1967.05, NA, NA,
+     1809.98, 1715.23, NA, 1835.4, 1752.55), C = c(3013.123, 3182.02,
+     3230.984, 3438.276, 3205.386, 3329.164, 3257.927, NA, 3280.05,
+     NA, 3373.147, 3236.337, 3356.522, NA, 3446.907, NA, 3454.708,
+     NA, 3568.25, NA), D = c(7771.071, 7800.285, 7900.123, 7980.975,
+     7937.876, 8091.661, 7701.011, NA, 7865.038, NA, 7901.68,
+     7778.707, 7899.239, NA, NA, 8172.843, 8426.962, NA, 8468.67,
+     8824.94)), .Names = c("Date", "A", "B", "C", "D"),
+     row.names = c(NA, -20L), class = "data.frame")
>
> library(zoo)
> z <- read.zoo(DF) # convert to zoo
> na.locf(z)[!duplicated(as.yearmon(time(z)), fromLast = TRUE)]
                  A       B        C        D
2009-06-30 437.4941 1498.94 3013.123 7771.071
2009-07-31 445.5752 1612.31 3182.020 7800.285
2009-08-31 460.4486 1670.52 3230.984 7900.123
2009-09-30 476.5982 1732.86 3438.276 7980.975
2009-10-31 490.8214 1700.67 3205.386 7937.876
2009-11-30 491.5338 1802.68 3329.164 8091.661
2009-12-31 492.6630 1837.50 3257.927 7701.011
2010-01-31 495.9595 1771.40 3280.050 7865.038
2010-02-28 493.2970 1826.27 3373.147 7901.680
2010-03-31 508.8930 1936.48 3236.337 7778.707
2010-04-30 515.2000 1967.05 3356.522 7899.239
2010-05-31 507.9438 1809.98 3446.907 8172.843
2010-06-30 513.0341 1715.23 3454.708 8426.962
2010-07-31 529.3736 1835.40 3568.250 8468.670
2010-08-31 553.9723 1752.55 3568.250 8824.940


-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com