An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-finance/attachments/20100118/1919a574/attachment.pl>
5th of month working day
5 messages · Research, Diethelm Wuertz, Gabor Grothendieck
On Mon, Jan 18, 2010 at 9:54 AM, Research <risk2009 at ath.forthnet.gr> wrote:
Hello, I have a daily data zoo object with ?prices such as: 05/04/2006 ? ? ?1311.56 06/04/2006 ? ? ?1309.04 07/04/2006 ? ? ?1295.5 10/04/2006 ? ? ?1296.6 11/04/2006 ? ? ?1286.57 12/04/2006 ? ? ?1288.12 13/04/2006 ? ? ?1289.12 14/04/2006 ? ? ?1289.12 17/04/2006 ? ? ?1285.33 18/04/2006 ? ? ?1307.65 19/04/2006 ? ? ?1309.93 20/04/2006 ? ? ?1311.46 21/04/2006 ? ? ?1311.28 24/04/2006 ? ? ?1308.11 25/04/2006 ? ? ?1301.74 26/04/2006 ? ? ?1305.41 27/04/2006 ? ? ?1309.72 28/04/2006 ? ? ?1310.61 01/05/2006 ? ? ?1305.19 02/05/2006 ? ? ?1313.21 03/05/2006 ? ? ?1307.85 04/05/2006 ? ? ?1312.25 05/05/2006 ? ? ?1325.76 How can I isolate the 5th day of each month (if this was a working/trading day) otherwise the most recent (before the 5th) working day for each month?
Your sample data always has the 5th of the month filled in but assuming that that is not the case for the real data, merge your series with a zero width series having every date and use na.locf to move values up into subsequent NAs. Then just pick off the 5th of each month. Lines <- "05/04/2006 1311.56 06/04/2006 1309.04 07/04/2006 1295.5 10/04/2006 1296.6 11/04/2006 1286.57 12/04/2006 1288.12 13/04/2006 1289.12 14/04/2006 1289.12 17/04/2006 1285.33 18/04/2006 1307.65 19/04/2006 1309.93 20/04/2006 1311.46 21/04/2006 1311.28 24/04/2006 1308.11 25/04/2006 1301.74 26/04/2006 1305.41 27/04/2006 1309.72 28/04/2006 1310.61 01/05/2006 1305.19 02/05/2006 1313.21 03/05/2006 1307.85 04/05/2006 1312.25 05/05/2006 1325.76" library(zoo) z <- read.zoo(textConnection(Lines), format = "%d/%m/%Y") rng <- range(time(z)) zz <- na.locf(merge(z, zoo(, seq(rng[1], rng[2], by = "day")))) zz[format(time(zz), "%d") == "05"]
Since I do not understand Gabor's solution I add my suggestion to
solve the problem ...
> require(timeSeries)
>
> # Compose for 2006 a calendar with the first 5 days in each month:
> years = rep(2006, times = 60)
> months = rep(1:12, each = 5)
> days = rep(1:5, times = 12)
> tD = timeCalendar(years, months, days)
> tD
GMT
[1] [2006-01-01] [2006-01-02] [2006-01-03] [2006-01-04] [2006-01-05]
[6] [2006-02-01] [2006-02-02] [2006-02-03] [2006-02-04] [2006-02-05]
[11] [2006-03-01] [2006-03-02] [2006-03-03] [2006-03-04] [2006-03-05]
[16] [2006-04-01] [2006-04-02] [2006-04-03] [2006-04-04] [2006-04-05]
[21] [2006-05-01] [2006-05-02] [2006-05-03] [2006-05-04] [2006-05-05]
[26] [2006-06-01] [2006-06-02] [2006-06-03] [2006-06-04] [2006-06-05]
[31] [2006-07-01] [2006-07-02] [2006-07-03] [2006-07-04] [2006-07-05]
[36] [2006-08-01] [2006-08-02] [2006-08-03] [2006-08-04] [2006-08-05]
[41] [2006-09-01] [2006-09-02] [2006-09-03] [2006-09-04] [2006-09-05]
[46] [2006-10-01] [2006-10-02] [2006-10-03] [2006-10-04] [2006-10-05]
[51] [2006-11-01] [2006-11-02] [2006-11-03] [2006-11-04] [2006-11-05]
[56] [2006-12-01] [2006-12-02] [2006-12-03] [2006-12-04] [2006-12-05]
>
> # Then extract the Business days (not weekdays!) according
> # to a given holiday Calendar, here I used the NYSE holiday
calendar for 2006
> # Note with Rmetrics you can create your own business calendars!
> tM = matrix(as.integer(isBizday(tD, holidayNYSE(2006))), byrow =
TRUE, ncol = 5)
> rownames(tM) = paste(200600+1:12)
> colnames(tM) = paste(1:5)
> tM
1 2 3 4 5
200601 0 0 1 1 1
200602 1 1 1 0 0
200603 1 1 1 0 0
200604 0 0 1 1 1
200605 1 1 1 1 1
200606 1 1 0 0 1
200607 0 0 1 0 1
200608 1 1 1 1 0
200609 1 0 0 0 1
200610 0 1 1 1 1
200611 1 1 1 0 0
200612 1 0 0 1 1
>
> # Then isolate the 5th day of each month if this was a business day
> # otherwise the most recent business day before the 5th working
> # day for each month - this is what you want, or?
> # Take care, there may be again holidays in between previous working
> # days!! Here they are handled properly.
> tW = t(apply(tM, 1, cumsum))[,5:1]
> tW
5 4 3 2 1
200601 3 2 1 0 0
200602 3 3 3 2 1
200603 3 3 3 2 1
200604 3 2 1 0 0
200605 5 4 3 2 1
200606 3 2 2 2 1
200607 2 1 1 0 0
200608 4 4 3 2 1
200609 2 1 1 1 1
200610 4 3 2 1 0
200611 3 3 3 2 1
200612 3 2 1 1 1
> tIndex = which(t(tW) == 1)
>
>
> # After having the Index, you can get the timeDate objects for 2006
> tD[tIndex]
GMT
[1] [2006-01-03] [2006-02-05] [2006-03-05] [2006-04-03] [2006-05-05]
[6] [2006-06-05] [2006-07-02] [2006-07-03] [2006-08-05] [2006-09-02]
[11] [2006-09-03] [2006-09-04] [2006-09-05] [2006-10-04] [2006-11-05]
[16] [2006-12-03] [2006-12-04] [2006-12-05]
>
> # and finally index your time series with the timeDate objects.
> # Isn't it powerful to use timeDate and timeSeries objects?
Exercise: write a small function to extract the n-th business day for
each month of a timeDate calendar object given a specific holiday Calendar
enjoy Rmetrics!
Diethelm
PS: I found this example really nice to show what timeDate and timeSeries
methods can do for you, I will add this example to the FAQ's in the next
edition of our timeSeries FAQ e-book: http://www.rmetrics.org/node/8
-----------------------
Gabor Grothendieck wrote:
On Mon, Jan 18, 2010 at 9:54 AM, Research <risk2009 at ath.forthnet.gr> wrote:
Hello, I have a daily data zoo object with prices such as: 05/04/2006 1311.56 06/04/2006 1309.04 07/04/2006 1295.5 10/04/2006 1296.6 11/04/2006 1286.57 12/04/2006 1288.12 13/04/2006 1289.12 14/04/2006 1289.12 17/04/2006 1285.33 18/04/2006 1307.65 19/04/2006 1309.93 20/04/2006 1311.46 21/04/2006 1311.28 24/04/2006 1308.11 25/04/2006 1301.74 26/04/2006 1305.41 27/04/2006 1309.72 28/04/2006 1310.61 01/05/2006 1305.19 02/05/2006 1313.21 03/05/2006 1307.85 04/05/2006 1312.25 05/05/2006 1325.76 How can I isolate the 5th day of each month (if this was a working/trading day) otherwise the most recent (before the 5th) working day for each month?
Your sample data always has the 5th of the month filled in but assuming that that is not the case for the real data, merge your series with a zero width series having every date and use na.locf to move values up into subsequent NAs. Then just pick off the 5th of each month. Lines <- "05/04/2006 1311.56 06/04/2006 1309.04 07/04/2006 1295.5 10/04/2006 1296.6 11/04/2006 1286.57 12/04/2006 1288.12 13/04/2006 1289.12 14/04/2006 1289.12 17/04/2006 1285.33 18/04/2006 1307.65 19/04/2006 1309.93 20/04/2006 1311.46 21/04/2006 1311.28 24/04/2006 1308.11 25/04/2006 1301.74 26/04/2006 1305.41 27/04/2006 1309.72 28/04/2006 1310.61 01/05/2006 1305.19 02/05/2006 1313.21 03/05/2006 1307.85 04/05/2006 1312.25 05/05/2006 1325.76" library(zoo) z <- read.zoo(textConnection(Lines), format = "%d/%m/%Y") rng <- range(time(z)) zz <- na.locf(merge(z, zoo(, seq(rng[1], rng[2], by = "day")))) zz[format(time(zz), "%d") == "05"]
_______________________________________________ R-SIG-Finance at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go.
Here is a second example that does the same thing except (that is it
returns the last value on or prior to the 5th); however, in the prior
solution the date was shown as the 5th and in this one it is shown as
the date of the last filled in value. Not sure which you would
prefer. See ?na.locf for more info on moving values forward into NAs
and also read the three vignettes that come with zoo.
# same as your example except I have removed the 5ths of the month and
added 4/4/2006.
Lines <- "
04/04/2006 1311.56
06/04/2006 1309.04
07/04/2006 1295.5
10/04/2006 1296.6
11/04/2006 1286.57
12/04/2006 1288.12
13/04/2006 1289.12
14/04/2006 1289.12
17/04/2006 1285.33
18/04/2006 1307.65
19/04/2006 1309.93
20/04/2006 1311.46
21/04/2006 1311.28
24/04/2006 1308.11
25/04/2006 1301.74
26/04/2006 1305.41
27/04/2006 1309.72
28/04/2006 1310.61
01/05/2006 1305.19
02/05/2006 1313.21
03/05/2006 1307.85
04/05/2006 1312.25
06/05/2006 1325.76"
library(zoo)
z <- read.zoo(textConnection(Lines), format = "%d/%m/%Y")
# z.na is same as z but with missing days added using NAs
# Its formed by merging z with a zoo-width series containing all days.
rng <- range(time(z))
z.na <- merge(z, zoo(, seq(rng[1], rng[2], by = "day")))
# form a series that has NAs wherever z.na does but has 1, 2, 3, ...
# instead of z.na's data values and then use na.locf to fill in NAs
idx <- na.locf(seq_along(z.na) + (0 * z.na))
# pick off elements of z.na corresponding to 5th of month
z.na[idx[format(time(z.na), "%d") == "05"]]
Here is the final result:
2006-04-04 2006-05-04
1311.56 1312.25
On Mon, Jan 18, 2010 at 10:16 AM, Gabor Grothendieck
<ggrothendieck at gmail.com> wrote:
On Mon, Jan 18, 2010 at 9:54 AM, Research <risk2009 at ath.forthnet.gr> wrote:
Hello, I have a daily data zoo object with ?prices such as: 05/04/2006 ? ? ?1311.56 06/04/2006 ? ? ?1309.04 07/04/2006 ? ? ?1295.5 10/04/2006 ? ? ?1296.6 11/04/2006 ? ? ?1286.57 12/04/2006 ? ? ?1288.12 13/04/2006 ? ? ?1289.12 14/04/2006 ? ? ?1289.12 17/04/2006 ? ? ?1285.33 18/04/2006 ? ? ?1307.65 19/04/2006 ? ? ?1309.93 20/04/2006 ? ? ?1311.46 21/04/2006 ? ? ?1311.28 24/04/2006 ? ? ?1308.11 25/04/2006 ? ? ?1301.74 26/04/2006 ? ? ?1305.41 27/04/2006 ? ? ?1309.72 28/04/2006 ? ? ?1310.61 01/05/2006 ? ? ?1305.19 02/05/2006 ? ? ?1313.21 03/05/2006 ? ? ?1307.85 04/05/2006 ? ? ?1312.25 05/05/2006 ? ? ?1325.76 How can I isolate the 5th day of each month (if this was a working/trading day) otherwise the most recent (before the 5th) working day for each month?
Your sample data always has the 5th of the month filled in but assuming that that is not the case for the real data, merge your series with a zero width series having every date and use na.locf to move values up into subsequent NAs. ?Then just pick off the 5th of each month. Lines <- "05/04/2006 ? ? ?1311.56 06/04/2006 ? ? ?1309.04 07/04/2006 ? ? ?1295.5 10/04/2006 ? ? ?1296.6 11/04/2006 ? ? ?1286.57 12/04/2006 ? ? ?1288.12 13/04/2006 ? ? ?1289.12 14/04/2006 ? ? ?1289.12 17/04/2006 ? ? ?1285.33 18/04/2006 ? ? ?1307.65 19/04/2006 ? ? ?1309.93 20/04/2006 ? ? ?1311.46 21/04/2006 ? ? ?1311.28 24/04/2006 ? ? ?1308.11 25/04/2006 ? ? ?1301.74 26/04/2006 ? ? ?1305.41 27/04/2006 ? ? ?1309.72 28/04/2006 ? ? ?1310.61 01/05/2006 ? ? ?1305.19 02/05/2006 ? ? ?1313.21 03/05/2006 ? ? ?1307.85 04/05/2006 ? ? ?1312.25 05/05/2006 ? ? ?1325.76" library(zoo) z <- read.zoo(textConnection(Lines), format = "%d/%m/%Y") rng <- range(time(z)) zz <- na.locf(merge(z, zoo(, seq(rng[1], rng[2], by = "day")))) zz[format(time(zz), "%d") == "05"]
The first line should have read Here is a second example that does the same thing (that is it On Tue, Jan 19, 2010 at 9:20 AM, Gabor Grothendieck
<ggrothendieck at gmail.com> wrote:
Here is a second example that does the same thing except (that is it returns the last value on or prior to the 5th); however, in the prior solution the date was shown as the 5th and in this one it is shown as the date of the last filled in value. ?Not sure which you would prefer. ?See ?na.locf for more info on moving values forward into NAs and also read the three vignettes that come with zoo. # same as your example except I have removed the 5ths of the month and added 4/4/2006. Lines <- " 04/04/2006 ? ? ?1311.56 06/04/2006 ? ? ?1309.04 07/04/2006 ? ? ?1295.5 10/04/2006 ? ? ?1296.6 11/04/2006 ? ? ?1286.57 12/04/2006 ? ? ?1288.12 13/04/2006 ? ? ?1289.12 14/04/2006 ? ? ?1289.12 17/04/2006 ? ? ?1285.33 18/04/2006 ? ? ?1307.65 19/04/2006 ? ? ?1309.93 20/04/2006 ? ? ?1311.46 21/04/2006 ? ? ?1311.28 24/04/2006 ? ? ?1308.11 25/04/2006 ? ? ?1301.74 26/04/2006 ? ? ?1305.41 27/04/2006 ? ? ?1309.72 28/04/2006 ? ? ?1310.61 01/05/2006 ? ? ?1305.19 02/05/2006 ? ? ?1313.21 03/05/2006 ? ? ?1307.85 04/05/2006 ? ? ?1312.25 06/05/2006 ? ? ?1325.76" ? ? ? ?library(zoo) ? ? ? ?z <- read.zoo(textConnection(Lines), format = "%d/%m/%Y") # z.na is same as z but with missing days added using NAs # Its formed by merging z with a zoo-width series containing all days. ? ? ? ?rng <- range(time(z)) ? ? ? ?z.na <- merge(z, zoo(, seq(rng[1], rng[2], by = "day"))) # form a series that has NAs wherever z.na does but has 1, 2, 3, ... # instead of z.na's data values and then use na.locf to fill in NAs ? ? ? ?idx <- na.locf(seq_along(z.na) + (0 * z.na)) # pick off elements of z.na corresponding to 5th of month ? ? ? ?z.na[idx[format(time(z.na), "%d") == "05"]] Here is the final result: 2006-04-04 2006-05-04 ? 1311.56 ? ?1312.25 On Mon, Jan 18, 2010 at 10:16 AM, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
On Mon, Jan 18, 2010 at 9:54 AM, Research <risk2009 at ath.forthnet.gr> wrote:
Hello, I have a daily data zoo object with ?prices such as: 05/04/2006 ? ? ?1311.56 06/04/2006 ? ? ?1309.04 07/04/2006 ? ? ?1295.5 10/04/2006 ? ? ?1296.6 11/04/2006 ? ? ?1286.57 12/04/2006 ? ? ?1288.12 13/04/2006 ? ? ?1289.12 14/04/2006 ? ? ?1289.12 17/04/2006 ? ? ?1285.33 18/04/2006 ? ? ?1307.65 19/04/2006 ? ? ?1309.93 20/04/2006 ? ? ?1311.46 21/04/2006 ? ? ?1311.28 24/04/2006 ? ? ?1308.11 25/04/2006 ? ? ?1301.74 26/04/2006 ? ? ?1305.41 27/04/2006 ? ? ?1309.72 28/04/2006 ? ? ?1310.61 01/05/2006 ? ? ?1305.19 02/05/2006 ? ? ?1313.21 03/05/2006 ? ? ?1307.85 04/05/2006 ? ? ?1312.25 05/05/2006 ? ? ?1325.76 How can I isolate the 5th day of each month (if this was a working/trading day) otherwise the most recent (before the 5th) working day for each month?
Your sample data always has the 5th of the month filled in but assuming that that is not the case for the real data, merge your series with a zero width series having every date and use na.locf to move values up into subsequent NAs. ?Then just pick off the 5th of each month. Lines <- "05/04/2006 ? ? ?1311.56 06/04/2006 ? ? ?1309.04 07/04/2006 ? ? ?1295.5 10/04/2006 ? ? ?1296.6 11/04/2006 ? ? ?1286.57 12/04/2006 ? ? ?1288.12 13/04/2006 ? ? ?1289.12 14/04/2006 ? ? ?1289.12 17/04/2006 ? ? ?1285.33 18/04/2006 ? ? ?1307.65 19/04/2006 ? ? ?1309.93 20/04/2006 ? ? ?1311.46 21/04/2006 ? ? ?1311.28 24/04/2006 ? ? ?1308.11 25/04/2006 ? ? ?1301.74 26/04/2006 ? ? ?1305.41 27/04/2006 ? ? ?1309.72 28/04/2006 ? ? ?1310.61 01/05/2006 ? ? ?1305.19 02/05/2006 ? ? ?1313.21 03/05/2006 ? ? ?1307.85 04/05/2006 ? ? ?1312.25 05/05/2006 ? ? ?1325.76" library(zoo) z <- read.zoo(textConnection(Lines), format = "%d/%m/%Y") rng <- range(time(z)) zz <- na.locf(merge(z, zoo(, seq(rng[1], rng[2], by = "day")))) zz[format(time(zz), "%d") == "05"]