Query about creating time sequences

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120525/6ca4390d/attachment.pl>
One (somewhat kludgy) way would be to use seq() to make one day's worth of times then to pass those to outer() to add in the needed days and then coerce the whole thing back to a sorted vector. 

I'm not at a computer right now so this won't be quite right but something like

x <- seq(x.start.first.day, x.end.first.day, by = "sec")

y <- 24*60*60 *(1:n.days)

sort(as.vector(outer(x, y, "+")))

Changing the order of x and y might make the sort unnecessary. 

M

Hi All,

I have a query about time based sequences. I know such questions have been
asked a lot on forums, but I couldnt find the exact thing that I was
looking for.

I want to create a time-based sequence which will mimic the trading window
AND would span multiple days. Something like below:

"2011-01-03 09:15:00 IST"
"2011-01-03 09:15:01 IST"
....
....
....
"2011-01-03 15:29:59 IST"
"2011-01-03 15:30:00 IST"
"2011-01-04 09:15:00 IST"
"2011-01-04 09:15:01 IST"
....
....
....
"2011-01-04 15:29:59 IST"
"2011-01-04 15:30:00 IST"

Kindly notice the change of date in the sequence.

The Indian Equity markets open at 09:15:00 and close at 15:30:00. I have
equity data that spans 124 days, and I need to create a corresponding
sequence which I will later use to regularize the irregular dataset to make
a regular time-series.

I was able to accomplish this task for a single day (i.e. creating a
sequence then merging my dataset with it and use na.locf to make my dataset
regular) but am unable to create a sequence for 'n' number of days. Can
anyone help me with this?

If it is of any help, I have a file which contains all the dates for which
I need the sequence. The dput of the file is placed at the end of the
email.

One option is to create sequences for the entire days and then later remove
all these records after merging. Although I havent checked the feasibility
of this method, it would be complex and more so it will increase the data
four folds (I already have 2 million records in the dataframe which I have
to make regular).

Another approach that I could think of was to make a timebased sequence
based on the date from the file and then use a loop to append one sequence
after another. But am not having much success there either.

Any kind of help would be greatly appreciated.

Thanks and regards,
Shivam

structure(list("20110103", "20110104", "20110105", "20110106",
   "20110107", "20110110", "20110111", "20110112", "20110113",
   "20110114", "20110117", "20110118", "20110119", "20110120",
   "20110121", "20110124", "20110125", "20110127", "20110128",
   "20110131", "20110201", "20110202", "20110203", "20110204",
   "20110207", "20110208", "20110209", "20110210", "20110211",
   "20110214", "20110215", "20110216", "20110217", "20110218",
   "20110221", "20110222", "20110223", "20110224", "20110225",
   "20110228", "20110301", "20110303", "20110304", "20110307",
   "20110308", "20110309", "20110310", "20110311", "20110314",
   "20110315", "20110316", "20110317", "20110318", "20110321",
   "20110322", "20110323", "20110324", "20110325", "20110328",
   "20110329", "20110330", "20110331", "20110401", "20110404",
   "20110405", "20110406", "20110407", "20110408", "20110411",
   "20110413", "20110415", "20110418", "20110419", "20110420",
   "20110421", "20110425", "20110426", "20110427", "20110428",
   "20110429", "20110502", "20110503", "20110504", "20110505",
   "20110506", "20110509", "20110510", "20110511", "20110512",
   "20110513", "20110516", "20110517", "20110518", "20110519",
   "20110520", "20110523", "20110524", "20110525", "20110526",
   "20110527", "20110530", "20110531", "20110601", "20110602",
   "20110603", "20110606", "20110607", "20110608", "20110609",
   "20110610", "20110613", "20110614", "20110615", "20110616",
   "20110617", "20110620", "20110621", "20110622", "20110623",
   "20110624", "20110627", "20110628", "20110629", "20110630"), .Dim =
c(124L,
1L), .Dimnames = list(c("X1", "X2", "X3", "X4", "X5", "X6", "X7",
"X8", "X9", "X10", "X11", "X12", "X13", "X14", "X15", "X16",
"X17", "X18", "X19", "X20", "X21", "X22", "X23", "X24", "X25",
"X26", "X27", "X28", "X29", "X30", "X31", "X32", "X33", "X34",
"X35", "X36", "X37", "X38", "X39", "X40", "X41", "X42", "X43",
"X44", "X45", "X46", "X47", "X48", "X49", "X50", "X51", "X52",
"X53", "X54", "X55", "X56", "X57", "X58", "X59", "X60", "X61",
"X62", "X63", "X64", "X65", "X66", "X67", "X68", "X69", "X70",
"X71", "X72", "X73", "X74", "X75", "X76", "X77", "X78", "X79",
"X80", "X81", "X82", "X83", "X84", "X85", "X86", "X87", "X88",
"X89", "X90", "X91", "X92", "X93", "X94", "X95", "X96", "X97",
"X98", "X99", "X100", "X101", "X102", "X103", "X104", "X105",
"X106", "X107", "X108", "X109", "X110", "X111", "X112", "X113",
"X114", "X115", "X116", "X117", "X118", "X119", "X120", "X121",
"X122", "X123", "X124"), NULL))

   [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120528/6ad659e9/attachment.pl>
Hi All,

I have a query about time based sequences. I know such questions have been
asked a lot on forums, but I couldnt find the exact thing that I was
looking for.

I want to create a time-based sequence which will mimic the trading window
AND would span multiple days. Something like below:

"2011-01-03 09:15:00 IST"
"2011-01-03 09:15:01 IST"
....
....
....
"2011-01-03 15:29:59 IST"
"2011-01-03 15:30:00 IST"
"2011-01-04 09:15:00 IST"
"2011-01-04 09:15:01 IST"
....
....
....
"2011-01-04 15:29:59 IST"
"2011-01-04 15:30:00 IST"

Kindly notice the change of date in the sequence.

The Indian Equity markets open at 09:15:00 and close at 15:30:00. I have
equity data that spans 124 days, and I need to create a corresponding
sequence which I will later use to regularize the irregular dataset to make
a regular time-series.

I was able to accomplish this task for a single day (i.e. creating a
sequence then merging my dataset with it and use na.locf to make my dataset
regular) but am unable to create a sequence for 'n' number of days. Can
anyone help me with this?

If it is of any help, I have a file which contains all the dates for which
I need the sequence. The dput of the file is placed at the end of the
email.

One option is to create sequences for the entire days and then later remove
all these records after merging. Although I havent checked the feasibility
of this method, it would be complex and more so it will increase the data
four folds (I already have 2 million records in the dataframe which I have
to make regular).

Another approach that I could think of was to make a timebased sequence
based on the date from the file and then use a loop to append one sequence
after another. But am not having much success there either.

Any kind of help would be greatly appreciated.

Thanks and regards,
Shivam

Create a minute by minute sequence of datetimes (tseq) from the first
datetime to the last datetime and then extract those datetimes whose
times (tt) lie between the desired times of day:

from <- as.POSIXct("2011-01-03 09:15:00:00")
to <- as.POSIXct("2011-01-04 15:30:00")
tseq <- seq(from, to, "1 min")

tt <- format(tseq, "%H:%M")
tseq[tt >= "09:30" & tt <= "15:30"]
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com
Try this:

# Setting TZ is optional, but I find it helps me to be more aware of 
# timezone effects
Sys.setenv( TZ="Asia/Kolkata")
library(lubridate)

pdates <- as.POSIXct( fdates )
hstart <- new_period( hour=9, minute=15 )
hend <- new_period( hour=15, minute=30 )
mperiod <- new_period( minute=15 )
numperday <- (hend-hstart)/mperiod
dtms <- expand.grid( dt=pdates, tm=hstart + mperiod * seq( from=0, 
to=numperday ) )
dtms$dtm <- with( dtyms, dt + tm )
dtms <- dtms[ order( dtms$dtm ), ]

You can discard everything but dtms$dtm once it has been created.

Thanks for the effort Michael, but the problem here is that the dates for
which the sequences need to be created have gaps in between. Basically I
need the sequence for only those days on which the security market is open
(I have the dates in a file which is present at the end of THIS mail).

What I have been able to do is to create a list, where each element of the
list is a sequence for a single day. It was done like below:

for(i in 1:124){
seqtimes[[i]] = xts(,seq(as.POSIXct(paste(fdates[i],'09:15:00', sep=" ")),
as.POSIXct(paste(fdates[i],'15:30:00', sep=" ")), by = 1))}

where 'fdates' is the file which contains the dates for which the sequences
need to be created.

But now I am stuck. I need a way to get all these sequences in a
(vector/dataframe/xts object) where all the list items are sequentially
present.

I tried merge.xts, but to no avail.

seqtimes[[1]]
Data:
numeric(0)

Index:
POSIXct[1:22501], format: "2011-01-03 09:15:00" "2011-01-03 09:15:01"
"2011-01-03 09:15:02" "2011-01-03 09:15:03" "2011-01-03 09:15:04"
"2011-01-03 09:15:05" ...

seqtimes[[2]]
Data:
numeric(0)

Index:
POSIXct[1:22501], format: "2011-01-04 09:15:00" "2011-01-04 09:15:01"
"2011-01-04 09:15:02" "2011-01-04 09:15:03" "2011-01-04 09:15:04"
"2011-01-04 09:15:05" ...

tseq = merge.xts(seqtimes[[1]],seqtimes[[2]], all = TRUE)
tseq
Data:
numeric(0)

Index:
integer(0)

Any help would be greatly appreciated.

Thanks in advance,
Regards,
Shivam

P.S. - The dput of the fdates file:

dput(fdates)
structure(c("2011-01-03", "2011-01-04", "2011-01-05", "2011-01-06",
"2011-01-07", "2011-01-10", "2011-01-11", "2011-01-12", "2011-01-13",
"2011-01-14", "2011-01-17", "2011-01-18", "2011-01-19", "2011-01-20",
"2011-01-21", "2011-01-24", "2011-01-25", "2011-01-27", "2011-01-28",
"2011-01-31", "2011-02-01", "2011-02-02", "2011-02-03", "2011-02-04",
"2011-02-07", "2011-02-08", "2011-02-09", "2011-02-10", "2011-02-11",
"2011-02-14", "2011-02-15", "2011-02-16", "2011-02-17", "2011-02-18",
"2011-02-21", "2011-02-22", "2011-02-23", "2011-02-24", "2011-02-25",
"2011-02-28", "2011-03-01", "2011-03-03", "2011-03-04", "2011-03-07",
"2011-03-08", "2011-03-09", "2011-03-10", "2011-03-11", "2011-03-14",
"2011-03-15", "2011-03-16", "2011-03-17", "2011-03-18", "2011-03-21",
"2011-03-22", "2011-03-23", "2011-03-24", "2011-03-25", "2011-03-28",
"2011-03-29", "2011-03-30", "2011-03-31", "2011-04-01", "2011-04-04",
"2011-04-05", "2011-04-06", "2011-04-07", "2011-04-08", "2011-04-11",
"2011-04-13", "2011-04-15", "2011-04-18", "2011-04-19", "2011-04-20",
"2011-04-21", "2011-04-25", "2011-04-26", "2011-04-27", "2011-04-28",
"2011-04-29", "2011-05-02", "2011-05-03", "2011-05-04", "2011-05-05",
"2011-05-06", "2011-05-09", "2011-05-10", "2011-05-11", "2011-05-12",
"2011-05-13", "2011-05-16", "2011-05-17", "2011-05-18", "2011-05-19",
"2011-05-20", "2011-05-23", "2011-05-24", "2011-05-25", "2011-05-26",
"2011-05-27", "2011-05-30", "2011-05-31", "2011-06-01", "2011-06-02",
"2011-06-03", "2011-06-06", "2011-06-07", "2011-06-08", "2011-06-09",
"2011-06-10", "2011-06-13", "2011-06-14", "2011-06-15", "2011-06-16",
"2011-06-17", "2011-06-20", "2011-06-21", "2011-06-22", "2011-06-23",
"2011-06-24", "2011-06-27", "2011-06-28", "2011-06-29", "2011-06-30"
), .Dim = c(124L, 1L))

On Sat, May 26, 2012 at 6:22 AM, R. Michael Weylandt <
michael.weylandt at gmail.com> <michael.weylandt at gmail.com> wrote:

One (somewhat kludgy) way would be to use seq() to make one day's worth of
times then to pass those to outer() to add in the needed days and then
coerce the whole thing back to a sorted vector.

I'm not at a computer right now so this won't be quite right but something
like

x <- seq(x.start.first.day, x.end.first.day, by = "sec")

y <- 24*60*60 *(1:n.days)

sort(as.vector(outer(x, y, "+")))

Changing the order of x and y might make the sort unnecessary.

M

On May 25, 2012, at 1:14 PM, Shivam <shivamsingh at gmail.com> wrote:

Hi All,

I have a query about time based sequences. I know such questions have
been
asked a lot on forums, but I couldnt find the exact thing that I was
looking for.

I want to create a time-based sequence which will mimic the trading
window
AND would span multiple days. Something like below:

"2011-01-03 09:15:00 IST"
"2011-01-03 09:15:01 IST"
....
....
....
"2011-01-03 15:29:59 IST"
"2011-01-03 15:30:00 IST"
"2011-01-04 09:15:00 IST"
"2011-01-04 09:15:01 IST"
....
....
....
"2011-01-04 15:29:59 IST"
"2011-01-04 15:30:00 IST"

Kindly notice the change of date in the sequence.

The Indian Equity markets open at 09:15:00 and close at 15:30:00. I have
equity data that spans 124 days, and I need to create a corresponding
sequence which I will later use to regularize the irregular dataset to
make
a regular time-series.

I was able to accomplish this task for a single day (i.e. creating a
sequence then merging my dataset with it and use na.locf to make my
dataset
regular) but am unable to create a sequence for 'n' number of days. Can
anyone help me with this?

If it is of any help, I have a file which contains all the dates for
which
I need the sequence. The dput of the file is placed at the end of the
email.

One option is to create sequences for the entire days and then later
remove
all these records after merging. Although I havent checked the
feasibility
of this method, it would be complex and more so it will increase the data
four folds (I already have 2 million records in the dataframe which I
have
to make regular).

Another approach that I could think of was to make a timebased sequence
based on the date from the file and then use a loop to append one
sequence
after another. But am not having much success there either.

Any kind of help would be greatly appreciated.

Thanks and regards,
Shivam

structure(list("20110103", "20110104", "20110105", "20110106",
   "20110107", "20110110", "20110111", "20110112", "20110113",
   "20110114", "20110117", "20110118", "20110119", "20110120",
   "20110121", "20110124", "20110125", "20110127", "20110128",
   "20110131", "20110201", "20110202", "20110203", "20110204",
   "20110207", "20110208", "20110209", "20110210", "20110211",
   "20110214", "20110215", "20110216", "20110217", "20110218",
   "20110221", "20110222", "20110223", "20110224", "20110225",
   "20110228", "20110301", "20110303", "20110304", "20110307",
   "20110308", "20110309", "20110310", "20110311", "20110314",
   "20110315", "20110316", "20110317", "20110318", "20110321",
   "20110322", "20110323", "20110324", "20110325", "20110328",
   "20110329", "20110330", "20110331", "20110401", "20110404",
   "20110405", "20110406", "20110407", "20110408", "20110411",
   "20110413", "20110415", "20110418", "20110419", "20110420",
   "20110421", "20110425", "20110426", "20110427", "20110428",
   "20110429", "20110502", "20110503", "20110504", "20110505",
   "20110506", "20110509", "20110510", "20110511", "20110512",
   "20110513", "20110516", "20110517", "20110518", "20110519",
   "20110520", "20110523", "20110524", "20110525", "20110526",
   "20110527", "20110530", "20110531", "20110601", "20110602",
   "20110603", "20110606", "20110607", "20110608", "20110609",
   "20110610", "20110613", "20110614", "20110615", "20110616",
   "20110617", "20110620", "20110621", "20110622", "20110623",
   "20110624", "20110627", "20110628", "20110629", "20110630"), .Dim =
c(124L,
1L), .Dimnames = list(c("X1", "X2", "X3", "X4", "X5", "X6", "X7",
"X8", "X9", "X10", "X11", "X12", "X13", "X14", "X15", "X16",
"X17", "X18", "X19", "X20", "X21", "X22", "X23", "X24", "X25",
"X26", "X27", "X28", "X29", "X30", "X31", "X32", "X33", "X34",
"X35", "X36", "X37", "X38", "X39", "X40", "X41", "X42", "X43",
"X44", "X45", "X46", "X47", "X48", "X49", "X50", "X51", "X52",
"X53", "X54", "X55", "X56", "X57", "X58", "X59", "X60", "X61",
"X62", "X63", "X64", "X65", "X66", "X67", "X68", "X69", "X70",
"X71", "X72", "X73", "X74", "X75", "X76", "X77", "X78", "X79",
"X80", "X81", "X82", "X83", "X84", "X85", "X86", "X87", "X88",
"X89", "X90", "X91", "X92", "X93", "X94", "X95", "X96", "X97",
"X98", "X99", "X100", "X101", "X102", "X103", "X104", "X105",
"X106", "X107", "X108", "X109", "X110", "X111", "X112", "X113",
"X114", "X115", "X116", "X117", "X118", "X119", "X120", "X121",
"X122", "X123", "X124"), NULL))

   [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

-- 
*Victoria Concordia Crescit*

	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120528/78db56dc/attachment.pl>
Thanks for the responses ppl.

@Gabor - The issue with your approach was that I had to select the time
window for many days (124), which would be very difficult to achieve. I
really appreciate you time though.
Why does the number of days "make it difficult to achieve"?  The
number of days does not affect the code at all.  Is there some aspect
of the problem you haven't mentioned?
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120528/acddade2/attachment.pl>
Its not the number of days per se, it is the random gaps between the dates
(corresponding to the dates on which the security market was closed) which
will be difficult to accommodate in the solution proposed by you. So I would
have to remove the sequence corresponding to those days from the entire
sequence. This was the part which I deemed as difficult to achieve.
I had mentioned this issue in my previous mails but you might have missed
it.

If dd is a vector of the dates you want then just change the last line
to choose only those using as.Date(tseq, tz = "") %in% dd as below:

dd <- as.Date(c("2011-01-03", "2011-01-04")) ##

from <- as.POSIXct(paste(dd[1], "09:15:00")) ##
to <- as.POSIXct(paste(tail(dd, 1), "15:30:00")) ##

tseq <- seq(from, to, "1 min")

tt <- format(tseq, "%H:%M:%S")
tresult <- tseq[tt >= "09:30:00" & tt <= "15:30:00" & as.Date(tseq, tz
= "") %in% dd] ##
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com
Depending on your exchange of interest, you might also find some of
the functions of the timeDate package helpful, e.g., holidayNYSE() --
it will miss the day the market was closed for extraordinary
circumstances, but it seems to do a very good job. [Disclaimer: I
haven't used it myself extensively]

Michael

On Sun, May 27, 2012 at 8:26 PM, Gabor Grothendieck
On Sun, May 27, 2012 at 8:01 PM, Shivam <shivamsingh at gmail.com> wrote:
Its not the number of days per se, it is the random gaps between the dates
(corresponding to the dates on which the security market was closed) which
will be difficult to accommodate in the solution proposed by you. So I would
have to remove the sequence corresponding to those days from the entire
sequence. This was the part which I deemed as difficult to achieve.
I had mentioned this issue in my previous mails but you might have missed
it.

If dd is a vector of the dates you want then just change the last line
to choose only those using as.Date(tseq, tz = "") %in% dd as below:

dd <- as.Date(c("2011-01-03", "2011-01-04")) ##

from <- as.POSIXct(paste(dd[1], "09:15:00")) ##
to <- as.POSIXct(paste(tail(dd, 1), "15:30:00")) ##

tseq <- seq(from, to, "1 min")

tt <- format(tseq, "%H:%M:%S")
tresult <- tseq[tt >= "09:30:00" & tt <= "15:30:00" & as.Date(tseq, tz
= "") %in% dd] ##

--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.