I am having trouble with non-unique time stamps in an xts.
My underlying data has some repeated rows (in a csv file).
How can I easily get rid of the duplicates?
I feel I must be missing something simple. If not I can concoct an
example to illustrate my problem.
cheers
Worik
On Monday, January 31, 2011 12:55:03 am Worik wrote:
I am having trouble with non-unique time stamps in an xts.
My underlying data has some repeated rows (in a csv file).
How can I easily get rid of the duplicates?
I feel I must be missing something simple. If not I can concoct an
example to illustrate my problem.
Worik,
It depends on what you need.
If you can remove the rows with duplicated indices, then a construction such
as:
myxts<-myxts[!duplicated(index(myxts))]
should work.
If you need all of the observations, and need to artificially make them unique
(as is a common problem with tick data), then you will see discussion in the
list archives here and other places regarding adding artificial indices to high
frequency data while preserving order. You will need the latest xts from R-
Forge and use a construction like this:
myxts<-make.unique.index(myxts)
which will (by default) add .00001 sec to each non-unique index after the
first, preserving order, and providing every observation with a unique index.
Note that this presumes that the original order of the observations was
correct in the first place, no provision has been made if you have different
circumstances.
Thanks to Jeff Ryan for (very) recently adding this second method.
Regards,
- Brian
Brian, Worik
w.r.t the new functionality in xts.
It is so bleeding edge that Brian gave you the wrong name ;-) think
"make [the] index unique". It probably will also be extended to do
the former removal of subsequent non-unique observations/times as
well.
HTH,
Jeff
?make.index.unique
make.index.unique package:xts R Documentation
Force Time Values To Be Unique
Description:
A generic function to force sorted time vectors to be unique.
Useful for high-frequency time-series where original time-stamps
may have identical values. For the case of xts objects, the
default ?eps? is set to one-hundred microseconds. In practice this
advances each subsequent identical time by ?eps? over the previous
(possibly also advanced) value.
Usage:
make.index.unique(x, eps = 1e-05, ...)
make.time.unique(x, eps = 1e-05, ...)
Arguments:
x: An xts object, or POSIXct vector.
eps: value to add to force uniqueness.
...: unused
Details:
The returned time-series object will have new time-stamps so that
?isOrdered( .index(x) )? evaluates to TRUE.
Value:
A modified version of x.
Note:
Incoming values must be pre-sorted, and no check is done to make
sure that this is the case. If the index values are of
storage.mode ?integer?, they will be coerced to ?double?.
Author(s):
Jeffrey A. Ryan
See Also:
?align.time?
Examples:
ds <- options(digits.secs=6) # so we can see the change
x <- xts(1:10, as.POSIXct("2011-01-21") + c(1,1,1,2:8)/1e3)
x
make.index.unique(x)
options(ds)
On Mon, Jan 31, 2011 at 6:05 AM, Brian G. Peterson <brian at braverock.com> wrote:
On Monday, January 31, 2011 12:55:03 am Worik wrote:
I am having trouble with non-unique time stamps in an xts.
My underlying data has some repeated rows (in a csv file).
How can I easily get rid of the duplicates?
I feel I must be missing something simple. ?If not I can concoct an
example to illustrate my problem.
Worik,
It depends on what you need.
If you can remove the rows with duplicated indices, then a construction such
as:
myxts<-myxts[!duplicated(index(myxts))]
should work.
If you need all of the observations, and need to artificially make them unique
(as is a common problem with tick data), then you will see discussion in the
list archives here and other places regarding adding artificial indices to high
frequency data while preserving order. You will need the latest xts from R-
Forge and use a construction like this:
myxts<-make.unique.index(myxts)
which will (by default) add .00001 sec to each non-unique index after the
first, preserving order, and providing every observation with a unique index.
Note that this presumes that the original order of the observations was
correct in the first place, no provision has been made if you have different
circumstances.
Thanks to Jeff Ryan for (very) recently adding this second method.
Regards,
?- Brian
--
Brian G. Peterson
http://braverock.com/brian/
Ph: 773-459-4973
IM: bgpbraverock
_______________________________________________
R-SIG-Finance at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.
Jeffrey Ryan
jeffrey.ryan at lemnica.com
www.lemnica.com
| Brian, Worik
|
| w.r.t the new functionality in xts.
|
| It is so bleeding edge that Brian gave you the wrong name ;-) think
| "make [the] index unique". It probably will also be extended to do
| the former removal of subsequent non-unique observations/times as
| well.
|
| HTH,
| Jeff
|
|
| ?make.index.unique
|
| make.index.unique package:xts R Documentation
|
| Force Time Values To Be Unique
|
| Description:
|
| A generic function to force sorted time vectors to be unique.
| Useful for high-frequency time-series where original time-stamps
| may have identical values. For the case of xts objects, the
| default ?eps? is set to one-hundred microseconds. In practice this
| advances each subsequent identical time by ?eps? over the previous
| (possibly also advanced) value.
|
| Usage:
|
| make.index.unique(x, eps = 1e-05, ...)
|
| make.time.unique(x, eps = 1e-05, ...)
Why eps=1e-05? I wrote variants of this in-house and use
incrementTimestamps <- function(times, incr=1.0e-6, ...) {
...
}
Dirk
|
| Arguments:
|
| x: An xts object, or POSIXct vector.
|
| eps: value to add to force uniqueness.
|
| ...: unused
|
| Details:
|
| The returned time-series object will have new time-stamps so that
| ?isOrdered( .index(x) )? evaluates to TRUE.
|
| Value:
|
| A modified version of x.
|
| Note:
|
| Incoming values must be pre-sorted, and no check is done to make
| sure that this is the case. If the index values are of
| storage.mode ?integer?, they will be coerced to ?double?.
|
| Author(s):
|
| Jeffrey A. Ryan
|
| See Also:
|
| ?align.time?
|
| Examples:
|
| ds <- options(digits.secs=6) # so we can see the change
|
| x <- xts(1:10, as.POSIXct("2011-01-21") + c(1,1,1,2:8)/1e3)
| x
| make.index.unique(x)
|
| options(ds)
|
|
|
| On Mon, Jan 31, 2011 at 6:05 AM, Brian G. Peterson <brian at braverock.com> wrote:
| > On Monday, January 31, 2011 12:55:03 am Worik wrote:
| >> I am having trouble with non-unique time stamps in an xts.
| >>
| >> My underlying data has some repeated rows (in a csv file).
| >>
| >> How can I easily get rid of the duplicates?
| >>
| >> I feel I must be missing something simple. ?If not I can concoct an
| >> example to illustrate my problem.
| >
| > Worik,
| >
| > It depends on what you need.
| >
| > If you can remove the rows with duplicated indices, then a construction such
| > as:
| >
| > myxts<-myxts[!duplicated(index(myxts))]
| >
| > should work.
| >
| > If you need all of the observations, and need to artificially make them unique
| > (as is a common problem with tick data), then you will see discussion in the
| > list archives here and other places regarding adding artificial indices to high
| > frequency data while preserving order. You will need the latest xts from R-
| > Forge and use a construction like this:
| >
| > myxts<-make.unique.index(myxts)
| >
| > which will (by default) add .00001 sec to each non-unique index after the
| > first, preserving order, and providing every observation with a unique index.
| > Note that this presumes that the original order of the observations was
| > correct in the first place, no provision has been made if you have different
| > circumstances.
| >
| > Thanks to Jeff Ryan for (very) recently adding this second method.
| >
| > Regards,
| >
| > ?- Brian
| >
| > --
| > Brian G. Peterson
| > http://braverock.com/brian/
| > Ph: 773-459-4973
| > IM: bgpbraverock
| >
| > _______________________________________________
| > R-SIG-Finance at r-project.org mailing list
| > https://stat.ethz.ch/mailman/listinfo/r-sig-finance
| > -- Subscriber-posting only. If you want to post, subscribe first.
| > -- Also note that this is not the r-help list where general R questions should go.
| >
|
|
|
| --
| Jeffrey Ryan
| jeffrey.ryan at lemnica.com
|
| www.lemnica.com
|
| _______________________________________________
| R-SIG-Finance at r-project.org mailing list
| https://stat.ethz.ch/mailman/listinfo/r-sig-finance
| -- Subscriber-posting only. If you want to post, subscribe first.
| -- Also note that this is not the r-help list where general R questions should go.
On Mon, Jan 31, 2011 at 10:09 AM, Dirk Eddelbuettel <edd at debian.org> wrote:
On 31 January 2011 at 09:37, Jeffrey Ryan wrote:
| Brian, Worik
|
| w.r.t the new functionality in xts.
|
| It is so bleeding edge that Brian gave you the wrong name ;-) think
| "make [the] index unique". ?It probably will also be extended to do
| the former removal of subsequent non-unique observations/times as
| well.
|
| HTH,
| Jeff
|
|
| ?make.index.unique
|
| make.index.unique ? ? ? ? ? ? package:xts ? ? ? ? ? ? ?R Documentation
|
| Force Time Values To Be Unique
|
| Description:
|
| ? ? ?A generic function to force sorted time vectors to be unique.
| ? ? ?Useful for high-frequency time-series where original time-stamps
| ? ? ?may have identical values. For the case of xts objects, the
| ? ? ?default ?eps? is set to one-hundred microseconds. In practice this
| ? ? ?advances each subsequent identical time by ?eps? over the previous
| ? ? ?(possibly also advanced) value.
|
| Usage:
|
| ? ? ?make.index.unique(x, eps = 1e-05, ...)
|
| ? ? ?make.time.unique(x, eps = 1e-05, ...)
Why eps=1e-05? ?I wrote variants of this in-house and use
? ?incrementTimestamps <- function(times, incr=1.0e-6, ...) {
? ? ? ...
? ?}
Why not. ;-)
On the topic of semantics, why 1.0e-6: 1.0 is redundant, given that a
negative exponent will assure you a mode of "double".
All kidding aside, Brian told me to. And it is user settable of course.
Cuique suum
Jeff
Dirk
|
| Arguments:
|
| ? ? ? ?x: An xts object, or POSIXct vector.
|
| ? ? ?eps: value to add to force uniqueness.
|
| ? ? ?...: unused
|
| Details:
|
| ? ? ?The returned time-series object will have new time-stamps so that
| ? ? ??isOrdered( .index(x) )? evaluates to TRUE.
|
| Value:
|
| ? ? ?A modified version of x.
|
| Note:
|
| ? ? ?Incoming values must be pre-sorted, and no check is done to make
| ? ? ?sure that this is the case. ?If the index values are of
| ? ? ?storage.mode ?integer?, they will be coerced to ?double?.
|
| Author(s):
|
| ? ? ?Jeffrey A. Ryan
|
| See Also:
|
| ? ? ??align.time?
|
| Examples:
|
| ? ? ?ds <- options(digits.secs=6) # so we can see the change
|
| ? ? ?x <- xts(1:10, as.POSIXct("2011-01-21") + c(1,1,1,2:8)/1e3)
| ? ? ?x
| ? ? ?make.index.unique(x)
|
| ? ? ?options(ds)
|
|
|
| On Mon, Jan 31, 2011 at 6:05 AM, Brian G. Peterson <brian at braverock.com> wrote:
| > On Monday, January 31, 2011 12:55:03 am Worik wrote:
| >> I am having trouble with non-unique time stamps in an xts.
| >>
| >> My underlying data has some repeated rows (in a csv file).
| >>
| >> How can I easily get rid of the duplicates?
| >>
| >> I feel I must be missing something simple. ?If not I can concoct an
| >> example to illustrate my problem.
| >
| > Worik,
| >
| > It depends on what you need.
| >
| > If you can remove the rows with duplicated indices, then a construction such
| > as:
| >
| > myxts<-myxts[!duplicated(index(myxts))]
| >
| > should work.
| >
| > If you need all of the observations, and need to artificially make them unique
| > (as is a common problem with tick data), then you will see discussion in the
| > list archives here and other places regarding adding artificial indices to high
| > frequency data while preserving order. You will need the latest xts from R-
| > Forge and use a construction like this:
| >
| > myxts<-make.unique.index(myxts)
| >
| > which will (by default) add .00001 sec to each non-unique index after the
| > first, preserving order, and providing every observation with a unique index.
| > Note that this presumes that the original order of the observations was
| > correct in the first place, no provision has been made if you have different
| > circumstances.
| >
| > Thanks to Jeff Ryan for (very) recently adding this second method.
| >
| > Regards,
| >
| > ?- Brian
| >
| > --
| > Brian G. Peterson
| > http://braverock.com/brian/
| > Ph: 773-459-4973
| > IM: bgpbraverock
| >
| > _______________________________________________
| > R-SIG-Finance at r-project.org mailing list
| > https://stat.ethz.ch/mailman/listinfo/r-sig-finance
| > -- Subscriber-posting only. If you want to post, subscribe first.
| > -- Also note that this is not the r-help list where general R questions should go.
| >
|
|
|
| --
| Jeffrey Ryan
| jeffrey.ryan at lemnica.com
|
| www.lemnica.com
|
| _______________________________________________
| R-SIG-Finance at r-project.org mailing list
| https://stat.ethz.ch/mailman/listinfo/r-sig-finance
| -- Subscriber-posting only. If you want to post, subscribe first.
| -- Also note that this is not the r-help list where general R questions should go.
--
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
_______________________________________________
R-SIG-Finance at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.
Jeffrey Ryan
jeffrey.ryan at lemnica.com
www.lemnica.com
On 01/31/2011 10:09 AM, Dirk Eddelbuettel wrote:
Why eps=1e-05? I wrote variants of this in-house and use
incrementTimestamps<- function(times, incr=1.0e-6, ...) {
...
}
On 01/31/2011 10:23 AM, Jeffrey Ryan wrote:
Why not.;-)
On the topic of semantics, why 1.0e-6: 1.0 is redundant, given that a
negative exponent will assure you a mode of "double".
I recall (hazy, I'm suspect there was beer involved) that we did some
testing a year or so ago on multiple different R installations and found
that 1e-5 was reliable even on 32 bit architectures, and 1e-6 was
(potentially) in the precision wiggle. My recollection could easily be
incorrect, and I'm sure that any precision issue like this is highly
compiler dependent.
As ?s (microsecond, 1e-6) timestamped data becomes increasingly
available, this functionality should become less necessary. For now,
I'm happy that it's in xts.
Cheers,
- Brian
Jeff,
When you do extend it to remove duplicates, you might want to
give a choice whether to keep the first or last duplicated value.
I know I've used both for various reasons.
Thanks for such a great set of packages!
-- David
-----Original Message-----
From: r-sig-finance-bounces at r-project.org [mailto:r-sig-finance-bounces at r-project.org] On Behalf Of Jeffrey Ryan
Sent: Monday, January 31, 2011 9:37 AM
To: Brian G. Peterson
Cc: r-sig-finance at r-project.org
Subject: [SPAM] - Re: [R-SIG-Finance] XTS with unique time stamps? - Email found in subject
Brian, Worik
w.r.t the new functionality in xts.
It is so bleeding edge that Brian gave you the wrong name ;-) think
"make [the] index unique". It probably will also be extended to do
the former removal of subsequent non-unique observations/times as
well.
HTH,
Jeff
?make.index.unique
make.index.unique package:xts R Documentation
Force Time Values To Be Unique
Description:
A generic function to force sorted time vectors to be unique.
Useful for high-frequency time-series where original time-stamps
may have identical values. For the case of xts objects, the
default 'eps' is set to one-hundred microseconds. In practice this
advances each subsequent identical time by 'eps' over the previous
(possibly also advanced) value.
Usage:
make.index.unique(x, eps = 1e-05, ...)
make.time.unique(x, eps = 1e-05, ...)
Arguments:
x: An xts object, or POSIXct vector.
eps: value to add to force uniqueness.
...: unused
Details:
The returned time-series object will have new time-stamps so that
'isOrdered( .index(x) )' evaluates to TRUE.
Value:
A modified version of x.
Note:
Incoming values must be pre-sorted, and no check is done to make
sure that this is the case. If the index values are of
storage.mode 'integer', they will be coerced to 'double'.
Author(s):
Jeffrey A. Ryan
See Also:
'align.time'
Examples:
ds <- options(digits.secs=6) # so we can see the change
x <- xts(1:10, as.POSIXct("2011-01-21") + c(1,1,1,2:8)/1e3)
x
make.index.unique(x)
options(ds)
On Mon, Jan 31, 2011 at 6:05 AM, Brian G. Peterson <brian at braverock.com> wrote:
On Monday, January 31, 2011 12:55:03 am Worik wrote:
I am having trouble with non-unique time stamps in an xts.
My underlying data has some repeated rows (in a csv file).
How can I easily get rid of the duplicates?
I feel I must be missing something simple. If not I can concoct an
example to illustrate my problem.
Worik,
It depends on what you need.
If you can remove the rows with duplicated indices, then a construction such
as:
myxts<-myxts[!duplicated(index(myxts))]
should work.
If you need all of the observations, and need to artificially make them unique
(as is a common problem with tick data), then you will see discussion in the
list archives here and other places regarding adding artificial indices to high
frequency data while preserving order. You will need the latest xts from R-
Forge and use a construction like this:
myxts<-make.unique.index(myxts)
which will (by default) add .00001 sec to each non-unique index after the
first, preserving order, and providing every observation with a unique index.
Note that this presumes that the original order of the observations was
correct in the first place, no provision has been made if you have different
circumstances.
Thanks to Jeff Ryan for (very) recently adding this second method.
Regards,
- Brian
--
Brian G. Peterson
http://braverock.com/brian/
Ph: 773-459-4973
IM: bgpbraverock
_______________________________________________
R-SIG-Finance at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.
--
Jeffrey Ryan
jeffrey.ryan at lemnica.com
www.lemnica.com
_______________________________________________
R-SIG-Finance at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.
This e-mail and any materials attached hereto, including, without limitation, all content hereof and thereof (collectively, "XR Content") are confidential and proprietary to XR Trading, LLC ("XR") and/or its affiliates, and are protected by intellectual property laws. Without the prior written consent of XR, the XR Content may not (i) be disclosed to any third party or (ii) be reproduced or otherwise used by anyone other than current employees of XR or its affiliates, on behalf of XR or its affiliates.
THE XR CONTENT IS PROVIDED AS IS, WITHOUT REPRESENTATIONS OR WARRANTIES OF ANY KIND. TO THE MAXIMUM EXTENT PERMISSIBLE UNDER APPLICABLE LAW, XR HEREBY DISCLAIMS ANY AND ALL WARRANTIES, EXPRESS AND IMPLIED, RELATING TO THE XR CONTENT, AND NEITHER XR NOR ANY OF ITS AFFILIATES SHALL IN ANY EVENT BE LIABLE FOR ANY DAMAGES OF ANY NATURE WHATSOEVER, INCLUDING, BUT NOT LIMITED TO, DIRECT, INDIRECT, CONSEQUENTIAL, SPECIAL AND PUNITIVE DAMAGES, LOSS OF PROFITS AND TRADING LOSSES, RESULTING FROM ANY PERSON'S USE OR RELIANCE UPON, OR INABILITY TO USE, ANY XR CONTENT, EVEN IF XR IS ADVISED OF THE POSSIBILITY OF SUCH DAMAGES OR IF SUCH DAMAGES WERE FORESEEABLE.