RTAQ - convert function: warning causes incorrect loading of data
Hello, Thanks for looking into it. What process do people usually follow to fix bugs in RTAQ? I took me a while to realise what's wrong with it, therefore it would be great if we can address it so others won't have to encounter it. Best, Nicolae On Sat, 13 Oct 2012 12:33:01 -0500, Jeff Ryan <jeff.a.ryan at gmail.com> wrote:
FWIW %m is the proper conversion for months. %M is minutes. Looks like a bug. Jeffrey Ryan | Founder | jeffrey.ryan at lemnica.com www.lemnica.com On Oct 13, 2012, at 10:33 AM, Nicolae Caprarescu <caprarn9 at cs.man.ac.uk> wrote:
Hi Michael, Thank you for pointing me in the right direction, I'm now using an
client rather than Nabble. Related to the issue I described below, it's resolved now, I have
managed
to fix it myself. However, I believe this might be a bug, or at least something that needs improving; I have described both how to reproduce this issue and its solution in the below 4 steps: 1) library(RTAQ) 2) Create XXX_trades.csv file with the contents below using a relative path like [somewhere]/TAQData/2010-11-01/XXX_trades.csv SYMBOL,DATE,TIME,PRICE,SIZE,G127,CORR,COND,EX XXX,20101101,10:30:00,11.49,500,0,0,@,B XXX,20101101,10:30:02,11.49,322,0,0,0,B XXX,20101101,10:30:02,11.49,178,0,0,@,B XXX,20101101,10:30:03,11.49,500,0,0,@,B XXX,20101101,10:30:03,11.49,187,0,0,@,D 3) #convert does not generate any errors/warnings, however it does not
work
properly convert(from="2010-11-01", to="2010-11-01",datasource="[somewhere]/TAQData/",
datadestination="[somewhere]/TAQDataRData/",trades=T,quotes=F,ticker="XXX",dir=T,
extention="csv", header=T, tradecolnames=c("SYMBOL", "DATE", "TIME",
"PRICE", "SIZE", "G127", "CORR", "COND", "EX"))
#loading the RData created by convert
TAQLoad("XXX",from="2010-11-01",to="2010-11-01",datasource="[somewhere]TAQDataRData/",
trades=T,quotes=F)
#output of TAQLoad
SYMBOL EX PRICE SIZE COND CORR G127
<NA> "XXX" "B" "11.49" "500" "@" "0" "0"
<NA> "XXX" "B" "11.49" "322" "0" "0" "0"
<NA> "XXX" "B" "11.49" "178" "@" "0" "0"
<NA> "XXX" "B" "11.49" "500" "@" "0" "0"
<NA> "XXX" "D" "11.49" "187" "@" "0" "0"
Warning message:
timezone of object (GMT) is different than current timezone ().
Problem are the <NA>s. If one does not supply the format of date and
time
to the convert function, it is assumed that the standard NYSE format is used, and therefore RTAQ internally (convert_to_RData.r line 32) represents this as "Y%M%D %H:%M:%S". Whilst this works fine for some things, when
a
timeDate is initialised using this format (convert_to_RData.r line
102),
it
does not work. timeDate expects a correct format like "%Y%m%d %H:%M:%S"
format rather than "Y%M%D %H:%M:%S".
Run the below two to confirm:
tdobject=timeDate:::timeDate(paste(as.vector("2010-10-11"),
as.vector("10:30:30")), format="%Y%M%D
%H:%M:%S",FinCenter="GMT",zone="GMT")
#tdobject is GMT [1] [NA]
tdobject=timeDate:::timeDate(paste(as.vector("20101011"),
as.vector("10:30:30")), format="%Y%m%d
%H:%M:%S",FinCenter="GMT",zone="GMT")
#tdobject is now GMT [1] [2010-10-11 10:30:30]
Therefore, if one explicitly includes format="%Y%m%d %H:%M:%S" in the
convert function, everything works fine and the <NA> problem above is
solved; this is my solution. Can I please suggest that, once you
investigate this and provided that you confirm my understanding,
convert_to_RData.r is changed in order to use "%Y%m%d %H:%M:%S" as the
default format?
4) My environment:
R version 2.15.1 (2012-06-22)
Platform: i686-pc-linux-gnu (32-bit)
locale:
[1] LC_CTYPE=en_GB LC_NUMERIC=C LC_TIME=en_GB
[4] LC_COLLATE=C LC_MONETARY=en_GB LC_MESSAGES=en_GB
[7] LC_PAPER=C LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_GB LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages: [1] RTAQ_0.2 timeDate_2160.97 xts_0.8-6 zoo_1.7-8 loaded via a namespace (and not attached): [1] grid_2.15.1 lattice_0.20-6 Best wishes, Nicolae On Fri, 12 Oct 2012 21:52:22 +0100, "R. Michael Weylandt" <michael.weylandt at gmail.com> wrote:
I'm forwarding this to the R-SIG-Finance list, where ou'll have a more specialized audience. In the meanwhile, you may wish to look at
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
Finally, I note you're posting from Nabble. Please do include context
in
your reply -- I don't believe Nabble does this automatically, so you'll need to manually include it. Most of the regular respondents on these lists don't use Nabble -- it is a _mailing list_ after all -- so we don't get the forum view you do, only emails of the individual posts. Combine that with the high volume of posts, and it's quite difficult to trace a discussion if we all don't make sure to include context. Cheers, Michael On Fri, Oct 12, 2012 at 7:01 PM, caprarn9 <caprarn9 at cs.man.ac.uk>
wrote:
Hello, I am closely following the RTAQ documentation in order to load my
dataset
into R, however I get this warning when running the convert function
in
the
following way:
convert(from="2010-11-01", to="2010-11-01",datasource=datasource,
datadestination=datadestination,trades=T,quotes=T,ticker="BAC",dir=T,
extention="csv", header=T, tradecolnames=c("SYMBOL", "DATE", "TIME",
"PRICE", "SIZE", "G127", "CORR", "COND", "EX"),
quotecolnames=c("SYMBOL",
"DATE", "TIME", "BID", "OFR", "BIDSIZ", "OFRSIZ", "MODE", "EX")) The only warning returned is: In `[<-.factor`(`*tmp*`, is.na(tdata$G127), value = c(1L, 1L, 1L, : invalid factor level, NAs generated As it is a warning, the .RData files still get created and I can use TAQLoad to load them: x <-
TAQLoad("BAC",from="2010-11-01",to="2010-11-01",datasource=datadestination,
trades=T,quotes=T)
The PROBLEM:
head(x)
SYMBOL EX PRICE SIZE COND CORR G127
<NA> "BAC" "B" "11.4900" " 500" "@" "0" "0"
...
This is the same for the quotes objects, but different headers
obviously. I
get a <NA> instead of the expected YYY-MM-DD HH:MM:SS format for each
observation.
I've spent a fair number of hours on trying to get this right, no
success.
Can you please provide me with some guidance?
Thank you.
A sample from the CSV files I use:
SYMBOL,DATE,TIME,BID,OFR,BIDSIZ,OFRSIZ,MODE,EX
BAC,20101101,9:30:00,11.5,11.51,5,116,12,P
...
SYMBOL,DATE,TIME,PRICE,SIZE,G127,CORR,COND,EX
BAC,20101101,10:30:00,11.49,500,0,0,@,B
...
--
View this message in context:
http://r.789695.n4.nabble.com/RTAQ-convert-function-warning-causes-incorrect-loading-of-data-tp4646025.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
_______________________________________________ R-SIG-Finance at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go.