Yahoo hasn't done a stellar job with the OSI initiative as far as I
can tell. They seem to be getting better, but as Marc points out, it
is far from perfect.
You could possibly go from right to left to avoid the symbol width
issue (which *should* be 6 wide).
One other comment though --- if you are just looking to get days to
expiry, you are specifically requesting that in the call to retrieve
the chains - so all the parsing isn't really needed for that part.
Best,
Jeff
On Mon, Sep 6, 2010 at 10:29 PM, Marc Delvaux <mdelvaux at gmail.com> wrote:
Just a final word of caution if you automate this procedure across a large
number of stocks. ?From time to time, you will find option symbols that
deviate from the standard format, i.e. where there are more than 6
characters between the stock ticker and the option type symbol. ?Code
similar to the one presented by Gabor was failing for me for some stocks
because of that. ?Currently I used a brute force approach, I remove all
these as they typically also would pollute other calculations like the
implied volatility. ?A current example of this type of problem is the
October expiration for BHI, see below. ?In my approach, I remove all rows
where the number of characters is strictly more than the minimum for that
expiration.
Options <- getOptionChain("BHI",Exp=NULL)
rownames(Options[[2]]$puts)
?[1] "BHI1101016P00017000" "BHI1101016P00018000" "BHI1101016P00019000"
"BHI101016P00020000"
?[5] "BHI1101016P00020000" "BHI1101016P00021000" "BHI1101016P00022000"
"BHI101016P00022500"
?[9] "BHI1101016P00023000" "BHI1101016P00024000" "BHI101016P00025000"
"BHI1101016P00025000"
[13] "BHI1101016P00026000" "BHI101016P00030000" ?"BHI101016P00034000"
"BHI101016P00035000"
[17] "BHI101016P00036000" ?"BHI101016P00037000" ?"BHI101016P00038000"
"BHI101016P00039000"
[21] "BHI101016P00040000" ?"BHI101016P00041000" ?"BHI101016P00042000"
"BHI101016P00043000"
[25] "BHI101016P00044000" ?"BHI101016P00045000" ?"BHI101016P00046000"
"BHI101016P00047000"
[29] "BHI101016P00048000" ?"BHI101016P00049000" ?"BHI101016P00050000"
"BHI101016P00055000"
[33] "BHI101016P00060000" ?"BHI101016P00065000"
nchar(rownames(Options[[2]]$puts))
?[1] 19 19 19 18 19 19 19 18 19 19 18 19 19 18 18 18 18 18 18 18 18 18 18 18
18 18 18 18 18
[30] 18 18 18 18 18
On Mon, Sep 6, 2010 at 6:45 PM, Gabor Grothendieck
<ggrothendieck at gmail.com>wrote:
On Mon, Sep 6, 2010 at 8:37 PM, rex <rex at nosyntax.net> wrote:
rex <rex at nosyntax.net> [2010-09-06 16:11]:
allOpts <- getOptionChain("AAPL", Exp=optExpire)
allOpts
$calls
? ? ? ? ? ? ? ? ? ? Strike ? Last ? Chg ? ?Bid ? ?Ask ? Vol ? ?OI
AAPL100918C00150000 ? ?150 108.50 16.50 106.80 108.85 ? ? 3 ? ?13
AAPL100918C00155000 ? ?155 ?96.77 ?0.00 102.40 103.85 ? ? 4 ? ?10
AAPL100918C00160000 ? ?160 ?95.75 ?4.50 ?97.40 ?98.85 ? ?10 ? ?30
[...]
$puts
? ? ? ? ? ? ? ? ? Strike ?Last ? Chg ? Bid ? Ask ? Vol ? ?OI
AAPL100918P00150000 ? ?150 ?0.01 ?0.00 ? ?NA ?0.01 ? ? 6 ? 876
AAPL100918P00155000 ? ?155 ?0.02 ?0.00 ? ?NA ?0.01 ? ?30 ? 666
AAPL100918P00160000 ? ?160 ?0.02 ?0.00 ? ?NA ?0.01 ? ?79 ?1535
[...]
$symbol
[1] "AAPL"
The obvious thing fails to produce the desired result:
[1] ?1 ?2 ?3 ?4 ?5 ?6 ?7 ?8 ?9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
24
What I need are the indexes of both $calls and $puts as a set of
strings that can be split, etc. (I need the expiration date of the
options as a Date to be used to calculate the days to expiration.)
A solution (fugly, for sure):
[[1]]
[1] "AAPL100918P00150000" "AAPL100918P00155000" "AAPL100918P00160000"
[4] "AAPL100918P00165000" "AAPL100918P00170000" "AAPL100918P00175000"
[7] "AAPL100918P00180000" "AAPL100918P00185000" "AAPL100918P00190000"
[10] "AAPL100918P00195000" "AAPL100918P00200000" "AAPL100918P00210000"
[13] "AAPL100918P00220000" "AAPL100918P00230000" "AAPL100918P00240000"
[16] "AAPL100918P00250000" "AAPL100918P00260000" "AAPL100918P00270000"
[19] "AAPL100918P00280000" "AAPL100918P00290000" "AAPL100918P00300000"
[22] "AAPL100918P00310000" "AAPL100918P00320000" "AAPL100918P00330000"
[[2]]
[1] "Strike" "Last" ? "Chg" ? ?"Bid" ? ?"Ask" ? ?"Vol" ? ?"OI"
id <- dimnames(allOpts$puts)[1]
id
[[1]]
[1] "AAPL100918P00150000" "AAPL100918P00155000" "AAPL100918P00160000"
[4] "AAPL100918P00165000" "AAPL100918P00170000" "AAPL100918P00175000"
[7] "AAPL100918P00180000" "AAPL100918P00185000" "AAPL100918P00190000"
[10] "AAPL100918P00195000" "AAPL100918P00200000" "AAPL100918P00210000"
[13] "AAPL100918P00220000" "AAPL100918P00230000" "AAPL100918P00240000"
[16] "AAPL100918P00250000" "AAPL100918P00260000" "AAPL100918P00270000"
[19] "AAPL100918P00280000" "AAPL100918P00290000" "AAPL100918P00300000"
[22] "AAPL100918P00310000" "AAPL100918P00320000" "AAPL100918P00330000"
[1] "AAPL100918P00155000"
dat <- paste("20", substr(id2[2], 5, 10), sep="")
expDate <- as.Date(paste(substr(dat,1,4), "-", substr(dat,5,6), "-",
substr(dat,7,8), sep=""))
expDate
[1] "2010-09-18"
The above has got to be about the most convoluted and arcane
method to get the expiration date one can imagine.
Here are a few approaches and variations:
x <- rep( "AAPL100918P00155000", 3)
# 1 - gsub
as.Date(gsub("^\\D+|P.*", "", x), "%y%m%d")
[1] "2010-09-18" "2010-09-18" "2010-09-18"
# 2 - rbind and strsplit
as.Date(do.call(rbind, strsplit(x, "\\D+"))[,2], "%y%m%d")
[1] "2010-09-18" "2010-09-18" "2010-09-18"
# 3 - sapply and strsplit
as.Date(sapply(strsplit(x, "\\D+"), "[[", 2), "%y%m%d")
[1] "2010-09-18" "2010-09-18" "2010-09-18"
# 4 - strapply
library(gsubfn)
strapply(x, "(\\d+)P", ~ as.Date(x, "%y%m%d"), simplify = c)
[1] "2010-09-18" "2010-09-18" "2010-09-18"
--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com