cointegration
Stephen,
It depends what you mean by "logic".
If you mean statistical logic, I'll defer to Eric Zivot and Sarbo who are far wiser than I am. I will note, however, that you are testing for a p-value of 0.05, so I expect 5% of your test results to be misleading. In other words, for every 20 pairs tested by your batch job, I expect one will be suspect.
"Spurious cointegration" is a serious problem. I suggest Googling that topic. You may be suprised what you learn. (The irony, of course, is that cointegration was supposed to cure "spurious correlation." Oh well.)
If you mean financial logic, I strongly suggest not blindly risking money on your statistical test. Some filtering is required. Look for trades that make sense.
For example, my software reports that the stocks of MSFT and GOOG form a mean-reverting pair. But I would not trade that spread: too much idiosyncratic risk. My software also reports that Corn futures and Soybean Oil futures form a mean-reverting pair. But I would not trade that spread because the economic connection between corn and bean oil is too weak.
Hope that helps.
Paul
_____
From: r-sig-finance-bounces at stat.math.ethz.ch [mailto:r-sig-finance-bounces at stat.math.ethz.ch] On Behalf Of Stephen Choularton
Sent: Monday, October 18, 2010 9:46 PM
To: r-sig-finance at stat.math.ethz.ch
Subject: [R-SIG-Finance] cointegration
Hi Folks
I'm using this to find cointegrated stocks on the AX.
library(xts)
library(quantmod)
# quickly re-source this file
s <- function() source('meanrev.R')
checkPairFromYahoo <- function(sym1, sym2, dateFilter='::')
{
t.xts <- getCombined(sym1, sym2, dateFilter=dateFilter)
cat("Date range is", format(start(t.xts)), "to", format(end(t.xts)), "\n")
# Build linear model
m <- buildLM(t.xts)
# Note beta -- http://en.wikipedia.org/wiki/Beta_(finance)
beta <- getBeta(m)
cat("Assumed hedge ratio is", beta, "\n")
# Build spread
sprd <- buildSpread(t.xts, beta)
# Test cointegration
ht <- testCoint(sprd)
cat("PP p-value is", as.double(ht$p.value), "\n")
if (as.double(ht$p.value) < 0.05)
{
cat("###############################################################\n", sym1 ,":", sym2 ," is likely mean-reverting.\n", "###########################################################\n" )
}
else
{
#cat(sym1 ,":", sym2 ," is not mean-reverting.\n")
}
}
getCombined <- function(sym1, sym2, dateFilter='::')
{
# Grab historical data for both symbols
one <- getSymbols(sym1, auto.assign=FALSE)
two <- getSymbols(sym2, auto.assign=FALSE)
# Give columns more usable names
colnames(one) <- c('Open', 'High', 'Low', 'Close', 'Volume', 'Adjusted')
colnames(two) <- c('Open', 'High', 'Low', 'Close', 'Volume', 'Adjusted')
# Build combined object
return(merge(one$Close, two$Close, all=FALSE)[dateFilter])
}
buildLM <- function(combined)
{
return(lm(Close ~ Close.1 + 0, combined))
}
getBeta <- function(m)
{
return(as.double(coef(m)[1]))
}
buildSpread <- function(combined, beta)
{
return(combined$Close - beta*combined$Close.1)
}
testCoint <- function(sprd)
{
return(PP.test(sprd, lshort = FALSE))
}
I run it on batches of stock-pairs and then have a look at those which are cointegrated. Assuming my code is right (and anyone who thinks there is something wrong with it please let me know ;-)
Just wondered if anyone simply goes with the results, or if a test of logic is required. I found, for example, that AGL ( a big gas company) was cointegrated with Bunnings Wharehouses (a hardware superstore chain). Can't see the reason for that. AMP (major insurer) cointegrates with AXA (another major insurer). That makes sense and it cointegrates with Westpac (major bank) still some logic but a bit thinner. It also cointegrates with Fortescue Metals (big iron ore operation). Not much logic there. Anyway question is: do you get better results by using informed judgement on these things or just trust the figures?
Any comments most welcome.
Stephen Choularton Ph.D., FIoD
9999 2226
0413 545 182
for insurance go to www.netinsure.com.au
for markets go to www.organicfoodmarkets.com.au
On 19/10/2010 12:35 PM, Yihao Lu aeolus_lu wrote:
I am doing rolling ADF test on some time series to check mean reversion. When I use short period rolling, I find the residue is not stationary at all. However, when I use horizon longer than 5 years, I find very significant stationary. On the other hand, I find the half life is only around 30 days. Is there anyone who can give me some possible explanation or guide me to some reference? thanks Best, Yihao ________________________________ Date: Tue, 19 Oct 2010 09:03:55 +1100 From: stephen at organicfoodmarkets.com.au To: r-sig-finance at stat.math.ethz.ch CC: bjorn.skogtro at gmail.com Subject: Re: [R-SIG-Finance] Ornstein-Uhlenbeck Hi I am still trying to sort this one out. Any comments from anyone would be most welcome. Stephen Choularton Ph.D., FIoD
On 14/10/2010 7:29 AM, Stephen Choularton wrote:
Thanks for this help.
Trying to make sense of it so I have added some notes to the code. I
have marked them #?#
Delighted if you can tell me if I am write or wrong, add any comments,
answers.
#?# This appears to be the function that is doing the 'Ornstein-Uhlenbeck
#?# process work' particularly via dcOU
#?# I have noted in several places that I am after:
#?# 'the half-life of the decay equals ln(2)/?'
#?# 'The half-life is given as log(2)/mean-reversion speed.'
#?# and I see theta appearing at a number of points in the code.
#?# Can you tell me why 3 thetas viz theta1, theta2, theta3 and what they do?
#?# eg is one of these the theta I am after?
# ex3.01.R
OU.lik <- function(theta1, theta2, theta3){
n <- length(X)
dt <- deltat(X)
-sum(dcOU(X[2:n], dt, X[1:(n-1)], c(theta1,theta2,theta3), log=TRUE))
}
require(stats4)
require(sde)
#?# random numer generation seed
set.seed(123)
#?# creation of a data set
X <- sde.sim(model="OU", theta=c(3,1,2), N=1000, delta=1)
#?# If I Look at X its like this:
#?# Time Series:
#?# Start = 0
#?# End = 1000
#?# Frequency = 1
#?# [1] 1.00000000 etc
#?# What sort of data object is it and how would I coerce an object with one
#?# column from a read.csv into it?
mle(OU.lik, start=list(theta1=1, theta2=0.5, theta3=1),
method="L-BFGS-B", lower=c(-Inf,0,0)) -> fit
summary(fit)
#?# This gives:
#?# Maximum likelihood estimation
#?# Call:
#?# mle(minuslogl = OU.lik, start = list(theta1 = 1, theta2 = 0.5,
#?# theta3 = 1), method = "L-BFGS-B", lower = c(-Inf, 0, 0))
#?# Coefficients:
#?# Estimate Std. Error
#?# theta1 3.355322 0.28159504
#?# theta2 1.106107 0.09010627
#?# theta3 2.052815 0.07624441
#?# -2 log L: 3366.389
#?# What's this telling me?
# ex3.01.R (cont.)
prof <- profile(fit)
par(mfrow=c(1,3))
plot(prof)
par(mfrow=c(1,1))
vcov(fit)
confint(fit)
#?# This provides me with this output using 'fit' from before:
#?# > vcov(fit)
#?# theta1 theta2 theta3
#?# theta1 0.07929576 0.024620718 0.016634557
#?# theta2 0.02462072 0.008119141 0.005485549
#?# theta3 0.01663456 0.005485549 0.005813209
#?# > confint(fit)
#?# Profiling...
#?# 2.5 % 97.5 %
#?# theta1 2.8448980 3.960982
#?# theta2 0.9433338 1.300629
#?# theta3 1.9147136 2.216113
#?# and 'fit' is:
#?# Call:
#?# mle(minuslogl = OU.lik, start = list(theta1 = 1, theta2 = 0.5,
#?# theta3 = 1), method = "L-BFGS-B", lower = c(-Inf, 0, 0))
#?# Coefficients:
#?# theta1 theta2 theta3
#?# 3.355322 1.106107 2.052815
#?# plus some graphic output
#?# Again, what's this telling me.
#?# This looks like a further example?
# ex3.01.R (cont.)
set.seed(123)
X <- sde.sim(model="OU", theta=c(3,1,2), N=1000, delta=1e-3)
mle(OU.lik, start=list(theta1=1, theta2=0.5, theta3=1),
method="L-BFGS-B", lower=c(-Inf,0,0)) -> fit2
summary(fit2)
Please excuse the length of this email (and my lack of understanding)
Hope you can help and thanks.
Stephen Choularton Ph.D., FIoD
On 13/10/2010 2:41 AM, stefano iacus wrote:
just for completeness: OU process is gaussian and transitiion density is known in exact form. So maximum likelihood estimation works fine and I suggest to avoid GMM.
sde package contains exact transition density for this process (e.g. ?dcOU) which you can use to build the likelihood to pass to mle() function.
This example taken from the "inst" directory of the package sde. For the parametrization of the model see ?dcOU
# ex3.01.R
OU.lik <- function(theta1, theta2, theta3){
n <- length(X)
dt <- deltat(X)
-sum(dcOU(X[2:n], dt, X[1:(n-1)], c(theta1,theta2,theta3), log=TRUE))
}
require(stats4)
require(sde)
set.seed(123)
X <- sde.sim(model="OU", theta=c(3,1,2), N=1000, delta=1)
mle(OU.lik, start=list(theta1=1, theta2=0.5, theta3=1),
method="L-BFGS-B", lower=c(-Inf,0,0)) -> fit
summary(fit)
# ex3.01.R (cont.)
prof <- profile(fit)
par(mfrow=c(1,3))
plot(prof)
par(mfrow=c(1,1))
vcov(fit)
confint(fit)
# ex3.01.R (cont.)
set.seed(123)
X <- sde.sim(model="OU", theta=c(3,1,2), N=1000, delta=1e-3)
mle(OU.lik, start=list(theta1=1, theta2=0.5, theta3=1),
method="L-BFGS-B", lower=c(-Inf,0,0)) -> fit2
summary(fit2)
I hope this helps out
stefano
On 12 Oct 2010, at 12:33, Bjorn Skogtro wrote:
Hi Stephen, You could take a look at http://sitmo.com/doc/Calibrating_the_Ornstein-Uhlenbeck_model for the linear regression method, or take a look at the package "sde" which contains some examples using GMM (not for the Ornstein-Uhlenbeck process, though, only the CIR). The half-life is given as log(2)/mean-reversion speed. Do keep an eye on the partition of the time-axis, e.g. what frequency you are using (daily, yearly) for interpreting the half-life. BR, Bj?rn ------------------------------ Message: 2 Date: Tue, 12 Oct 2010 05:43:32 -0400 From: Sarbo To: r-sig-finance at stat.math.ethz.ch Subject: Re: [R-SIG-Finance] Ornstein-Uhlenbeck Message-ID: Content-Type: text/plain; charset="utf-8" By half-life, do you mean the speed of mean-reversion? If so, there's a bit of algebraic tomfoolery that's required to discretise the equation and then fit the data to it. I don't have the time right now to go into all the details but it's not hard- you can parameterise the process using simple linear regression. If you need help with that I'll try and get back to you tonight about it.
On Tue, 2010-10-12 at 13:47 +1100, Stephen Choularton wrote:
Hi Wonder if anyone could point me how I use this method to discover the half life of a mean reverting process. I am looking into pair trading and the time it takes for a cointegrated pair to revert to the norm. -- Stephen Choularton Ph.D., FIoD 9999 2226 0413 545 182 for insurance go to www.netinsure.com.au for markets go to www.organicfoodmarkets.com.au _______________________________________________ R-SIG-Finance at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go. -------------- next part -------------- An HTML attachment was scrubbed... URL: < https://stat.ethz.ch/pipermail/r-sig-finance/attachments/20101012/26e32fc7/attachment-0001.html -------------- next part -------------- A non-text attachment was scrubbed... Name: CoS2010Winner.JPG Type: image/jpeg Size: 16091 bytes Desc: not available URL: < https://stat.ethz.ch/pipermail/r-sig-finance/attachments/20101012/26e32fc7/attachment-0001.jpe ------------------------------ _______________________________________________ R-SIG-Finance mailing list R-SIG-Finance at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-finance End of R-SIG-Finance Digest, Vol 77, Issue 8 ******************************************** _______________________________________________ R-SIG-Finance at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go. ----------------------------------- Stefano M. Iacus Department of Economics, Business and Statistics University of Milan Via Conservatorio, 7 I-20123 Milan - Italy Ph.: +39 02 50321 461 Fax: +39 02 50321 505 http://www.economia.unimi.it/iacus ------------------------------------------------------------------------------------ Please don't send me Word or PowerPoint attachments if not absolutely necessary. See: http://www.gnu.org/philosophy/no-word-attachments.html _______________________________________________ R-SIG-Finance at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go. No virus found in this incoming message. Checked by AVG - www.avg.com _______________________________________________ R-SIG-Finance at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go. No virus found in this incoming message. Checked by AVG - www.avg.com _______________________________________________ R-SIG-Finance at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go. _______________________________________________ R-SIG-Finance at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go. No virus found in this incoming message. Checked by AVG - www.avg.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://stat.ethz.ch/pipermail/r-sig-finance/attachments/20101019/08079fb9/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 16091 bytes Desc: not available URL: <https://stat.ethz.ch/pipermail/r-sig-finance/attachments/20101019/08079fb9/attachment.jpe>