Blotter example by kafka from R-bloggers
Hi
One thing that I want to understand is the effect of stop-loss activity
on results and employing more complex rules. The two examples I am
looking at have fairly simple rules like:
# three days higher close, high and open than on previous day
#one day before
lag1<-lag((SPY),1)
#two days defore
lag2<-lag((SPY),2)
signal<-ifelse( (Cl(lag2)>Cl(lag1) & Cl(lag1)>Cl(SPY))&
(Hi(lag2)>Hi(lag1) & Hi(lag1)>Hi(SPY)) &
(Op(lag2)>Op(lag1) & Op(lag1)>Op(SPY)),
1,0
)
and
# if today's low is higher than yesterday's close 1, else 0
signal<-ifelse(Lo(SPY)>Cl(tmp),1,0)
signal[1]<-0
First on more complex rules: I have tried looking at vector operations
but trying to write a rule for spreads like this:
[rule for opening]
if !not_open(yesterday(last_spread > 2 * standard deviation) and
today(last_spread < 2 * standard deviation)) -> open short spread )(and
vica versa)
[rule for stop loss]
if open(last_spread > opening_spread * 1.05 [stop loss]) -> close short
(and vica versa)
[rule for closing]
if open(last_spread < moving average) -> close short (and vica versa)
defeated me and I ending up writing some code like this (notice that I
haven't got the stop loss rule in it):
i <- 1
long = 0
short = 0
for (i in seq(from=1,to=length(spread.data$Close),by=1)) {
# lets get the data in more usable names
close.today <- spread.data[i,1]
close.yesterday <- spread.data[i-1,1]
# just to deal with the first period when there is no yesterday
if(i == 1) close.yesterday <- close.today
mean.today <- spread.data[i,2]
mean.yesterday <- spread.data[i-1,2]
# just to deal with the first period when there is no yesterday
if(i == 1) mean.yesterday <- mean.today
upper.boundary.today <- spread.data[i,3]
upper.boundary.yesterday <- spread.data[i-1,3]
# just to deal with the first period when there is no yesterday
if(i == 1) upper.boundary.yesterday <- upper.boundary.today
lower.boundary.today <- spread.data[i,4]
lower.boundary.yesterday <- spread.data[i-1,4]
# just to deal with the first period when there is no yesterday
if(i == 1) lower.boundary.yesterday <- lower.boundary.today
# lets try and find if we have a long signal
#print(c(i,
-close.yesterday,lower.boundary.yesterday,close.today,lower.boundary.today))
################## RULES FROM HERE ##################
# spread$Close - spread$Close.1
####### FIRST FOR A LONG #####
####### first find lower boundary crossings #####
if(long == 0) position = 0
if(close.yesterday <= lower.boundary.yesterday && close.today >
lower.boundary.today) long = 1
####### find mean crossings #####
if (long == 1 && close.today > mean.today) long = 0
sigup[i] <- long
#print(c(i, long,
close.yesterday,lower.boundary.yesterday,close.today,lower.boundary.today))
####### THEN FOR A SHORT #####
####### first find upper boundary crossings #####
if(close.yesterday >= upper.boundary.yesterday && close.today <
upper.boundary.today) short = -1
####### find mean crossings #####
if (short == -1 && close.today < mean.today) short = 0
sigdn[i] <- short
}
So I put it all in a loop and carry forward my positions/triggers from
one day to the next which is sort of the way I would normally program.
Can you write rules such as I am trying to using vector operations and
does blotter lend itself to this?
Second: on more general note this whole question of stop loss is very
significant to results. I find that most back testing is based upon not
adopting such a policy, but prudence would almost always insist on one
doing so. If you have no real option but to adopt a stop loss policy
then the most important question is what is the correct level of
protection. I get very annoyed when my strategy works without a stop
loss and then the first time I take a position I get closed out by my
stop loss and lose money and then the next day or the day after I find
the figures put me back in the black. Anyway, I guess this is just an
iterative process using a binary search but, again, are there any useful
ideas about how one can go about this sort of optimisation re-using
existing packages/code?
Stephen Choularton Ph.D., FIoD
On 29/12/2010 7:18 AM, Brian G. Peterson wrote:
On 12/28/2010 01:28 PM, Stephen Choularton wrote:
My apologies. I did not realize the script worked so slowly. I reduced the time scale it covered so it commenced at the beginning of the year and it did run to completion. I will try the full term and see if it produces the same graphs as the original example. I'm always a bit worried about warnings as they often mean something is going wrong and it might be useful if kafta had warned one not to worry about them. Mind you I think he did say it all took a long time ;-)
The reason this script runs slowly is that it is calling updatePortf, updateAcct, and updateEndEq after each and every observation to do order sizing. As a matter of practice, if you can 'cheat' and say 'I've got $1000000 to invest, and I don't mind being a little leveraged', you don't need to do that, and things are *much* faster. For example, we can typically run a strategy backtest on *tick* data (millions of observations) in less than a minute per day. The reason for this divergent length of time is that the blotter update* functions do a *lot* of calculations, and all of those take time, even though they are vectorized where possible. Perhaps a middle ground would be to call the update* functions monthly, or something similar. I found his example script to be slower than I am used to, but not unbearable, and believe that it finished in a couple minutes, though its been a while since I ran it...
I can assure you I do try and read man before I ask for help but dealing
with other people's code is not always easy particularly when working
with a programming system that uses a different paradigm like R with its
emphasis on operations on vectors and the like. and the extensive use of
calls to functions each of which often require a wet towel and cup of
coffee to understand.
I added the parameter definitions you suggest:
currency("USD")
stock("SPY",currency="USD",multiplier=1)
and the warnings reduced to one:
Good.
Warning messages:
1: In updatePortf(ltportfolio, Dates = currentDate) :
Incompatible methods ("Ops.Date", "Ops.POSIXt") for ">="
<...>
"Ops.Date", "Ops.POSIXt" don't appear in the function call so they must be somewhere deeper. I'm afraid I'm currently a windows user so grep is not available and the windows native text search didn't reveal much. However, I did find some references in the documentation (Date-Time Classes, Operators on the Date Class & S3 Group Generic Functions) but Ops.POSIXt doesn't appear therein only POSIXlt and Ops.POSIXct. Is there a typo somewhere ?
It's likely not a typo, but rather an incompatible index between one time series and another. You'd need to check the indices of each of the input series, or of the custom order sizing function from the script to see what's going on. If the output from your run and the blog post agree, I wouldn't bother.
It would be nice to get rid of the warnings.