No, I have not seen this. Thanks. My post was more to point out how I structure data when I'm running a compute intensive program written in C. I thought this might help Atakam think of a way to structure directories to fit his data. The batch command to run R can be run multiple times in different directories. This is an example of how to run R from a Windows batch file. Some may not know how to run R from a batch file. Breaking the data up into multiple directories and then running R code from a batch file is an alternative to using the console. I do run the regressions using R from a batch file, but they take seconds to complete. Best, Frank Chicago From: Dane Edwards [mailto:daneedwards1 at hotmail.com] Sent: Tuesday, March 07, 2017 9:06 PM To: Frank <frankm60606 at gmail.com> Subject: RE: [R-SIG-Finance] Parallelizing applyStrategy to multiple symbols Hi Frank have you tried this? http://blog.revolutionanalytics.com/2009/05/parallelized-backtesting-with-fo reach.html http://stackoverflow.com/questions/22340488/r-using-foreach-with-blotter-por tfolio-already-exists-error From: Frank <mailto:frankm60606 at gmail.com> Sent: Wednesday, 8 March 2017 8:18 AM To: 'Atakan Okan' <mailto:atakanokan at outlook.com> Cc: r-sig-finance at r-project.org <mailto:r-sig-finance at r-project.org> ; 'Brian G. Peterson' <mailto:brian at braverock.com> Subject: Re: [R-SIG-Finance] Parallelizing applyStrategy to multiple symbols Hi Atakan, I use a batch file to run most of my R programs. That way I just have to get it right once and then I can run it many times. The following is a simple batch command, scatter_plot.bat, to run some regressions: "C:\Program Files\R\R-3.0.2\bin\x64\R.exe" CMD BATCH " Scatter_Plot.txt" " Scatter_Plot.out" Scatter_plot.txt contains generic R commands that use data in the current directory. Scatter_Plot.out will contain the output from the commands in the text file. If I'm analyzing SPY data for 2016, I would use a data structure like: \SPY\2016\01 \SPY\2016\02 \SPY\2016\03 . . . So that I can analyze one month's data and save the output in one directory. January data and output to \SPY\2016\01, etc. I have 8 execution paths and can run 8 months of data simultaneously. My program is small and does not use up all available physical memory. I would run the final 4 months when 4 of the 8 initial months are finished. If I run more than 8 data intensive regressions, what Brian is saying is that the OS will spend extra time allocating which thread from which process gets loaded into the next available execution path. If I were to use up more than the available physical memory, if that thread was swapped out to the disk, the process would need to be loaded back into memory and executed while some process in memory would have to be swapped out to the hard drive. This traffic will slow things down dramatically. At the end of the batch file, the output is copied up one directory, in this case to 2016, with the year and month appended to a generic file name. There is a batch file in 2016 to concatenate all data from the different months into one file for 2016. Best, Frank Chicago -----Original Message----- From: Atakan Okan [mailto:atakanokan at outlook.com] Sent: Monday, March 06, 2017 4:37 PM To: Frank <frankm60606 at gmail.com <mailto:frankm60606 at gmail.com> > Cc: Brian G. Peterson <brian at braverock.com <mailto:brian at braverock.com> >; r-sig-finance at r-project.org <mailto:r-sig-finance at r-project.org> Subject: Re: [R-SIG-Finance] Parallelizing applyStrategy to multiple symbols Hi Frank, I just thought of an idea based on your suggestion. Instead of trying to implement a foreach loop, I will try to subset my symbol set into different R sessions with the create a new r session option in Rstudio and then run each subset on a different session with the default call to applyStrategy. I think this is what you were suggesting or I might have understood it incorrectly. Hi Brian, My understanding of parallelization wasnt enough to grasp all of your reply, but I am not planning on doing rebalancing or testing any strategy that need to "talk" to other threads. Each symbol is backtested on its own withiut any input or output to and from other symbols' backtest. Would my idea suggested above work in this case? I think I explained my problem inadequately; the time of completion of a single symbol's backtest is not the issue but the sequential computing of each symbol's backtest and consequently, linearly increasing completion time of all symbols' backtest is the main issue. I just want to divide each symbol's applyStrategy call to each CPU my laptop has to speed up the process. Like apply.paramset but not for each parameter combination, for each symbol. I hope I have explained better. Thanks for the help. Best, Atakan Okan
On 6 Mar 2017, at 23:55, Frank <frankm60606 at gmail.com
<mailto:frankm60606 at gmail.com> > wrote:
Atakan, What kind of computer do you have? Number of cores, memory, hyperthreaded
or not?
/Brian Does this package take advantage of hyperthreading? By your comment
it suggests it does for multiple cores and I would assume hyper threading.
When I do non-R computer intensive work, I break it up into chunks of 8. I
have an i7 that hyper threads which pegs the CPU at 100%. If you had a similar setup, you could break your 100 symbol list down into 8 datasets and run them simultaneously.
Regardless, adding memory is usually a cheap and mindless way to improve
throughput.
Best, Frank Chicago, IL -----Original Message----- From: R-SIG-Finance [mailto:r-sig-finance-bounces at r-project.org] On Behalf Of Brian G. Peterson Sent: Monday, March 06, 2017 1:46 PM To: Atakan Okan <atakanokan at outlook.com <mailto:atakanokan at outlook.com> >;
r-sig-finance at r-project.org <mailto:r-sig-finance at r-project.org>
Subject: Re: [R-SIG-Finance] Parallelizing applyStrategy to multiple symbols I suspect you're running up against communication and memory management
time and resource contention.
applyIndicators and applySignals should all be using vectorized code, so
the potential benefit from parallelization will likely be negative, as communication and memory management swap any benefit from the calculations.
applyRules might benefit from parllelization, but you would need to come
back together on any rebalancing period. You would also have significant copying time.
If you were going to make this work, you'd need to minimize copies. Your effective 'reduce' operation at the end by only returning tradeStats
could do this for the end of the calculation, but at the start, you'd need to be smarter about how you segment market data to each worker.
Just putting getSymbols on the workers might run into I/O contention
issues. You also don't need to redeclare the strategy object. You could just copy that to each worker.
When we've done things as a one-off, we typically create portfolios for
each segment, and try to avoid as many copies as we can.
You'd need to profile to see exactly where you're getting hung up, but
this approach seems too simplistic (see my first sentence for hints).
We haven't bothered to do this in the package itself since with a little
work we can usually get to around one core minute per symbol per day on L1 tick data, which means that even a large backtest on tick data can finish in a few hours. The cost of optimizing execution time doesn't seem to be worth the cost in programming and testing time.
Regards, Brian -- Brian G. Peterson http://braverock.com/brian/ Ph: 773-459-4973 IM: bgpbraverock
On Mon, 2017-03-06 at 18:53 +0000, Atakan Okan wrote:
Hello,
I am trying to parallelize applyStrategy() to make it faster when
applied to multiple symbols. The reproducible code below only
contains
3 symbols thus it finishes fast however when I apply it to
100 symbols in an index, sequential computing takes a lot of time.
What is the best way to accomplish this? Using foreach loop does not
seem to work and couldn't find any info on stackexchange or the usual
mailing lists.
Thanks.
Atakan Okan
Code with applyStrategy (foreach is below this):
library(quantmod)
library(quantstrat)
symbols <- c("AAPL","GOOGL","MSFT")
getSymbols(Symbols = symbols, from = "2010-01-01")
currency('USD')
stock(symbols, currency="USD")
strategy.st <- "multiple_symbols_parallel_applystrategy"
rm.strat(strategy.st)
initPortf(strategy.st, symbols = symbols) initAcct(strategy.st,
portfolios=strategy.st, initEq=100000)
initOrders(portfolio=strategy.st)
strategy(strategy.st,store=TRUE)
rule.longenter = TRUE
rule.longexit = TRUE
rule.shortenter = TRUE
rule.shortexit = TRUE
txn.model <- 0
add.indicator(strategy.st,
name = "MACD",
arguments = list(x=Cl(get(symbols))),
label='macd')
add.signal(strategy.st,name="sigCrossover",
arguments = list(columns=c("macd.macd","signal.macd"),
relationship="gt"),
label="macd.gt.signal")
add.signal(strategy.st,name="sigCrossover",
arguments = list(columns=c("macd.macd","signal.macd"),
relationship="lt"),
label="macd.lt.signal")
add.rule(strategy.st,
name='ruleSignal',
arguments = list(sigcol="macd.gt.signal",
sigval=TRUE,
prefer="Open",
orderqty= 1000,
#osFUN="osAllInLong",
ordertype='market',
orderside='long',
orderset='ocolong',
TxnFees = txn.model),
type='enter',
label='longenter',
enabled=FALSE
)
add.rule(strategy.st,
name='ruleSignal',
arguments = list(sigcol="macd.lt.signal",
sigval=TRUE,
prefer="Open",
orderqty='all',
ordertype='market',
orderside='long',
orderset='ocolong',
TxnFees = txn.model),
type='exit',
label='longexit',
enabled=FALSE
)
add.rule(strategy.st,
name='ruleSignal',
arguments = list(sigcol="macd.lt.signal",
sigval=TRUE,
prefer="Open",
orderqty=-1000,
#osFUN="osAllInShort",
ordertype='market',
orderside='short',
orderset='ocoshort',
TxnFees = txn.model),
type='enter',
label='shortenter',
enabled=FALSE
)
add.rule(strategy.st,
name='ruleSignal',
arguments = list(sigcol="macd.gt.signal",
sigval=TRUE,
prefer="Open",
orderqty='all',
ordertype='market',
orderside='short',
orderset='ocoshort',
TxnFees = txn.model),
type='exit',
label='shortexit',
enabled=FALSE
)
enable.rule(strategy.st,type="enter",label="longenter", enable =
rule.longenter)
enable.rule(strategy.st,type="exit",label="longexit", enable =
rule.longexit)
enable.rule(strategy.st,type="enter",label="shortenter", enable =
rule.shortenter)
enable.rule(strategy.st,type="exit",label="shortexit", enable =
rule.shortexit)
summary(getStrategy(strategy.st))
applyStrategy( strategy=strategy.st ,
portfolios=strategy.st,
symbols = symbols,
verbose=TRUE)
updatePortf(strategy.st)
updateAcct(strategy.st)
updateEndEq(strategy.st)
-------------------------------------------------------------------
-------------------------------------------------------------------
-------------------------------
Code with foreach:
library(quantmod)
library(quantstrat)
if(Sys.info()["sysname"] == "Windows") {
library(doSNOW)
cl <- makeCluster(4)
registerDoSNOW(cl)
}
if(Sys.info()["sysname"] == "Linux") {
library(doMC)
registerDoMC(cores=4)
#registerDoSEQ()
getDoParWorkers()
}
symbols <- c("AAPL","GOOGL","MSFT")
sens.df <- foreach(sym = 1:length(symbols),
.combine = 'rbind',
.packages = c("quantstrat","quantmod")) %dopar% {
getSymbols(Symbols = sym, from = "2010-01-01")
currency('USD')
stock(sym, currency="USD")
strategy.st <- "multiple_symbols_parallel_applystrategy"
rm.strat(strategy.st)
initPortf(strategy.st, symbols = sym) initAcct(strategy.st,
portfolios=strategy.st, initEq=100000)
initOrders(portfolio=strategy.st)
strategy(strategy.st,store=TRUE)
rule.longenter = TRUE
rule.longexit = TRUE
rule.shortenter = TRUE
rule.shortexit = TRUE
txn.model <- 0
add.indicator(strategy.st,
name = "MACD",
arguments = list(x=Cl(get(sym))),
label='macd')
add.signal(strategy.st,name="sigCrossover",
arguments = list(columns=c("macd.macd","signal.macd"),
relationship="gt"),
label="macd.gt.signal")
add.signal(strategy.st,name="sigCrossover",
arguments = list(columns=c("macd.macd","signal.macd"),
relationship="lt"),
label="macd.lt.signal")
add.rule(strategy.st,
name='ruleSignal',
arguments = list(sigcol="macd.gt.signal",
sigval=TRUE,
prefer="Open",
orderqty= 1000,
#osFUN="osAllInLong",
ordertype='market',
orderside='long',
orderset='ocolong',
TxnFees = txn.model),
type='enter',
label='longenter',
enabled=FALSE
)
add.rule(strategy.st,
name='ruleSignal',
arguments = list(sigcol="macd.lt.signal",
sigval=TRUE,
prefer="Open",
orderqty='all',
ordertype='market',
orderside='long',
orderset='ocolong',
TxnFees = txn.model),
type='exit',
label='longexit',
enabled=FALSE
)
add.rule(strategy.st,
name='ruleSignal',
arguments = list(sigcol="macd.lt.signal",
sigval=TRUE,
prefer="Open",
orderqty=-1000,
#osFUN="osAllInShort",
ordertype='market',
orderside='short',
orderset='ocoshort',
TxnFees = txn.model),
type='enter',
label='shortenter',
enabled=FALSE
)
add.rule(strategy.st,
name='ruleSignal',
arguments = list(sigcol="macd.gt.signal",
sigval=TRUE,
prefer="Open",
orderqty='all',
ordertype='market',
orderside='short',
orderset='ocoshort',
TxnFees = txn.model),
type='exit',
label='shortexit',
enabled=FALSE
)
enable.rule(strategy.st,type="enter",label="longenter", enable =
rule.longenter)
enable.rule(strategy.st,type="exit",label="longexit", enable =
rule.longexit)
enable.rule(strategy.st,type="enter",label="shortenter", enable =
rule.shortenter)
enable.rule(strategy.st,type="exit",label="shortexit", enable =
rule.shortexit)
summary(getStrategy(strategy.st))
applyStrategy( strategy=strategy.st ,
portfolios=strategy.st,
symbols = sym,
verbose=TRUE)
updatePortf(strategy.st)
updateAcct(strategy.st)
updateEndEq(strategy.st)
results.checkstrat <- data.frame(t(tradeStats(strategy.st)))
return(results.checkstrat[,1])
}
if (Sys.info()["sysname"] == "Windows"){
snow::stopCluster(cl) #dosnow windows }
_______________________________________________ R-SIG-Finance at r-project.org <mailto:R-SIG-Finance at r-project.org> mailing
list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go.
_______________________________________________ R-SIG-Finance at r-project.org <mailto:R-SIG-Finance at r-project.org> mailing
list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions
should go.
_______________________________________________ R-SIG-Finance at r-project.org <mailto:R-SIG-Finance at r-project.org> mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go.