Hello All,
Generally I do most of my back-test (basically running
correlation/regression analysis etc on a set of historical data) in R.
Siince I haven't done any similar analysis on any other
platform/language, I wonder If any of you out there has any experience
in conduting computation intensive back-tests in other langauges and
done some sort of statistics on the time taken to perform the
back-test in R as oppose to other langauges.
My hunch is that processing time shouldn't be *very* different
between R & other languages but then it's just a hunch. Can anybody
share/comment on there experience? Sorry for the open-ended nature of
the question.
Cheers
Manoj
Backtesting speed
8 messages · Manoj, BBands, roger bos +4 more
On 7/2/06, Manoj <manojsw at gmail.com> wrote:
Hello All,
Generally I do most of my back-test (basically running
correlation/regression analysis etc on a set of historical data) in R.
Siince I haven't done any similar analysis on any other
platform/language, I wonder If any of you out there has any experience
in conduting computation intensive back-tests in other langauges and
done some sort of statistics on the time taken to perform the
back-test in R as oppose to other langauges.
My hunch is that processing time shouldn't be *very* different
between R & other languages but then it's just a hunch. Can anybody
share/comment on there experience? Sorry for the open-ended nature of
the question.
Cheers
Manoj
In my experience scripted languages can be quite fast. The bulk of the
time is usually consumed by a relatively small part of the code.
Profiling to see where the problems are, optimisation of the time
heavy sections and perhaps off loading the still irksome portions to a
C or C++ library can help.
jab
John Bollinger, CFA, CMT www.BollingerBands.com If you advance far enough, you arrive at the beginning.
14 days later
On 7/2/06, Manoj <manojsw at gmail.com> wrote:
My hunch is that processing time shouldn't be *very* different between R & other languages but then it's just a hunch. Can anybody share/comment on there experience? Sorry for the open-ended nature of the question.
There is a interesting thread on slashdot this AM that addresses this question: http://it.slashdot.org/it/06/07/18/0146216.shtml jab
John Bollinger, CFA, CMT www.BollingerBands.com If you advance far enough, you arrive at the beginning.
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/r-sig-finance/attachments/20060719/7506773b/attachment.pl
roger bos wrote:
Yeah reading the link above, I would summarize it as this: If someone is good at/and likes C/C++, you will never be able to convince them that an interpretted language is as good. Most proponents of interpretted languages just figure the processor speed and memory improvements will allow them to carry on without using compilers. When I profile my R code, the vast majority of the time is usually in read.table and write.table, so I figure there is not much I can do to improve my code. While using Perl & C & R together could bring some speed imporovement, there is also a downside to learning and maintaining code in different langues and putting all the pieces together. But then again, I work with monthly data, so its not really a concern of mine. Most hedge funds that work with tick data use Perl to process the data and then maybe R to analyze it. Basically, the volume is too great to do in R. Of course linking to a database to a nice plus in R, I don't know if Perl can do that.
Actually Perl has excellent database connectivity. The DBA's I work with tend to use Perl more than anything else to write DB maintenance and data transformation tools because the combination of Perl's db connectivity and text manipulation capabilities is very hard to beat. As far as interpreted vs. compiled code the gap is narrowing every day. C doesn't map directly to machine instructions as in the past as CPUs become more sophisticated, and languages like Java are using techniques like run-time optimization that are not available to statically compiled languages.
Just to play contrarian here, let me mention that on the quantlib-user list two issues recently surfaced: - replacing float/double calculations with (scaled) integers for greater speed - using dedicated hardware (field programmable gate arrays) with their specialised language dialects / compilers for speed increases of up to 100x (relative to a baseline of C++, not R) And then there are the folks who want to do linear algebra in their graphics cards as those things have better bus speed and floating point performance than the usual motherboard and fpu ... So to sum it up, there is always someone trying to be faster yet. But then most of us use R for the speed and power of prototyping and 'r & d', right? Dirk
Hell, there are no rules here - we're trying to accomplish something.
-- Thomas A. Edison
On Wed, Jul 19, 2006 at 08:06:07AM -0400, roger bos wrote:
When I profile my R code, the vast majority of the time is usually in read.table and write.table, so I figure there is not much I can do to improve my code.
No, that's likely an unwarranted assumption. If I remember right, R read.table() can be GROTESQUELY inefficient in some cases. So, you might just have one super slow thing obscuring the fact that you also have lots of other moderately slow things, all of which could be dramatically (and perhaps usefully) sped up.
Andrew Piskorski <atp at piskorski.com> http://www.piskorski.com/
Python is probably even a better agile language than Perl at this point. Its database connectivity is nonesuch, and the RPy interface provides the capabilities to combine both R and Python functionality in a single program. For more information, check out "Poor Man's BI", an article I wrote for the June 2006 DM Review Extended Edition. http://www.dmreview.com/ee/ Steve Miller -----Original Message----- From: r-sig-finance-bounces at stat.math.ethz.ch [mailto:r-sig-finance-bounces at stat.math.ethz.ch] On Behalf Of eric larson Sent: Wednesday, July 19, 2006 5:07 PM To: R-sig-finance Subject: Re: [R-sig-Finance] Backtesting speed
roger bos wrote:
Yeah reading the link above, I would summarize it as this: If someone is good at/and likes C/C++, you will never be able to convince them that an interpretted language is as good. Most proponents of interpretted
languages
just figure the processor speed and memory improvements will allow them to carry on without using compilers. When I profile my R code, the vast majority of the time is usually in read.table and write.table, so I figure there is not much I can do to improve my code. While using Perl & C & R together could bring some speed imporovement, there is also a downside to learning and maintaining code in different langues and putting all the pieces together. But then again, I work with monthly data, so its not really a concern of mine. Most hedge funds that work with tick data use Perl to process the data and then maybe R to analyze it. Basically, the volume is too great
to
do in R. Of course linking to a database to a nice plus in R, I don't know if Perl can do that.
Actually Perl has excellent database connectivity. The DBA's I work with tend to use Perl more than anything else to write DB maintenance and data transformation tools because the combination of Perl's db connectivity and text manipulation capabilities is very hard to beat. As far as interpreted vs. compiled code the gap is narrowing every day. C doesn't map directly to machine instructions as in the past as CPUs become more sophisticated, and languages like Java are using techniques like run-time optimization that are not available to statically compiled languages. _______________________________________________ R-SIG-Finance at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance