From: Barry Rowlingson
Jan T. Kim wrote:
Generally, I fully agree -- modular coding is good, not only in R.
However, with regard to execution time, modularisation that involves
passing of large amounts of data (100 x 1000 data frames etc.) can
cause problems.
I've just tried a few simple examples of throwing biggish
(3000x3000)
matrices around and haven't encountered any pathological
behaviour yet.
I tried modifying the matrices within the functions, tried
looping a few
thousand times to estimate the matrix passing overhead, and in most
cases the modular version run pretty much as fast as - or
occasionally
faster than - the inline version. There was some variability
in CPU time
taken, probably due to garbage collection.
Does anyone have a simple example where passing large data
sets causes
a huge increase in CPU time? I think R is pretty smart with its
parameter passing these days - anyone who thinks its still like Splus
version 2.3 should update their brains to the 21st Century.