Back to formatted view
Raw Message

Message-ID: <4289D634.1020103@lancaster.ac.uk>
Date: 2005-05-17T11:32:04Z
From: Barry Rowlingson
Subject: parsing speed
In-Reply-To: <20050517112941.GF20694@jtkpc.cmp.uea.ac.uk>

Jan T. Kim wrote:

> Generally, I fully agree -- modular coding is good, not only in R.
> However, with regard to execution time, modularisation that involves
> passing of large amounts of data (100 x 1000 data frames etc.) can
> cause problems.

  I've just tried a few simple examples of throwing biggish (3000x3000) 
matrices around and haven't encountered any pathological behaviour yet. 
I tried modifying the matrices within the functions, tried looping a few 
thousand times to estimate the matrix passing overhead, and in most 
cases the modular version run pretty much as fast as - or occasionally 
faster than - the inline version. There was some variability in CPU time 
taken, probably due to garbage collection.

  Does anyone have a simple example where passing large data sets causes 
a huge increase in CPU time? I think R is pretty smart with its 
parameter passing these days - anyone who thinks its still like Splus 
version 2.3 should update their brains to the 21st Century.

Baz