Skip to content

Sweave for "big" data analysis

5 messages · Lars Bishop, Uwe Ligges, Sarah Goslee +2 more

#
Google for "cache" and "Sweave" and you will find more than one package 
that extend Sweave and provide kinds of caching, i.e. roughly what you 
have in mind anyway.

Best,
Uwe Ligges
On 31.12.2010 21:35, Lars Bishop wrote:
#
My very simple approach is to check if the output file exists (within the
Sweave file), and run the time-consuming bits only if it does not.

As Uwe says, there are more sophisticated approaches too.

Sarah
On Fri, Dec 31, 2010 at 3:35 PM, Lars Bishop <lars52r at gmail.com> wrote:

  
    
#
I still recommend the pgfSweave package (as usual) -- you can cache
both data objects (using cacheSweave) and graphics (using pgf).

Regards,
Yihui
--
Yihui Xie <xieyihui at gmail.com>
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA
On Fri, Dec 31, 2010 at 2:35 PM, Lars Bishop <lars52r at gmail.com> wrote:
#
On 31/12/2010 3:35 PM, Lars Bishop wrote:
As others have said, there are packages that provide caching.

I haven't used them, because I like to keep my projects as 
self-contained as possible:  adding a dependency on one of those 
packages is undesirable[1].  What I do in the case where there are time 
consuming calculations is to do all the calculations in a script, and 
save the results (using save()).  Then the Sweave document will load the 
objects (using load()) and do post-processing, plotting, etc.

Duncan Murdoch

1.  I do generally write things that are dependent on my own patchDVI 
package, and curse myself for the dependency all the time.