Skip to content

System.time

9 messages · John Kerpel, jim holtman, Stavros Macrakis +3 more

#
It depends on the granularity that the operating system is recording
that time; on some systems the minimum might be 0.01 seconds.  If it
is that short, why worry about it.  There is nothing unusual about the
result.
On Wed, Feb 11, 2009 at 7:49 PM, John Kerpel <john.kerpel at gmail.com> wrote:

  
    
#
On Wed, 2009-02-11 at 18:49 -0600, John Kerpel wrote:
Jim has already suggested why this might be the case. When I'm testing
the speed of things like this (that are in and of themselves very quick)
for situations where it may matter, I wrap the function call in a call
to replicate():

system.time(replicate(1000, svd(Mean_svd_data)))

to run it 1000 times, and that allows me to judge how quickly the
function executes.

HTH

G
#
Gavin Simpson wrote:
the timethese function in the Benchmark module in perl takes as
arguments not only the code to be run, but also the number of
replications to be performed.  it will complain if the code runs too
fast, i.e., if the measured time is too short and is not a reliable
estimate.

it might be a good idea to have a similar functionality in r (maybe it's
already there?), which would basically wrap over system.time and issue a
warning when reliable measurement canno tbe made.

vQ
#
On Thu, Feb 12, 2009 at 4:28 AM, Gavin Simpson <gavin.simpson at ucl.ac.uk> wrote:
I do the same, but with a small twist:

     system.time(replicate(1000, {svd(Mean_svd_data); 0} ))

This allows the values of svd(...) to be garbage collected.

If you don't do this and the output of the timed code is large, you
may allocate large amounts of memory (which may influence your timing
results) or run out of memory (which will also influence your timing
results :-) ),

              -s
#
On Thu, Feb 12, 2009 at 8:42 AM, Stavros Macrakis <macrakis at alum.mit.edu> wrote:
You could also do

r_ply(1000, svd(Mean_svd_data))

which has the same effect - the results are discarded after each
evaluation (as opposed to raply, rlply and rdply where they are kept
and returned in various formats)

Hadley
#
On Thu, 2009-02-12 at 09:42 -0500, Stavros Macrakis wrote:
Thanks for that tip Stavros. I hadn't realised that.

G
6 days later
#
Stavros Macrakis wrote:
to contribute my few cents, here's a simple benchmarking routine,
inspired by the perl module Benchmark.  it allows one to benchmark an
arbitrary number of expressions with an arbitrary number of
replications, and provides a summary matrix with selected timings.

the code below is also available from google code [1], if anyone is
interested in updates (should there be any) or contributions.

benchmark = function(
      ...,
      columns=c('test', 'replications', 'user.self', 'sys.self',
'elapsed', 'user.child', 'sys.child'),
      replicate=100,
      environment=parent.frame()) {
   arguments = match.call()[-1]
   parameters = names(arguments)
   if (is.null(parameters))
      parameters = as.character(arguments)
   else {
      indices = ! parameters %in% c('columns', 'replicate', 'environment')
      arguments = arguments[indices]
      parameters = parameters[indices] }
   result = cbind(
      test=rep(ifelse(parameters=='', as.character(arguments),
parameters), each=length(replicate)),
      as.data.frame(
         do.call(rbind,
            lapply(arguments,
               function(argument)
                  do.call(rbind,
                     lapply(replicate,
                        function(count)
                           c(replications=count,
                              system.time(replicate(count, {
eval(argument, environment); NULL })))))))))
   result[, columns, drop=FALSE] }

it's rudimentary and not fool-proof, but might be helpful if used with
care.  (the nested do.call-rbind-lapply sequence can surely be
simplified, but i could not resist the pattern.  someone once wrote that
if you need more than three (five?) levels of indentation in your code,
there must be something wrong with it;  presumably, he was a fortran
programmer.)

examples:

benchmark(1:10^7)
#     test replications user.self sys.self elapsed user.child sys.child
# 1 1:10^7          100     2.168        0   2.166          0         0

benchmark(allocation=1:10^8, replicate=10)
#         test replications user.self sys.self elapsed user.child sys.child
# 1 allocation           10      0.98    3.073    4.05          0         0

means.rep = function(n, m) replicate(n, mean(rnorm(m)))
means.pat = function(n, m) colMeans(array(rnorm(n*m), c(m, n)))
(result = benchmark(replicate=c(10, 100, 1000),
    rep=means.rep(100, 100),
    pat=means.pat(100, 100),
    columns=c('test', 'replications', 'elapsed')))
#   test replications elapsed
# 1  rep           10   0.037
# 2  rep          100   0.387
# 3  rep         1000   3.840
# 4  pat           10   0.017
# 5  pat          100   0.170
# 6  pat         1000   1.731

result$elapsed/result$replications
# [1] 0.003700 0.003870 0.003840 0.001700 0.001700 0.001731

with(result, t.test(elapsed/replications ~ test, paired=TRUE))
# silly, i know...

manual on demand.
vQ


[1] http://code.google.com/p/rbenchmark/
#
Wacek Kusnierczyk wrote:
<snip>
i have cleaned-up the code, removing the fancy nested structure.  the
code plus detailed documentation is available from googlecode [1], and i
stop the self-marketing here.

vQ

[1] http://code.google.com/p/rbenchmark/