Skip to content
Prev 31610 / 63424 Next

execution time of .packages

Prof Brian Ripley wrote:
Thanks for the info, I'll try to find my way with these directions.
I usually do not trust the result of the profiler when a for loop is 
involved, as it tends to miss the point (or maybe I am).

Consider this script below, the profiler reports 0.22 seconds when the 
actual time spent is about 6 seconds,  and would blame rnorm as the 
bottleneck when the inefficiency is in with growing the data structure.

Rprof( )
x <- numeric( )
for( i in 1:10000){
  x <- c( x, rnorm(10) )
}
Rprof( NULL )
print( summaryRprof( ) )

$ time Rscript --vanilla profexample.R
$by.self
        self.time self.pct total.time total.pct
"rnorm"      0.22      100       0.22       100

$by.total
        total.time total.pct self.time self.pct
"rnorm"       0.22       100      0.22      100

$sampling.time
[1] 0.22

real    0m6.164s
user    0m5.156s
sys     0m0.737s

$ time Rscript --vanilla -e "rnorm(10)"
 [1]  0.836411851  1.762081444  1.076305644  2.063515383  0.643254750
 [6]  1.698620443 -1.774479062 -0.432886214 -0.007949533  0.284089832

real    0m0.224s
user    0m0.187s
sys     0m0.024s


Now, if i replace the for loop with a similar silly lapply construct, 
profiler tells me a rather different story:

Rprof( )
x <- numeric( )
y <- lapply( 1:10000, function(i){
    x <<- c( x, rnorm(10) )
    NULL
} )
Rprof( NULL )
print( summaryRprof( ) )

$ time Rscript --vanilla prof2.R
$by.self                                           
         self.time self.pct total.time total.pct   
"FUN"         6.48     96.1       6.68      99.1   
"rnorm"       0.20      3.0       0.20       3.0
"lapply"      0.06      0.9       6.74     100.0

$by.total
         total.time total.pct self.time self.pct
"lapply"       6.74     100.0      0.06      0.9
"FUN"          6.68      99.1      6.48     96.1
"rnorm"        0.20       3.0      0.20      3.0

$sampling.time
[1] 6.74

real    0m8.352s
user    0m4.762s
sys     0m2.574s

Or let us wrap the for loop of the first example in a function:

Rprof( )
x <- numeric( )
ffor <- function(){
    for( i in 1:10000){
      x <- c( x, rnorm(10) )
    }
}
ffor()
Rprof( NULL )
print( summaryRprof( ) )
  

$ time Rscript --vanilla prof3.R
$by.self
        self.time self.pct total.time total.pct
"ffor"        5.4     96.4        5.6     100.0
"rnorm"       0.2      3.6        0.2       3.6

$by.total
        total.time total.pct self.time self.pct
"ffor"         5.6     100.0       5.4     96.4
"rnorm"        0.2       3.6       0.2      3.6

$sampling.time
[1] 5.6

real    0m6.379s
user    0m5.408s
sys     0m0.717s



Maybe I get this all wrong, maybe the global assignment operator is 
responsible for some of the time in the second example. But how can I 
analyse the result of profiler in the first example when it seems to 
only be interested in the .22 seconds when I want to know what is going 
on with the rest of the time.

Is it possible to treat "for" as a function when writing the profiler 
data so that I can trust it more ?

Romain