Skip to content

Benefit of Iterators (package iterator)

2 messages · Doran, Harold, Thierry Onkelinx

#
R-Help (and package author)

I'm trying to understand within the context of R what the benefit of using an iterator is. My only goal in using the foreach package is to improve computational speed with some embarrassingly parallel tasks I have to compute.

I took the example found at the link below to provide a reproducible example and ran it in a "conventional" way to iterate in a loop and the timing suggests here (as well as with my actual project) that using an iterator generates the same object, but at a much slower speed.

If I can get the same thing faster without using an iterator what would be the potential of its use?	

https://msdn.microsoft.com/en-us/microsoft-r/foreach
user  system elapsed 
   0.40    0.08    0.87
user  system elapsed 
   0.41    0.03    0.81
[1] TRUE
#
Dear Harold,

I get a different story

library(doParallel)
library(microbenchmark)
cl <- makeCluster(4)
registerDoParallel(cl)
x <- matrix(rnorm(1000000), ncol=1000)
itx <- iter(x, by='row')
microbenchmark(
  iterator = foreach(i=itx, .combine=c) %dopar% mean(i),
  base = foreach(i= 1:nrow(x), .combine=c) %dopar% mean(x[i,])
)

Unit: milliseconds
     expr       min         lq       mean     median         uq      max
neval cld
 iterator   2.11206   2.298507   6.254412   2.540116   2.691283  370.623
100  a
     base 390.21825 442.561737 550.169590 452.729684 466.343894 2554.329
100   b

Best regards,


ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2016-12-08 15:20 GMT+01:00 Doran, Harold <HDoran at air.org>: