Skip to content
Prev 7793 / 21307 Next

[Bioc-devel] Memory issues with BiocParallel::SnowParam()

Hi Valerie,

I have re-run my two examples twice using "log = TRUE" and updated the
output at http://lcolladotor.github.io/SnowParam-memory/ As I was
writing this email (all morning...), I made a 4th run where I save the
gc() information to compare against R 3.1.x That fourth run kind of
debunked what I was taking away from replicate runs 2 and 3.



## Between runs


Between runs there's almost no variability with R 3.1.x in the memory
used, a little bit with 3.2.1 and more with R 3.2.0. The variability
shown could be due to using BiocParallel 1.2.9 in runs 2 to 4 vs 1.2.7
in run 1 for R 3.2.0 and 1.3.31 vs 1.3.34 in R 3.2.1.



## gc() output differences

Now, from the gc() output from using "log = TRUE", I don't see nearly
any differences in the first example between R:

* 3.2.0 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/snow-3.2.o6463210
* 3.2.1 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/snow-3.2.x.o6463213

which makes sense given that the memory was the same

* 3.2.0 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/mem_emails/snow-3.2.txt#L24
* 3.2.1 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/mem_emails/snow-3.2.x.txt#L24

However, I do notice a large difference versus R 3.1.x

* log https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/snow-3.1.x.o6463536
* mem info https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/mem_emails/snow-3.1.x.txt#L50



In the derfinder example, the memory goes from 10.9 GB in R 3.2.0 to
12.87 GB in R 3.2.1 (run 2) with SnowParam(). The 18% increase
reported there is kind of similar to the increase from comparing the
max used mb output from gc():

# R 3.2.1 vs 3.2.0
* gc() max mem used mb ratio: (303.6 + 548.9) / (251.3 + 547.6) =~ 1.07
* max mem used in GB from cluster email:12.871 / 10.904  =~ 1.18
memory used reported by gc() goes down will be useful for testing
changes in BiocParallel() to see if they increase/decrease memory use.
When I run the tests locally (using 2 cores), I get similar ratios;
data at https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/local_run.txt.


However, the same type of comparison but now between R 3.2.1 against R
3.1.x shows that the numbers can be off. Although the 2.8 ratio is
closer to what I saw in my analysis scenario ( > 2.5).

# R 3.2.1 vs 3.1.x
* gc() max mem used mb ratio: (303.6 + 548.9) / (236 + 66) =~ 2.83
* max mem used in GB from cluster email:12.871 / 7.175  =~ 1.79

Numbers from:
# gc() info
* 3.2.1 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.x.o6463204#L15-L16
* 3.2.0 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.o6463201#L15-L16
* 3.1.x https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.1.x.o6463545#L10-L11
# cluster mem info
* 3.2.1 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/mem_emails/der-snow-3.2.x.txt#L24
* 3.2.0 https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/mem_emails/der-snow-3.2.txt#L24
* 3.1.x https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/mem_emails/der-snow-3.1.x.txt#L50






## Max mem vs gc()


Comparing the memory used in GB from the cluster email to the max used
mb from gc() multiplied by 10 (number of cores used), I see that the
ratio is somewhat consistent.


# first example, R 3.2.0, SnowParam() run 2
7.285 * 1024 / (10 * (23.9 + 461.8)) =~ 1.54
# first example, R 3.2.1, SnowParam(), run 2
7.286 * 1024 / (10 * (24 + 461.8)) =~ 1.54

# derfinder example, R 3.2.0, SnowParam(), run 2
10.904 * 1024 / (10 * (251.3 + 547.6)) =~ 1.4
# derfinder example, R 3.2.1, SnowParam(), run 2
12.871 * 1024 / (10 * (303.6 + 548.9)) =~ 1.55

# derfinder example, R 3.2.0, MulticoreParam(), run 2
13.789 * 1024 / (10 * (230.8 + 770.2)) =~ 1.41
# derfinder example, R 3.2.1, MulticoreParam(), run 2
13.671 * 1024 / (10 * (245.9 + 757.5)) =~ 1.4

And from this other observation, maybe I can use the gc() output from
"log = TRUE" to get an idea if the memory use reported by my cluster
is in line with previous runs, or if there's a cluster issue. This
ratio could also be used to compare different cluster environments to
see which ones are reporting greater/lower memory use.


However, the above ratios are different with R 3.1.x

# first example, R 3.1.x, SnowParam(), run 4
5.036 * 1024 / (10 * (32.1 + 218.5)) =~ 2.06

# derfinder example, R 3.1.x, SnowParam(), run 4
7.175 * 1024 / (10 * (236 + 66)) =~ 2.43
# derfinder example, R 3.1.x, MulticoreParam(), run 4
8.473 * 1024 / (10 * (240.7 + 189.1)) =~ 2.02


I'm not sure if this is a hint, but the largest difference between R
3.1.x and the other two is in the max used mb from Vcells.




Summarizing, I was thinking that

(A) we could use the output from gc() to compare between versions and
check which changes lowered the memory required,
and (B) estimate the actual memory needed as measured by the cluster
as well as compare cluster environments.

However, the gc() numbers from R 3.1.x (only available on the 4th
replicate run) don't seem to support these thoughts.



Or do you interpret these numbers differently?



Best,
Leo

On Sun, Jul 12, 2015 at 11:00 AM, Valerie Obenchain
<vobencha at fredhutch.org> wrote: