On Mon, 12 Sept 2022 at 09:57, Maxim Nazarov
<maxim.nazarov at openanalytics.eu> wrote:
If you profile the second run, you will see that the majority of the time is spent in the `tools:::.remove_stale_dups` function, which loops over all duplicated packages - so all packages in that case.
One improvement I could think of is to replace the first line of that function
pkgs <- ap[, "Package"]
with
pkgs <- ap[!duplicated(ap[, c("Package", "Version")]), "Package"]
which would help in your example, but the function might still run longer if there are many packages with different versions present, so there maybe even better optimizations.
Stating the obvious here, but since we don't know your 'real' use case, adding a `unique` call to the `repos` argument of the `available.packages` would achieve a similar improvement without any modifications needed from `tools`.
Kind regards,
Maxim Nazarov
----- Original Message -----
From: "Colin Gillespie" <csgillespie at gmail.com>
To: "r-devel" <r-devel at r-project.org>
Sent: Friday, September 9, 2022 7:33:09 PM
Subject: [Rd] Duplicated mirrors on available packages
Hi
When there are duplicated repos, available.packages() takes
significantly longer to run.
For example
mirror = "https://cloud.r-project.org/"
system.time(available.packages(repos = mirror))
# user system elapsed
# 1.054 0.031 1.245
system.time(available.packages(repos = c(mirror, mirror)))
# user system elapsed
# 22.389 0.037 22.429
Best wishes,
Colin
R version 4.2.0 (2022-04-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.1 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so
locale:
[1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
[5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
[7] LC_PAPER=en_GB.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_4.2.0 tools_4.2.0
Dr Colin Gillespie
https://twitter.com/csgillespie