If you profile the second run, you will see that the majority of the time is
spent in the `tools:::.remove_stale_dups` function, which loops over all
duplicated packages - so all packages in that case.
One improvement I could think of is to replace the first line of that function
pkgs <- ap[, "Package"]
with
pkgs <- ap[!duplicated(ap[, c("Package",
"Version")]), "Package"]
which would help in your example, but the function might still run longer if
there are many packages with different versions present, so there maybe even
better optimizations.
Stating the obvious here, but since we don't know your 'real' use
case, adding a `unique` call to the `repos` argument of the `available.packages`
would achieve a similar improvement without any modifications needed from
`tools`.
Kind regards,
Maxim Nazarov
----- Original Message -----
From: "Colin Gillespie" <csgillespie at gmail.com>
To: "r-devel" <r-devel at r-project.org>
Sent: Friday, September 9, 2022 7:33:09 PM
Subject: [Rd] Duplicated mirrors on available packages
Hi
When there are duplicated repos, available.packages() takes
significantly longer to run.
For example
mirror = "https://cloud.r-project.org/"
system.time(available.packages(repos = mirror))
# user system elapsed
# 1.054 0.031 1.245
system.time(available.packages(repos = c(mirror, mirror)))
# user system elapsed
# 22.389 0.037 22.429
Best wishes,
Colin
> sessionInfo()
R version 4.2.0 (2022-04-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.1 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so
locale:
[1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
[5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
[7] LC_PAPER=en_GB.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_4.2.0 tools_4.2.0
Dr Colin Gillespie
https://twitter.com/csgillespie
______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel