On 12/10/24 00:35, Llu?s Revilla wrote:> Dear R-devel,
>
> I read with interest the recent blog post on how R will have parallel
> downloads, on blog.r-project.org
> (https://blog.r-project.org/2024/12/02/faster-downloads/index.html).
> Thanks Tomas!
>
> The blog mentions that one of the areas where this will be observed is
> while installing them (which I did!). However, I noticed they might be
> downloaded multiple times:
> If one interrupts the install.packages (via Ctrl+C), or it fails due
> to some system dependency missing and I fix that on a different
> terminal session, or the internet connection is cut and I try again.
Yes, and this has been the case before - it's not new for simultaneous
downloads.
> One possible way to make installations/downloads faster and also
> reduce the bandwidth of repositories (and its mirrors) would be to
> check if they need to be downloaded (again).
> PACKAGES file on <repo>/src/contrib includes the MD5sum field that
> could be used to check packages on the local folder (But it might be
> faster to first check if any file exists there for the same package).
>
> In short, I propose:
> 1) Checking before downloading packages their existence on the destdir
> directory used by install.packages.
> 2) I suppose the most common scenario is to use install.packages with
> the default destdir parameter (NULL). If 1) is implemented it might be
> useful to keep the temporary directory common for a single R session.
When destdir is NULL (the default), non-local packages are downloaded to
a subdirectory of the temporary session directory (see
?install.packages), so the downloaded files would be readily available
to further installation attempts done by the same R session.
I think we could once extend download.file() to support re-use of
already downloaded files, so that it can continue an interrupted
download of a single file or re-use the whole file. This shouldn't be
the default because the files in general may change between downloads,
and may be even from different URLs, but it could be used by
install.packages(), where this shouldn't happen, at least when destdir
is NULL.? I think an extra round of checking checksums shouldn't be
needed in install.packages().
Best
Tomas
> I would appreciate feedback on these ideas.
>
> Best,
>
> Llu?s Revilla
>
> PD: New users encountering download & installation issues often keep
> seeing the progress bar (and in the future "trying URL
'https://...")
> of the same packages. There are some ways to prevent/avoid repeated
> downloads, such as, using the system library dependency resolver, or
> having local mirrors. But they are not easy/available for new useRs,
> and sometimes they are difficult to avoid (like having a reliable
> internet connection).
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel