LluĂs Revilla
2025-Dec-04 09:22 UTC
[Rd] Remove http parameters from filename in download.packages()
Hi Jeroen and Martin (and other readers), I like this suggestion Jeroen, but I am not sure of the implications of enabling this beyond getting a valid file name locally on the computer. The File field is documented on tools::write_PACKAGES as "the filenames be included ... in the ?PACKAGES? file." (which might need a slight modification if this patch is implemented). What would parameters do on CRAN-like repositories? How would they alter the behaviour of the request of the package downloaded? The only idea I could think of is to control which/how packages are installed (as the name, version and file extension is still fixed on this patch). There are different methods used by CRAN-like repositories to control that: - CRAN handles parameters at the web server level not on the PACKAGES file: You can get a specific package version with https://CRAN.R-project.org/package=PKG&version=VER but doesn't install dependencies as it is understood as a single package. - https://bioconductor.org recommends using BiocManager to point to the right CRAN-like repository. - The CRAN-like repositories from https://r-universe.dev don't redirect but use sha256 as File (as you are well aware). - https://rpkgs.com redirects to appropriate binaries based on the headers, installing a package and its dependencies for the right R version and OS. - https://r-multiverse.org/ has two repositories but doesn't redirect, or use headers to control packages served. CRAN's redirects work for a single package but the headers approach works smoother for more than one at the cost of user/R admin preparations. I am not sure if some of these repositories (or new ones) could benefit from this feature. But I don't see a back compatibility problem as this couldn't be used by repositories since it wasn't available. Best wishes, El dc., 3 de des. 2025, 12:34, Jeroen Ooms <jeroenooms at gmail.com> va escriure:> > On Wed, Dec 3, 2025 at 11:55?AM Martin Maechler > <maechler at stat.math.ethz.ch> wrote: > > > > >>>>> Jeroen Ooms > > >>>>> on Tue, 2 Dec 2025 22:33:05 +0100 writes: > > > > > Currently `download.packages()` copies the full `File` > > > field from the URL in PACKAGES, including http parameters, > > > as the local filename on disk. So for example, if the > > > `PACKAGES` file contains > > > > > Package: jsonlite Version: 2.0.0 File: > > > jsonlite_2.0.0.tar.gz?auth=blabla123&hash=79fad1b6092c1d1cc71e096d02cbc7618837fda1f90b61443f09adc25caab095 > > > > > Then the file is saved on disk not as > > > `jsonlite_2.0.0.tar.gz` but as the full url including `?` > > > and `=` and `&` characters which are not supported and > > > create corrupt files on some platforms. > > > > Yes... but why should a "CRAN-like repository" use such file names ? > > Such that we can host a "CRAN-like repository" on modern > infrastructure that may use this sort of URLs. > > > I am considering, but still not convinced why it is needed (see above). > > Maybe I'm overlooking something ? > > Also, I'd have it as a switch (argument) of download.packages() in > > order to provide back compatibility [no need for a new patch, though !] > > The patch should not break functionality; it only fixes an edge case > where R would otherwise write to an illegal filename on disk. I am > guessing nobody has used this yet so I am not sure what would be the > value of back compatibility to ensure R keeps doing this. > > In either case the fix would only be useful if install.packages() > makes use of it. > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
Martin Maechler
2025-Dec-05 09:01 UTC
[Rd] Remove http parameters from filename in download.packages()
>>>>> Llu?s Revilla >>>>> on Thu, 4 Dec 2025 10:22:40 +0100 writes:> Hi Jeroen and Martin (and other readers), > I like this suggestion Jeroen, but I am not sure of the implications > of enabling this beyond getting a valid file name locally on the > computer. > The File field is documented on tools::write_PACKAGES as "the > filenames be included ... in the ?PACKAGES? file." (which might need a > slight modification if this patch is implemented). > What would parameters do on CRAN-like repositories? How would they > alter the behaviour of the request of the package downloaded? > The only idea I could think of is to control which/how packages are > installed (as the name, version and file extension is still fixed on > this patch). > There are different methods used by CRAN-like repositories to control that: > - CRAN handles parameters at the web server level not on the PACKAGES file: > You can get a specific package version with > https://CRAN.R-project.org/package=PKG&version=VER but doesn't install > dependencies as it is understood as a single package. Interesting; I have not yet seen this documented; how would it work? (To me it looks that the "?...." part is completely disregarded, at least when used in a web browser ...) > - https://bioconductor.org recommends using BiocManager to point to > the right CRAN-like repository. > - The CRAN-like repositories from https://r-universe.dev don't > redirect but use sha256 as File (as you are well aware). > - https://rpkgs.com redirects to appropriate binaries based on the > headers, installing a package and its dependencies for the right R > version and OS. > - https://r-multiverse.org/ has two repositories but doesn't redirect, > or use headers to control packages served. > CRAN's redirects work for a single package but the headers approach > works smoother for more than one at the cost of user/R admin > preparations. > I am not sure if some of these repositories (or new ones) could > benefit from this feature. > But I don't see a back compatibility problem as this couldn't be used > by repositories since it wasn't available. Thank you, Llu?s, for the (partial) summary of what different CRAN-like repositories do. To "back compatibility": We cannot know in how many different ways install.packages() and update.packages() are used for, e.g., site-internal (typically company internal) package repositories. It could well be that some of these also provide 'File' entries with a "?" (possibly even with a different meaning than "url parameters"). .. and their own setup code may rely on the long file names one way or another. On decent operating systems (e.g., Linux; possibly macOS) filenames containing a `?` work without problems: $ touch foo?bar $ ls -lG foo?bar -rw-r--r-- 1 maechler 0 5. Dec 09:38 'foo?bar' Hence, unconditionally cutting such filenames (as in the original proposal) will break such R code. Hence, we want to allow back compatible behaviour, such that previous "non-cutting" behaviour would still be possible; easy and simple. I'd tend to agree that the new behaviour would still be default Lastly, to Jeroen's question: Yes, of course, update.packages() can use all arguments of install.packages() via `...` (I'd be sure you'd know..). I've documented it explicitly in help(update.packages), yesterday. Martin > El dc., 3 de des. 2025, 12:34, Jeroen Ooms <jeroenooms at gmail.com> va escriure: >> >> On Wed, Dec 3, 2025 at 11:55?AM Martin Maechler >> <maechler at stat.math.ethz.ch> wrote: >> > >> > >>>>> Jeroen Ooms >> > >>>>> on Tue, 2 Dec 2025 22:33:05 +0100 writes: >> > >> > > Currently `download.packages()` copies the full `File` >> > > field from the URL in PACKAGES, including http parameters, >> > > as the local filename on disk. So for example, if the >> > > `PACKAGES` file contains >> > >> > > Package: jsonlite Version: 2.0.0 File: >> > > jsonlite_2.0.0.tar.gz?auth=blabla123&hash=79fad1b6092c1d1cc71e096d02cbc7618837fda1f90b61443f09adc25caab095 >> > >> > > Then the file is saved on disk not as >> > > `jsonlite_2.0.0.tar.gz` but as the full url including `?` >> > > and `=` and `&` characters which are not supported and >> > > create corrupt files on some platforms. >> > >> > Yes... but why should a "CRAN-like repository" use such file names ? >> >> Such that we can host a "CRAN-like repository" on modern >> infrastructure that may use this sort of URLs. >> > I am considering, but still not convinced why it is needed (see above). >> > Maybe I'm overlooking something ? >> > Also, I'd have it as a switch (argument) of download.packages() in >> > order to provide back compatibility [no need for a new patch, though !] >> >> The patch should not break functionality; it only fixes an edge case >> where R would otherwise write to an illegal filename on disk. I am >> guessing nobody has used this yet so I am not sure what would be the >> value of back compatibility to ensure R keeps doing this. >> >> In either case the fix would only be useful if install.packages() >> makes use of it. >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel