Ivan, I suspect that the `which' case is just the tip of the iceberg - generally, R expects all tools it detects at configure time to be callable, just to list a few from a running session: PAGER /usr/bin/less R_BROWSER /usr/bin/open R_BZIPCMD /usr/bin/bzip2 R_GZIPCMD /usr/bin/gzip R_PDFVIEWER /usr/bin/open R_QPDF /Library/Frameworks/R.framework/Resources/bin/qpdf R_TEXI2DVICMD /usr/local/bin/texi2dvi R_UNZIPCMD /usr/bin/unzip R_ZIPCMD /usr/bin/zip SED /usr/bin/sed TAR /usr/bin/tar I would claim it is not an unreasonable expectation that the user doesn't delete tools after R was built. Obviously it is tedious, but Spack may need to patch all absolute paths if it wants to relocate things (it's not easy as it includes all linker paths as well FWIW) - they must be doing something like that already as even the R start-up script uses absolute paths: $ grep ^R_ /Library/Frameworks/R.framework/Resources/bin/R R_HOME_DIR=/Library/Frameworks/R.framework/Resources R_HOME="${R_HOME_DIR}" R_SHARE_DIR=/Library/Frameworks/R.framework/Resources/share R_INCLUDE_DIR=/Library/Frameworks/R.framework/Resources/include R_DOC_DIR=/Library/Frameworks/R.framework/Resources/doc R_binary="${R_HOME}/bin/exec${R_ARCH}/R" That said, WHICH is a mess - it may make sense to switch to the command -v built-in which is part of POSIX (where available - which is almost everywhere today) which would not require an external tool, but as noted I don't think this is the only problem Spack has... (and that's just core R - even a bigger can of worms with R packages :P). Cheers, Simon> On Jan 11, 2024, at 8:43 AM, Ivan Krylov via R-devel <r-devel at r-project.org> wrote: > > Hello R-devel, > > Currently on Unix-like systems, Sys.which incorporates the absolute > path to the `which` executable, obtained at the configure stage: > >> ## hopefully configure found [/usr]/bin/which >> which <- "@WHICH@" >> if (!nzchar(which)) { >> warning("'which' was not found on this platform") > > This poses a problem for the Spack package manager and software > distribution. In Spack, like in Nix, Guix, and GoboLinux, packages live > under their own path prefixes, which look like the following: > >>> /opt/spack/opt/spack/linux-ubuntu18.04-x86_64_v3/gcc-7.5.0/r-4.3.0-eqteloqhjzix6ta373ruzt5imvvbcesc > > Unfortunately, Spack packages are expected to get relocated, changing > the path prefix and invalidating stored paths, including the path to > `which`: <https://github.com/spack/spack/issues/41953>. > > Harmen Stoppels, who is not subscribed to R-devel but interested in > making R work in Spack, currently creates a symlink to `which` > <https://github.com/r-devel/r-svn/pull/151> as part of a patch to R. > > What would be the minimally disruptive way to avoid this dependency or > at least make it easier to fix post-factum, during relocation? What > would be the pitfall if Sys.which() were to find `which` on the $PATH > by itself, without remembering the full path? > > -- > Best regards, > Ivan > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >
For context: I don't think Nix and Guix have to relocate anything, cause I think they require absolute paths like /nix/store where all binaries go. Spack on the other hand can install packages w/o sudo to a location of choice, e.g. ~/spack/opt/spack. That's why we have to patch binaries. However, Spack's relocation stuff is not about creating truly relocatable binaries where everything is referenced by relative paths. We still use absolute paths almost everywhere, it's just that they have to be rewired when the location things are built is different from where they are installed. I'm sure there are people who would like to have an actually fully relocatable R, but that's not my goal.> I would claim it is not an unreasonable expectation that the user doesn't delete tools after R was built. Obviously it is tedious, but Spack may need to patch all absolute paths if it wants to relocate things (it's not easy as it includes all linker paths as well FWIW) - they must be doing something like that already as even the R start-up script uses absolute pathsBasically Spack does (a) special handling of the dynamic section of ELF files for Linux / FreeBSD, and load commands of mach-o files for macOS, (b) find & replace of prefixes in text files / scripts and (c) somewhat fancy but still basic replacement of C-strings containing prefixes in binaries. This works reliably because the package prefixes contain hashes that make false positives unlikely. It's just that it does not work when the absolute path to be relocated is captured inside serialized bytecode in a zlib-compressed database base.rdb :) I believe `which` is the only occurrence of this.> That said, WHICH is a mess - it may make sense to switch to the command -v built-in which is part of POSIX (where available - which is almost everywhere today) which would not require an external toolThat sounds like a decent solution to me, probably `command -v` is more commonly available than `which`.> I don't think this is the only problem Spack has... (and that's just core R - even a bigger can of worms with R packages :P).We can deal with most issues, just not with compressed byte code.
Harmen, thanks for the additional details, it wasn't exactly clear what this is about. Ivan's post didn't mention that the issue here is the caching, not the path replacement which you are apparently already doing, now it makes more sense. I still think it is dangerous as you have no way of knowing who else is caching values at installation time since there is no reason to assume that the system will change after installation - technically, it breaks the contract with the application. We are trying to get packages to not hard-code or cache paths, but that typically only applies to the package library location, not to system tools. Cheers, Simon> On Jan 11, 2024, at 10:36 AM, Harmen Stoppels <me at harmenstoppels.nl> wrote: > > For context: I don't think Nix and Guix have to relocate anything, cause I think they require absolute paths like /nix/store where all binaries go. Spack on the other hand can install packages w/o sudo to a location of choice, e.g. ~/spack/opt/spack. That's why we have to patch binaries. > > However, Spack's relocation stuff is not about creating truly relocatable binaries where everything is referenced by relative paths. We still use absolute paths almost everywhere, it's just that they have to be rewired when the location things are built is different from where they are installed. > > I'm sure there are people who would like to have an actually fully relocatable R, but that's not my goal. > >> I would claim it is not an unreasonable expectation that the user doesn't delete tools after R was built. Obviously it is tedious, but Spack may need to patch all absolute paths if it wants to relocate things (it's not easy as it includes all linker paths as well FWIW) - they must be doing something like that already as even the R start-up script uses absolute paths > > Basically Spack does (a) special handling of the dynamic section of ELF files for Linux / FreeBSD, and load commands of mach-o files for macOS, (b) find & replace of prefixes in text files / scripts and (c) somewhat fancy but still basic replacement of C-strings containing prefixes in binaries. > > This works reliably because the package prefixes contain hashes that make false positives unlikely. > > It's just that it does not work when the absolute path to be relocated is captured inside serialized bytecode in a zlib-compressed database base.rdb :) > > I believe `which` is the only occurrence of this. > >> That said, WHICH is a mess - it may make sense to switch to the command -v built-in which is part of POSIX (where available - which is almost everywhere today) which would not require an external tool > > That sounds like a decent solution to me, probably `command -v` is more commonly available than `which`. > >> I don't think this is the only problem Spack has... (and that's just core R - even a bigger can of worms with R packages :P). > > We can deal with most issues, just not with compressed byte code. > >
On Thu, 11 Jan 2024 09:30:55 +1300 Simon Urbanek <simon.urbanek at R-project.org> wrote:> That said, WHICH is a mess - it may make sense to switch to the > command -v built-in which is part of POSIX (where available - which > is almost everywhere today) which would not require an external toolThis is a bit tricky to implement. I've prepared the patch at the end of this e-mail, tested it on GNU/Linux and tried to test on OpenBSD [*] (I cannot test on a Mac), but then I realised one crucial detail: unlike `which`, `command -v` returns names of shell builtins if something is both an executable and a builtin. So for things like `[`, Sys.which would behave differently if changed to use command -v: $ sh -c 'which [' /usr/bin/[ $ sh -c 'command -v [' [ R checks the returned string with file.exists(), so the new Sys.which('[') returns an empty string instead of /usr/bin/[. That's probably undesirable, isn't it? Index: configure ==================================================================--- configure (revision 85802) +++ configure (working copy) @@ -949,7 +949,6 @@ PDFTEX TEX PAGER -WHICH SED INSTALL_DATA INSTALL_SCRIPT @@ -5390,66 +5389,6 @@ done test -n "$SED" || SED="/bin/sed" - -## 'which' is not POSIX, and might be a shell builtin or alias -## (but should not be in 'sh') -for ac_prog in which -do - # Extract the first word of "$ac_prog", so it can be a program name with args. -set dummy $ac_prog; ac_word=$2 -{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 -printf %s "checking for $ac_word... " >&6; } -if test ${ac_cv_path_WHICH+y} -then : - printf %s "(cached) " >&6 -else $as_nop - case $WHICH in - [\\/]* | ?:[\\/]*) - ac_cv_path_WHICH="$WHICH" # Let the user override the test with a path. - ;; - *) - as_save_IFS=$IFS; IFS=$PATH_SEPARATOR -for as_dir in $PATH -do - IFS=$as_save_IFS - case $as_dir in #((( - '') as_dir=./ ;; - */) ;; - *) as_dir=$as_dir/ ;; - esac - for ac_exec_ext in '' $ac_executable_extensions; do - if as_fn_executable_p "$as_dir$ac_word$ac_exec_ext"; then - ac_cv_path_WHICH="$as_dir$ac_word$ac_exec_ext" - printf "%s\n" "$as_me:${as_lineno-$LINENO}: found $as_dir$ac_word$ac_exec_ext" >&5 - break 2 - fi -done - done -IFS=$as_save_IFS - - ;; -esac -fi -WHICH=$ac_cv_path_WHICH -if test -n "$WHICH"; then - { printf "%s\n" "$as_me:${as_lineno-$LINENO}: result: $WHICH" >&5 -printf "%s\n" "$WHICH" >&6; } -else - { printf "%s\n" "$as_me:${as_lineno-$LINENO}: result: no" >&5 -printf "%s\n" "no" >&6; } -fi - - - test -n "$WHICH" && break -done -test -n "$WHICH" || WHICH="which" - -if test "${WHICH}" = which ; then - ## needed to build and run R - ## ends up hard-coded in the utils package - as_fn_error $? "which is required but missing" "$LINENO" 5 -fi - ## Make : ${MAKE=make} Index: configure.ac ==================================================================--- configure.ac (revision 85802) +++ configure.ac (working copy) @@ -680,15 +680,6 @@ ## we would like a POSIX sed, and need one on Solaris AC_PATH_PROGS(SED, sed, /bin/sed, [/usr/xpg4/bin:$PATH]) -## 'which' is not POSIX, and might be a shell builtin or alias -## (but should not be in 'sh') -AC_PATH_PROGS(WHICH, which, which) -if test "${WHICH}" = which ; then - ## needed to build and run R - ## ends up hard-coded in the utils package - AC_MSG_ERROR([[which is required but missing]]) -fi - ## Make : ${MAKE=make} AC_SUBST(MAKE) Index: src/library/base/Makefile.in ==================================================================--- src/library/base/Makefile.in (revision 85802) +++ src/library/base/Makefile.in (working copy) @@ -28,7 +28,7 @@ all: Makefile DESCRIPTION @$(ECHO) "building package '$(pkg)'" @$(MKINSTALLDIRS) $(top_builddir)/library/$(pkg) - @WHICH="@WHICH@" $(MAKE) mkRbase mkdesc2 mkdemos2 + @$(MAKE) mkRbase mkdesc2 mkdemos2 @$(INSTALL_DATA) $(srcdir)/inst/CITATION $(top_builddir)/library/$(pkg) include $(top_srcdir)/share/make/basepkg.mk @@ -45,12 +45,12 @@ mkR: mkRbase Rsimple: - @WHICH="@WHICH@" $(MAKE) mkRbase mkRsimple + @$(MAKE) mkRbase mkRsimple ## Remove files to allow this to be done repeatedly Rlazy: - at rm -f $(top_builddir)/library/$(pkg)/R/$(pkg)* - @WHICH="@WHICH@" $(MAKE) mkRbase + @$(MAKE) mkRbase @cat $(srcdir)/makebasedb.R | \ R_DEFAULT_PACKAGES=NULL LC_ALL=C $(R_EXE) > /dev/null @$(INSTALL_DATA) $(srcdir)/baseloader.R \ @@ -57,4 +57,4 @@ $(top_builddir)/library/$(pkg)/R/$(pkg) Rlazycomp: - @WHICH="@WHICH@" $(MAKE) mkRbase mklazycomp + @$(MAKE) mkRbase mklazycomp Index: src/library/base/R/unix/system.unix.R ==================================================================--- src/library/base/R/unix/system.unix.R (revision 85802) +++ src/library/base/R/unix/system.unix.R (working copy) @@ -114,23 +114,14 @@ Sys.which <- function(names) { res <- character(length(names)); names(res) <- names - ## hopefully configure found [/usr]/bin/which - which <- "@WHICH@" - if (!nzchar(which)) { - warning("'which' was not found on this platform") - return(res) - } for(i in seq_along(names)) { if(is.na(names[i])) {res[i] <- NA; next} ## Quoting was added in 3.0.0 - ans <- suppressWarnings(system(paste(which, shQuote(names[i])), - intern = TRUE, ignore.stderr = TRUE)) - ## Solaris' which gives 'no foo in ...' message on stdout, - ## GNU which does it on stderr - if(grepl("solaris", R.version$os)) { - tmp <- strsplit(ans[1], " ", fixed = TRUE)[[1]] - if(identical(tmp[1:3], c("no", i, "in"))) ans <- "" - } + ans <- tryCatch( + suppressWarnings(system(paste("command -v", shQuote(names[i])), + intern = TRUE, ignore.stderr = TRUE)), + error = \(e) "" + ) res[i] <- if(length(ans)) ans[1] else "" ## final check that this is a real path and not an error message if(!file.exists(res[i])) res[i] <- "" -- Best regards, Ivan [*] example(Sys.which()) works, but there are multiple problems elsewhere, e.g. tan(1+1000i) returns NaN+1i and Matrix doesn't install without gmake