Henrik Bengtsson
2023-Feb-08 17:21 UTC
[Rd] Compiling R-devel on older Linux distributions, e.g. RHEL / CentOS 7
I just want to add a few reasons that I know of for why users are still on Red Hat/CentOS 7 and learned from being deeply involved with big academic and research high-performance compute (HPC) environments. These systems are not like your regular sailing boat, but more like a giant container ship; much harder to navigate, slower to react, you cannot just cruise around and pop into any harbor you like to, or when you like to. It takes much more efforts, more logistics, and more people to operate them. If you mess up, the damage is much bigger. Reasons: * Users don't have many options, but have to use what is available. * Red Hat/CentOS is designed for long term stability and backward compatibility. They've only done major upgrades every 3-4 years. * Red Hat backports security fixes to old versions of common software, which is why you see, for instance, Python 3.6 still being the provided version, although Python made that End-of-Life in December 2021. * HPC environments (aka "compute cluster") often have 100s to 1000s of users. Imagine the amount of software tools and difference versions installed in such environments. * Upgrading an HPC environments is a major disruption for users who rely on it in their work and research, e.g. some software stacks, pipelines, and scripts have to reinstalled and re-coded. * The majority of users and sysadmins prefer stability over the being able to run the latest tools. * Over years, stability increases the technical debt, to a point where it is cheaper to upgrade than to stay behind. * Even if sysadmins want to upgrade to a newer release, their hands are often also tied, because of external factors. For example, the global parallel file system or the backup system you rely on has not yet be validated for the next version of OS you want to upgrade too. There might also be research critical scientific pipelines that does not yet support the new version, which can be because the maintainers of those tools don't have access to a new version to test on. GPU drives much not be available. This is also the case for commercial tools. Sometimes IT security requirements cannot be met on the new version, because security scanning tools are not yet up-to-date. There can also be hardware limitations, e.g. you might even have to replace some central server for the whole cluster to be able to upgrade. * Although you might want to tell everyone to just run a new version via Linux containers, it's not the magic sauce for all of the above. Savvy users might be able to do it, but not your average users. Also, this basically puts the common sysadmin burden on the end-user, who now have to keep their container stacks up-to-date and in sync. In contrast to a homogeneous environment, this strategy increases the support burden on sysadms, because they will get much more questions and request for troubleshooting on very specific setups. Specifically to Red Hat/CentOS 7: When sites started to think about migrating to CentOS 8, Red Hat decided to pull the plug and change their long-term business plan. This itself was a disruptive event, because any plans to do a "regular" distro upgrade had to be flushed down the toilet. The community waited to see what would happen and what the options would be. A lot of sites now plan on migrating to Rocky 8, which (AFAIU) tries to stay true to the original CentOS mission. This means they are waiting for third-party hardware and software providers to validate their products for Rocky 8, e.g. parallel file systems, backup software, software stacks, etc. What R Core, Gabor, and many others are doing, often silently in the background, is to make sure R works smoothly for many R users out there, whatever operating system and version they may be on. This is so essential to R's success, and, more importantly, for research and science to be able to move forward. Those endless hours spend on trying to support some OS, even obscure ones, to pay off many times, especially on common OSes such as Red Hat and CentOS. You spare lots of users and sysadmins lots of pain when you put those hours in. So, thank you for doing all that. /Henrik On Wed, Feb 8, 2023 at 2:24 AM I?aki Ucar <iucar at fedoraproject.org> wrote:> > On Wed, 8 Feb 2023 at 07:05, Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote: > > > > On 08/02/2023 00:13, G?bor Cs?rdi wrote: > > > As preparation for the next release, I am trying to compile R devel on > > > RHEL / CentOS 7, which is still supported by RedHat until 2024 June. > > True, but with a big asterisk. Full updates ended on 2020-08-06, and > it's been in maintenance mode since then, meaning that only security > and critical fixes are released until EOL to facilitate a transition > to a newer version. So CentOS 7 users shouldn't expect new releases of > software to be available. > > > > There are two issues. > > > > > > One is that the libcurl version in CentOS 7 is quite old, 7.29.0, and > > > R devel now requires 7.32.0, since 83715 about a week ago. This > > > requirement is here to stay for R 4.3.0, right? > > I suppose that if R-devel doesn't use any API endpoint not available > in 7.29, you could just patch out that requirement. Otherwise, you > would need to build your own. > > > Unless we revert it. The comment in the manual says > > > > @c libcurl 7.32.0 was released in Aug 2013 > > > > and Centos 7 was released in 2014-07-07, 11 months later. Do they > > really never security-patch libcurl? > > Oh, they do port all security fixes, but without changing the version, > which is the whole point of LTS. In fact, current version is > 7.29.0-59, and there are probably a hundred patches on top of those 59 > builds. > > > > The second is that the recommended packages are now installed with R > > > CMD INSTALL --use-C17, which fails on CentOS 7 with > > > > > > begin installing recommended package MASS > > > * installing *source* package 'MASS' ... > > > ** package 'MASS' successfully unpacked and MD5 sums checked > > > ** using non-staged installation > > > ** libs > > > Error: C17 standard requested but CC17 is not defined > > > * removing '/root/R-devel/library/MASS' > > > > > > CentOS 7 has GCC 4.8.5, which does not have a -std=gnu17 option. > > > However the commit message of this change in commit 83566 hints that > > > this requirement might be temporary. Hence my questions. > > > > It is temporary -- needed for survival (now updated) and mgcv (awaited). > > However, > > > > 1) You should be able to set > > > > CC17="gcc -std=gnu11" > > > > in config.site, as C17 is a bug-fixed C11. > > > > 2) Centos 7 has later compilers available, and people are going to need > > them for C++. The manual says > > > > ... later compilers are available: for RHEL/Centos 7 look for > > ?devtoolset?. > > Exactly, here is the reference: > https://www.softwarecollections.org/en/scls/rhscl/devtoolset-7/ > > R 3.6.0 is the last version we support in EPEL, because EPEL is not > allowed to build on top of SCL. But you can enable SCL and install the > devtoolset available, which contains gcc version 7.3.1. > > But anyway, I don't think that staying in an almost 10-year old distro > in maintenance mode and at the same time expecting a cutting-edge > version of R (or any software) is reasonable. > > I?aki > > > > Is the C17 requirement temporary or it will be a requirement for R 4.3.0? > > > Should I expect any problems if I remove the --use-C17 flag for > > > installing the recommended packages? > > > > Not with that compiler. > > > > > > > > There are a lot of R users still on RHEL 7, so it would be great to > > > know what to expect for the next release. > > an D. Ripley, ripley at stats.ox.ac.uk > > Emeritus Professor of Applied Statistics, University of Oxford > > > > ______________________________________________ > > R-devel at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > > > -- > I?aki ?car > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
Eric Berger
2023-Feb-08 19:13 UTC
[Rd] Compiling R-devel on older Linux distributions, e.g. RHEL / CentOS 7
A different route to get the "latest and greatest version of R" while sticking with an older distribution of an OS would be via docker. At work, our linux servers run Ubuntu 18.04 on which I run an Ubuntu 20.04 docker image with (close to) the latest version of R (4.2.2) for a shiny app I supply across the firm. On Wed, Feb 8, 2023 at 7:21 PM Henrik Bengtsson <henrik.bengtsson at gmail.com> wrote:> I just want to add a few reasons that I know of for why users are > still on Red Hat/CentOS 7 and learned from being deeply involved with > big academic and research high-performance compute (HPC) environments. > These systems are not like your regular sailing boat, but more like a > giant container ship; much harder to navigate, slower to react, you > cannot just cruise around and pop into any harbor you like to, or when > you like to. It takes much more efforts, more logistics, and more > people to operate them. If you mess up, the damage is much bigger. > > Reasons: > > * Users don't have many options, but have to use what is available. > > * Red Hat/CentOS is designed for long term stability and backward > compatibility. They've only done major upgrades every 3-4 years. > > * Red Hat backports security fixes to old versions of common software, > which is why you see, for instance, Python 3.6 still being the > provided version, although Python made that End-of-Life in December > 2021. > > * HPC environments (aka "compute cluster") often have 100s to 1000s of > users. Imagine the amount of software tools and difference versions > installed in such environments. > > * Upgrading an HPC environments is a major disruption for users who > rely on it in their work and research, e.g. some software stacks, > pipelines, and scripts have to reinstalled and re-coded. > > * The majority of users and sysadmins prefer stability over the being > able to run the latest tools. > > * Over years, stability increases the technical debt, to a point where > it is cheaper to upgrade than to stay behind. > > * Even if sysadmins want to upgrade to a newer release, their hands > are often also tied, because of external factors. For example, the > global parallel file system or the backup system you rely on has not > yet be validated for the next version of OS you want to upgrade too. > There might also be research critical scientific pipelines that does > not yet support the new version, which can be because the maintainers > of those tools don't have access to a new version to test on. GPU > drives much not be available. This is also the case for commercial > tools. Sometimes IT security requirements cannot be met on the new > version, because security scanning tools are not yet up-to-date. There > can also be hardware limitations, e.g. you might even have to replace > some central server for the whole cluster to be able to upgrade. > > * Although you might want to tell everyone to just run a new version > via Linux containers, it's not the magic sauce for all of the above. > Savvy users might be able to do it, but not your average users. Also, > this basically puts the common sysadmin burden on the end-user, who > now have to keep their container stacks up-to-date and in sync. In > contrast to a homogeneous environment, this strategy increases the > support burden on sysadms, because they will get much more questions > and request for troubleshooting on very specific setups. > > Specifically to Red Hat/CentOS 7: When sites started to think about > migrating to CentOS 8, Red Hat decided to pull the plug and change > their long-term business plan. This itself was a disruptive event, > because any plans to do a "regular" distro upgrade had to be flushed > down the toilet. The community waited to see what would happen and > what the options would be. A lot of sites now plan on migrating to > Rocky 8, which (AFAIU) tries to stay true to the original CentOS > mission. This means they are waiting for third-party hardware and > software providers to validate their products for Rocky 8, e.g. > parallel file systems, backup software, software stacks, etc. > > What R Core, Gabor, and many others are doing, often silently in the > background, is to make sure R works smoothly for many R users out > there, whatever operating system and version they may be on. This is > so essential to R's success, and, more importantly, for research and > science to be able to move forward. Those endless hours spend on > trying to support some OS, even obscure ones, to pay off many times, > especially on common OSes such as Red Hat and CentOS. You spare lots > of users and sysadmins lots of pain when you put those hours in. So, > thank you for doing all that. > > /Henrik > > On Wed, Feb 8, 2023 at 2:24 AM I?aki Ucar <iucar at fedoraproject.org> wrote: > > > > On Wed, 8 Feb 2023 at 07:05, Prof Brian Ripley <ripley at stats.ox.ac.uk> > wrote: > > > > > > On 08/02/2023 00:13, G?bor Cs?rdi wrote: > > > > As preparation for the next release, I am trying to compile R devel > on > > > > RHEL / CentOS 7, which is still supported by RedHat until 2024 June. > > > > True, but with a big asterisk. Full updates ended on 2020-08-06, and > > it's been in maintenance mode since then, meaning that only security > > and critical fixes are released until EOL to facilitate a transition > > to a newer version. So CentOS 7 users shouldn't expect new releases of > > software to be available. > > > > > > There are two issues. > > > > > > > > One is that the libcurl version in CentOS 7 is quite old, 7.29.0, and > > > > R devel now requires 7.32.0, since 83715 about a week ago. This > > > > requirement is here to stay for R 4.3.0, right? > > > > I suppose that if R-devel doesn't use any API endpoint not available > > in 7.29, you could just patch out that requirement. Otherwise, you > > would need to build your own. > > > > > Unless we revert it. The comment in the manual says > > > > > > @c libcurl 7.32.0 was released in Aug 2013 > > > > > > and Centos 7 was released in 2014-07-07, 11 months later. Do they > > > really never security-patch libcurl? > > > > Oh, they do port all security fixes, but without changing the version, > > which is the whole point of LTS. In fact, current version is > > 7.29.0-59, and there are probably a hundred patches on top of those 59 > > builds. > > > > > > The second is that the recommended packages are now installed with R > > > > CMD INSTALL --use-C17, which fails on CentOS 7 with > > > > > > > > begin installing recommended package MASS > > > > * installing *source* package 'MASS' ... > > > > ** package 'MASS' successfully unpacked and MD5 sums checked > > > > ** using non-staged installation > > > > ** libs > > > > Error: C17 standard requested but CC17 is not defined > > > > * removing '/root/R-devel/library/MASS' > > > > > > > > CentOS 7 has GCC 4.8.5, which does not have a -std=gnu17 option. > > > > However the commit message of this change in commit 83566 hints that > > > > this requirement might be temporary. Hence my questions. > > > > > > It is temporary -- needed for survival (now updated) and mgcv > (awaited). > > > However, > > > > > > 1) You should be able to set > > > > > > CC17="gcc -std=gnu11" > > > > > > in config.site, as C17 is a bug-fixed C11. > > > > > > 2) Centos 7 has later compilers available, and people are going to need > > > them for C++. The manual says > > > > > > ... later compilers are available: for RHEL/Centos 7 look for > > > ?devtoolset?. > > > > Exactly, here is the reference: > > https://www.softwarecollections.org/en/scls/rhscl/devtoolset-7/ > > > > R 3.6.0 is the last version we support in EPEL, because EPEL is not > > allowed to build on top of SCL. But you can enable SCL and install the > > devtoolset available, which contains gcc version 7.3.1. > > > > But anyway, I don't think that staying in an almost 10-year old distro > > in maintenance mode and at the same time expecting a cutting-edge > > version of R (or any software) is reasonable. > > > > I?aki > > > > > > Is the C17 requirement temporary or it will be a requirement for R > 4.3.0? > > > > Should I expect any problems if I remove the --use-C17 flag for > > > > installing the recommended packages? > > > > > > Not with that compiler. > > > > > > > > > > > There are a lot of R users still on RHEL 7, so it would be great to > > > > know what to expect for the next release. > > > an D. Ripley, ripley at stats.ox.ac.uk > > > Emeritus Professor of Applied Statistics, University of Oxford > > > > > > ______________________________________________ > > > R-devel at r-project.org mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-devel > > > > > > > > -- > > I?aki ?car > > > > ______________________________________________ > > R-devel at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >[[alternative HTML version deleted]]
IƱaki Ucar
2023-Feb-08 20:22 UTC
[Rd] Compiling R-devel on older Linux distributions, e.g. RHEL / CentOS 7
On Wed, 8 Feb 2023 at 19:59, Henrik Bengtsson <henrik.bengtsson at gmail.com> wrote:> > I just want to add a few reasons that I know of for why users are > still on Red Hat/CentOS 7 and learned from being deeply involved with > big academic and research high-performance compute (HPC) environments. > These systems are not like your regular sailing boat, but more like a > giant container ship; much harder to navigate, slower to react, you > cannot just cruise around and pop into any harbor you like to, or when > you like to. It takes much more efforts, more logistics, and more > people to operate them. If you mess up, the damage is much bigger.I'm fully aware of, and I understand, all the technical and organizational reasons why there are CentOS 7 systems out there. I only challenge a single point (cherry-picked from your list):> * The majority of users and sysadmins prefer stability over the being > able to run the latest tools.This is simply not true. In general, sysadmins do prefer stability, but users want the latest tools (otherwise, this very thread would not exist, QED). And the first thing is hardly compatible with the second one. That is, without containers, which brings us to the next point.> * Although you might want to tell everyone to just run a new version > via Linux containers, it's not the magic sauce for all of the above. > Savvy users might be able to do it, but not your average users. Also, > this basically puts the common sysadmin burden on the end-user, who > now have to keep their container stacks up-to-date and in sync. In > contrast to a homogeneous environment, this strategy increases the > support burden on sysadms, because they will get much more questions > and request for troubleshooting on very specific setups.How is that so? Let's say a user wants the latest version of R. Nothing prevents a sysadmin to set up a script called "R" in the PATH that runs e.g. the r2u container [1] with the proper mounts. And that's it: the user runs "R" and receives the latest version (and even package installations seem to be blazing fast now!) without even knowing that it's running inside a container. I know, you are thinking "security", "permissions"... $ yum install podman Drop-in replacement for docker, but rootless, daemonless. Also there's a similar thing called Apptainer [1], formerly Singularity, that was specifically designed with HPC in mind, now part of the Linux Foundation. [1] https://github.com/eddelbuettel/r2u [2] https://apptainer.org/> What R Core, Gabor, and many others are doing, often silently in the > background, is to make sure R works smoothly for many R users out > there, whatever operating system and version they may be on. This is > so essential to R's success, and, more importantly, for research and > science to be able to move forward.+1000 -- I?aki ?car