Dirk Eddelbuettel
2023-Aug-07 12:12 UTC
[Rd] Detecting physical CPUs in detectCores() on Linux platforms
On 7 August 2023 at 08:48, Nils Kehrein wrote: | I recently noticed that `detectCores()` ignores the `logical=FALSE` | argument on Linux platforms. This means that the function will always | return the number of logical CPUs, i.e. it will count the number of threads | that theoretically can run in parallel due to e.g. hyper-threading. | Unfortunately, this can result in issues in high-performance computing use | cases where hyper-threading might degrade performance instead of improving | it. | | Currently, src/library/parallel/R/detectCores.R uses the following R/shell | code fragment to identify the number of logical CPUs: | linux = 'grep "^processor" /proc/cpuinfo 2>/dev/null | wc -l' | | As far as I understand, one could derive the number of online physical CPUs | by parsing the contents of /sys/devices/system/cpu/* but that seems rather | cumbersome. Instead, could we amend the R code with the following line? | linux = if(logical) 'grep "^processor" /proc/cpuinfo 2>/dev/null | wc -l' | else 'lscpu -b --parse="CORE" | tail -n +5 | sort -u | wc -l' That's good but you also need to at protect this from `lscpu` being in the path. Maybe `if (logical && nzchar(Sys.which("lscpu")))` ? Dirk | This solution uses `lscpu` from `sys-utils`. The -b switch makes sure that | only online CPUs/cores are listed and due to the --parse="CORE", the output | will contain only a single column with logical core ids. It seems to do the | job in my view, but there might be edge cases for exotic CPU topologies | that I am not aware of. | | Thank you, Nils | | [[alternative HTML version deleted]] | | ______________________________________________ | R-devel at r-project.org mailing list | https://stat.ethz.ch/mailman/listinfo/r-devel -- dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org
Julian Hniopek
2023-Aug-07 12:47 UTC
[Rd] Detecting physical CPUs in detectCores() on Linux platforms
On Mon, 2023-08-07 at 07:12 -0500, Dirk Eddelbuettel wrote:> > On 7 August 2023 at 08:48, Nils Kehrein wrote: > > I recently noticed that `detectCores()` ignores the `logical=FALSE` > > argument on Linux platforms. This means that the function will > > always > > return the number of logical CPUs, i.e. it will count the number of > > threads > > that theoretically can run in parallel due to e.g. hyper-threading. > > Unfortunately, this can result in issues in high-performance > > computing use > > cases where hyper-threading might degrade performance instead of > > improving > > it. > > > > Currently, src/library/parallel/R/detectCores.R uses the following > > R/shell > > code fragment to identify the number of logical CPUs: > > linux = 'grep "^processor" /proc/cpuinfo 2>/dev/null | wc -l' > > > > As far as I understand, one could derive the number of online > > physical CPUs > > by parsing the contents of /sys/devices/system/cpu/* but that seems > > rather > > cumbersome. Instead, could we amend the R code with the following > > line? > > linux = if(logical) 'grep "^processor" /proc/cpuinfo 2>/dev/null | > > wc -l' > > else 'lscpu -b --parse="CORE" | tail -n +5 | sort -u | wc -l' > > That's good but you also need to at protect this from `lscpu` being > in the > path.? Maybe `if (logical && nzchar(Sys.which("lscpu")))` ? > > Dirk >Alternatively, using only on POSIX utils which should be in the path of all Linux Systems and /proc/cpuinfo: awk '/^physical id/{PHYS_ID=$NF; next} /^cpu cores/{print PHYS_ID" "$NF;}' /proc/cpuinfo 2>/dev/null | sort | uniq | awk '{sum+=$NF;} END {print sum}'. Parses /proc/cpuinfo for the number of physical cores and physical id in each CPU. Only returns unique combinations of physical id (i.e. Socket) and core numbers. Then sums up the number of cores for each physicalid to get the total amount of physical cores. Something I had lying around. Someone with better awk skills could probably do sorting and filtering in awk as well to save on pipes. Works on single and multisocket AMD/Intel from my experience. Julian> > > > This solution uses `lscpu` from `sys-utils`. The -b switch makes > > sure that > > only online CPUs/cores are listed and due to the --parse="CORE", > > the output > > will contain only a single column with logical core ids. It seems > > to do the > > job in my view, but there might be edge cases for exotic CPU > > topologies > > that I am not aware of. > > > > Thank you, Nils > > > > ????????[[alternative HTML version deleted]] > > > > ______________________________________________ > > R-devel at r-project.org?mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel >