Nils Kehrein
2023-Aug-07 06:48 UTC
[Rd] Detecting physical CPUs in detectCores() on Linux platforms
Dear all, I recently noticed that `detectCores()` ignores the `logical=FALSE` argument on Linux platforms. This means that the function will always return the number of logical CPUs, i.e. it will count the number of threads that theoretically can run in parallel due to e.g. hyper-threading. Unfortunately, this can result in issues in high-performance computing use cases where hyper-threading might degrade performance instead of improving it. Currently, src/library/parallel/R/detectCores.R uses the following R/shell code fragment to identify the number of logical CPUs: linux = 'grep "^processor" /proc/cpuinfo 2>/dev/null | wc -l' As far as I understand, one could derive the number of online physical CPUs by parsing the contents of /sys/devices/system/cpu/* but that seems rather cumbersome. Instead, could we amend the R code with the following line? linux = if(logical) 'grep "^processor" /proc/cpuinfo 2>/dev/null | wc -l' else 'lscpu -b --parse="CORE" | tail -n +5 | sort -u | wc -l' This solution uses `lscpu` from `sys-utils`. The -b switch makes sure that only online CPUs/cores are listed and due to the --parse="CORE", the output will contain only a single column with logical core ids. It seems to do the job in my view, but there might be edge cases for exotic CPU topologies that I am not aware of. Thank you, Nils [[alternative HTML version deleted]]
Dirk Eddelbuettel
2023-Aug-07 12:12 UTC
[Rd] Detecting physical CPUs in detectCores() on Linux platforms
On 7 August 2023 at 08:48, Nils Kehrein wrote: | I recently noticed that `detectCores()` ignores the `logical=FALSE` | argument on Linux platforms. This means that the function will always | return the number of logical CPUs, i.e. it will count the number of threads | that theoretically can run in parallel due to e.g. hyper-threading. | Unfortunately, this can result in issues in high-performance computing use | cases where hyper-threading might degrade performance instead of improving | it. | | Currently, src/library/parallel/R/detectCores.R uses the following R/shell | code fragment to identify the number of logical CPUs: | linux = 'grep "^processor" /proc/cpuinfo 2>/dev/null | wc -l' | | As far as I understand, one could derive the number of online physical CPUs | by parsing the contents of /sys/devices/system/cpu/* but that seems rather | cumbersome. Instead, could we amend the R code with the following line? | linux = if(logical) 'grep "^processor" /proc/cpuinfo 2>/dev/null | wc -l' | else 'lscpu -b --parse="CORE" | tail -n +5 | sort -u | wc -l' That's good but you also need to at protect this from `lscpu` being in the path. Maybe `if (logical && nzchar(Sys.which("lscpu")))` ? Dirk | This solution uses `lscpu` from `sys-utils`. The -b switch makes sure that | only online CPUs/cores are listed and due to the --parse="CORE", the output | will contain only a single column with logical core ids. It seems to do the | job in my view, but there might be edge cases for exotic CPU topologies | that I am not aware of. | | Thank you, Nils | | [[alternative HTML version deleted]] | | ______________________________________________ | R-devel at r-project.org mailing list | https://stat.ethz.ch/mailman/listinfo/r-devel -- dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org