Greetings, I'm a software engineer with Intel. Recently I've been investigating R performance on Intel Xeon and Xeon Phi processors and RH Linux. I've also compared the performance of R built with the Intel compilers and Intel Math Kernel Library to a "default" build (no config options) that uses the GNU compilers. To my dismay, I've found that the GNU build always runs on a single CPU core, even during matrix operations. The Intel build runs matrix operations on multiple cores, so it is much faster on those operations. Running the benchmark-2.5 on a 24 core Xeon system, the Intel build is 13x faster than the GNU build (21 seconds vs 275 seconds). Unfortunately, this advantage is not documented anywhere that I can see. Building with the Intel tools is very easy. Assuming the tools are installed in /opt/intel/composerxe, the process is simply (in bash shell): $ . /opt/intel/composerxe/bin/compilervars.sh intel64 $ ./configure --with-blas="-L/opt/intel/composerxe/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -lm" --with-lapack CC=icc CFLAGS=-O2 CXX=icpc CXXFLAGS=-O2 F77=ifort FFLAGS=-O2 FC=ifort FCFLAGS=-O2 $ make $ make check My questions are: 1) Do most system admins and/or R installers know about this performance difference, and use the Intel tools to build R? 2) Can we add information on the advantage of building with the Intel tools, and how to do it, to the installation instructions and FAQ? I can post my data if anyone is interested. Thanks, Jonathan Anspach Sr. Software Engineer Intel Corp. jonathan.p.anspach at intel.com 713-751-9460
Jonathan, I myself tried something like this - comparing gcc, clang and intel on a Mac. From my experiences in HPC on the university cluster (where we also use the Xeon Phi, Landeshochleistungscluster University RWTH Aachen), the Intel compiler has better code optimization in regard to vectorisation, etc. (clang is up to now suffering from a not yet implemented OpenMP library). Here is a revolutionanalytics article about this topic: http://blog.revolutionanalytics.com/2010/06/performance-benefits-of-multithreaded-r.html As I usually use the Rcpp package for C++ extensions this could give me further performance. Though, I already failed when trying to compile R with the Intel compiler and linking against the MKL (see my topic in the Intel developer zone: http://software.intel.com/en-us/comment/1767418 and my threads on the R-User list: https://stat.ethz.ch/pipermail/r-sig-mac/2013-November/010472.html). So, to your questions: 1) I think that most admins do not even use the Intel compiler to compile R - this seems to me rare. There are some people I know they do and I think they could be aware of it - but these are only a few. As R is growing in usage and I do know from regional user meetings that very large companies start using it in their BI units - this should be of interest. 2) I would really welcome this step because compilation with intel (especially on a Mac) and linking to the MKL seems to be delicate. I am interested in the data - so if it is possible send it via the list or directly to my account. Further, could you show some code that you used for the computations? Best Simon On 04 Mar 2014, at 22:44, Anspach, Jonathan P <jonathan.p.anspach at intel.com> wrote:> Greetings, > > I'm a software engineer with Intel. Recently I've been investigating R performance on Intel Xeon and Xeon Phi processors and RH Linux. I've also compared the performance of R built with the Intel compilers and Intel Math Kernel Library to a "default" build (no config options) that uses the GNU compilers. To my dismay, I've found that the GNU build always runs on a single CPU core, even during matrix operations. The Intel build runs matrix operations on multiple cores, so it is much faster on those operations. Running the benchmark-2.5 on a 24 core Xeon system, the Intel build is 13x faster than the GNU build (21 seconds vs 275 seconds). Unfortunately, this advantage is not documented anywhere that I can see. > > Building with the Intel tools is very easy. Assuming the tools are installed in /opt/intel/composerxe, the process is simply (in bash shell): > > $ . /opt/intel/composerxe/bin/compilervars.sh intel64 > $ ./configure --with-blas="-L/opt/intel/composerxe/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -lm" --with-lapack CC=icc CFLAGS=-O2 CXX=icpc CXXFLAGS=-O2 F77=ifort FFLAGS=-O2 FC=ifort FCFLAGS=-O2 > $ make > $ make check > > My questions are: > 1) Do most system admins and/or R installers know about this performance difference, and use the Intel tools to build R? > 2) Can we add information on the advantage of building with the Intel tools, and how to do it, to the installation instructions and FAQ? > > I can post my data if anyone is interested. > > Thanks, > Jonathan Anspach > Sr. Software Engineer > Intel Corp. > jonathan.p.anspach at intel.com > 713-751-9460 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
could you tell us if the same/similar performance benefits we should expect when gnu complier suite + MKL are teamed up? and how to configure such a compilation? many thanks On 04/03/14 21:44, Anspach, Jonathan P wrote:> Greetings, > > I'm a software engineer with Intel. Recently I've been investigating R performance on Intel Xeon and Xeon Phi processors and RH Linux. I've also compared the performance of R built with the Intel compilers and Intel Math Kernel Library to a "default" build (no config options) that uses the GNU compilers. To my dismay, I've found that the GNU build always runs on a single CPU core, even during matrix operations. The Intel build runs matrix operations on multiple cores, so it is much faster on those operations. Running the benchmark-2.5 on a 24 core Xeon system, the Intel build is 13x faster than the GNU build (21 seconds vs 275 seconds). Unfortunately, this advantage is not documented anywhere that I can see. > > Building with the Intel tools is very easy. Assuming the tools are installed in /opt/intel/composerxe, the process is simply (in bash shell): > > $ . /opt/intel/composerxe/bin/compilervars.sh intel64 > $ ./configure --with-blas="-L/opt/intel/composerxe/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -lm" --with-lapack CC=icc CFLAGS=-O2 CXX=icpc CXXFLAGS=-O2 F77=ifort FFLAGS=-O2 FC=ifort FCFLAGS=-O2 > $ make > $ make check > > My questions are: > 1) Do most system admins and/or R installers know about this performance difference, and use the Intel tools to build R? > 2) Can we add information on the advantage of building with the Intel tools, and how to do it, to the installation instructions and FAQ? > > I can post my data if anyone is interested. > > Thanks, > Jonathan Anspach > Sr. Software Engineer > Intel Corp. > jonathan.p.anspach at intel.com > 713-751-9460 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
could you tell us if the same/similar performance benefits we should expect when gnu complier suite + MKL are teamed up? and how to configure such a compilation? many thanks On 04/03/14 21:44, Anspach, Jonathan P wrote:> Greetings, > > I'm a software engineer with Intel. Recently I've been investigating R performance on Intel Xeon and Xeon Phi processors and RH Linux. I've also compared the performance of R built with the Intel compilers and Intel Math Kernel Library to a "default" build (no config options) that uses the GNU compilers. To my dismay, I've found that the GNU build always runs on a single CPU core, even during matrix operations. The Intel build runs matrix operations on multiple cores, so it is much faster on those operations. Running the benchmark-2.5 on a 24 core Xeon system, the Intel build is 13x faster than the GNU build (21 seconds vs 275 seconds). Unfortunately, this advantage is not documented anywhere that I can see. > > Building with the Intel tools is very easy. Assuming the tools are installed in /opt/intel/composerxe, the process is simply (in bash shell): > > $ . /opt/intel/composerxe/bin/compilervars.sh intel64 > $ ./configure --with-blas="-L/opt/intel/composerxe/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -lm" --with-lapack CC=icc CFLAGS=-O2 CXX=icpc CXXFLAGS=-O2 F77=ifort FFLAGS=-O2 FC=ifort FCFLAGS=-O2 > $ make > $ make check > > My questions are: > 1) Do most system admins and/or R installers know about this performance difference, and use the Intel tools to build R? > 2) Can we add information on the advantage of building with the Intel tools, and how to do it, to the installation instructions and FAQ? > > I can post my data if anyone is interested. > > Thanks, > Jonathan Anspach > Sr. Software Engineer > Intel Corp. > jonathan.p.anspach at intel.com > 713-751-9460 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >