Dirk Eddelbuettel
2001-Nov-12 23:20 UTC
[R] Announcement: Automatic ATLAS support under Debian GNU/Linux
[ If this is considered off-topic please let me know in private mail. ]
With the current version of the glibc library in Debian's
"testing" and
"unstable" distributions, ldconfig now loads the ATLAS optimised BLAS
without
any user intervention beyond installation of the Atlas and R or Octave
packages.
ATLAS can lead to very dramatic speed increases (up to a factor of ten, see
below for simple examples) for common linear algebra operations. The ability
to use these optimised libraries along with either R or Octave without having
to compile any code is probably a first among Linux, and Unix, distributions.
The README file below (included with Debian's R and Octave packages)
provides
a few more details, and has the appropriate acknowledgements.
Comments or questions are welcome.
Best regards, Dirk
Notes on using Atlas libs with GNU Octave and GNU R
I. Overview
As of the Debian releases 2.1.34-6 (for GNU Octave) and 1.3.0-3 (for GNU R),
both Octave and R can be used with Atlas, the Automatically Tuned Linear
Algebra Software, in order to obtain much faster linear algebra operations.
To make use of Atlas, Debian users need to install the Atlas libraries for
their given cpu architecture. Concretely, one of
atlas2-base - Automatically Tuned Linear Algebra Software
atlas2-p3 - Automatically Tuned Linear Algebra Software
atlas2-p4 - Automatically Tuned Linear Algebra Software
atlas2-athlon - Automatically Tuned Linear Algebra Software
must be installed. Here, 'base' provides generic libraries which run on
all
platforms whereas 'p3', 'p4' and 'athlon' stand for the
Pentium III and IV as
well as the AMD Athlon, respectively. The actual libraries are installed in
/usr/lib/atlas (in the case of 'base') and in /usr/lib/$arch/atlas for
the
cpu-specific versions. Here $arch stands for the cpu code used by the kernel
and shown in /proc/cpuinfo.
The Atlas libraries can be loaded dynamically instead of the (non-optimised)
blas libraries against which both Octave and R are compiled.
Section III below briefly describes how Atlas libraries can be compiled for
your specific machine to further optimise performance.
II. Using the Atlas libraries
II.A New default behaviour with automatic loading of the Atlas libraries
In order to have the libraries loaded at run-time, the location needs to be
communicated to the dynamic linker/loader. As of the Debian release
libc6_2.2.4-5 of the glibc library, a patch to ldconfig automates the use of
the Atlas library. If an Atlas package is installed, and correctly registered
in /etc/ld.so.conf as done by its postinst script, ldconfig will
automatically load the Atlas' version of the Blas instead of the (slower)
default Blas.
The following text is hence only relevant for systems which have not yet
upgraded to libc6_2.2.4-5 or later.
II.B Old behaviour requiring LD_LIBRARY_PATH for Octave
For Octave, use the variable LD_LIBRARY_PATH. On a computer with the
atlas2-base package:
$ LD_LIBRARY_PATH=/usr/lib/atlas octave2.1 -q
octave2.1:1> X=randn(1000,1000);t=cputime();Y=X'*X;cputime-t
ans = 7.9600
$ edd at homebud:~> octave2.1 -q
octave2.1:1> X=randn(1000,1000);t=cputime();Y=X'*X;cputime-t
ans = 61.520
For R version 1.3.0-4, the R_LD_LIBRARY_PATH variable has to be used, and its
value needs to be copied out of /usr/bin/R (or edited therein). For R version
1.3.1 or later this is done automatically in the R startup shell script. For
an Athlon machine, and with the explicit definition which is no longer needed
as of R 1.3.1, the example becomes
$
R_LD_LIBRARY_PATH=/usr/lib/R/bin:/usr/local/lib:/usr/X11R6/lib:/usr/lib/3dnow/atlas:/usr/lib:/usr/X11R6/lib:/usr/lib/gcc-lib/i386-linux/2.95.4:.
R --vanilla -q
> mm <- matrix(rnorm(10^6), ncol = 10^3)
> system.time(crossprod(mm))
[1] 2.38 0.04 2.84 0.00 0.00
$ R --vanilla -q
> mm <- matrix(rnorm(10^6), ncol = 10^3)
> system.time(crossprod(mm))
[1] 28.28 0.08 33.54 0.00 0.00
>
Running such a small example is highly recommded to ascertain that the
libraries are indeed found, and to "prove" that the speed gain is real
(and
significant) for problems of at least a medium size as the 1000x1000 examples
above.
Note that the example use "/usr/lib/atlas" for the atlas2-base
package;
Athlon users should employ "/usr/lib/3dnow/atlas", Pentium III users
should
employ "/usr/lib/xmm/atlas" and Pentium IV users should employ
"/usr/lib/26/atlas".
Lastly, it should be pointed out that it is probably worthwhile to locally
compile, and thereby optimise, the Atlas libraries if at least a moderately
intensive load is expected. This is described in the next section.
III. Locally compiling the Atlas libraries
The Debian Atlas packages have been setup to allow for local recompilation of
the Atlas libraries. This way the behaviour will be tuned exactly to the
specific CPU rather than the broader class of CPUs. It has been reported that
this can increase performance by a further 12% on the examples above.
Detailed instructions are in /usr/share/doc/atlas2-base/README.debian.gz but
the process is essentially the following [ courtesy of Doug Bates ]
apt-get source atlas2-base
cd atlas2-$VERSION
fakeroot debian/rules/custom
# wait for a *very* long time
dpkg -i ../atlas2-base*.deb
IV. See also
The Atlas packages have a very detailed README.Debian file which should be
consulted; it also details local recompilation. Sources and documentation for
Atlas are at http://www.netlib.org/atlas.
V. Acknowledgements
Camm Maguire developed the scheme of overloading Atlas over the default blas
libraries and deserves all the credit. Many thanks to John Eaton for helping
debug some errors in the initial setup, and to Doug Bates for work on the R
package. Special thanks to Ben Collins for providing a patched ldconfig as
part of the libc6 package.
Initial version
-- Dirk Eddelbuettel <edd at debian.org> Tue, 21 Aug 2001 21:37:15 -0500
First updated
-- Dirk Eddelbuettel <edd at debian.org> Sun, 11 Nov 2001 11:03:19 -0600
--
Better to have an approximate answer to the right question
than a precise answer to the wrong question. -- John Tukey
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
I have recently done a clean install of Mandrake 8.1 and am trying to
compile R with an aim to preparing a working RPM.
If anyone has managed to compile R-base on a Mandrake 8.1 please tell me.
When I try, R-base build keeps on failing both with gcc-2.96 and
gcc-3.0.1, no matter what the optimizations. The details vary according to
the choice of compiler and options, but failure is during build of lapack.
Installing the liblapack3 and liblapack3-devel rpms present on the
Mandrake 8.1 distribution hasn't solved the problem.
Switching between versions of gcc is simple, as /usr/bin/gcc is actually a
link to /etc/alternatives/gcc which is a link to /usr/bin/gcc-2.96 but can
be changed to /usr/bin/gcc-3.0.1 (or even /usr/bin/kgcc which immediately
gives "/usr/include/stdio.h:300 parse error before `__gnuc_va_list'
").
The following examples were run cleanly, i.e. the entire old build directory
was cancelled each time. Result: five different errors in double.f, one in
cmplx.f. I have kept the config.cache and config.log if anyone is
interested.
Example 1:
gcc 3.0.1, no optimizations set in R-base.spec
....
gcc -I. -I../../../src/include -I../../../src/include -I/usr/local/include
-DHAVE_CONFIG_H -mieee-fp -D__NO_MATH_INLINES -fPIC -g -O2 -c Lapack.c -o
Lapack.lo
g77 -fPIC -g -O2 -c double.f -o double.lo
double.f: In subroutine `dgees':
double.f:2362: Internal error: Segmentation fault.
Example 2:
gcc 3.0.1, CFLAGS and FFLAGS set to $RPM-OPT-FLAGS but
-ffast-math removed (this flag has always caused R build
to fail in the past)
....
gcc -I. -I../../../src/include -I../../../src/include -I/usr/local/include
-DHAVE_CONFIG_H -mieee-fp -D__NO_MATH_INLINES -fPIC -O3
-fomit-frame-pointer -pipe -mcpu=pentiumpro -march=i586
-fno-strength-reduce -c Lapack.c -o Lapack.lo
g77 -fPIC -O3 -fomit-frame-pointer -pipe -mcpu=pentiumpro -march=i586
-fno-strength-reduce -c double.f -o double.lo
double.f: In subroutine `dbdsqr':
double.f:731: Internal error: Segmentation fault.
Example 3:
gcc 3.0.1, $RPM-OPT-FLAGS enabled
....
gcc -I. -I../../../src/include -I../../../src/include -I/usr/local/include
-DHAVE_CONFIG_H -mieee-fp -D__NO_MATH_INLINES -fPIC -O3
-fomit-frame-pointer -pipe -mcpu=pentiumpro -march=i586 -ffast-math
-fno-strength-reduce -c Lapack.c -o Lapack.lo
etc. gives a segmentation fault in yet another subroutine of double.f
Example 4:
gcc 2.96, no optimizations set in R-base.spec
....
make[4]: Entering directory
`/home/mike/rpm/BUILD/R-1.3.1/src/modules/lapack'
gcc -I. -I../../../src/include -I../../../src/include -I/usr/local/include
-DHAVE_CONFIG_H -mieee-fp -D__NO_MATH_INLINES -fPIC -g -O2 -c Lapack.c -o
Lapack.lo
g77 -fPIC -g -O2 -c double.f -o double.lo
double.f: In subroutine `dtrsyl':
double.f:27942: Internal error: Segmentation fault.
Example 5:
gcc 2.96, CFLAGS and FFLAGS set to $RPM-OPT-FLAGS
but -ffast-math removed (this flag has always caused R
build to fail in the past)
....
gcc -I. -I../../../src/include -I../../../src/include -I/usr/local/include
-DHAVE_CONFIG_H -mieee-fp -D__NO_MATH_INLINES -fPIC -O3
-fomit-frame-pointer -pipe -mcpu=pentiumpro -march=i586
-fno-strength-reduce -c Lapack.c -o Lapack.lo
g77 -fPIC -O3 -fomit-frame-pointer -pipe -mcpu=pentiumpro -march=i586
-fno-strength-reduce -c double.f -o double.lo
double.f: In subroutine `dgecon':
double.f:1931: Internal error: Segmentation fault.
Example 6:
gcc 2.96, $RPM-OPT-FLAGS enabled
....
mcpu=pentiumpro -march=i586 -ffast-math -fno-strength-reduce -c Lapack.c
-o Lapack.lo
g77 -fPIC -O3 -fomit-frame-pointer -pipe -mcpu=pentiumpro -march=i586
-ffast-math -fno-strength-reduce -c double.f -o double.lo
g77 -fPIC -O3 -fomit-frame-pointer -pipe -mcpu=pentiumpro -march=i586
-ffast-math -fno-strength-reduce -c cmplx.f -o cmplx.lo
cmplx.f: In subroutine `zbdsqr':
cmplx.f:783: Internal error: Segmentation fault.
Any suggestions ?
--
Dr. Michele Alzetta
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Agustin Lobo
2001-Nov-21 14:08 UTC
[R] Announcement: Automatic ATLAS support under Debian GNU/Linux
What about other distributions? Can ATLAS be used with, i.e., SuSe? Thanks Agus Dr. Agustin Lobo Instituto de Ciencias de la Tierra (CSIC) Lluis Sole Sabaris s/n 08028 Barcelona SPAIN tel 34 93409 5410 fax 34 93411 0012 alobo at ija.csic.es -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._