Dear R-help, Here's a posting to the most recent NA-digest: From: Robert van de Geijn <rvdg at cs.utexas.edu> Date: Fri, 13 Dec 2002 11:15:23 -0600 Subject: Fast BLAS Libraries for Current Architectures Recent research by Kazushige Goto, Visiting Scientist at UT-Austin, has resulted in high-performance BLAS libraries for the Intel (R) Pentium (R) III and 4 processors, the HP/Compaq/DEC alpha processor, and the IBM Power 3 and 4 architectures. Performance improvements appear to be substantial. For example, by linking this library instead of other commonly used BLAS libraries, the performance of the 600 processor (Pentium 4 processor based) cluster at the University at Buffalo-SUNY was increased from roughly 1.5 TeraFLOPS to 2.0 TeraFLOPS (HPL LINPACK benchmark used for the TOP500 list. See http://www.ccr.buffalo.edu/newsReleases/newsRelease.htm). To help us evaluate this library, kindly visit http://www.cs.utexas.edu/users/flame/goto/ For information on the techniques underlying the implementation, see Kazushige Goto and Robert van de Geijn. On Reducing TLB Misses in Matrix Multiplication. FLAME Working Note #9, The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-2002-55. Nov. 2002. available from http://www.cs.utexas.edu/users/flame/pubs.html Regards Robert van de Geijn ========================================= Anyone interested in trying to link R against it and see how much of a difference it makes? (I would, except I won't have high speed net access until I get back to office next year, and d/l large files are too painful...) Cheers, Andy ------------------------------------------------------------------------------