Basic Linear Algebra Subprograms
||This article's use of external links may not follow Wikipedia's policies or guidelines. (July 2012)|
Basic Linear Algebra Subroutine (BLAS) is a de facto application programming interface standard for publishing libraries to perform basic linear algebra operations such as vector and matrix multiplication. They were first published in 1979, and are used to build larger packages such as LAPACK. Heavily used in high-performance computing, highly optimized implementations of the BLAS interface have been developed by hardware vendors such as Intel and AMD, as well as by other authors, e.g. Goto BLAS and ATLAS (a portable self-optimizing BLAS). The LINPACK and HPL benchmarks relies heavily on DGEMM, a BLAS subroutine, for its performance.
The BLAS functionality is divided into three levels: 1, 2 and 3.
Level 1 
This level contains vector operations of the form
Level 2 
This level contains matrix-vector operations of the form
as well as solving for with being triangular, among other things.
Level 3 
This level contains matrix-matrix operations of the form
as well as solving for triangular matrices , among other things. This level contains the widely used General Matrix Multiply operation.
- Apple's framework for Mac OS X and iOS, which includes tuned versions of BLAS and LAPACK. 
- The AMD Core Math Library, supporting the AMD Athlon and Opteron CPUs under Linux and Windows.
- C++ AMP BLAS
- The C++ AMP BLAS Library is an open source implementation of BLAS for Microsoft's AMP language extension for Visual C++.
- Automatically Tuned Linear Algebra Software, an open source implementation of BLAS APIs for C and Fortran 77.
- IBM's Engineering and Scientific Subroutine Library, supporting the PowerPC architecture under AIX and Linux.
- Eigen BLAS
- A Fortran 77 and C BLAS library implemented on top of the open source Eigen library, supporting x86, x86 64, ARM (NEON), and PowerPC architectures. (Note: as of Eigen 3.0.3, the BLAS interface is not built by default and the documentation refers to it as "a work in progress which is far to be ready for use".)
- Goto BLAS
- Kazushige Goto's BSD-licensed implementation of BLAS, tuned in particular for Intel Nehalem/Atom, VIA Nanoprocessor, AMD Opteron.
- HP MLIB
- HP's Math library supporting IA-64, PA-RISC, x86 and Opteron architecture under HPUX and Linux.
- Intel MKL
- The Intel Math Kernel Library, supporting the old Intel Pentium (although there are some doubts about future support to the Pentium architecture), Core and Itanium CPUs under Linux, Windows and Mac OS X.
- NEC's math library, supporting NEC SX architecture under SUPER-UX, and Itanium under Linux 
- Netlib BLAS
- The official reference implementation on Netlib, written in Fortran 77. 
- Netlib CBLAS
- Reference C interface to the BLAS. It is also possible (and popular) to call the Fortran BLAS from C. 
- NEC's Public Domain Mathematical Library for the NEC SX-4 system.
- SGI's Scientific Computing Software Library contains BLAS and LAPACK implementations for SGI's Irix workstations.
- Sun Performance Library
- Optimized BLAS and LAPACK for SPARC, Core and AMD64 architectures under Solaris 8, 9, and 10 as well as Linux.
- Optimized BLAS that is an attempt to continue the work of Kazushige Goto by Ei-ji Nakama. 
- Optimized BLAS based on Goto BLAS hosted at GitHub, supporting Intel Sandy Bridge and MIPS_architecture Loongson processors. 
- Optimized BLAS for NVIDIA based GPU cards.
Other libraries offering BLAS-like functionality 
- AMD APPML
- AMD Accelerated Parallel Processing Math Libraries contains FFT and 3 Levels BLAS functions written in OpenCL. Designed to run on AMD GPUs supporting OpenCL also work on CPUs to facilitate multicore programming and debugging. 
- Armadillo is a C++ linear algebra library aiming towards a good balance between speed and ease of use. It employs template classes, and has optional links to BLAS/ATLAS and LAPACK. It is sponsored by NICTA (in Australia) and is licensed under a free license. .
- The Eigen template library provides an easy to use highly generic C++ template interface to matrix/vector operations and related algorithms like solving algorithms, decompositions etc. It uses vector capabilities and is optimized for both fixed size and dynamic sized and sparse matrices.
- CUDA SDK
- The NVIDIA CUDA SDK includes BLAS functionality for writing C programs that runs on GeForce 8 Series or newer graphics cards.
- The GNU Scientific Library Contains a multi-platform implementation in C which is distributed under the GNU General Public License.
- FLAME project implementation of dense linear algebra library 
- Matrix Algebra on GPU and Multicore Architectures (MAGMA) project develops a dense linear algebra library similar to LAPACK but for heterogeneous and hybrid architectures including multicore systems accelerated with GPGPU graphics cards. 
- The Matrix Template Library version 4 is a generic C++ template library providing sparse and dense BLAS functionality. MTL4 establishes an intuitive interface (similar to MATLAB) and broad applicability thanks to Generic programming.
- The Parallel Linear Algebra for Scalable Multi-core Architectures (PLASMA) project is a modern replacement of LAPACK for multi-core architectures. PLASMA is a software framework for development of asynchronous operations and features out of order scheduling with a runtime scheduler called QUARK that may be used for any code that expresses its dependencies with a Directed acyclic graph. 
- A generic C++ template class library providing BLAS functionality. Part of the Boost library. It provides bindings to many hardware-accelerated libraries in a unifying notation. Moreover, uBLAS focuses on correctness of the algorithms using advanced C++ features. 
- is a C++ template library, being able to solve linear equations and to compute eigenvalues. It is licensed under BSD License. 
The Sparse BLAS 
Sparse extensions to the previously dense BLAS exist such as in ACML
See also 
- Numerical linear algebra, the type of problem BLAS solves
- LAPACK, the Linear Algebra Package
- Math Kernel Library
- List of numerical libraries
- BLAS homepage on Netlib.org
- BLAS FAQ
- BLAS operations from the GNU Scientific Library reference manual
- BLAS Quick Reference Guide from LAPACK Users' Guide
- CSBlas for C#. CSBlas is the translation of Fortran to C# of the BLAS subroutines.
- NLapack. Port of LAPACK and BLAS using unmanaged (native) Fortran libraries.
- Lawson Oral History One of the original authors of the BLAS discusses its creation in an oral history interview. Charles L. Lawson Oral history interview by Thomas Haigh, 6 and 7 November 2004, San Clemente, California. Society for Industrial and Applied Mathematics, Philadelphia, PA.
- Dongarra Oral History In an oral history interview, Jack Dongarra explores the early relationship of BLAS to LINPACK, the creation of higher level BLAS versions for new architectures, and his later work on the ATLAS system to automatically optimize BLAS for particular machines. Jack Dongarra, Oral history interview by Thomas Haigh, 26 April 2005, University of Tennessee, Knoxville TN. Society for Industrial and Applied Mathematics, Philadelphia, PA
- An Overview of the Sparse Basic Linear Algebra Subprograms: The New Standard from the BLAS Technical Forum