|Original author(s)||Travis Oliphant|
|Initial release||As Numeric, 1995; as NumPy, 2006|
1.11.2 / 3 October 2016
1.12.0b1 / 16 November 2016
|Written in||Python, C|
NumPy (pronounced // (NUM-py) or sometimes // (NUM-pee)) is an extension to the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large library of high-level mathematical functions to operate on these arrays. The ancestor of NumPy, Numeric, was originally created by Jim Hugunin with contributions from several other developers. In 2005, Travis Oliphant created NumPy by incorporating features of the competing Numarray into Numeric, with extensive modifications. NumPy is open-source software and has many contributors.
NumPy targets the CPython reference implementation of Python, which is a non-optimizing bytecode interpreter. Mathematical algorithms written for this version of Python often run much slower than compiled equivalents. NumPy address the slowness problem partly by providing multidimensional arrays and functions and operators that operate efficiently on arrays, requiring (re)writing some code, mostly inner loops using NumPy.
Using NumPy in Python gives functionality comparable to MATLAB since they are both interpreted, and they both allow the user to write fast programs as long as most operations work on arrays or matrices instead of scalars. In comparison, MATLAB boasts a large number of additional toolboxes, notably Simulink, whereas NumPy is intrinsically integrated with Python, a more modern, complete, and open source programming language. Moreover, complementary Python packages are available; SciPy is a library that adds more MATLAB-like functionality and Matplotlib is a plotting package that provides MATLAB-like plotting functionality. Internally, both MATLAB and NumPy rely on BLAS and LAPACK for efficient linear algebra computations.
The ndarray data structure
The core functionality of NumPy is its "ndarray", for n-dimensional array, data structure. These arrays are strided views on memory. In contrast to Python's built-in list data structure (which, despite the name, is a dynamic array), these arrays are homogeneously typed: all elements of a single array must be of the same type.
Such arrays can also be views into memory buffers allocated by C/C++, Cython, and Fortran extensions to the CPython interpreter without the need to copy data around, giving a degree of compatibility with existing numerical libraries. This functionality is exploited by the SciPy package, which wraps a number of such libraries (notably BLAS and LAPACK). NumPy has built-in support for memory-mapped ndarrays.
NumPy's arrays must be views on contiguous memory buffers. A replacement package called Blaze attempts to overcome this limitation.
Algorithms that are not expressible as a vectorized operation will typically run slowly because they must be implemented in "pure Python", while vectorization may increase memory complexity of some operations from constant to linear, because temporary arrays must be created that are as large as the inputs. Runtime compilation of numerical code has been implemented by several groups to avoid these problems; open source solutions that interoperate with NumPy include
scipy.weave, numexpr and Numba. Cython is a static-compiling alternative to these.
- Array creation
>>> import numpy as np >>> x = np.array([1, 2, 3]) >>> x array([1, 2, 3]) >>> y = np.arange(10) # like Python's range, but returns an array >>> y array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
- Basic operations
>>> a = np.array([1, 2, 3, 6]) >>> b = np.linspace(0, 2, 4) # create an array with four equally spaced points starting with 0 and ending with 2. >>> c = a - b >>> c array([ 1. , 1.33333333, 1.66666667, 4. ]) >>> a**2 array([ 1, 4, 9, 36])
- Universal functions
>>> a = np.linspace(-np.pi, np.pi, 100) >>> b = np.sin(a) >>> c = np.cos(a)
- Linear algebra
>>> from numpy.random import rand >>> from numpy.linalg import solve, inv >>> a = np.array([[1, 2, 3], [3, 4, 6.7], [5, 9.0, 5]]) >>> a.transpose() array([[ 1. , 3. , 5. ], [ 2. , 4. , 9. ], [ 3. , 6.7, 5. ]]) >>> inv(a) array([[-2.27683616, 0.96045198, 0.07909605], [ 1.04519774, -0.56497175, 0.1299435 ], [ 0.39548023, 0.05649718, -0.11299435]]) >>> b = np.array([3, 2, 1]) >>> solve(a, b) # solve the equation ax = b array([-4.83050847, 2.13559322, 1.18644068]) >>> c = rand(3, 3) # create a 3x3 random matrix >>> c array([[ 3.98732789, 2.47702609, 4.71167924], [ 9.24410671, 5.5240412 , 10.6468792 ], [ 10.38136661, 8.44968437, 15.17639591]]) >>> np.dot(a, c) # matrix multiplication array([[ 53.61964114, 38.8741616 , 71.53462537], [ 118.4935668 , 86.14012835, 158.40440712], [ 155.04043289, 104.3499231 , 195.26228855]]) >>> a @ c # Starting with Python 3.5 and NumPy 1.10 array([[ 53.61964114, 38.8741616 , 71.53462537], [ 118.4935668 , 86.14012835, 158.40440712], [ 155.04043289, 104.3499231 , 195.26228855]])
The Python programming language was not initially designed for numerical computing, but attracted the attention of the scientific/engineering community early on, so that a special interest group called matrix-sig was founded in 1995 with the aim of defining an array computing package. Among its members was Python designer/maintainer Guido van Rossum, who implemented extensions to Python's syntax (in particular the indexing syntax) to make array computing easier. An implementation of a matrix package was completed by Jim Fulton, then generalized by Jim Hugunin to become Numeric, also variously called Numerical Python extensions or NumPy. Hugunin, a graduate student at MIT,:10 joined CNRI to work on JPython in 1997 leading Paul Dubois of LLNL to take over as maintainer.:10 Other early contributors include David Ascher, Konrad Hinsen and Travis Oliphant.:10
A new package called Numarray was written as a more flexible replacement for Numeric. Like Numeric, it is now deprecated. Numarray had faster operations for large arrays, but was slower than Numeric on small ones, so for a time both packages were used for different use cases. The last version of Numeric v24.2 was released on 11 November 2005 and numarray v1.5.2 was released on 24 August 2006.
There was a desire to get Numeric into the Python standard library, but Guido van Rossum (the author of Python) was quite clear that the code was not maintainable in its state then.[when?]
In early 2005, NumPy developer Travis Oliphant wanted to unify the community around a single array package and ported Numarray's features to Numeric, releasing the result as NumPy 1.0 in 2006. This new project was part of SciPy. To avoid installing the large SciPy package just to get an array object, this new package was separated and called NumPy. Support for Python 3 was added in version 1.5.0.
- Stéfan van der Walt, S. Chris Colbert and Gaël Varoquaux (2011). "The NumPy array: a structure for efficient numerical computation". Computing in Science and Engineering. IEEE. arXiv: .
- "Blaze Ecosystem Docs". Read the Docs. Retrieved 17 July 2016.
- Francesc Alted. "numexpr". Retrieved 8 March 2014.
- "Numba". Retrieved 8 March 2014.
- Millman, K. Jarrod; Aivazis, Michael (2011). "Python for Scientists and Engineers". Computing in Science and Engineering. 13 (2): 9–12.
- Travis Oliphant (2007). "Python for Scientific Computing" (PDF). Computing in Science and Engineering.
- David Ascher; Paul F. Dubois; Konrad Hinsen; Jim Hugunin; Travis Oliphant (1999). "Numerical Python" (PDF).
- "Numarray Homepage". Retrieved 2006-06-24.
- "NumPy Sourceforge Files". Retrieved 2008-03-24.
- "NumPy 1.5.0 Release Notes". Retrieved 2011-04-29.
- "PyPy Status Blog: Numpy funding and status update". Retrieved 2011-12-22.
- "NumPyPy Status". Retrieved 2013-10-14.
- Bressert, Eli (2012). Scipy and Numpy: An Overview for Developers. O'Reilly Media. ISBN 978-1-4493-0546-8.