NumPy

From Wikipedia, the free encyclopedia
Jump to: navigation, search
NumPy
NumPy logo
Original author(s) Travis Oliphant
Developer(s) Community project
Initial release As Numeric, 1995 (1995); as NumPy, 2006 (2006)
Stable release 1.8.0 / 30 October 2013; 5 months ago (2013-10-30)
Written in Python, C
Operating system Cross-platform
Type Technical computing
License BSD-new license
Website www.numpy.org

NumPy is an extension to the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large library of high-level mathematical functions to operate on these arrays. The ancestor of NumPy, Numeric, was originally created by Jim Hugunin with contributions from several other developers. In 2005, Travis Oliphant created NumPy by incorporating features of Numarray into Numeric with extensive modifications. NumPy is open source and has many contributors.

Traits[edit]

NumPy targets the CPython reference implementation of Python, which is a non-optimizing bytecode compiler/interpreter. Mathematical algorithms written for this version of Python often run much slower than compiled equivalents. NumPy seeks to address this problem by providing multidimensional arrays and functions and operators that operate efficiently on arrays. Thus any algorithm that can be expressed primarily as operations on arrays and matrices can run almost as quickly as the equivalent C code.[1]

Using NumPy in Python gives functionality comparable to MATLAB since they are both interpreted, and they both allow the user to write fast programs as long as most operations work on arrays or matrices instead of scalars. In comparison, MATLAB boasts a large number of additional toolboxes, notably Simulink; whereas NumPy is intrinsically integrated with Python, a more modern, complete, and open source programming language. Moreover complementary Python packages are available; SciPy is a library that adds more MATLAB-like functionality and Matplotlib is a plotting package that provides MATLAB-like plotting functionality. Internally, both MATLAB and NumPy rely on BLAS and LAPACK for efficient linear algebra computations.

The ndarray data structure[edit]

The core functionality of NumPy is its "ndarray", for n-dimensional array, data structure. These arrays are strided views on memory.[2] In contrast to Python's built-in list data structure (which, despite the name, is a dynamic array), these arrays are homogeneously typed: all elements of a single array must be of the same type.

Such arrays can also be views into memory buffers allocated by C, C++. Cython and Fortran extensions to the CPython interpreter without the need to copy data around, giving a degree of compatibility with existing numerical libraries. This functionality is exploited by the SciPy package, which wraps a number of such libraries (notably BLAS and LAPACK). NumPy has built-in support for memory-mapped ndarrays.[2]

Limitations[edit]

NumPy's arrays must be views on contiguous memory buffers. A replacement packages called Blaze attempts to overcome this limitation.[3]

Algorithms that are not expressible as vectorized operation will typically run slowly because they must be implemented in "pure Python", while vectorization may increase memory complexity of some operations from constant to linear, because temporary arrays must be created that are as large as the inputs. Runtime compilation of numerical code has been implemented by several groups to avoid these problems; open source solutions that interoperate with NumPy include scipy.weave, numexpr[4] and Numba.[5] Cython is a static-compiling alternative to these.

Examples[edit]

Array Creation

>>> import numpy as np
>>> x = np.array([1, 2, 3])
>>> x
array([1, 2, 3])
>>> y = np.arange(10)  # like Python's range, but returns an array
>>> y
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

Basic Operations

>>> a = np.array([1,2,3,6])
>>> b = np.linspace(0,2,4)  # create an array with end-points 0 and 2 and 4 points in it.
>>> c = a - b
>>> c
array([ 1.        ,  1.33333333,  1.66666667,  4.        ])
>>> a**2
array([ 1,  4,  9, 36])

Universal Functions

>>> a = np.linspace(-np.pi, np.pi, 100) 
>>> b = np.sin(a)
>>> c = np.cos(a)

Linear Algebra

>>> from numpy.random import rand
>>> from numpy.linalg import solve, inv
>>> a = np.array([[1, 2, 3], [3, 4, 6.7], [5, 9.0, 5]])
>>> a.transpose()
array([[ 1. ,  3. ,  5. ],
       [ 2. ,  4. ,  9. ],
       [ 3. ,  6.7,  5. ]])
>>> inv(a)
array([[-2.27683616,  0.96045198,  0.07909605],
       [ 1.04519774, -0.56497175,  0.1299435 ],
       [ 0.39548023,  0.05649718, -0.11299435]])
>>> b =  array([3, 2, 1])
>>> solve(a, b)  # solve the equation ax = b
array([-4.83050847,  2.13559322,  1.18644068])
>>> c = rand(3, 3)  # create a 3x3 random matrix
>>> c
array([[  3.98732789,   2.47702609,   4.71167924],
       [  9.24410671,   5.5240412 ,  10.6468792 ],
       [ 10.38136661,   8.44968437,  15.17639591]])
>>> np.dot(a, c)  # matrix multiplication
array([[  3.98732789,   2.47702609,   4.71167924],
       [  9.24410671,   5.5240412 ,  10.6468792 ],
       [ 10.38136661,   8.44968437,  15.17639591]])

History[edit]

NumPy is based on two earlier Python array packages. The first, variously called Numeric, Numerical Python extensions or NumPy,[2][6] which is reasonably complete and stable, remains available, but is now obsolete. It was originally written in 1995[2] largely by Jim Hugunin, then a graduate student at MIT.[6]:10 When Hugunin joined CNRI to work on JPython, Paul Dubois of LLNL took over as maintainer.[6]:10 Other early contributors include David Ascher, Konrad Hinsen and Travis Oliphant.[6]:10

The other package, Numarray, was written as a more flexible replacement for Numeric.[2] It is also deprecated.[7] Numarray had faster operations for large arrays, but was slower than Numeric on small ones,[citation needed] so for a time both packages were used for different use cases. The last version of Numeric v24.2 was released on 11 November 2005 and numarray v1.5.2 was released on 24 August 2006.[8]

There was a desire to get Numeric into the Python standard library, but Guido van Rossum (the author of Python) was quite clear that the code was not maintainable in its state then.[when?][citation needed]

In early 2005, NumPy developer Travis Oliphant wanted to unify the community around a single array package and ported Numarray's features to Numeric, releasing the result as NumPy 1.0 in 2006[2] This new project was part of SciPy. To avoid installing the large SciPy package just to get an array object, this new package was separated and called NumPy.

While the source code is freely available and it contains significant documentation, there is also an extensive official Guide to NumPy.[9] The documentation is built around a unified docstring standard.[10]

The release version 1.5.1 of NumPy is compatible with Python versions 2.4–2.7 and 3.1–3.2. Support for Python 3 was added in 1.5.0.[11] In 2011, PyPy started development on an implementation of the numpy API for PyPy.[12] It is not yet fully compatible with NumPy.[13]

See also[edit]

References[edit]

  1. ^ "SciPy PerformancePython". Retrieved 2006-06-25. 
  2. ^ a b c d e f Stéfan van der Walt, S. Chris Colbert and Gaël Varoquaux (2011). "The NumPy array: a structure for efficient numerical computation". Computing in Science and Engineering (IEEE). 
  3. ^ "Blaze 0.4.1 Documentation". Blaze. Retrieved 8 March 2014. 
  4. ^ Francesc Alted. "numexpr". Retrieved 8 March 2014. 
  5. ^ "Numba". Retrieved 8 March 2014. 
  6. ^ a b c d David Ascher; Paul F. Dubois; Konrad Hinsen; Jim Hugunin; Travis Oliphant (1999). "Numerical Python". 
  7. ^ "Numarray Homepage". Retrieved 2006-06-24. 
  8. ^ "NumPy Sourceforge Files". Retrieved 2008-03-24. 
  9. ^ Oliphant, Travis E. (2006-12-07). Guide to NumPy (PDF). 
  10. ^ "NumPy Docstring Standard". Retrieved 2009-11-01. [dead link]
  11. ^ "NumPy 1.5.0 Release Notes". Retrieved 2011-04-29. 
  12. ^ "PyPy Status Blog: Numpy funding and status update". Retrieved 2011-12-22. 
  13. ^ "NumPyPy Status". Retrieved 2013-10-14. 

Further reading[edit]

  • Bressert, Eli (2012). Scipy and Numpy: An Overview for Developers. O'Reilly Media. ISBN 978-1-4493-0546-8. 

External links[edit]