Numpy
Numpy
1
What is Numpy?
• Numpy, Scipy, and Matplotlib provide MATLAB-
like functionality in python.
• Numpy Features:
Typed multidimentional arrays (matrices)
Fast numerical computations (matrix math)
High-level math functions
2
NumPy documentation
• Official documentation
http://docs.scipy.org/doc/
• Example list
https://docs.scipy.org/doc/numpy/reference/routines.html
Why do we need NumPy
Let’s see for ourselves!
4
Why do we need NumPy
• Python does numerical computations slowly.
• 1000 x 1000 matrix multiply
Python triple loop takes > 10 min.
Numpy takes ~0.03 seconds
5
NumPy Overview
1. Arrays
2. Shaping and transposition
3. Mathematical Operations
4. Indexing and slicing
5. Broadcasting
6
Arrays
Structured lists of numbers.
• Vectors
• Matrices
• Images
• Tensors
• ConvNets
7
Arrays
Structured lists of numbers.
𝑝𝑥
• Vectors
𝑝𝑦
• Matrices 𝑝𝑧
• Images
• Tensors
𝑎11 ⋯ 𝑎1𝑛
⋮ ⋱ ⋮
• ConvNets
𝑎𝑚1 ⋯ 𝑎𝑚𝑛
8
Arrays
Structured lists of numbers.
• Vectors
• Matrices
• Images
• Tensors
• ConvNets
9
Arrays
Structured lists of numbers.
• Vectors
• Matrices
• Images
• Tensors
• ConvNets
10
Arrays
Structured lists of numbers.
• Vectors
• Matrices
• Images
• Tensors
• ConvNets
11
Arrays, Basic Properties
import numpy as np
a = np.array([[1,2,3],[4,5,6]],dtype=np.float32)
print a.ndim, a.shape, a.dtype
12
Arrays, creation
• np.ones, np.zeros
• np.arange
• np.concatenate
• np.astype
• np.zeros_like,
np.ones_like
• np.random.random
13
Arrays, creation
• np.ones, np.zeros
• np.arange
• np.concatenate
• np.astype
• np.zeros_like,
np.ones_like
• np.random.random
14
Arrays, creation
• np.ones, np.zeros
• np.arange
• np.concatenate
• np.astype
• np.zeros_like,
np.ones_like
• np.random.random
15
Arrays, creation
• np.ones, np.zeros
• np.arange
• np.concatenate
• np.astype
• np.zeros_like,
np.ones_like
• np.random.random
16
Arrays, creation
• np.ones, np.zeros
• np.arange
• np.concatenate
• np.astype
• np.zeros_like,
np.ones_like
• np.random.random
17
Arrays, creation
• np.ones, np.zeros
• np.arange
• np.concatenate
• np.astype
• np.zeros_like,
np.ones_like
• np.random.random
18
Arrays, creation
• np.ones, np.zeros
• np.arange
• np.concatenate
• np.astype
• np.zeros_like,
np.ones_like
• np.random.random
19
Arrays, creation
• np.ones, np.zeros
• np.arange
• np.concatenate
• np.astype
• np.zeros_like,
np.ones_like
• np.random.random
20
Arrays, danger zone
• Must be dense, no holes.
• Must be one type
• Cannot combine arrays of different shape
21
Shaping
a = np.array([1,2,3,4,5,6])
a = a.reshape(3,2)
a = a.reshape(2,-1)
a = a.ravel()
1. Total number of elements cannot change.
2. Use -1 to infer axis shape
3. Row-major by default (MATLAB is column-major)
22
Return values
• Numpy functions return either views or copies.
• Views share data with the original array, like
references in Java/C++. Altering entries of a
view, changes the same entries in the original.
• Thenumpy documentation says which functions
return views or copies
• Np.copy, np.view make explicit copies and views.
23
Saving and loading arrays
np.savez(‘data.npz’, a=a)
data = np.load(‘data.npz’)
a = data[‘a’]
24
Image arrays
Images are 3D arrays: width, height, and channels
Common image formats:
height x width x RGB (band-interleaved)
height x width (band-sequential)
Gotchas:
Channels may also be BGR (OpenCV does this)
May be [width x height], not [height x width]
25
Mathematical operators
• Arithmetic operations are element-wise
• Logical operator return a bool array
• In place operations modify the array
26
Mathematical operators
• Arithmetic operations are element-wise
• Logical operator return a bool array
• In place operations modify the array
27
Mathematical operators
• Arithmetic operations are element-wise
• Logical operator return a bool array
• In place operations modify the array
28
Mathematical operators
• Arithmetic operations are element-wise
• Logical operator return a bool array
• In place operations modify the array
29
Math, upcasting
Just as in Python and Java, the result of a math
operator is cast to the more general or precise
datatype.
uint64 + uint16 => uint64
float32 / int32 => float32
30
Math, universal functions
Also called ufuncs
Element-wise
Examples:
np.exp
np.sqrt
np.sin
np.cos
np.isnan
31
Math, universal functions
Also called ufuncs
Element-wise
Examples:
np.exp
np.sqrt
np.sin
np.cos
np.isnan
32
Indexing
x[0,0] # top-left element
x[0,-1] # first row, last column
x[0,:] # first row (many entries)
x[:,0] # first column (many entries)
Notes:
Zero-indexing
Multi-dimensional indices are comma-separated (i.e., a
tuple)
33
Numpy – Creating vectors
• From lists
numpy.array
# as vectors from lists
>>> a = numpy.array([1,3,5,7,9])
>>> b = numpy.array([3,5,6,7,9])
>>> c = a + b
>>> print(c)
[4, 8, 11, 14, 18]
>>> type(c)
(<type 'numpy.ndarray'>)
>>> c.shape
(5,)
Numpy – Creating matrices
>>> l = [[1, 2, 3], [3, 6, 9], [2, 4, 6]] # create a list
>>> a = numpy.array(l) # convert a list to an array
>>>print(a)
[[1 2 3]
[3 6 9]
[2 4 6]]
>>> a.shape
(3, 3)
>>> print(a.dtype) # get type of an array
int64 #only one type
>>> M[0,0] = "hello"
# or directly as matrix Traceback (most recent call last):
>>> M = array([[1, 2], [3, 4]]) File "<stdin>", line 1, in <module>
>>> M.shape ValueError: invalid literal for long() with base
(2,2) 10: 'hello‘
>>> M.dtype
dtype('int64') >>> M = numpy.array([[1, 2], [3, 4]],
dtype=complex)
>>> M
array([[ 1.+0.j, 2.+0.j],
[ 3.+0.j, 4.+0.j]])
Numpy – Matrices use
>>> print(a)
[[1 2 3]
[3 6 9]
[2 4 6]]
>>> print(a[0]) # this is just like a list of lists
[1 2 3]
>>> print(a[1, 2]) # arrays can be given comma separated indices
9
>>> print(a[1, 1:3]) # and slices
[6 9]
>>> print(a[:,1])
[2 6 4]
>>> a[1, 2] = 7
>>> print(a)
[[1 2 3]
[3 6 7]
[2 4 6]]
>>> a[:, 0] = [0, 9, 8]
>>> print(a)
[[0 2 3]
[9 6 7]
[8 4 6]]
Numpy – Createing arrays
# a diagonal matrix
>>> numpy.diag([1,2,3])
array([[1, 0, 0],
[0, 2, 0],
[0, 0, 3]])
>>> b = numpy.zeros(5)
>>> print(b)
[ 0. 0. 0. 0. 0.]
>>> b.dtype
dtype(‘float64’)
>>> n = 1000
>>> my_int_array = numpy.zeros(n, dtype=numpy.int)
>>> my_int_array.dtype
dtype(‘int32’)
>>> c = numpy.ones((3,3))
>>> c
array([[ 1., 1., 1.],
[ 1., 1., 1.],
[ 1., 1., 1.]])
Numpy – array creation and use
>>> x, y = numpy.mgrid[0:5, 0:5] # similar to meshgrid in MATLAB
>>> x
array([[0, 0, 0, 0, 0],
[1, 1, 1, 1, 1],
[2, 2, 2, 2, 2],
[3, 3, 3, 3, 3],
[4, 4, 4, 4, 4]])
# random data
>>> numpy.random.rand(5,5)
array([[ 0.51531133, 0.74085206, 0.99570623, 0.97064334, 0.5819413 ],
[ 0.2105685 , 0.86289893, 0.13404438, 0.77967281, 0.78480563],
[ 0.62687607, 0.51112285, 0.18374991, 0.2582663 , 0.58475672],
[ 0.72768256, 0.08885194, 0.69519174, 0.16049876, 0.34557215],
[ 0.93724333, 0.17407127, 0.1237831 , 0.96840203, 0.52790012]])
Numpy – ndarray attributes
• ndarray.ndim
the number of axes (dimensions) of the array i.e. the rank.
• ndarray.shape
the dimensions of the array. This is a tuple of integers indicating the size of the array in each
dimension. For a matrix with n rows and m columns, shape will be (n,m). The length of the shape
tuple is therefore the rank, or number of dimensions, ndim.
• ndarray.size
the total number of elements of the array, equal to the product of the elements of shape.
• ndarray.dtype
an object describing the type of the elements in the array. One can create or specify dtype's using
standard Python types. NumPy provides many, for example bool_, character, int_, int8, int16,
int32, int64, float_, float8, float16, float32, float64, complex_, complex64, object_.
• ndarray.itemsize
the size in bytes of each element of the array. E.g. for elements of type float64, itemsize is 8
(=64/8), while complex32 has itemsize 4 (=32/8) (equivalent to ndarray.dtype.itemsize).
• ndarray.data
the buffer containing the actual elements of the array. Normally, we won't need to use this
attribute because we will access the elements in an array using indexing facilities.
Numpy – array creation and use
Two ndarrays are mutable and may be views to the same memory:
>>> x = np.array([1,2,3,4]) >>> x = np.array([1,2,3,4])
>>> y = x >>> y = x.copy()
>>> x is y >>> x is y
True False
>>> id(x), id(y) >>> id(x), id(y)
(139814289111920, 139814289111920) (139814289111920, 139814289111840)
>>> x[0] = 9 >>> x[0] = 9
>>> y >>> x
array([9, 2, 3, 4]) array([9, 2, 3, 4])
>>> y
>>> x[0] = 1 array([1, 2, 3, 4])
>>> z = x[:]
>>> x is z
False
>>> id(x), id(z)
(139814289111920, 139814289112080)
>>> x[0] = 8
>>> z
array([8, 2, 3, 4])
Numpy – array methods - sorting
>>> arr = numpy.array([4.5, 2.3, 6.7, 1.2, 1.8, 5.5])
>>> arr.sort() # acts on array itself
>>> print(arr)
[ 1.2 1.8 2.3 4.5 5.5 6.7]
>>> print(x)
[ 4.5 2.3 6.7 1.2 1.8 5.5]
>>> s = x.argsort()
>>> s
array([3, 4, 1, 0, 5, 2])
>>> x[s]
array([ 1.2, 1.8, 2.3, 4.5, 5.5, 6.7])
>>> y[s]
array([ 6.2, 7.8, 2.3, 1.5, 8.5, 4.7])
Numpy – statistics
In addition to the mean, var, and std functions, NumPy supplies several other methods
for returning statistical features of arrays. The median can be found:
>>> a = np.array([1, 4, 3, 8, 9, 2, 3], float)
>>> np.median(a)
3.0
The correlation coefficient for multiple variables observed at multiple instances can be
found for arrays of the form [[x1, x2, …], [y1, y2, …], [z1, z2, …], …] where x, y, z are
different observables and the numbers indicate the observation times:
>>> a = np.array([[1, 2, 1, 3], [5, 3, 1, 8]], float)
>>> c = np.corrcoef(a)
>>> c
array([[ 1. , 0.72870505],
[ 0.72870505, 1. ]])
Here the return array c[i,j] gives the correlation coefficient for the ith and jth
observables. Similarly, the covariance for data can be found::
>>> np.cov(a)
array([[ 0.91666667, 2.08333333],
[ 2.08333333, 8.91666667]])
Axes
a.sum() # sum all entries
a.sum(axis=0) # sum over rows
a.sum(axis=1) # sum over columns
a.sum(axis=1, keepdims=True)
1. Use the axis parameter to control which axis
NumPy operates on
2. Typically, the axis specified will disappear,
keepdims keeps all dimensions
43
Broadcasting
a = a + 1 # add one to every element
44