Numerical-and-Scientific-Computing-in-Python-v0.1.2
Numerical-and-Scientific-Computing-in-Python-v0.1.2
v0.1.2
Go to: scc-ondemand.bu.edu
source /net/scc2/scratch/Numpy_examples.sh
Run Spyder
http://rcs.bu.edu/examples/numpy_scipy
Alternatives to Python
Python’s strengths
“regular” Python code is not competitive with compiled languages (C, C++,
Fortran) for numeric computing.
The solution: specialized libraries that extend Python with data structures
and algorithms for numeric computing.
Keep the good stuff, speed up the parts that are slow!
Outline
The numpy library
NumPy underlies many other numeric and algorithm libraries available for
Python, such as:
SciPy, matplotlib, pandas, OpenCV’s Python API, and more
Ndarray – the basic NumPy data type
List: Ndarray:
General purpose Intended to store and process
Untyped (mostly) numeric data
1 dimension Typed
Resizable N-dimensions
Add/remove elements anywhere Chosen at creation time
Accessed with [ ] notation and Fixed size
integer indices Chosen at creation time
Accessed with [ ] notation and
integer indices
List Review x = ['a','b',3.14]
x[1] ‘b’
Indexing backwards from -1 x[-1] 3.14
x[-3] ‘a’
Slicing x[start:end:incr] Slicing produces a COPY of
the original list!
x[0:2] [‘a’,’b’]
x[-1:-3:-1] [3.14,’b’]
x[:] [‘a’,’b’,3.14]
Sorting x.sort() in-place sort Depending on list contents a
sorted(x) returns a new sorted list sorting function might be req’d
Pointer to a
Python object
'a'
Allocated
Pointer to a
x Python object
'b' anywhere in
memory
Pointer to a
Python object
3.14
1 2 3
y[1] check the ndarray data type retrieve the value at offset 1 in the
data array return 2
https://docs.scipy.org/doc/numpy/reference/arrays.html
dtype
Every ndarray has a dtype, the type a = np.array([1,2,3])
of data that it holds. a.dtype dtype('int64')
This is used to interpret the block of
data stored in the ndarray.
c = np.array([-1,4,124],
Can be assigned at creation time: dtype='int8')
c.dtype --> dtype('int8')
A small amount of memory is used to store info about the ndarray (~few dozen bytes)
The numpy function array creates a new array from any data structure
with array like behavior (other ndarrays, lists, sets, etc.)
Read the docs!
twoD
ndarray indexing is similar to array([[1, 2],
[3, 4]])
Python lists, strings, tuples, etc.
# index from 0
oneD[0] 1
Index with integers, starting from oneD[3] 4
zero.
# -index starts from the end
oneD[-1] 4
oneD[-2] 3
Indexing N-dimensional arrays,
just use commas: # For multiple dimensions use a comma
# matrix[row,column]
array[i,j,k,l] = 42 twoD[0,0] 1
twoD[1,0] 3
y = np.arange(50,300,50)
ndarray slicing # y --> array([ 50, 100, 150, 200, 250])
y = np.arange(50,300,50)
# y --> array([ 50, 100, 150, 200, 250])
y[0:3] = -1
# y --> array([ -1, -1, -1, 200, 250])
y[0:8] = -1
# NO ERROR!
# y --> array([ -1, -1, -1, -1, -1])
ndarray addressing with an ndarray
https://docs.scipy.org/doc/numpy/reference/routines.linalg.html
NumPy I/O
When reading files you can use standard Python, use lists, allocate
ndarrays and fill them.
Or use any of NumPy’s I/O routines that will directly generate ndarrays.
Docs: https://docs.scipy.org/doc/numpy/reference/routines.io.html
Numpy docs
As numpy is a large library we can only cover the basic usage here
genfromtxt.
This demonstrates calling the function and extracting all the info it
returns.
Example: scipy.optimize.minimize 𝑦 = 3𝑥 2 + 𝑥 − 1
Open scipy_minimize.py
https://docs.scipy.org/doc/scipy/reference/tutorial/optimize.html
OpenCV
• Image Processing
• Image file reading and writing
The Open Source Computer
• Video I/O
Vision Library • High-level GUI
• Video Analysis
• Camera Calibration and 3D Reconstruction
Highly optimized and mature C++ • 2D Features Framework
library usable from C++, Java, and • Object Detection
Python. • Deep Neural Network module
• Machine Learning
• Clustering and Search in Multi-Dimensional Spaces
Cross platform: Windows, Linux, • Computational Photography
Mac OSX, iOS, Android • Image stitching
OpenCV vs SciPy
The OpenCV Python API uses NumPy ndarrays, making OpenCV algorithms
compatible with SciPy and other libraries.
OpenCV vs SciPy
A simple benchmark: Gaussian and median
filtering a 1024x671 pixel image of the CAS
building.
Gaussian: radius 5, median: radius 9. See: image_bench.py
Timing: 2.4 GHz Xeon E5-2680 (Sandybridge)
scipy.ndimage.gaussian_filter 85.7
Gaussian 3.7x
cv2.GaussianBlur 23.2
scipy.ndimage.median_filter 1,780
Median 22.5x
cv2.medianBlur 79.2
When NumPy and SciPy aren’t fast enough
Auto-compile your Python code with the numba and numexpr libraries
Combine your own C++ (with SWIG) or Fortran code (with f2py) and call
from Python
numba
The numba library can translate portions of your Python code and compile
it into machine code on demand.
The @jit decorator is used to # This will get compiled when it's
indicate which functions are first executed
@jit
compiled.
def average(x, y, z):
Options: return (x + y + z) / 3.0
GPU code generation
Parallelization
Caching of compiled code # With type information this one gets
# compiled when the file is read.
@jit (float64(float64,float64,float64))
Can produce faster array code def average_eager(x, y, z):
than pure NumPy statements. return (x + y + z) / 3.0
numexpr
import numpy as np
import numexpr as ne
Another acceleration library for
Python. a = np.arange(10)
b = np.arange(0, 20, 2)
Intel now releases a customized build of Python 2.7 and 3.6 based on
their optimized libraries.
In RCS testing on various projects the Intel Python build is always at least
as fast as the regular Python and Anaconda modules on the SCC.
In one case involving processing several GB’s of XML code it was 20x faster!
Can use the Intel Thread Building Blocks library to improve multithreaded
Python programs:
This can make mixing Python, Cython, and C code (or libraries) very
straightforward.
Your feedback is highly valuable to the RCS team for the improvement
and development of tutorials.
If you visit this link later please make sure to select the correct tutorial –
name, time, and location.
http://scv.bu.edu/survey/tutorial_evaluation.html