Introduction To The Tools of Scientific Computing
Introduction To The Tools of Scientific Computing
Introduction To The Tools of Scientific Computing
Einar Smith
Introduction
to the Tools of
Scientific
Computing Editorial Board
T. J.Barth
Second Edition M.Griebel
D.E.Keyes
R.M.Nieminen
D.Roose
T.Schlick
Texts in Computational Science
and Engineering
Volume 25
Series Editors
Timothy J. Barth, NASA Ames Research Center, National Aeronautics Space
Division, Moffett Field, CA, USA
Michael Griebel, Institut für Numerische Simulation, Universität Bonn, Bonn,
Germany
David E. Keyes, 200 S.W. Mudd Bldg., Apt 5, New York, NY, USA
Risto M. Nieminen, School of Science & Technology, Aalto University, Aalto,
Finland
Dirk Roose, Department of Computer Science, Katholieke Universiteit Leuven,
Leuven, Belgium
Tamar Schlick, Department of Chemistry, Courant Institute of Mathematical
Sciences, New York University, New York, NY, USA
This series contains graduate and undergraduate textbooks on topics described by
the term “computational science and engineering”. This includes theoretical aspects
of scientific computing such as mathematical modeling, optimization methods,
discretization techniques, multiscale approaches, fast solution algorithms, paralleli-
zation, and visualization methods as well as the application of these approaches
throughout the disciplines of biology, chemistry, physics, engineering, earth sciences,
and economics.
Einar Smith
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
The core material is essentially the same as in the first edition, but thoroughly revised
and updated to the current versions of the programming environments. Based on
reader comments and suggestions, several cross-references have been included to
facilitate comparison between different programming approaches.
In the area of computer algebra systems, we have added a chapter on Mathematica.
Here, the presentation is essentially based on that of the Maple chapter and can thus
help the reader decide which system to use.
Completely new is the chapter on scientific machine learning, a discipline that is
currently in a rather experimental stage, but shows emerging potential. In any case,
the discussion can help to take a fresh look at the concept of algorithms in general.
Acknowledgments
v
vi Preface
The book is based on material from courses held by the author in the Department of
Mathematics at the University of Bonn. Originally primarily intended for students of
mathematics – at both bachelor and master level – the courses also attracted partic-
ipants from other fields, including computer science, physics and geology.
The book is primarily aimed at students of mathematics and disciplines in which
mathematical methods play an important role. A certain level of mathematical matu-
rity is recommended. Technically, however, only very basic ideas from linear algebra
and analysis are assumed, so that the book can also be read by anyone with a solid
high-school education in mathematics who wants to understand how mathematical
algorithms can be performed by digital computers. Programming experience is not
required.
The book is written in such a way that it can also serve as a text for private self-study.
With the exception of a few advanced examples in the Matlab and Maple chapters,
you can run all programs directly on your home computer, based on free open source
programming environments.
The book can therefore also serve as a repetition and to improve the understanding
of basic numerical algorithms.
Acknowledgments
The author wishes to thank Helmut Griebel and Marc Alexander Schweitzer from the
Institute for Numerical Simulation at the University of Bonn for the opportunity to
hold the programming courses and for their help in contacting Springer Verlag.
I would like to thank the course participants for their lively collaboration and crit-
ical comments, which have helped to transform the loose lecture notes into a com-
prehensive presentation. In particular, I would like to thank Angelina Steffens for
proofreading the manuscript.
Very useful was also the correspondence with Chris Rackauckas, the author of the
Julia differential equation package in Chapter 8, and Lisandro Dalcin, the author of
the Python MPI implementation in Chapter 12.
I am also grateful for helpful suggestions from the anonymous referees.
My special thanks go to Martin Peters, Ruth Allewelt and Leonie Kunz from Springer-
Verlag for their support and encouragement while preparing the book.
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Part I Background
ix
x Contents
8 Julia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
8.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
8.2 Control Structures: Branching, Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
8.3 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
8.4 Collection Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
8.5 Composite Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
8.6 Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
8.7 Ordinary Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
8.8 Partial Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
8.9 Working with Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
Contents xi
9 Matlab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
9.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
9.2 Vectors and Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
9.3 Control Structures: Branching, Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
9.4 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
9.5 M-Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
9.6 Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
9.7 Ordinary Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
9.8 Partial Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
10 Maple . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
10.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
10.2 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
10.3 Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
10.4 Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
10.5 Interpolation with Spline Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
10.6 Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
10.7 Galerkin Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
10.8 Finite Element Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
11 Mathematica . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
11.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
11.2 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
11.3 Nonlinear Equations, Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
11.4 Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
11.5 Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
11.6 Interpolation and Piecewise Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 259
11.7 Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
11.8 Galerkin Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
11.9 Finite Element Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
Supplement I. Logical Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402
Supplement II. Character Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 406
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413
Subjects and Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413
Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416
C/C++ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420
Julia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422
Matlab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424
Maple . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 426
Mathematica . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427
Chapter 1
Introduction
Stanisław Lem
In public discussion an opinion is often heard that in our “digital age” computer pro-
gramming is to be considered a basic culture technique, in importance comparable
to reading and writing. In the field of modern mathematics, at least, there can be little
doubt that this is indeed the case.
Background
Many of the mathematical computation methods known for centuries, such as Gaus-
sian elimination for solving systems of linear equations, receive their full power, re-
quired for instance in meteorology, only in implementations on high-performance
computers. Here, systems of linear equations with hundreds of thousands or even
millions of unknowns must be solved in the shortest possible time. Likewise, numer-
ical fluid mechanics requires the solution of increasingly complex partial differential
equations, which in turn is only possible with the help of powerful computers.
But even in pure mathematics the use of modern computers is becoming increasingly
indispensable. The four color conjecture, first formulated in the midst of the nine-
teenth century, according to which four colors suffice to color any arbitrary map in
the Euclidean plane, such that no two countries sharing a common boundary have
the same color, was only proved in 1976 with the help of digital computers.
A more recent finding concerns an even older problem, the so-called Kepler conjec-
ture about sphere packing in three-dimensional Euclidean space. It was formulated in
the beginning of the seventeenth century. A proof based on complex computer cal-
culations was announced in 1998. This proof was then systematically checked with
the strong support of computer assisted theorem provers, and finally accepted by the
journal Forum of Mathematics in 2017.
In many cases where computers are actually used for mathematical computations, an
inadequate understanding of how real numbers are processed has led to erroneous
theories. The most famous example is arguably the so-called butterfly effect predicted
in 1972: “Does the flap of a butterfly’s wings in Brazil set off a tornado in Texas?”
With regard to the fundamentals of programming, it was also found that a large num-
ber of computational methods - although theoretically correct - are no longer practi-
cable for practically relevant problem sizes. This is the case, for instance, for Cramer’s
rule for the solution of a system of linear equations, or for the problem, whether an
arbitrary formula of propositional logic is satisfiable.
Within the theoretical foundations of programming and the related field of math-
ematical logic, it could even be proved that certain problems cannot be solved al-
gorithmically at all. This refutes Leibniz’ “Calculemus” as well as Hilbert’s “In der
Mathematik gibt es kein Ignorabimus” (In mathematics there is no ignorabimus).
Programming Languages
The C Language
A language that is widely used also in mathematical contexts is the C language. Its
strength lies mainly in system programming, so that the language constructs remain
closely machine-oriented. This enables highly efficient programs, but with the disad-
vantage that the language often appears cryptic to beginners.
The user is forced to deal with machine-internal idiosyncrasies even in apparently
straightforward problems. In particular, questions of memory allocation and cor-
responding pointer arithmetic are omnipresent. For instance, the memory space
needed to store an 𝑛 × 𝑚 matrix 𝐴 is requested as follows:
1 Introduction 3
It is obvious that the concern with such machine-oriented structures distracts from
the real algorithmic problem in question.
At the other extreme are powerful, comprehensive program environments that offer
a wide range of data structures and algorithms in a ready-made and easy-to-use form.
For instance, in Matlab the linear equation system 𝐴𝑥 = 𝑏 is solved through
x = A \ b
The fact that the operator ‘\’ actually performs Gaussian elimination embedded in
an LU decomposition, is not visible and perhaps also not necessary for the user in
practical work.
For a deeper understanding of mathematical programming it nonetheless appears
advisable to start at a more fundamental level and learn the craft of translating algo-
rithmic approaches into executable programs from scratch.
Part I Background
Chapter 2 gives a brief introduction to the theory of computability and the princi-
ples of digital computers. This includes the presentation of a machine model closely
related to real computers, and techniques to successively build a programming lan-
guage that resembles basic Python. Other topics include Church’s thesis and the no-
tion of Turing completeness. With reference to “real” computers we explain the basic
ideas and pitfalls of the IEEE 754 standard for the representation of floating-point
numbers, a topic we return to later in the Python chapter.
Part II is the central part of the book. Here we develop various methods to transfer
mathematical algorithms to the computer, using typical programming languages and
environments.
Chapter 3 provides a systematic introduction to Python. The basic language is built
up, explaining variables, control structures, and collection types, in particular lists. The
function concept is introduced for integer and real valued functions. We emphasize
that a function can also be the return value of another function. On a technical level,
string formatting and file management are explained. As a more advanced paradigm,
we discuss object-oriented programming, based on class constructions, including in-
heritance.
Chapter 4 introduces the Python extensions Numerical Python, Scientific Python, and
Matplotlib for graphical visualization. Mathematical topics include matrix algebra,
both dense and sparse varieties, optimization methods, nonlinear equations, and or-
dinary differential equations. For partial differential equations, we develop a pro-
gram for solving Poisson equations using the finite difference method. We conclude
with a Monte Carlo method for computing 𝜋.
In Chap. 5 we illustrate basic concepts in computer algebra with the Symbolic Python
package SymPy. Topics include symbolic linear algebra and exact solutions for ordi-
nary differential equations. At an advanced level we discuss the Galerkin method for
the discretization of continuous operator problems, again illustrated for differential
equations.
Chapter 6 develops the basics of the C language. In particular, we consider arrays and
basic pointer arithmetic. We discuss C structures and their relationship to Python
classes and show how pointers and structures can be combined to construct the
sparse matrices encountered in Scientific Python.
1 Introduction 5
Chapter 7 is concerned with C++ and its relationship to C. Specifically, we discuss the
C++ data type vector and use it to greatly simplify the construction of sparse matri-
ces. We pick up object-oriented programming and class constructions from Python,
and in particular explain the approach of splitting class definitions into header and
implementation files.
Chapter 8 introduces the Julia language that combines the ease of use of Python with
the efficiency of C. We develop the basics by reformulating and expanding ideas from
Python, but also from C, here in particular the concept of compound types imple-
mented by structures. We then discuss methods from linear algebra and differential
equations. We show how one of Julia’s main features, multiple-dispatch methods in
functions, can be used to emulate class constructions in object-oriented languages.
Part III shows how the ideas developed in Part II are implemented and worked out
in typical commercial programming environments.
In Chap. 9, we discuss Matlab, a highly integrated programming environment for
many areas of scientific computing. We take up various topics from SciPy and Julia,
in particular matrix algebra and ordinary differential equations. Then we reformulate
the solution programs for partial equations taking advantage of the Matlab matrix
gallery. We conclude with a brief introduction to a prefabricated graphical toolbox to
support the solution of partial differential equations.
Chapter 10 is dedicated to Maple, a commercial computer algebra system for evaluat-
ing symbolic expressions. We compare Maple with Symbolic Python and summarize
typical examples. We take up the Galerkin method from Chap. 5 and expand it to a
tool for solving ordinary differential equations with the finite element method.
In Chap. 11 we introduce Mathematica, a computer algebra system often consid-
ered as an alternative to Maple. For comparison, we transform several examples from
SymPy and Maple to Mathematica. We also examine how the built-in solvers for dif-
ferential equations can handle partial equations.
In Chap. 13, the same algorithms are revised and extended in the more widely used
MPI version for C/C++. We then explain the shared memory approach based on the
C/C++ OpenMP module. As typical examples we reformulate integral approxima-
tions and the vector dot product. We show how function evaluation can be spread
across multiple threads. We then show how message passing and shared memory
approaches can be combined. Examples include an integral approximation for the
calculation of 𝜋 and a hybrid variant of the conjugate gradient method.
Chapter 14 introduces the proprietary approach to distributed processing in Julia,
revisiting various examples from Python and C/C++.
Part V presents two specialized software frameworks embedded in the Python envi-
ronment.
In Chap. 15, we discuss FEniCS, a library for the automated solution of differential
equations using the finite element method. The development takes up earlier dis-
cussions from SymPy, Maple and Mathematica. We explain the different techniques
needed to incorporate Dirichlet and Neumann boundary conditions. As a more ad-
vanced topic we discuss the Stokes equation, which includes both a vector-valued
and a scalar-valued function. On a technical level, we illustrate different approaches
to mesh generation in discrete approximations.
In Chap. 16, we discuss how PyTorch, an open-source machine learning package with
a particular focus on backpropagation algorithms in artificial neural networks, can be
applied to scientific computing. Topics include functional approximation, optimiza-
tion, nonlinear and linear equations, elementary analysis and, at an advanced level,
ordinary and partial differential equations. Finally, for the sake of completeness, we
present a typical example of machine learning in artificial intelligence: the recognition
of handwritten characters.
Part I
Background
Chapter 2
Mathematical Foundations of Programming
The use of modern digital computers is indispensable for large numerical tasks. In
weather forecasting, for example, systems of linear equations with thousands or even
millions of unknowns must be solved quickly. Numerical fluid mechanics is based on
the efficient calculation of ever increasingly complex partial differential equations.
The delegation of mathematical procedures to the machine requires suitable forms
of communication, in particular programming languages and problem-specific pro-
gram packages.
As mentioned earlier, there are conflicting requirements for programming lan-
guages. On the one hand, they should enable human users to formulate algorithms
in an intuitively understandable manner. On the other hand, the language should
help generate efficient machine-executable code. In order to develop sensible com-
promises, a basic knowledge of the background of modern computers can certainly
be helpful.
Example 2.2 We show how the machine performs multiplication of two numbers 𝑥
and 𝑦, stored in the registers R1 and R2.
Here is the program:
1: if R1 = 0 then stop
2:R3 <‐ R3 + R2
3:R1 <‐ R1 ‐ 1
4: goto 1
Through the command ‘res <‐ 0’ the program itself takes care of initializing the
result register with the value 0. The target of the jump command is now also marked
by a symbolic value lbl. It is left to the machine to determine the next command in
the program-execution sequence. Note that line numbers are now no longer part of
the program, but only serve to help the reader in referring to program components.
First, the multiplicands 𝑥 and 𝑦 are loaded into the registers arg1 and arg2. The
product is computed in line 3 and the result 𝑧 returned in line 4.
Of particular importance is the introduction of loops, a control structure commonly
employed in real programming languages. In Example 2.3 we note that the com-
mands in lines 3 and 4 are executed repeatedly as long as the value of arg1 is greater
than 0. This can be expressed in the form
1 while arg1 > 0 do
2 res <‐ res + arg2
3 arg1 <‐ arg1 ‐ 1
4 end while
The behavior of the program is thus that the inner block consisting of lines 2 and 3
is executed as long as the loop condition (here arg1 > 0) is satisfied. Otherwise,
the next instruction after the loop end (line 4) is executed. The instruction in line 3
guarantees that the value of arg1 eventually reaches 0.
Example 2.4 We illustrate the development so far with a somewhat larger example.
We follow the inventor of the game of chess and the payment of his reward. Allegedly,
the inventor of chess was granted a free wish by the king. His wish now was that the
chessboard was to be filled with rice grains. More precisely as follows: One grain
should be put on the first field, two on the second, four on the third, etc. (i.e. on the
next field always twice as many grains as on the previous one). The king, astonished
by this apparently modest wish, promised to grant it.
The following program computes the sum:
1 s <‐ 0; n <‐ 1; fv <‐ 1
2 while n < 65 do
3 s <‐ s + fv
4 n <‐ n + 1
5 fv <‐ 2 * fv
6 end while
In line 1, the initial values for the sum s, the field counter n, and the field content
fv are set. The semicolon serves to separate the instructions.
12 2 Mathematical Foundations of Programming
In the while loop, the number of grains in each field is added to the sum. Lines 4
and 5 compute the number for the next field. When the loop is finished, the variable s
contains the number 18446744073709551615 of requested grains.
By expanding the macros, the complete “high-level” program can be translated
back into machine instructions.
Church’s Thesis
Now we can ask how powerful our apparently primitive computation model really is.
It is rather straightforward to show that for instance the factorial function 𝑓(𝑛) = 𝑛!,
or the function 𝑝(𝑛) that returns each 𝑛-th prime number, i.e. 𝑝(1) = 2, 𝑝(2) = 3,
𝑝(3) = 5 etc., are computable. The details can again be found in [12].
In fact, it is possible to formulate the following variant of the so-called Church’s
thesis that the American mathematician Alonzo Church stated for a related model in
1936:
Church’s thesis is of course not a theorem in the mathematical sense, since proving
agreement between a formal and an intuitive concept is obviously inherently im-
possible. The thesis is perhaps rather comparable to the findings in natural sciences,
which are also accepted only because of a large number of empirical observations.
Namely, it has been discovered that a vast number of computation models all lead
to the same class of computable functions. Moreover, there is not a single algorith-
mic method known that leads to a larger class. The thesis is today accepted by most
mathematicians and computer scientists.
In order to prove that a computation model satisfies the conditions of Church’s thesis,
it has to be established as Turing complete, i.e. it has to be shown that the behavior of
the so-called universal Turing machine can be emulated.
The Turing machine is a reference model for computability, in the sense that every-
thing that is at all mechanically computable, is computable already with such a ma-
chine. If it can be verified that an automaton model can emulate a Turing machine,
it – so to speak – inherits the complete computability power.
A Turing machine consists of a control unit, which is always in one of finitely many
states, and an unbounded computation tape, divided into discrete fields. Each field
can be empty or inscribed with a marker symbol, say ‘I’. In each step exactly one
field, the work field, is processed by a read-write head. The behavior of the machine
is determined by an internal program. Depending on the state of the control unit
2.1 A Simple Machine Model 13
and the inscription of the work field, the machine can change the symbol in the field,
move the head to an adjacent field, and pass to a next internal state, see Fig. 2.1.
⋯ 𝑐 𝑎 𝑏 𝑑 ⋯ ⋯ 𝑐 𝑎' 𝑏 𝑑 ⋯
6 6
𝑞 𝑞'
Fig. 2.1 In state 𝑞, the Turing machine reads the symbol 𝑎 in the present work field (left), replaces
it by 𝑎′ , moves the head one field to the right and shifts to the next state 𝑞 ′ (right)
With some formal effort it is possible to show that our register-machine can in fact
emulate the behavior of such a Turing machine. The interested reader is again referred
to the book mentioned above.
Limits of Computability
Of course, the question arises, why at all it should be necessary to contemplate the ab-
stract theory of computability. In fact, it was for a long time assumed that every math-
ematical problem could eventually be solved. The search for limits of computabil-
ity would hence be senseless. As mentioned in the introduction, Gottfried Wilhelm
Leibniz is said to have exclaimed “Calculemus, let us compute!”, indicating that every
question of logical nature would eventually be solvable by mechanical procedures.
As late as in the year 1900, David Hilbert formulated this assumption as follows: “This
conviction of the solubility of any mathematical problem is a powerful incentive for
us to work; we have in us the constant call: There is the problem, seek the solution.
You can find it by pure thinking; for there is no Ignorabimus in mathematics”.
This hope was however destroyed in the following years by Kurt Gödel and Alan
Turing. For the programming of computing machines, the “unsolvability of the halt-
ing problem” is of fundamental importance, namely the finding that there can be no
algorithm, which can be used to check arbitrary computer programs to determine
whether they terminate after a finite number of steps or not. Anyone who has acci-
dentally written a non-terminating while loop would be grateful for such a control
procedure.
14 2 Mathematical Foundations of Programming
Repercussions in Mathematics
The unsolvability results in computability theory and mathematical logic have reper-
cussions also in the realm of classical mathematics. Based on the work of Turing and
Gödel, the unsolvability of the famous Hilbert’s tenth problem could be established in
1970.
Hilbert’s tenth problem, raised in a list of centenary mathematical problems in
1900, is the challenge of providing a general algorithm that, for any given Diophan-
tine equation (a polynomial equation with integer coefficients and a finite number of
unknowns) can decide whether the equation has a solution in which all unknowns
take integer values.
The register machine can only deal with natural numbers. We briefly outline how our
model could be modified to handle extended number ranges.
For example, for the representation of integers from ℤ we could exploit the bijec-
tion 𝜑 ∶ ℤ → ℕ given by 𝜑(𝑧) = 2𝑧 for 𝑧 ≥ 0 and −2𝑧 − 1 for 𝑧 < 0. Then, of course,
the arithmetic operations must also be adapted accordingly.
Extending further, one could then represent rational numbers by pairs of integers.
Such an approach is in principle no longer possible for the real numbers ℝ, be-
cause of their uncountability. Here we must be content with finite approximations.
We do not pursue this further for the abstract model, but discuss the matter in the
context of actual digital computers in the following.
In fact, our model is not that far from how real machines work. A digital computer
also consists of an arithmetic logic unit, a control unit, and a primary storage, i.e. stor-
age cells in which data as well as program code are kept for immediate access.
In addition, there is also an input/output management, which organizes data input
and output, to the user or to other interfaces.
Modern digital computers are generally binary. This has no metaphysical reason,
but simply the one that the components are composed of switching elements, which
only know the states ‘on’ or ‘off ’. From the word origin of “digital” from the Latin word
“digitus” for finger, one would rather expect a construction based on the decimal
system.
In memory, data is kept in the form of so-called binary words. A memory cell can
today usually store a sequence of 0s and 1s (“bit sequences”) of length 32 or 64.
2.2 Digital Computers 15
Number Representation
Natural Numbers
If we interpret the contents of a 64-bit cell as a natural number in binary form, the
maximum number that can be represented is 264 − 1 (that is, exactly the number
of summed up rice grains in Example 2.4. Why?), for a 32-bit cell correspondingly
232 − 1.
In this sense, our register machine model is obviously an idealization. In real com-
puters, the processing of arbitrarily large numbers cannot be an elementary opera-
tion, but has to be implemented in program code.
Integer Numbers
To include negative numbers, the representable range is usually shifted, in the case of
64-bit cells to the range of −263 to 263 − 1, analogously for 32-bit cells to the range of
−231 to 231 − 1.
Remark 2.5 Computationally, this can be established as follows: For the binary rep-
resentation of a number 𝑛 ∈ ℕ one needs ⌈log2 (𝑛)⌉ bits, for a 7-digit decimal number
up to ⌈log2 (107 − 1)⌉ = 24, while the 8 digit number 108 − 1 already requires 25 bits.
With 53 bits, only decimal numbers with up to 15 digits can be represented.
16 2 Mathematical Foundations of Programming
Another source for rounding errors is that most decimal fractional numbers cannot
be represented exactly by binary numbers at all! For example, the number decimal 0.1
corresponds to an infinite periodic binary number 0.00011, in normalized represen-
tation 1.1001 ⋅ 2−4 .
The rounding errors are then amplified within calculations. In fact, the finite pre-
cision of floating-point numbers is one of the fundamental problems of numerical
mathematics. It affects every computation process based on real numbers.
Computation in a real digital computer relies on instructions that are actually as sim-
ple as we assumed in our abstract model. Often it appears that the processor provides
more complex instructions, but these are then usually programmed internally in the
processor, which means that they are ultimately again reduced to addition and sub-
traction.
The reason to use a higher-level programming language is that it permits to for-
mulate problem-oriented algorithmic procedures in a better human-understandable
manner. Programs written in a higher language can then automatically be translated
into machine-executable code.
Example 2.6 As an example we reconsider our program for calculating the rice
grains in Example 2.4. Here the translation work to be performed is: replace the con-
trol structure of the while loop with jump commands, and convert the multiplication
in line 5 into a sequence of elementary commands.
There are two main approaches to translating higher-level programs into machine
code. In a compiler, the entire program is translated into an independent machine
program, which can then be executed directly without further reference to the higher-
level source.
In an interpreter, the program is translated line by line and each line is then executed
immediately. This also means that the program cannot be run on its own but only in
the dedicated interpreter environment.
The advantage of the interpreter approach is that programs can run immediately
without the need for a time-consuming translation process. The disadvantage is that
interpreted programs tend to run slower.
The latter can be explained by the while loop in Example 2.4. In an interpreter-
based version, the multiplication macro has to be translated each time it is called,
while a compiler only performs the full translation once.
Part II
Core Languages
Chapter 3
Python, the Fundamentals
A First Example
Remark Note that the print command is presented in bold. Throughout this intro-
ductory chapter we follow the convention to mark Python keywords, i.e. words that
belong to the core Python language, in this way.
Interactive Mode
Especially for learning or testing of small programs, Python offers the convenience
of interactive programming. The user enters the input at the command prompt, which
in Python usually consists of three ‘larger-than’ characters ‘>>>’, or, as in Spyder, has
the form ‘In[n]:’, where n is a counter for the number of inputs within a session.
3.2 Elementary Data Types 21
After pressing the Enter key, the result is returned on the next line, in Spyder preceded
by ‘Out[n]:’. Here we simply use ‘Out:’.
In interactive mode (and only here!) the print command can be omitted:
1 >>> "Hello World" # input on the command line
2 Out: 'Hello World' # response of the interpreter
3 >>> print ("Hello World")
4 Out: Hello World
A subtle difference can still be seen. In line 2, the input is simply mirrored, while the
print command removes the quotation marks that served only to specify the input
argument as a string.
Another useful observation is that the last response of the interpreter can be re-
called through input of the special symbol ‘_’. In line 3 above, we could equiva-
lently write print(_). The interpreter then replaces the underline symbol ‘_’ with
'Hello World' from line 2.
For example, the underline symbol is useful in a sequence of calculations such as
the following:
>>> 1 + 1 # Out: 2
>>> 3 * _ # Out: 6
Remark Note that we have written the outputs as comments on the same line as the
input, instead of on a new line. To save vertical printing space we often follow this
convention.
One last tip: In most editors, the last entries can be called up with the up/down arrow
keys.
Of prominent importance in scientific computing are the numerical data types int
and float for the representation of whole and real number values.
22 3 Python, the Fundamentals
Integer Numbers
For integer numbers from ℤ, Python provides the data type int. It can represent ar-
bitrarily large values:
1 >>> 2**100
2 Out: 1267650600228229401496703205376
The operator ‘**’ in line 1 is the Python notation for representing the power function.
Here the number 2100 is to be calculated. In line 2 the exact value is returned.
Small numbers (according to absolute value) are processed within the arithmeti-
cal framework provided by the processor. Numbers that exceed that size are then
processed by Python itself. In theory, this is significantly slower, however in practice
barley noticeable with modern machines.
The limit is usually at 263 − 1. It can be queried by
1 >>> import sys
2 >>> sys.maxsize
3 Out: 9223372036854775807
The program package sys is loaded in line 1, then the maximum size of numbers
that can be represented in the processor’s own arithmetic is requested in line 2 and
returned in line 3.
More details regarding the package sys can again be found on the site python.org.
We’ll discuss the inclusion and use of additional program packages in more detail
later.
The Python data type float for the representation of real numbers is normally based
on the 64-bit precision of the IEEE 754 standard. In the following we will simply speak
of floats. The float numbers can be represented directly in decimal-point form, but
also in scientific notation such as 1.3e2 or 1.3E2 for 130 or 1e-6 for 0.000001.
As mentioned earlier, the finite representation of real numbers is the fundamental
problem in numerical mathematics, regardless of how many bits are used. We’ll come
back to that later. Here we give just a simple example that confirms our theoretical
observations on representation precision in the last chapter:
1 >>> print (f"{0.1 :.17f}")
2 Out: 0.10000000000000001
The print command in line 1 causes the float value of 0.1, stored internally as a
binary number of length 64, to be returned with a precision of 17 decimal digits.
Here (and throughout this chapter) we use so-called f-strings (a character string
preceded by the letter ‘f’ for ‘formatted’) to generate formatted output. For technical
details we refer to Sect. 3.7.
3.2 Elementary Data Types 23
Remark For the interested reader: The number of representable decimal digits can
be obtained through
>>> import sys
>>> sys.float_info.dig # Out: 15
Observe that the number 15 confirms our estimation in Remark 2.5 in the last chap-
ter.
Arithmetical Operations
Python knows the usual arithmetic operators +, *, ‐, /, and some more, like the power
operator ** used above. A complete list can be found at python.org.
In “mixed expressions”, where at least one float occurs, the result is also a float:
>>> 5 * (3 + 4.0) # Out: 35.0
The remainder of the integer division is computed with the modulo operator ‘%’:
>>> 5 % 2 # remainder in integer division
Out: 1
In Python the symbol ‘==’ denotes the equality relation (more about this later). The
two sides are tested for equality, and the (obviously useless) result returned.
Let’s take a closer look at what happens when we print the left and right-hand side of
the equation with a 17-digit precision, again generating the output with f-strings:
1 >>> print (f"{0.1 + 0.2 :.17f}")
2 Out: 0.30000000000000004
3 >>> print (f"{0.3 :.17f}")
4 Out: 0.29999999999999999
24 3 Python, the Fundamentals
The reason for the different results is that in line 2 the values of 0.1 and 0.2 are first
rounded independently and then added. In contrast, the rounded value of 0.3 is dis-
played directly in line 4.
Example 3.1 Here is a simple example to illustrate how errors can assume astonish-
ing orders of magnitude during a computation. We use some Python constructions
that will be explained later:
from math import sqrt
n = 2; x = 2
for _ in range (n): x = sqrt(x)
for _ in range (n): x = x**2
print (x)
The program computes the sequence 𝑎0 = 2, 𝑎𝑖+1 = √𝑎𝑖 for 𝑖 = 1, … , 𝑛, then the
inverse sequence 𝑏𝑛 = 𝑎𝑛 , 𝑏𝑛−1 = 𝑏2𝑛 , … , 𝑏0 = 𝑏21 . Mathematically it is obvious that
𝑏0 = 2. For 𝑛 = 2 the program execution still seems close to the correct computation.
For 𝑛 = 51 the result 1.65 is already significantly off, and a further extension of the
sequence to length 𝑛 = 52 concludes with 𝑏0 = 1.0.
The example can also be reproduced on any standard pocket calculator, where
however the critical value for 𝑛 may different.
Python has two more built-in basic data types to represent numbers.
Complex Numbers
In Python, you can put ‘j’ or ‘J’ after a number to make it imaginary, so you can write
complex literals easily:
>>> (1+2j)*(3‐4j) # Out: (11+2j)
The ‘j’ suffix comes from electrical engineering, where the variable ‘i’ is usually used
for current.
The type of a complex number is complex. The type designator can also be used
as a constructor:
>>> complex (2, 3) # Out: (2+3j)
The type bool contains only two values True and False. As in many other program-
ing languages, True and False are just other names for the two integers 1 and 0:
>>> 1 == True # Out: True
Boolean expressions can be composed with the Boolean operators and, or, not.
3.3 Variables and Value Assignments 25
Number comparisons are denoted by ‘<’ for ‘less than’, ‘<=’ for ‘less than or equal’.
Comparisons can also be chained, for instance:
>>> 1 < 2 < 3 # Out: True
>>> 1 < 2 >= 3 # Out: False
However, observe the notation ‘==’ for ‘is equal to’ and ‘!=’ for ‘is not equal to’. We’ll
come back to that later.
So far we can only evaluate arithmetic expressions and return the result immediately.
However, as we saw in our abstract model, storing values is of paramount importance.
The following program computes the sum 1 + 2 + ⋯ + 100 according to the Gauss
sum formula:
>>> n = 100 # input
>>> sm = n*(n + 1) // 2 # integer division
>>> sm # Out: 5050
A variable may appear on both sides of an assignment. The basic rule above applies
also in this case:
>>> a = 1; a = a + 2 # two commands in a line, separated by ";"
Out: 3
26 3 Python, the Fundamentals
Note that the input consists of two commands. This is perfectly possible. The com-
mands, however, must then be separated by a semicolon.
The semicolon also serves another purpose, namely to suppress the output of an ex-
pression:
1 >>> 1 + 1 # Out: 2
2 >>> 1 + 2; # no output
Program commands are executed sequentially, unless control flow is changed by loops
or branching commands.
Loops
We’ll first look at loops, of which there are two main types in Python, while and for
loops.
While Loops
The loop is initiated by the loop head in line 2, terminated by the colon symbol ‘:’.
In the following lines the ellipsis ‘...’ indicates that a block of code is expected af-
ter the loop head. The loop body consists of the equally indented lines 3 and 4. Hitting
the enter key twice terminates the block.
3.4 Control Structures 27
During program execution the loop body is repeated as long as the index variable
contains a value less than 3. The value is increased by 1 in line 4, such that the exit
condition is eventually reached. Then the command that follows the indented block
is executed (that is, if there is one).
In line 3, note that f-strings can contain expressions, here index and 5*index,
which in lines 6 and 7 are then replaced by their values. Again we refer to Sect. 3.7
for details.
For Loops
If, as here, it is already known before entering the loop, how often it will be executed,
an equivalent formulation with a for loop might be a better choice, namely:
1 >>> for index in range (1,3):
2 ... print (f"{index} times 5 is {5*index}")
The head of the for loop in line 1 declares that the loop body (which in the present case
consists only of line 2) is to be applied successively to every element in range(1,3).
The basic form of the range function is range(start,stop) that produces a sequence
of integers from start (inclusive) to stop (exclusive), such that in the for loop above,
range(1,3) generates the indices 1, 2.
Often used is the short form range(stop) for range(0,stop).
An optional third argument can be included to specify the increment (or decre-
ment) step, such that for instance range(1,10,2) generates the sequence 1, 3, 5, 7, 9,
range(10,1,‐2) the sequence 10, 8, 6, 4, 2.
In line 1, the two assignments n = 100 and sm = 0 are are combined. Technically,
we make use of a data type ‘tuple’, to be explained in more detail later.
Moreover, line 2 shows that a code block, which consists only of a single line, can
also be written directly after the colon in the same line. We will often make use of this.
28 3 Python, the Fundamentals
Often it is necessary to interrupt the execution of a loop. The break statement ter-
minates the entire loop containing it. Control of the program flows to the statement
immediately after the loop body. The continue statement is used to skip the rest of
the code inside a loop for the current iteration only.
We will see examples later.
Conditional Statements
Often the next command in a program depends on whether a condition holds or not.
In Python (as in most modern programming languages) if-else-commands are used
for this purpose.
In Python, the basic form is
if condition : # True/False switch
statements1 # executed if True
else :
statements2 # executed otherwise
In the while loop, the value of the variable n is modified, according to whether n is
even or odd.
To test whether n is even, the modulo operator ‘%’ computes the division remainder
in integer division, and the equality test ‘==’ checks if it is 0.
If n is even, the number is replaced by its half, where we use integer division to
keep the result as an integer. If n is odd, the value of n is multiplied by 3 in line 4,
increased by 1, and the new value assigned to n.
3.4 Control Structures 29
In Exercise 3.1 at the end of the chapter you are asked to provide an explanation for
the different behaviors.
30 3 Python, the Fundamentals
All Python data types considered so far are elementary types. In addition to the el-
ementary types, there are various collection types that, as the name implies, collect
elements of other types.
Lists
The by far most important collection type in Python is the list type.
For example, the following list
lst = [1, 2, 3]
contains the numbers 1, 2, 3 in that order. The number of elements in a list can be
queried with the ‘len’ function, in this case len(lst) returns 3.
To access the individual elements we use the component notation lst[0], lst[1],
lst[2], also known as bracket or subscript notation. Note that the index counter
begins with 0. The last element of the list can be accessed by lst[‐1], the second to
last by lst[‐2], etc.
The type list is an example of a class type. It comes along with a large set of ac-
cess methods, written in the ‘dot’ notation. For instance, with the above list lst, the
operation
lst.append(4)
modifies lst, such that it now additionally contains the number 4 at position lst[3].
Conversely, we can, for example, apply the operation
lst.remove(2)
to remove the entry 2 from the list, such that lst now consists of the numbers 1,
3, 4 in that order. Note that the entry 2 is removed, not the entry at index 2. Actu-
ally, if there are multiple occurrences of 2, the first one is removed. The outcome of
[2,3,2].remove(2) is [3,2].
The entries in a list can be of any type, also again lists. Here is e.g. the list represen-
tation of Pascal’s triangle with rows 0 through 3:
[[1], [1, 1], [1, 2, 1], [1, 3, 3, 1]]
We can also define matrices in this way. However, we defer this to the next chapter,
where we will encounter more appropriate data structures. In the following we simply
refer to lists of lists as tables.
In general, lists are used to store large amounts of data. To save memory space, an
assignment of the form b = a will not copy the value of a to b. Instead, b is assigned
only the memory address of a, so that a and b now access the same memory space.
3.5 Collection Types: Lists, Tuples, Dictionaries and Sets 31
If a value is copied, we talk about value assignment, otherwise about reference assign-
ment.
The difference can be illustrated in an everyday situation: “I have made a copy of
the entrance key for you” is a value assignment. “The key is under the door mat” is a
reference assignment.
The difference between the two assignment types is of central importance in most
modern programming languages.
Reference assignments can lead to undesirable side effects when used without cau-
tion. The reason is that a change in, say, a also affects the other b, like this:
>>> a = [1] # a declared as list containing the element 1
>>> b = a # the assignment declares b as a list
>>> a.append(0) # a now becomes [1,0]
>>> b # but then also b has been changed to:
Out: [1, 0]
Remark In fact, this difference is so important that certain data types are specifically
classified as value types, for instance the basic number types. The others are then
called reference types. When in doubt, it is easy to make an assignment as above that
changes one value and then check to see if it affects the other.
A new list with the same entries is constructed with the method copy:
>>> a = [1]
>>> b = a.copy()
>>> a.append(0)
>>> b # unchanged:
Out: [1]
Sieve of Eratosthenes
As an example for the use of the list type, we show how to generate the set of all
prime numbers below a given number 𝑛.
Example 3.4 (Sieve of Eratosthenes) We begin with a list L to store the numbers
2, 3, 4, … , 𝑛, and an initially empty list P:
1 n = 10 # input upper limit
2 L = list ( range (2, n+1)) # constructs a list from range()
3 P = [] # [] denotes the empty list
4 while L != []:
5 p = L[0] # the smallest number still in L
6 P.append(p) # is appended to the end of P
7 for i in L:
8 if i % p == 0: L.remove(i)
9 print (P)
32 3 Python, the Fundamentals
In line 2, the list L is filled with the numbers 2, ..., 𝑛. A short explanation: The range
operator produces the values one after the other, as it makes sense in a programming
loop. Here they are collected together to a comprehensive whole through application
of the function list.
In line 3, the list P is prepared for collecting the prime numbers. Initially it is empty.
The while loop is executed as long as there are still elements left in L. The symbol ‘!=’
is the Python notation for the relation ‘not equal’. In lines 5 and 6, the smallest number
p still contained in L is appended to P. (The crucial question is: Why is p prime?) Then
all multiples of p (including p itself) are removed from L.
Sublists
Let lst be the list [3,1,4,1,5]. Then lst[1:3] denotes the sublist [1,4,1] from
index position 1 to position 3. (Recall that indices start at 0.) lst[:3] denotes the
list [3,1,4,1] of the first elements up to position 3. Similarly, lst[2:] denotes the
last elements [4,1,5] starting at index 2.
Concatenation
Let l1 be the list [3,1,4] and let l2 be [1,5]. Then the concatenation ‘l1 + l2’
denotes the list [3,1,4,1,5].
Remark Note that here we encounter a common pattern in Python called operator
overloading, which means that the same operator symbol denotes different operators
in different contexts. In lists, the symbol ‘+’ denotes the concatenation operator.
List Comprehensions
This is an extremely versatile method for creating lists. Let lst be a list, f a function.
(We formally discuss functions below.) Analogous to the mathematical notation for
sets {𝑓(𝑥) ∣ 𝑥 ∈ 𝑋}, then [f(x) for x in lst] is the list of function values resulting
from the successive application of f to the elements of lst.
A finer control is achieved, when we additionally assume a filter in form of an if-
condition. For instance, let lst be the list [3,1,4,3]. Then [x**2 for x in lst
if x < 4] results in the list [9,1,9], since the element 4 from lst does not satisfy
the choice criterion.
Such a list construction is known as list comprehension.
3.5 Collection Types: Lists, Tuples, Dictionaries and Sets 33
Tuples
When collecting elements, the full power of lists (and the associated computation
complexity) is often not required. Then the Python data type tuple comes in handy.
A tuple is a sequence of elements that cannot be changed once it has been created.
A tuple is defined like a list, with the difference that now rounded parentheses are
used instead of brackets:
t1 = (1, 2, 3, 4)
When declaring a tuple, the parentheses can even be left out. The following definition
is equivalent to the one above:
t2 = 1, 2, 3, 4
Note that we have already used tuples before, e.g. in Example 3.2.
With tuples consisting of single elements, some special care is required, however. The
instruction ‘t3 = (1)’ does not assign a tuple, but rather an int number, as can be
verified through the query type(t3). The correct definition of a single-element tuple
requires a comma following the element:
t4 = (1,)
Remark In fact, in the usual mathematical notation there is often no difference be-
tween an element 𝑥 and a 1-tuple that consists of this element. If 𝑋𝑛 denotes the set
of 𝑛-tuples of a set 𝑋, then for 𝑛 = 1 the tuple (𝑥) ∈ 𝑋1 is normally identified with
the element 𝑥 itself. When programming, however, we have to make this distinction
explicit.
Dictionaries
Another useful data type for the collection of elements is the type dict for dictio-
nary. A dictionary consists of a set of key-value pairs. A rectangular cuboid could be
defined like this:
c = {'width': 3, 'height': 2, 'depth': 4}
Keys and values are separated by a colon, the individual pairs by commas, the pairs
collected within curly braces. Access to the values is then – as in lists and tuples –
provided by the component operator, such that, say, c['width'] returns the value 3.
Sets
Python also provides a data type set for sets, which unfortunately is rarely discovered
in the wild. Sets are created by enclosing elements in curly braces:
>>> a = {3, 1, 2, 2, 3}
>>> a # Out: {1, 2, 3}
34 3 Python, the Fundamentals
Sets can be iterated in for loops. Note again, however, that element order or multi-
plicities are not taken into account:
>>> for i in {2, 2, 1, 2}: print (i)
Out: 1
Out: 2
Also the “comprehension” syntax, seen in lists, applies to the type set. The set dif-
ference ‘a ‐ b’ could, for instance, equivalently be defined by
c = {x for x in a if x not in b}
Note however, that the empty set cannot be specified as ‘{}’, but instead by e.g.‘set()’:
>>> type ({}) # creates empty dictionary !
Out: < class 'dict'>
>>> d = set ()
>>> type (d)
Out: < class 'set'>
>>> len (d) # Out: 0, i.e. d is empty
3.6 Functions
In line 1, the keyword ‘def’ indicates that a function definition follows, more precisely
the definition of a function called factorial in one variable n.
As with the notation of loops and the if-else statement, the colon ‘:’ indicates that
a block of code follows in which the effect of the function is defined.
The function value res is computed in lines 2 and 3 and output in line 4 with the
instruction ‘return’.
Note in line 3 that for 𝑛 = 0, range(1,1) is empty, such that the loop command
is not executed, and the initial value 1 of res left unchanged.
The function defined in this way is tested for the argument 4:
print (factorial(4)) # Out: 24
For functions whose command block consists of a single expression, often a simpli-
fied definition based on a so-called anonymous lambda expression is preferable.
Consider for instance
>>> def double(x): return 2*x
Note that this notation corresponds directly to the mathematical notation 𝑥 ↦ 2𝑥.
A lambda expression can be applied as a function, say
>>> ( lambda x: 2*x)(4) # Out: 8
or assigned to a variable
>>> dbl = lambda x: 2*x
Functions as Arguments
In Python, arbitrary data types may be used as function arguments, in particular also
functions themselves.
Example 3.6 The following function vmax computes the maximum value of a func-
tion 𝑓 on 𝑛 + 1 equidistant points 0 = 0/𝑛, 1/𝑛, 2/𝑛, ..., 𝑛/𝑛 = 1:
1 def vmax(f,n):
2 max_y = f(0)
3 h = 1.0/n
4 for i in range (1, n+1):
36 3 Python, the Fundamentals
5 y = f(i*h)
6 if y > max_y: max_y = y
7 return max_y
In line 1, both the function 𝑓 to be evaluated and the number 𝑛 are declared as argu-
ments of the function vmax(f,n). The function values are determined one after the
other in the for loop and the largest y so far is stored in the variable max_y.
We test the function vmax for an example:
def g(x): return 4*x*(1 ‐ x)
print (vmax(g,7)) # Out: 0.9795918367346939
In Python, functions can not only be used as arguments, but also output as return
values.
Example 3.7 We define a derivative operator ddx that, for an input function 𝑓, com-
putes an approximation to the derivative 𝑓′ according to the formula
𝑓(𝑥 + ℎ) − 𝑓(𝑥)
𝑓′ (𝑥) ≈ , ℎ→0∶
ℎ
1 def ddx(f):
2 h = 1.e‐6
3 def f_prime(x): return (f(x+h) ‐ f(x)) / h
4 return f_prime
Line 2 defines the value of ‘h’, which is responsible for the accuracy of the approxi-
mation. The notation 1.e-6 stands for 0.000001 in decimal exponent representation.
In line 3, a local function f_prime is defined, which is then returned as result of the
function ddx in line 4.
We test the ddx operator:
5 def g(x): return 4*x*(1 ‐ x)
6 print (ddx(g)(0.3)) # Out: 1.5999960000234736
7 dgdx = ddx(g)
8 print (dgdx(0.5)) # Out: ‐4.0000225354219765e‐06
Line 5 defines a function g for which we test the operator in lines 6–8.
In line 6, we print the approximate value of 𝑔 ′ (0.3).
In line 7, we assign ddx(g) to a variable dgdx, which can then be applied as a new
function in line 8.
Recursion
In many cases the ability to define functions recursively leads to more compact and
also – after getting used to – easier to understand definitions.
3.6 Functions 37
In lines 2 and 3, you might expect an if-else command. However, this is already im-
plicitly contained in the formulation above. If the condition n == 1 holds, the com-
mand ‘return 1’ terminates the computation with the return value 1, so that line 3
is no longer considered.
A recursive function definition always consists of one or more base cases, for which
the function produces a result directly, and one or more recursive cases, for which
the program recurs (calls itself).
For simple functions, Python provides a convenient shorthand notation for such
if-else cases. The factorial_rec can, for instance, equivalently be defined as
def factorial_rec(n): return 1 if n == 1 else n*factorial_rec(n‐1)
as above are often referred to as ternary, since they involve three parameters. We will
meet similar constructions also in various other programming languages.
Here are two more typical examples of recursive functions. We write them in short-
hand notation:
Example 3.10 The Euclidean algorithm for the computation of the greatest common
divisor of two natural numbers:
def gcd(a,b): return a if b == 0 else gcd(b, a % b)
Note that here we encounter a composite test condition. Recall that conditions can
generally be combined with the logical operators and, or, not.
Example 3.12 In the mathematical theory of computability (nowadays rather a dis-
cipline within computer science), the following Ackermann function plays a funda-
mental role:
def ack(x,y):
if x == 0: return y + 1
if y == 0: return ack(x‐1, 1)
return ack(x‐1, ack(x, y‐1))
Note that the Ackermann function is only of theoretical interest. Even with the most
advanced machines it is not possible to compute it for more than a handful of small
arguments. You might try an input 𝑥 = 4, 𝑦 > 0. In the second argument it is a bit
more conciliatory. For 𝑥 = 3 it works with 𝑦 = 8, but then with 𝑦 = 9 no more.
38 3 Python, the Fundamentals
We end our brief introduction to the basic concepts of Python with an example in
which both recursion and Python’s sophisticated list processing play an important
role, the famous Quicksort algorithm for sorting a list of numbers.
In the string in line 2, the expressions ‘n’ and ‘n/3’ to be evaluated are enclosed in
curly braces. The prefix ‘f’ causes the evaluation and the replacement of the place-
holders by the values. The modified string is then passed to the print command in
line 3.
To allow finer control, expressions can be followed by an optional format specifier
separated by a colon:
Example 3.15 If we replace line 2 in Example 3.14 with
2 >>> fstr = f"{n} divided by 3 is {n/3 :.4}"
we get an output, where the result is rounded to 4 digits after the decimal point:
3 >>> print (fstr)
4 Out: 1 divided by 3 is 0.3333
Example 3.16 We give an example to illustrate the main formatting instructions used
in numerical computations:
1 >>> f"|{12.34567 :.3f}|" # Out: '|12.346|'
2 >>> f"|{12.34567 :7.3f}|" # Out: '| 12.346|'
3 >>> f"|{12 :3d}|" # Out: '| 12|'
In line 1, the format specification ‘.3f’ causes the number 12.34567 to be rounded
to 3 decimal places, and then output. The number of required places is determined
automatically.
The specifier ‘7.3f’ in line 2 additionally sets the number of places for the out-
put to 7 (including the decimal point), and causes the number to be printed right-
justified.
In line 3, the specifier ‘3d’ reserves 3 digits for an integer decimal number.
Reading data from - and saving data to - files is, of course, essential also in mathemat-
ical programming. In the following, we discuss the basic methods for file processing
in Python. First, we take a look at standard text files that can store character strings.
We then consider specialized methods for handling binary data.
Writing Strings
In line 1, the open command opens a file called parrot.txt (first creating it, if it does
not already exist). The second argument 'w' in open states that the file is opened for
writing. The file is then linked to the file-handler variable wf, so that it can be accessed
from the program.
The write method in line 2 outputs the string 'The parrot is dead!\n' to the
file. Here ‘\n’ is interpreted as a control character, which causes a line break.
In line 3, another text line is appended to the end of the file. In line 4, access to the
file is terminated, and the connection to the file handler wf cut off.
The open command in line 1 above refers to a specific default “working directory”
where the file is to be stored, here Spyder’s own home directory, If another directory
is desired, an explicit path must be given, such as ‘/Users/myName/Desktop/par‐
rot.txt’, if it is to be stored on user myName’s desktop.
A convenient way to deal with file paths is to use string concatenation. For example
with
>>> dir_path = '/Users/myName/Desktop/'
we could access the file on the desktop by the path “dir_path + 'parrot.txt'”.
For simplicity, we continue to assume all files to be stored in the default directory,
so that no explicit access path is required.
Reading
In line 1, the parameter 'r' indicates that the file is to be accessed in reading mode.
In line 2, the entire file content is assigned to the variable fstr as a string, including
the tag ‘\n’ denoting the line break. Therefore, the print command in line 3 causes
the text to be output in two lines.
Numbers
In fact, the case above is already the general one. Only strings can be stored in files,
no numbers. Number descriptions must therefore be wrapped in character strings
before being written to the file.
This is also illustrated with a simple example:
Example 3.17 We consider a number table tbl and convert it to a character string
tblstrg as follows:
3.8 Writing and Reading Files 41
Line 4 is central. For each number num in row r of the table, an empty-space marker
followed by a string representation of the number is appended to tblstrg. Note again
the use of f-strings to interpolate the num value.
The string representation of the row is then terminated by the newline charac-
ter ‘\n’ in line 5.
After the first run of the outer for loop, tblstrg looks like this: '1 2 3 \n', after
the second (and last) like this: '1 2 3 \n 4 5 6 \n'.
The inverse conversion of the string tblstrg into a table in_tbl is then as follows:
1 row_lst = tblstrg.split('\n')
2 in_tbl = []
3 for r in row_lst:
4 nums = [ int (c) for c in r.split()]
5 if nums == []: break
6 in_tbl.append(nums)
In line 1, the string tblstrg is divided at the ‘\n’-marks, and the substrings collected
into a list row_lst, which then looks like this: [' 1 2 3',' 4 5 6','']. The last
empty string results from the newline character at the end of the line.
In the following for loop, each str component r in row_lst is converted to a list
of numbers and inserted into the table in_tbl. Here line 4 is central: first, r is split
up at the empty-space positions by split(), then each component c is converted to
an integer number by the function int(), and finally collected in a new list nums, for
example, [1, 2, 3].
In line 6, the list nums is appended to the end of in_tbl.
Line 5 captures the trailing empty string. The command ‘break’ terminates the
execution of the loop and the program control jumps to the next statement after the
loop.
Remark Recall that the break command terminates the loop’s entire execution,
whereas in contrast, a continue command causes the loop to skip the remainder
of its body in the current iteration round and immediately start the next one.
The approach of encoding data in strings has one major advantage. The resulting text
files can also be read by other programs and then processed further.
In many cases, however, you only want to back up the data for your own needs
or pass it on to other Python users. Then there are solutions, which leave the coding
42 3 Python, the Fundamentals
We will discuss the library import mechanism in general in more detail later. Here it
is sufficient to note that we can now use dump and load like any other built-in Python
function.
As an example, we store the following table using pickle encoding:
2 tbl = [[1, 2, 3], [4, 5, 6]] # input
We open the file datafile.pkl (creating it, if it does not yet exist), where the table
is to be stored:
3 fwb = open ('datafile.pkl', 'wb')
Note that the second argument in open is now 'wb'. This means that the file is to be
written with binary data, no longer with normal readable text.
The dump command in line 4 causes the table tbl to be encoded and then written
to the file:
4 dump(tbl, fwb) # write
5 fwb.close()
To regain the content we open the file in ‘read binary’ mode, and use load to assign
the decoded content to the variable in_tbl:
6 frb = open ('datafile.pkl', 'rb')
7 in_tbl = load(frb) # read
Finally, we check if the original input has survived the procedure unharmed:
8 print (in_tbl)
A highly versatile idea in modern programming is to group data types together with
their application tools into functional units. To this end, Python, like many other lan-
guages today, follows the paradigm of object-oriented programming and in particular
implements the concept of class constructions.
A class is essentially a data type combined with various access methods, i.e. special
functions that only apply to objects of this class.
We have in fact already made heavy use of one prominent example. The data type
list is a class type. We have also encountered some list methods, such as append
and remove. Typical for class methods is that their application is denoted in the so-
called dot notation, such as lst.append(3) or lst.remove(2).
3.9 Object-Oriented Programming and Classes 43
In a sense, the class is a blueprint for the construction of objects that are to behave
as fractions. An actual object (in object-oriented speech: instance of the class) is then
constructed with an instruction of the form a = Fraction(3,4), assigning it to the
variable a.
In detail, the process looks as follows: First, the initializing function __init__
allocates memory for two internal variables self.num and self.den in lines 2–4
and assigns the external input values of num and den (here the numbers 3 and 4) to
these variables. This memory space is then linked to the variable named ‘a’.
The use of the same designators num and den for both internal and external refer-
ences is unproblematic and in fact quite customary.
Now the inner components can be accessed via a.num and a.den:
>>> a.num, a.den # Out: (3, 4)
Actually we could also access the components in write mode, and for example, change
the numerator of a to 5 with the assignment a.num = 5. But we will not need that
here.
For the processing of Fraction objects we add two methods. In object-oriented
speech, a method is simply a function defined within a class.
First we define a method to add fractions. The following lines are indented to
indicate that they still belong to the code block initiated in line 1:
5 def add(self, b): # fraction addition
6 return Fraction(self.num*b.den + b.num*self.den,
7 self.den*b.den)
The add method serves to add a second fraction b to an already existing one a. It is
applied like this:
>>> b = Fraction(1,2)
>>> c = a.add(b)
>>> c.num, c.den # Out: (10, 8)
Note that the application of the method appears in the so-called dot notation, where
the function is written as postfix in only one variable, instead of – as one could per-
haps expect – in the form add(a,b). However, what happens is that the method
belonging to the distinguished object a is applied, for which then the other object b
becomes the function argument.
If we now call to mind that the placeholder self in lines 6 and 7 stands for the
object that applies the method, the effect of add should be clear.
44 3 Python, the Fundamentals
The following method ‘isEqualTo’, as the name suggests, serves to test whether two
fractions are equal. It is applied in the form a.isEqualTo(b) and returns one of the
Boolean values True or False.
8 def isEqualTo(self, b) # equality between fractions
9 return True if self.num*b.den == self.den*b.num else False
Note that in line 9 we use the shorthand notation explained in Remark 3.9.
Example (with a, b, c as defined above):
>>> d = b.add(a)
>>> c.isEqualTo(d) # Out: True
From a mathematical point of view, it is not entirely satisfactory that the arguments
a and b are not treated symmetrically.
In certain cases, however, Python provides special identifiers that permit more
familiar expressions. If we replace ‘add’ by ‘__add__’ in line 5 in the class definition
above, we can write fraction addition in the usual form ‘a + b’ instead of a.add(b).
Note that this is again an example of operator overloading. Depending on context,
the symbol ‘+’ can denote different operations.
Similarly, in line 8 the designator isEqualTo can be replaced by ‘__eq__’, such
that the equality test for two fractions a and b can be performed as ‘a == b’.
Remark Note that we follow the usual convention of using uppercase for class names
and lowercase for methods.
Exercise In Sect. 3.5 we noted the distinction between value and reference types. To
which do class objects belong?
Polynomials
5 s = 0
6 for i in range ( len (self.coeff)): s += self.coeff[i]*x**i
7 return s
In the initialization method ‘__init__’, the coefficient list from the input is assigned
to the internal variable self.coeff.
The method ‘__call__’ computes the function value for a polynomial p applied
to an argument x. It is applied in the form p(x). The function len in line 6 returns
the length of a list, i.e. the numbers of elements it contains.
To illustrate the development so far:
>>> p = Polynomial([1, 2, 3])
>>> p(4) # Out: 57
The method ‘__add__’ serves to add polynomials p, q in the user-friendly form p+q
(similar to the addition in the class Fraction). In line 9, an empty list lst is gener-
ated. The sums of the corresponding coefficients are then appended to the list one
after the other.
It should be noted that the coefficient lists are generally of different lengths. In the
if-else block, the longer one of the two is first appended to the empty list lst in lines
11 or 14, respectively. The following for loop then adds the components of the shorter
one. The return value is a new polynomial with the added coefficients.
Remark One could be tempted to use an assignment of the form ‘lst =’ instead of
the concatenation ‘lst +=’ in lines 12 and 15. However, as we have already seen in
Sect. 3.4, this could possibly have undesired side effects.
We come to multiplication:
19 def __mul__(self, q): # method polynomial multiplication
20 d1, d2 = len (self.coeff), len (q.coeff)
21 lst = [0 for i in range (d1 + d2 ‐ 1)]
22 for i in range (d1):
23 for j in range (d2):
24 lst[i+j] += self.coeff[i]*q.coeff[j]
25 return Polynomial(lst)
46 3 Python, the Fundamentals
The method ‘__mul__’ computes the coefficients of the product polynomial accord-
ing to the standard procedure for polynomial multiplication.
In line 19, a list of the required length is defined and filled with the placeholder
values 0. This is an initialization method that occurs so frequently that NumPy, for
example, which will be discussed in the next chapter, provides a special function ze‐
ros for exactly this purpose.
The analysis of the rest of the method’s operation is left to the reader.
Finally we include an equality test ‘p == q’ for polynomials. Intuitively, two poly-
nomials are equal if their coefficient lists are equal. However, we have often stressed
that we should not rely on equality tests between float numbers.
We interpret equality as “indistinguishable within machine precision” and arrive at:
26 def __eq__(self, q):
27 d = len (self.coeff)
28 if d != len (q.coeff): return False
29 for i in range (d):
30 if abs (self.coeff[i] ‐ q.coeff[i]) > 1.e‐14:
31 return False
32 return True
Line 28 states that polynomials with coefficient lists of different length cannot be
equal. Otherwise, the coefficient pairs are compared, and if they at some point differ
by more than 1.e-14, the polynomials are considered not equal.
Only if all comparisons remain within the tolerance, we return the verdict True.
To illustrate the methods in the class Polynomial, we test the associative law:
>>> p = Polynomial([1,2])
>>> q = Polynomial([3,4,5])
>>> r = Polynomial([6,7,8,9])
>>> p*(q + r) == p*q + p*r
Out: True
Inheritance
The init function rejects input of lists that do not consist precisely of three coeffi-
cients, required for parabolas.
Otherwise, in line 6 the input is handed over to the init function of the superclass
Polynomial, such that then as before the coefficients are imported into the internal
variable coeff.
Let’s check what we have so far:
>>> p = Parabola([1, 2]) # Out: no parabola
>>> p = Parabola([1, 2, 3])
>>> p(4) # Out: 57
This last response is acceptable, since the polynomial p*p is in fact no parabola, where
the roots method would make sense.
However, we receive the same response for ‘p + p’, which is a parabola. The remedy
is to modify the Polynomial method add to one where the return value is of type
Parabola:
10 def __add__(self, q):
11 lst = [0, 0, 0]
12 for i in range (3): lst[i] += self.coeff[i] + q.coeff[i]
13 return Parabola(lst)
48 3 Python, the Fundamentals
We try again:
>>> p = Parabola([1, 2, 3])
>>> (p + p).roots() # Out: To be implemented
Now it works. What has happened, is that the new Parabola method ‘__add__’ over-
rides the one in the superclass Polynomial.
Exercises
Machine Epsilon
Exercise 3.1 Consider the two while loops discussed in connection with the machine
epsilon in Sect. 3.4.
Give a detailed explanation of the different behaviors. To this end, it might be
useful to consult e.g. the article en.wikipedia.org/wiki/IEEE_754 on the numeri-
cal standard in floating-point computation, and/or follow the program execution by
inserting print commands.
Polynomial Class
Exercise 3.2 Extend the class Polynomial, described in Sect. 3.9, to include methods
for computing the derivative and antiderivative of polynomials, and illustrate them
for 𝑝(𝑥) = 3𝑥 2 + 2𝑥 + 1.
Linear Algebra
Exercise 3.3 Define a basic Matrix class. It should include matrix multiplication in
the form 𝐴∗𝐵, a method to print a matrix to the screen in the usual form as a vertical
sequence of rows, such as e.g.:
1. 2. 3.
4. 5. 6.
and a test for equality using the infix-operator ‘==’. Give an example to illustrate that
multiplication is associative.
The solutions to the following exercises can either be formulated as independent pro-
grams or preferably included as methods in the above matrix class or in a subclass.
Exercise 3.4 Compute the solution of the linear equation system 𝐴𝑥 = 𝑏, where
123 2
𝐴 = (4 5 6), 𝑏 = (3).
788 5
Exercises 49
3 −1 2
𝐴 = ( −3 4 −1 ) .
−6 5 −2
(𝑏 − 𝑎)3
(∗) |𝐸(𝑓, 𝑎, 𝑏)| ≤ max |𝑓″ (𝑥)|.
12 𝑎≤𝑥≤𝑏
To better approximate the integral, we subdivide the interval [𝑎, 𝑏] into 𝑛 adjacent
equal-size subintervals of length ℎ = 𝑏−𝑎
𝑛
. In each subinterval we apply the simple
trapezoidal rule, and then sum up the resulting approximations. This gives the com-
posite trapezoidal rule.
Note Some functions and constants needed for the exercises below can be found in
the package math. They can be accessed, for instance, through ‘from math import
sin, pi’.
Exercise 3.7
(1) Define a Python function
trapeze(f,a,b,n)
to compute the integral of a function 𝑓 over an interval [𝑎, 𝑏] according to the com-
posite trapezoidal rule for 𝑛 subintervals. Avoid repeated computations of the indi-
vidual function values 𝑓(𝑥).
𝜋
(2) Use the trapeze function to approximate the integral 𝐼 ∶= ∫0 sin(𝑥) 𝑑𝑥, such
that the result is correct to 6 decimal places. To do this, use the exact value 𝐼 = 2,
which results from the evaluation of the antiderivative −cos of sin.
In the last two exercises, the integral could be calculated directly, using the an-
tiderivative. Of course, numerical integration is of particular interest when the in-
tegral cannot be expressed by known antiderivatives in a closed form.
2
Exercise 3.9 For example, consider the function 𝑓(𝑥) = 𝑒 𝑥 . Compute the integral
1
∫ 𝑓(𝑥) 𝑑𝑥
0
to 6 decimal places. For this purpose, determine the number 𝑛 of necessary equidis-
tant evaluation points by applying the above error estimate (∗) to each individual
trapeze over the subintervals.
The Taylor series is probably the most important tool in numerical mathematics. It
plays a major role in error estimation in approximation methods. For instance, the
estimate (∗) above is based on the evaluation of a suitable Taylor series. In general,
the Taylor series allows us to approximate
𝑛
ℎ𝑖 (𝑖)
𝑓(𝑥 + ℎ) ≈ ∑ 𝑓 (𝑥)
𝑖=0 𝑖!
for the approximation of 𝑒 𝑥 to 6 decimal places. Formulate the loop condition with-
out reference to the true value. Try to use as few elementary arithmetic operations
+, ∗, −, / as possible.
Exercises 51
The Newton method is a standard method for the numerical determination of zeros
of nonlinear functions.
Let 𝑓∶ ℝ → ℝ be a continuously differentiable function. Consider the recursively
defined sequence
𝑓(𝑥 )
𝑥𝑛+1 ∶= 𝑥𝑛 − ′ 𝑛 , 𝑥0 given.
𝑓 (𝑥𝑛 )
If this sequence converges, then to a root of 𝑓. Again, this can be shown with the
aforementioned Taylor-series expansion.
Exercise 3.11
(1) Write a Python function
newton(f, f_prime , x)
which applies the Newton method to the initial value 𝑥0 until |𝑥𝑛+1 − 𝑥𝑛 | < 10−7 . To
catch the case that the computation does not converge, it should be interrupted with
an error message after 100 unsuccessful steps.
Note that in addition to the function 𝑓 itself, the derivative 𝑓′ must also be sup-
plied explicitly.
(2) Test the procedure for 𝑓(𝑥) ∶= 𝑥 2 − 2 and the initial value 𝑥 = 1. Compare the
result to the exact solution √2.
Exercise 3.12 Explain the following apparent counter example to Fermat’s Last The-
orem:
>>> 844487.**5 + 1288439.**5 == 1318202.**5 # Out: True
Chapter 4
Python in Scientific Computation
The Python language itself contains only a limited number of mathematical func-
tions. Fortunately, it is easy to seamlessly integrate comprehensive libraries. For ex-
ample, the computation of sin(𝜋/2) could be performed as follows:
>>> from math import sin, pi
>>> sin(pi/2) # Out: 1.0
4.1 NumPy
There are several ways to include external libraries. One we have already seen above
(and also encountered in the exercises in the last chapter). Another one is for example
(illustrated with numpy as example) to issue the command ‘import numpy’.
In that case, calls of NumPy components must be preceded by the prefix ‘numpy.’,
such as numpy.sin(numpy.pi).
Often the form ‘import numpy as np’, is used, which declares np as an abbrevi-
ation for numpy, so that we can write np.sin(np.pi).
Here we often prefer the simplest form, namely ‘from numpy import *’, which
means that we can use the entire library content without prefix, and write sin(pi),
for example.
As a first example we will write a NumPy program for solving quadratic equations.
Before that, however, we introduce the eminently important NumPy data type array.
NumPy Arrays
A NumPy array is similar to a Python list, however with two limitations. An array
can only contain numbers, and the length of an array – once declared – cannot be
changed. These apparent drawbacks however come along with a decisive advantage.
The specialized arrays allow much faster processing, which becomes quite relevant for
data sets with e.g. millions of elements. A further advantage is the algebraic structure
of the arrays. They behave like mathematical vectors. They can be added, multiplied
with scalars, or scalars may be added componentwise to all elements. As we will see,
this idea of componentwise processing is a major feature.
After these preparations we come to our first example:
Example 4.1 (Quadratic Formula) We develop a program for the solution of quadra-
tic equations 𝑎𝑥 2 + 𝑏𝑥 + 𝑐 = 0, 𝑎 ≠ 0, according to the formula
−𝑏 ± √𝑏2 − 4𝑎𝑐
𝑥1,2 = .
2𝑎
The following function solveq returns both solutions:
1 from numpy import sqrt, array
2 def solveq(a,b,c):
3 d = b**2 ‐ 4*a*c
4 if d < 0: return 'No real‐valued solutions'
5 w = sqrt(d)
6 return 1.0/(2*a)*(‐b + array([w, ‐w]))
The root function sqrt in line 5 is not contained in the basic Python kernel, it belongs
to NumPy. Note that it always returns only the positive root.
Lines 4 and 6 illustrate that the return value of a function is not limited to a single
type, in line 4 it is a string, in line 6 a pair of floats, more precisely a NumPy array
with two components.
Let’s take a closer look at the expression in line 6. The root in line 5 is stored in
± form in the list [w,‐w] and then converted to a NumPy array by the function ar‐
ray(). Then the value ‐b is added componentwise to each entry, resulting in the
4.1 NumPy 55
array [‐b+w, ‐b‐w]. The calculation then ends with the multiplication of the array
by the scalar ‘1.0/(2*a)’.
We test the program for 4𝑥 2 − 12𝑥 − 40:
7 print(solveq(4, ‐12, ‐40)) # Out: [5. ‐2.]
Exercise Modify solveq to an implementation of the roots method for the Parabola
class in Sect. 3.9 in the last chapter.
In NumPy, vectors and matrices are directly represented by arrays and can therefore
be added to and multiplied with scalars.
In the following, we assume all required NumPy components to be imported.
Vectors
In the example above we already indicated how NumPy arrays can be used to repre-
sent vectors. Here is a collection of basic vector operations:
1 v = array([1, 2, 3])
2 v[1] # access to components
3 2 * v # multiplication with scalar
4 3 + v # componentwise addition of scalar
5 w = array([4, 5, 6])
6 v + w # addition
7 v * w # attention! componentwise multiplication
8 v @ w # correct dot product
In lines 1 and 5, the vectors v and w are created as Python lists and then converted to
NumPy arrays by the array() function.
Lines 3, 4, 6 and 7 illustrate the component-based approach in NumPy arrays.
Note in line 8 that Python provides ‘@’ as an operator for matrix multiplication
and thus also for the dot product.
Remark The two statements print([1,2,3]) and print(array([1,2,3])) return
the same result [1,2,3] . However, the actual data type can be queried with the ‘type’
function:
1 >>> type([1, 2, 3]) # Out: list
2 >>> type(array([1, 2, 3])) # Out: numpy.ndarray
The response in line 2 is the internal name ndarray for the NumPy data type array.
The prefix ‘nd’ stands for ‘𝑛-dimensional’.
Exercise What happens if we work directly with the lists, i.e. if we consider 2*[1,2,3]
or [1,2,3] + [4,5,6] for lists?
56 4 Python in Scientific Computation
Matrices
Matrices are two-dimensional arrays in the above sense and can also be represented
directly as NumPy arrays:
1 A = array([[1, 2], [3, 4]])
2 A[0,1] # component
3 Atp = A.T # transpose of A
4 B = array([[5, 6], [7, 8]])
5 C = A + B # correct componentwise addition
6 D = A * B # attention! componentwise multiplication
7 E = A @ B # correct matrix multiplication
As already seen for lists in general, it is often useful to extract parts from a given array.
NumPy makes it possible to access not only individual elements of a matrix, but also
arbitrary subblocks. For this, simply replace numbers with lists (or arrays or tuples)
when indexing.
As an example consider the matrix
>>> A = array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12]])
With the method flatten all elements of a matrix can be arranged linearly in an
array, by default row by row from left to right:
1 >>> A = array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]]);
2 >>> A.flatten() # elements written in row‐major order:
3 Out: array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
4 >>> A.flatten(order='F') # in column ‐major order:
5 Out: array([ 1, 5, 9, 2, 6, 10, 3, 7, 11, 4, 8, 12])
The parameter 'F' in line 4 denotes that the matrix is read column by column and
each column from top to bottom, as is usual in the Fortran programming language.
The reshape(n,m) function can be used to reshape an array or matrix of compatible
size into an 𝑛 × 𝑚 matrix with the same elements:
>>> a = array([1, 2, 3, 4, 5, 6])
>>> a.reshape(2, 3) # row‐major order:
Out: array([[1, 2, 3],
[4, 5, 6]])
>>> a.reshape(2, 3, order='F') # column‐major order:
Out: array([[1, 3, 5],
[2, 4, 6]])
Remark Note that the default option for flatten and reshape can also be specified
explicitly as order='C', referring to the representation order of matrices in the C
programming language. The interested reader is referred to Sect. 6.4 in the C chapter.
The column-major option in flatten and reshape will be of particular importance
in Sect. 4.8, where we solve partial differential Poisson equations with the so-called
finite difference method. There we need to flatten a quadratic matrix 𝑈 to a vector 𝑢
with entries in column-major order. After some processing, the new 𝑢-values should
be stored in 𝑈 at the original positions.
The following example illustrates how the original positions can be regained:
>>> U = array([1, 2, 3], [4, 5, 6], [7, 8, 9]])
>>> u = U.flatten(order='F')
Out: array([1, 4, 7, 2, 5, 8, 3, 6, 9])
>>> u.reshape(3, 3, order='F')
Out: array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
Standard Matrices
>>> ones((2,3))
Out: array([[ 1., 1., 1.],
[ 1., 1., 1.]])
>>> identity(2)
Out: array([[ 1., 0.],
[ 0., 1.]])
Other methods for defining arrays and matrices build on the list comprehension dis-
cussed earlier.
Example: Using the NumPy function arange(4) to define the array([0,1,2,3])
we get:
>>> array([i**2 for i in arange(4)]) # Out: array([0, 1, 4, 9])
As a somewhat larger example, a so-called Toeplitz matrix, i.e. a matrix in which each
diagonal descending from left to right is constant, can be constructed as follows:
>>> arc = lambda r, c: r ‐ c
>>> array([[arc(r,c) for c in arange(4)] for r in arange(3)])
Out: array([[ 0, ‐1, ‐2, ‐3],
[ 1, 0, ‐1, ‐2],
[ 2, 1, 0, ‐1]])
Remark The NumPy function arange is similar to the Python function range, with
the difference that arange generates a NumPy array. The general form is aran‐
ge(start,stop,step).
We show how to construct the so-called Poisson matrix, which we will use later to
solve partial differential equations.
The basic building block is the 𝑛 × 𝑛, 𝑛 ≥ 3, tridiagonal matrix given by
4 −1
−1 4 −1
( −1 4 −1 )
𝐷=(
( ⋱⋱⋱ )
).
−1 4 −1
−1 4 −1
( −1 4 )
1 >>> n = 3
2 >>> w = ones(n‐1)
3 >>> sD = diag(w,1) + diag(w,‐1); sD
4 Out: array([[ 0., 1., 0.],
5 [ 1., 0., 1.],
6 [ 0., 1., 0.]])
We can now turn to the construction of the Poisson matrix. For a given 𝑛 ≥ 3, it is
defined as the following 𝑛2 × 𝑛2 block tridiagonal matrix:
𝐷 −𝐼
−𝐼 𝐷 −𝐼
( −𝐼 𝐷 −𝐼 )
𝐴=(
( ⋱⋱⋱ )
),
−𝐼 𝐷 −𝐼
−𝐼 𝐷 −𝐼
( −𝐼 𝐷)
78 78
1⋅( ) 2⋅( )7 8 14 16
90 90
12 78 9 0 18 0
( )⊗( )=( )=( ).
34 90 21 2428 32
78 78
3⋅( ) 4⋅( ) 27 0 36 0
90 90
Example 4.2 We show how to generate the Poisson matrix in NumPy. For conve-
nience, we repeat the construction of the matrices sD and D from above.
1 n = 3
2 w = ones(n‐1); sD = diag(w, 1) + diag(w, ‐1)
60 4 Python in Scientific Computation
3 I = identity(n)
4 D = 4*I ‐ sD
5 A = kron(I, D) + kron(sD, ‐I) # Poisson matrix
In line 5, we construct the Poisson matrix with the NumPy function kron. First,
kron(I,D) creates the block diagonal in A from n copies of D. Then kron(sD,‐I)
adds copies of ‐I on the super and subdiagonal blocks.
Note that the construction also works for arbitrary numbers 𝑛. For a given 𝑛, the
Poisson matrix has the shape (𝑛2 , 𝑛2 ).
As an example of how already basic NumPy tools can be used effectively in math-
ematical programming, we discuss the conjugate gradient method for solving linear
equations. We will return to the example several times in later chapters.
The method of conjugate gradients is a widely used iterative method for solving large
linear equation systems 𝐴𝑥 = 𝑏 for 𝑛 × 𝑛 matrices 𝐴 that are symmetric (i.e. 𝐴𝑇 = 𝐴)
and positive definite (i.e. 𝑥 ⋅ 𝐴𝑥 > 0 for 0 ≠ 𝑥 ∈ ℝ𝑛 ). A prominent example is the
Poisson matrix constructed above.
Quadratic Form
The method is based on the fact that for a symmetric and positive definite matrix 𝐴,
the solution 𝑥 ∗ of 𝐴𝑥 = 𝑏 coincides with the unique point 𝑥 ∗ where the so-called
quadratic form
1
𝑓(𝑥) ∶= 𝑥 ⋅ 𝐴𝑥 − 𝑥 ⋅ 𝑏
2
takes its minimal value.
It follows that the solution of the linear equation can be obtained by gradient de-
scent along the function𝑓.
The method of conjugate gradients is a particular variant of gradient descent in
which, after an initial guess of the minimum 𝑥0 of 𝑓, iterative “search directions”
characterized by appropriate mutually conjugate 𝑛-dimensional vectors 𝑝𝑘 , 𝑘 ≥ 1,
are chosen.
Here, two vectors 𝑢, 𝑣 are conjugate (with respect to 𝐴) if
𝑢 ⋅ 𝐴𝑣 = 0.
Note that this actually defines a vector dot product, in respect to which 𝑢 and 𝑣 are
then orthogonal.
4.2 Conjugate Gradient Method 61
The Method
For our construction, we choose 𝑥0 arbitrarily, and then set 𝑝1 ∶= 𝑏 − 𝐴𝑥0 as the
residual vector necessary to reach 𝑏. Here we let 𝑥0 ∶= 0 and 𝑝1 ∶= 𝑏.
Now assume we have inductively determined 𝑝𝑘 and 𝑥𝑘 . Then we have to fix the
new direction 𝑝𝑘+1 . As indicated for 𝑝1 , a key to the choice is the residual
(1) 𝑟𝑘 ∶= 𝑏 − 𝐴𝑥𝑘
(3) ||𝑝𝑘+1 − 𝑟𝑘 ||
Once the direction 𝑝𝑘+1 is chosen, a new estimate 𝑥𝑘+1 for the solution is obtained by
minimizing 𝑓 along the new direction. That is, 𝑥𝑘+1 will be chosen by determining
the scalar 𝛼𝑘+1 = 𝛼 that minimizes
In our solution program we will use that with the same scalar 𝛼𝑘+1 , the residual in (1)
can equivalently be expressed by the inductive definition
Remark The method relies on a clever steepest descent, only once along every di-
mension, such that the exact result is reached in 𝑛 steps, but a good approximate
solution in practice much faster.
The construction resembles the Gram–Schmidt orthogonalization process, how-
ever with the advantage that the definition of each 𝑝𝑘+1 depends only on the direct
predecessor 𝑝𝑘 .
62 4 Python in Scientific Computation
A thorough but entertaining discussion can be found in the paper [11], obtainable
from www.cs.cmu.edu.
Example 4.3 (Conjugate Gradient Method) First we import the required NumPy
components and fix an example:
1 from numpy import array, zeros, sqrt
2 A = array([[ 9., 3., ‐6., 12.], [ 3., 26., ‐7., ‐11.],
3 [ ‐6., ‐7., 9., 7.], [ 12., ‐11., 7., 65.]])
4 b = array([ 18., 11., 3., 73.])
Note that the numbers (𝑛) in the comments refer to the equations above.
The iterative approximation is computed in the following loop:
9 for i in range(n): # n steps to exact result
10 Ap = A @ p # matrix‐vector mult.
11 alpha = rs_old / (p @ Ap) # needed for (6) and (7)
12 x += alpha*p # ... in (6)
13 r ‐= alpha*Ap # ... in (7)
14 rs_new = r @ r # update residual
15 if sqrt(rs_new) < 1e‐10: break # desired precision
16 p = r + (rs_new / rs_old)*p # used in (4)
17 rs_old = rs_new # prepare for next iteration
18 print(x) # Out: [ 1. 1. 1. 1.]
4.3 SciPy
Throughout this section we also assume the package scipy.linalg to be loaded with
from scipy.linalg import *
Matrix Algorithms
Not surprisingly, the package scipy.linalg provides basic methods for matrix
arithmetic.
Example:
>>> A = array([[1, 2, 3], [1, 1, 1], [3, 3, 1]])
>>> det(A) # determinant
Out: 2.0000000000000004
>>> inv(A) # matrix inverse
Out: array([[‐1. , 3.5, ‐0.5],
[ 1. , ‐4. , 1. ],
[ 0. , 1.5, ‐0.5]])
Linear Equations
LU Decomposition
>>> U
Out: array([[ 3. , 3. , 1. ],
[ 0. , 1. , 2.66666667],
[ 0. , 0. , 0.66666667]])
>>> P
Out: array([[ 0., 1., 0.],
[ 0., 0., 1.],
[ 1., 0., 0.]])
Cholesky Decomposition
The method of least squares is a method to solve overdetermined linear equation sys-
tems of the form
𝐴𝑥 = 𝑏, 𝐴 ∈ ℝ𝑛×𝑚 , 𝑥 ∈ ℝ𝑚 , 𝑏 ∈ ℝ𝑛 , 𝑛 > 𝑚.
The least squares method now determines a vector 𝑥 such that the above equations
are fulfilled “as close as possible”. More precisely, 𝑥 is calculated so that the residual
vector 𝑟 ∶= 𝑏 − 𝐴𝑥 becomes minimal with respect to the Euclidean norm || ⋅ ||.
We are therefore looking for a solution 𝑥 to the problem
It is known that this vector 𝑥 is the solution of the Gaussian normal equation
(∗) 𝐴𝑇𝐴𝑥 = 𝐴𝑇 𝑏.
The matrix 𝐴𝑇𝐴 is symmetric and positive definite, hence satisfies the conditions of
the Cholesky decomposition.
We show how Cholesky decomposition can be used in solving (∗) by computationally
efficient forward and backward substitution with triangular matrices.
3 6 −1
𝐴𝑥 = 𝑏 with 𝐴 = (4 −8) , 𝑏 = ( 7 ),
0 1 2
QR Decomposition
>>> R
Out: array([[‐14., ‐21., 14.],
[ 0., ‐175., 70.],
[ 0., 0., ‐35.]])
>>> Q.T @ Q # returns I
>>> Q @ R # returns A
As we can see, qr has selected Q as a square 3×3 matrix and R in 3×2 shape by adding
a 0-row at the bottom.
We can also use QR decomposition to solve the Gaussian normal equation (∗) on
page 65. However, for this we would prefer a factorization with a regular square matrix
𝑅, since then, observing 𝐴 = 𝑄 𝑇 𝑅 and the identity 𝑄 𝑇 𝑄 = 𝐼, we can rewrite (∗) as
𝑅𝑇 𝑅𝑥 = 𝑅𝑇 𝑄 𝑇 𝑏,
An eigenvector of a square matrix 𝐴 is a vector 𝑥 that does not change its direction
when multiplied with 𝐴, but is only stretched by a scalar factor 𝜆, i.e. such that
(1) 𝐴𝑥 = 𝜆𝑥.
The factor 𝜆 is a so-called eigenvalue of the matrix. A common method for calcu-
lating eigenvectors and eigenvalues is based on the fact that all eigenvalues can be
determined as roots of the characteristic polynomial
Note that eigenvalues are in general complex numbers, but in our example they are
all real, as can be seen from the summand 0.j, where j denotes the imaginary unit.
We calculate the product of 𝐴 and the first eigenvector 𝑣0 , as well as the product
with the associated eigenvalue:
9 >>> (A @ e_vec)[:,0]
10 Out: array([ ‐3.74002233, ‐8.50785632, ‐12.47378976])
11 >>> e_val[0]*e_vec[:,0]
12 Out: array([ ‐3.74002233, ‐8.50785632, ‐12.47378976])
Sparse Matrices
For dealing with sparse matrices SciPy provides the package sparse. We assume it
to be imported in the form
from scipy.sparse import *
seems to be a good candidate that could benefit from a sparse representation. It can
be obtained in this way:
2 >>> A_csr = csr_matrix(A); A_csr
3 Out: <3x3 sparse matrix of type '<class 'numpy.int64'>'
4 with 5 stored elements in Compressed Sparse Row format >
The idea behind a sparse representation is to store values ≠ 0 together with their
position in the matrix, in our example
(2, 0, 1), (5, 1, 1), (6, 1, 2), (7, 2, 0), (8, 2, 2).
Technically, there are several ways to do this. For use in matrix equations, it turns
out to be efficient to choose the so-called compressed sparse row format, as in line 2
above. For details see e.g. the Wikipedia article Sparse_matrix.
For solving equations with sparse matrices we need specially adapted solvers:
Example 4.7 We continue with our last example and solve the matrix equation
In Example 4.2 we discussed the construction of Poisson matrices. Now, the Poisson
matrix is typically applied when the basic size 𝑛 is quite large, say 𝑛 = 50. The matrix
then has the shape (𝑛2 , 𝑛2 ). For 𝑛 = 50 it is (2500, 2500).
As can be easily seen, the matrix is sparse. For memory reasons (and, as we will see
later, also for computational speed reasons) it is therefore advantageous to consider
a sparse implementation.
Example 4.8 We construct a version of the Poisson matrix in compressed sparse row
format. The construction itself is essentially the same as before, only with adapted
functions.
4.5 Graphics with Matplotlib 69
We start with
1 n = 50
2 w = ones(n‐1)
As a result we get:
>>> A # Poisson matrix
Out: <2500x2500 sparse matrix of type '<class 'numpy.float64'>'
with 12300 stored elements in Compressed Sparse Row format >
If desired, the sparse matrix can be converted into a dense matrix with
>>> A.toarray()
The graphical tools we need to create and display the plot are available in the library
matplotlib.pyplot:
3 from matplotlib.pyplot import plot, show
4 plot(xvals, yvals)
5 show()
The instruction in line 4 creates the plot. It interprets the first argument xvals as a
list of 𝑥-values and the second as the associated 𝑦-values. The individual points (𝑥, 𝑦)
are then connected by straight line segments to yield the desired graph.
In line 5, the function ‘show’ then displays the plot to the screen. This is illustrated
in Fig. 4.1.
1.00
0.75
0.50
0.25
0.00
0.25
0.50
0.75
1.00
0 1 2 3 4 5 6
Fig. 4.1 The sine function represented on a grid of 9 equidistant support points
Note that in many programming environments, such as e.g. Spyder, the plot is auto-
matically displayed, the call to show hence not required.
The plot can also be saved to a file, e.g. graph.png, as follows:
from matplotlib.pyplot import savefig
savefig('graph.png')
Note however that the ‘savefig’ function cannot be called after ‘show’, since the latter
consumes its argument, so to speak.
To obtain high-resolution graphics, the output format can be set to e.g. PDF, sim-
ply by changing the file extension to ‘pdf’ instead of ‘png’. In fact, this is how the plots
were prepared for printing in this book.
Example 4.10 Let’s consider a slightly larger example where two functions are visu-
2
alized in a common representation, the function 𝑓(𝑥) = 𝑥 2 𝑒−𝑥 and its derivative
2
𝑓′ (𝑥) = 2𝑥 (1 − 𝑥 2 ) 𝑒−𝑥 over the interval [0, 3].
The functions are defined by
1 def f(x): return x**2*exp(‐x**2)
2 def dfdx(x): return 2*x*(1 ‐ x**2)*exp(‐x**2)
4.5 Graphics with Matplotlib 71
Here, ‘r’ and ‘b’ specify the color of the plots, ‘‐‐’ the dashed shape of the line.
7 xlabel('x‐axis'); ylabel('y‐axis')
8 legend(['f(x) = x^2 e^(‐x^2)', 'd/dx f(x)'])
9 title('Function and derivative')
The result will look like Fig. 4.2. Note that the default number 50 for grid points in
linspace usually results in an acceptably fine grid for smooth rendering.
0.2
y-axis
0.0
0.2
0.4
0.0 0.5 1.0 1.5 2.0 2.5 3.0
x-axis
Fig. 4.2 Two functions plotted in a sufficiently fine grid
Example 4.11 A function to be plotted may also be given by a number of value pairs
of the form (𝑥, 𝑓(𝑥)). In this case, the 𝑥- and 𝑦-values must be distributed across two
“parallel” arrays (or, as here, actually normal Python lists).
Example:
1 >>> pts = [(1, 1), (2, 4), (3, 9)] # input
2 >>> xvals = [x for x, y in pts] # x‐components
3 >>> xvals # Out: [1, 2, 3]
4 >>> yvals = [y for x, y in pts] # y‐components
5 >>> plot(xvals, yvals) # plot
6 >>> z = list(zip(xvals, yvals)) # reconstruction of pts‐list
7 >>> z # Out: [(1, 1), (2, 4), (3, 9)]
In lines 2 and 4, the 𝑥- and 𝑦-components are collected in separate lists. Note that
the pairs in pts in line 4 are given as tuples, and therefore do not have to be enclosed
in parentheses.
Line 6 shows how the components can be brought together again. The analogy
to a zipper in clothing is mirrored in the function name zip. Note that zip works
72 4 Python in Scientific Computation
like the previously considered Python function range. The value pairs are generated
one after the other. It is the function list() that combines them to a comprehensive
whole.
In the package optimize, SciPy offers a collection of functions for solving nonlinear
equations and systems of such equations.
Single Equations
In line 2, we define the function for which we want to determine the root. In line 3,
we use the bisection method to find a root in the interval (−2, 2). In line 4, we apply
Newton’s method with initial value 2.
Equation Systems
To compute solutions of equation systems we use the function fsolve. First, however,
we apply fsolve to the same single equation as above:
>>> from scipy.optimize import fsolve
>>> fsolve(f, 0.3) # initial value 0.3
Out: array([‐1.02986653])
Note that the output in array form already indicates that fsolve is intended for solv-
ing systems of multiple equations. Here is one:
𝑦 − 𝑥 3 − 2𝑥 2 + 1 = 0
𝑦 + 𝑥 2 − 1 = 0.
Starting with an initial value (1, 1), the system can be solved like this:
>>> def f(x): return [x[1] ‐ x[0]**3 ‐ 2*x[0]**2 + 1,
x[1] + x[0]**2 ‐ 1]
>>> fsolve(f, [1, 1]) # Out: array([ 0.73205081, 0.46410162])
However, note that all solvers bisect, newton and fsolve determine the root of a
function 𝑓. An equation, say, 𝑥 2 = 2, must be rewritten in the homogeneous form
𝑓(𝑥) = 0 with 𝑓(𝑥) ∶= 𝑥 2 − 2.
4.6 Nonlinear Equations, Optimization 73
Minimization
Often it is necessary to determine the minimum of a function. For this, we use the
minimize operator in the package scipy.optimize.
As an example, we show how to use minimize to find the minimum of the so-
called Rosenbrock function
Note that the NumPy function meshgrid returns the coordinate matrices X, Y from
the coordinate vectors x, y in line 7. This is needed for the 3D representation in line 9.
The result is shown in Fig. 4.3.
3000
2000
1000
4
3
2
2 1
1 0
0 1
1 2
2
Fig. 4.3 The Rosenbrock function
The Rosenbrock function assumes its minimum value 0 at the point (1, 1). The fol-
lowing program computes an approximate solution:
1 from scipy.optimize import minimize
2 def f(x): return (1 ‐ x[0])**2 + 100*(x[1] ‐ x[0]**2)**2
3 x0 = array([1.3, 0.7])
4 res = minimize(f, x0)
5 print(res.x) # Out: [0.99999552 0.99999102]
6 print(f(res.x)) # Out: 2.011505248124899e‐11
74 4 Python in Scientific Computation
In line 3, we choose an initial value. The solution ‘res’ returned in line 4 contains
various entries, including the minimum coordinates in the array res.x.
To get a better control of the approximation, the function minimize can apply
various approximation methods. In the present case, for example, we could replace
line 4 with
res = minimize(f, x0, method='nelder ‐mead', tol=1.e‐8)
to obtain the exact result [1. 1.]. Use help(minimize) to see the available options.
In the package integrate, SciPy offers a large collection of tools for numerical inte-
gration and the solution of ordinary differential equations.
Integration
For the numerical computation of definite integrals, SciPy provides various func-
tions, in particular the general-purpose tool ‘quad’.
1
As a simple example, let’s calculate the integral ∫0 𝑥 2 𝑑𝑥:
>>> from scipy.integrate import quad
>>> f = lambda x: x**2
>>> ans, err = quad(f, 0, 1) # result and estimated error
>>> ans # Out: 0.33333333333333337
>>> err # Out: 3.700743415417189e‐15
∞
Integrals from/to ±∞ can also be calculated, for example ∫0 𝑒−𝑥 𝑑𝑥:
>>> f = lambda x: exp(‐x)
>>> quad(f, 0, inf) # Out: (1.000000, 5.842607e‐11)
The integrate package offers various functions for the numerical solution of sys-
tems of ordinary differential equations, including the solver solve_ivp for solving
initial value problems IVPs and solve_bvp for boundary value problems BVPs.
4.7 Numerical Integration, Ordinary Differential Equations 75
If we assume that the evaluation points are equidistant with distance 1/𝑛, and ap-
proximate the integral in (3) by 2𝑥𝑖 /𝑛, we obtain a discrete function 𝑢 that takes the
values 𝑢𝑖 ∶= 𝑢(𝑥𝑖 ), inductively defined by
𝑥𝑖
(4) 𝑢0 = 0, 𝑢𝑖+1 = 𝑢𝑖 + 2 ⋅ 𝑖 = 0, 1, … , 𝑛 − 1.
𝑛
This function 𝑢 can now easily be computed by a Python program:
u = zeros(100)
n = 100
x = linspace(0, 1, n)
for i in range(n‐1): u[i+1] = (u[i] + 2*x[i]/n)
from matplotlib.pyplot import plot
plot(x,u)
Example 4.12 Equation (1) can also be solved directly with solve_ivp:
1 from scipy.integrate import solve_ivp
2 def dudx(x,u): return 2*x
3 epts = (0, 1)
4 u0 = [0]
5 sol = solve_ivp(dudx, epts, u0)
76 4 Python in Scientific Computation
Line 2 specifies the form in which 𝑢′ is passed to the solver. Line 3 defines the evalu-
ation interval by the tuple of endpoints. Line 4 provides the initial value in the form
of a list. This is necessary because in systems with multiple equations, an initial value
has to be specified for each equation.
The solution sol stores the evaluation points selected by the solver in sol.t and
the associated function values in sol.y[0]. In its basic form, the algorithm actually
only returns 6 value pairs. We’ll see below how to get finer solution sets.
More precisely, t is returned as an array of length 6, y as a 1 × 6 matrix, with the
values for u contained in y[0].
Remark The reason for the specific identifier t in sol.t is that it reflects that initial
value problems are often considered for systems evolving along a time parameter.
We turn to a slightly less trivial example, where we in particular show how to specify
finer evaluation grids.
Lotka–Volterra Equations
Systems of multiple first order differential equations can also be solved with the same
solver solve_ivp.
As an example, we consider the Lotka–Volterra equations, a pair of first order
equations, commonly used to describe the dynamics of predator–prey systems in
biology.
4.7 Numerical Integration, Ordinary Differential Equations 77
4.0
3.5
3.0
2.5
2.0
1.5
1.0
0 1 2 3 4 5
′
Fig. 4.4 Initial value problem 𝑢 (𝑥) = 𝑥 − 𝑢(𝑥), 𝑢(0) = 1 solved with solve_ivp
To solve the system with solve_ivp, we collect the unknown functions 𝑢 and 𝑣 in a
pair 𝑦 = (𝑢, 𝑣).
The equation system (∗) then becomes
Recalling that indices in Python begin at 0, we can now hand it over to the solver:
1 def dydt(t,y): return [y[0]*(1‐ y[1]), ‐y[1]*(1 ‐ y[0])]
2 y0 = [1, 4]
3 tvals = linspace(0, 4*pi, 100)
4 epts = [0, 4*pi]
5 from scipy.integrate import solve_ivp
6 sol = solve_ivp(dydt, epts, y0, t_eval=tvals)
The array sol.y contains the solutions for 𝑢 and 𝑣. We plot both functions together:
7 from matplotlib.pyplot import plot
8 plot(tvals, sol.y[0])
9 plot(tvals, sol.y[1])
Pendulum
4.0
3.5
3.0
2.5
2.0
1.5
1.0
0.5
0.0
0 2 4 6 8 10 12
𝜃′ (𝑡) = 𝜔(𝑡),
(∗)
𝜔′ (𝑡) = −0.25 𝜔(𝑡) − 5.0 sin (𝜃(𝑡)).
The return value sol.y is a 2 × 100 matrix. It contains the computed values of the un-
known function 𝜃 in the first row, in the second the values of the “helper” function 𝜔.
The following plot shows them both:
8 from matplotlib.pyplot import plot, legend
9 plot(tvals, sol.y[0], label='theta(t)')
10 plot(tvals, sol.y[1], label='omega(t)')
11 legend(loc='best')
3 theta(t)
omega(t)
2
1
0
1
2
3
4
0 2 4 6 8 10
The scipy.integrate package provides the solve_bvp solver for boundary value
problems. We briefly show how it works.
Like solve_ivp in the previous section, solve_bvp can only handle first deriva-
tives, so again we need to translate a given second-order problem into a system of
two first-order equations.
Example 4.16 Consider the BVP
The interval under consideration, here [0, 1], is initially encoded in the simplest pos-
sible form, as the pair of endpoints:
3 x_init = [0, 1]
In the matrix y_init, we can provide initial-value guesses for the unknown functions
𝑢 and 𝑣 at the points in the interval x_init and pass them to the solver. Here we
initialize all values to 0:
4 y_init = zeros((2, len(x_init)))
The resulting approximation functions for both 𝑢 and 𝑣 are contained in res.sol.
For an array of evaluation points, here
7 x = linspace(0, 1, 100)
1.0
0.8
0.6
0.4
0.2
0.0
0.0 0.2 0.4 0.6 0.8 1.0
″
Fig. 4.7 BVP 𝑢 (𝑥) = 6𝑥, 𝑢(0) = 0, 𝑢(1) = 1 solved with solve_bvp
4.7 Numerical Integration, Ordinary Differential Equations 81
Remark For the curious reader: The meaning of the matrix y_init in lines 4 and 6
in Example 4.16 might still appear somewhat elusive. It serves a similar purpose as
the initial value in Newton’s method when solving nonlinear equations, namely to
ensure efficient convergence and, furthermore, in the case of multiple solutions, to
guide the solver to the desired one.
The row y_init[0] can assume initial function values 𝑢(𝑥) for the unknown func-
tion 𝑢, the row y_init[1] values for the helper function.
To make that point clear, we illustrate the best possible help we can give to the
solver, namely to provide the exact solution itself. Replace line 3, for example, with
‘x_init = linspace(0,1,5)’, and assign the corresponding exact function values
to the matrix y_init by inserting the following lines after line 4:
y_init[0] = x_init**3 # exact function value
y_init[1] = 3*x_init**2 # exact derivative value
The main difference from the previous example is the constant curvature −2. It must
be recast to a vector of suitable length:
def dydx(x,y):
z = ‐2*ones(len(x))
return (y[1], z)
The rest of the problem specification is then essentially a repetition of the above:
def bc(yl, yr): return (yl[0], yr[0]‐3)
x_init = [0, 5]
y_init = zeros((2, len(x_init)))
Example 4.18 As a final example, we consider a “real world problem”, the so-called
Bratu equation
𝑢″ (𝑥) = −𝑒𝑢(𝑥) in [0, 1],
𝑢(0) = 𝑢(1) = 0,
which is known to have two solutions.
We begin as in the previous examples:
1 def dydx(x,y): return (y[1], ‐exp(y[0]))
2 def bc(yl, yr): return [yl[0], yr[0]]
In the present case we have to provide suitable hints to the solver, not least because
we want to retrieve both solutions.
82 4 Python in Scientific Computation
and introduce two different matrices y_init_1 and y_init_2 for the initial guesses:
4 y_init_1 = zeros((2, len(x_init)));
5 y_init_2 = zeros((2, len(x_init)))
6 y_init_2[0][1] = 3
4.0 u_1
u_2
3.5
3.0
2.5
2.0
1.5
1.0
0.5
0.0
0.0 0.2 0.4 0.6 0.8 1.0
The solution of partial differential equations PDEs is one of the main concerns in
numerical mathematics. By nature, the matter is significantly more complex than
the IVPs and BVPs studied so far. There is no universal solver like solve_ivp or
4.8 Partial Differential Equations 83
solve_bvp. SciPy provides useful support, but the main work has to be done “man-
ually”. In a later chapter, we discuss the FEniCS project, which provides sophisticated
methods for the automated solution of PDEs.
As a model example, we discuss how 2D Poisson equations can be solved in SciPy
using the so-called finite difference method. We limit ourselves to Dirichlet boundary
conditions, for a start even to homogeneous ones.
To this end, we consider the following PDE:
Determine a function 𝑢 ∶ Ω → ℝ, Ω ∶= [0, 1] × [0, 1], such that
−Δ𝑢 = 𝑓 in Ω,
(1)
𝑢=𝑔≡0 on the boundary 𝜕Ω
for a given function (often called source function) 𝑓, where we here assume the special
case of a constant function 𝑓 ≡ 1. As usual, Δ𝑢 is an abbreviation for the sum of the
𝜕2 𝑢 𝜕2 𝑢
partial derivatives 2 + 2 .
𝜕𝑥 𝜕𝑦
For the construction of an approximate solution, we use the general observation that
for a twice continuously differentiable function 𝜑 and small ℎ we have
1
𝜑″ (𝑥) ≈ (𝜑(𝑥 − ℎ) − 2𝜑(𝑥) + 𝜑(𝑥 + ℎ)).
ℎ2
For our function 𝑢 in 2 arguments, we can add both approximations in the 𝑥- and
𝑦-directions to obtain
1
(2) Δ𝑢(𝑥, 𝑦) ≈ (𝑢(𝑥 − ℎ, 𝑦) + 𝑢(𝑥, 𝑦 − ℎ) − 4𝑢(𝑥, 𝑦) + 𝑢(𝑥 + ℎ, 𝑦) + 𝑢(𝑥, 𝑦 + ℎ)).
ℎ2
Discretization
After these preparations we can now develop a discrete approximation for (1).
Let ℎ ∶= 1/(𝑛 + 1) for a given 𝑛. We wish to determine the values for our desired
function 𝑢 at the points
From (2) together with −Δ𝑢 = 𝑓 ≡ 1 in (1), we get the following system
1
(4) (−𝑢𝑖−1,𝑗 − 𝑢𝑖,𝑗−1 + 4𝑢𝑖𝑗 − 𝑢𝑖+1,𝑗 − 𝑢𝑖,𝑗+1 ) = 𝑓(𝑥𝑖 , 𝑦𝑗 ) = 1, 𝑖, 𝑗 = 1, … , 𝑛,
ℎ2
of 𝑛2 equations as a discretization of the equation in (1).
84 4 Python in Scientific Computation
In this case, each row 𝑣 of 𝐴 consists only of 0s, except for a value 4 on the main
diagonal and a maximum of 4 entries with a value -1, positioned exactly so that the
dot product 𝑣 ⋅ 𝑢 between this row and the unknown vector 𝑢 corresponds to the
added values of the five-point stencil around each 𝑢𝑖𝑗 . For example, for 𝑛 = 3, row 5
looks like this: 𝑣 = (0, −1, 0, −1, 4, −1, 0, −1, 0).
Note, however, that, for example, the first line referring to the stencil around 𝑢1,1
consists of the entries (4, −1, 0, −1, 0, … , 0). For such border adjacent elements, there
is no need to include the values that refer to the left and upper neighbor, since they
are 0 by assumption.
Note that 𝐴 actually turns out to be precisely the Poisson matrix from Sect. 4.2.
The definition of the vector 𝑏 in this simple case with 𝑓 ≡ 1 is trivial, i.e. we define 𝑏
as the one-vector (1, 1, … , 1)𝑇 of length 𝑛2.
Solution Program
The equation system (5) can now be solved in SciPy. We first provide the grid:
1 n = 50 # n x n inner grid points
2 h = 1/(n+1) # distance between grid points
Then we introduce the value matrix for the grid points, at first initialized with 0s:
3 u = zeros((n+2,n+2)) # grid points 0‐initialized
We come to the matrix equation (5). This requires a “flat” vector representation of
the function values of 𝑓:
9 b = F.flatten(order='F') # equation right‐hand side needs vector
The order='F' option ensures that the order in b corresponds to the column-major
order of the 𝑢𝑖𝑗 according to (6). Actually, here it would not matter, since all 𝑓-values
are equal by definition.
Then we can solve (5) with the operator ‘solve’ from scipy.linalg:
10 from scipy.linalg import solve
11 u_inner = solve(A, b*h*h) # solution of Lin. eq. syst. (5)
The solution u_inner has the form of a “flat” vector and must be reshaped to a ma-
trix, which then stores the inner values of the solution matrix 𝑢. More precisely, the
values are stored in column-major order specified by (6) and must be distributed
accordingly. This is again achieved with the order='F' option:
12 u[1:n+1,1:n+1] = u_inner.reshape(n,n, order='F')
The 0s on the boundary remain. As explained they represent the boundary condition
in (1).
We are now ready to plot the solution:
13 lin = linspace(0, 1, n+2)
14 x, y = meshgrid(lin,lin)
15 from matplotlib.pyplot import figure , show
16 fig = figure()
17 ax = fig.add_subplot(projection='3d')
18 ax.plot_surface(x, y, u, rstride=1, cstride=1,
19 cmap='jet', linewidth=0)
20 show()
We show that by using sparse matrices, a considerable gain in speed can be achieved
in the above program.
86 4 Python in Scientific Computation
0.07
0.06
0.05
0.04
0.03
0.02
0.01
0.00
1.0
0.8
0.0 0.2 0.6
0.4 0.6 0.4
0.2
0.8 1.0 0.0
Fig. 4.9 Finite-difference approximation to Poisson equation (1)
To do this, we first simply convert the dense Poisson matrix to a sparse one in com-
pressed sparse row format. For this it suffices to insert
from scipy.sparse import csr_matrix
A_csr = csr_matrix(A)
For example, the test computer used here takes approximately 0.0295 seconds to
compute the solution for 𝑛 = 100 with a sparse Poisson matrix compared to 5.622
with a dense matrix, i.e. a speed increase by a factor of about 190.
The time was measured by surrounding the lines calling the solvers as follows:
import time
start = time.time()
# The solver line goes here
end = time.time()
print(end ‐ start)
If the boundary function 𝑔 in (1) above does not vanish everywhere, we speak of a
nonhomogeneous boundary condition.
We now discuss this general case where, as a further generalization, we no longer
assume that the source function 𝑓 is constant.
Consider the equation system (4) for a stencil around a grid point (𝑥1 , 𝑦𝑗 ) = (1ℎ, 𝑗ℎ),
adjacent to a boundary point 𝑝0𝑗 = (0, 𝑦𝑗 ). The value 𝑢0𝑗 is now given by 𝑔0𝑗 ∶=
𝑔(𝑥0 , 𝑦𝑗 ). The equation (4) then becomes
1
(−𝑔0𝑗 − 𝑢1,𝑗−1 + 4𝑢1𝑗 − 𝑢2𝑗 − 𝑢2,𝑗+1 ) = 𝑓(0, 𝑦𝑗 ).
ℎ2
4.8 Partial Differential Equations 87
1
𝑓0𝑗 + 𝑔 .
ℎ2 0𝑗
The other points adjacent to the boundary are treated in a similar way, in particular
also the corner points adjacent to both a horizontal and a vertical boundary, such as
(𝑥1 , 𝑦1 ) = (ℎ, ℎ).
This is all we need for the general solution.
This time we write a Python function that solves the PDE (1) and plots the solution
for any given source function 𝑓 and boundary-value function 𝑔 on an 𝑚 × 𝑚 grid.
The computation is based on sparse matrices:
1 def poisson_solver(f,g,m):
and then, evaluating the input function 𝑔, the boundary conditions are inserted:
8 for i in range(n+2):
9 u[i,0] = g(lin[i], lin[0])
10 u[i,n+1] = g(lin[i], lin[n+1])
11 for j in range(1, n+1):
12 u[0,j] = g(lin[0], lin[j])
13 u[n+1,j] = g(lin[n+1], lin[j])
Next comes the generation of the sparse Poisson matrix, just as in Example 4.8:
22 w = ones(n‐1); sD = diags(w, 1) + diags(w, ‐1)
23 I = identity(n)
24 D = 4*I ‐ sD
25 A = kron(I, D) + kron(sD, ‐I)
We can then prepare for the solution process by reshaping F into a column vector,
following the column-major order required in (6):
26 b = F.flatten(order='F')
The flat solution vector is reshaped to a matrix, which is then inserted into the solu-
tion matrix 𝑢 as a block of inner points :
28 u[1:n+1, 1:n+1] = u_inner.reshape(n,n, order='F')
Verification
𝑦
We test poisson_solver for the source function 𝑓 = 1.25 𝑒 𝑥+ 2 and the boundary-
𝑦
value function 𝑔 = 𝑒 𝑥+ 2 :
from numpy import *
f = lambda x, y: 1.25*exp(x + y/2)
g = lambda x, y: exp(x + y/2)
poisson_solver(f, g, 50)
and get the solution shown in Fig. 4.10 on the next page.
4.9 Round off: Random Numbers 89
4.0
3.5
3.0
2.5
2.0
1.5
1.0
1.0
0.8
0.0 0.2 0.6
0.4 0.6 0.4
0.2
0.8 1.0 0.0
𝑦
Fig. 4.10 Solution for nonhomogeneous Poisson equation with source function 𝑓 = 1.25 𝑒 𝑥+ 2 and
𝑦
boundary-value function 𝑔 = 𝑒 𝑥+ 2
We end our journey to the numerical possibilities in Python with a short visit to
the casino. Many mathematical contexts (especially statistics) require numbers that
appear to be the result of a random process. Previously, specially developed tables
had to be used for this purpose. This can easily be done today by programming. The
Python package random contains corresponding functions. We briefly show how ran-
dom numbers can be used in numerical calculations. More precisely, we show how
to approximate the number 𝜋.
Lines 5–7 are central. The random function returns a random real number in the
interval [0, 1]. In line 5, a random point 𝑝 is generated in the square 𝑄 ∶= [0, 1]×[0, 1].
Line 6 checks if this 𝑝 is in the intersection 𝑆 of the unit circle and 𝑄. If yes, the
counter ‘hits’ is incremented by 1.
Since the area ratio of 𝑆 ∶ 𝑄 is exactly 𝜋/4, the ratio of hits to samples should
converge exactly to this value, assuming that the points are uniformly distributed.
90 4 Python in Scientific Computation
The loop counter in line 4 causes the loop to execute exactly ‘samples’ many times.
The symbol ‘_’ denotes that no counter variable is used inside the loop, hence it can
be left out.
The result can then be written to the screen:
8 print(4*(hits/samples)) # Out: 3.143268
Exercises
Linear Equations
Exercise 4.1 The Gauss-Seidel method is an iterative method to solve a system of lin-
ear equations 𝐴𝑥 = 𝑏. The matrix 𝐴 is decomposed into a diagonal matrix 𝐷, a
strictly lower triangular component 𝐿, and a strictly upper triangular component 𝑈,
such that 𝐴 = 𝐿 + 𝐷 + 𝑈.
Starting with an initial value 𝑥 (0) , the following sequence is computed iteratively:
4 3 0 𝑥1 24
(3 4 −1) (𝑥2 ) = ( 30 ) .
0 −1 4 𝑥3 −24
Perform three iteration steps using the Gauss-Seidel method, starting from the initial
approximation 𝑥 (0) = (3, 3, 3)𝑇, and compare the result with the exact solution.
Nonlinear Equations
Exercise 4.3 We continue with the Newton method in Exercise 3.11 in the last chap-
ter. Extend the function newton to a Python function
newton_ext(f,x)
𝑓(𝑥) ∶= 𝑒 𝑥 + 2𝑥,
Newton’s Method in ℝ𝑛
𝜕𝑓𝑖 (𝑣)
𝐽(𝑣) ∶= ( ) , 𝑣 = (𝑥, 𝑦, 𝑧).
𝜕𝑣𝑗 1≤𝑖,𝑗≤3
(2) Solve the equation system 𝑓(𝑣) = 0 approximately by considering the iteration
sequence
𝑢𝑘+1 ∶= 𝑢𝑘 − 𝐽(𝑢𝑘 )−1 ⋅ 𝑓(𝑢𝑘 )𝑇
with initial values 𝑢0 = (±1.0, ±1.0, 0).
92 4 Python in Scientific Computation
Integration
Optimization
Exercise 4.7 Consider the matrix 𝐴 and the vector 𝑏 in Example 4.3 on page 62 in
Sect. 4.2. Use the minimize function from Sect. 4.6 to solve the equation 𝐴𝑥 = 𝑏.
𝑢′ (𝑥) + 𝑢(𝑥) = 𝑥,
𝑢(0) = 1
with the method solve_ivp in the package scipy.integrate, and plot the solution
together with the exact solution
𝑢(𝑥) = 𝑥 − 1 + 2𝑒−𝑥 .
Note that in the present case, an initial value is specified also for the derivative 𝑢′ .
SymPy (for Symbolic Python) is a Python program package for the symbolic solution
of mathematical problems. In the area of computer algebra, the commercial products
Maple and Mathematica are probably the best known. However, SymPy does not have
to hide from these professional environments. The expressiveness and performance
is comparable.
SymPy is available for free. It has a programming community that encourages to
participate and contribute to the development.
For introduction, the tutorial on the website sympy.org is recommended.
Example:
>>> sqrt(8) # Out: 2√2
Number Types
To represent numbers, SymPy introduces three new number types: Integer, Float
and Rational. Some examples:
>>> a = Rational(1, 3); type(a)
Out: sympy.core.numbers.Rational
>>> b = 9*a; type(b)
Out: sympy.core.numbers.Integer
The method evalf returns an output of SymPy type Float. The number of decimals
can be chosen as desired. evalf() without argument sets the output precision to 15
digits (Why?).
Symbolic Expressions
The main feature of computer algebra systems is that expressions built from symbolic
variables can be manipulated.
In SymPy it is necessary to express such variables explicitly as symbols, e.g:
>>> x = Symbol('x')
In many other computer algebra systems – such as Maple or Mathematica – this dec-
laration is not necessary. There, a variable that has not been assigned a value is au-
tomatically interpreted as a symbol. This is not possible in Python, however, since
variables can only be declared together with value assignments.
Remark The convention that a variable-name should coincide with the represented
symbol, is meaningful but not compelling. It would also be possible to assign
>>> abc = Symbol('xyz')
In fact, ‘from sympy.abc import *’ initializes all Latin letters and Greek letters such
as alpha, beta etc. as SymPy symbols.
Convention From now on, we always assume that the variables 𝑥, 𝑦, 𝑧 are declared
as symbols in one of the above ways when needed.
Symbols are interpreted as number variables, so that simple algebraic rules are ap-
plied automatically:
>>> x + y + x ‐ y # Out: 2*x
sympify
The use of sympify is not required very often, since basic arithmetic expressions con-
taining only symbol variables are automatically understood as SymPy terms. How-
ever, some caution is required with fractions. We illustrate what happens for the fol-
lowing fractions:
>>> f1 = 1/3 # type: float
>>> f2 = sympify(1/3) # sympy.core.numbers.Float
>>> f3 = sympify(1)/3 # sympy.core.numbers.Rational
>>> f4 = sympify('1/3') # sympy.core.numbers.Rational
To ensure exact computations in expressions with division, one must ensure that the
result is of type Rational, since already one float or Float resulting from a division
spreads through the entire computation. To be on the safe side, it is sufficient that one
of the arguments in each division is of a well-behaved type.
Actually, the easiest way to achieve this, is to explicitly declare one of the involved
integers as Rational, for example, Rational(1)/3 or 1/Rational(3).
Value Assignments
Value assignments to symbolic variables are made with the method subs, for example
>>> ((x + y)**2).subs(x,1) # Out: (y + 1)**2
98 5 Python in Computer Algebra
Note that obvious algebraic simplifications have again been made automatically.
Substitution of more complex terms is also possible:
>>> ((x + y)**2).subs(x, y + z) # Out: (2*y + z)**2
Advanced Transformations
The real power of computer algebra systems, however, lies in the ability to mechani-
cally transform even more complex expressions.
Here is an example where SymPy does not automatically detect an obvious trans-
formation. The expression is returned unchanged:
>>> (x + x*y)/x # Out: (x*y + x)/x
simplify
In such cases, the help of the most general SymPy simplification method simplify
is often enough. In the example above we get:
>>> simplify((x + x*y)/x) # Out: y + 1
trigsimp
In many cases, a transformation can be achieved by more specific methods, for ex-
ample in expressions involving trigonometric functions:
>>> trigsimp(sin(x)/cos(x)) # Out: tan(x)
expand
factor, collect, …
The method collect collects and orders an expression according to term powers,
here for the powers of 𝑥:
>>> expr = x*y + x ‐ 3 + 2*x**2 ‐ z*x**2 + x**3
>>> collect(expr, x) # Out: x**3 + x**2*(‐z + 2) + x*(y + 1) ‐ 3
Functions
Symbolic expressions can be used to define functions. However, some care must be
taken. The reader is invited to explain what happens in the following example:
>>> expr = x**2
>>> def f(x): return x**2
>>> def g(x): return expr
>>> def h(x_var): return expr.subs(x, x_var)
>>> f(1), g(1), h(1) # Out: (1, x**2, 1)
Often we don’t need the actual conversion to a function. For example, SymPy pro-
vides a plot function for plotting that refers directly to term expressions.
Continuing with the example above, we can plot the graph for 𝑎 = 2 like this:
5 >>> plot(expr.subs(a, 2), (x, 0, 1))
100 5 Python in Computer Algebra
f(x)
2.00
1.75
1.50
1.25
1.00
0.75
0.50
0.25
0.00
0.0 0.2 0.4 0.6 0.8 1.0
x
Actually, ‘plot’ returns the plot as an instance of the SymPy Plot class, which can
then be stored in a variable for later use.
The save method can be used to save the plot to an output file:
6 >>> graph = plot(expr.subs(a, 2), (x, 0, 1))
7 >>> graph.save('graph.pdf')
lambdify
plotted in SymPy:
4 sym.plot(expr, (x, 0, sym.pi))
Figure 5.2 shows the SymPy plot generated in line 4 on the left, and the one drawn
by Matplotlib in line 9 on the right.
f(x)
4.0 4.0
3.5 3.5
3.0 3.0
2.5 2.5
2.0 2.0
1.5 1.5
1.0 1.0
0.5 0.5
0.0 0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0
x 0.0 0.5 1.0 1.5 2.0 2.5 3.0
The simplest (and conventional) way to solve an equation in SymPy uses the operator
solve. For example, the equation 𝑥 2 = 1 is solved by
>>> solve(Eq(x**2, 1), x) # Out: [‐1, 1]
Here Eq(.,.) denotes the equation, and x is the variable for which the equation is
to be solved. Note that unlike other computer algebra systems, the notations ‘x**2
= 1’ or ‘x**2 == 1’ do not work, because ‘=’ and ‘==’ are already reserved for other
purposes in Python.
The equation above can equivalently also be written as 𝑥 2 − 1 = 0, which can then
be solved by
>>> solve(x**2 ‐ 1, x)
If the first argument consists of only one term, solve assumes that it is to be tested
for equality with 0.
An even simpler way to write this, is
>>> solve(x**2 ‐ 1)
It is often recommended to use the function solveset for solving general equations
and the function linsolve specifically for linear equations.
solveset
For example, the equation 𝑥 2 = 𝑥 has the solutions 0 and 1, both found by solve as
well as solveset:
>>> solve(x**2 ‐ x) # Out: [0, 1]
>>> solveset(x**2 ‐ x) # Out: {0, 1}
But note the difference: solve provides a list of the single solutions, solveset the
entire solution set {0, 1}. The difference becomes clearer, when we consider the equa-
tion 𝑥 = 𝑥. The solution of this equation is the largest possible number set, which for
SymPy is the entire set of complex numbers ℂ.
>>> solve(x ‐ x) # Out: []
>>> solveset(x ‐ x) # Out: S.Complexes
The solve method fails, because the solution cannot be represented as a finite list,
whereas the method solveset correctly returns S.Complexes, in pretty print ℂ.
With additional hints, solveset can also provide more accurate characterizations
of the solution sets, such as
1 >>> solveset(x ‐ x, x, domain=S.Reals) # Out: Reals
2 >>> solveset(sin(x) ‐ 1, x, domain=S.Reals)
3 Out: ImageSet(Lambda(_n, 2*_n*pi + pi/2), Integers())
The solution in line 3 becomes clearer, if we assume “pretty printing” to be switched
on by the command ‘init_printing()’. Then, instead of the cryptic expression, we
get the understandable formulation
𝜋
{2𝑛𝜋 + | 𝑛 ∈ ℤ}
2
5.3 Linear Algebra 103
Remark For readers familiar with the LaTeX formatting system, it should be noted
that in this book the pretty-print output of SymPy has been prepared for printing
with the built-in latex function, the last one for example with
>>> print(latex(solveset(sin(x) ‐ 1, x, domain=S.Reals)))
Out: \left\{2 n \pi + \frac{\pi}{2}\; |\; n \in \mathbb{Z}\right\}
linsolve
Here is just a simple example. We return to the solution of linear equations in the
context of matrix equations.
>>> linsolve([x+y+z‐1, x+y+2*z‐3], (x,y,z)) # Out: {(‐y‐1, y, 2)}
The solution set consists of all tuples (𝑥, 𝑦, 𝑧) of the form 𝑥 = −𝑦 − 1, 𝑧 = 2, with
arbitrary values for 𝑦.
Remark The curly braces suggest that the output is of type ‘set’, as discussed in the
basic Python chapter. In fact, it is, but of a particular SymPy type ‘sets.FiniteSet’.
SymPy is familiar with the usual methods of linear algebra. Both symbolic and nu-
merical calculations are supported.
The determinant is computed by M.det(). The shape (i.e. the number of rows and
columns) of a matrix can be queried with M.shape.
Single rows and columns are accessed by row and col, the first row of M, for ex-
ample, by M.row(0).
SymPy provides a special mechanism for generating vectors or matrices whose en-
tries can be described by a function.
As an example, consider the famous Hilbert matrix:
>>> def f(i,j): return 1/(1+i+j)
>>> Matrix(3, 3, f)
Out: Matrix([
[ 1, 1/2, 1/3],
[1/2, 1/3, 1/4],
[1/3, 1/4, 1/5]])
Special matrices can be created by ones, zeros, etc., as in SciPy. Note that the square
identity matrix is called ‘eye’. The name probably comes from the homonymous pro-
nunciation of the single letter “I”.
The linear algebra tools of SymPy can be conveniently used to determine the linear
dependence or independence between vectors. We discuss some typical examples.
Example 5.2 We show that the following vectors in ℝ3 form an orthonormal basis:
1 1 1
𝑣1 ∶= (1, 1, 1), 𝑣2 ∶= (1, 0, −1), 𝑣3 ∶= (1, −2, 1).
√3 √2 √6
In SymPy representation:
1 >>> v1 = Matrix([1, 1, 1]) / sqrt(3)
2 >>> v2 = Matrix([1, 0, ‐1]) / sqrt(2)
3 >>> v3 = Matrix([1, ‐2, 1]) / sqrt(6)
Observing the SymPy notation for the vector dot product, we get for example:
4 >>> v1.dot(v1) # Out: 1
5 >>> v1.dot(v2) # Out: 0
Now, if {𝑣1 , 𝑣2 , 𝑣3 } is an orthonormal basis, then for every vector 𝑥 = (𝑎, 𝑏, 𝑐) ∈ ℝ3,
taking the sum of the projections we have
3
𝑥 = ∑(𝑥 ⋅ 𝑣𝑖 ) 𝑣𝑖 .
𝑖=1
constitute a basis of the vector space ℙ2 of polynomials of degree 2. To prove this, for
an arbitrary such polynomial 𝑞(𝑥) = 𝑎𝑥 2 + 𝑏𝑥 + 𝑐 we have to find coefficients 𝑐1 , 𝑐2 , 𝑐3
such that
3
𝑞(𝑥) = ∑ 𝑐𝑖 𝑝𝑖 (𝑥).
𝑖=1
Lines 3–5 define the basis polynomials. In line 6, we fix an arbitrary polynomial of de-
gree 2 that is to be represented as a linear combination of the basis functions. In line 7,
the difference polynomial is generated. In line 8, the values of 𝑐1 , 𝑐2 , 𝑐3 are determined,
such that the coefficients of all 𝑥-powers in the difference polynomial become 0.
SymPy’s answer in pretty print:
1 1
{𝑐1 ∶ 𝑐 + 𝑏 + 𝑎, 𝑐2 ∶ 𝑏 + 𝑎, 𝑐3 ∶ 𝑎}
2 3
It shows that the coefficients of the basis functions can indeed be chosen to produce
a representation of 𝑞(𝑥) = 𝑎𝑥 2 + 𝑏𝑥 + 𝑐 as a linear combination of the 𝑝𝑖 as desired.
Note again that the result is returned in form of a Python dictionary of type dict.
are linearly independent. For this purpose we collect them in the matrix
1 >>> A = Matrix([[1, 0, 2], [0, 1, 1], [1, 2, ‐1]])
We can verify that 𝐴 is regular by checking that A.det() yields a non-zero result −5.
Furthermore, A.columnspace() shows that the space of the column vectors has the
maximal dimension 3. The kernel A.nullspace() of the linear mapping induced by
A is empty, which is still another way to establish regularity.
We determine the linear combination of the 𝑢𝑖 to represent an arbitrary vector, say
2 >>> b = Matrix([8, 2, ‐4])
For this purpose, we use one of the customary solvers for matrix equations:
1
3 >>> u = A.LUsolve(b); u # Out: 𝑢 = 5 (8, −6, 16)
Note that in difference to normal float comparisons, we are here safe, since symbolic
computations are exact.
Recall the definition of eigenvectors and eigenvalues in Sect. 4.4 in the last chapter.
To compute these, SymPy provides the methods eigenvects and eigenvals.
Example:
1 >>> M = Matrix([[3, ‐2, 4, ‐2], [5, 3, ‐3, ‐2],
2 [5, ‐2, 2, ‐2], [5, ‐2, ‐3, 3]])
3 >>> M.eigenvals() # Out: {3: 1, ‐2: 1, 5: 2}
The result of line 3 is a dictionary with the individual eigenvalues 3, -2 and 5 as keys
and their multiplicities as values.
This is confirmed by inspecting the characteristic polynomial of 𝑀:
4 >>> lamda = Symbol('lamda') # spelling to avoid conflict
5 >>> p = M.charpoly(lamda) # returns PurePoly object
The last line checks whether the eigenvalue property is actually fulfilled. Note again
that we can trust the verdict here, since computations in SymPy are exact.
𝑛 × 𝑛 Matrices with 𝑛 ≥ 5
SymPy returns a complex algebraic expression, which indicates that the attempt to
find a symbolic solution has failed.
5.4 Calculus
With its basic on-board equipment, SymPy can already handle limit values as well as
differential and integral computations.
Limits
SymPy knows a number of rules for the calculation of limit values, such as for exam-
ple lim𝑥=0 sin(𝑥)/𝑥:
>>> limit(sin(x)/x, x, 0) # Out: 1
or lim𝑥=∞ 1/𝑥:
>>> limit(1/x, x, oo) # Out: 0
Differential Calculus
diff(f(x),x) calculates the derivative of a function 𝑓(𝑥). SymPy knows the deriva-
tives of all standard functions, as well as the usual derivation rules.
Example:
1 >>> diff(sin(2*x), x) # Out: 2*cos(2*x)
2 >>> diff(tan(x), x) # Out: tan(x)**2 + 1
3 >>> h = Symbol('h')
4 >>> limit((tan(x+h) ‐ tan(x))/h, h, 0) # Out: tan(x)**2 + 1
tan(𝑥 + ℎ) − tan(𝑥)
tan′ (𝑥) = lim
ℎ=0 ℎ
5.4 Calculus 109
The diff operator can also be written as suffix in the dot notation:
>>> expr = sin(2*x)
>>> expr.diff(x) # Out: 2*cos(2*x)
Partial derivations are also possible. For example, let 𝑓(𝑥, 𝑦) ∶= 𝑥 4 𝑦. The partial
𝜕2 𝑓(𝑥, 𝑦)
derivative is then computed by:
𝜕𝑥𝜕𝑦
>>> diff(x**4*y, x, y) # Out: 4*x**3
Integration
SymPy knows the integrals of the standard elementary functions and can handle the
usual integration rules.
Example:
>>> integrate(log(x), x) # indefinite integral
Out: x*log(x) ‐ x
>>> integrate(x**3, (x, ‐1, 1)) # definite integral
Out: 0
As is easily guessed, pi is the SymPy symbol for 𝜋. With init_printing the output
would in fact be precisely the glyph 𝜋.
Remark Note, however, that the SymPy symbol for Euler’s number is E and not e:
>>> exp(1) # Out: E
Series Expansion
The Landau symbol 𝑂 here indicates that powers ≥ 4 are not evaluated.
For the special case 𝑥0 = 0 and 𝑛 = 6, we can also use ‘series(f(x), x)’:
>>> series(exp(x), x)
𝑥2 𝑥3 𝑥4 𝑥5
Out: 1 + 𝑥 + + + + + 𝑂 (𝑥 6 )
2 6 24 120
Like the diff operator before, the series operator can be written in suffix notation:
>>> exp(x).series(x)
Example 5.6 As a simple example, consider the initial value problem 𝑢′ (𝑥) = 𝑥−𝑢(𝑥)
in the interval [0, 5] with initial value 𝑢(0) = 1, discussed in Example 4.13 in the SciPy
chapter.
For the representation in SymPy, we first need a variable-symbol ‘u’ for the un-
known function 𝑢. We get it through
1 >>> u = Function('u')
If SymPy does not find a solution in closed form, dsolve may return a symbolic
“intermediate result”:
Example 5.8 Let’s try the Bratu equation
𝑢″ (𝑥) = −𝑒𝑢(𝑥) ,
𝑢(0) = 𝑢(1) = 0.
After some time (about 30 seconds on the computer used here) the solver gives up
and returns an empty solution list in line 5.
Instead, we try the equation itself without the boundary conditions:
6 >>> sol = dsolve(ode, u(x)); sol
𝐶1 𝐶1
7 Out: [𝑢(𝑥) = log ( ), 𝑢(𝑥) = log ( )]
1−cos (𝐶1 √− 𝐶1 (𝐶2 +𝑥)) 1−cos (𝐶1 √− 𝐶1 (𝐶2 −𝑥))
1 1
This time, a symbolic intermediate result is returned in line 7. Notice that the solver
detects that the equation has two solutions.
The Galerkin method is the basis for a powerful approach to the automated solution
of differential equations, which we will discuss in detail in a later chapter: the FEniCS
project.
The idea is to approximate the solution 𝑢 of a differential equation by a linear
combination
𝑛
𝑢 ≈ ∑ 𝑐𝑖 𝜑 𝑖 of given basis functions 𝜑1 , … , 𝜑𝑛 .
𝑖=1
The Galerkin method is based on the so-called variational form of a differential equa-
tion. We briefly explain the idea.
Let 𝑉 be the set of integrable functions 𝑣 ∶ [0, 1] → ℝ with 𝑣(0) = 𝑣(1) = 0.
If −𝑢″ = 𝑓, it is then trivially clear that for every test function 𝑣 ∈ 𝑉:
1 1
(2) − ∫ 𝑢″ 𝑣 = ∫ 𝑓𝑣.
0 0
Now the crucial observation is that also the converse is true: if (2) holds for every
such test function, then −𝑢″ = 𝑓.
This is best seen by contraposition, as in the following exercise:
Exercise Assume 𝑔, ℎ ∈ 𝐶[0, 1], 𝑔 ≠ ℎ. Show that there is a 𝑣 ∈ 𝐶[0, 1], 𝑣(0) =
1 1
𝑣(1) = 0, such that ∫0 𝑔 𝑣 ≠ ∫0 ℎ 𝑣.
These observations lead to an equivalent formulation for (1) as a variational problem:
Determine 𝑢 with 𝑢(0) = 𝑢(1) = 0 such that (2) holds for every 𝑣 ∈ 𝑉.
It is common to denote the left-hand side in (2) by 𝑎(𝑢, 𝑣), noting that 𝑎 is bilinear
in the arguments 𝑢 and 𝑣 due to the linearity of the derivative and integral. Similarly,
the right-hand side, which does not depend on 𝑢, is linear in 𝑣. It is often referred to
as 𝐿(𝑣).
The Method
4
Consider a trial function 𝑢 = ∑𝑖=1 𝑐𝑖 𝜑𝑖 built from the the basis functions
The equation system (4) can now be solved with SymPy. We start with the import of
the SymPy components:
1 from sympy import *
Again, note the use of sympify, which causes all fractions to be of the SymPy type
Rational.
We define a SymPy function 𝑎(𝑢, 𝑣) to compute the integrals on the left-hand side
in (3):
8 a = lambda u,v: ‐integrate(diff(u,x,2)*v, (x, 0, 1))
All these values are then collected in the so-called stiffness matrix 𝐴:
9 A = Matrix(4, 4, lambda i, j: a(phi[i], phi[j]))
Correspondingly, we define the SymPy function 𝐿(𝑣) to compute the right-hand side
in (3):
10 f = x*(x + 3)*exp(x)
11 L = lambda v: integrate(f*v, (x, 0, 1))
Similarly to the stiffness matrix, we can then collect the values in a vector (called load
vector) 𝑏.
12 b = Matrix([L(phi[j]) for j in range(4)])
Recall that a matrix generated from a list of single elements is interpreted as a column
vector.
This is all we need to determine the coefficients 𝑐𝑖 from the matrix equation 𝐴𝑐 = 𝑏:
13 c = A.LUsolve(b)
114 5 Python in Computer Algebra
4
With the coefficients 𝑐𝑖 we define the function 𝑢 ∶= ∑𝑖=1 𝑐𝑖 𝜑𝑖 (𝑥) to represent the
approximate solution:
14 u = sum(c[i]*phi[i] for i in range(4))
Note that here the standard Python function sum, briefly mentioned in the basic
Python chapter, can also be used to add SymPy terms.
The solution is then plotted with the SymPy function plot:
15 plot(u, (x, 0, 1))
0.4
0.3
0.2
0.1
0.0
0.0 0.2 0.4 0.6 0.8 1.0
x
Fig. 5.3 Galerkin approximation for the boundary value problem (1)
Exact Solution
To see how good our approximation is, we calculate the exact solution by direct in-
tegration for comparison:
16 from sympy import symbols , solve
17 C1, C2 = symbols('C1, C2')
18 temp = integrate(‐f) + C1 # here var x unique from context
19 expr = integrate(temp, x) + C2 # ... here x needed
20 sol = solve([expr.subs(x, 0), expr.subs(x, 1)], [C1, C2])
21 u_e = expr.subs([(C1, sol[C1]), (C2, sol[C2])])
22 plot(u_e, (x, 0, 1))
In fact, the deviation between approximate and exact solution does not show up in
the plot. In such cases, it is often useful to plot the difference:
23 plot(u ‐ u_e, (x, 0, 1))
Exercises 115
f(x)
0.00004
0.00002
0.00000
0.0 0.2 0.4 0.6 0.8 1.0
x
0.00002
0.00004
Fig. 5.4 Difference between exact solution of BVP (1) and Galerkin approximation
Exercises
Linear Equations
(1) Determine the values 𝛼 ∈ ℝ for which the homogeneous system 𝐴𝑥 = 0 only
has the trivial solution 𝑥 = 0.
(2) Are the column vectors of 𝐴 linearly dependent for 𝛼 = 0?
(3) Find the values for 𝛼, 𝛽 ∈ ℝ for which the equation system 𝐴𝑥 = 𝑏 has no
solution.
(4) Compute the solution set of 𝐴𝑥 = 𝑏 for 𝛼 = −3, 𝛽 = 0.
Eigenvalues
Exercise 5.2 Compute the eigenvalues 𝜆𝑖 (𝜀) and the eigenvectors 𝜑𝑖 (𝜀) of the matrix
1 + 𝜀 cos(2/𝜀) −𝜀 sin(2/𝜀)
𝐴(𝜀) ∶= ( ).
−𝜀 sin(2/𝜀) 1 + 𝜀 cos(2/𝜀)
Nonlinear Equations
Exercise 5.3 Find all solutions for the equation 𝑥 3 + 3𝑥 − 𝑎 = 0 for the real parameter
𝑎 ∈ ℝ. Show that for each choice of 𝑎, there is only one real-valued solution 𝑥. Sketch
that solution 𝑥 as a function of 𝑎 in the interval [−500, 500].
Exercise 5.4 Modify the Newton solver from Exercise 4.3 in the last chapter, such
that it now computes the derivative function with SymPy. Test it for the equations
Differential Equations
Exercise 5.8 Modify the program in the Galerkin section 5.6 so that it now computes
an approximate solution for the function 𝑢 given by the equation
Function Spaces
Exercise 5.10 In the space 𝐶[−1, 1] we construct an orthogonal basis 𝐿0 , … 𝐿𝑛 for all
polynomials of degree 𝑛. We start with a basis consisting of all monomials
𝑃𝑖 (𝑥) ∶= 𝑥 𝑖 , 𝑖 = 0, … 𝑛,
and, applying the Gram-Schmidt process, use them to define the so-called Legendre
polynomials
𝑚−1
⟨𝑃𝑚 , 𝐿𝑖 ⟩
𝐿0 ∶= 𝑃0 , 𝐿𝑚 ∶= 𝑃𝑚 − ∑ 𝐿 , 𝑚 = 1, … , 𝑛.
𝑖=0 ⟨𝐿𝑖 , 𝐿𝑖 ⟩ 𝑖
Write a SymPy program to generate the first 𝑛 Legendre polynomials and test, for
example, for 𝐿3 and 𝐿6 to see if they are indeed orthogonal.
Chapter 6
The C Language
C is one of the most widely used programming languages. In fact, we have also already
made extensive use of a C program. The standard implementation of the Python in-
terpreter is written in C.
C was developed in the early 1970s by Dennis Ritchie in the United States, as the
successor to a language B, for the development of the operating system Unix.
The main application of C is in system programming. Language constructs are
designed to correspond closely to machine instructions. This allows for extremely
efficient programming techniques, but with the disadvantage that the language often
appears cryptic to beginners.
C was originally not intended for numerical calculation. There were special lan-
guages for this, such as Fortran (for Formula Translation). In the meantime, however,
an increasing number of extensive Fortran libraries for scientific and numerical com-
putation have been ported to C and its extensions such as C++.
In contrast to Python, C programs are compiled. The development is as follows: The
program is created with an editor and saved in a file, usually with the extension ‘.c’.
The compiler translates it into machine language and includes the required additional
components from the program libraries. The result is an independent program that
can be run directly on the machine.
It is also possible to compile and run C programs from the command line. As an
example we mention Ubuntu Linux, which we will use often in later chapters. If a
program is stored in a file, e.g. test.c in the user’s home directory, it can be compiled
into an executable program with the Terminal command
$ gcc test.c
Here a.out is the default name automatically assigned to the executable program,
which is again stored in the user’s home directory.
The prefix ‘./’ is an abbreviated representation of the access path to the home
directory.
If a different file name, e.g. testprog, is desired, it can be specified with the output
option ‘‐o’:
$ gcc test.c ‐o testprog
$ ./testprog
6.1 Basics
Example 6.1 As a first example, let’s look at the greeting “Hello World” again:
1 #include <stdio.h> // contains the printf function
2 int main() { // every C‐prog. begins with main
3 printf("Hello World\n");
4 return 0; } // not required
In line 3, the string enclosed in quotation marks is output to the screen. As in Python,
the control character ‘\n’ causes a line break. The function printf (for “print format-
ted”) is not part of the core language, but rather provided by the library stdio (for
“standard input output”) in line 1. The #include statement is equivalent to the state-
ment ‘from ... import *’ in Python.
Every C program starts at the line with the main function, which must therefore
occur exactly once in the program. Formally, main is a function that returns a value
of type int, here the value 0 in line 4. The 0 stands for ‘properly terminated’.
In contrast to Python, an instruction block (here lines 3 and 4) is enclosed in curly
braces. Indentation is only used to improve readability. In the following we adhere to
the Python style, not emphasizing the curly braces. Also, if allowed by C, we will not
enclose single-line statements in braces.
Instructions in C are always terminated by a semicolon.
Comments up to the end of a line are indicated by ‘//’, comment blocks by inclu-
sion in a pair ‘/* */’.
6.1 Basics 121
Basic number types and arithmetic operations are similar to what we have seen in
Python.
Number Representations
The default data types for numbers are int for integers and float for floating-point
numbers. On most systems, int has a word length of 32 bits = 4 bytes, float one of
32 or 64 bits = 8 bytes.
There are also different size variants, for integers for example char, typically 1
byte large, and ‘unsigned long’, typically 8 bytes large. For floating-point numbers,
besides float, double with a length of 8 bytes is common.
We will simply speak of int and float also referring to the other variants.
Arithmetic Operators
C knows the basic arithmetic operators +, ‐, *, /, as well as ‘%’ for the remainder
in integer division, however not the power operator. As in Python, the result of a
mixed expression consisting of int and float is of type float. Unlike Python, the
‘/’ operation applied to arguments of type int denotes integer division.
Output
The output command printf can only contain string arguments. Numbers must be
fitted to strings for output:
printf("%2d times %f is %.2f\n", 3, 4.2, 3*4.2);
In C, the percent sign is used to indicate placeholders. Note that the order of place-
holders and values (and their type) must match.
The evaluated string ‘3 times 4.200000 is 12.60’ is then printed to the screen.
In Python, we could assign any value of arbitrary type to a variable. In C, that is not
allowed. A variable must be declared to be of a particular type before use, for example
by statements of the form ‘int a’, or ‘float b’.
These declarations provide memory space that can hold a value of the correspond-
ing type. This static one-time, no longer changeable specification of memory require-
ment before program execution ultimately leads to more efficient programs.
122 6 The C Language
However, the declaration does not yet give the variable any value. It must be initialized
by a value assignment before first use. Assignments are identified by the operator ‘=’
as in Python.
Declaration and initialization can also be combined into one statement, such as ‘int
a = 42’.
In general, assignments of new values can then be made as in Python, as long as
type compatibility is respected.
A special feature of C is the notation a++ and ++a for a variable a of type int. By
themselves, both abbreviations stand for the assignment ‘a = a + 1’ or equivalently
‘a += 1’. The difference becomes apparent in assignments of the form ‘b = a++’ and
‘b = ++a’. In the first case, the current value of a is assigned to the variable b, and
then the value of a is incremented by 1. In the second case it is the other way round.
First a is incremented, and then the new value is assigned to b. The same applies to
a‐‐ and ‐‐a.
Conditional Execution
As in Python, the italicized identifiers are placeholders for actual test conditions and
statement blocks. A condition is an expression that evaluates to one of the Boolean
values true or false. The curly braces enclose the statements to be executed if the
condition is true or false. If a statement block consists of a single statement, the braces
are not required. The ‘else {}’ part can be omitted.
In C, any expression that evaluates to a number can be used as a condition. The value 0
means ‘false’, all others ‘true’.
The comparison operators <, <=, ==, != return the value 1, if the relation is true, 0
otherwise. Conditions can again be composed with Boolean operators. In C, logical
and is denoted by &&, logical or by ||, and the negation by a leading exclamation
mark, as in the characterization of not equal by !=.
6.2 Control Structures: Branches, Loops 123
Loops
Like Python, C offers several types of iteration loops. The while loop is the easiest
one. It is the same as in Python, only in C syntax:
while (condition) {statements}
The test condition is again enclosed in parentheses, the statement block in curly
braces. If a block consists only of a single statement, the braces are not required.
Example 6.2 We formulate the Collatz problem in Example 3.3 in the Python chapter
as a C program. For brevity, we show only the instructions within the main function.
Of course, in an actual execution, the entire program must be present.
1 int n = 100; // input
2 while (n > 1) {
3 if (n % 2 == 0) n /= 2; // integer division
4 else n = 3*n + 1;
5 printf ("%d\n", n); }
6 printf("arrived at 1\n");
Remark 6.3 In lines 3–4, we use a standard if-else statement to determine the next
sequence value. However, just like the ternary operator in Remark 3.9 in the Python
chapter, C also provides a conditional expression written with the ternary operator ‘?:’
to abbreviate this and similar constructions.
In the example we can replace the if-else statement with
n = (n % 2 == 0) ? n/2 : 3*n + 1;
If the condition evaluates to true, then expr1 is evaluated and that is the value of the
conditional expression, Otherwise expr2 is evaluated, and that is the value.
Example 6.4 The following program calculates the reward for the chess inventor, as
explained in Example 2.4 in the Mathematical Foundations chapter. At first reading,
it can be helpful to read ‘int’ for ‘unsigned long’.
We declare and initialize the variables for field 1 by
1 int fieldno = 1;
2 unsigned long fieldval = 1;
3 unsigned long sum = 1;
Lines 1 through 3 set the number of grains for field 1 (i.e. 1 grain) and store it as the
first contribution to the total sum.
124 6 The C Language
The meaning of the components were explained in the example. Perhaps the gen-
eral principle becomes even clearer, when we describe the effect with an equivalent
construction based on a while loop:
initialization;
while (condition) {statements increment}
Conversely, this construction gives rise to the question: what happens if the initial-
ization and increment instructions in a for loop are both empty? Then, in fact, we get
back the effect of a while loop.
This is illustrated by an example, the computation of √2 by the so-called Babylo-
nian approximation method:
Example 6.5 The sequence
1
𝑎0 = 1, 𝑎𝑛+1 = (𝑎𝑛 + 2/𝑎𝑛 )
2
converges to √2. This allows us to approximate the root by
1 float r = 1;
2 float eps = 1.e‐6;
3 for (; (r*r ‐ 2 )*(r*r ‐ 2) > eps*eps; )
4 r = (r + 2/r) / 2;
5 printf("%f\n", r); // Out: 1.414214
6.3 Functions 125
The execution is stopped when |𝑤2 − 2| ≤ 𝜀. This latter formulation is chosen so that
we can work without the absolute value and the square root, neither of which are part
of the basic C language.
The number 1.e‐6 in line 2 is simply 0.000001 in exponential notation.
Note that we declared the variables in lines 1 and 2 as float. We will generally
continue to follow this standard convention. In practical applications it is usually
advisable to use the more precise variant double instead.
Remark For specially interested readers: This mixed form of the for loop as counting
loop and general while loop is not completely satisfactory from the point of view of
computability theory. There a clear distinction is made between general loops and
those where the number of runs is known before the loop starts.
The class of so-called primitive recursive functions consists exactly of those func-
tions that can be computed using only counting loops. The larger class of 𝜇-recursive
functions also includes computable functions that require unlimited while loops. The
prime example is the Ackermann function in Example 3.12 in the Python chapter.
6.3 Functions
We first consider functions over the basic types for numbers. A C function can take a
tuple as argument, but can return only a single value as output, with all values iden-
tified by their type. Otherwise, the definition is similar to the one in Python.
The function definition must be specified outside the main function. It is however
called inside main, like this:
5 int a = factorial(5); // use in main
6 printf("%d\n", a); // Out: 120
It should now be easy to reformulate all the functions from Examples 3.8, 3.10,
3.11 and 3.12 in the basic Python chapter, including the recursive ones. For recur-
sive functions, the above-mentioned fact is crucial that only values are passed to the
functions, not the variables themselves. What happens is that new variables are set
up at each recursion level to copy the values from the calling level.
Functions as Arguments
Remark In this example, we return the value of the derivative function at a given
point.
Unlike Python, in C it is not possible to directly return the derivative function
itself.
6.4 Arrays
C arrays are rather spartan (some even say primitive). An array always consists of a
sequence of a fixed length of numbers of the same type. For example, an array arr of
length 3 with elements of type int is declared as ‘int arr[3]’. The elements can be
6.4 Arrays 127
read and written as in Python lists. Note again that indices start at 0, hence the last
element in the array of length 3 has index 2.
As with elementary variables, the elements of an array must be initialized before
use, such as:
int arr[3]; // declaration
for (int i = 0; i < 3; i++) arr[i] = i; // initialization
Since the array index runs from 0 to 2, it can only accept values less than 3, as specified
by the condition in the second argument.
Another way to initialize is to assign the sequence of values as in line 1 below:
1 int arr[3] = {0, 1, 2};
2 int arr2[] = {0, 1, 2};
Line 2 shows an alternate method where declaration and initialization are combined.
Here the compiler recognizes by itself that the array has a length of 3.
Because an array is of constant length and each entry is of the same known type,
the compiler can allocate the required memory before the program run (in technical
jargon: at compile time). If int has the word length 4 bytes, 12 bytes must be reserved
for an int array of length 3. The byte length of a data type or variable can be queried
with the built-in function sizeof.
Remark Strictly speaking, sizeof does not return the value in bytes, but as a multiple
of the shortest C data type char, which is usually exactly 1 byte, however.
With sizeof you can conversely also regain the number of elements of an int array
arr by dividing the memory requirement sizeof(arr) by that of its component type
sizeof(int). C itself does not provide a function like len in Python.
Remark Thus, the memory for an array of length 𝑛 is determined before the program
is started, but during execution it is usually not checked whether the index limits
are respected. For example, if we make the assignment arr[100]=1 for our array
arr above, a critical value may be overwritten, with serious consequences. Not for
nothing, in the past you could sometimes hear: “C is a language for adults, who know
what they are doing.”
Arrays can be used as function arguments. However, you should also include the
length as an additional argument when passing it, since calculations with sizeof
usually do not work within a function. We will come back to this later.
𝑛
Example 6.8 The following function calculates the Euclidean norm ||𝑣|| = √∑𝑖=1 𝑥2𝑖
of a vector 𝑣 = (𝑥1 , … , 𝑥𝑛 ) ∈ ℝ𝑛 . Note that the root function sqrt does not belong to
the basic C language, but to the library math, which therefore has to be imported with
#include<math.h> in the preamble. The functions in math are listed, for example,
in the Wikipedia article C_mathematical_functions.
128 6 The C Language
Remark Note that in Linux the mathematical functions are bundled separately. To
use them, we must pass the ‘–lm’ option to the compiler. If the program is stored in
a file, e.g. norm.c, it can be compiled and run as follows:
$ gcc norm.c ‐o norm ‐lm
$ ./norm
5.000000
Matrices
Like linear arrays, matrices can also be initialized together with the declaration:
int B[2][3] = { {0, 1, 2}, {3, 4, 5} };
int C[][3] = { {0, 1, 2}, {3, 4, 5} };
int D[][3] = {0, 1, 2, 3, 4, 5};
In B and C the components are assigned row by row. In C, the compiler itself deter-
mines the number 2 of rows from the number 6 of components and the number 3 of
columns. The number of columns cannot be omitted because the compiler does not
care about the inner curly braces, but treats the input as a linear array as in D. In D it
is obvious that the number of columns is actually needed to split the array properly.
If you use matrices as function arguments, you must enter the number of columns.
where each 𝐴 1,𝑗 is the minor of 𝐴, obtained by erasing the first row and the 𝑗-th
column.
6.5 Pointers 129
and obtain the correct result 2.000000. Note that C automatically converts the integer
entries to floats.
6.5 Pointers
In the square-root computation with Python in Example 4.1 in the SciPy chapter, the
solution pair was returned as a function value. How can we do that in C? As a matter
of fact, we cannot do it directly. But we can specify a replacement construction, based
on the most powerful (but also most dangerous) instrument of C, the pointer.
Let’s start with the basics. When we declare a variable like ‘int a’, a suitable mem-
ory space is reserved. This area also has a storage location address that can be accessed
directly with the address operator ‘&’:
130 6 The C Language
The instructions
int a;
printf("%p\n", &a); // %p is the place holder symbol for pointer
The more abstract term pointer is usually used instead of ‘address’. The pointer is
then interpreted as a reference to the value. In truth, however, an address of the type
mentioned above is always meant.
The actual value stored at the address can then be checked with the asterisk- or
dereference-operator *p.
So the asterisk has a double meaning. In the declaration it says that p is a variable of
type int*, which means a pointer to int. In use, *p refers to the number value stored
at that location.
Example 6.10 We want to write a function ‘... vpairf(..., int x)’, so that after
execution of
1 int a, b; // possible: two declarations in same line
2 vpairf(..., 1);
Exercise Corresponding to the Python program in Example 4.1 in the SciPy chapter,
write a C program that computes both solutions to a quadratic equation.
6.5 Pointers 131
It is often said that C arrays are but slightly disguised pointers. There is certainly some
truth to this. Internally, an array ‘int a[n]’ is managed like a pointer a of type int*.
This can be verified as follows:
1 int a[] = {1, 2, 42}; // in main
2 int* p = a + 2; // address arithmetic
3 printf("%d\n", *p); // Out: 42
Pointers as Arrays
The identification of arrays and pointers can also be reversed, and arrays can be de-
clared as pointers. This is especially useful for large data sets that are only needed
temporarily in memory. Arrays are declared statically with a fixed size before the
program is executed, while pointers can be requested (“allocated”) dynamically while
the program is running, and the memory is then freed when it is no longer needed.
We will not go into detail here, but only show how arrays and matrices can be
handled in this way.
Remark It should be emphasized that the use of pointers, addresses and dynamic
memory management are not C-specific concepts, but ultimately form the basis in
all modern programming languages. The only question is to what extent the user is
forced to deal with these technical mechanisms. This cannot be avoided in system
programming, but when using computers to solve mathematical problems, it is quite
desirable not to be confronted with such details.
Example 6.11 We want to declare an int array of length 𝑛 as a pointer and allocate
a corresponding memory area. For this, we need the malloc function (for memory
allocation) from the library stdlib, which returns a memory space of a desired size.
132 6 The C Language
The result is an int pointer arr that points to the beginning of a newly provided
address block that can hold 𝑛 int numbers. We can now use arr like a normal array
and access the elements with arr[i].
Example 6.12 Matrices can also be created in this way. To illustrate, let’s consider
a matrix of type ‘int A[n][m]’. Following the above explanation, each row can be
considered an int pointer, so the matrix can be considered an array of int pointers.
With the same argument, the matrix can in turn be identified with a pointer to the
beginning of this array. A could thus also be declared as of type (int*)*, in C written
as int**.
So, after a short breath, on to the actual implementation:
1 int** A = malloc(sizeof(int*) * n); // in main
2 for (int i = 0; i < n; i++) A[i] = malloc(sizeof(int) * m);
When the matrix is no longer needed, its memory space can be released (or “deallo-
cated”) using the stdlib function free:
5 for (int i = 0; i < n; i++) free(A[i]);
6 free(A);
6.6 Structures
Structures provide a way to combine a set of related variables into a single entity. In
this respect, they are similar to classes in Python. Unlike classes, however, structures
do not contain application methods.
A positive aspect is that structures, unlike arrays, can be returned as results of
function evaluations.
6.6 Structures 133
We illustrate the basic ideas along a structure for representing fractions. For back-
ground, we refer to Example 3.19 in the Python chapter.
We represent a fraction like 3/4 as a pair (3, 4) of numerator and denominator. A vari-
able ‘a’ of the corresponding structure type can then be declared with the keyword
struct:
struct { int num; int den; } a;
The variable a is now declared to have two so-called members num and den which can
be accessed with the dot notation a.num and a.den.
It can then be initialized:
a.num = 3; a.den = 4;
printf("%d, %d\n", a.num, a.den); // Out: 3, 4
Named Structures
For convenience, the struct type can also be named, let’s say frac, with
struct frac { int num; int den; }
so that we now can declare a fraction ‘a’ in the form ‘struct frac a’.
The typedef declaration can be used to give the type identifier ‘struct frac’ an
alias with any name, especially also frac itself:
typedef struct frac frac;
Example 6.13 We put it all together and define a function add to add fractions:
typedef struct { int num; int den; } frac;
frac add(frac a, frac b) {
int nm = a.num*b.den + a.den*b.num;
int dn = a.den*b.den;
frac c = {.num = nm, .den = dn};
return c; }
Note that the program code so far is formulated before the main function.
We test our fraction addition within main:
frac a = { .num = 3 , .den = 4 }; frac b = {.num = 1 , .den = 2};
frac c = add(a,b);
printf("%d, %d\n", c.num, c.den); // Out: 10, 8
134 6 The C Language
Sparse Matrices
In Sect. 4.4 in the SciPy chapter we discussed the application of sparse matrices.
Example 6.14 As a somewhat larger example for the use of structures, we develop
a program to represent sparse vectors as linked lists. In general, a linked list consists
of individual nodes, each of which contains data and a link to the next node. In our
program we represent nodes with corresponding structure variables and links with
pointers.
The program begins with the preamble
1 #include <stdio.h>
2 #include <stdlib.h> // contains the malloc function
We then come to the central data type for the representation of nodes:
3 struct node { int index; int value; struct node* next; };
4 typedef struct node node;
Note in line 5 that for simplicity we only consider vectors with integer values.
In line 6, memory is requested for a node structure, and npt is declared as a pointer
to this structure.
In lines 7 and 8, the external arguments idx and val are stored in the correspond-
ing members of the node *npt.
The ‘next’ member is not yet linked to any successor. In line 9, it receives the
special value NULL. Formally, NULL is a pointer with a value 0. This can be verified
with the instruction printf("%p\n", NULL).
Remark Notions like(*npt).index above are often written in a more concise form
as npt‐>index. We follow that convention below.
6.6 Structures 135
The next function create_vector allocates memory for a vector of type vec and
returns a pointer to it.
12 vec* create_vector(void) { // "void" means "no arguments"
13 vec* v = malloc(sizeof(vec));
14 v‐>first = NULL;
15 v‐>last = NULL;
16 return v; }
Initially, the list is empty. Lines 14 and 15 state that no nodes have yet been included
in the vector representation.
The following function ‘append’ is used to actually build up the linked list. Intuitively,
it appends a node *npt to a vector *v. Technically, it is again formulated in pointer
terms:
17 void append(vec* v, node* npt) {
18 if (v‐>first == NULL) {
19 v‐>first = npt;
20 v‐>last = npt;}
21 else (v‐>last)‐>next = npt;
22 v‐>last = npt; }
The condition in the if-clause in line 18 holds when v has just been newly created by
the create_vector function, i.e. no node has been appended yet. In that case, the
pointer to the node to be appended, i.e. npt, is assigned to both pointers first and
last of *v.
Otherwise, *v has a last node, say tmp_last, to which *v.last points. In line 21,
the new node *npt is linked to tmp_last as its successor.
In line 22, the node *npt is marked as the new last node in *v.
Note again that the program code so far is formulated before the main function.
The following program part shows how a vector, given as a C array, is converted to a
sparse vector. The code must be written within the main function:
23 int arr[] = {0, 2, 0, 0, 8, 10, 0, 14, 16, 18};
24 vec* v = create_vector();
25 for (int i = 0; i < 10; i++) {
26 if (arr[i] != 0) {
27 node* npt = create_node(i, arr[i]);
28 append(v, npt); } }
In the for loop in lines 29–30, all nodes of the sparse vector are run through and their
index-value pairs displayed.
It is worthwhile to follow how the process is controlled by the entries in the loop
head in line 29.
As a simple example we show how a matrix can be written to a file and read out again.
In line 2, a pointer fp to an object of C data type FILE is declared and linked to the
file test.txt in write mode with the command fopen.
In line 3 the row and column numbers of the matrix A are parametrized, so that
the program can also be used for other values. In line 4, these values are written to
the beginning of the file.
The fprintf instruction works like printf, except for the additional pointer that
directs the output to the file.
In line 7, fprintf writes the matrix elements to the file.
Line 8 terminates the connection to the file.
In the following program, the contents of the file are read back into a matrix B. For
this we use the counterpart to fprintf, namely fscanf:
1 FILE* fp = fopen("test.txt", "r");
2 int n, m;
3 fscanf(fp, "%d %d ", &n, &m);
4 float B[n][m];
6.8 Conclusion 137
Line 1 opens the file in read mode. In line 3, the row and column numbers of the
matrix are loaded into the variables n and m. In line 4, a matrix B is declared, which
is filled with with the values from the file in line 7.
The following for loop shows that the matrix has been completely reconstructed:
9 for (int i = 0; i < n; i++) {
10 for (int j = 0; j < m; j++)
11 printf("%f ", B[i][j]);
12 printf("\n"); }
6.8 Conclusion
As the name suggests, C++ was designed as an extension of C. The language was de-
veloped in 1979 by the Danish mathematician and computer scientist Bjarne Strous-
trup as “C with classes”. The model for the central concept of class construction was
the Simula language, which was developed in Norway in the 1960s.
The tutorial at cplusplus.com provides a good introduction. For further reading,
the textbook [15] written by the developer can be recommended.
The development environments already mentioned for C can be used also for C++.
All examples in this chapter were tested in Xcode under macOS for the C++ variant
C++20.
Under Ubuntu Linux you can save programs in files with the extension ‘.cpp’ and
then compile them with ‘g++’.
Essentially, C++ comprises the whole language scope of C, so that all basic concepts
considered in the last chapter can also be used in C++. In particular, this applies to
the number types, variables, control structures, and functions. The pointer constructions
and the C structures are available without restriction.
The development of the two languages is not always completely synchronous, so
that occasionally some deviations occur. This concerns for example the use of arrays
as function arguments. The recursive determinant function defined in Example 6.9
is not accepted by the C++ compiler used here.
Be that as it may, C++ offers much more user friendly alternatives, so there is little
point in using C arrays.
Let’s get straight on to the welcome example.
Example 7.1 The following is a perfectly correct C++ program:
#include <iostream >
int main() { printf("Hello World\n"); }
The only thing that stands out is that the printf command has been moved from the
library stdio to another library iostream (and that the extension ‘.h’ is missing).
However, using printf is not considered good C++ style. In the sense of “C with
classes”, C++ prefers the “object-oriented” approach also for input/output.
7.2 Basics
Example 7.2 In correct C++ style, the greeting example looks like this:
#include <iostream >
int main() { std::cout << "Hello World\n"; }
The object cout controls the output to the screen. (See below for the meaning of the
additional prefix ‘std::’.) It takes the expression to the right of the output operator
‘<<’, processes it and passes the result to the output device. In our example, cout
detects that a string is to be written, followed by a line break.
It is also possible to pass several expressions one after the other, which can also consist
of different data types. For example, the input
std::cout << 1./3 << " times " << 3.14
<< " is " << 1./3*3.14 << std::endl;
gives the result ‘0.333333 times 3.14 is 1.04667’. Even the data types are au-
tomatically recognized by cout, without need for type placeholders such as ‘%f’ in
printf.
The use of std::endl (for end line) is essentially identical to that of ‘\n’.
Example 7.3 To format the output, appropriate parameters can be set in cout. For
example, consider the input
1 float p = 3.14159;
2 std::cout.precision(4); // 4 digit precision ...
3 std::cout << p << "\n"; // Out: 3.142
4 std::cout.setf(std::cout.fixed); // ... here for the decimals
5 std::cout << p << "\n"; // Out: 3.1416
6 std::cout.width(8);
7 std::cout << p << "\n"; // Out: 3.1416
In line 2, the output precision is set to 4 digits. In line 3, the number 𝑝, rounded to 4
digits, is output.
The method setf (for set flag) in line 4 can control various yes/no switches, here
specifically that the output precision ‘4’ should refer to the decimal places.
7.2 Basics 141
Line 6 states that the output is to be printed right aligned within a space of width 8.
Note that the cout settings from lines 2 and 4 are still valid. They remain valid until
changed again.
Writing to Files
Writing to a file is quite similar, except that it is not controlled by std::cout, but by
a file manager imported by
1 #include <fstream >
In line 2, the type declaration ‘ofstream myfile’ indicates that the following output
data stream is routed to a file that is referenced by the myfile object.
Line 3 establishes the connection to the actual external file.
In line 4, text is written to the file. Note that we could also redirect all formatted
output in Example 7.3 by simply replacing the target std::cout with myfile.
Line 5 closes the file connection. The instruction is not mandatory; at the latest at
the end of the program this happens automatically.
There are many ways to read from files. We consider a simple variant that reads a text
file line by line with the getline function and displays it on the screen. We need
1 #include <fstream >
2 #include <string > // provides 'string' data type in line 5
In line 3, a variable myfile of type ifstream is declared and in line 4 linked to the
file to be read.
In line 5, a variable ‘line’ of type string is declared, which receives the next line
of the file myfile from the function getline in line 7 and outputs it to the screen in
line 8.
142 7 The C++ Language
The condition in the loop head in line 6 states that the loop is repeated as long as the
end of the file has not yet been reached. The method eof (for: end of file) returns
the Boolean value ‘true’ when the end is detected, the exclamation mark negates this
condition, so that the loop is executed as long as the end is not reached.
Namespaces
The prefix ‘std::’ before identifiers like cout serves to avoid name collisions. This is
quite analogous to the convention that we already met in Python. For example, the
function sin appears in both packages math and numpy. If both are imported, we can
only distinguish which one is meant by a prefix like math.sin or numpy.sin. Cor-
respondingly, std contains a list of names in which all identifiers from the standard
libraries are collected, so that no ambiguity will arise.
If you are sure that the identifiers do not conflict with each other, you can de-
clare ‘using namespace std’. You can now write cout instead of std::cout. In the
following we will follow this convention and use identifiers without ‘origin specifier’.
Unlike C, in Python a function could return another function as output. C++ pro-
vides lambda expressions, which at least in simple cases also allow such constructions.
As an example, we show how to define a second level function ddx that returns the
(approximate) derivative function ddx(f) of a function f.
But first to the lambda expressions themselves. A simple example:
[](int n) { return n*n; };
This defines the function 𝑛 ↦ 𝑛2 . Here the bracket pair [] is the identifier that tells
the compiler that the definition of an anonymous function follows. The arguments are
enclosed in parentheses. The code block defining the function is enclosed in braces.
In simple expressions as the one considered here, there is no need to declare the type
of the return value; the compiler takes care of that.
A lambda expression can then be applied directly to arguments or assigned to a
variable. In our example, the assignment to a function variable fct looks like this:
function <int(int)> fct = [](int n) { return n*n; };
The term ‘function<int(int)>’ denotes the C++ type declaration for functions
from int to int.
Now fct can be used as a normal function, e.g. to calculate, fct(4) with the out-
put 16.
Actually, the lambda construction is even more powerful. It can reference variables
from the environment in which it is defined, in technical jargon: capture variables.
7.4 Data Type vector 143
This is best explained when we now turn to the actual implementation of our differ-
ential operator ddx.
In order not to be distracted by rounding errors, we use double precision floating
point numbers.
1 function <double(double)> ddx(double f (double)) { // in preamble
2 function <double(double)> f_prime;
3 f_prime = [f](double x){ double h = 1e‐6;
4 return (f(x+h) ‐ f(x)) / h; };
5 return f_prime; };
In line 1, the argument of ddx is declared as a function variable f that takes a double
as argument and also returns a double as value. This is the normal C notation already
used in Example 6.7 in the C chapter.
The return value of ddx is of essentially the same type, now however declared as a
genuine C++ type ‘function<double(double)>’.
In line 2, a variable f_prime of type ‘function<double(double)>’ is declared
and then initialized by the lambda expression in lines 3–4.
Note that line 4 refers to the function f. As mentioned earlier, this is possible
because in line 3 the variable f is “captured” in the lambda expression. The bracket
pair ‘[]’ serves as capture-operator.
We test ddx for the function 𝑔 ∶ 𝑥 ↦ 𝑥 2 :
double g(double x) { return x*x; }; // in preamble
...
function <double(double)> dgdx = ddx(g); // in main
cout << dgdx(2) << endl; // Out: 4
Note that the output is 3.8147 if we use float instead of double precision.
With the data type vector, C++ offers a user-friendly alternative to C arrays. C++
vectors do not decompose into pointers, the number of components can always be
queried directly, they can be used as function arguments without restriction. In addi-
tion, memory management is organized by the system itself. The data type is provided
by specifying ‘#include <vector>’ in the preamble.
There are several options for declaration and initialization, such as:
1 vector<int> u(3); // 0‐initialized
2 vector<int> v(3,1);
3 vector<int> w = {1, 2, 3};
Again, we assume that ‘using namespace std’ is declared. Otherwise we would have
to use the longer form std::vector.
The instruction in line 1 defines a vector u, which consists of 3 elements of type
int. All components are implicitly initialized with a 0 value. Line 2 is analogous,
144 7 The C++ Language
except that now all elements are explicitly initialized with the value 1. In line 3, the
vector w is implicitly declared with length 3 and initialized with the values enclosed
in curly braces.
As in C arrays, components are accessed in the form v[i] etc. The length of a
vector v can be queried with v.size(). Note, however, that as in C, range bounds
are not controlled.
Very useful is the possibility to dynamically adjust the length of a vector. For ex-
ample, the declaration ‘vector<int> v’ first creates an “empty” vector of length 0,
which can then be converted to a vector of length 3 with v.resize(3), or with
v.resize(3,1), in which case all components additionally receive the value 1.
Sparse Vectors
In Sect. 6.6 in the last chapter we saw how to represent sparse vectors by linked lists
in C. By using C++ vectors, the construction can be simplified considerably.
We show how to convert a C array
1 int arr[] = {0, 2, 0, 0, 8, 10, 0, 14, 16, 18};
Finally, all index-value pairs of arr with value ≠ 0 are sequentially appended to vec:
4 for (int i = 0; i < 10; i++) {
5 if (arr[i] == 0) continue; // proceed if != 0
6 node nd = { .index = i, .value = arr[i] }; // produce node
7 node_vec.push_back(nd); } // append to vec
In line 7, the vector method push_back adds a new element at the end of the vector,
after its current last element.
To verify our construction, we check that all arr components have been trans-
ferred correctly. We take this as an opportunity to explain the C++ iterator:
8 vector<node >::iterator it;
9 for (it = node_vec.begin(); it != node_vec.end(); it++)
10 cout << ' ' << '(' << it‐>index << "," << it‐>value << ")";
11 cout << endl;
In line 8, the variable it is declared as an iterator for the type vector<node>, com-
parable to a conventional number index in for loops. In line 9, the ‘it’ iterator is set
to the first element of the node_vec vector. Then the loop is executed repeatedly for
7.5 Reference Operator 145
each node of the vector as indicated by the third argument it++ in the loop head.
This continues until the iterator reaches the last element specified by the condition
in the second argument.
The expressions ‘it‐>index’ and ‘it‐>value’ in line 10 reveal that C++ iterators
are based on the idea of pointers.
Remark Besides the vector type, C++ offers several other so-called sequential con-
tainer types, i.e. data types that aggregate data. The list type, for example, unlike
vector, allows elements to be inserted at arbitrary positions.
Matrices
One of the advantages of C++ over C is that C++ significantly reduces the need for
low-level constructs such as pointers. However, sometimes it is useful to specify a
variable b that can access the same memory space that belongs to another variable a.
In C++ this can be achieved with the so-called reference operator ‘&’.
As a reminder: if we assign the address &a of a variable a to a pointer p, the contents
of *p and a are linked, i.e. changing one automatically affects the other, precisely
because both identifiers refer to the same memory locations.
In contrast, when assigning b=a between variables of a value type, only the current
value is copied, so that a subsequent change is not transferred from one to the other.
The reference operator ‘&’ now allows the above tight coupling also directly be-
tween variables.
If, for example, a denotes an int variable, then the statement
int& b = a;
declares b as a variable of type int, whose address &b coincides with the one &a of a.
146 7 The C++ Language
If now the value of ref is incremented with ref++, then a is changed to {2,2}.
7.6 Classes
In Example 3.19 in the Python chapter, we explained the basic ideas of object-
oriented programming and illustrated them using the class Fraction. Here we show
how the same example, a class for representing and managing fractions of integers,
can be created in C++.
Example 7.5 (Fraction Class) The following specification can be written either in
the preamble or in the main function:
1 class Fraction { // in preamble or main
2 public:
3 int num, den;
4 Fraction(int numerator , int denominator) {
5 num = numerator; den = denominator; }
6 Fraction add(Fraction b) {
7 return Fraction(num*b.den + b.num*den, den*b.den); }
8 bool operator==(Fraction b) {
9 return num*b.den == den*b.num; }};
First, we consider only the part consisting of lines 1–5. In line 1, the keyword class
declares that the definition of a class follows, which here is called Fraction. The key-
word ‘public’ in line 2 states that all variables and functions defined in the following
block can be accessed from outside the class instance. In line 3, two internal variables
num and den are declared to hold the numerator and denominator of the fraction to
be represented.
In lines 4–5, a so-called constructor is specified. It controls what happens when a
new class instance is created. (This corresponds to the __init __ method in Python.)
Note that the constructor is a distinguished function, recognizable by bearing the
name of the class, and having no return value.
A new instance of the class Fraction is generated by a statement of the form
Fraction a(3, 4); // class name, identifier a, initial values
7.6 Classes 147
The constructor reads the initial values 3 and 4 into its placeholders numerator and
denominator and assigns them to the internal variables num and den in line 5.
The current state can be inspected with:
cout << a.num << "," << b.den << endl; // Out: 3, 4
The value 1 is output, which is to be read here as the Boolean value ‘true’.
A Class Matrix
Line 2 creates an empty matrix mat as a vector of vectors. In line 6, the constructor
assigns the number of rows r and columns c of mat to the internal variables rows
and cols.
In line 7, the internal variable mat is converted to the correct form and all entries
initialized to zero. (As mentioned before, the explicit zero initialization is actually
redundant, but it is always a good practice not to rely on “hidden” features.)
The class definition is now already sufficient to declare Matrix objects. An 𝑛 × 𝑚
matrix 𝐴 will be instantiated in the form
Matrix A(n,m);
The number of rows and columns can be queried by A.rows and A.cols.
148 7 The C++ Language
However, we don’t have access to the matrix elements yet. The variable mat is declared
outside the public area and therefore cannot be directly addressed in the form A.mat.
So we need a public access method. We define a variant that is not the technically
most simple, but one that would suit a mathematical user.
Namely, we will be able to access the elements of the matrix 𝐴 in the form
𝐴(𝑖, 𝑗) for 1 ≤ 𝑖 ≤ 𝑛, 1 ≤ 𝑗 ≤ 𝑚,
In the return statement, the internal mat values are exported. To understand what
happens, assume for a moment that the function declaration starts with ‘float get’
instead of ‘float& operator()’. Then the values would be accessible in the usual
A.get(i,j) notation.
However, the function here actually has the unusual name ‘operator()’. This
means that the use of the pair of parentheses ‘( )’ is overloaded, so that access to the
return values becomes possible via the functional notation A(i,j). (Compare the
corresponding construct ‘p(x)’ for the polynomial class in Sect. 3.9 in the Python
chapter.)
However, this only covers read access. This is where the reference operator ‘&’
comes into play, which is appended to the float type of the return value. As men-
tioned above, the declaration as ‘float&’ results in A(i,j) having the same address
as mat[i‐1][j‐1], so we can access this internal memory space also in write mode.
(Recall that the indices in the internal representation are always 1 less than in the
external matrix object.)
The following method can be included to print a matrix in the usual row-by-row
form:
void printm() {
for (int i = 0; i < rows; i++) {
for (int j = 0; j < cols; j++) cout << mat[i][j] << "\t";
cout << endl; }}
The control character ‘\t’ causes a jump to the next tabulator position.
An addition method can be defined along the usual lines:
1 Matrix operator+(Matrix B) {
2 Matrix C(rows, cols); // internal var for sum matrix
3 for (int i = 1; i <= rows; i++)
4 for (int j = 1; j <= cols; j++)
5 C(i,j) = (*this)(i,j) + B(i,j);
6 return C; }
7.6 Classes 149
We want to supplement our class skeleton with the determinant function det. To
illustrate the possibilities, we choose to formulate it as a standalone function, not as
a class method.
The program is based on the Laplace expansion in Example 6.9 in the C chapter.
1 float det(Matrix A) {
2 int n = A.rows;
3 if (n != A.cols) { // if not square: Abort
4 cerr << "Error\n";
5 exit(41); }
6 if (n == 1) return A(1,1); // recursion's base case
7 Matrix B(n‐1,n‐1); // declaration of minor submatrix
8 float d = 0;
9 for (int c = 1; c <= n; c++) {
10 for (int i = 1; i <= n‐1; i++)
11 for (int j = 1; j <= n‐1; j++)
12 B(i,j) = ( j < c ) ? A(i+1,j) : A(i+1,j+1);
13 float s = A(1,c)*det(B);
14 d += ( c % 2 == 0 ) ? ‐s : s; }
15 return d; }
In lines 3–5, the computation is aborted with an error message, if the input matrix
is not square. The word “Error” is written to the standard output for error messages
cerr (err for error), which is often just the same screen used for cout. Then the
program execution is aborted and the message: “Program ended with exit code: 41”
is returned.
Example 7.7 Applied to the identity matrix A in the above Example 7.6, det(A)
yields 1 as desired.
Declaration Order
The function det refers to the type Matrix, hence can only be defined after that type
has been introduced. Since function definitions must be made outside of the main
150 7 The C++ Language
function, the only possible sequence here is: class Matrix - function det - function
main.
However, it is obvious that it will be difficult or even impossible for large programs
to meet such strict dependency requirements.
Example 7.8 For example, consider two functions 𝑓 and 𝑔 that recursively call each
other, like:
2 int f(int n) { return n < 0 ? 22 : g(n); }
3 int g(int n) { return f(n‐1); };
The standard method for dealing with such dependencies is the forward declaration
approach. In the example it is sufficient to insert the new line 1 before line 2:
1 int g(int n);
In line 1, the 𝑔 function is declared to have the correct type for use in line 2, even
though the compiler does not know what 𝑔 does at this point. However, the declara-
tion is sufficient to break the dependency cycle.
For all components it is now clear how they are applied, but not yet what they do.
Note that the det function is again declared outside the Matrix definition.
Also note also that the file has the extension ‘.hpp’, which is common but not
mandatory.
7.8 Subclassing 151
The results are the same as before in Examples 7.6 and 7.7.
If the program was created in a dedicated development environment like Xcode, the
connection between the files is established automatically. The program can then be
built and executed by clicking the Run button.
To compile and run the program under Ubuntu Linux, the easiest way is to store the
files matrixmain.cpp, matrix.hpp, and matrix.cpp in the home directory and then
compile both cpp files:
$ g++ matrixmain.cpp matrix.cpp ‐o matrixtest
7.8 Subclassing
As we have seen, class constructions are central to C++. In more complex program
development, the method of deriving new classes from existing classes by subclass-
ing plays an important role. Below we give a simple example to illustrate the basic
152 7 The C++ Language
techniques. We consider a variant of the class Polynomial from Example 3.20 in the
Python chapter. Actually, here we present only a rudimentary framework and in the
exercises ask the reader to fill in the details.
We choose to represent polynomials by vectors of floats, in ascending order of the
powers of 𝑥, e.g. the polynomial 𝑝(𝑥) = 3𝑥 2 + 2𝑥 + 1 by a vector ‘{1.,2.,3.}’. The
object instantiation should be in the form ‘Polynomial p({1.,2.,3.})’.
In the program preamble we include the files iostream, vector, cmath and declare
the statement ‘using namespace std’.
We then come to the core of the Polynomial class, the definition of polynomial
objects p and a method that supports application to arguments in the form p(x).
1 class Polynomial { // in preamble or in main
2 public:
3 vector <float> coeff;
4 Polynomial(vector <float> arr) { coeff = arr; };
5 int operator()(float x) {
6 float s = 0;
7 for (int i = 0; i < coeff.size() ; i++ )
8 { s += coeff[i] * pow(x, i); }
9 return s; }
10 }; // end of Polynomial definition
In line 3, we declare an internal variable coeff that contains the coefficients of the
Polynomial object.
Line 4 defines the object constructor. It takes a float vector as argument and stores
the value in the internal variable.
In lines 5–9, we define a method that allows a polynomial to be applied to an
argument in the usual functional form such as p(4). As explained on page 148 for
the Matrix class, it is the ‘operator()’ specifier in line 5 that makes this possible.
The power operator ‘pow’ in line 8 is imported from the cmath library.
We can test what we have so far in the main function:
Polynomial p({1,2,3});
cout << p(4) << endl; // Out: 57
so that the end of class marker ‘};’ is now shifted down to line 13.
Currently, the addition operator returns the zero polynomial as a dummy result. We
continue the above example and test it:
cout << (p+p)(4) << endl; // Out: To be implemented. Dummy: 0
7.9 Summary and Outlook 153
In line 14, the Parabola class is declared as a subclass of Polynomial. The specifier
‘public’ declares that no additional access restrictions apply to the public part of the
superclass.
Lines 16–17 define the constructor for the subclass. In line 16, it first calls the
constructor of the superclass. Note the special colon-notation.
Line 17 simply says that further execution of the program is aborted when the
argument polynomial is not of degree 2.
Within main we can test
Parabola q({1,2,3,4}); // Out: No parabola
In addition, the program execution is aborted with the message “Program ended with
exit code: 41”.
For a polynomial of degree 2 we get:
Parabola r({1,2,3}); // valid parabola
cout << r(4) << endl; // Out: 57
Now it works. The method definition of the subclass overrides the method of the
same name in the superclass:
Parabola r({1,2,3});
cout << (r+r)(4) << endl; // Out: 114
In the last few chapters, we have discussed the basics of the Python, C, and C++
programming languages, focusing on the concepts that are important for scientific
154 7 The C++ Language
computing. However, all three languages were developed as universal languages that
are not specifically designed for mathematical computation.
If you’re mainly interested in fast number processing, then the Python approach
of type-less variables, which comes with expensive internal memory management,
may be a bit too liberal. Moreover, in Python the interpretative command execution
tends to lead to longer run times. To some extent, these drawbacks can be overcome
by using special types of data structures such as NumPy arrays. Also on the positive
side, many specific mathematical libraries have been developed for the Python lan-
guage, which turn it into a comprehensive user-friendly mathematical development
environment.
C is the other extreme, so to speak. The tight machine oriented approach makes
it possible to create highly efficient programs, efficient in terms of both runtime and
storage utilization. However, this comes at the expense of less user-friendliness and
less control of program security. In particular, the generous use of pointers is often
seen as an uncertainty factor.
High-level languages such as C++ attempt to combine the efficiency of C with
problem-adapted, more controllable data structures, for example, with the vectors
and classes discussed in this chapter.
One approach to mathematical programming is to develop programs in a user-
friendly language such as Python and then, if necessary, replace resource-intensive
parts with modules formulated in efficiency-optimized machine-oriented languages.
We will see examples of this later in the FEniCS chapter.
Another approach is to use languages that are inherently designed for mathematical
programming and combine the ease of use of Python with the performance advan-
tages of C. In the next chapter, we will discuss a particularly promising newcomer in
this field, the Julia language.
Exercises
Note In numerical computations with C/C++, it is advisable to use the type double
instead of float. Most standard mathematical functions including sin are defined in
math.h (in C++ use ‘#include <cmath>’).
Polynomial Interpolation
that interpolates the (𝑛 + 1) points (𝑥0 , 𝑦0 ), (𝑥1 , 𝑦1 ), … , (𝑥𝑛 , 𝑦𝑛 ), based on the so-
𝑗
called Vandermonde matrix 𝑉 = (𝑥𝑖 )0≤𝑖,𝑗≤𝑛 .
Test the program for 𝑦𝑖 = tan(𝑥𝑖 ) on the evaluation points −1.5, −0.75, 0, 0.75, 1.5.
Note This exercise is for “educational purposes only”. In practice, the method is com-
putationally too costly.
𝑢″ − 4𝑢 = 0,
𝑢(0) = 0, 𝑢(1) = 𝑒2 − 𝑒−2
in the interval [0, 1]. Solve this BVP numerically using the discrete difference method
by calculating the corresponding difference equations at 5 equidistant points.
𝑢″ − 5𝑢′ + 4𝑢 = 𝑥 2,
𝑢(0) = 0, 𝑢′ (1) = 0
Classes
The class definitions in the following exercises should follow the pattern in Sect. 7.7,
namely to split declaration and implementation into separate files.
Polynomials
Exercise 7.5 Extend the class Polynomial and its subclass Parabola in Sect. 7.8 to
include addition and multiplication of polynomials and a solver for quadratic equa-
tions.
Quaternions
Adding two more imaginary units 𝑗, 𝑘 to the one 𝑖 in the complex numbers and defin-
ing multiplication between these units by a formula carved in stone (literally, cut on
a stone of Broome Bridge in Dublin, Ireland):
𝑖 2 = 𝑗 2 = 𝑘 2 = 𝑖𝑗𝑘 = −1,
Exercise 7.6 Develop a class for the representation of quaternions. It should at least
provide the additive and multiplicative constants 0 and 1, addition, subtraction, mul-
tiplication and division in infix notation, and the multiplicative inverse for 𝑥 ≠ 0.
A test for equality should also be possible using infix notation. Here two quater-
nions should be considered equal, if all components are indiscernible within double
precision.
Illustrate with suitable examples.
Chapter 8
Julia
8.1 Basics
Launching the Julia application opens a Terminal window, where commands can be
entered at the ‘julia>’ prompt, very similar to Python.
Remark 8.1 Julia can also be launched directly from the terminal. To do this, we
need to make the access path available. On the Mac, this can be done by running the
Terminal command
$ ln ‐s /Applications/Julia ‐1.7.app/Contents/Resources/julia\
/bin/julia /usr/local/bin/julia
Arithmetical Expressions
Julia is a mathematics oriented language. It provides data types for integers and
floating point numbers, by default represented in machine precision, i.e. normally
with 64 bits. The type of an object can be inspected with typeof, such as
typeof(3) # ans: Int64
typeof(3.2) # ans: Float64
Note that we often write commands without a prompt and put the response on the
same line, separated by a comment character.
As in Python, the result of mixed expressions consisting of integers and floats is again
a float. Division of two integers using ‘/’ always returns a float. There is also a “division
from left” (or “reverse division”) denoted by ‘\’.
Integer division is denoted by div, the remainder by ‘%’. The power operator is ‘^’.
Infix operators can also be written in functional form: for example, ‘+(1,2)’ is equiv-
alent to ‘1+2’. This is especially useful in expressions where associative operations can
be chained.
Some examples:
1 + 2.0 # ans: 3.0
35/5 # 7.0
5\35 # 7.0
2^10 # 1024
div(13, 3) # 4
13 % 3 # 1
+(1, 2, 3) # 6
8.1 Basics 159
For convenience, a scalar multiplicand for a variable can be written without the ‘*’
operator:
x = 2; 10x + 4x ‐ 3x/2 + 1 # ans: 26.0
Large numbers can also be input in a more readable form, such as 10_000.
In the basic form, the assignment to variables works exactly as in Python. Variables
are not declared before assigning to them. A variable by itself has no static type, only
the value currently stored in it:
x = 42; typeof(x) # ans: Int64
x = 3.14; typeof(x) # ans: Float64
The standard arithmetic operators also have special versions, which can be used to
update variables quickly, such as x += 1.
As seen in the last line, the constant 𝜋 can be accessed by pi. Euler’s number 𝑒 is
buried a bit deeper in the archives as Base.MathConstants.e.
160 8 Julia
The Boolean constants are denoted by true and false, the Boolean operators written
as in C: ‘!’ for negation, ‘&&’ for and, and ‘||’ for or.
Comparisons and equality are denoted as in Python. They can be be chained, such
that for example
1 < 2 == 3‐1 # ans: true
1 < 2 >= 3 # ans: false
Conditional Statements
The else branch can again be omitted. In addition, the basic form can also contain
one or more elseif clauses.
Remark Note that the statement structure is clearly defined by the if, else, end key-
words. No special formatting is required. However, to improve readability, we usually
follow the Python style of indenting statement blocks.
Note that it is not necessary to enclose the condition in parentheses. Again, it only
serves to improve readability.
In short expressions like above, where a conditional choice between single values
is required, we can use the ternary operator ‘?:’ for an equivalent formulation:
(n % 2 == 0) ? n = div(n, 2) : n = 3n + 1
The ternary ‘?:’ operator can also be used to assign new values to a variable in an
if-else manner:
n = (n % 2 == 0) ? div(n, 2) : 3n + 1
While Loop
When we want a loop to run as long as a condition remains true, we use the while
loop. The general form is ‘while condition statements end’.
Remark Note the somewhat unintuitive use of the global keyword in line 2. In Julia,
commands in code blocks can in general only change local variables that are defined
within the block. The while command starts a new block, so 𝑛 must be explicitly
declared as the global variable defined in line 1.
In fact, this applies to while loops and also to for loops discussed below, but not
to if-else statements.
For Loop
The general form of a for loop is ‘for expr in coll statements end’, where expr is an
expression, and coll can be an array, or any iterable collection to be discussed later.
Example 8.4 Here is the Julia version of our rice-grain computation, discussed in
Examples 2.4 and 6.4 in previous chapters:
1 grains = UInt64(0) # max val 2^64‐1 exceeds Int64 max
2 fieldval = UInt64(0) # similarly for max value 2^63
3 for fieldno in 1:64 # range 1 ... 64 incl lower and upper bounds
4 global fieldval = fieldno == 1 ? 1 : fieldval *= 2
5 global grains += fieldval end
6 println(grains)
In lines 1 and 2 we define the variables to be of type UInt64, which can store unsigned
integers up to 264 − 1.
In line 4, note the short form of an if-else value assignment, discussed above.
The variables grains and fieldval are defined outside the for loop. They have to
be made accessible inside the loop with the global specifier.
Remark As a final general note on for loops, we mention that multiple nested for
loops can be combined into a single loop, such as
for i = 1:2, j = 3:4 println((i, j)) end
The output, here written on one line: (1, 3), (1, 4), (2, 3), (2, 4).
162 8 Julia
Interlude: Ranges
The expression ‘1:64’ is a simple example of a range, often used in Julia. Unlike
Python, a range includes the upper bound. The general syntax is: ‘start:end’, or in-
cluding an increment specifier ‘start:increment:end’.
When iterating over a range of numbers, ‘=’ is often used instead of ‘in’, so the
counter condition in line 3 of Example 1.3 can be written as ‘for fieldno=1:64’.
Julia also provides a convenient constructor function ‘range’, which can take several
forms:
range(0, stop=1, length=5) # ans: 0.0: 0.25: 1.0
range(2, stop=‐2, length=9) # 2.0: ‐0.5: ‐2.0
range(1, step=5, length=10) # 1: 5: 46
range(0, stop=1, step=0.3) # 0.0: 0.3: 0.9
Example 8.5 As another example of a for loop, we recast the Monte Carlo method
for the computation of 𝜋 in Sect. 4.9 in the SciPy chapter:
1 const samples = 10_000_000 # input
2 hits = 0
3 for _ = 1 : samples # no counter variable used in loop
4 x, y = rand(), rand()
5 d = x*x + y*y
6 if d <= 1 global hits += 1 end end
In line 1, we declare the ‘samples’ variable as a constant whose value will not be
changed during computation. This is not necessary, but may help to improve perfor-
mance.
In line 4, the built-in rand function generates a pair of random numbers from the
interval (0, 1). Technically, the comma-separated entries form a tuple, as in Python.
We will come back to the details later.
The approximation will then result in a value like this:
7 println(4*(hits/samples)) # ans: 3.1416488
As in many other languages, a break statement terminates the loop containing it. The
program control flows to the instruction immediately after the loop body.
If the break statement is inside a nested loop (loop inside another loop), break
will terminate the innermost loop.
The continue statement is used to skip the rest of the code inside a loop for the
current iteration only.
8.3 Functions 163
8.3 Functions
Functions are the basic building blocks of Julia. A function definition begins with
the keyword function, followed by the name of the function and the argument list.
What the function does, is defined in the function body, a code block terminated by
the keyword end.
For value types such as numbers, only the argument value is passed to the func-
tion. As we shall see later, the behavior is different for reference types such as arrays.
Example 8.6 As a simple example, here is the definition of the factorial function:
function factorial(n) # computes the factorial
res = 1
for i = 1:n res *= i end
return res end
The value of the expression in line 1 is the anonymous function itself. It is automat-
ically stored in the temporary variable ans and can then be used in line 2 to return
the result 4.
When defining anonymous functions, a single argument can also be enclosed in
parentheses, e.g. the expression in line 1 can be written as ‘(x) ‐> x^2’.
As in Python, an anonymous function can also be assigned to a variable:
g = (x,y) ‐> x^3 ‐ 2x + x*y
g(3, 2) # ans: 27
164 8 Julia
Recursive Functions
Example 8.10 In the chapter on C++ we saw that dependencies between function
calls must be resolved by forward declaration. In Julia this is done automatically:
f(n) = n < 0 ? 22 : g(n)
g(n) = f(n‐1)
f(4) # ans: 22
Julia is very good at inferring the type of a variable from the context in which it
occurs. For example, line 3 automatically assumes that the argument f is a function
variable.
However, for better performance, it may be beneficial to explicitly specify the vari-
ables involved. For example, as a hint to the compiler we can declare the argument
types in the header in line 1 as
function ddx(f::Function , x::Float64)
Recall that in Python a function could again return a function as output value. This
is just as easy in Julia:
Example 8.12 Continuing with the program in the last example, we write a function
that returns the approximate derivative function 𝑓′ of an input function 𝑓:
6 function ddx(f)
7 h = 1.e‐14
8 function f_prime(x) return (f(x+h) ‐ f(x)) / h end
9 return f_prime end
In line 8 we define a local function f_prime inside the body of ddx, and return this
function in line 9.
We apply ddx to the same test function 𝑔 as above and get:
10 ddx(g)(0.5) # ans: 0.0
11 g_prime = ddx(g)
12 g_prime(0.5) # ans: 0.0
Multiple Dispatch
Note that in the above examples we used the same name ‘ddx’ for two different func-
tions in lines 1 and 6. This is a simple example of Julia’s multiple dispatch or multi-
method feature.
In fact, Julia reports that ddx includes two methods:
julia> ddx # ans: ddx (generic function with 2 methods)
In the present case, which of the two methods is used depends on the number of
arguments.
Although apparently a simple concept, multiple dispatch depending on the types
and numbers of arguments is perhaps the most powerful and central feature of the
Julia language.
Core operations in Julia typically have dozens of methods. As an example, let’s con-
sider the addition operator ‘+’:
julia> + # ans: + (generic function with 208 methods)
At first sight, the Julia programming interface looks very much like the interpreter
environment in Python. It is however a bit more sophisticated. Input is passed to a
so-called just-in-time (JIT) compiler, which translates it on-the-fly into lower level
code that is then executed immediately.
166 8 Julia
Example 8.13 As an example, we measure the time it takes to add 1000 numbers
𝑥 ∈ (0, 1).
1 julia> v = rand(1000); # generates 1000 random numbers in (0, 1)
2 julia> function add()
3 s = 0
4 for x in v s += x end
5 return s end
Execution time can be measured with the built in Julia macro ‘elapsed’:
6 julia> @elapsed add() # ans: 0.008543041
7 julia> @elapsed add() # ans: 0.000152042
The elapsed time for the first run in line 6 is longer because it includes the extra time
it takes to compile the function.
Note that the call requires the preceding ‘@’ (at-sign) before the name of the macro.
Remark In Julia, macros allow sophisticated code generation that can then be passed
to the compiler.
To illustrate this, we give a simplified definition of the macro elapsed:
1 macro myelapsed(expr)
2 return quote
3 t0 = time()
4 $expr
5 t1 = time()
6 t1 ‐ t0 end end
7 @myelapsed sin(pi) # ans: 2.1457672119140625e‐6
In line 7, the macro myelapsed is called for the expression ‘sin(pi)’. This has the
effect that ‘$expr’ in line 4 is replaced by the expression ‘sin(pi)’ and the resulting
code between ‘return quote’ and ‘end’ is then passed to the compiler.
Frequently used is also the extended ‘time’ macro, which additionally returns the
evaluated expression itself as well as some further information about the resource
usage.
Arrays are used for lists, vectors, and matrices. In the following, we will focus on one-
dimensional vectors and here in particular on the list aspects. Mathematical proper-
ties are discussed later in connection with matrices.
Vectors can be created as a comma-separated sequence of elements enclosed in
square brackets:
julia> v = [1, 2]
2‐element Vector{Int64}:
1
2
Remark The type Any in line 3 is a so-called abstract data type, there are no objects
of type Any. Abstract types only serve as nodes in the type graph. Any is the root of
Julia’s type hierarchy.
In contrast, concrete types such as Float64, String etc. denote types of actual
objects.
Vector entries are accessed in bracket notation v[n] as in Python. Note, however,
that indices in Julia start with 1 according to the usual mathematical conventions.
The number of items in a vector v can be queried with length(v), the last element
with v[end]:
v = [3, 2, 1]
length(v) # ans: 3
v[end] # ans: 1
Vectors can be concatenated using the function vcat, where the initial ‘v’ for ‘vertical’
reflects that vectors by default are conceived as column vectors.
Example:
v = [1, 2, 3]; w = [4, 5]
c = vcat(v,w) # c is column vector , flattened by print:
println(c) # ans: [1, 2, 3, 4, 5]
An empty vector is denoted by []. Note that the vector in this generic form is of type
Any. If we want to have an empty vector of a certain type, e.g. integer, we have to
specify it as follows:
v = Int64[] # 0‐element Vector{Int64}
or equivalently simply as v = Int[], since on 64-bit machines the integer type de-
faults to Int64.
Vectors can hold entries of arbitrary types. Very often vectors of vectors are needed,
such as
v = [[1, 2], [3, 4], [5, 6]]
typeof(v) # ans: Vector{Vector{Int64}}
Note that the function swap! has no return value. It accesses the external variable
vec and performs the swap in-place.
Also note that functions that change one or more of their arguments are usually given
an exclamation mark as part of the name.
8.4 Collection Types 169
Here is a built-in Julia function ‘push!’ that we will need later. It appends a new entry
to the end of the vector:
v = [1, 2]
push!(v, 42)
println(v) # ans: [1, 2, 42]
Below, we use vectors to formulate a Julia program that computes the sieve of Er-
atosthenes discussed in Example 3.4 in the Python chapter. For this we can use the
powerful filter function for vectors.
If v is a vector and f a Boolean function, then the vector v filtered by f results from v
by removing all elements x for which f(x) is false.
Actually, there are two variants of the filter function: filter, which returns a new
vector, and a mutating function filter! (again written with an exclamation mark),
which modifies the argument vector.
Often anonymous Boolean functions defined in the form ‘x ‐> Boolean expression’
are used to generate filters:
Example 8.15 Filtering with the function ‘x ‐> x%2 == 0’ returns the even numbers
in the argument vector:
v = collect(1: 5)
filter(x ‐> x % 2 == 0, v) # ans: [2, 4]
println(v) # ans: [1, 2, 3, 4, 5], v not modified
filter!(x ‐> x % 2 == 0, v)
println(v) # ans: [2, 4], 'v' modified in‐place
Example 8.16 Here is a Julia implementation of the sieve of Eratosthenes from Ex-
ample 3.4 in the Python chapter:
1 n = 30
2 L = collect(2:n)
3 P = Int[] # empty integer list
4 while L != []
5 p = L[1] # the smallest number still contained in L
6 push!(P, p) # is appended to the end of P
7 for i in L # removes all multiples of p
8 if i % p == 0 filter!(x ‐> x != i, L) end end end
9 println(P)
Note that in line 3 we initialize P as an empty integer vector. The program would
also work with the initial assignment ‘P = []’. But then the Any type would prevail
throughout the program.
Remark The attentive reader may have noticed that, unlike the previous loop con-
structions, we have used the variables L and P within the loop body without declaring
them global. Again, the reason is that Vector is a reference type that passes refer-
ences to memory addresses, not values.
170 8 Julia
Comprehension
The comprehension methods for lists discussed in Python also exist in Julia.
Example 8.17 Here is a Julia version of the Quicksort algorithm from Example 3.13
in the Python chapter:
1 function qsort(lst::Vector{Int})
2 if lst == [] return Int[] end
3 p = lst[1]
4 sml = qsort([x for x in lst[2: end] if x < p])
5 grt = qsort([x for x in lst[2: end] if x >= p])
6 return vcat(sml, [p], grt) end
A test run:
7 testList = [4, 5, 7, 3, 8, 3]
8 println(qsort(testList))
In line 2, the base case in the recursion again refers to an empty integer array. Oth-
erwise, the type ‘Any’ would propagate into the result.
However, specifying the type ‘Vector{Int}’ for the argument is not mandatory.
It only serves as a hint for the compiler to optimize the code.
Tuples
Like vectors, a tuple is an ordered sequence of elements, except that once it is created,
it cannot be modified. A tuple is represented by a comma-separated sequence of ele-
ments enclosed in parentheses, where the parentheses can be omitted, as in Python.
As in Python, a tuple of length 1 must be distinguished from its contents by a trailing
comma.
Examples:
a = (2.0, "hello"); typeof(a) # ans: Tuple{Float64 ,String}
b = 1, 2; typeof(b) # Tuple{Int64,Int64}
c = (1); typeof(c) # Int64
d = (1,); typeof(d) # Tuple{Int64}
Dictionaries
Dictionaries work essentially the same as in Python, except that the separator ‘:’ is
replaced with =>’:
dict = Dict("a" => 1, "b" => 2, "c" => 3)
Recall, however, that strings in Julia are always included in double quotes.
8.5 Composite Types 171
Sets
The standard set operations union, intersection and difference are performed with
the functions union, intersect and setdiff. Single elements can be added with
‘push!’.
Unlike Python and C++, Julia does not support object-oriented programming. Data
types can be created with structs as in C, but do not allow access methods to be bun-
dled in encapsulated functional units as in classes. Instead, Julia relies on its key con-
cept of multiple dispatch to specify type-dependent behavior of functions.
Structures
Julia structures are similar to those in C, except that the entry types need not be
specified.
Example 8.18 (Linked Lists) We illustrate the idea with an example already dis-
cussed in the C and C++ chapters, the construction of linked lists for storing sparse
vectors.
Consider the vector
1 v = [0, 2, 0, 0, 8, 10, 0, 14, 16, 18]
Remark Note that the type is here ‘node’ and not ‘struct node’ as in C, where we
had to apply the typedef operator to get the abbreviated form.
The sparse representation is now defined as a list of all index-value pairs for indices
with non-zero values.
We begin with an empty list, prepared to store entries of type node:
3 sp = node[]
The list is then successively populated with nodes that have non-zero values:
4 for i in 1 : length(v)
5 if v[i] != 0 push!(sp, node(i, v[i])) end end
In the loop, each index-value pair with a non-zero value is stored as a new node and
appended to the list sp.
To test the construction, we print the resulting representation:
6 for nd in sp print("($(nd.index), $(nd.value)) ") end
and obtain
7 (2, 2) (5, 8) (6, 10) (8, 14) (9, 16) (10, 18)
Note that we use the print command (not println), so that all output is printed on
the same line.
In line 6 above, we use string interpolation to insert the node components into the
string to be printed, similar to what we have seen in Python’s f-strings.
In general, string interpolation inserts the value of an expression ‘expr’ into the
string replacing the placeholder ‘$(expr)’.
In the case of individual variables, the parentheses can even be omitted.
Example:
a = 42
println("The answer is $a") # ans: The answer is 42
Note that we have actually encountered string interpolation before, in the definition
of the macro myelapsed at the end of Sect. 8.3.
Mutable Structures
Julia structures are immutable, they cannot be changed, once created. However, Julia
also provides a mutable variant.
8.5 Composite Types 173
Example:
mutable struct point x::Int; y::Int end
p = point(1, 2)
p.x = 4
println(p) # ans: point(4, 2)
As already mentioned, Julia does not support object-oriented programming. The fol-
lowing example shows how to emulate certain class constructs using structs.
Example 8.19 (Fractions) Consider the definition of the class Fraction in Exam-
ple 3.19 in the Python chapter. We show how a to create a corresponding data type
in Julia.
The data structure itself is defined as in Python, now however based on a struct
type:
1 struct fract num::Int; den::Int end
In a class definition, we could now insert object-specific methods into the data struc-
ture. This is not possible in Julia. We have to define the methods globally, however in
such a way that they apply only to fract objects. This is where the multiple-dispatch
approach comes to help.
First, we define a function ‘add’ to calculate the sum of two fractions:
3 add(x::fract, y::fract) =
4 fract(x.num*y.den+x.den*y.num, x.den*y.den)
Note that the function is now exclusively applicable to arguments of type fract.
We test it:
5 c = add(a,b) # ans: fract(10, 8)
In Python, we could use operator overloading to write addition in the form ‘a + b’.
In Julia, we can do this using the multiple-dispatch feature.
Recall that the ‘+’ operator is a generic function with 208 methods, where the choice
of the actual method depends on the number and type of arguments.
Now we can actually add another method to ‘+’, which applies specifically to fractions.
This is achieved as follows:
6 import Base: +
7 +(x::fract, y::fract) =
8 fract(x.num*y.den + x.den*y.num, x.den*y.den)
9 ans: + (generic function with 209 methods)
174 8 Julia
In line 6 we import the current definition for the ‘+’ function. The general specifica-
tion declares that ‘+’ can be used as an infix operator.
In line 7, we add a new specific method to be applied whenever the arguments
are of type fract. The output in line 9 shows that the number of methods in the ‘+’
function has now been increased by one.
We test it:
10 a + b # ans: fract(10, 8)
The add definition in lines 3–4 is now no longer needed and can be deleted.
To test fractions for equality, we can proceed similarly:
11 import Base: ==
12 ==(x::fract, y::fract) = x.num*y.den == x.den*y.num
13 a == b # ans: false
14 b == fract(2, 4) # ans: true
Remark In the Python chapter, we also discussed a class Polynomial with a subclass
Parabola. However, a reformulation of the concept of inheritance is not immediately
possible in Julia.
In general, matrices can also be generated using the bracket notation. More precisely,
a matrix can be defined as a sequence of row vectors of equal length separated by
semicolons:
julia> A = [1 2 3; 4 5 6; 7 8 9]
3×3 Matrix{Int64}:
1 2 3
4 5 6
7 8 9
8.6 Linear Algebra 175
Matrices can also be constructed using comprehension, for example the 3 × 3 Hilbert
matrix with rational number entries as follows:
julia> H3 = [1//(i+j‐1) for i = 1:3, j = 1:3]
3×3 Matrix{Rational{Int64}}:
1//1 1//2 1//3
1//2 1//3 1//4
1//3 1//4 1//5
Here is a list of common matrix operations. Recall that in Julia indices begin at 1:
A = [1 2; 3 4]
A[2,1] # entry value 3, recall: indices start at 1
A' # transpose of A, also written as transpose(A)
2 * A # scalar multiplication
B = [5 6; 7 8]
A + B # addition
A * B # matrix multiplication
A .* B # componentwise multiplication
Note that the componentwise multiplication operator is written as ‘.*’. Also note
that adding a scalar to each component in the form, e.g. 2 + A, is not supported. The
explicit component operator .+’ is required for componentwise addition.
If f is a function defined for single arguments and A is a matrix, then f.(A) is the
matrix that results from componentwise application of f to the entries of A. This is
also known as broadcasting.
Julia also offers matrix division from left ‘B\A’ and right ‘A/B’. According to the
documentation, this corresponds to the computation of 𝐵−1 𝐴 and 𝐴𝐵−1 .
Again, if the operations are to be performed componentwise, the operators must
be preceded with a dot as ‘.\’ or ‘./’.
Linear Equations
Conjugate Gradient
Note that the matrix 𝐴 above is symmetric and positive definite. We use this as an
opportunity to solve the equation by the conjugate gradient method, discussed in
Example 4.3 in the SciPy chapter.
176 8 Julia
Special Matrices
Submatrices
Reshape
As in NumPy, we can rearrange the entries in a matrix, here using the ‘reshape’
function. For the same matrix A as in line 1 above, we get:
10 reshape(A, (2, 8))
11 ans: 2×8 Matrix{Int64}:
12 1 9 2 10 3 11 4 12
13 5 13 6 14 7 15 8 16
The result is the same as the one we got with the order='F' option in NumPy.
To flatten a matrix into a linear vector we can also use the ‘vec’ function:
14 println(vec(A)) # result represented horizontally:
15 [1, 5, 9, 13, 2, 6, 10, 14, 3, 7, 11, 15, 4, 8, 12, 16]
As usual in Julia, the result is actually a column vector, in order to save printing space
here displayed horizontally. Note again the column-major element order.
Application of vec(A') returns the elements in a row-major order:
16 println(vec(A'))
17 [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]
So far we have limited ourselves to the built-in linear algebraic tools. For more ad-
vanced constructions we need to use the LinearAlgebra package included in the
standard Julia distribution. It is loaded with
using LinearAlgebra
The LinearAlgebra module offers array arithmetic, matrix factorizations and other
linear algebra related functionality.
Now we can, for example, compute the determinant of a matrix with det(A) and
the inverse with inv(A). Eigenvalues and eigenvectors can be found with eigvals(A)
and eigvecs(A). With the help of the Boolean functions issymmetric(A) and is‐
posdef(A) we can test if a matrix is symmetric or positive definite.
Standard factorization methods are provided by lu(A) for LU, cholesky(A) for
Cholesky, and qr(A) for QR decomposition.
In the SciPy chapter, we introduced the Poisson matrix used to implement the finite
difference method for solving partial differential equations. For details, we refer to
the discussion at the end of Sect. 4.1. In the following, we present a corresponding
Julia construction.
178 8 Julia
Example 8.21 (Poisson Matrix) We define a function poisson that constructs the
𝑛2 × 𝑛2 Poisson matrix for a given number 𝑛 :
1 using LinearAlgebra
2 function poisson(n::Int)
3 v = 4*ones(n)
4 w = ones(n‐1)
5 D = SymTridiagonal(v, ‐w)
6 sD = SymTridiagonal(zeros(n), w)
7 Id = Diagonal(ones(n)) # identity matrix
8 A = kron(Id, D) + kron(sD, ‐Id)
9 return A end
Sparse Matrices
As previously mentioned, using sparse matrices and matrix operations can result in
significant performance gains. In the following, we consider a sparse version of the
Poisson matrix.
To get a sparse version, we could start from scratch and use sparse variants of the
construction methods. However, most of the performance gain comes from its use in
linear equations, not from the construction itself.
We therefore choose to construct our matrix as a dense matrix and then convert
it to a sparse representation.
Example 8.22 (Sparse Poisson Matrix) We continue with the previous example:
11 using SparseArrays # loading module
12 spA = sparse(A) # converting poisson matrix in line 10
8.7 Ordinary Differential Equations 179
In line 11, the standard library SparseArrays is loaded, which in particular contains
the sparse conversion function used to generate the sparse variant spA of the Poisson
matrix in line 12.
The output is
13 9×9 SparseMatrixCSC{Float64 ,Int64} with 33 stored entries
Line 13 states that the internal representation is in compressed sparse column format
CSC. Line 14 shows that the entries are the ones we want.
Example 8.23 We continue with the program in the last example and compare the
performance of dense and sparse Poisson matrices in solving a simple system of linear
equations. More precisely, for 𝑛 = 100, we solve 𝐴𝑥 = 𝑏 for the 𝑛2 × 𝑛2 Poisson
matrix 𝐴 and the 1-vector 𝑏 of length 𝑛2 . Time is again measured with the Julia-
macro elapsed:
15 n = 100
16 A = poisson(n)
17 b = ones(n*n)
18 @elapsed A \ b # ans: 3.548505505
19 spA = sparse(A)
20 @elapsed spA \ b # ans: 0.021005501
Julia has no built-in tools for solving differential equations, so we need to install an
additional package. Here we use the DifferentialEquations package developed by
Chris Rackauckas at the University of California-Irvine. Extensive documentation
can be found at diffeq.sciml.ai.
We need to download and install the package. This only must be done once:
1 import Pkg # Julia's package manager
2 Pkg.add("DifferentialEquations") # only required once
In line 1, the internal package manager is loaded. In line 2, the Pkg-function ‘add’
downloads and installs the external package.
Once the installation is complete, we can use the package like any integrated module:
3 using DifferentialEquations # required once in every session
180 8 Julia
Example 8.24 As a first example we formulate the Julia solution of a simple initial
value problem
𝑢′ (𝑥) = 𝑥 − 𝑢(𝑥), 𝑥 ∈ [0, 5],
𝑢(0) = 1,
with the exact solution 𝑢(𝑥) = 𝑥 − 1 + 2𝑒−𝑥 .
We assume that the DifferentialEquations package is already loaded according
to lines 1–3 above.
The problem is then encoded by
4 dudx(u, p, x) = x ‐ u # note additional parameter p
5 bpts = (0.0, 5.0) # interval def by tuple of boundary points
6 u0 = 1.0 # initial value
7 prob = ODEProblem(dudx, u0, bpts)
The p parameter in line 4 is required even if not used. The reason is that the con-
structor in line 7 expects the input function to have three arguments. We will return
to this below.
The interval over which the equation is to be evaluated is specified by a pair of
boundary values in line 5. Line 6 sets the initial value. The ODEProblem constructor
then stores the equation in the prob variable.
The problem is then passed to the solver solve. The solver can use different approx-
imation algorithms. If none is specified as here, a default method is selected:
8 sol = solve(prob)
For the visualization of the solution we need another external package. A popular
choice is the Plots package, which works very well with DifferentialEquations.
Documentation can be found on the official site juliaplots.org.
9 Pkg.add("Plots")
10 using Plots
11 plot(sol, leg=false) # no legend shown
For comparison, we also show the exact solution computed at the evaluation points
selected by the algorithm. Note that we can access the array of evaluation points with
sol.t. As with the solver solve_ivp in SciPy, DifferentialEquations assumes
that the equations evolve over a time span, represented by a variable 𝑡:
12 plot!(sol.t, t‐> t‐1+2*exp(‐t), seriestype=:scatter , leg=false)
Note again that the exclamation mark in plot! indicates that the operation works
in-place, i.e. modifies the original plot. The result is shown in Fig. 8.1 on the facing
page.
The plot can then be saved to a file, e.g. in portable document format PDF, with
13 savefig("odeplot.pdf")
8.7 Ordinary Differential Equations 181
0 1 2 3 4 5
t
Fig. 8.1 Julia solution of the initial value problem in Example 8.24. Solid dots indicate the exact
solution at the evaluation points selected by the approximation algorithm
Equation Systems
In the previous example, we specified the equation with an explicit definition of the
derivative function dudt. When dealing with equation systems, dudt will be an array.
Therefore, we can take advantage of an implicit in-place specification
3 function lotvol!(dudt, u, p, t)
4 dudt[1] = p[1]*u[1] ‐ p[2]*u[1]*u[2]
5 dudt[2] = ‐p[3]*u[2] + p[4]*u[1]*u[2] end
The specification of the derivative can then be passed to ODEProblem in this way:
6 prob = ODEProblem(lotvol!, u0, tspan, p)
The solution can then be passed to the plotter. The plotter draws the two functions
𝑢1 and 𝑢2 .
8 plot(sol)
u1(t)
10 u2(t)
0 2 4 6 8 10 12
t
𝑢″ + 𝑢 = 0 in [0, 2𝜋],
𝑢(0) = 0, 𝑢′ (0) = 1,
The result can then be plotted. Note that plot(sol) shows the solutions for both 𝑢
and 𝑢′ .
8.7 Ordinary Differential Equations 183
Also in Julia, boundary value problems are more difficult to solve than initial value
equations.
As a simple example, we return to the equation from Example 4.16 in the SciPy
chapter:
Example 8.27 Consider the BVP
The boundary values 𝑢(0) = 0, 𝑢(1) = 1 are encoded in a function ‘bc!’ in residual
form, which describes what needs to be subtracted to get a homogeneous system.
During the approximation, bc! will be called repeatedly:
3 function bc!(residual , u, p, x)
4 residual[1] = u[1][1]
5 residual[2] = u[end][1] ‐ 1 end
Note that u[1] in line 4 refers to the current value pair at the beginning of the time
span, where u[1][1] stores the current approximation to 𝑢(0). During the solution
process the value of residual[1] should converge to 0.
Dually, u[end] in line 5 refers to the last element in the value pair array, where
u[end][1] stores the current approximation to 𝑢(1). The correct value is again
reached when residual[2] converges to 0.
To formulate the BVP in Julia, we use the constructor TwoPointBVProblem:
6 bvp = TwoPointBVProblem(u!, bc!, [0, 0], bds)
The constructor expects an initial estimate for the solution. The pair [0,0] declares
the initial values to be constant zero over the entire interval for both functions in-
volved.
The solution is again obtained with the multipurpose solver solve. Presently, how-
ever, for BVPs the solver can only deal with a particular algorithm MIRK4. Moreover,
the step size dt for the evaluation grid must be specified:
7 sol = solve(bvp, MIRK4(), dt=0.05) # note: 'dt' with 't' required
′
The solution for both 𝑢 and 𝑢 can then be plotted as usual.
184 8 Julia
Example 8.28 (Bratu Equation) In Example 4.18 in the SciPy chapter, we solved the
Bratu equation
𝑢″ (𝑥) = −𝑒𝑢(𝑥) , 𝑥 ∈ [0, 1],
𝑢(0) = 𝑢(1) = 0.
Recall that the equation has two solutions. Here is the Julia version of the solution
program:
function u!(dudx, u, p, x) dudx[1]=u[2]; dudx[2] = ‐exp(u[1]) end
bds = (0.0, 1.0)
function bc!(residual , u, p, x)
residual[1] = u[1][1]
residual[2] = u[end][1] end
solinit = [0, 0] # for second solution set solinit = [3, 0]
bvp = TwoPointBVProblem(u!, bc!, solinit , bds)
sol = solve(bvp, MIRK4(), dt=0.05)
plot(sol)
Figure 8.3 shows both solutions, on the left the one based on the initial value [0,0],
on the right the one based on [3,0]. Note that the designator ‘t’ for the argument
parameter cannot be easily changed.
10
u1(t) u1(t)
0.4 u2(t) u2(t)
5
0.2
0.0 0
-0.2
-5
-0.4
-10
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
t t
Fig. 8.3 Two solutions of the Bratu equation in Example 8.28. Blue lines show the solution functions,
red lines the derivatives.
In Sect. 4.8 in the SciPy chapter, we discussed how partial differential equations PDEs
can be solved using the finite difference method, illustrated for 2D Poisson equations
with Dirichlet boundary conditions. In the following, we develop a corresponding Ju-
lia program.
Recall that the general form of such a 2D Poisson problem is to find a function
𝑢 ∶ Ω → ℝ, Ω ∶= [0, 1] × [0, 1], such that for some given functions 𝑓, 𝑔 ∶ Ω → ℝ:
8.8 Partial Differential Equations 185
−Δ𝑢 = 𝑓 in Ω,
(1)
𝑢=𝑔 on the boundary 𝜕Ω.
For background information, we refer to the introduction in Sect. 4.8, and turn di-
rectly to the solution program, written as a Julia function.
For convenience, we repeat the definition of the function generating the Poisson ma-
trix from Example 8.21 above:
2 function poisson(n::Int)
3 v = 4*ones(n)
4 w = ones(n‐1)
5 D = SymTridiagonal(v, ‐w)
6 sD = SymTridiagonal(zeros(n), w)
7 Id = Diagonal(ones(n))
8 A = kron(Id, D) + kron(sD, ‐Id)
9 return A end
The mesh grid and the distance between the grid points ℎ is given by
11 x = y = range(0, stop=1, length=m)
12 h = 1/(m‐1)
The solution matrix u is initialized with zeros, then modified by the 𝑔-values on the
boundaries:
13 u = zeros(m,m)
14 for i = 1:m, j = 1:m # nested loops
15 if i == 1 || i == m || j == 1 || j == m
16 u[i,j] = g(x[i], y[j]) end end
The boundary adjacent points of F are then modified by the 𝑔 values stored in the
u-boundary:
186 8 Julia
The solution u_inner is then inserted in the solution matrix u and the result returned
as a tuple of x, y, u:
27 u[2:n+1, 2:n+1] = reshape(u_inner , n, n) # column ‐major order
28 return x, y, u
29 end # of poisson_solver definition
on a 30 × 30 grid:
30 f(x,y) = 1.25*exp(x + y/2)
31 g(x,y) = exp(x + y/2)
32 sol = poisson_solver(f, g, 30)
33 plot(sol, seriestype=:wireframe)
Fig. 8.4 Julia solution of the Poisson equation (1) with source function 𝑓 and boundary function 𝑔
in (2)
8.9 Working with Files 187
The open function opens the file test.txt in writing mode (creating it if it does not
yet exist) and links it to the file handler fh, so that it can be accessed by the program.
The main difference to Python is that the write function now takes the file handler
as argument, while in Python the class method is appended to the handler object in
dot notation.
The file can then be read like this:
5 fh = open("test.txt", "r") # opens file in reading mode
6 s = read(fh, String) # assigns file content to string s
7 println(s) # prints s in two lines
8 ans: The parrot is
9 a Norwegian Blue.
In the output, note that the control character ‘\n’ causes a line break.
The Julia program begins with the definition of the iterator constant:
1 const c = ‐0.8 + 0.156im
We create and open a file julia.pgm in writing mode and connect it to a file handler
fout:
2 fout = open("julia.pgm", "w")
Then we declare the image to be of both width and height 𝑠 = 1000 and write the
entire image specification to the file preamble:
3 s = 1000
4 write(fout, "P2\n# Julia Set image\n$s $s \n255\n")
Basically, the string in line 4 says that each pixel in the 𝑠 × 𝑠 grid will be associated
with a grayscale value, where 0 corresponds to black and 255 to white.
The grayscale values are determined in the following nested for loops:
5 for y in range(2, stop=‐2, length=s) # Im‐rows
6 for x in range(‐2, stop=2, length=s) # Re‐columns
7 z = complex(x,y); n = 0
8 while (abs(z) < 2 && n < 255) z = z*z + c; n += 1 end
9 write(fout, "$(255‐n) ") end
10 write(fout, "\n") end
In line 7, each pixel point (𝑥, 𝑦) is interpreted as a number 𝑧 in the complex plane
and the iteration counter set to 0. The while loop in line 8 checks whether the pixel’s
membership in the set can be verified in fewer than 256 steps, and if so, the number
of steps required is recorded in the variable n. In line 9, a corresponding gray value
is output to the file, getting darker with the number of iterations.
The nested for loops beginning in lines 5 and 6 range over all grid points, in the
outer loop from top to bottom for each imaginary part 𝑦, in the inner loop then
horizontally for each real part 𝑥.
Remark Note that in previous Julia versions, we could use a linspace construction
to define the range. However, the linspace function is unfortunately no longer avail-
able.
The file content interpreted as a PGM representation is shown in Fig. 8.5 on the next
page.
Exercises 189
Fig. 8.5 Julia set generated with iteration function 𝑧 ↦ 𝑧2 − 0.8 + 0.156 𝑖
Exercises
Exercise 8.1 Write a Julia function that uses a Monte Carlo method to compute the
1
value of the integral ∫0 𝑓(𝑥) 𝑑𝑥 for 𝑓∶ ℝ → ℝ with 𝑓([0, 1]) ⊆ [0, 1]. Test the pro-
gram for the argument function 𝑓∶ 𝑥 ↦ 𝑒−𝑥.
Continued Fractions
1
𝑎0 + ,
1
𝑎1 +
1
𝑎2 +
1
⋯+
𝑎𝑛
Exercise 8.2 Write a Julia function to compute the rational number 𝑎/𝑏 given by a
continued-fraction representation [𝑎0 , 𝑎1 , … , 𝑎𝑛 ].
Exercise 8.3 Conversely, write a function that converts a rational number 𝑎/𝑏 to a
continued fraction.
Note that the numbers in the continued fraction are precisely the quotients that
would arise, if we used the Euclidean algorithm to compute gcd(𝑎, 𝑏).
Here the Julia function divrem could prove useful, which provides quotient and
remainder of the Euclidean division.
Here Julia’s floor(x) function may be useful, which returns the greatest integer less
than or equal to the float x.
Test the program for the input 𝜋 and 𝑛 = 20.
Differential Equations
Exercise 8.6 In the Python chapter we developed a class for representing polynomi-
als, and then also a subclass for parabolas of degree 2. Recast the class Polynomial
in Julia, using structs and the multiple dispatch feature.
As an advanced challenge, try to reformulate the concept of inheritance, using
Julia’s type hierarchy.
Spline Interpolation
9.1 Basics
Matlab programs are usually run in its dedicated interpreter environment. The main
component of the user interface is the “command window”, in which you can enter a
command at the prompt ‘>>’. This will then be executed immediately.
Example:
>> 2 * 4 % ans = 8
The variable ans corresponds to the anonymous variable ‘_’ in Python. It stores the
last output, which can then be accessed in reading mode. Comments are marked with
the symbol ‘%’.
Note that, as in previous chapters, we use a writing convention to save vertical
printing space. We often write the result on the same line as the input, separated by
the comment symbol.
Numbers can have different types, e.g. int32 for 32-bit integer numbers or double for
64-digit floating-point numbers. If no type is specified, Matlab will automatically
assume double.
With regards to the basic arithmetic operators, it should only be noted that ‘^’
denotes the power operator.
A list of the provided data types can be obtained with ‘help datatypes’.
With ‘format’ the number representation can be changed. By default, ‘format
short’ is set, which leads to an output of sqrt(2) as 1.4142, for example. After setting
‘format long’ it becomes 1.414213562373095.
Variables
As in all the languages considered so far, values are assigned to variables using the
assignment operator ‘=’.
As in Python, a variable has no particular data type.
The type logical contains the truth values true and false. In the output, the truth
values are represented as 1 and 0, whereby it is noted that they stand for logical values.
Boolean expressions can be composed with the operators ‘&’ for and, ‘|’ for or.
Negation is indicated by ‘~’.
Comparisons are denoted by ‘<’ for ‘less than’, ‘<=’ for ‘less than or equal’, equality
and inequality by ‘==’ and ‘~=’.
9.2 Vectors and Matrices 195
The property ‘Size 1x1’ reveals a basic concept in Matlab. All variables are con-
ceived as matrices, a single number consequently as a 1 × 1 matrix. The property
‘Class double’ confirms our above remark that double is automatically assumed if
no type is specified.
Vectors
In Matlab, the basic way to create a vector is to enclose the elements in square brack-
ets, like this:
>> v = [1 2 3 4] % v = 1 2 3 4
Entries are separated by a space or comma. Access is in the form v(i), with indices
starting at 1, as is usual in mathematics, and also in Julia, and not at 0 as in Python
or C/C++. The length of the vector is returned by length(v).
With the colon operator ‘:’, the input v = [1 2 3 4] can be shortened to v = 1:4.
The colon operator is one of the most important operators in Matlab. With its
help, for example, special vectors can be generated, used for indexing in loops or in
plotting.
The usage is as follows: Starting with an initial number, a fixed number is added
and stored in the vector until a given end is reached or exceeded. The general syntax
is: ‘start:increment:end’, or ‘start:end’ with the default value 1 for the increment.
Another example should make that clear:
>> x = 1: ‐0.3: 0 % x = 1.0000 0.7000 0.4000 0.1000
The colon operator is related to the Python function range, where the latter, however,
generates numbers up to but not including the upper bound.
If instead of specifying the increment you want to generate a number 𝑛 of points with
equal distance between the values 𝑎 and 𝑏, use the linspace(a,b,n) function, just
as in NumPy.
196 9 Matlab
Example:
>> linspace(‐1, 1, 5) % ans = ‐1.0000 ‐0.5000 0 0.5000 1.0000
we can simply generate an extension by assigning a value at some new index position:
>> v(5) = 1 % v = 1 2 0 0 1
The entries filled in at the new intermediate positions are automatically zero initial-
ized.
In fact, we can also create a vector in this way:
>> w(3) = 1 % w = 0 0 1
The comparison operators defined at the end of Sect. 9.1 can also be applied element-
wise to vectors, yielding a logical array:
1 >> x = [‐1 2 3];
2 >> x > 0 % ans = 1×3 logical array 0 1 1
A logical array is an array with entries 1 and 0, but explicitly marked as Boolean
values.
The logical array returned in line 2 can also be defined directly as follows:
>> b = [false true true] % 1×3 logical array 0 1 1
Remark In the introduction to this section, we noticed that a variable holding a num-
ber is automatically conceived as a 1 × 1 matrix. However, also a vector of length 1 is
considered as a 1 × 1 matrix. In fact, a vector of length 1 is identified with its content
in Matlab:
>> x = 1; y = [1];
>> x == y % ans = logical 1
9.2 Vectors and Matrices 197
Matrices
The rows of the matrix are separated by a semicolon or a line break, the row entries
by a space or comma.
Note that this confirms that the vectors considered above are single-row matri-
ces. A column vector can be defined with entries separated by a semicolon, e.g ‘v =
[1;2;3]’.
Matrix entries are accessed by A(i,j), with indices starting at 1.
Like vectors, matrices can also be defined and extended dynamically:
>> B(2,2) = 1
B = 0 0
0 1
>> B(1,4) = 2
B = 0 0 0 2
0 1 0 0
As in NumPy, every subblock in a matrix can be accessed, not just individual ele-
ments. It is sufficient to replace the numbers with vectors when indexing.
The colon operator is often used to create a submatrix A(p:q,r:s) from a matrix A
consisting of the intersection of the rows p through q and the columns r through s.
Example:
>> A = [1 2 3 4; 5 6 7 8; 9 10 11 12; 13 14 15 16];
>> A(3:4, 2:3)
ans = 10 11
14 15
A special case is a single colon, which selects all rows or columns. For example,
A(:,j) denotes the 𝑗-th column and A(i,:) denotes the 𝑖-th row of 𝐴.
Standard Matrices
Remark Note the slight difference between Matlab and NumPy. In Matlab, a
function with only one argument, e.g. zeros(n), returns a square matrix, while in
NumPy it returns a vector.
The function ‘rand’ is used to create matrices whose entries consist of pseudo-
random numbers in the interval (0, 1). The syntax is the same as above. Without
argument, rand returns a single random number.
Of course, we can also use all the functions above to create special vectors, such as
>> ones(1,3) % row vector
>> ones(3,1) % column vector
Again, note that the notation is slightly different from that in NumPy.
Special Matrices
Matlab provides functions for creating various special matrices. An example is the
Hilbert matrix whose elements 𝑎𝑖𝑗 have the value 1/(𝑖 + 𝑗 − 1). The matrix is created
with the command hilb(n) and its inverse with invhilb(n), e.g.:
>> invhilb(3)
ans = 9 ‐36 30
‐36 192 ‐180
30 ‐180 180
As an aside, this illustrates one of the remarkable properties of Hilbert matrices: the
inverse consists only of integers.
Over fifty other special and famous matrices can be generated with the command
gallery. Which ones are available can be queried via ‘help gallery’.
Matrix Operations
The usual matrix operations are available. The transpose of a matrix A is denoted by
transpose(A) or A'. Matrix multiplication is computed by A*B, the 𝑛-th power by
A^n, and the inverse of a regular matrix A by inv(A).
We can also use addition between matrices and scalars. For example, ‘A + n’ adds
the number n to each entry of the matrix A.
Division
Example:
>> A = [1 2; 3 4]; B = [5 6; 7 8];
>> C = A / B
ans = 3.0000 ‐2.0000
2.0000 ‐1.0000
>> C*B
ans = 1 2
3 4
A/B is essentially equivalent to multiplying from the right by the inverse inv(B) of B.
In fact, the inverse of an 𝑛 × 𝑛 matrix B can be computed with eye(n)/B, where
eye(n) denotes the 𝑛 × 𝑛 identity matrix.
Dually, the division from left B\A yields the matrix C such that B*C returns A.
Elementwise Operations
Special care should be taken with operations that behave differently when used ele-
mentwise. This concerns multiplication ‘*’, division ‘/’ and exponentiation ‘^’. When
used elementwise they must be preceded by a dot ‘.’:
>> A = [1 2; 3 4]; B = [5 6; 7 8];
>> A .* B % ans = [5 12; 21 32]
>> A.^2 % ans = [1 4; 9 16]
>> A ./ B % ans = [0.2000 0.3333; 0.4286 0.5000]
>> A .\ B % ans = [5.0000 3.0000; 2.3333 2.0000]
Control structures are again very similar to the ones in Python and Julia.
Conditional Statements
In conditional statements all components can be written on the same line, but then
separated by a comma, or a semicolon if the output is to be suppressed.
Example:
>> if x > y, tmp = y; y = x; x = tmp; end
Loops
The for loop in Matlab is a pure counting loop. The general form is
for start : increment : end, statements , end
Example 9.2 (Machine Epsilon) As in Sect. 3.4 in the Python chapter, the follow-
ing program computes the machine epsilon, i.e. the smallest positive floating point
number 𝜀 for which 1 + 𝜀 > 1:
>> epsi = 1; % # spelling to avoid conflict
>> while 1 + epsi > 1, epsi = epsi/2; end
>> format long
>> epsi = 2*epsi % epsi = 2.220446049250313e‐16
The second version will run much faster, on our test machine by the factor 9.5.
On this occasion we also encounter the tic and toc command pair that measures
the time elapsed in the enclosed command block.
Note the different meanings of the expression ‘t = 0: 0.01: 10’ in the two code
snippets. In the first, the number t proceeds from 0 to 10 in steps of length 0.01. In the
second, the variable t denotes a vector of the form 𝑡 = (0, 0.01, 0.02, … , 9.99, 10.0),
to which the sine function is then applied elementwise.
Elementwise function application will be explained more generally below.
9.4 Functions
Many Matlab functions are scalar functions and are executed elementwise when
they are applied to matrices. These include the trigonometric, logarithmic and expo-
nential functions. Use the command ‘help elfun’ to query what is available.
Details of the individual functions can be obtained with ‘help sin’ or more de-
tailed information with ‘doc sin’. The command ‘help specfun’ outputs a list of
other special mathematical functions.
By clicking the 𝑓𝑥 symbol to the left of the current input line, you can retrieve the
total inventory.
Scalar functions are often applied to vectors elementwise. This is especially the case
when function graphs are drawn:
Example 9.3 We plot the so-called sigmoid function
1
𝜎(𝑥) ∶= , 𝑥 ∈ [−6, 6].
1 + 𝑒−𝑥
For this we introduce a vector of equidistant points with
1 >> x = linspace(‐6, 6);
Here the exponential function is evaluated elementwise over the vector −𝑥. Note the
difference between the exponentiation and the division. The division operator ‘/’ is
preceded by a dot to indicate that it is applied elementwise. This is not necessary for
the function exp, since that one is only defined for scalar values anyway.
The function 𝜎 can then be plotted:
3 plot(x, y, 'LineWidth', 2);
The function plot in line 3 should be self explaining. Just note that an optional line-
width specification is included.
The output is shown in Fig. 9.1.
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
-6 -4 -2 0 2 4 6
1
Fig. 9.1 Matlab plot of the sigmoid function 𝜎(𝑥) = over the interval [−6, 6]
1 + 𝑒−𝑥
9.4 Functions 203
Vector Functions
A second class of Matlab functions are the vector functions. They can be applied to
both row and column vectors, using the same syntax. These functions include max,
min and sum and prod, which compute the sum or product of the entries. length
returns the length of a vector.
Matrix Functions
The strength of Matlab are the matrix functions. Some important ones are: rank to
query the rank of a matrix, det for the determinant, inv for computing the inverse.
length returns the length of the larger of the two dimensions for a matrix, numel the
number of elements. size returns the dimensions. A possible usage of the latter is
>> [n,m] = size(A)
9.5 M-Files
Script Files
Script files contain command sequences that are simply read into the system and then
executed as if they had been entered directly. There are no input or output parameters.
Script files are used when you have long command sequences or when you want to
avoid repeated input of frequently used command blocks.
For the development of script files, Matlab offers a comfortable text editor, which
you can access via the window bar or with the command ‘edit filename’ or simply
‘edit’.
From the command line you can then inspect the contents of the file with
>> type scriptdemo
Function Files
Function files store self-written functions, each function into its own file. Function
files can be recognized by the fact that the first line of the M-file contains the word
function. Functions are M-files with input and output parameters. The name of the
M-file and the function should coincide. In case of doubt, Matlab uses the file name.
9.5 M-Files 205
Back in the command window, we can display the contents of the file by ‘type mv’.
The comment line is output with the command ‘help mv’.
The function is applied as if it were a normal built-in function:
>> x = [1 2 3 4 5 6];
>> mv(x) % ans = 3.5000
Subfunctions
Every function that is to be called by the user must be in a file of its own. However,
subfunctions that are called only from the main function can be in the same file. Let’s
look at the following example function stat, which is stored in the file stat.m:
1 function [m,s] = stat(x)
2 n = length(x);
3 m = avg(x,n); % subfunction defined below
4 s = sqrt(sum((x ‐ m).^2)/n);
Note again that we precede the power operator in line 4 with a dot ‘.’, since the oper-
ation is to be applied elementwise.
In line 3, a subfunction avg is called, which is defined in the same file:
5 function m = avg(x,n)
6 m = sum(x)/n;
However, we cannot use avg directly because it is located in the stat.m file, not visible
from the outside.
Functions as Arguments
This will draw the sine function in the interval [−3, 3]. Note that the input function
must be preceded by the special function handle symbol ‘@’.
Anonymous Functions
The only exception to the rule that functions must be stored in files of their own, is
that so-called anonymous functions (corresponding to lambda functions in Python)
may be defined in the command window.
Example:
1 >> f = @(x) x.^2 + 2*x; % @ creates function handle
2 >> f(.5) % ans = 1.2500
However, the dot does matter if we want to plot the function, for example:
>> x = linspace(0,1);
>> plot(x, f(x))
Also in Matlab there are many ways to save and read files. With
>> A = [1 2; 3 4];
>> save('a.txt', 'A', '‐ascii')
the matrix 𝐴 is coded as a character string and written row by row to the file a.txt
in the Matlab default directory. Without the option '‐ascii', the data is stored in
a binary format.
To read the data into a matrix 𝐵 we use the command
>> B = load('a.txt');
For details and further possibilities see ‘help save’ and ‘help load’.
9.6 Linear Algebra 207
Based on the built-in standard operators and functions, we can already carry out ex-
tensive matrix computations. We consider some examples of linear equation systems
and matrix decompositions.
Matlab offers several possibilities to solve systems of linear equation. The simplest
and most versatile is to use the “division from the left” backslash operator presented
above.
Hidden behind the backslash operator is the most important method for solving sys-
tems of linear equations, Gaussian elimination based on LU decomposition. How-
ever, the decomposition is only used in the course of the calculation and is no longer
available afterwards.
Below we show how to explicitly generate LU decompositions in Matlab.
Matrix Decompositions
LU Decomposition
Recall that the algorithm factorizes a nonsingular square matrix 𝐴 into a unit lower
triangular matrix 𝐿 with only 1s in the diagonal, an upper triangular matrix 𝑈, and
a permutation matrix 𝑃, such that 𝐴 = 𝑃 𝐿 𝑈.
In Sect. 4.4 of the SciPy chapter, we used the lu function to generate such a de-
composition. In Matlab we have a function with the same name, however a slightly
different behavior. It does not return the permutation matrix 𝑃 such that 𝐴 = 𝑃𝐿 𝑈,
but rather the inverse such that 𝑃𝐴 = 𝐿 𝑈:
208 9 Matlab
>> A = [1 2 3; 1 1 1; 3 3 1];
>> [L,U,P] = lu(A)
L = 1.0000 0 0
0.3333 1.0000 0
0.3333 0 1.0000
U = 3.0000 3.0000 1.0000
0 1.0000 2.6667
0 0 0.6667
P = 0 0 1
1 0 0
0 1 0
Cholesky Decomposition
Recall that the Cholesky decomposition factorizes a symmetric positive definite ma-
trix 𝐴 into a product 𝑈𝑇 𝑈 where 𝑈 is an upper triangular matrix.
In Matlab, the Cholesky decomposition is computed as follows:
>> A = [1 2 1; 2 5 2; 1 2 10];
>> U = chol(A)
with the same result as in the corresponding SciPy example in Sect. 4.4.
QR Decomposition
Note that in Matlab the “economic” form of the qr function, which we used in the
least square method in Sect. 4.4 in the SciPy chapter, is also available:
>> A = [3 ‐6; 4 ‐8; 0 1]; b = [‐1 7 2]';
>> [Q,R] = qr(A,0) % second arg 0 specifies economic form
Q = ‐0.6000 0
‐0.8000 0
0 ‐1.0000
R = ‐5 10
0 ‐1
>> x = R\Q'*b % x = [5, 2]'
9.7 Ordinary Differential Equations 209
Remark In fact, the least-squares solver is already available directly in Matlab; the
backslash operator can be used for this purpose as well:
>> A = [3 ‐6; 4 ‐8; 0 1]; b = [‐1 7 2]';
>> x = A\b % x = [5.000, 2.000]'
We have seen that matrix computations are a strength of Matlab for historical rea-
sons. As already mentioned, powerful methods are available today for many other
application areas. This applies not least to the treatment of differential equations, in
particular ordinary first order equations ODEs.
In Sect. 4.7 in the SciPy chapter, we discussed several examples of SciPy programs
for solving ODEs. We show how corresponding programs can be written in Matlab.
For background information, see the discussion in the SciPy chapter.
As a first example, we recall the initial value problem
However, for first-order ODEs like (1), Matlab provides highly efficient built-in
solvers, notably the solver function ode45. The first part of the name obviously refers
to ordinary differential equations. The ‘45’ is to indicate that this is a special Runge-
Kutta method of type 4/5, known as the Dormand-Prince method.
Actually the SciPy solver solve_ivp in Sect. 4.7 in Chap. 4 uses exactly this
method by default.
Our ODE (1) above can then be solved as follows:
>> dudx = @(x,u) 2*x; % anonymous function def, see above
>> [x,u] = ode45(dudx, [0, 1], 0);
>> plot(x,u)
Here dudx indicates the derivative 𝑢′ , the second argument the solution interval, the
third the initial value 𝑢0 = 0.
210 9 Matlab
Note that ode45 iteratively modifies the vector x of evaluation points until a prede-
termined precision is reached.
This can be seen in the solution plot in Fig. 9.2 if we include the computed nodes:
>> plot(x, u, '‐o', 'LineWidth', 2)
4.5
3.5
2.5
1.5
0.5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Fig. 9.2 ode45 solution of 𝑢′ (𝑥) = 𝑥 − 𝑢(𝑥), 𝑢(0) = 1 over the interval [0, 5]. Circles indicate the
approximation grid chosen by the solver function
We illustrate how to solve second-order ODEs with ode45. The example below cor-
responds to the pendulum example in the SciPy chapter, but since the techniques are
of general importance, we’ll cover them in detail.
We consider a simple form of an oscillation equation with the solution 𝑢 = sin:
𝑢″ + 𝑢 = 0
(2)
𝑢(0) = 0, 𝑢′ (0) = 1.
ode45 can only handle first order ODEs. We use the standard technique and trans-
form (2) into a system of two first order equations.
9.7 Ordinary Differential Equations 211
𝑣 = 𝑢′ ,
𝑣′ = −𝑢,
𝑢(0) = 0, 𝑣(0) = 1.
This system can now be processed by ode45, if we pass it over in a suitable form.
We collect 𝑢 and 𝑣 in a column vector
𝑢 𝑑 𝑦′ 𝑢′ 𝑣 𝑦
𝑦 = ( ), such that 𝑦 = ( 1′ ) = ( ′ ) = ( ′ ) = ( 2 ) .
𝑣 𝑑𝑥 𝑦2 𝑣 −𝑢 −𝑦1
The solution vector y contains both solution functions 𝑢 and 𝑣. In line 5, we select
the one we are interested in. The solution can then be plotted with plot(x,u).
As mentioned in the SciPy chapter, boundary value problems BVPs are generally
more difficult to handle than initial value problems. This also applies to Matlab.
Consider the following BVP from Example 4.17 in Chap. 4:
𝑢″ = −2,
𝑢(0) = 0, 𝑢(5) = 3.
To process it with Matlab, we first translate it to a system of two first order equations:
1 dydx = @(x,y) [y(2) ‐2]';
The boundary values are stored in the form of residuals: “Which residuum must be
subtracted to get a homogeneous equation?”.
212 9 Matlab
Similar to the Python function solve_bvp, the Matlab solver needs initial values
for the solution process. These are handled by the built-in function bvpinit.
The first argument in bvpinit assumes an initial grid for the solution interval. In
our case, it is sufficient to specify the endpoints of the interval, i.e. the vector [0 5].
The second argument stores an estimate for the solution. In our case, the vector
[0 0] instructs the system to start with the constant value 0 for all grid points for
both functions y(1) and y(2).
These initial assumptions are assigned to a variable solinit:
3 solinit = bvpinit([0 5], [0 0]);
The variable sol contains the solution and additionally a detailed protocol of the
solution process.
We want to plot the solution function 𝑢. For this purpose we specify
5 x = linspace(0, 5); % recall: 100 equally spaced points
and then call the Matlab function deval to extract the function values as follows:
6 y = deval(sol, x, 1);
7 u = y(1,:);
Line 6 provides the value matrix for both y(1) and y(2), from which we extract the
row representing the solution to the original function 𝑢 in line 7.
The result can then again be plotted with plot(x,u).
Recall that the Bratu equation has two solutions. The above program generates a so-
lution based on an initial estimate 𝑢(𝑥) = 𝑢′ (0) = 0 for all grid points. To generate
the other, it is sufficient to replace the second argument of bvpinit in line 4 with
[3 0] to indicate that we now assume the initial estimate 𝑢(𝑥) = 3 for all grid points.
Like SciPy, Matlab has no ready-made solvers for general partial equations. (How-
ever, as we shall see below, Matlab provides a versatile graphical toolbox to aid in
the development of solutions.)
Here we discuss a Matlab variant of the SciPy program used for solving 2D Pois-
son equations in Sect. 4.8.
Recall that the basic case is to determine a function 𝑢 ∶ Ω → ℝ, Ω ∶= [0, 1] × [0, 1],
such that
−Δ𝑢 ≡ 1 in Ω,
(∗)
𝑢≡0 on the boundary 𝜕Ω.
For background information, we refer to the introduction in Sect. 4.8, and turn to
the solution program, written as a Matlab script.
We begin with the grid size:
1 n = 100; % n x n inner grid points
2 h = 1/(n+1);
The value matrix in which the solution is to be stored is initialized with zeros:
3 u = zeros(n+2); % recall: square matrix , unlike Python
The zeros on the boundary keep this value, the inner components in u take on the
computed solution values.
Recall that the latter are determined by solving
1
𝐴𝑢 = 𝑏
ℎ2
with the 𝑛2 × 𝑛2 Poisson matrix 𝐴 and a one vector 𝑏 ∶= (1, 1, … , 1)𝑇 of length 𝑛2.
Here we do not construct the Poisson matrix as before; it is already available in
the Matlab collection of special matrices:
4 A = gallery('poisson', n); % built‐in Poisson matrix A
5 b = ones(n*n, 1); % needed as column vector
Note that A is provided as a sparse matrix. This can be seen as usual with ‘whos A’.
214 9 Matlab
The system is then solved, and the solution is stored in a column vector u_inner of
length 𝑛2 :
6 u_inner = (A/h^2) \ b;
The column vector is then reshaped into an 𝑛 × 𝑛 matrix and inserted into the solu-
tion matrix as a block of inner points:
7 u(2:n+1, 2:n+1) = reshape(u_inner , n, n); % column‐major order
Matlab offers a powerful toolbox for the approximate solution of partial differen-
tial equations. As already mentioned, a corresponding toolbox is unfortunately not
available in Octave.
We illustrate the use for the following Poisson equation, discussed in Sect. 4.8 in the
SciPy chapter:
𝑦
−Δ𝑢 = 𝑓 = 1.25 ⋅ 𝑒 𝑥+ 2 in [0, 1] × [0, 1],
(∗∗) 𝑦
𝑥+ 2
𝑢=𝑔=𝑒 on the boundary:
9.8 Partial Differential Equations 215
• By entering
>> pdeModeler
in the command window we start the graphical user interface.
• In the menu item “Options -> Axes Limits” we change both axes limits to [0, 1].
• By clicking on the rectangle symbol at the top left, we declare that our PDE should
be evaluated over a rectangular area.
• By clicking with the left mouse button on the point (0, 0) and then dragging the
mouse pointer to the point (1, 1) while keeping the button pressed, we determine the
size of the rectangle.
• By double-clicking on the rectangle, input fields appear in which we can adjust the
corner points.
• Next, we set the boundary values. By clicking the 𝜕Ω symbol, the boundary lines
are displayed in red. The boundary conditions can now be set by clicking on the lines.
For each line, we enter exp(x+y/2) for 𝑟 and leave the value ℎ = 1 unchanged.
• The equation itself can now be specified by clicking on the PDE symbol. In our case,
we keep the preselection “Elliptic” for the type of PDE and 𝑐 = −1, 𝑎 = 0. We set 𝑓 to
‐1.25*exp(x+y/2).
• By clicking on the ‘Δ’ icon, we can see the grid on which the solution function is
approximated. Note that pdeModeler chooses a triangular grid pattern. This is be-
cause the underlying approximation method is based on the so-called finite element
method. We will come back to the finite element method in later chapters.
• We click on the symbol ‘=’ to compute the solution.
• In the menu item “Plot -> Parameters” the graphic representation of the solution
can be controlled, e.g. by selecting the two options “Height (3-D plot)” and “Plot in
x-y grid”.
The graph then corresponds to the SciPy solution at the end of Sect. 4.8. The plot is
shown in Fig. 9.4.
Fig. 9.4 Solution of the Poisson equation (∗∗) obtained with the Matlab PDE toolbox
216 9 Matlab
Exercises
Matrix Computations
Exercise 9.1 Compute the regression line 𝑔 = 𝑎1 𝑥 + 𝑎2 through the point cloud in
ℝ2 , generated by
>> x = 0:20;
>> y = x + 2*rand(size(x));
Plot the line together with the cloud. To plot the scattered points, you can use the
plot command with a third argument, say plot( , , '*'). To make sure that both
diagrams appear in the same figure, use ‘hold on’ after the first one.
Exercise 9.2 Use ode45 to solve the third order nonlinear initial value problem
𝑢 ⋅ 𝑢‴ = −1,
𝑢(0) = 1, 𝑢′ (0) = 𝑢″ (0) = 0,
𝑢″ − 2𝑢′ + 𝑢 = 0,
𝑢(0) = 𝑢(1) = 0,
Exercise 9.4 By analogy with the SciPy function poisson_solver at the end of
Sect. 4.8, extend the solver in Sect. 9.8 to a Matlab function to solve general Poisson
equations
−Δ𝑢 = 𝑓 in [0, 1]2 ,
𝑢=𝑔 on the boundary.
Test the function for
𝑦 𝑦
(1) 𝑓 = 1.25 ⋅ 𝑒 𝑥+ 2 , 𝑔 = 𝑒 𝑥+ 2 ,
Exercises 217
(2)
𝑢(0, 𝑦) = 𝑦 2 , 𝑢(1, 𝑦) = 1,
𝑓 ∶= 20 cos(3𝜋𝑥) sin(2𝜋𝑦), 3
} on the boundary.
𝑢(𝑥, 0) = 𝑥 , 𝑢(𝑥, 1) = 1,
Exercise 9.5 Solve the following Poisson equation over a non-square rectangle:
𝜕2 𝑢 𝜕2 𝑢
+ = 𝑥2 + 𝑦2, (𝑥, 𝑦) ∈ [0, 2] × [0, 1],
𝜕𝑥 2 𝜕𝑦 2
𝑢(𝑥, 𝑦) = 𝑥(2 − 𝑥) + 𝑦(1 − 𝑦) on the boundary,
using a program as in the text, or by applying the pdeModeler in the Matlab PDE
toolbox.
In the exercises in the C++ chapter, we looked at the set ℍ of quaternions. Here we
return to a more conventional number set, the complex numbers ℂ. In Sect. 8.9 in
the Julia chapter, we discussed Julia sets, and mentioned that they are related to the
probably better known Mandelbrot set. We take up this topic here.
More specifically, we examine the behavior of the sequences 𝑠(𝑐) given by
2
𝑧𝑐,𝑛+1 = 𝑧𝑐,𝑛 + 𝑐, 𝑧𝑐,0 = 0, for 𝑐 ∈ ℂ.
𝑀 = {𝑐 ∈ ℂ ∣ 𝑠(𝑐) is bounded }.
Representing the complex numbers as ℝ2 , one can visualize the Mandelbrot set as
in Fig. 9.5.
Based on the same ideas as Symbolic Python, Maple (for mathematical manipulation
language) is a commercial computer algebra system for interactive programming in
algebra, analysis, discrete mathematics, graphics, numerical computations, and many
other areas of mathematics. As the name suggests, it was developed in Canada, start-
ing in 1980 at the University of Waterloo.
Initially focused on symbolic computing, Maple was over time developed into a
powerful system also for numerical algorithms.
Maple works well with Matlab and complements the procedures there with the
ability to represent real numbers with any required (but still finite) accuracy. The
book [4] provides a comprehensive discussion of scientific computing with the com-
bination of Matlab and Maple.
10.1 Basics
The main component of the Maple user interface is the “worksheet” for interactive
programming. On the command line, you can enter a Maple command after the
prompt ‘>’ and it will be executed immediately. A command is usually completed
by a semicolon. Entering a colon suppresses the output.
The central feature of all computer algebra systems is that they by default operate with
exact values. Like Python, Maple can represent natural numbers of any size: 2^100
yields the exact result 1267650600228229401496703205376. Note that the power op-
erator is denoted by ‘^’. For rational numbers, entering 115/39 + 727/119 returns
the exact result 42038/4641.
The result of sin(Pi/4) will only be output as 1/2 √2, because an exact represen-
tation of the root is not possible. Here Pi obviously denotes the number 𝜋.
A numerical representation of symbolic values is obtained with evalf:
> sin(Pi/4): evalf(%); # Out: 0.7071067810
Note that comments are marked by the ‘#’ symbol, as in Python. As before, we often
mark output with a preceding ‘Out:’, either on the line after the input or as a comment
on the same line. Note that Maple normally prints output centered in the window.
Here we prefer the convention of printing left-justified. We also usually omit the input
prompt symbol, when it is clear from the context.
The Maple symbol ‘%’ denotes the so-called “ditto” operator, i.e. the content of
the previous line. The name of the function evalf stands for evaluate using floating-
point arithmetic. Maple is exact to a number of digits determined by the environment
variable Digits, with default value 10.
Calling kernelopts(maxdigits) one obtains the maximal value for Digits, on
the system used here 38654705646. The desired number of digits can also be passed
to the function evalf as an additional argument, such as evalf(%,5) or equivalently
evalf[5](%).
The root function sqrt may also serve to explain some further Maple features:
sqrt(2) returns the expression √2, whereas the decimal point in sqrt(2.0) indi-
cates that the output is requested in decimal form, such as 1.414213562.
In many cases, you can contribute to the calculation with additional help:
1 sqrt(x^2); # Out: √𝑥 2
2 assume(x, nonnegative):
3 sqrt(x^2); # Out: 𝑥~
In line 1, the system gets stuck because it does not know whether 𝑥 is a positive or
negative value. The annotation in line 2 makes this clear. The tilde in the output marks
that we make an additional assumption about 𝑥.
Maple performs calculations in the complex numbers ℂ by default. For example,
sqrt(‐1) returns the result 𝐼, the Maple symbol for the imaginary unit 𝑖. The term
exp(I*Pi) + 1 evaluates to the correct result 0, thus confirming Euler’s identity.
However, restriction to real-valued computation is possible. Example:
1 sqrt(‐4); # Out: 2𝐼
2 with(RealDomain):
3 sqrt(‐4); # Out: undefined
In line 2, the program package RealDomain is loaded, which among other things re-
sults in sqrt now being interpreted as a real-valued root function. Exactly which
of the Maple components are modified in the package is listed if the command
with(RealDomain) is terminated with a semicolon instead of a colon.
With the ‘?’ operator you can open the help menu and consult the documentation,
e.g. ‘?sqrt’ for the root function.
Variables
Naturally, results can also be assigned to variables. Note, however, that the assignment
operator is ‘:=’ and not ‘=’ as in the languages we have seen so far.
10.2 Functions 221
The assignment ‘a := sin(Pi/4)’ stores the symbolic expression 1/2 √2 in the vari-
able a. As before, we can then evaluate the expression numerically by evalf(a) or
make an assignment a:=evalf(a), which then overwrites the original symbolic ex-
pression.
The values of the variables are preserved as long as the worksheet is open. To avoid
possible side effects, it is a good idea to reset the variables before new computations,
e.g. by unassign('a','b','c') or by ‘restart’, which clears all entries in Maple’s
memory.
10.2 Functions
Maple has an extensive collection of mathematical functions, from the abs func-
tion for calculating the absolute value of real or complex numbers to the Riemann-
Hurwitz 𝜁 function zeta. The entire list can be retrieved by ‘?initial’.
There are basically three ways to define functions in Maple, the arrow operator, the
unapply command, and the proc definition.
The arrow operator ‘‐>’ corresponds to the mathematical maps-to operator ‘↦’.
This allows functions to be formulated in a way that corresponds to the lambda
expressions discussed previously, such as
1 f := x ‐> a*x^2;
2 g := (x,h) ‐> (f(x+h) ‐ f(x)) / h;
3 delta := (x,y) ‐> if x = y then 1 else 0 end if;
The functions are then applied like the built-in functions: f(2) yields 4𝑎, g(2, 0.01)
results in 4.010000000 𝑎, and delta(I,sqrt(‐1)) returns the value 1.
Note that the ‘if ... end if’ expression in line 3 is the Maple syntax for the ternary
if-else operator, which has already appeared on several occasions. Note also that the
equality operator is denoted by ‘=’.
The unapply command can be used to convert a term expression into a function.
Consider the following expression:
term := x + x^2:
222 10 Maple
However, the term is not yet a function in the sense of x ‐> x+x^2. This is what the
unapply command is for:
f := unapply(term, x); # Out: 𝑓 ∶= 𝑥 ↦ 𝑥 + 𝑥 2
The name comes from the fact that a function is applied to an argument and the result
is then a term. The name unapply now suggests the converse. Recall that in SymPy
the corresponding function was called lambdify.
The proc definition (for procedure) corresponds to the function definitions in Python,
C and Julia. It goes far beyond the definition method based on the arrow operator,
since all language constructions can be used in a proc definition, including control
structures such as if-statements and loops.
As a first (trivial) example, the function 𝑓(𝑥) ∶= 𝑥 2 can be defined as follows:
f := proc(x) return x^2 end proc;
The return statement gives the result value of the proc function:
f(2); # Out: 4
If the procedure does not contain an explicit return statement, the value of the last
term is automatically returned. So the above example could be formulated more con-
cisely by
f := proc(x) x^2 end proc;
The keyword local specifies that the variables i, prod in the function definition are
different from variables in the worksheet environment that may have the same name.
10.2 Functions 223
To get a feel for the capabilities of the proc definition, you can put Maple into a
talkative mode with the interface(verboseproc=2) command and then examine
the definitions of the built-in functions with, for example, eval(nextprime).
Remark Here we refrain from detailed explicit syntax considerations of the control
structures. They can be queried from Maple itself. For the sake of completeness, it
should be mentioned that control structures can of course also be used outside of
proc definitions.
Visualization
With the plot command Maple offers extensive possibilities for the graphical repre-
sentation of functions. A function f must generally be specified as a function term,
e.g. f(x).
Line 1 defines the Runge function, line 2 specifies a polynomial as a term expression.
The functions to be plotted are then passed to the plotter enclosed in curly braces.
The size option specifies the aspect ratio between 𝑥 and 𝑦-axes.
224 10 Maple
Note that 𝑝 was generated as an interpolation polynomial at the mesh points -1, -1/2,
0, 1/2, 1. More details can be found in the exercises.
The result is shown in Fig. 10.1.
Fig. 10.1 Maple plot of Runge function (red) and interpolation polynomial (blue)
The with(plots) command can be used to load many other plots options, e.g. ani‐
mate and animate3d for the representation of temporal processes.
Equation Systems
Maple provides powerful methods for solving systems of equations, both symbolically
with solve, and numerically with fsolve.
Symbolic:
quad := x^2 + 2*x + c = 0:
solve(quad, x); # Out: −1 + √1 − 𝑐, −1 − √1 − 𝑐
Numeric:
eq1 := (sin(x + y))^2 + exp(x)*y + cot(x ‐ y) + cosh(z + x) = 0:
eq2 := x^5 ‐ 8*y = 2:
eq3 := x + 3*y ‐ 77*z = 55:
fsolve({eq1, eq2, eq3});
Out: {𝑥 = −1.543352313, 𝑦 = −1.344549481, 𝑧 = −.7867142955}
Maple knows the standard methods of linear algebra. Both symbolic and numerical
calculations are supported. The LinearAlgebra package must be loaded in order to
10.3 Linear Algebra 225
take full advantage of these capabilities. But first, let’s see how far we can get with the
standard on-board resources.
A vector is defined as follows:
v := Vector([1, 0, ‐3]): # or equivalently:
w := <1, 0, ‐3>:
In both cases the vector is understood as a column vector (1, 0, −3)𝑇. Components
are accessed in the form v[i], with indices starting at 1, so that v[3] is -3.
A row vector is defined as follows:
v := Vector[row]([1, 0, ‐3]): # or equivalently:
v := <1 | 0 | ‐3>:
Remark Brackets can also be used as subscript operators in general, e.g. to define
a sequence 𝑎1 , 𝑎2 , 𝑎3 by a[1], a[2], a[3]. However, this does not generate ‘a’ as a
vector. From whattype(a) we obtain the answer symbol. For a real row vector v as
above, whattype(v) gives the answer Vectorrow .
Remark Also in general, the ‘~’ symbol is used to indicate elementwise function ap-
plication:
f := x ‐> 1/x:
v := <1 | 2 | 3>: # row vector
1 1
f~(v); # Out: [1, 2 , 3 ]
226 10 Maple
Maple provides a special mechanism for creating vectors and matrices whose entries
can be described by a function. Consider a vector 𝑣 with entries 𝑖 2 , 𝑖 = 1, … , 10. It
can be generated as follows:
f := i ‐> i^2:
v := Vector(10, f);
The syntax requires the length of the vector and a function that computes the 𝑖-th
component.
A shorthand notation is also allowed:
v := Vector(10, i ‐> i^2);
form an orthonormal basis of ℝ3 . In Maple, the verification program looks like this:
1 v1 := <1, 1, 1> / sqrt(3):
2 v2 := <1, 0, ‐1> / sqrt(2):
3 v3 := <1, ‐2, 1> / sqrt(6):
4 v1.v1; # Out: 1
5 v1.v2; # Out: 0
6 x := <a | b | c>: # row vector
7 y:=(x.v1)*v1 + (x.v2)*v2 + (x.v3)*v3;
8 simplify(y);
10.3 Linear Algebra 227
The vector y returned in line 7 has the same unintelligible form as the one obtained
from SymPy. Again, it is the simplify statement in line 8 that provides the readable
result (𝑎, 𝑏, 𝑐)𝑇.
Example 10.6 In Example 5.3 in the SymPy chapter, we showed that the polynomials
Line 1 defines the polynomials as term expressions. Line 2 generates a linear combi-
nation of the 𝑝𝑖 with placeholder coefficients 𝑐𝑖 . Line 3 defines an arbitrary but fixed
polynomial 𝑞 of degree 2. Line 4 defines the difference polynomial 𝑟 between 𝑞 and
the linear combination of the 𝑝𝑖 . In lines 5–6, the coefficients 𝑐𝑖 are calculated so that
the difference 𝑟 gives the zero polynomial. The output confirms that such 𝑐𝑖 exist.
In response, Maple displays a list of all functions in the package. As with other Maple
commands, the output can be suppressed by deleting the semicolon or replacing it
with a colon.
For example, we can now compute the transpose of a matrix A with Transpose(A).
Other commands include MatrixInverse(A), LinearSolve(A,b) or the construc-
tion of standard matrices like IdentityMatrix(3).
The NullSpace(A) command is often helpful. It determines the subspace of the vec-
tors 𝑣 with 𝐴𝑣 = 0, i.e. the kernel of the linear mapping induced by matrix multipli-
cation.
Example 10.7 As in the SymPy Example 5.4, we show how to check that the vectors
u1 := <1, 0, 2>: u2 := <0, 1, 1>: u3 := <1, 2, ‐1>:
Example 10.8 Finally, we reformulate Example 5.5 from SymPy to obtain a matrix
with a nonempty kernel:
C := Matrix([[1, 3, ‐1, 2], [0, 1, 4, 2],
[2, 7, 2, 6], [1, 4, 3, 4]]):
NullSpace(C);
4 13
{
{ [ −2 ] [ −4 ] }
}
Out: { [ ], [ ]
{[ 0 ] [ 1 ]} }
{[ 1 ] [ 0 ]}
Note that Maple, like SymPy, selects basis vectors with integer entries.
The first component res[1] of the result res is a column vector containing the
eigenvalues, the second res[2] is a matrix whose columns contain the corresponding
eigenvectors. The columns can be accessed with Column(res[2],i).
Here we perform the test for the second eigenvalue:
5 e_vals :=res[1]:
6 V := res[2]:
7 A.Column(V,2) ‐ e_vals[2]*Column(V,2):
8 simplify(%);
0
9 Out: [ 0 ]
[0]
Note that we have suppressed the output in line 7. Again, it is only the simplify
statement in line 8 that makes the result recognizable.
𝑛 × 𝑛 Matrices with 𝑛 ≥ 5
At the end of Sect. 5.3 of the SymPy chapter, we recalled that it is generally impossible
to determine the eigenvalues of a large matrix accurately.
Here is Maple’s attempt to compute the eigenvalues of the 5 × 5 variant of the
Hilbert matrix with Eigenvalues:
1 H5 := Matrix(5, 5, (i,j) ‐> 1/(i+j‐1)):
2 Eigenvalues(H5);
with disappearing imaginary part. As already mentioned, one must expect complex
solutions when determining eigenvalues. In symmetric matrices like the Hilbert ma-
trix, however, all eigenvalues are real values in ℝ.
If we give Maple the additional information about the matrix, we finally come to an
acceptable result:
6 unassign('H5'):
7 H5 := Matrix(5, 5, (i,j) ‐> 1/(i+j‐1), shape=symmetric):
8 Eigenvalues(evalf(H5));
3.28793927018769 ⋅ 10−6
[ 0.000305898025240842271 ]
[ ]
9 Out: [
[ 0.0114074915726152006 ]
]
[ 0.208534218668012500 ]
[ 1.56705069109486050 ]
230 10 Maple
10.4 Calculus
Maple has a wealth of knowledge for the formation of limits, and offers a variety of
functions for derivation and integration both on a symbolic level and for numerical
evaluations.
Derivation
𝑓(𝑥 + ℎ) − 𝑓(𝑥)
𝑓′ (𝑥) = lim .
ℎ→0 ℎ
We consider the function 𝑓 ∶ 𝑥 ↦ 𝑎𝑥 2 with a symbolic parameter 𝑎.
Here is the Maple program:
f := x ‐> a*x^2:
delta_f := (x,h) ‐> (f(x+h) ‐ f(x))/h: # difference quotient
f_prime := x ‐> limit(delta_f(x,h), h = 0): # limit
f_prime(3); # Out: 6𝑎
Not surprisingly, Maple itself handles derivatives quite well. The Maple function diff
returns the term expression for the derivative of a function 𝑓 at a point 𝑥. For exam-
ple, for the function f in line 1 above, diff(f(x),x) yields the term 2𝑎𝑥.
The diff command knows all the built-in functions and the usual derivation rules.
For example, for diff(x*sin(cos(x)),x) Maple returns the correct result
𝜕 𝜕2
Line 2 computes the first partial derivative 𝑤, line 3 the second 𝑤. Line 4
𝜕𝑥 𝜕𝑥 2
𝜕3 𝜕2
returns the second derivative directly. Line 5 yields 3 2 𝑤.
𝜕𝑦 𝜕𝑥
Note, however, that diff returns the algebraic term expressions of the derivatives,
not the derivative functions themselves. If desired, the corresponding functions can be
recreated using the unapply operator discussed earlier.
10.4 Calculus 231
It is also possible to generate the derivative functions directly with the differential
operator D. The input D(sin) returns the actual cosine function cos, which can then
also be applied as such:
D(sin); # Out: cos
D(sin) ‐ cos; # Out: 0
The result 0 confirms that D(sin) in fact coincides with the function cos.
The differential operator can also handle higher and partial derivatives. For example,
for the function w:=(x,y) ‐> x^4*y^4 + x*y^2, applying D[1](w) we get the
function
𝜕
𝑤 ∶ (𝑥, 𝑦) ↦ 4 𝑥 3 𝑦 4 + 𝑦 2.
𝜕𝑥
Here the number 1 enclosed in brackets refers to the first argument of 𝑤. Other uses,
such as D[2](w), D[2,2](w) or D[1$2,2$3](w) should now be clear.
Integration
Maple also offers powerful options for symbolic integration. Definite and indefinite
integrals are obtained with the command int, where, if no integration limits are spec-
ified, an antiderivative with integration constant 0 is computed. int knows the an-
tiderivative functions of all built-in functions (if they exist) and an ever growing set
of sophisticated integration rules.
For ∫sin(𝑥)2 𝑑𝑥 we get
int(sin(x)^2, x); # Out: −1/2 sin (𝑥) cos (𝑥) + 𝑥/2
𝜋
For the definite integral ∫0 sin(𝑥)2 𝑑𝑥:
int(sin(x)^2, x = 0..Pi); # Out: 𝜋/2
If Maple cannot find a solution, the system returns the expression unchanged:
2
3
int(exp(x^3), x = 0..2); # Out: ∫ e𝑥 d𝑥
0
Besides the int function, there is also the Int variant (with a capital I), which is used
when you want the integral representation without further transformation attempts.
232 10 Maple
Integral Norm
Consider the standard integral inner product on the vector space 𝐶[𝑎, 𝑏] of contin-
uous functions over an interval [𝑎, 𝑏], defined by
𝑏
⟨𝑓, 𝑔⟩ ∶= ∫ 𝑓(𝑥) 𝑔(𝑥) 𝑑𝑥,
𝑎
Example 10.9 We show that, with respect to this inner product, the functions sin and
cos are orthogonal in the space 𝐶[0, 𝜋]:
int(sin(x)*cos(x), x = 0...Pi); # Out: 0
For the norm of, say sin, we get || sin || = 1/2√2𝜋 with
sqrt(int(sin(x)*sin(x), x = 0...Pi));
Exercise Show that the basis of the ℙ2 space considered in Example 10.6 is actually
an orthogonal basis, when considered as a function space over the interval [0, 1]. How
can this basis be converted to an orthonormal basis?
The plots are prepared in lines 8 and 10, and output together in line 12.
The result is shown on the left in Fig. 10.2 on the next page, together with a polyg-
onal interpolation explained in the following.
Polygonal Chains
{𝑥 x<1/3
{
14 Out: 𝑠𝑝𝑙1 ∶= {−1/3 + 2 𝑥 x<2/3
{
{3/2 − 3/4 𝑥 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
15 spl1plot := plot(spl1, x = 0..1):
16 display(spl1plot , ptsplot);
The result is shown on the right in Fig. 10.2 on the following page.
Example 10.12 (Hat Functions) We continue with the program in the examples
above. When solving differential equations with the finite element method, among
others, the following degree 1 splines are often used, the so-called hat or tent func-
tions 𝜑𝑖 , which are generated as follows:
234 10 Maple
Fig. 10.2 Cubic spline interpolation in Example 10.10 (left). Polygonal interpolation in Exam-
ple 10.11 (right)
The graphs are shown in Fig. 10.3. These hat functions form a basis for all polygonal
chains over the grid points 𝑥0 = 0, 𝑥1 = 1/3, 𝑥2 = 2/3, 𝑥3 = 1.
Note that in line 27 we use the function f and not the term f_expr. This is mainly
due to writing economy. Instead of ‘f(0) = 0’ we could just as well have passed
‘eval(f_expr,x=0) = 0’ etc. as inputs to the solver.
Remark In fact, in the present simple case, we do not need Maple to find the coef-
ficients 𝑐𝑖 . At the point 𝑥𝑖 exactly one of the basis hat functions takes a value ≠ 0,
namely 𝜑𝑖 (𝑥𝑖 ) = 1. It is therefore immediately clear that 𝑐𝑖 = 𝑦𝑖 .
The fact that the hat functions vanish outside a small (“finite”) vicinity around the
evaluation points 𝑥𝑖 makes them particularly suitable for the numerical computation
of differential equations. The so-called finite element method is based on exactly this
idea. The method is examined in more detail below and then discussed in depth in
the chapter on the FEniCS project.
Maple offers many ways to deal with ordinary differential equations, both symboli-
cally and numerically.
It is clear that (1) can be solved by direct double integration. We have already estab-
lished that Maple does not add integration constants, so we explicitly introduce c1
as such. Thus, we first obtain an expression for 𝑢′ through
1 > int(‐(1 + x), x) + c1;
2 Out: −𝑥 − 1/2 𝑥 2 + 𝑐1
This expression is a term. Again it seems more convenient to work with a function,
so we call unapply:
5 > u := unapply(%,x);
6 Out: 𝑥 ↦ −1/2 𝑥 2 − 1/6 𝑥 3 + 𝑐1 𝑥 + 𝑐2
236 10 Maple
We can let Maple check the result by calculating diff(u(x),x$2) and the boundary
values u(0) and u(1).
Example 10.15 The same result is obtained with the built-in Maple function dsolve.
All we need to do is to formulate the BVP and boundary conditions in Maple and then
apply the solver:
ode := diff(u(x), x, x) = ‐(1 + x):
bc := u(0) = 0, u(1) = 0:
dsolve({ode, bc});
𝑢(𝑥)
(2) 𝑓(𝑥, 𝑢) ∶= , 𝑢0 ∶= 1, 𝑛 = 10 ∶
1 + 𝑥2
For later use, we create the representation of the evaluation points in the discrete
solution:
14 with(plots): # contains pointplot , display
15 plot1 := pointplot(u_eul, symbolsize=15):
For comparison, we determine the exact solution using the built-in Maple function
dsolve, which requires the ODE and the initial condition as input arguments:
16 ode := diff(u(x), x) = u(x)/(1 + x^2):
17 ic := u(0) = 1:
18 sol := dsolve({ode, ic}):
19 Out: 𝑠𝑜𝑙 ∶= 𝑢 (𝑥) = earctan(𝑥)
Finally, we plot the result together with the discrete approximation that we estab-
lished earlier:
21 plot2 := plot(u, x = 0..1):
22 display(plot1, plot2);
Fig. 10.4 Solution of Equation (1) with 𝑓 defined in (2). Dots indicate the discrete solution obtained
with the Euler method. Solid line represents the exact solution obtained with the built-in Maple
function dsolve
238 10 Maple
In Sect. 5.6 in the SymPy chapter, we explained the Galerkin method to solve dif-
ferential equations. In the following, we show how to implement a corresponding
solution program in Maple.
We once again consider the second order boundary value problem
Variational Problem
We assume the same basic functions as in Sect. 5.6 in the SymPy chapter, expressed
in Maple by
1 phi[1]:= x*(1 ‐ x):
2 phi[2]:= x*(1/2 ‐ x)*(1 ‐ x):
3 phi[3]:= x*(1/3 ‐ x)*(2/3 ‐ x)*(1 ‐ x):
4 phi[4]:= x*(1/4 ‐ x)*(1/2 ‐ x)*(3/4 ‐ x)*(1 ‐ x):
Next come the specification of the bilinear form 𝑎, the linear form 𝐿, the stiffness
matrix 𝐴, and the load vector 𝑏:
5 a := (u,v) ‐> ‐int(diff(u,x,x)*v, x = 0..1):
6 f := x*(x + 3)*exp(x):
7 L := v ‐> int(f*v, x = 0..1):
8 A := Matrix(4, 4, (i,j) ‐> a(phi[j], phi[i])):
9 b := Vector(4, j ‐> L(phi[j])):
Exact Solution
Recalling that Maple does not add any integration constants, we introduce 𝑐1 , 𝑐2 ex-
plicitly:
14 int(f,x) + c1:
15 int(%,x) + c2:
16 u_e := unapply(%,x):
17 sols := solve({u_e(0) = 0, u_e(1) = 0}, {c1, c2}):
18 assign(sols):
19 u_e(x); # Out: −𝑥𝑒 𝑥 + 𝑥 2 𝑒 𝑥
As in the corresponding SymPy example, the deviation between the approximate and
the exact solution is so small that it is not noticeable in the plot. As already mentioned,
in such cases it is useful to plot the difference u‐u_e. This is left to the reader.
In Sect. 10.4, we introduced the hat functions in the context of splines. If we choose
the basis functions as suitable hat functions in the Galerkin approach, we come to the
finite element method FEM. The FEM will be discussed in depth later in the FEniCS
chapter. Below, we illustrate the basic idea for the BVP (1) in Sect. 10.7.
We would like to proceed in the same way as in the general Galerkin case in the last
section, based directly on the variational form
1 1
(1) − ∫ 𝑢″ 𝑣 = ∫ 𝑓𝑣.
0 0
By the assumption 𝑣(0) = 𝑣(1) = 0, the second term on the right-hand side of (2)
vanishes. Putting it all together, we get the following equivalent formulation for (1):
1 1
(3) ∫ 𝑢′ 𝑣′ = ∫ 𝑓𝑣.
0 0
The equation (3) can now be used as a basis for the Galerkin method. Again, we
denote the left-hand side as 𝑎(𝑢, 𝑣), and the right-hand side as 𝐿(𝑣).
So, we first create a sparse matrix with only zero entries and then populate it with the
values 𝑎(𝜑𝑖 , 𝜑𝑗 ) ≠ 0.
Moreover, the matrix is obviously symmetric, so it is sufficient to specify the lower
triangular part:
13 A := Matrix(n‐1, n‐1, shape=symmetric , storage=band[1, 0]):
Here storage=band[1,0] means that only the main diagonal and the first subdiag-
onal have to be stored.
The main diagonal in 𝐴 has (𝑛 − 1) entries:
14 for i from 1 to n‐1 do A[i,i] := a(phi[i+1], phi[i+1]) end do:
10.8 Finite Element Method 241
Note that the border-adjacent hat functions 𝜑1 and 𝜑11 are not used. The reason
is simply that the equation (1) is homogeneous, i.e. 𝑢 assumes the value 0 at both
boundary points, so hat functions with function value ≠ 0 at the boundaries cannot
contribute to the solution anyway.
We come to the load vector 𝑏, where we again only need to consider the inner hat
functions:
16 b := Vector(n‐1, j ‐> L(phi[j+1])):
We solve the equation system 𝐴𝑐 = 𝑏 to find the coefficient values 𝑐𝑖 for the approx-
imate solution 𝑢 = ∑ 𝑐𝑖 𝜑𝑖 at the inner evaluation points:
17 c := LinearSolve(A,b):
The solution vector 𝑐 is then extended to include the coefficients 𝑐1 = 𝑐11 = 0 for 𝜑1
and 𝜑11 :
18 c := Vector(n+1, [0, c, 0]):
Finally, the solution function 𝑢 is generated and plotted as shown in Fig. 10.5.
19 u := add(c[i]*phi[i], i = 1..11):
20 plot(u, x = 0..1);
Remark As pointed out in Remark 10.5, the hat functions take the value 1 at their
peak point, so strictly speaking it is not necessary to formulate the solution 𝑢 as a sum
𝑢 = ∑ 𝑐𝑖 𝜑𝑖 . From the 𝑐𝑖 values themselves we could generate 𝑢 in line 19 directly as
follows:
u := Spline(x_vec, c, x, degree=1):
242 10 Maple
Exercises
Exercise 10.1 Use the Maple function dsolve to solve the BVP
𝑢″ (𝑥) = 𝑢(𝑥),
𝑢(0) = 0, 𝑢′ (1) = −1,
already discussed in Exercise 5.7 in the SymPy chapter, and plot the solution.
Hints (1) Use D(u)(1)=‐1 to represent the second boundary condition. (2) The func-
tion to be plotted is returned as the right-hand side of the dsolve result.
Exercise 10.2 Show that Maple finds no symbolic solution for the equation
Use dsolve with the option ‘type=numeric’ to retrieve a numerical solution. Plot it
with the odeplot function from the plots package. Note that odeplot works directly
with the dsolve result without the need to extract the right-hand side.
The Weierstrass approximation theorem states that any continuous function defined
on a closed interval [𝑎, 𝑏] can be uniformly approximated by a polynomial function
with arbitrary accuracy: For every function 𝑓 ∈ 𝐶[𝑎, 𝑏] there is a sequence 𝑃𝑛 of
polynomials such that
for 𝑛 = 4, 𝑛 = 8 and 𝑛 = 16. Draw the graph of 𝑓 and the 𝑃𝑛 . Note that 𝑃4 is actually
the polynomial 𝑝 shown in Fig. 10.1 on page 224.
(2) Make a conjecture as to how the approximation evolves for increasing 𝑛. Support
your conjecture by sketching the graph of the function
Exercise 10.4 Repeat the same reasoning as in the last exercise, with the difference
that now you consider the following Chebyshev nodes instead of the equidistant in-
terpolation points:
𝑖−1
𝑥𝑖 ∶= −cos ( 𝜋) , 𝑖 = 1, … , 𝑛 + 1.
𝑛
Remark In a sense, the Chebyshev nodes 𝑥𝑖 are also based on an equidistant division.
They result from the projection of a sequence of points 𝑝𝑖 = (𝑥𝑖 , 𝑦𝑖 ) on the upper unit
semicircle, equidistantly distributed on the arc.
In Exercise 5.10 in the SymPy chapter, we defined the Legendre polynomials and
showed that they are pairwise orthogonal in the space 𝐶[−1, 1] with respect to the
integral inner product. Here we discuss how they can be used in polynomial inter-
polation.
To recall: The standard integral inner product in 𝐶[−1, 1] is defined by
1
⟨𝑓, 𝑔⟩ ∶= ∫ 𝑓(𝑥) 𝑔(𝑥) 𝑑𝑥.
−1
Beginning with the monomials 𝑀𝑛 (𝑥) ∶= 𝑥 𝑛 , the Legendre polynomials 𝐿𝑛 are then
inductively defined as
𝑛−1
⟨𝑀𝑛 , 𝐿𝑖 ⟩
𝐿0 ∶= 𝑀0 , 𝐿𝑛 ∶= 𝑀𝑛 − ∑ 𝐿.
𝑖=0 ⟨𝐿𝑖 , 𝐿𝑖 ⟩ 𝑖
To approximate a given function 𝑓 ∈ 𝐶[−1, 1], we can then use the polynomials 𝑃𝑛
given by
𝑛
⟨𝑓, 𝐿𝑖 ⟩
𝑃𝑛 ∶= ∑ 𝐿𝑖 .
𝑖=0 ⟨𝐿𝑖 , 𝐿𝑖 ⟩
In fact, these 𝑃𝑛 are the optimal approximation polynomials of degree 𝑛 for 𝑓 that
minimize ||𝑓(𝑥) − 𝑃(𝑥)|| over all polynomials 𝑃 of degree 𝑛, where || ⋅ || denotes the
integral norm induced by the integral inner product ⟨⋅, ⋅⟩1/2 .
244 10 Maple
Exercise 10.5 To illustrate this, we return to the Runge function 𝑓 defined above.
(1) Approximate 𝑓 with some of the 𝑃𝑛 and plot the result.
(2) Compute the sequence 𝜀𝑛 ∶= ||𝑓 − 𝑃𝑛 || and check if it confirms the convergence
assumption.
Spline Interpolation
Exercise 10.6 Without using the package CurveFitting, define a cubic spline to in-
terpolate the points (1,1), (2,4), (3,3).
Note that there are still two degrees of freedom left in the resulting equations sys-
tem. Now consider the spline generated by the Maple function Spline and explain
how these degrees of freedom are used there.
Exercise 10.7 Approximate the Runge function by cubic splines on sets of equidis-
tant evaluation points.
Chapter 11
Mathematica
Remark Strictly speaking, Mathematica is now the name for the comprehensive de-
velopment environment. The underlying programming language has been renamed
“Wolfram”. Here we omit this distinction and refer also to the language itself as
Mathematica.
11.1 Basics
Note the square brackets on the right edge of the window. They indicate that the
input and output is organized into cells that can be accessed individually, allowing
changes and reevaluations of cells in any order.
To refer to an earlier output, use the “Out” keyword, followed by the index of the
output, such as for example Out[1], which always refers to the first output of the
current Mathematica session.
Like SymPy and Maple, Mathematica works by default with exact values, for example:
In[2]:= Sin[Pi/4] (* Out: 1/Sqrt[2] *)
Note that keywords, such as Sin and Pi, are capitalized. Function arguments are en-
closed in square brackets.
Here we follow our convention from the previous chapters to print output as a
comment on the input line. Comments are always enclosed in a pair ‘(* *)’.
Note that we typically omit the input and output prompts when they are obvious from
the context.
In many cases, an additional assumption is necessary to obtain an intended result.
Consider for example
s = Sqrt[x^2] (* Out= Sqrt(x^2) *)
The system gets stuck because it is not clear whether x is a positive or negative value.
We could solve the problem by specifying
Simplify[s, Assumptions ‐> x > 0] (* Out= x *)
11.1 Basics 247
Variables in Mathematica can hold values of any type. An unassigned variable re-
mains as a symbolic expression.
Because all Mathematica keywords are capitalized, it is recommended that vari-
able names start with a lowercase letter. Underscores are not allowed in names be-
cause, as we will see, they have a special meaning.
Value assignment to variables generally uses the the ‘Set’ operator, denoted by ‘=’.
Consider the results of the following inputs:
1 a (* Out= a *)
2 a = Sin[Pi/4] (* Out= 1/Sqrt[2] *)
3 a = a // N (* Out= 0.707107 *)
The result in line 1 is again ‘a’, i.e. the variable a retains its symbol value. In line 2, the
symbolic value 1/√2 is assigned to a. As a result of line 3, the variable a now contains
the numeric value 0.707107.
Once you assign a value to a particular variable, that value persists until you ex-
plicitly change or remove it. Of course, the value disappears when you start a new
Mathematica session.
To avoid possible side effects, it is a good idea to reset variables before new com-
putations. This can be done e.g. with Clear[a]. After that the value of ‘a’ is again the
symbol a itself. Clear can also also be applied to a comma separated list of variables.
A radical command is ‘Quit’, which clears memory and starts a new session.
Assignments to symbolic expressions are made with the replacement operator ‘/.’
(pronounced “slash‐dot”), followed by a replacement rule denoted with the arrow
operator ‘→’, for example:
1 expr = (x + y)^2; (* ; suppresses output *)
2 expr /. x ‐> 1 (* Out= (y + 1)^2 *)
3 expr /. x ‐> y (* Out= 4 y^2 *)
4 expr /. x ‐> y + z (* Out= (2 y + z)^2 *)
5 expr /. {x‐>2, y‐>4} (* Out= 36 *)
Control Structures
Program expressions are evaluated sequentially, provided that the control flow is not
changed by loops or branch instructions.
Compound Expressions
A compound expression consists of a sequence of expressions that are either on their
own line or, if combined on one line, separated by a semicolon. The expressions are
then evaluated in order, with the last one providing the result. The semicolon has the
additional effect of suppressing the output.
Loops
Mathematica knows the usual loop constructions. The for loop resembles the one in
the C language:
For[start , condition , increment , body]
Example:
s = 0; For[i = 1, i <= 100, i++, s += i]; s (* Out= 5050 *)
Note that the first three components of the loop correspond to the loop head in the
C version. The last component consists of the loop body. But note that unlike C, the
parts are separated by commas. The body can again be a compound expression.
The do loop is a convenient shortcut when, as here, the iteration variable ranges over
a given sequence:
s = 0; Do[s += i, {i, 1, 100}]; s (* Out= 5050 *)
As with all identifiers in Mathematica, the ‘?’ operator can be used to retrieve detailed
information by typing ‘?Do’.
The while loop has the general form
While[condition , body]
Example:
s = 0; i = 1; While[i <= 100, s += i; i++]; s (* Out= 5050 *)
Note that to increase readability, the loop body may be enclosed in parentheses.
Conditional Statements
The ‘If’ function in Mathematica corresponds to the ternary conditional construc-
tions we have already seen on several occasions before:
If[condition , expression1 , expression2 ]
11.2 Functions 249
The If function evaluates and returns expression1 if condition evaluates to True, and
expression2 if it evaluates to False. The short form ‘If[condition, expression]’ re-
turns a special Null symbol if condition evaluates to False.
As an example, we consider the computation of the Collatz sequence:
1 count = 0; m = 100;
2 While[m > 1,
3 m = If[EvenQ[m], m/2, 3 m + 1];
4 count++]
5 count (* Out= 25 *)
Notice in line 3 that EvenQ denotes the built-in predicate that holds when the argu-
ment is an even integer.
Also notice that we use the If function to assign different values depending on the
value of the condition. Alternatively we can also use it to select different command
sequences, in the form
If[condition , statements1 , statements2 ]
If we need a conditional expression with else-if clauses, we can use the generalized
Which function:
Which[condition1 , value1 , condition2 , value2 , …]
This evaluates each condition𝑖 in turn and evaluates and returns the value𝑖 that cor-
responds to the first condition𝑖 that yields True.
Example:
a = 2; Which[a == 1, x, a == 2, y, a == 3, z] (* Out= y *)
As with the ‘If’ function, the values can also be compound expressions or statement
blocks.
11.2 Functions
Recall that function arguments are always enclosed in square brackets. Here the un-
derscore indicates that x is a placeholder for the function argument.
When applying the function f to a term t, each occurrence of x in the expression
rhs on the right-hand side is replaced by t.
250 11 Mathematica
This is affirmed by
f[t] == rhs /. x ‐> t (* Out= True *)
Example:
1 f[x_] = x^2;
2 f[4] (* Out= 16 *)
3 f[a + 2a] (* Out= 9 a^2 *)
Line 3 again shows that the usual simple algebraic transformations have been per-
formed automatically.
Functions can have multiple arguments and also symbolic parameters:
g[x_, y_] = a Sin[x/2] Cos[y]
g[Pi, 0] (* Out= a *)
We can also define a function by providing argument-value pairs. Consider for ex-
ample the logical negation neg ∶ {0, 1} → {0, 1} given by neg(𝑥) ∶= 1 − 𝑥.
In Mathematica it can be implemented either as above in the form neg[x_] = 1‐x,
or alternatively by
1 neg[0] = 1; neg[1] = 0;
2 neg[neg[1]] (* Out= 1 *)
3 neg[t] (* Out= neg[t] *)
Applying neg affects exactly the expressions neg[0] and neg[1]. Line 3 shows that
the expression is returned unevaluated for arguments that do not belong to the func-
tion domain {0, 1}.
The two variants can also be combined. Consider the function
1 f[x_] = Sin[x]/x
2 f[0] (* Indeterminate *)
The error message returned along with the Indeterminate output in line 2 indicates
the the function is not defined for the 0 input. However, the limit of 𝑓(𝑥) when 𝑥
approaches 0 exists and is equal to 1. Therefore, we might like to add a second defi-
nition:
3 f[0] = 1
Now the function is completely defined, as can be seen by the query ‘?f’. Mathematica
then returns f[0] = 1, f[x_] = Sin[x]/x.
Mathematica looks for any “special” definitions like f[0]=1 before applying a general
definition involving a placeholder variable.
Returning to the placeholder variant, let’s look at a slightly larger example that we
can use to illustrate some other important function properties.
Remark 11.2 Note in particular the assignment operator ‘:=’ in line 1. The basic dif-
ference between the forms lhs=rhs and lhs:=rhs is the time when the expression
rhs is evaluated. lhs=rhs is an immediate assignment, in which rhs is evaluated at
the time when the assignment is made. lhs:=rhs, on the other hand, is a delayed as-
signment, in which rhs is not evaluated when the assignment is made, but is instead
evaluated each time the value of lhs is requested.
Without going into the subtle details, if you want be on the safe side, you can stick
to always using ‘:=’.
For illustrative purposes, however, we prefer ‘=’ in the following and use ‘:=’ only
when necessary.
In the present case in Example 11.1, this is necessary. If we replace := with = in line 1,
the assignment in line 3 applies and the initial value 1 is returned in line 6.
Note that we must enclose the compound expression in parentheses here, otherwise
only the first if statement in line 1 would be assigned.
Equivalently, the function can be defined with a single if-else statement:
fac[n_] = If[n == 1, 1, n fac[n ‐ 1]];
252 11 Mathematica
It can be expressed even more succinctly with the pointwise function definition ex-
plained above:
1 fac[1] = 1;
2 fac[n_] = n fac[n ‐ 1];
As mentioned earlier, Mathematica first looks for explicit definitions, here the one
in line 1, and uses the recursion expression only if it cannot find a matching explicit
definition.
The notion of pointwise function definition can also help speed up computations by
storing intermediate results:
Example 11.4 Using the same idea as in the last example, we can define the Fibonacci
sequence by
1 fib[1] = fib[2] = 1;
2 fib[n_] := fib[n‐1] + fib[n‐2]
In line 2, we need to evaluate two recursion branches, one for fib[n‐1] and one for
fib[n‐2]. This, of course, leads to many repeated calculations of the same function
values.
However, if we replace line 2 with
fib[n_] := fib[n] = fib[n‐1] + fib[n‐2]
the fib[n] value is stored, so that it can be called again the next time we need it, just
as if it had been assigned directly in the program code. This concept of storing the
results and return them when requested again is known as memoization.
Visualization
Example 11.5 Consider the so-called sigmoid function 𝜎 and its derivative 𝜎 ′ given
by:
1 ′ 𝑒−𝑥
𝜎(𝑥) ∶= , 𝜎 (𝑥) ∶= .
1 + 𝑒−𝑥 (1 + 𝑒−𝑥 )2
A plot can be generated as follows:
1 s[x_] = 1/(1 + E^‐x);
2 sp[x_] = E^‐x/(1 + E^‐x)^2;
3 Plot[{s[x], sp[x]}, {x, ‐6, 6}]
1.0
0.8
0.6
0.4
0.2
-6 -4 -2 2 4 6
−𝑥
1 𝑒
Fig. 11.1 Sigmoid function 𝜎 = (blue line). 𝜎 ′ (𝑥) = (orange line)
1 + 𝑒−𝑥 (1 + 𝑒−𝑥 )2
The plot is shown in Fig. 11.2. Note that you can change the appearance of the view
by dragging it with the left mouse cursor.
We begin with nonlinear equations. They can be solved either analytically with Solve
or numerically with NSolve. Both solvers can handle single equations as well as sys-
tems of equations.
254 11 Mathematica
Example 11.7 Consider the equation 4𝑥 2 − 12𝑥 = 𝑐. Here is how to solve it in Math-
ematica:
1 s = Solve[4 x^2 ‐ 12 x == c, x]
1 1
2 Out= {{𝑥 → (3 − √𝑐 + 9)} , {𝑥 → (√𝑐 + 9 + 3)}}
2 2
3 s /. c ‐> 40 (* Out= {{x ‐> ‐2}, {x ‐> 5}} *)
4 s /. c ‐> 42 // N (* Out= {{x ‐> ‐2.07071}, {x ‐> 5.07071}} *)
5 NSolve[4 x^2 ‐12 x == 42, x] (* same as above *)
In line 3, the exact solution for 𝑐 = 40 is returned. The solution is again given in form
of a replacement rule, which specifies what values x can be replaced with to produce
a valid equality.
For 𝑐 = 42 there is no exact solution. In line 4, we request an approximate solution
from the exact expression in line 2.
In line 5, we use NSolve to directly obtain the same numerical solution.
Note that the third solution is the one we found with SciPy on page 72.
Minimization
A whole set of optimization methods is built into the Mathematica language, both
numerical and symbolic.
As a simple example, let’s consider the Minimize symbolic function. We use it to
determine the minimum value of the Rosenbrock function:
1 r[x_, y_] = (1 ‐ x)^2 + 100 (y ‐ x^2)^2;
2 min = Minimize[r[x, y], {x, y}] (* Out= {0, {x ‐> 1, y ‐> 1}} *)
The first entry in the list min contains the minimum value 0.
11.4 Linear Algebra 255
The second entry consists of a list of replacement rules that must be applied to the
arguments 𝑥, 𝑦 to obtain the minimum:
3 s = r[x, y] /. min[[2]] (* Out= 0 *)
Components are accessed in double-bracket notation of the form v[[i]], with in-
dices starting with 1, so v[[2]] is 0.
For example, let’s calculate the dot product between two vectors:
1 v = {1,2,3};
2 w = {4,5,6};
3 Sum[v[[i]] * w[[i]], {i,1,3}] (* Out= 32 *)
4 v.w (* Out= 32 *)
In line 3, we apply the Sum operator, which can be used to sum up any sequence of
values.
In line 4, we get the same result using the built-in ‘.’ operator. Note that a vector
by itself is not characterized as row or column vector.
Matrices can be entered as a list of vectors:
1 A = {{1,2,3}, {4,5,6}, {7,8,9}}
2 MatrixForm[A]
123
3 Out= (4 5 6)
789
Line 2 shows how to represent matrices in the usual 2D form.
The components of the matrix are again accessed using double bracket notation,
with indices starting at 1, e.g. A[[3,1]] is 7.
Access to whole rows is done e.g. via A[[2]], which returns the second row
{4,5,6}. For a column, the corresponding command is A[[All,2]], which returns
the column {2,5,8}.
Vectors and matrices can be created with the Table function. Here is an example:
Example 11.9 We show how to define the basic building block for the Poisson matrix,
as discussed, for example, in Example 4.2 in the SciPy chapter.
First we define the function to generate the entries:
1 f[i_, j_] = Which[
2 i == j, 4, (* diagonal *)
3 Abs[i ‐ j] == 1, ‐1, (* first super and subdiagonal *)
4 True, 0] (* otherwise *)
256 11 Mathematica
Note that we use the function Which, already briefly discussed at the end of Sect. 11.1.
With the function f we can then define our matrix:
5 A = Table[f[i, j], {i, 1, 3}, {j, 1, 3}] // MatrixForm
4 −1 0
6 Out= (−1 4 −1)
0 −1 4
We remodel some of the examples in the SymPy and Maple chapters in Mathematica.
For background we refer to Sects. 5.3 and 10.3.
As in SymPy and Maple, it is the Simplify statement in line 8 that produces a read-
able result.
Example 11.11 As in SymPy Example 5.4, we show how to check that the vectors 𝑢𝑖
given by
1 u1 = {1,0,2}; u2 = {0,1,1}; u3 = {1,2,‐1};
Note that the dot operator is used here in a matrix-vector product, which means in
particular that both c and x are now interpreted as column vectors.
We can also formulate a direct confirmation:
9 c[[1]] u1 + c[[2]] u2 + c[[3]] u3 == x (* Out= True *)
While we’re at it, let’s show how the dot operator in line 8 extends to matrix-matrix
multiplications as well:
10 A.Inverse[A] (* Out= {{1, 0, 0}, {0, 1, 0}, {0, 0, 1}} *)
Example 11.12 In Example 5.3 in the SymPy chapter, we showed that the three poly-
nomials
𝑝1 (𝑥) ∶≡ 1, 𝑝2 (𝑥) ∶= 𝑥 − 1/2, 𝑝3 (𝑥) ∶= 𝑥 2 − 𝑥 + 1/6
form a basis of the vector space ℙ2 of polynomials of degree 2. In Mathematica we
proceed as follows:
1 p1 = 1; p2 = x ‐ 1/2; p3 = x^2 ‐ x + 1/6;
2 plin = c1 p1 + c2 p2 + c3 p3;
3 q = a0 + a1 x + a2 x^2;
4 r = q ‐ plin;
5 Solve[{Coefficient[r, x, 0] == 0, Coefficient[r, x, 1] == 0,
6 Coefficient[r, x, 2] == 0}, {c1, c2, c3}];
7 sol = Simplify[%]
a1 a2
8 Out= {{c1 → a0 + + , c2 → a1 + a2, c3 → a2}}
2 3
Note that the polynomials do not have to be defined as functions. We can work di-
rectly with the term expressions.
To verify the solution we apply the replacement rules in line 8 to plin in line 2:
9 plin /. sol[[1]]
1 a1 1 a2
10 Out: a0 + (𝑥 − ) (a1 + a2) + + a2 (𝑥 2 − 𝑥 + ) +
2 2 6 3
11 Simplify[%] (* Out: a0 + 𝑥 (a1 + a2 𝑥) *)
12 Collect[%, x] (* Out: a0 + a1 𝑥 + a2 𝑥 2 *)
The Collect function in line 12 collects terms that contain the same powers of x.
As expected, the verfication returns the polynomial q specified in line 3.
258 11 Mathematica
Sparse Matrices
The notation in line 2 states that all entries with an index pair matching {i_, i_}, i.e.
the entries on the diagonal, are given the value 4.
In line 3, the suffix condition operator ‘/;’ declares that the entries whose index
pairs satisfy the condition ‘Abs[i‐j] == 1’, i.e. all entries on the first super and
subdiagonal, are assigned the value -1. You can read the operator ‘/;’ as “slash‐semi”,
“whenever”, or “provided that”.
Finally, the specification in line 4 defines the matrix size.
The output of line 5 shows that the result corresponds to the previous one.
11.5 Calculus
Like SymPy and Maple, Mathematica can handle derivation and integration both
symbolically and numerically.
Derivation
𝑓(𝑥 + ℎ) − 𝑓(𝑥)
𝑓′ (𝑥) = lim .
ℎ→0 ℎ
We consider the function 𝑓∶ 𝑥 ↦ 𝑎𝑥 2 with a symbolic parameter 𝑎:
1 f[x_] = a x^2;
2 fquot[x_, h_] = (f[x + h] ‐ f[x])/h;
3 flim[x_] = Limit[fquot[x, h], h ‐> 0];
4 flim[x] (* Out= 2 a x *)
5 flim[x] == f'[x] (* Out= True *)
6 f''[x] (* Out= 2 a *)
Note that in line 3 we use the built-in operator Limit. In lines 5 and 6 we use the
‘prime’ operator ‘'’ to specify first and second derivatives.
11.6 Interpolation and Piecewise Functions 259
We can also write D[f[x],x] and D[f[x],x,2] for f'[x] and f''[x]. Note that the
‘prime’ operator applies to functions, the ‘D’ to terms.
This latter D notation also extends to partial derivatives:
1 f[x_,y_] = 1 + x^2 + 2 y^2
2 D[f[x,y], x] (* Out= 2 x *)
3 D[f[x,y], y] (* Out= 4 y *)
𝜕2 𝜕2
D[f[x,y],x,y] for 𝑓(𝑥, 𝑦), or D[f[x,y],{x,2}] for 𝑓(𝑥, 𝑦).
𝜕𝑥𝜕𝑦 𝜕𝑥 2
Integration
In Sect. 10.5 in the Maple chapter, we discussed spline interpolation. Here, we re-
model some typical examples in Mathematica.
260 11 Mathematica
Figure 11.3 shows the resulting splines together with the interpolation knots, the cu-
bic spline on the left and the linear spline on the right.
1.0
1.0
0.8
0.8
0.6
0.6
0.4
0.4
0.2 0.2
0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0
Fig. 11.3 Cubic spline (left). Polygon spline (right). Dots indicate interpolation knots
Note that in lines 4-6 we suppress the immediate output of the individual diagrams.
The Show function in lines 7-8 combines the pairs of diagrams into one and then
displays them.
Remark Note the slight difference between the left cubic spline and the one generated
in Maple (see Fig. 10.2 on page 234), especially around the left interval boundary 0.
In the exercises you will be asked for an explanation.
In particular, when solving partial differential equations with the finite element
method, we take advantage of the fact that polygon splines can be generated as a
sum of hat functions.
Example 11.15 (Hat Functions) We continue with the example above. As in the cor-
responding Maple Example 10.12, we define hat functions 𝜑𝑖 , 𝑖 = 1, … , 4 to represent
polygons over the grid 0, 1/3, 2/3, 1.
11.6 Interpolation and Piecewise Functions 261
The resulting functions look the same as in Maple Fig. 10.3 on page 234.
We represent the polygon curve in Fig. 11.3 as a sum of these hat functions:
From the 4 × 2 matrix pts in line 1 we extract the value vector
12 c = pts[[All, 2]] (* selects column 2 *)
The plot shows that we have regained the polygon spline illustrated in Fig. 11.3 on
the facing page.
Piecewise Functions
When solving partial differential equations using the finite element method, we need
to integrate expressions with hat functions. Unfortunately, the Mathematica function
Integrate cannot satisfactorily handle hat functions when they are defined as inter-
polating functions as above.
We are therefore prompted to consider an alternative construction that works. The
idea is to build hat functions as piecewise functions:
𝑖−2 𝑖−1
{
{3𝑥 − 𝑖 + 2 if <𝑥≤
{
{ 3 3
{
(∗) 𝜑𝑖 (𝑥) ∶= {−3𝑥 + 𝑖 𝑖−1 𝑖 𝑖 = 1, … , 4.
{
{ if <𝑥<
{
{ 3 3
{0 otherwise,
This is implemented by
1 Do[phi[i][x_] = Piecewise[{
2 { 3 x ‐ i + 2, (i‐2)/3 < x <= (i‐1)/3},
3 {‐3 x + i, (i‐1)/3 < x < i/3}},
4 0],
5 {i, 1, 4}]
The plot returns exactly the same polygon chain as in Fig. 11.3 on the preceding page.
262 11 Mathematica
For solving differential equations, Mathematica provides the symbolic solver DSolve
and the numerical solver NDSolve, both of which return the results as replacement
rules, similar to the solver Solve for nonlinear equations above.
However, Mathematica also offers the functions DSolveValue and NDSolveValue,
which return the solutions directly. In the following we prefer these latter variants.
Example 11.16 As a first example, we consider an initial value problem presented in
Example 8.24 in the Julia chapter:
𝑢′ (𝑥) = 𝑥 − 𝑢(𝑥),
𝑢(0) = 1,
It can be solved by
1 ode = {u''[x] == ‐(1+x), u[0] == 0, u[1] == 0}
2 sol = DSolveValue[ode, u[x], x] (* Out= 1/6 (4𝑥 − 3𝑥 2 − 𝑥 3 ) *)
The output shows the solution with the peak value at around 0.14. It corresponds to
the one in Fig. 4.8 on page 82 in the SciPy chapter, so we do not repeat it here.
However, there seems to be no obvious way to direct the solver to the other solu-
tion with a peak value of about 4.0.
Δ𝑢 ≡ 6 in Ω = [0, 1]2 ,
(1) 𝑢(𝑥, 0) = 1 + 𝑥 2 , 𝑢(𝑥, 1) = 3 + 𝑥 2 ,
} on the boundary 𝜕Ω.
𝑢(0, 𝑦) = 1 + 2𝑦 2 , 𝑢(1, 𝑦) = 2 + 𝑦 2
In line 2, we encode the equation Δ𝑢 ≡ 6 with the built-in function Laplacian. Lines
3–4 take care of the boundary conditions.
Line 5 again specifies the target function u and the argument variables x and y.
Line 6 shows the solution.
In line 7, the function Plot3D generates the graph shown in Fig. 11.4.
Example 11.20 Consider a related equation from Sect. 4.8 in the SciPy chapter:
DSolveValue does not find a solution here. Instead, we search for a numeric solution
with NDSolveValue. However, this solver expects the declaration of the boundary
values to be specified by the DirichletCondition function.
We arrive at
1 sol = NDSolveValue[{‐Laplacian[u[x, y], {x, y}] == 1,
2 DirichletCondition[u[x, y] == 0, True]},
3 u[x, y], Element[{x, y}, Rectangle[]]];
4 Plot3D[sol, Element[{x, y}, Rectangle[]]]
In Sect. 5.6 in the SymPy chapter and in Sect. 10.7 in the Maple chapter, we explained
the Galerkin method for solving differential equations. In the following, we show how
to implement a corresponding solution program in Mathematica.
11.9 Finite Element Method 265
The left and right-hand side of the variational problem are encoded by
5 a[u_, v_] = Integrate[‐u''[x] * v[x], {x, 0, 1}];
6 f[x_] = x (x+3) E^x;
7 L[v_] = Integrate[f[x] * v[x], {x, 0, 1}];
The forms in lines 5 and 7 are then used to define the stiffness matrix and the load
vector:
8 A = Table[a[phi[i], phi[j]], {i,1,4}, {j,1,4}];
9 b = Table[L[phi[i]], {i,1,4}];
The plot will look as in Fig. 5.3 on page 114 in the SymPy chapter.
In Sect. 10.8 in the Maple chapter, we showed how to solve the boundary value prob-
lem (1) in Sect. 11.8 with the finite element method. In the following, we discuss a
corresponding Mathematica implementation.
266 11 Mathematica
We refer to the Maple chapter for an introduction and get right to work.
We first define hat functions 𝜑𝑖 , 𝑖 = 1, … , 11, over a grid consisting of a sequence
0 = 𝑥0 < 𝑥1 < ⋯ < 𝑥10 = 1 of 11 equidistant support points.
For this, we extend the definition (∗) in Sect. 11.6 and obtain:
𝑖−2 𝑖−1
{
{ 10 𝑥 − 𝑖 + 2 if <𝑥≤
{
{ 10 10
{
𝜑𝑖 (𝑥) ∶= {10 𝑥 + 𝑖 𝑖−1 𝑖 𝑖 = 1, … , 11.
{
{ if <𝑥<
{
{ 10 10
{0 otherwise,
Since 𝑢(0) = 𝑢(1) = 0, we do not need to consider the boundary hat functions 𝜑1 and
𝜑11 with peak values 1 at 𝑥0 = 0 and 𝑥11 = 1, so we only determine the coefficients
𝑐𝑖 for the inner hat functions 𝜑𝑖 , 𝑖 = 2, … , 10.
Again, we collect the integrals on the left-hand side of (2) into a matrix 𝐴:
6 A = Table[Integrate[phi[i]'[x] * phi[j]'[x],
7 {x, 0, 1}], {i, 2, 10}, {j, 2, 10}];
The result corresponds to the one shown in Fig. 10.5 on page 241 in the Maple chapter.
Exercises 267
Exercises
Functions
Exercise 11.1 Define a recursive function that, for 𝑛 > 0, returns the number of
steps it takes for the Collatz sequence to reach 1. To speed up the calculation, use
memoization to store intermediate results.
Exercise 11.2 In Exercise 10.3 in the Maple chapter, we discussed the “boundary os-
cillation” problem in polynomial interpolation.
Here you are asked to reformulate the model example in Mathematica.
Consider the Runge function
1
𝑓(𝑥) = , 𝑥 ∈ [−1, 1].
1 + 25𝑥 2
As in Exercise 10.3 in the Maple chapter, interpolate 𝑓 on (𝑛 + 1) equidistant evalua-
tion points −1 = 𝑥1 < ⋯ < 𝑥𝑛 < 𝑥𝑛+1 = 1 with polynomials 𝑃𝑛 of degree 𝑛 for 𝑛 = 4,
𝑛 = 8, and 𝑛 = 16.
To generate the (𝑛 + 1) equidistant points use Subdivide[‐1, 1, n]. Use Inter‐
polatingPolynomial to generate the polynomials and Expand to verify that they are
of degree 𝑛.
Exercise 11.3 Repeat the same reasoning as in the last exercise, with the difference
that now you consider the Chebyshev nodes
𝑖−1
𝑥𝑖 ∶= −cos ( 𝜋) , 𝑖 = 1, … , 𝑛 + 1,
𝑛
instead of equidistant evaluation points.
Exercise 11.4 Write a program to compute the cubic polynomial 𝑝(𝑥) = 𝑎𝑥 3 + 𝑏𝑥 2 +
𝑐𝑥+𝑑 that best fits the sine function in the interval [−𝜋, 𝜋] with respect to the integral
norm. The integral norm is explained e.g. on page 232 at the the end of Sect. 10.4 in
the Maple chapter.
Splines
Exercise 11.5 Without using the Interpolation function, define a spline that con-
sists of three piecewise cubic polynomials to interpolate the 4 points
Observe that there are two degrees of freedom left in the resulting equation system.
Consider the spline generated in Example 11.14 in Sect.11.6 and explain how these
degrees of freedom are used there. Compare this with the solution provided by Maple
in Example 10.10.
Exercise 11.6 In Example 4.15 in the SciPy chapter, we considered the pendulum
equation
1 ′
(∗) 𝜃″ (𝑡) + 𝜃 (𝑡) + 5 sin (𝜃(𝑡)) = 0.
4
Verify that DSolve finds no solution.
Instead, use NDSolveValue to solve the equation numerically over the interval [0, 10]
for initial conditions 𝜃(0) = 1, 𝜃′ (0) = 0.
For small angels 𝜃 we can use sin(𝜃) ≈ 𝜃 and consider the approximation
1 ′
(∗∗) 𝜃″ (𝑡) + 𝜃 (𝑡) + 5 𝜃(𝑡) = 0.
4
Show that DSolveValue can find a solution here.
For comparison, draw the numerical solution for (∗) above together with the an-
alytic solution for (∗∗) .
Exercise 11.7 Consider the solution program according to the finite element method
in Sect. 11.9.
Replace the stiffness matrix 𝐴 in lines 6–7 with with a sparse variant that takes
into account that 𝜑′𝑖 𝜑𝑗′ ≡ 0 for |𝑖 − 𝑗| > 1.
Part IV
Distributed Computing
Chapter 12
A Python Approach to Message Passing
Modern high-performance computing owes its success not least to the simultaneous
processing of complex computing tasks on distributed systems, in networks consist-
ing of nodes on separate machines, or on individual machines with multiprocessor
architecture. In fact, most computers today have more than one processor core, so
that distributed processing is also possible on normal laptops, not just on supercom-
puters as in the past.
Two main approaches to distributed computing can be distinguished: first, message
passing systems. This involves a distributed group of cooperative workers performing
a common task in a network of nodes, sending and receiving messages over shared
channels. On the other hand, the idea is to organize communication between inde-
pendent agents simultaneously accessing a local shared memory.
Distributed computing and the theory of distributed systems in general is still an area
of intensive research in computer science. Pioneering contributions to the theoreti-
cal foundations were made in the late 1950s in the Institute for Instrumental Mathe-
matics at the University of Bonn and later in the GMD (German Research Center for
Information Technology) by the then head of the University Computing Center, Carl
Adam Petri. The author would like to take this opportunity to refer to his biography
[13], in which various aspects of the theoretical foundations are discussed.
In this chapter, we develop the basics of message passing within the user-friendly
Python language.
All examples are provided in a form that can be tested on standard home computers.
Originally developed for the programming languages C, C++ and the venerable For-
tran, the basics of MPI have since been adapted to Python as well.
Here we follow that latter path, more specifically we discuss the mpi4py imple-
mentation developed by Lisandro Dalcin from Argentina.
For background and further reading, the tutorials mpi4py.readthedocs.io on
the official website can be recommended. Another good source can be found on
materials.jeremybejarano.com/MPIwithPython.
Installation
Testing
To test the installation, we consider a file hello.py that contains only the single line
print("Hello World"). Assuming that the file is stored in the user’s home direc-
tory, where the Python interpreter can find it directly, we can run the program in the
Terminal window as follows:
$ python hello.py
Hello World
What is new now is that we can run multiple copies (in this case 2) of the program
in parallel in our just installed MPI system by using the command mpiexec with the
option ‘‐n 2’:
$ mpiexec ‐n 2 python hello.py
Hello World
Hello World
In fact, mpiexec launches two independent instances of the Python interpreter, each of
which executes the hello.py program and forwards the output to the shared console
window.
Remark It should be noted, however, that the system is not required to distribute
execution among different processors. The entire development can be illustrated on
a single-processor computer and will also work there. Of course, the real perfor-
mance gain in parallel processing can only be achieved with multiprocessor systems.
As mentioned earlier, this is true for most modern computers today.
Unfortunately, the convenient Spyder programming environment that we used in the
previous Python chapters only allows the simultaneous execution of one interpreter
instance.
In the following, all our MPI programs are run from the Terminal command line in
the form
$ mpiexec ‐n number-of-processes python file.py
12.2 Communicating Processes 273
The main idea is that multiple processes should be enabled to interact. The two
Python processes above know nothing of each other.
This changes if we tell each process that it is part of a larger whole.
For this (and much more) we need the Python package mpi4py and there in particular
the class object MPI.COMM_WORLD, which defines a context of interprocess communi-
cation. It contains the essential attributes and methods we will use in the following,
e.g. the variable ‘size’, which stores the number of processes involved in our commu-
nication world, and a number ‘rank’ for each to distinguish the individual processes.
Example 12.1 A first example is the following program:
1 from mpi4py import MPI
2 comm = MPI.COMM_WORLD
3 print("Hello World from process", comm.rank, "of", comm.size)
In line 1, we import the basic module MPI. The communication context is controlled
by the object MPI.COMM_WORLD, which contains the variables size and rank and most
of the methods to be introduced later.
In line 2, we introduce the common abbreviation comm for the communicator
object MPI.COMM_WORLD and use it in line 3 to represent the number of processes
comm.size in the communication system and assign each a specific index number
comm.rank from 0 to comm.size-1.
We assume that the program is stored in a file hello_from.py. If we run the program
for n = 2 processes, we get
$ mpiexec ‐n 2 python hello_from.py
Hello World from process 1 of 2
Hello World from process 0 of 2
Note that the rank numbers do not determine any execution order. Rather, the order
may change between different program runs.
Now it is time to establish communication between processes. Here send and recv
are the most basic – and most important – operations.
Example 12.2 We assume the following code to be stored in a file hello_to.py:
1 from mpi4py import MPI
2 comm = MPI.COMM_WORLD
3 if comm.rank == 0:
4 comm.send("Hello World", dest=1)
5 if comm.rank == 1:
6 msg = comm.recv(source=0)
7 print("Message received:", msg)
274 12 A Python Approach to Message Passing
Note that the program contains separate pieces of code. Lines 1 and 2 are executed
by all processes, line 4 only in process 0, and the block of lines 6 and 7 in process 1.
Which block refers to which process is controlled by the if-conditions in lines 3 and 5.
This is the standard method for assigning different process behavior in MPI systems.
The operation send in line 4 takes two arguments, the data to send and the desti-
nation to send it to.
In line 6, process 1 waits for data from process 0, and when it arrives, recv stores
it in the variable msg, which is then output to the screen in line 7.
A run for 𝑛 = 2 processes results in
$ mpiexec ‐n 2 python hello_to.py
Message received: Hello World
Deadlocks
It is obvious that a receive operation must wait for the associated send operation.
However, this strict synchronization order between sending and receiving is a con-
stant source of runtime errors in parallel programming. Suppose the following pro-
gram is stored in a file dl.py:
1 from mpi4py import MPI
2 comm = MPI.COMM_WORLD
3 msg = comm.recv(source = comm.size ‐ 1 ‐ comm.rank)
4 comm.send(22, dest = comm.size ‐ 1 ‐ comm.rank)
5 print(msg)
Then, in line 3, each of the processes waits in turn for the other to send some data
before continuing to send its own message in line 4. The program hangs up, it must
be forced to abort with the key combination “Ctrl-C”.
Such a situation is called deadlock.
When lines 3 and 4 are exchanged, the program can continue with line 5 and print
the message 22.
Notation Convention
In the examples above we have illustrated the usual convention of writing comm as an
abbreviation for MPI.COMM_WORLD. In the following we decide to use an even shorter
notation.
We print MPI.COMM_WORLD variables and methods like rank, size, send and recv
in bold to indicate that they have been provided according to the pattern
rank = MPI.COMM_WORLD.rank
12.2 Communicating Processes 275
Moreover, we will always assume that the module MPI is loaded in the form
from mpi4py import MPI
Summation
𝑛
As a first “real” application, we consider an MPI program to compute the sum ∑𝑖=1 𝑖.
Example 12.3 The program begins with a preamble that imports the MPI module
and declares the variable and method abbreviations.
We then fix the upper limit 𝑛 as
1 n = 100
We assume that the number ‘size’ of processes evenly divides the number 𝑛 of sum-
mands. If not, the program should abort with an error message:
2 assert n % size == 0 # prog. aborts if condition not satisfied
Each process locally computes the sum over its own segment:
5 sm_loc = sum(range(k,l))
For illustration, let us assume size = 4. For the process with, say, rank = 2, we then
get 𝑘 = 51, 𝑙 = 76. Since in Python the upper bound 𝑙 is not included in the range, in
line 5 the numbers 51, 52, … , 75 are added using the Python function sum and stored
as the contribution of process 2 to the final result.
The root process 0 collects the partial sums from the other processes and adds
them to its own contribution:
6 if rank != 0: send (sm_loc , dest=0) # send from other processes
7 else: # in root
8 sm = sm_loc # computed by root process
9 for i in range(1, size ):
10 sm += recv (source=i) # add received contributions
In line 11, we follow the good convention of having only the root process print results.
However, during program development it may be useful to insert print commands at
arbitrary neuralgic points.
Assume that the program is stored in a file sum.py. It can then be distributed among
any number of processes that evenly divides the internal variable 𝑛 = 100, e.g. 4 in
this case:
$ mpiexec ‐n 4 python sum.py
5050
276 12 A Python Approach to Message Passing
Both send and recv operate on a point-to-point basis: send sends a message to a spe-
cific destination, recv is prepared to receive a message only from a specified source.
However, MPI also provides several collective communication operators that a pro-
cess can use, for example, to collect messages from a group of processes or to broadcast
a message to all processes.
This collects the sm_loc values from all processes, then “reduces” them to a represen-
tation as a single value, namely their sum, and then assigns the sum to the variable
sm in the root process.
In the other processes, the reduce command assigns a special Python value None
to sm. This can be checked by deleting the clause ‘if rank == 0:’ before the
print(sm) command in line 11.
In fact, reduce is even much more flexible. Calculating the sum is only the default
option, with the meaning ‘reduce(., op=MPI.SUM)’. There are several other options
available, such as MPI.PROD or MPI.MAX with their obvious meanings.
In line 5, the quotient is calculated in process 0 and broadcast to the local variable q of
each process in line 6 . Note that the command is executed in each process. Therefore,
the variable q_loc must be declared everywhere. In Python, a declaration without
initialization is not possible, so q_loc must be assigned some (arbitrary) value. This
happens in line 1.
Note that it is quite possible and even very common to use the same name for the
two variables q and q_loc.
Load Balancing
In our summation example, we have assumed that the number of processes evenly
divides the number of summands, so that the workload assigned to the processes is
equal. If this is not the case, some action must be taken.
A common way to deal with the general case is to distribute the remaining entries
one by one to the first processes.
Let 𝑓∶ [𝑎, 𝑏] → ℝ. To approximate the integral of 𝑓, we divide the interval [𝑎, 𝑏] into
𝑛 subintervals of equal length ℎ and sum the areas ℎ𝑓(𝑥𝑖 ), where 𝑥𝑖 is the midpoint
of the subinterval number 𝑖 = 0, 1, … , 𝑛 − 1.
Formally:
𝑏 𝑛−1
1 𝑏−𝑎
∫ 𝑓(𝑥) 𝑑𝑥 ≈ ℎ ∑ 𝑓 (𝑎 + ℎ (𝑖 + )) , where ℎ = .
𝑎 𝑖=0 2 𝑛
In our parallel program, we want to distribute the summation to the involved pro-
cesses. Note that the summands are independent, so the partition between them is
arbitrary. For a given number 𝑛 of subintervals and 𝑠 processes, we choose to let pro-
cess number 𝑟 < 𝑠 take care of the indices 𝑖 < 𝑛 of the form 𝑖 = 𝑟 + 𝑚𝑠, 𝑚 = 0, 1, … .
For example, for 𝑛 = 10 and 𝑠 = 4, this means that process 0 is responsible for
three indices 𝑖 = 0, 4, 8, process 1 for 𝑖 = 1, 5, 9, process 2 for 𝑖 = 2, 6, and process 3
for the remaining 𝑖 = 3, 7.
Note that the distribution of subintervals is not contiguous. The advantage of this
approach is that load balancing is performed automatically in the manner described
above, and the entries remaining after even distribution are assigned to the first pro-
cesses one by one.
Example 12.7 Assuming the usual preamble is already loaded, here is the MPI-
program:
We first fix a test function
1 f = lambda x: 4/(1 + x*x)
Then each process computes its contribution according to the pattern explained
above:
8 add = 0.0
9 for i in range( rank , n, size ):
10 x = a + h*(i + 0.5)
11 add += f(x)
12 int_loc = h*add
12.3 Integral Approximation 279
The local contributions are summed up and the result is printed by the root process:
13 int = reduce (int_loc)
14 if rank == 0: print(int)
The trapezoidal rule is a more efficient approximation technique for definite integrals
𝑏
of the form ∫𝑎 𝑓(𝑥) 𝑑𝑥.
Suppose a grid over the interval [𝑎, 𝑏] is given by (𝑛 + 1) equidistant points
𝑏−𝑎
𝑥𝑖 ∶= 𝑎 + 𝑖ℎ with ℎ = for 𝑖 = 0, … , 𝑛.
𝑛
Let
𝑙−1
ℎ
𝐼(𝑘, 𝑙) ∶= (𝑓(𝑥𝑘 ) + 2 ∑ 𝑓(𝑥𝑖 ) + 𝑓(𝑥𝑙 )) , 0 ≤ 𝑘 < 𝑙 ≤ 𝑛,
2 𝑖=𝑘+1
Now, the crucial observation is that the evaluation of 𝐼 can be distributed to parallel
processes, since always
We then define an example function 𝑓, an interval [𝑎, 𝑏] over which to integrate, and
the number of equidistant grid points.
We use the same test function as before:
1 def f(x): return 4/(1 + x*x)
Each process determines its segment and computes the partial integral:
11 k = rank *q; l = k + q
12 int_loc = h/2*(f(x[k]) + 2*sum(f(x[k+1: l])) + f(x[l]))
Note that ‘sum’ here again denotes the standard Python function for adding elements
in a collection.
As usual, the partial results are summed by root and then printed:
13 int = reduce (int_loc)
14 if rank == 0: print(int) # printing controlled by root
We first consider the special case where all vector segments can be equally distributed
among the processing units. Then we turn to the general case and show how the
segments can be distributed in an economical way.
We develop an MPI program for distributed dot multiplication of vectors, where the
length of the involved vectors is evenly divisible by the number of processes.
More complex communication in mpi4py is based on NumPy arrays, so we import
the needed components:
1 from numpy import array, zeros
12.4 Vector Dot Product 281
The root process initializes two example vectors 𝑣, 𝑤 and prepares for the distribution
to all processes.
As an example, let us take 𝑣 = (1, 2, … , 𝑛 − 1, 𝑛) and 𝑤 = (𝑛, 𝑛 − 1, … , 2, 1):
2 v = w = n_loc = None # vars must be known in all processes
3 if rank == 0:
4 n = 12 # number of processes must divide n ...
5 assert n % size == 0 # ... if not: abort
6 v = array([float(i) for i in range(1, n+1)])
7 w = array([float(i) for i in range(n, 0, ‐1)])
8 n_loc = n// size # division with remainder 0
In lines 6 and 7, note that only float values can be used in array-based communica-
tion, so we convert the entries accordingly.
The number n_loc calculated in line 8 is broadcast to all processes:
9 n_loc = bcast (n_loc)
Each process prepares (initially empty) buffers of that size n_loc for the storage of
their segment of the vectors 𝑣 and 𝑤:
10 v_loc = zeros(n_loc); w_loc = zeros(n_loc)
The operator Scatter splits both vectors into size many pieces of equal length
n_loc, and stores the subvectors in the individual buffers v_loc and w_loc:
11 Scatter (v, v_loc); Scatter (w, w_loc) # capital S, see below
and the partial results then “reduced” to the global result dot in the root process:
13 dot = reduce (dot_loc)
In mpi4py, basic message passing uses methods from the pickle module, discussed
in Sect. 3.8 in the Python chapter, more specifically the dump and load functions.
This allows for very convenient high-level communication.
However, for large data sets, this approach can be slow. In particular, for the com-
munication of arrays, more fine-tuned operations may be more efficient.
The operators can generally be distinguished by their initial letter. pickle com-
munication of Python objects uses operators with lower case initial letters, such as
send and recv, while the corresponding array-based operators are written with cap-
ital initial letters, such as Send and Recv.
In the example above, we used the array-based function Scatter to distribute the
vectors.
282 12 A Python Approach to Message Passing
It is of course unsatisfactory that so far we can only process vectors whose length 𝑛
is an integer multiple of the number 𝑝 of processes.
As mentioned earlier, a common way to handle the general case is to distribute the
remaining entries one by one to the first processes, so that, for example, for 𝑛 = 10,
𝑝 = 4, processes 0 and 1 each receive 3 entries and the remaining processes 2 and 3
receive only 2. Note that this is exactly the distribution in the integral approximation
in Sect. 12.3.
We begin with the program right away:
1 from numpy import array, zeros
The root process defines example vectors as before and prepares for a balanced dis-
tribution:
2 v = w = sendcts = None
3 if rank == 0:
4 n = 10
5 v = array([float(i) for i in range(1, n+1)])
6 w = array([float(i) for i in range(n, 0, ‐1)])
7 n_loc = n// size # even divisibility not required
8 sendcts = [n_loc for i in range( size )]
9 for i in range(n % size ): sendcts[i] += 1
In the general case, we cannot assume that the number n_loc uniformly describes
the size of the vector segments. Instead, in line 8 we collect the individual sizes into an
integer list sendcts, which is first initialized with the entry n_loc for all processes.
With 𝑝 processes, there are still 𝑟 ∶= 𝑛 mod 𝑝 entries to be distributed. Line 9 pre-
pares for this by increasing the segment size by 1 for each process 0, 1 … , 𝑟 − 1.
The sendcts list, which now contains the segment sizes for all processes, is broadcast:
10 sendcts = bcast (sendcts)
Each process ‘rank’ then provides appropriately sized local buffers for storing its sub-
vectors:
11 v_loc = zeros(sendcts[ rank ]); w_loc = zeros(sendcts[ rank ])
In the previous example we used Scatter to distribute the vectors. Here things are
more complicated, because the splitting points are not determined by a constant off-
set number n_loc, but by the entries in the sendcts list.
This time we have to use a generalized scatter function Scatterv, which however as
first argument requires significantly more information than Scatter:
12 Scatterv ([v, sendcts , MPI.DOUBLE], v_loc)
13 Scatterv ([w, sendcts , MPI.DOUBLE], w_loc)
12.5 Laplace Equation 283
The last lines can then be copied from the previous example:
14 dot_loc = v_loc @ w_loc
15 dot = reduce (dot_loc)
16 if rank == 0: print(dot) # Out: 220.0
In all examples so far, communication between processes was limited to the initial
distribution of data and the final recombination of the partial results.
In general, however, nested communication will also be required during compu-
tation. As an example, we discuss the solution of partial differential Laplace equations
by the method of finite differences. For background information we refer to Sect. 4.8
in the SciPy chapter.
Recall that a Laplace equation is a special Poisson equation Δ𝑢 = 𝑓, where 𝑓 ≡ 0. As
an example we consider the following PDE:
Determine a function 𝑢 ∶ Ω → ℝ, Ω ∶= [0, 1] × [0, 1], such that
Δ𝑢 = 0 in Ω,
(1)
𝑢=𝑔 on the boundary 𝜕Ω
𝑥 (1 − 𝑥) if 𝑦 = 0
(2) 𝑔(𝑥, 𝑦) ∶= {
0 else.
For our numerical solution we again need a discretization, that of the stencil equa-
tion (4) in Sect. 4.8 in the SciPy chapter. Since in the present case the right-hand side
is 0, after some manipulations we obtain the equivalent formulation
1
(3) 𝑢𝑖𝑗 = (𝑢 + 𝑢𝑖,𝑗−1 + 𝑢𝑖+1,𝑗 + 𝑢𝑖,𝑗+1 ) , 𝑖, 𝑗 = 1, … , 𝑛.
4 𝑖−1,𝑗
The crucial observation now is that we can see (3) as a fix-point solution of the itera-
tion sequence
1 (𝑘)
(4) 𝑢(𝑘+1)
𝑖𝑗 ∶= (𝑢 + 𝑢(𝑘) (𝑘) (𝑘)
𝑖,𝑗−1 + 𝑢𝑖+1,𝑗 + 𝑢𝑖,𝑗+1 ) , 𝑖, 𝑗 = 1, … , 𝑛, 𝑘 ≥ 0.
4 𝑖−1,𝑗
The discretization of the boundary condition provides the starting values for the it-
eration:
𝑗
{𝑥𝑗 (1 − 𝑥𝑗 ) for 𝑥𝑗 ∶= if 𝑖 = 0
(5) 𝑢(0)
𝑖𝑗 ∶= { 𝑛 + 1 𝑖, 𝑗 = 0, … , 𝑛 + 1.
{0 else,
284 12 A Python Approach to Message Passing
Distributed Iteration
We show how to distribute the iteration over two processes. Here we assume 𝑛 to be
an even number.
The task is split vertically at the 𝑥-index 𝑚 = 𝑛/2 so that one process 𝑃left computes
the the 𝑢(𝑘+1)
𝑖𝑗 for 𝑗 ≤ 𝑚 and another 𝑃right computes those for 𝑗 ≥ 𝑚 + 1.
We follow the “𝑘 → 𝑘 + 1” iteration step for 𝑃left . The required values from step 𝑘
are supplied by 𝑃left itself, except those for the “right-boundary” cases 𝑢(𝑘+1)
𝑖,𝑚 , which
(𝑘)
also require the values 𝑢𝑖, 𝑚+1 . These are provided by 𝑃right .
Conversely, 𝑃right requires 𝑢(𝑘) (𝑘+1)
𝑖,𝑚 from 𝑃left to compute the 𝑢𝑖, 𝑚+1 .
In MPI jargon, the additional boundary points provided by the neighbor are called
“ghost points”.
The matrix u is then split vertically and the right half is sent to the helper process 1:
7 ul = u[:, :m] # u evenly divided in left
8 ur = u[:, m:] # ... and right part
9 send (ur, dest=1) # ... sent to helper
10 if rank == 1: ur = recv (source=0) # ... and received
The Python function below is used to compute the iteration steps for the inner points
according to the sequence definition (4) above:
11 def stencil_it(u):
12 u_it = u.copy() # copy avoids side effects
13 r, c = u.shape # number of rows, columns
14 for i in range(1, r‐1): # inner rows
15 for j in range(1, c‐1): # inner columns
16 u_it[i,j] = (u[i+1,j] + u[i‐1,j]
17 + u[i,j+1] + u[i,j‐1]) / 4
18 return u_it
We come to the central part of the program, the iterative approximation, divided
between root and helper process. Note that the root process, in addition to the orga-
nizational preparatory work, also performs the left part of the computation.
12.5 Laplace Equation 285
We need two more NumPy functions to add and delete the columns of ghost points
needed for the computation:
19 from numpy import column_stack , delete # needed in both processes
We fix the number of iterations and start the loop. Note that the loop body consists
of lines 22 up to 33.
20 iter = 40
21 for i in range(iter): # central iteration loop
Root sends the current ghost points in the last column of ul to the helper, where it
is received in line 30:
23 send (ul[:, ‐1], dest=1) # '‐1' denotes last column
Conversely, root receives the column of ghost points sent by the helper in line 29 and
appends it to its own matrix part:
24 v = recv (source=1)
25 ul_v = column_stack([ul, v]) # append right boundary v
Now the inner grid points can be calculated for the next iteration, and the ghost
points are then discarded:
26 ul_v = stencil_it(ul_v) # update inner values
27 ul = delete(ul_v, ‐1, 1) # delete v
The second argument ‘-1’ in the delete function stands for ‘last’, the third ‘1’ for
‘column’.
We shift focus to the helper process. The procedure is essentially the same as in root,
except that it is mirrored:
28 if rank == 1:
29 send (ur[:, 0], dest=0) # sends leftmost column
30 v = recv (source=0)
31 v_ur = column_stack([v, ur]) # insert left boundary v
32 v_ur = stencil_it(v_ur)
33 ur = delete(v_ur, 0, 1) # delete v, 0 = leftmost
Here the second argument ‘0’ in ‘delete’ stands for ‘first’, the third argument ‘1’ again
stands for ‘column’.
This completes the iterative calculation in the for loop started in line 21.
When the number of iterations is reached, the calculation thus is complete, process 1
returns its partial result ‘ur’ to the root process:
34 if rank == 1: send (ur, dest=0)
35 if rank == 0:
36 ur = recv (source=1)
37 u = column_stack([ul, ur])
0.20
0.15
0.10
0.05
0.00
1.0
0.8
0.0 0.2 0.6
0.4 0.6 0.4
0.2
0.8 1.0 0.0
Fig. 12.1 Approximation to Laplace equation (1) in Sect. 12.5, with boundary function 𝑔 in (2).
Computation distributed to 2 processes
In Sect. 4.2 in the SciPy chapter, we developed the conjugate gradient method for
the solution of linear equation systems 𝐴𝑥 = 𝑏 for symmetric and positive definite
matrices 𝐴.
In the following, we show how the computation in the sequential program in SciPy
Example 4.3 can be distributed to multiple processes.
Analysis of the sequential program shows that the by far largest computation effort
is concerned with the matrix-vector products, hence in parallelization the main focus
needs to be on the speed-up of matrix-vector multiplication.
12.6 Conjugate Gradient Method 287
The Program
As usual, the program begins with our mpi4py declarations. Then we import the
NumPy components:
1 from numpy import array, zeros, sqrt
Next, we introduce some variables that are set in the root process. As mentioned
before, they have to be declared globally in order to access them in the other processes
as well. And since in Python a declaration without initialization is not possible, they
get the special value None:
2 A = n = p = None
Initial values are then set as in the sequential program, except that the commands are
executed within the root process:
3 if rank == 0:
4 A = array([[ 9., 3., ‐6., 12.], [ 3., 26., ‐7., ‐11.],
5 [‐6., ‐7., 9., 7.], [12., ‐11., 7., 65.]])
6 b = array([ 18., 11., 3., 73.])
7 n = len(b)
8 x = zeros(n)
9 r = b.copy(); p = r.copy()
10 rs_old = r @ r
We partition 𝐴 into horizontal slices, declare local matrices to store each submatrix,
and distribute 𝐴 among the involved processes:
11 n = bcast (n)
12 n_loc = n// size
13 A_loc = zeros((n_loc, n))
14 Scatter (A, A_loc)
This distribution is required only once before the actual computation begins.
The gradient descent is performed in the following for-loop in lines 15–27.
Each step begins with the distributed computation of the vector 𝐴𝑝:
15 for i in range(n): # n = max iterations to exact result
16 p = bcast (p) # current p distributed
17 Ap_loc = A_loc @ p # local matrix vector product
18 Ap = zeros(n) # buffer needed to ...
19 Gather (Ap_loc , Ap) # ... gather local results in root
288 12 A Python Approach to Message Passing
In line 16, the current value of the search-direction vector 𝑝 is broadcast to all pro-
cesses. Each process computes its segment 𝐴loc 𝑝 of the product 𝐴𝑝 in line 17. In
line 19, the segments are then collected in the root variable Ap.
The Gather operator is basically the converse of Scatter. It merges the local partial
solutions into one overall solution in the root process, respecting the distribution
order set by Scatter in line 14.
The remaining part of the iteration step is then performed by root:
20 if rank == 0:
21 alpha = rs_old / (p @ Ap)
22 x += alpha*p
23 r ‐= alpha*Ap
24 rs_new = r @ r
25 if sqrt(rs_new) < 1e‐10: break
26 p = r + (rs_new / rs_old)*p
27 rs_old = rs_new # end for loop
In summary, the only part that is executed in parallel is the calculation of the next
iteration values 𝐴𝑝 in lines 16–19.
Exercises
Matrix Operations
Exercise 12.1 Write an mpi4py program to multiply two matrices 𝐴, 𝐵. Split the ma-
trix 𝐵 vertically into two slices 𝐵1 and 𝐵2 , distribute the computation of 𝐴𝐵1 and 𝐴𝐵2
to two processes and finally reassemble the partial products and print the resulting
product matrix.
Test the program for the Hilbert matrix of order 4 and its inverse. The matrix
constructors can be imported from scipy.linalg as hilbert and invhilbert.
Exercise 12.2 Extend the program to one where additionally also 𝐴 is divided into
two parts 𝐴 1 and 𝐴 2 , but now horizontally. Distribute the computation of the partial
products 𝐴 𝑖 𝐵𝑗 to four processes.
Exercises 289
The following exercise illustrates the use of the alltoall operator, which combines
collection and distribution of data.
Assume that in a communication world, each of the ‘size’ many processes in-
volved has a send buffer, say sendbuf, and a receive buffer, say recvbuf, with exactly
‘size’ data objects. The alltoall operator takes the 𝑖-th object from the sendbuf of
process 𝑗 and copies it into the 𝑗-th object of the recvbuf of process 𝑖.
The operation can be thought of as a transpose of the matrix with processes as
columns and data objects as rows.
The syntax of the alltoall methods is
recvbuf = alltoall (sendbuf)
As with all ‘lowercase’ operators, the data objects can be of any Python type, as long
as the entries in sendbuf conform to those in recvbuf.
Exercise 12.4 Write an mpi4py program sieve.py to find all prime numbers in the
set 𝐿 = {2, … , 𝑛}, according to the following idea:
The set 𝐿 is divided into parts 𝐿𝑖 and distributed among all processes 𝑖. Each pro-
cess 𝑖 looks for its local minimum 𝑝𝑖 from its subset 𝐿𝑖 and sends it to the root process.
The root process determines the global minimum 𝑝 = min 𝑝𝑖 and broadcasts it to
all processes, which then delete all multiples of 𝑝 from their subsets 𝐿𝑖 . This step is
repeated as long as 𝑝 2 ≤ 𝑛.
Now each subset 𝐿𝑖 consists only of prime numbers.
In the end, the largest prime number found should be output to the console.
Test the program for 𝑛 = 100, distributed to 4 processes.
Chapter 13
Parallel Computing in C/C++
MPI was originally developed for C, C++ and Fortran. In practice, these platforms
are still the most widespread in parallel programming. Based on the discussion in
the last chapter, we briefly outline how to implement message passing in C/C++. We
focus mainly on C, since message passing relies on elementary C concepts such as
pointers and arrays.
We then turn to another main paradigm in distributed computing, the shared
memory approach.
Installation
In the following we look at the widely used Open MPI implementation (not to be
confused with the shared memory module OpenMP, discussed later).
Currently, there does not seem to be a satisfactory way to develop Open MPI pro-
grams in standard integrated development environments such as Eclipse or Xcode.
In the following, we will rely on the Ubuntu Linux operating system, which was al-
ready mentioned in the chapters on C and C++. More specifically, we will compile
and run our programs in the Ubuntu Terminal application.
Ubuntu can either be installed directly or on a virtual machine within another system.
All examples in this chapter were tested under Ubuntu 20.04 in a virtual environment
in macOS.
Under Ubuntu, Open MPI for C/C++ can be installed with the Terminal command
$ sudo apt‐get install openmpi ‐bin
As a first example, consider the standard introductory program hello.c, which con-
tains the single command printf("Hello World\n"). Note that in the following we
assume that all programs are stored in the user’s home directory, where the compiler
can find them directly.
The source code is compiled to an executable program ‘hello’ with the MPI aware
compiler mpicc:
$ mpicc hello.c ‐o hello
In Ubuntu, mpicc is just a a wrapper around the standard C compiler gcc that adds
some components needed for parallel processing. Details can be obtained by
$ mpicc ‐showme
The corresponding C++ compiler can be called with mpiCC.
The option ‘‐o hello’ means that the compiled program should be output as a
file named ‘hello’.
It can then be scheduled to 𝑛 = 2 processes, for example, and executed with the
mpiexec command:
$ mpiexec ‐n 2 hello
Hello World
Hello World
Recall that in Python we had to pass ‘python hello.py’ as an argument to mpiexec,
which launched two instances of the interpreter. Here, ‘hello’ is a ready-to-run
standalone program.
In Python we imported the central MPI module with the instruction ‘from mpi4py
import MPI’. In C/C++ the command ‘#include <mpi.h>’ provides the required
communication components.
In Example 12.7 in the last chapter, we saw that 𝜋 can be approximated by a midpoint
computation of the integral
1
4
∫ 2
𝑑𝑥.
0 1+𝑥
In the main function we first initialize the MPI execution environment and the size
and rank values as in Python:
11 int main() {
12 MPI_Init(NULL,NULL);
13 int rank, size;
14 MPI_Comm_size(MPI_COMM_WORLD , &size);
15 MPI_Comm_rank(MPI_COMM_WORLD , &rank);
Note that the MPI_Init function takes two pointer arguments, but their values are
not needed, so we can insert a 0-pointer NULL as a dummy.
The root process then sets the number 𝑛 of interval segments and broadcasts it to
all processes. Here we assume 𝑛 = 10:
16 int n;
17 if (rank == 0) n = 10;
18 MPI_Bcast(&n, 1, MPI_INT , 0, MPI_COMM_WORLD);
The arguments in line 18 are: the address of the buffer array to be broadcast, the
number of entries (here 1, because only the single 𝑛 value is distributed), then the
data-type MPI_INT of the buffer array, followed by the identifier of the sender (here
the root process 0), and finally the so-called communicator, i.e. the communication
context.
Each process computes its partial contribution to the approximation:
19 float pi_loc = partial_pi(n, rank, size);
In lines 21 and 22, all local values pi_loc are sent to the root process and accumu-
lated in pi. The arguments in position 3 and 4 declare that the send array consists of
one element of type MPI_FLOAT. The next argument MPI_SUM specifies that the values
are to be summed up in pi. The 0 value in the following argument declares that the
receiver is the process with rank 0. The last argument again denotes the communica-
tion context.
The value pi is then printed by the root process:
23 if (rank == 0) printf("%f\n", pi);
294 13 Parallel Computing in C/C++
Remark This example shows that MPI programming in C is not so different from
Python. The main difference is that the commands are somewhat more verbose, since
in C all arguments must be explicitly specified. The arguments are always identified
by their position, so in particular there is no way to suppress the specification of
default values.
In the Python examples, we made repeated use of the MPI commands Scatter and
Gather. As a simple C example, we apply the corresponding commands in a program
to calculate the average of a sequence of numbers.
Example 13.2 Assume the following program is stored in the file avg.c.
We begin as in the last example:
1 #include <mpi.h>
2 #include <stdio.h>
3 int main() {
4 MPI_Init(NULL,NULL);
5 int size, rank;
6 MPI_Comm_size(MPI_COMM_WORLD , &size);
7 MPI_Comm_rank(MPI_COMM_WORLD , &rank);
We fix an array of length 16:
8 int n = 16;
9 float arr[n];
and let the root process fill it with numbers 0, 1, … , 15:
10 if (rank == 0) for (int i = 0; i < n; i++) arr[i] = i;
Then we prepare for the array to be divided into ‘size’ many parts, where we again
for simplicity assume size to evenly divide 𝑛.
11 int n_loc = n/size;
12 float arr_loc[n_loc];
The input array is scattered evenly to all processes:
13 MPI_Scatter(arr, n_loc, MPI_FLOAT ,
14 arr_loc , n_loc, MPI_FLOAT , 0, MPI_COMM_WORLD);
13.3 Conjugate Gradient 295
The MPI_Scatter function may look scary, but it is actually only a more verbose
formulation of the one in Python.
Next, the mean values of the numbers within each local process are calculated:
15 float sum_loc = 0.;
16 for (int i = 0; i < n_loc ; i++) sum_loc += arr_loc[i];
17 float avg_loc = sum_loc / n_loc;
The individual values avg_loc are then gathered in the array avg_arr and sent to
the root process 0:
18 float avg_arr[size];
19 MPI_Gather(&avg_loc , 1, MPI_FLOAT ,
20 avg_arr , 1, MPI_FLOAT , 0, MPI_COMM_WORLD);
The argument value ‘1’ indicates that the transfer uses buffer arrays of length 1.
Finally, the root process computes the overall average from the local averages and
outputs it:
21 if (rank == 0) {
22 float sum = 0.;
23 for (int i = 0; i < size ; i++)
24 sum += avg_arr[i];
25 float avg = sum/size;
26 printf("%f\n", avg); }
The following function (sequentially) computes the dot product between two vectors:
296 13 Parallel Computing in C/C++
In the main function, we first set our example matrix 𝐴, the vector 𝑏, and the initial
value 0 for the solution vector 𝑥:
8 int main() {
9 int n = 4;
10 float A[] = {9., 3., ‐6., 12., 3., 26., ‐7., ‐11.,
11 ‐6., ‐7., 9., 7., 12., ‐11., 7., 65.};
12 float b[] = {18., 11., 3., 73.};
13 float x[] = {0., 0., 0., 0.};
Notice that in lines 10–11 we initialize 𝐴 in a flattened form as a linear vector. There
are two reasons for this. First, in C it is not possible to use the variable 𝑛 from line 9
to declare a matrix in the form ‘A[][n]=’. Second, we want to scatter 𝐴 in contiguous
blocks of rows. For this we need the flattened form anyway, since transfer buffers can
only hold arrays.
Then we initialize the residual vector 𝑟, search-direction 𝑝, and the dot product 𝑟 ⋅ 𝑟:
14 float r[n], p[n];
15 for (int i = 0; i < n; i++) { r[i] = b[i]; p[i] = r[i]; }
16 float rs_old = dot(n,r,r);
We divide 𝐴 into horizontal slices, declare local matrices to store the individual sub-
matrices, and scatter 𝐴 among the processes involved:
21 int n_loc = n/size;
22 float A_loc[n_loc][n];
23 MPI_Scatter(A, n_loc * n, MPI_FLOAT ,
24 A_loc, n_loc*n, MPI_FLOAT , 0, MPI_COMM_WORLD);
This distribution is required only once before the actual computation begins.
The iterative approximation is performed in the following for loop from line 25 to 44:
25 for (int it = 0; it < n; it++) { // begin iteration loop
Each step begins with the broadcast of the current value of the search direction vec-
tor 𝑝 to all processes:
26 MPI_Bcast(p, n, MPI_FLOAT , 0, MPI_COMM_WORLD);
13.3 Conjugate Gradient 297
The segments are then gathered together in an array Ap in the root process:
33 float Ap[n];
34 MPI_Gather(Ap_loc , n_loc, MPI_FLOAT ,
35 Ap, n_loc, MPI_FLOAT , 0, MPI_COMM_WORLD);
The root process can now set the new values for 𝛼, 𝑥, 𝑟 and 𝑟 ⋅ 𝑟:
36 if (rank == 0) {
37 float alpha = rs_old / dot(n, p, Ap);
38 for (int i = 0; i < n; i++) x[i] += alpha*p[i];
39 for (int i = 0; i < n; i++) r[i] ‐= alpha*Ap[i];
When the loop is finished, root outputs the result and the MPI session is finalized:
45 if (rank == 0) {
46 for (int i = 0; i < n; i++) printf("%f ", x[i]);
47 printf("\n"); }
48 MPI_Finalize(); }
Hello World
Line 2 provides the OpenMP environment variables and functions used in the pro-
gram.
In line 4, the function omp_get_num_procs returns the number of processors
available in the system.
The #pragma directive in line 5 is the method specified by the C standard for pro-
viding additional information to the compiler, beyond what is conveyed in the lan-
guage itself. Here the pragma ‘omp parallel’ tells the compiler that the commands
13.4 Shared Memory Programming 299
in the following block consisting of lines 6 and 7 should be executed in parallel by all
threads, which by default means one thread per processor.
The default value can be changed by num_threads(n) for the desired number 𝑛
of threads.
To compile the program under Ubuntu Linux, assuming that the file is stored in the
home directory, we enter the command line
$ gcc ‐fopenmp hellomp.c ‐o hellomp
The ellipsis represents the six remaining responses from the other threads.
Note the similarity between threads in OpenMP and processes in MPI. In fact, we
can also program in “message-passing style” in OpenMP. As a simple example, we
show how to implement a broadcast operation.
Example 13.5 The following program is intended to broadcast the value 42 from
thread number 0 to all threads:
1 #include <stdio.h>
2 #include <omp.h>
3 int main() {
4 int ans = 0;
5 #pragma omp parallel
6 { if (omp_get_thread_num() == 0) ans = 42;
7 printf("%d ", ans); }
8 printf("\n"); }
The code block in lines 6 and 7 is executed by all threads concurrently. The only
effect of the statement in line 6 is that thread 0 assigns the value 42 to the shared
300 13 Parallel Computing in C/C++
variable ans. Line 7 prints the current value of ans as seen by each thread. Line 8
is outside the parallel block, so the line-break does not occur until the end of the
program.
The result is however not entirely satisfactory. It may randomly change between
program executions, a typical output being
0 0 42 0 0 42 42 0
The reason for this is again that the block is executed concurrently. It is quite possible
that, for example, thread 1 executes line 7 before thread 0 changes the value of ans.
In this case, thread 1 returns the value 0. However, if thread 1 executes line 7 after
the value change, the expected value 42 is output.
between lines 6 and 7 in the last example, then each thread suspends its execution
until also all others have processed line 6, i.e. in particular until thread 0 has updated
ans, and only then resumes the execution of line 7. Then the result will be the desired
one:
42 42 42 42 42 42 42 42
A good candidate for parallel execution is a suitably structured for loop. For example,
it is obvious that in a vector addition like
for (int i = 1; i <= n; i++) c[i] = a[i] + b[i];
the component additions are independent of each other and can therefore be exe-
cuted in parallel.
For the distribution of a for loop to separate threads, OpenMP provides the directive
#pragma omp for
In combination with the ‘omp parallel’ directive the program snippet for starting
a parallel vector addition looks like this:
13.5 Parallelizing Loops 301
Recall that the block of code to be computed in parallel is enclosed in curly braces as
usual (lines 2 and 5). Again, if the block consists of a single statement, the braces can
be omitted.
The combination of the two directives occurs so frequently that an abbreviated form
is provided:
#pragma omp parallel for
for (int i = 1; i <= 100; i++) c[i] = a[i] + b[i];
It is important to note that the compiler does not check whether it is safe to parallelize
the loop. Careless use can lead to quite subtle errors, as the following example shows.
Example 13.7 Suppose we want to sum the first 100 natural numbers. In a sequential
program we could do this with the following code snippet inside the main function:
int sum = 0;
for (int i = 1; i <= 100; i++) sum += i;
If we save the program in a file formp.c, compile it with the option ‘‐fopenmp’ and
run it, the result will be the expected one in most cases:
$ ./formp
5050
However, the correct result is not guaranteed. It may happen that for example
$ ./formp
4983
How is such a result to be explained? It is due to the most intricate and subtle error
source in shared memory computation, the so called race condition.
302 13 Parallel Computing in C/C++
In Example 13.5 above, we saw that the execution order between statements in inde-
pendent threads can lead to undesirable random results. This is exactly what happens
at a more fundamental micro level in Example 13.7 as well.
To see this, let’s consider a simplified version. Let sum = 0. Suppose thread 0 incre-
ment sum by 1, thread 1 increments it by 2.
Recall that an assignment is a three-part sequence, (1) accessing the old value, (2)
processing that value, and finally (3) storing the new value.
Since the threads are running concurrently, it cannot be ruled out that the se-
quences are interleaved, with both threads first reading the value 0, then increment-
ing the value to 1 or 2, and finally writing it back to sum. Depending on which of the
threads last wrote its computed value to the sum, the result is either 1 or 2, but in
both cases it is not the correct value 3.
The final result of this code depends on which of the threads terminates last and
therefore writes to memory last, which is a race condition.
The atomic directive can be seen as a particular form of the critical directive,
which specifically only prevents an assignment from being interrupted at any stage
by another thread. So here the directive
#pragma omp atomic
The best method to use is ‘reduction’. We can do this by changing line 6 in Exam-
ple 13.7 to
#pragma omp for reduction(+:sum)
The reduction command tells OpenMP that we want each thread to keep track of its
own private sum variable while the loop is running, and add them all up at the end of
the loop. This is the most efficient method, since the entire loop now runs in parallel,
and the only overhead is at the end of the loop when the local sum values are added.
Similarly to the reduce method in MPI, the reduction directive can also be used
with various other arithmetical operators, such as *, max, min.
The main block defines the number 𝑛 of equidistant grid points and the distance ℎ
between them:
3 int main() {
4 int n = 10; float h = 1.0/n;
The variable 𝑥 is used to range over the discrete evaluation points, the piecewise in-
tegral approximations are added up in sum:
5 float x; // every thread gets its own copy, see line 9
6 float sum = 0.0;
In the following loop, the partial integrals are computed concurrently and added up
in sum. The variable x is declared as private, which means that each thread receives a
separate memory space to store its value.
9 #pragma omp for private(x) reduction(+:sum)
10 for (int i = 0; i < n; i++) {
11 x = h*(i + 0.5);
12 sum += h*f(x); }
304 13 Parallel Computing in C/C++
In MPI we usually let the root process (i.e. the one with rank 0) print the result. Here
we leave it to an arbitrary thread. The thread that reaches the single directive in line 13
first, is the one that executes the single block, i.e. the print instruction in line 14.
Note that the for-directive in line 9 automatically ensures that control is only passed
to the commands that follow the loop when all threads have ended. The effect is the
same as if there were an explicit barrier directive after line 12.
Until now, parallelization was organized in the main function. However, it is also
possible to outsource the parallel code to external function modules.
As a first example, we define a function omp_dot that calculates the dot vector product
in parallel. It can then be used within an otherwise sequential program.
Example 13.9 After the usual
1 #include <stdio.h>
we define a function for the distributed computation of the vector dot product:
2 float omp_dot(int n, float v[], float w[]) {
3 float sum = 0;
4 #pragma omp parallel for reduction(+:sum)
5 for (int i = 0; i < n; i++) sum += v[i]*w[i];
6 return sum; }
The parallel dot product is then tested inside a sequential main module:
7 int main() {
8 int n = 10;
9 float a[n], b[n];
10 for (int i=0; i < n; i++){
11 a[i] = i; b[i] = n‐i; }
12 float a_dot_b = omp_dot(n, a, b); // computed in parallel
13 printf("%f\n", a_dot_b); }
The program must of course be compiled with the ‘‐fopenmp’ option for the paral-
lelizing to take effect:
$ gcc ‐fopenmp dotmp.c ‐o dotmp
$ ./dotmp
165.000000
13.7 Parallelized Function Modules 305
Recursive Functions
Recursive functions can also be formulated in parallel code and outsourced to exter-
nal modules. Here the OpenMP directive ‘task’ can be used to schedule recursive
calls in a “to do list”.
The call to fib(n) generates two tasks, indicated by the ‘task’ directive. One of the
tasks computes fib(n‐1) and the other computes fib(n‐2). The return values are
added together to produce the value returned by fib(n).
Each of the calls to fib(n‐1) and fib(n‐2) will in turn generate two tasks. The tasks
are generated recursively until the argument passed to fib is less than 3.
The tasks will then be executed in reverse order, storing the intermediate results
in the variables i and j.
The ‘taskwait’ directive ensures that the two tasks generated in an invocation of
fib are completed (that is, i and j have received their values) before that invocation
of fib returns.
The function is then called like this:
10 #include <stdio.h>
11 int main() {
12 int n = 20;
13 #pragma omp parallel
14 {
15 #pragma omp single
16 printf("%d\n",fib(n)); } } // Out: 6765
The parallel directive in line 13 specifies that the following code block is to be
executed by all threads. The single directive in line 15 states that actually only one of
the threads initiates the computation and prints the result. However, the computation
itself is spread across all threads.
306 13 Parallel Computing in C/C++
It is entirely possible to write hybrid parallel programs that use both message-passing
and shared-memory techniques.
As a first example, let’s revisit the 𝜋 approximation in Examples 13.1 and 13.8.
Assume the program is stored in a file parmp.c.
We begin as before:
1 #include <stdio.h>
2 #include <mpi.h>
3 float f(float x) { return 4.0/(1.0 + x*x); }
In the following main function, the only change from Example 13.1 is that in line 19
our new parallel helper function omp_partial_pi replaces the sequential function
partial_pi. For convenience we repeat the complete definition:
11 int main() {
12 MPI_Init(NULL,NULL);
13 int rank, size;
14 MPI_Comm_size(MPI_COMM_WORLD ,&size);
15 MPI_Comm_rank(MPI_COMM_WORLD ,&rank);
16 int n;
17 if (rank == 0) n = 10;
18 MPI_Bcast(&n, 1, MPI_INT , 0, MPI_COMM_WORLD);
19 float pi_loc = omp_partial_pi(n, rank, size); // shared
20 float pi;
21 MPI_Reduce(&pi_loc , &pi, 1, MPI_FLOAT ,
22 MPI_SUM , 0, MPI_COMM_WORLD);
23 if (rank == 0) { printf("%f\n", pi); }
24 MPI_Finalize(); }
Since the program contains both MPI and OpenMP components, it must be com-
piled with the extended gcc compiler mpicc using the ‘‐fopenmp’ option and then
executed by mpiexec:
$ mpicc ‐fopenmp parmp.c ‐o parmp
$ mpiexec ‐n 2 parmp
3.142426
Exercises 307
Exercises
Message passing in C/C++
Exercise 13.4 Write a C++ version of the program in Exercise 12.4. Here you might
consider to use the C++ data type vector to store the elements.
Shared Memory
Exercise 13.5 Based on the program in Example 6.9 on page 128 in the C chapter,
write a recursive function det(n,A) that computes the determinant of an 𝑛 × 𝑛 ma-
trix 𝐴 distributed to 8 threads.
Exercise 13.6 Write a distributed program to compute the maximum norm ||𝐴||∞ =
max𝑖,𝑗 |𝑎𝑖,𝑗 | of a matrix 𝐴 = (𝑎𝑖,𝑗 ). Partition 𝐴 into submatrices 𝐴 1 , … , 𝐴 𝑚 , and dis-
tribute the parts to separate MPI-processes. In each process, distribute the computa-
tion of the local maximum to multiple threads.
Chapter 14
Distributed Processing in Julia
Basics
which then automatically loads the Distributed module and initializes it with ad‐
dprocs(8).
In Julia, the distinguished main process 1 plays a role similar to the root process in
MPI.
Basic communication is initiated by the remotecall function, which delegates a
function (in the following example, the square root function) along with an argument
(here 2.0) to a worker (here worker 2) and retrieves the result as follows:
r = remotecall(sqrt, 2, 2.0);
fetch(r) # ans: 1.4142135623730951
There is also a variant ‘spawn’ of ‘spawnat’ that leaves the choice of worker to the
system. Example:
r = @spawn sqrt(2.0);
The macro distributed declares that the term evaluations in the loop should be
distributed to all workers. The ‘(+)’ reduction parameter indicates that the individual
results should be combined and their total sum returned.
14.3 Monte Carlo Method 311
Remark We mention that in the above examples, the index values are distributed to
the worker processes in such a way that load balancing is automatically ensured.
This can be seen, for example, in the loop
@distributed for i = 1:10 println("index $i processed here") end;
Example 14.3 A “real-world” example for the use of distributed for loops is the com-
putation of the dot product of vectors 𝑣 ⋅ 𝑤:
v = collect(1. : 10.); w = collect(10.: ‐1.: 1.); # example
@distributed (+) for i = 1:10 v[i] * w[i] end # ans: 220.0
Monte Carlo methods are generally well suited for distributed implementation.
We illustrate the idea with our standard example, the approximation of 𝜋.
Example 14.4 The first approach is to formulate a sequential program, which is then
distributed to all workers, and finally recombine the partial results.
Here is the sequential program from Example 8.5 in the Julia chapter, embedded
in a function:
1 function pi_loc(samples)
2 hits = 0
3 for _ = 1:samples
4 x, y = rand(), rand()
5 if(x^2 + y^2 <= 1) hits += 1 end end
6 return 4*hits/samples end
For later comparison with the distributed computation, we measure the performance:
7 samples = 1_000_000_000;
8 @elapsed pi_loc(samples) # ans: 2.378840125
We want to distribute the computation to all worker processes. Here is a naive ap-
proach:
9 function pi_dist(samples , p)
10 r = @distributed (+) for i = 1:p pi_loc(samples/p) end
11 return r/p end
312 14 Distributed Processing in Julia
We test it:
12 p = nworkers()
13 pi_dist(samples , p) # ans: Error!
The detailed error message states that pi_loc is unknown in the worker processes.
The reason is that the function definition in its current form can only be used in
the main process 1 and not, as requested in line 10, also by the workers.
To make it generally applicable, we need to embed line 1 in the everywhere macro.
The new line 1 now reads
@everywhere function pi_loc(samples) # now distributable
The everywhere macro ensures that the enclosed code block is also distributed to all
worker processes.
Now the command pi_dist(samples,p) in line 13 works.
Timing shows that there is a significant performance gain, but how large depends of
course on the number of physical processors. On our test machine with 8 cores we
get:
14 @elapsed pi_dist(samples , p) # ans: 0.332873125
Note that line 3 simply combines the effect of lines 4 and 5 in the sequential program.
The expression ‘rand()^2+rand()^2 <= 1’ results in true or false, which is then
converted into an integer 1 or 0 that can be added. (Actually, the explicit conversion
is even superfluous, since Boolean values correspond to integers 1 and 0 anyway.)
If you want to time the execution, you can enclose lines 2 and 3 in a block:
@elapsed begin
# code goes here
end
In Example 13.10 of the last chapter, we discussed an OpenMP program for recur-
sively computing the Fibonacci function. Here we develop a distributed Julia imple-
mentation:
14.5 Shared Arrays 313
Example 14.6 We start with the sequential program, controlled by the everywhere
macro, so that the fib function is also accessible to all workers:
1 @everywhere function fib(n)
2 return (n < 3) ? 1 : fib(n‐1) + fib(n‐2) end
Note that here the everywhere macro is required to enable recursive function calls
by the worker processes in line 5.
In distributed computing, there is generally a tradeoff between the performance
gain from multiple processors and the slowdown from communication overhead
when data must be moved.
In the Fibonacci example, one tradeoff is to compute values for small 𝑛 within a
single process, as in line 4. Also, note in lines 5 and 7 that only one of the branches
of the recursion calls is delegated to another process.
For the input value 𝑛 = 50, we obtain a performance gain by a factor of 6.94 on our
8-core test machine:
8 @elapsed fib(50) # ans: 52.752892042
9 @elapsed fib_parallel(50) # ans: 7.603331375
As shown in Example 14.3, Julia arrays defined in the main process can also be read
by the distributed workers. But workers cannot modify standard arrays of type Array.
However, Julia provides the special data type SharedArray for arrays, to which the
workers can also write.
The SharedArray type is provided in the SharedArrays module. In the following we
assume that the module is loaded by
using SharedArrays # loads the required module
The first example illustrates the difference between the array types:
Example 14.7 We first define a standard Julia vector
1 v = [0, 0];
For the standard vector v, the new value is only assigned to a local copy that is not
visible from the outside. For the shared vector sv, on the other hand, the change is
global:
4 println("$v, $sv") # ans: [0, 0], [1, 0]
Example 14.8 (Laplace Equation) Shared arrays provide a convenient way to imple-
ment shared memory programming. For illustration, we write a distributed program
to compute the Laplace equation, formulated in Python in Sect. 12.5 on page 283.
Assuming the SharedArrays module is loaded, we define a shared matrix S to store
the iterative approximations:
1 n = 10;
2 S = SharedMatrix{Float64}((n,n));
Again, SharedMatrix is an alias for the SharedArray type of dimension 2 . The tuple
(n,n) indicates that it is an 𝑛 × 𝑛 matrix. Note that S is automatically initialized to 0.
Still on the global level, we work the boundary conditions into the matrix:
3 x = collect(range(0, stop=1, length=n));
4 S[1,:] = x .* (1 .‐ x);
The iterative computation is embedded in a for loop as before, here however a dis‐
tributed loop:
5 for _ = 1 : 40
6 T = copy(S)
7 @sync @distributed for i in 2:n‐1
8 for j in 2:n‐1
9 S[i,j] = (T[i+1,j] + T[i‐1,j] + T[i,j+1] + T[i,j‐1]) / 4
10 end end end
In line 6, the values in S from the last iteration round are stored in the temporary
matrix T. Note that we copy the values to avoid side effects.
In line 7, we initiate the nested for loops to update each grid point. The dis‐
tributed macro ensures that the computation in the outer for loop is distributed
across all workers. The sync macro causes the next iteration round to start only after
all worker contributions to the current round have been included in S.
The solution can then be plotted:
11 using Pkg; Pkg.add("Plots") # only needed if not yet installed
12 using Plots
13 plot(x, x, S, seriestype=:wireframe) # x defined in line 3
14.6 Distributed Arrays 315
0.20
0.15
0.10
0.05
1.0
0.8
0.6
0.00 0.4
0.0 0.2 0.4 0.6 0.2
0.8 1.00.0
Fig. 14.1 Solution of the Laplace equation in Sect. 12.5 in Chap. 12. Computation distributed to 8
worker processes
Distributed arrays are sort of the opposite of shared arrays. The idea is to split a given
array and send the parts to different workers for further processing.
We need the DistributedArrays module. It is not included in the Julia standard
library, so we need to install it before using it for the first time:
import Pkg; # Julia's package manager
Pkg.add("DistributedArrays") # only required once
The vector du is prepared for division into local parts, one for each worker process:
4 for i in workers()
5 r = @spawnat i localpart(du)
6 lu = fetch(r)
7 println("$i: $lu") end;
316 14 Distributed Processing in Julia
In line 5, each worker extracts its own part and prepares it for delivery in the r vari-
able. Note that the localpart function belongs to the DistributedArrays module
and can be called by the workers only because the module was loaded under the
control of the everywhere macro.
In line 6, the local part is fetched by the main process and printed in line 7. The
output (here written one one line)
2: [1, 2] 3: [3, 4] 4: [5, 6] 5: [7, 8] 6: [9] 7: [10]
shows that the standard pattern of load balance is followed again automatically.
Remark When the number of workers exceeds the length of the vector 𝑢, the last
workers receive an empty part, such as for example for u = collect(1:4):
2: [1] 3: [2] 4: [3] 5: [4] 6: Int64[] 7: Int64[]
Recall that vectors are conceived as column vectors. For the dot multiplication in
line 5 to go through, the left vector must be transposed into a row vector.
We test the function:
u = collect(1. : 10.); v = collect(10.: ‐1: 1.0);
dot_dist(u,v) # ans: 220.0
Matrix-Vector Product
In line 2 we start with an initially empty vector to which the partial products are then
to be appended. In line 3, we specify the number of worker processes according to
the above remark.
In line 4, the option dist specifies the number of partitions in each dimension. We
split the matrix into 𝑚 horizontal slices. The second entry ‘1’ means that no splitting
of the matrix along the column dimension is requested.
In line 6, we calculate the product of the local matrix slice and the vector v in
each of the participating workers, and then append the partial result to the vector w
in line 7.
We test the function:
10 A = [9. 3. ‐6. 12.; 3. 26. ‐7. ‐11.;
11 ‐6. ‐7. 9. 7.; 12. ‐11. 7. 65.];
12 v = [1 1 1 1]';
13 mat_vec_dist(A,v)' # ans: 18.0 11.0 3.0 73.0
Note that we are still in an environment with 8 worker processes, but the matrix 𝐴 has
only four rows, so our extra precaution in the definition of mat_vec_dist applies.
318 14 Distributed Processing in Julia
Exercises
Exercise 14.1 Write a distributed version of the program for computing integrals in
Exercise 8.1 in the Julia chapter.
Exercise 14.3 Based on the program in Example 6.9 in the C chapter, write a dis-
tributed recursive function that computes the determinant of regular matrices.
Exercise 14.4 In Sect. 8.9 of the Julia chapter, we developed a program that generates
Julia set fractals. Write a distributed version of this program.
Exercise 14.5 Based on the program in Example 8.29 in the Julia chapter, write a
distributed program to generate the colored version of the Julia set shown on the
book cover. To do this, generate a 1000 × 1000 shared matrix 𝐽 and distribute the
computation of the matrix entries among the workers.
The result can then be visualized with the command imshow(J, cmap = "hot")
from the package PyPlot.
Part V
Specialized Programming Frameworks
Chapter 15
Automated Solution of PDEs with FEniCS
FEniCS is an open source computing platform for solving partial differential equa-
tions PDEs using the finite finite element method FEM.
The FEniCS project was launched at the University of Chicago in 2003 and is now
being developed in collaboration with various international research groups.
The CS in FEniCS stands for Computer System, “ni” is supposed to deliver a eu-
phonious word. The name may also be a reference to the University of Chicago’s
mascot, a phoenix risen from the ashes.
Currently, there are two coexisting versions, the classic FEniCS library in the final
stable version 2019.1.0 and a new version called FEniCSx, which reportedly has a
number of important improvements over the old library. However, as the developers
point out, it is still at a very early stage of development. This is also reflected in the
version number 0.3.0.
In this chapter we base the discussion on the classic FEniCS.
The FEniCS library can be embedded in the programming languages Python or
C++. Here we choose the user-friendly Python.
For an introduction and general information the ebook [6] can be recommended. It
can be downloaded free of charge from fenicsproject.org/tutorial.
A good source for further studies is the 800-page book Automated Solution of
Differential Equations by the Finite Element Method [7].
However, it should be mentioned that both books do not cover the current FEniCS
version. The introductory examples are outdated and will not run under the 2019.1.0
version without changes. The changes may be minor, but they can be a serious ob-
stacle for the beginner.
Installation
All examples in this chapter were tested on FEniCS 2019.1.0 based on Python ver-
sion 3.8 in a virtual environment Ubuntu 20.04 on the macOS platform.
In Ubuntu 20.04, FEniCS can be installed using the Terminal command:
$ sudo apt install fenics
The finite element method FEM was mainly developed for partial differential equa-
tions. However, the central ideas can already be illustrated with ordinary equations.
In fact, we’ve already covered the basic principles in previous chapters. In Sect. 5.6 in
the SymPy chapter, we developed the variational form of such equations and showed
how to solve them using the Galerkin method.
In Sect. 10.7 in the Maple chapter we continued the development, and in Sect. 10.8
introduced special first-degree splines, called hat functions, as the basis for the Galer-
kin method.
And to put it simply: The Galerkin method with hat functions is the heart of FEM.
Variational Form
Galerkin Method
The general approach now is to specify a finite set of basis functions 𝜑𝑖 and then
attempt to establish equation (2), however replacing the exact goal function 𝑢𝑒 with a
function 𝑢 = ∑ 𝑐𝑖 𝜑𝑖 (in FEniCS called trial function) formed as a linear combination
15.1 Finite Element Method, One-Dimensional 323
of the basis functions, with the coefficients 𝑐𝑖 to be determined. This is the general
Galerkin method.
Hat Functions
If we now choose hat functions as trial functions 𝜑𝑖 , we run into a difficulty, already
discussed in Sect. 10.8 in the Maple chapter. For 𝑢 = ∑ 𝑐𝑖 𝜑𝑖 , the second derivative 𝑢″
referred to in (2) makes no sense, because hat functions do not have any sensible sec-
ond derivative. Hat functions are only piecewise differentiable, without the derivatives
being continuous. But in fact, continuity is not necessary. What is important for us
here is that the derivatives 𝜑′𝑖 are integrable, and thus also linear combinations with
𝑢′ = ∑ 𝑐𝑖 𝜑′𝑖 .
What we need is an equivalent formulation for (2), which however does not refer
to second derivatives.
This is achieved through integration by parts of the left-hand side in (2). We get
1 1 1
(3) − ∫ 𝑢″ 𝑣 = ∫ 𝑢′ 𝑣′ − 𝑢′ 𝑣 | .
0 0 0
as an equivalent variational form for the differential equation −𝑢″ = 𝑓. This is the
one we will use in the following.
As mentioned in Sect. 5.6 in the SymPy chapter, it is common to collect the left-
hand side of the variational form in a bilinear form 𝑎(𝑢, 𝑣) and write the right-hand
side, which does not depend on 𝑢, as a linear form 𝐿(𝑣).
Discretization
We formulate (4) for the special case that the trial function 𝑢 is a linear combination
∑𝑖 𝑐𝑖 𝜑𝑖 of the basis functions. By the linearity of differentiation and integration, we
then obtain the following approximate formulation for equation (4):
1 1
(5) ∑ 𝑐𝑖 ∫ 𝜑′𝑖 𝑣′ = ∫ 𝑓 𝑣.
𝑖 0 0
As a final step in the discretization, we also restrict the test functions to the basis
functions 𝑣 = 𝜑𝑗 themselves.
324 15 Automated Solution of PDEs with FEniCS
Manual Solution
But before we hand it over to the computer, let’s briefly illustrate by hand how the
solution process for our example works. For this purpose, we define an equidistant
sequence 𝑥0 = 0, 𝑥1 = 0.25, 𝑥2 = 0.5, 𝑥3 = 0.75, 𝑥4 = 1. Let 𝜑𝑖 , 𝑖 = 1, ..., 3, be the hat
functions with 𝜑𝑖 (𝑥𝑖 ) = 1, and 𝜑𝑖 (𝑥) = 0 for 𝑥 ≤ 𝑥𝑖−1 and 𝑥 ≥ 𝑥𝑖+1 .
A simple computation shows that
1 {2
{
if 𝑖 = 𝑗
𝑎𝑖𝑗 ∶= ∫ 𝜑𝑖′ 𝜑𝑗′ = {−1 if |𝑖 − 𝑗| = 1
0 {
{0 else.
With 𝐴 = (𝑎𝑖𝑗 )1≤𝑖,𝑗≤3 , 𝑏 ∶= (𝑏𝑗 )𝑇1≤𝑗≤3 , and 𝑐 ∶= (𝑐1 , 𝑐2 , 𝑐3 )𝑇 we thus get the equation
system
(7) 𝐴𝑐 = 𝑏
with the solution 𝑐 = (0.75, 1, 0.75). Figure 15.1 shows the approximation 𝑢 = ∑𝑖 𝑐𝑖 𝜑𝑖
represented by the solid line, the exact solution 𝑢𝑒 by the dashed line.
1.0
0.8
0.6
0.4
0.2
0.0
0.0 0.2 0.4 0.6 0.8 1.0
Fig. 15.1 FEM solution of equation (1) (solid line) and exact solution (dashed line)
15.2 FEniCS Implementation, One-Dimensional 325
Using the dolfin function UnitIntervalMesh we create a mesh for the unit interval,
and divide it into 4 equal subintervals:
2 mesh = UnitIntervalMesh(4)
The second argument in FunctionSpace specifies the type of the basis functions,
the third argument the degree. Here we specify continuous Galerkin functions of
degree 1. Note that these are precisely our hat functions. As a synonym for ‘CG’ we
can also write ‘Lagrange’.
The following statements specify that our unknown solution function 𝑢 (declared
here as a trial function, a term commonly used in FEniCS) as well as all test functions 𝑣
are to be constructed within this function space:
4 u = TrialFunction(V)
5 v = TestFunction(V)
Note that the spaces of trial and test functions will often coincide, although this is
not mandatory.
326 15 Automated Solution of PDEs with FEniCS
Next comes the definition of the source function 𝑓 ≡ 8 on the right-hand side of our
equation (1):
6 f = 8
In the present example, the constant value 0 is assigned to the entire boundary (which
in our one-dimensional case consists precisely of the two boundary points 0 and 1 of
the unit interval) with the FEniCS keyword ‘on_boundary’.
Then the differential equation to be solved is formulated as a discrete variational
problem:
8 a = u.dx(0)*v.dx(0)*dx
9 L = f*v*dx
𝜕𝑢
The term ‘u.dx(0)’ denotes 𝜕𝑥 for the first argument (remember we are back in a pro-
gramming environment where indices begin at 0). In our case, this simply means 𝑢′,
and correspondingly for 𝑣.
The term ‘*dx’ indicates that the usual integral is to be computed over the entire
interval. Note that it has to be precisely ‘dx’ with the specific letter ‘x’.
The variables a and L now represent the left and right sides in the discrete varia-
tional form (6) above.
We come to the solution of the linear equation system 𝐴𝑐 = 𝑏 in (7). To do this,
we declare a function variable u_sol to which the solution can be assigned:
10 u_sol = Function(V)
Then the discrete variational problem is solved with the dolfin function solve and
the result stored in u_sol:
11 solve(a == L, u_sol, bc)
Note that it is entirely possible (and common) to use the same identifier ‘u’ instead
of u_sol.
A graphical representation of the solution can then be generated using the standard
Python library Matplotlib.
The plot is prepared with the dolfin command
12 plot(u_sol) # returns a matplotlib object
The program can be run by entering the following command in the Terminal win-
dow:
$ python3 "onedim.py"
Note that the ‘python3’ command specifically calls the Python version that was in-
stalled together with FEniCS.
15.2 FEniCS Implementation, One-Dimensional 327
to execute the file content. Another possibility is to type the code line by line inter-
actively.
Remark The program is executed as follows: The Python code references various
modules written in C++. These must be compiled into machine code before they
can be used. This is done at runtime by a so-called just-in-time compiler (JIT). This
can cause a program to run slower the first time, but on subsequent runs the program
can reuse code that has already been compiled.
Remark The solve function is the workhorse of FEniCS. Depending on the form
of the input, it can apply various solution methods. Here the first argument is of the
form ‘a == L’. Therefore solve expects the left-hand side to be a bilinear form built
from a TrialFunction and a TestFunction, and the right-hand side to be a linear
form in the TestFunction. Having created the form in lines 8 and 9, we could now
redeclare ‘u’ as a Function in line 10 to store the solution in line 11. However, for
clarity, in this introductory example we refrain from doing so.
Remark 15.1 To better understand how FEniCS works, it can be helpful to review
the program step by step. For this purpose, after line 9, you can define the matrix A
and the vector b used in the solution process with
A = assemble(a); b = assemble(L)
Note that A is a dolfin matrix, and b is a vector, which somewhat strangely require
different methods for conversion to NumPy arrays.
Note also that for 𝐴 we get a 5 × 5 matrix and for 𝑏 we get the vector ‘array([1.,
2., 2., 2., 1.])’.
The interested reader is asked to explain where the differences from our manual
approach in Sect. 15.1 come from.
To see how the boundary conditions are included in the equation, enter
bc.apply(A,b)
As a model example for a partial differential equation PDE, we consider the following
Poisson equation with Dirichlet boundary conditions, sometimes referred to as the
“Hello World” of PDEs.
The general form of such a Poisson equation is
−Δ𝑢 = 𝑓 in a region Ω ⊂ ℝ2 ,
(1)
𝑢=𝑔 on the border 𝜕Ω,
where we here assume the special case Ω = [0, 1]2 . As usual, Δ𝑢 is an abbreviation
𝜕2 𝑢 𝜕2 𝑢
for the sum of the second order partial derivatives 2 + 2 .
𝜕𝑥 𝜕𝑦
Remark The method of manufactured solutions has the collateral benefit that the
highly non-trivial question whether a differential equation can be solved at all is an-
swered in the affirmative by definition.
(2) − ∫ Δ𝑢 𝑣 = ∫ 𝑓 𝑣.
Ω Ω
Similar to the 1D case, a partial integration of the left-hand side in (2) again allows for
an equivalent formulation in terms of first-order derivatives, which in turn allows for
the use of “less smooth” basis functions in the representation of the trial function 𝑢.
15.3 Poisson Equation 329
𝜕𝑢
(3) − ∫ Δ𝑢 𝑣 = ∫ ∇𝑢 ⋅ ∇𝑣 − ∫ 𝑣.
Ω Ω 𝜕Ω 𝜕n
Here the rightmost term denotes the integral evaluation on the boundary 𝜕Ω. But as
long as we are only concerned with Dirichlet boundary conditions, we can also in
the multidimensional case assume that all test functions 𝑣 vanish on the boundary,
so that the rightmost term is always 0. We will come back to the details later, in the
context of Neumann boundary conditions.
For now, let’s focus on the constructions behind ∇𝑢, and the dual for 𝑣. ‘∇’ (pro-
nounced nabla or sometimes del ) is an operator that denotes the vector valued gra-
dient for a function, in our case a two-argument function 𝑤 ∶ ℝ2 → ℝ:
𝜕𝑤 𝜕𝑤
∇𝑤 ∶= ( , ) ∶ ℝ2 → ℝ2.
𝜕𝑥 𝜕𝑦
The product ∇𝑢 ⋅ ∇𝑣 is then to be understood as the dot product between the two
vectors, which again results in a scalar-valued function.
To summarize: In the variational form of the problem (1) we are looking for a func-
tion 𝑢 ∶ Ω → ℝ, 𝑢 = 𝑔 on 𝜕Ω, such that
(4) ∫ ∇𝑢 ⋅ ∇𝑣 = ∫ 𝑓 𝑣
Ω Ω
for all sufficiently smooth test functions 𝑣 ∶ Ω → ℝ with 𝑣 ≡ 0 on the boundary 𝜕Ω.
As mentioned earlier, it is common to collect the left-hand side of the variational
form in a bilinear form 𝑎(𝑢, 𝑣) and write the right-hand side, which does not depend
on 𝑢, as a linear form 𝐿(𝑣).
Regular Triangulation
is divided into 𝑛 × 𝑚 rectangles of equal size, each of which is in turn divided into
two triangles. Figure 15.2 shows the structure for the 3 × 4 case.
1.0
0.8
0.6
0.4
0.2
0.0
0.0 0.2 0.4 0.6 0.8 1.0
In the present case (and typical for two-dimensional partitions), these triangles are
used to form the finite elements.
Here we choose
2 mesh = UnitSquareMesh(32,32)
Remark FEniCS provides many other useful base classes for mesh generation, such
as RectangleMesh in the 2D case and UnitCubeMesh and BoxMesh for 3D problems.
In addition, FEniCS offers powerful methods to generate more general mesh
structures. We’ll come back to that later.
Analogously to the hat functions in the 1D case, we assign a piecewise linear pyramid
function 𝜑𝑖 (𝑥, 𝑦) to each interior node 𝑖 of the mesh. The function takes the value 1
at node 𝑖 and vanishes at all other nodes. This is shown in Fig. 15.3 on the next page.
The space 𝑉 then consists of all linear combinations 𝑣 = ∑𝑖 𝑐𝑖 𝜑𝑖 of the basis functions.
Note that each 𝑣 ∈ 𝑉 represents a locally linear function 𝑎 + 𝑏𝑥 + 𝑐𝑦 on each of
the finite regions where it does not vanish.
As in the 1D case, the variables for the solution function 𝑢 and the test functions 𝑣
are declared over the function space 𝑉:
4 u = TrialFunction(V)
5 v = TestFunction(V)
Source Function
From our manufactured solution, we derived the source function 𝑓 ≡ −6. In the
program we can specify it as before with ‘f=‐6’ or as an instance of the Expression
class by ‘f=Expression('‐6')’. For a constant, however, the specification
6 f = Constant(‐6.0)
is most efficient because it tells the underlying C++ compiler that the value does not
need to be recalculated during program execution.
Boundary Condition
The degree information is used to help FEniCS to interpolate the expression on the
mesh and thus increase the computation accuracy. However, in this simple example,
there doesn’t seem to be any reason to bother.
The tool class Expression can also be used to construct more complex expres-
sions, as can be seen from help(Expression).
The boundary condition itself is then again implemented as an instance of the class
DirichletBC:
8 bc = DirichletBC(V, g, 'on_boundary')
As in the 1D case, the discrete solution is based on the construction of the matrix
where grad denotes the gradient operator ‘∇’ defined earlier, and inner denotes the
inner integral product.
The computation of the solution then follows the same line as before:
11 u = Function(V)
12 solve(a == L, u, bc)
Again, note that the boundary conditions are introduced into the equation only dur-
ing the solution process.
The solution can then be visualized graphically. As in the 1D case, we use the dolfin
function plot to prepare the solution and pass it to Matplotlib:
13 plot(u, mode="warp")
14 from matplotlib.pyplot import show
15 show()
4.0
3.5
3.0
2.5
2.0
1.5
1.0
1.0
0.8
0.0 0.2 0.6
0.4 0.6 0.4
0.2
0.8 1.0 0.0
Fig. 15.4 FEniCS solution of PDE (1) with 𝑓(𝑥, 𝑦) ≡ −6, 𝑔(𝑥, 𝑦) = 1 + 𝑥 2 + 2𝑦 2
15.4 Time-Dependent Poisson Equation 333
𝜕𝑢
(𝑠, 𝑡) − Δ𝑢(𝑠, 𝑡) = 𝑓(𝑠, 𝑡), 𝑠 ∈ Ω, 𝑡 > 0,
𝜕𝑡
(1) 𝑢(𝑠, 𝑡) = 𝑔(𝑠, 𝑡), 𝑠 ∈ 𝜕Ω, 𝑡 > 0,
𝑢(𝑠, 0) = 𝑢0 (𝑠), 𝑠 ∈ Ω.
Here 𝑓, 𝑔, 𝑢0 are given functions. The solution 𝑢 = 𝑢(𝑠, 𝑡), the source function 𝑓 =
𝑓(𝑠, 𝑡) and the boundary condition 𝑔 = 𝑔(𝑠, 𝑡), 𝑡 > 0, can change in space 𝑠 and time 𝑡.
The initial value 𝑢0 depends only on space.
A Discrete-Time Approach
𝑢𝑛 (𝑠) ≈ 𝑢(𝑠, 𝑡𝑛 )
𝜕𝑢 𝑢 − 𝑢𝑛
(𝑠, 𝑡𝑛+1 ) ≈ 𝑛+1 (𝑠).
𝜕𝑡 ℎ
Rearranging the first line in (2), we get the following equation in the unknown func-
tion 𝑢𝑛+1 on the left-hand side:
Integration by parts taking into account the boundary condition for 𝑣 yields
Manufactured Solution
As in the stationary case, we construct a test problem whose solution we can easily
check.
We first fix the exact solution
𝑢𝑒 = 1 + 𝑥 2 + 3𝑦 2 + 1.2 𝑡,
(5) 𝑢0 = 1 + 𝑥 2 + 3𝑦 2.
Substituting 𝑢𝑒 into the first line of the heat equation (1) yields
𝜕𝑢
𝑓= − Δ𝑢 = 1.2 − 2 − 2 × 3 = −6.8.
𝜕𝑡
Note that then also all functions 𝑓𝑛 , 𝑛 > 0, assume the same constant value −6.8.
The boundary condition is once again determined by the exact solution
(6) 𝑔 = 𝑢𝑒 = 1 + 𝑥 2 + 3𝑦 2 + 1.2 𝑡.
FEniCS Implementation
We can now formulate a FEniCS program that for a fixed time point 𝑇approximates
𝑢(𝑠, 𝑇) with a function 𝑢𝑇 (𝑠). Observe that here 𝑢𝑇 depends only on space, i.e. the
time parameter does not appear as a variable.
15.4 Time-Dependent Poisson Equation 335
We first define the start and end times 0 and 1 and the discrete time steps ℎ = 0.1 for
the time progression:
1 t = 0.; h = 0.1; T = 1.0
As usual we need:
2 from dolfin import *
The variables u_sol and v to store the solution functions 𝑢𝑛 and test functions 𝑣 are
then declared over this function space 𝑉:
5 u_sol = Function(V) # not TrialFunction!
6 v = TestFunction(V)
Note that u_sol is declared as a general Function, not as TrialFunction. The reason
is explained in the remark below.
The right-hand side of the PDE (1), hence also the value of all 𝑓𝑛 in (4), is given
by the constant expression computed above:
7 f = Constant(‐6.8)
To encode the boundary condition according to (6), we first define a Python string
8 g_str = '1 + x[0]*x[0] + 3.0*x[1]*x[1] + 1.2*t'
For the initial value 𝑢0 of our function sequence according to (5), we can use the same
expression g. However, we must first convert it to a function in the discrete function
space 𝑉.
This is achieved through
10 u_n = project(g,V) # initial value u0 according to (5)
In the following loop, the functions 𝑢𝑛 are determined iteratively according to (4):
11 while t <= T:
12 a = u_sol*v*dx + h*inner(grad(u_sol), grad(v))*dx
13 L = u_n*v*dx + h*f*v*dx
14 g.t = t # assign new t‐value to boundary expression
15 bc = DirichletBC(V, g, 'on_boundary')
16 solve(a ‐ L == 0, u_sol, bc) # nonlinear solver
17 u_n.assign(u_sol) # next in u_n sequence
18 t += h # next time step
Lines 12 and 13 represent the left and right sides in (4). The variable u_n contains the
current last sequence entry 𝑢𝑛 .
336 15 Automated Solution of PDEs with FEniCS
Note, however, that ‘a’ is not a valid bilinear form accepted by the solver because
u_sol is not declared as a TrialFunction. In such cases it is necessary to set up the
equation in a form with 0 on the right-hand side, i.e. here in the form ‘a‐L == 0’.
Again, see the remark below for details.
The assignment g.t = t in line 14 gives the expression g the new t value, which
is required to update the boundary condition bc in line 15.
Line 16 generates the actual solution.
In line 17, the result is assigned to the variable u_n by the method assign, so that
in the next iteration round it is available as the new last sequence entry.
The final result for 𝑇 = 1 can then be plotted. The plot is shown in Fig. 15.5. Note,
however, that the plot does not show the complete temporal evolvement.
6.0
5.5
5.0
4.5
4.0
3.5
3.0
2.5
1.0
0.8
0.0 0.2 0.6
0.4 0.6 0.4
0.2
0.8 1.0 0.0
Fig. 15.5 FEniCS solution of the time dependent Poisson equation (1) at time 𝑇 = 1
Remark The variable u_sol plays a double role, firstly in specifying the variational
form in line 12, and secondly in the solution process for storing the result in line 16.
For the latter, it must be of type Function, not TrialFunction. But a variable of type
Function is not accepted by the solver in an argument of the form ‘a == L’.
Here the form ‘F == 0’ with ‘F = a ‐ L’ comes to the rescue, since the solver can
process a general Function expression if it appears in an argument of this form. The
disadvantage is that the solver then uses a more complex nonlinear solution method.
with the exact solution 𝑢𝑒 = (7𝑥 + 1)1/3 − 1. The term (1 + 𝑢)2 makes it nonlinear in
the unknown function 𝑢.
Variational Form
Picard Iteration
We solve (2) with the so-called Picard iteration. The idea is to inductively construct
a sequence of functions 𝑢0 , 𝑢1 , … , 𝑢𝑛 , such that in every step, 𝑢𝑛+1 is chosen as the
solution 𝑢 of the following variational form, which is now actually linear in 𝑢:
1
(3) ∫ 𝑞(𝑢𝑛 ) 𝑢′ 𝑣′ = 0, 𝑞(𝑢𝑛 ) ∶= (1 + 𝑢𝑛 )2 ,
0
However, we postpone the declaration of a variable to store the solution function. The
reason is that we want to reuse the present code block in the next program below.
We introduce the nonlinear factor as:
5 def q(u): return (1 + u)**2
338 15 Automated Solution of PDEs with FEniCS
Here the dolfin function near in lines 6 and 7 denotes ‘indiscernible within machine
precision’. Line 6 essentially specifies that a floating point number 𝑥 belongs to the
left boundary if it is indistinguishable from 0. Line 7 specifies that a point belongs to
the right boundary if it is indistinguishable from 1.
Lines 8 and 9 define the two boundary conditions, which are then combined into
a single condition in line 10.
We come to the definition of the Picard iteration.
The variable u_n is used to inductively store the functions 𝑢𝑛 in (3), beginning
with the initial value 𝑢0 ≡ 0:
11 u_n = project(Constant(0), V) # u_n is of type 'Function'
We now need a function variable ‘u’ to which the solver can assign the new iterative
solution 𝑢𝑛+1 , depending on the previous 𝑢𝑛 currently stored in u_n:
12 u = Function(V) # not TrialFunction
where we however abort the process after 25 steps to avoid non-terminating program
runs. To calculate the supremum norm, we use the NumPy function ‘norm’ with the
option ‘ord=inf’. We get:
13 eps = 1.0e‐6
14 from numpy import inf; from numpy.linalg import norm
15 for _ in range(25):
16 F = (q(u_n)*u.dx(0)*v.dx(0))*dx
17 solve(F == 0, u, bc)
18 diff = u.vector().get_local() ‐ u_n.vector().get_local()
19 d = norm(diff, ord=inf)
20 if d < eps: break
21 u_n.assign(u)
The variational problem (3) is formulated in line 16. In line 17 it is passed to the solver
in the form ‘F == 0’, which accepts a variable of type Function.
In line 18, the difference between the new and the previous approximation is cal-
culated. Note that the suffix ‘vector().get_local()’ denotes the method that as-
signs the values of the FEniCS Function object to a NumPy array.
15.5 Nonlinear Equations 339
In line 19, the maximum of the value differences is assigned to the variable ‘d’ via the
NumPy function norm with the maximum option.
In line 20, the iteration is halted when the desired accuracy is reached. Otherwise,
the result is assigned to u_n and a new iteration is initiated.
After the loop ends, the final result is contained in u. As usual, it can be plotted using
the dolfin plot function to pass the result to Matplotlib. The result is shown in
Fig. 15.6, actually superimposed together with the exact solution 𝑢 = (7𝑥 + 1)1/3 − 1.
The deviation from the exact solution is barely visible, if at all.
1.0
0.8
0.6
0.4
0.2
0.0
0.0 0.2 0.4 0.6 0.8 1.0
Fig. 15.6 FEniCS solution of the nonlinear PDE (1) together with exact solution 𝑢 = (7𝑥 + 1)1/3 − 1
Direct Solution
Remark Note that the variational form is defined in such a way that it extends directly
to the multidimensional case. In the present one-dimensional case, the expression
‘inner(grad(u),grad(v))’ is automatically evaluated as ‘u.dx(0)*v.dx(0)’.
Once the variational form has been built, we can redeclare ‘u’ as a Function variable
to represent the solution:
13 u = Function(V)
340 15 Automated Solution of PDEs with FEniCS
In the present case, the solution algorithm is based on the Newton method (instead
of a Picard iteration). For this, we need the derivative of the expression F in line 12.
To compute it, the command ‘action(F,u)’ is first used to transform F into a discrete
vector of the function values of u on the mesh nodes, from which then the derivative
is generated in form of a Jacobian matrix J:
14 F = action(F,u)
15 J = derivative(F,u)
The solution process still needs a start value for u. In general, this choice may be
critical, but here again the following is sufficient:
16 u_init = project(Constant(0), V)
17 u.assign(u_init)
and instantiate a solver object specifically adapted for such nonlinear problems:
19 solver = NonlinearVariationalSolver(problem)
The solution u can then be plotted as in the Picard-version above. In fact, if you look
closely, you might notice that the graph of the solution obtained by Newton’s method
is already included as an overlay in Fig. 15.6 on the preceding page.
So far, we have only considered Dirichlet boundary conditions where the value of the
unknown function 𝑢 is given on the boundary.
In contrast, a Neumann boundary condition specifies the derivative of the solution as
a function 𝑔:
𝜕𝑢(𝑥)
= 𝑔(𝑥),
𝜕n
where n(𝑥) denotes the perpendicular unit vector pointing outward from the bound-
ary 𝜕Ω at the boundary point 𝑥.
If we limit ourselves, as we will do below, to PDEs over the 2D domain Ω = [0, 1]2 ,
then the directional derivatives can be obtained directly from the partial derivatives
𝜕𝑢 𝜕𝑢
and .
𝜕𝑥 𝜕𝑦
15.6 Neumann Boundary Conditions 341
We first give an example in which both Dirichlet and Neumann conditions occur.
So let Ω = [0, 1]2 . Let Γ𝐷 = {(𝑥, 𝑦) ∈ 𝜕Ω ∣ 𝑥 = 0 or 𝑥 = 1} be the vertical component
of the boundary 𝜕Ω, and Γ𝑁 ∶= 𝜕Ω − Γ𝐷 be the horizontal part, where however the
corner points belong to Γ𝐷 .
Consider the Poisson equation
Manufactured Solution
Variational Form
For FEniCS processing, the PDE must again be formulated as a variational problem.
In Sect. 15.3 we developed the general variational form
𝜕𝑢
(2) ∫ ∇𝑢 ⋅ ∇𝑣 = ∫ 𝑓 𝑣 + ∫ 𝑣,
Ω Ω 𝜕Ω 𝜕n
and noted that as long as we only considered Dirichlet conditions, we could limit
ourselves to test functions that vanished on the boundary. This is why the second
term on the right-hand side could always be assumed to be 0.
This time we cannot omit the boundary term arising from the integration by parts.
The test functions 𝑣 can only be assumed to be 0 on Γ𝐷 .
Therefore, our model example, when expressed in variational form, becomes:
Determine 𝑢, such that, observing the Dirichlet condition above
(3) ∫ ∇𝑢 ⋅ ∇𝑣 = ∫ 𝑓 𝑣 + ∫ 𝑔 𝑣
Ω Ω Γ𝑁
FEniCS Implementation
We first note that in FEniCS a Neumann condition automatically holds for exactly
that part of the boundary for which no Dirichlet conditions are given.
The FEniCS implementation begins as usual:
1 from dolfin import *
2 mesh = UnitSquareMesh(32,32)
3 V = FunctionSpace(mesh, 'CG', 1)
4 u = TrialFunction(V)
5 v = TestFunction(V)
The boundary region Γ𝐷 for the Dirichlet condition can be defined as follows, again
with the dolfin function ‘near’:
6 def dir_boundary(x): return near(x[0], 0) or near(x[0], 1)
We can now define the Dirichlet condition for this boundary region:
7 g_D = Expression('1 + x[0]*x[0] + 2*x[1]*x[1]', degree=2)
8 bc = DirichletBC(V, g_D, dir_boundary)
In line 12, it is important to note that FEniCS interprets the *dx’ operator as a com-
mand to evaluate the integral over the entire domain, while the *ds’ operator indi-
cates that it is to be computed along the boundary.
Note that the integration ‘*ds’ is done over the entire boundary, including the
Dirichlet boundary. However, since the test functions v vanish on the Dirichlet
boundary (as a consequence of specifying a DirichletBC object), the integral will
only include the contribution from the Neumann boundary.
After these preparations, the solution is now straightforward:
13 u = Function(V)
14 solve(a == L, u, bc)
Plotting the result shows that the solution coincides with the one shown in Fig. 15.4
on page 332 in Sect. 15.3.
15.6 Neumann Boundary Conditions 343
We consider an equation where all boundary conditions are of Neumann type. Note
that such a problem is only well posed if a function value for 𝑢 is given in undiffer-
entiated form somewhere in the equation, because specifications referring only to
derivatives can only determine a solution up to an additive constant.
We take the approach of including 𝑢 itself in the equation as follows:
Manufactured Solution
Variational Form
The variational form has the same shape as in (3) above, except that the term ‘+𝑢𝑣’
is added on the left-hand side, while on the right-hand side the term referring to the
integral over the boundary is not needed, since by assumption it is 𝑔 ≡ 0.
The variational form thus reads: Determine 𝑢, such that
(6) ∫ ∇𝑢 ⋅ ∇𝑣 + 𝑢 𝑣 = ∫ 𝑓 𝑣
Ω Ω
FEniCS Implementation
1.00
0.75
0.50
0.25
0.00
0.25
0.50
0.75
1.0
0.8
0.0 0.2 0.6
0.4 0.6 0.4
0.2
0.8 1.0 0.0
So far, we have considered PDEs in a single scalar valued unknown function. We now
turn to a system consisting of two functions, a vector-valued function 𝑢 and a scalar-
valued 𝑝.
More precisely, we consider the following simple steady-state variant of the so-called
Stokes equation:
Mathematical Preliminaries
𝜕𝑢1 𝜕𝑢2
div 𝑢 ∶= + .
𝜕𝑥1 𝜕𝑥2
Variational Problem
When transforming (1) into a variational problem, it is useful to use a mixed formu-
lation.
We start with the equation in the first line in (1). Let 𝑉 be a space of sufficiently
smooth test functions 𝑣 ∶ Ω → ℝ2 with 𝑣 ≡ 0 on 𝜕Ω.
As usual, we multiply the equation by a test function 𝑣 and then take the integral
over Ω, reducing terms with second derivatives to first order expressions by integra-
tion by parts.
For the leftmost term −Δ𝑢 we obtain
(2) − ∫ Δ𝑢 𝑣 = ∫ ∇𝑢 ∶ ∇𝑣,
Ω Ω
346 15 Automated Solution of PDEs with FEniCS
(3) ∫ ∇𝑝 𝑣 = − ∫ (div 𝑣) 𝑝.
Ω Ω
Now we turn to the second line in (1). Again, we use a set 𝑄 of sufficiently smooth
functions 𝑞 ∶ Ω → ℝ as test functions. The equation written in variational is then
(4) ∫ (div 𝑢) 𝑞 = 0.
Ω
FEniCS Implementation
By the method of manufactured solutions, we again first construct a test problem for
which we can easily verify the solution.
For this purpose we define the exact solution by
cos(𝜋𝑦)
𝑢𝑒 (𝑥, 𝑦) ∶= ( ),
(6) sin(𝜋𝑥)
𝑝𝑒 (𝑥, 𝑦) ∶= 𝜋 cos(𝜋𝑥) cos(𝜋𝑦).
The Program
Visualizing
For the pressure p, on the other hand, things are a bit more complicated. Note that
in the system of equations (1), the function 𝑝 appears only in derived form, which
means that the solution is unique only up to an additive constant. A standard way
to get around this problem is to include a normalizing specification in the equation
system, e.g. ∫Ω 𝑝 = 0. Recall that this condition holds for our original manufactured
solution defined above.
In the present case, the normalization can be achieved as follows: Let 𝑐 be the
array of coefficients 𝑐𝑖 of the basis functions 𝜑𝑖 in the representation of the solution
function p found by FEniCS. If we then define a new function 𝑝norm ∶= ∑𝑖 (𝑐𝑖 −𝑐avg ) 𝜑𝑖 ,
where 𝑐avg is the average over all 𝑐𝑖 , then this function has the desired properties. It
is a solution of the equation system (1), where now also ∫Ω 𝑝norm = 0.
In FEniCS, the following instruction converts the original solution to such a 𝑝norm :
22 normalize(p.vector())
Figure 15.8 on the facing page shows the solution for the vector field 𝑢 and the scalar
pressure 𝑝.
Fig. 15.8 FEniCS solution for Stokes equation (1) with 𝑓 as in (7), 𝑔 = 𝑢𝑒 in (6). Velocity field (left).
Pressure 𝑝 (right)
for a measure function 𝑀, e.g. 𝑀(𝑢) ∶= ∫Ω 𝑢, and a tolerance level tol > 0.
The solver function solve can include the additional arguments tol and 𝑀, and then
iterate the solution process on finer and finer meshes until (2) holds.
The main observation is that solve only refines the mesh 𝐺𝑛 locally around the nodes
that promise the greatest gain in accuracy.
FEniCS Implementation
7 f = Expression('100*exp(‐100*(pow(x[0]‐0.5, 2)
8 + pow(x[1]‐0.3, 2)))', degree=1)
9 a = inner(grad(u), grad(v))*dx
10 L = f*v*dx
11 u = Function(V)
We define the measure function 𝑀 and the tolerance level tol as explained above:
12 M = u*dx
13 tol = 1.e‐3 # for illustration only ‐ in practice 1.e‐6 better
The adaptive solution process is then started with these values inserted into the ar-
gument placeholders in solve:
14 solve(a == L, u, bc, tol=tol, M=M)
The plots in Fig. 15.9 on the next page show the solutions in terms of initial and final
meshes, as well as the meshes themselves:
15 from matplotlib.pyplot import show, figure
16 figure(); plot(u.root_node(), mode='warp')
17 figure(); plot(u.leaf_node(), mode='warp')
18 figure(); plot(mesh)
19 figure(); plot(mesh.leaf_node()) # FEniCS 1.7.2
20 show()
Remark Note that line 19 was executed in FEniCS version 1.7.2. The generator
method ‘leaf_node’ is no longer available in later versions.
So far we have dealt with meshes from the basic FEniCS collection. However, it is ev-
ident that more complex structures are generally required when modeling real-world
problems. In the following, we present some useful methods for mesh generation.
The mshr package provides tools for constructing different domains based on simple
geometric shapes such as Circle or Rectangle. These can then be combined into
more complex shapes using the Boolean operations intersection, union and differ-
ence. The resulting regions can be overlaid with different types of meshes using the
generate_mesh operator.
0.8
0.8
0.6 0.6
0.4 0.4
0.2 0.2
(a) (b)
0.0 0.0
1.0 1.0
0.8 0.8
0.0 0.2 0.6 0.0 0.2 0.6
0.4 0.6 0.4 0.4 0.6 0.4
0.2 0.2
0.8 1.0 0.0 0.8 1.0 0.0
1.0 1.0
0.8 0.8
0.6 0.6
0.2 0.2
0.0 0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
Fig. 15.9 Adaptive mesh refinement. (a) shows the solution for the original coarse mesh, (b) that
for the mesh adaptively refined by the FEniCS program, (c) and (d) the underlying meshes
Mesh Generation
We consider an annulus Ω where the outer circle has a radius of 1 and the inner circle
has a radius of 0.5.
Using tools from mshr, we first create a FEniCS representation of the Ω domain:
1 from dolfin import *
2 from mshr import *
3 c1 = Circle(Point(0.0), 1)
4 c0 = Circle(Point(0.0), .5)
5 annulus = c1 ‐ c0
The result is shown in Fig. 15.10 on the following page (for illustration with a coarser
node density).
7 plot(mesh)
8 from matplotlib.pyplot import show
9 show()
352 15 Automated Solution of PDEs with FEniCS
1.00
0.75
0.50
0.25
0.00
0.25
0.50
0.75
1.00
1.0 0.5 0.0 0.5 1.0
The mesh can then be used in FEniCS programs like any other ready-made mesh, see
the example below.
For later use we store the mesh in an xml-file:
10 file = File('annulus.xml')
11 file << mesh # note: output operator as in C++
Application Example
We illustrate the use of mesh in a program for solving the Laplace equation
Δ𝑢 = 0 in Ω,
(1) 𝑢 = sin(5𝑥) + cos(5𝑦) on outer boundary,
∇𝑢 ⋅ n = 0 on inner boundary.
Remark Alternatively, we could also continue with the above mesh-generation pro-
gram and append the following program part directly after line 6.
15.9 User Defined Meshes 353
The solution:
13 u = Function(V)
14 solve(a == L, u, bc)
1.5
1.0
0.5
0.0
0.5
1.0
1.5
1.00
0.75
0.50
1.00 0.75 0.25
0.00
0.50 0.25 0.25
0.00 0.25 0.50
0.50 0.75 0.75
1.00 1.00
Fig. 15.11 FEniCS solution of Laplace equation (1) over annulus mesh
354 15 Automated Solution of PDEs with FEniCS
In Sect. 15.8 we saw how adaptive mesh refinement can be performed automatically
by the FEniCS system. However, the techniques can also be applied directly by the
user. The following FEniCS program illustrates the idea:
1 from dolfin import *
2 from mshr import *
Consider a domain made from a rectangle from which a circular segment has been
subtracted:
3 dom = Rectangle(Point(0,0),Point(1.2, .4))\ # note line‐
4 ‐Circle(Point(.3,.2),.15) # continuation char
which we then want to refine, first for the entire mesh and then restricted to the left
half of the rectangle for illustration.
The refinement for the entire mesh is obtained by
6 mesh_fine = refine(mesh)
To refine only the left part, we need to indicate the triangles involved by markers:
7 markers = MeshFunction('bool', mesh, 2) # 2 = mesh‐dimension
8 markers.set_all(False) # initialized
Each of the marked triangles is then refined. This results in the mesh
11 mesh_partial_fine = refine(mesh, markers)
Mesh Libraries
In fact, there is already a wealth of meshes freely available for many standard situa-
tions, but also for more exotic ones. A good collection can be found, at fenicspro‐
ject.org, for example.
15.10 Final Note: Parallel Processing in FEniCS 355
0.4
0.3
original: 0.2
0.1
0.0
0.0 0.2 0.4 0.6 0.8 1.0 1.2
0.4
0.3
refined: 0.2
0.1
0.0
0.0 0.2 0.4 0.6 0.8 1.0 1.2
0.4
0.3
However, when it comes to large real-world projects, the built-in FEniCS mesh con-
struction tools become quite tedious. Then it makes more sense to use other special-
ized tools and import the resulting meshes back into FEniCS.
One such tool that is often used in conjunction with FEniCS, is Gmsh, which can
be downloaded for free from the gmsh.info website.
We have seen that an important part of the finite element method is solving matrix
equations. This naturally raises the question of whether FEniCS can benefit from
distributing such calculations among several processes.
In fact, we can run any FEniCS code in parallel with MPI, as explained in Chap. 12,
by simply launching the computation for multiple instances of the Python interpreter.
Example 15.2 Assume the code in Sect. 15.3 to be stored in a file poisson.py.
The computation can then be distributed to four processes by
$ mpiexec ‐n 4 python3 poisson.py
This will return 4 partial results, including separate plots for each.
At the moment, however, there seems to be no obvious elementary method to
combine the partial results and diagrams into a common overall solution.
356 15 Automated Solution of PDEs with FEniCS
Exercises
Exercise 15.1 Write FEniCS programs to solve the following differential equations:
𝑢(𝑥)
(2) 𝑢′ (𝑥) = in [0, 1], 𝑢(0) = 1.
1 + 𝑥2
𝑢(0) = 𝑢(1) = 0.
and write a solution program. The exact solution is the same as in case (1).
(1) Embed the solution program into a function that contains the mesh size 𝑛 as
argument.
(2) Extend the program to one that additionally estimates the error by computing
the difference from the exact solution with respect to the 𝐿2 integral norm. To do this,
use the dolfin function errornorm. For details see help(errornorm).
(3) Solve the problem for 𝑛 = 10, 20, 30.
Exercise 15.6 Solve the same PDE (∗), now however using quadratic spline functions
instead of hat functions.
Exercise 15.7 Solve (∗), now however considering the 3D domain Ω = [0, 1]3 instead
of the 2D version [0, 1]2 . For testing, select a new “manufactured solution” and adapt
the program accordingly.
Exercise 15.8 We consider the heat equation in Sect. 15.4. For the computation of
the iterative solution, we used a “backward Euler” method, which determined the
current iterative solution 𝑢𝑛 implicitly from the equation
(1) Change the code so that it now uses the direct forward Euler method
𝑢𝑛+1 = 𝑢𝑛 + ℎ Δ𝑢𝑛 + ℎ 𝑓𝑛 .
What happens?
(2) Return to the original formulation and change the manufactured solution to
𝑢 = 1 + 𝑥 2 + 3𝑦 2 + 1.2 𝑡 so that the source function 𝑓 for the right-hand side becomes
time dependent. Adapt the solution program accordingly.
For the time-dependent Poisson equation in Sect. 15.4, we applied the FEniCS solver
solve with a nonlinear solution method (in line 16 of the program). However, solve
can operate in yet another mode: A linear system 𝐴𝑥 = 𝑏 can be solved by calling
solve(A,x,b), where A is a matrix and x and b are vectors.
Exercise 15.9 Modify the solution program accordingly. To do this, declare u_sol as
a function of type TrialFunction, create the bilinear form ‘a’ and the corresponding
matrix A by applying the techniques from Remark 15.1 in Sect. 15.2. Then, u_sol is
redeclared as a Function object. This is done outside of the while loop.
358 15 Automated Solution of PDEs with FEniCS
Inside the loop, construct the vector ‘b’, work the boundary condition into the equa-
tion, and pass it to solve, where the solution is inserted into the function u_sol in
vector form u_sol.vector().
A Neumann Challenge
Exercise 15.10 Develop a variant of the Poisson equation in Sect. 15.6, based as much
as possible on pure Neumann boundary conditions.
Chapter 16
Scientific Machine Learning with PyTorch
Overview
In Sect. 16.1 we introduce linear regression and solver methods based on gradient
descent, a standard approach in machine learning. In Sect. 16.2 we explain various
basic PyTorch tools and show how to implement gradient descent in regression.
Section 16.3 is concerned with general aspects of backpropagation, the central
training technique in machine learning. Topics include optimization, nonlinear equa-
tions, and matrix computations.
Section 16.4 discusses the theoretical foundations of neural networks, in particu-
lar the fundamental universal approximation theorem.
In Sect. 16.5 we begin with the application of neural networks in scientific com-
puting and first examine how they can be used in function approximation.
In Sect. 16.6 we show how to generate a versatile differentiation operator with Py-
Torch’s automatic gradient arithmetic tools. This operator then becomes our work-
horse in Sect. 16.7 on integration, in Sect. 16.8 on first order ordinary differential
equations and in Sect. 16.9 on second order equations and finally in the solution of
partial differential equations in Sect. 16.10.
In two supplementary sections, we discuss the representation of Boolean logical units
with neural networks and present a typical example of machine learning in artificial
intelligence, the recognition of handwritten characters.
Installation
The site pytorch.org lists several ways to install PyTorch. We assume that the Python
distribution Anaconda is installed. Then the necessary PyTorch components can be
downloaded and installed with the Terminal command
$ conda install ‐c pytorch pytorch
All examples in this chapter were tested on macOS for PyTorch version 1.11 in the
Spyder environment, which was already used in the basic Python chapters in Part II.
The standard criterion for the deviation measure is the mean squared error
𝑛
1 2
(1) ℓ(𝑤, 𝑏) ∶= ∑ (𝑟(𝑥𝑖 ) − 𝑦𝑖 ) .
𝑛 𝑖=1
Gradient Descent
There are many ways to determine the values of 𝑤 and 𝑏 that minimize ℓ. The cen-
tral method used in neural networks is to approach the solution by gradient descent,
similar to what we saw in the conjugate gradient method in previous chapters.
The idea is to iteratively define a sequence 𝑝𝑘 = (𝑤𝑘 , 𝑏𝑘 )𝑇 starting with some initial
value 𝑝0 = (𝑤0 , 𝑏0 )𝑇 :
𝑝𝑘+1 ∶= 𝑝𝑘 − 𝜂 ∇ℓ(𝑝𝑘 ),
where −∇ℓ denotes the gradient of ℓ that determines the direction of steepest descent,
and 𝜂 is some chosen scalar that sets the step length.
For later reference we note that
𝜕ℓ 2 𝑛
𝜕𝑤 𝑛
∑𝑖=1 (𝑤𝑥𝑖 + 𝑏 − 𝑦𝑖 )𝑥𝑖
(2) ∇ℓ = ( )=( 2 𝑛
).
𝜕ℓ
𝜕𝑏 𝑛
∑𝑖=1 (𝑤𝑥𝑖 + 𝑏 − 𝑦𝑖 )
Example 16.1 As running example throughout this and the next section we consider
the regression line 𝑤𝑥 + 𝑏 through the point cloud in ℝ2 , given by
with the exact solution 𝑤 = 1/2, 𝑏 = 2/3. The regression line together with the cloud
points is shown in Fig. 16.1.
2.4
2.2
2.0
1.8
1.6
1.4
1.2
1.0
We turn to the minimization of ℓ(𝑤, 𝑏) by gradient descent. Figure 16.2 shows the
landscape in which the descent is carried out.
3.0
2.5
2.0
1.5
1.0
0.5
0.0
1.0
0.8
0.0 0.2 0.6
0.4 0.4 -axis
w-axis 0.6 0.8 0.2 b
1.0 0.0
Fig. 16.2 Mean squared error ℓ(𝑤, 𝑏). Minimum indicated by blue dot
The graph has a slight bowl to its shape. The goal of gradient descent is to “roll down
the hill” along the error function ℓ(𝑤, 𝑏) in (1). The bottom of the bowl is determined
as the point where the gradient ∇ℓ is 0, indicated by the solid dot. This point repre-
sents the minimum error and thus determines the values of the parameters 𝑤 and 𝑏.
The general idea is to take repeated steps in the opposite direction of the gradient of
the error function at the current point. For the linear regression this means: make
an error prediction based on the current parameter values, determine the gradient
of the error, and then adjust the parameter values by taking a step along the steepest
descent.
Remark 16.2 Note the convex shape of the bowl in Fig. 16.2. It is essential to achieve
the correct minimum. To see what can go wrong, let’s turn to a simplified example.
Consider the graph in Fig. 16.3, actually the graph of the function −𝑥 sin(𝑥).
4
2
0
2
4
6
8
0 2 4 6 8
Fig. 16.3 Local (blue dot) vs. global minimum (red dot)
Starting in the point with 𝑥 = 4 we can reach the local minimum for 𝑥 ≈ 2 by gradient
descent, but not the global one for 𝑥 ≈ 8.
Remark Another remark concerns the step length 𝜂. It is clear that small steps lead
to slow convergence. On the other hand, too large steps can prevent the convergence
altogether. Here is an example:
16.2 Gradient Descent in PyTorch 363
Assume the task is to approximate the minimum 0 of the function 𝑓(𝑥) = 𝑥 2 with
For 𝜂 = 0.1 we need 31 steps to get below 0.001, for 𝜂 = 0.4 only 5. However, for
𝜂 = 1 the sequence diverges.
Continuing with the example above we show how to implement gradient-descent ap-
proximation in PyTorch. We develop a solution program and introduce some central
concepts along the way.
Example 16.3 The reader is advised to save a copy of the program to a file. Through-
out the section we will make successive changes to the code.
To store the point cloud we use the main data type in PyTorch, the tensor:
1 from torch import tensor
2 inputs = tensor([1., 2., 3.])
3 targets = tensor([1., 2., 2.])
On it’s own, a PyTorch tensor is very similar to a NumPy array. It can be used to store
vectors, matrices, and also higher dimensional arrays. Additionally it can also hold
scalar values. We come to more specific tensor properties later.
We initialize the parameters 𝑤 and 𝑏, in machine learning commonly called weight
and bias, with some reasonable values:
4 w, b = 1, 0 # initial parameter values
We set the step length 𝜂, in machine learning commonly called learning rate:
5 lr = 0.1 # sets learning rate
We follow the approximation towards the exact parameter values for a defined num-
ber of iterations, in machine learning commonly known as epochs:
6 epochs = 500 # defines number of iterations
Within the loop, first the predicted outputs are computed based on the current pa-
rameter values. This phase is called forward propagation:
8 preds = w * inputs + b # forward propagation
Then the deviation from the the desired goal value is determined:
9 errors = preds ‐ targets
Based on the error values, the direction of the steepest descent is calculated according
to equation (2) in the last section:
364 16 Scientific Machine Learning with PyTorch
Here ‘mean’ denotes the PyTorch tensor method with the obvious meaning of return-
ing the mean value over all elements in the tensor.
Next, the gradient is scaled by the learning rate and subtracted from the parame-
ters, in order to update the values to a better approximation. This step is called back
propagation:
12 w ‐= lr * w_grad # back propagation
13 b ‐= lr * b_grad
Remark One final remark concerns the concept of epochs. In practical numerical
computation, it certainly makes more sense to iterate the approximation until the
error falls below a specified tolerance level. In this chapter, however, the emphasis
is primarily on understanding the basic techniques, so we continue to use the fixed
number of iterations approach.
Automatic Differentiation
𝑑𝑧 𝑑𝑧 𝑑𝑦
= ⋅ = 3 ⋅ 2𝑥.
𝑑𝑥 𝑑𝑦 𝑑𝑥
The value True in the attribute ‘requires_grad’ tells the system to record every com-
putation involving x to allow backpropagating the gradient.
Now consider the computation sequence
3 >>> y = x**2; z = 3*y
storing the values together with an attribute denoting the type of function with which
they were computed. Internally the system keeps track of the full dependency graph.
The backpropagation of the gradient is initiated with
6 >>> z.backward()
will simply add the new gradient to the one already stored.
To obtain the correct result from the new operation we have to discard the previous
‘grad’ content by inserting
>>> x.grad.zero_() # Out: tensor(0.)
before line 8 to clear the gradient buffer for the new computation.
Then the commands in lines 8 and 9 store the correct answer ‘tensor(2.)’ in the
attribute x.grad.
Also in general, whenever we use gradients to update the parameters, we need to zero
the gradients before the next operation. And that is exactly what the ‘zero_’ method
does. Note that in PyTorch a trailing underscore is usually used in names of in-place
operations.
Example 16.5 (Automatic Differentiation) We show how the gradient computation
in our running example can be simplified by automatic differentiation.
In order not to lose the overview, we give the entire program:
1 from torch import tensor , no_grad
2 from torch.nn import MSELoss
Then we declare the so-called weight and bias parameters 𝑤 and 𝑏, again as scalar
tensors, but now with the additional attribute that the gradients are to be tracked:
5 w = tensor(1., requires_grad=True)
6 b = tensor(0., requires_grad=True)
Since we want to leave the gradient computation to the system, we need the mean-
squared-error function in undifferentiated form. However we do not have to define
it ourselves. It is delivered ready to use in the package torch.nn:
7 loss_fn = MSELoss() # contained in torch.nn
Note that the identifier ‘nn’ reminds us that the techniques were developed primarily
for neural networks. In addition, note that in machine learning it is more common
to speak of loss instead of error. From now on we follow this convention.
The iteration loop is specified in lines 8-16:
8 for _ in range(epochs):
9 preds = w * inputs + b
10 loss = loss_fn(preds, targets) # replaces manual computation
This initializes the gradient computation and stores the resulting values in the tensor
attributes w.grad and b.grad.
The parameter values are then updated in lines 13 and 14:
12 with no_grad():
13 w ‐= learning_rate * w.grad
14 b ‐= learning_rate * b.grad
Line 12 calls a context-manager that temporarily disables gradient calculation for the
following statement block. In fact, we would never need to backtrack those compu-
tations.
Finally, as illustrated in Example 16.4, the gradient values for the parameters are
reset to 0 to prepare for the next iteration round:
15 w.grad.zero_()
16 b.grad.zero_()
This ends the iteration loop. The result is the same as before:
17 print(w.detach(), b.detach())
18 Out: tensor(0.5000) tensor(0.6667)
Note that the detach method returns the tensors detached from the gradient graph,
so that only the stored value is printed.
16.2 Gradient Descent in PyTorch 367
Optimizer
In Example 16.5 we delegated the computation of the parameter gradients to the sys-
tem, but the actual update in lines 13 and 14 was still carried out manually.
In the ‘optim’ package, PyTorch implements various optimization algorithms to fa-
cilitate the update steps. The de facto standard is stochastic gradient descent SGD or
one of its variants. Below we use the default form, which behaves exactly as the man-
ual update in the last example. In particular, despite its name, it makes no stochastic
choice on which parameters to process.
Example 16.6 We show how the PyTorch SGD algorithm can be used in our linear
regression program.
First, we add the module optim to the list of torch components to be imported:
1 from torch import tensor , optim
Note that the no_grad declaration is no longer needed. The optimizer automatically
controls when to record gradients.
Lines 2-7 are then copied literally from the previous example above.
Then we continue with a new line 8, where we initialize our gradient-descent pro-
cedure. Note that we pass the parameter list and the value of the learning rate:
8 optimizer = optim.SGD([w,b], lr=learning_rate)
We then let the optimizer replace the manual parameter upgrade step:
13 optimizer.step() # updates parameters
14 optimizer.zero_grad() # zero parameter gradients
In line 14 the parameter attributes w.grad and b.grad are again reset to 0 to prepare
for the next iteration round.
The final result is then again the same as before:
15 print(w.detach(), b.detach())
16 Out: tensor(0.5000) tensor(0.6667)
So far we have defined our predictor model explicitly via the parameters in the form
preds = w * x + b
However, in the torch.nn package, PyTorch offers a module Linear as a basic build-
ing block precisely for such linear models.
368 16 Scientific Machine Learning with PyTorch
Example 16.7 We formulate our standard example with Linear instead of the self-
defined predictor function 𝑤𝑥 + 𝑏.
We start by loading the torch components:
1 from torch import tensor , optim
2 from torch.nn import Linear , MSELoss
We define our model. Note that the module Linear is designed for linear and affine
transformations in general. Here we only consider the most basic case, in which
both input and output consist of vectors of length 1. This is specified by the argu-
ments (1, 1):
3 model = Linear(1,1)
Note that the weight and bias parameters are initialized with some random values.
In the following we could start our approximation with these values, or, as we prefer
here, assign our usual start values.
This can be accomplished with:
4 from torch.nn import Parameter
5 model.weight = Parameter(tensor([[1.]]))
6 model.bias = Parameter(tensor([0.]))
We again fix the training data. Note however, that Linear expects the arguments to
be column vectors, more precisely one-column matrices:
7 inputs = tensor([[1.], [2.], [3.]]) # column vectors
8 targets = tensor([[1.], [2.], [2.]])
When defining the optimizer, note that weight and bias are no longer specified ex-
plicitly, but with the ‘parameters’ method, which returns the parameters together
with their current values:
11 optimizer = optim.SGD(model.parameters(), lr=learning_rate)
15 loss.backward()
16 optimizer.step()
17 optimizer.zero_grad()
Machine learning as discussed here rests on two pillars: A solution predictor con-
taining some parameters to be determined and a learning phase where the model’s
current parameter values are iteratively compared to some target values and updated
accordingly. The update process is controlled by gradient descent towards the target.
In addition to the learnable parameters the specific solution algorithm also depends
on the choice of the so-called hyper parameters, here loss function, optimizer, learning
rate and iteration epochs. In more complex cases, the design of the model architec-
ture itself is also an important issue. Choosing optimal hyperparameters is usually a
matter of trial and error based on experience.
In the next section we will mainly look at backpropagation and the gradient descent
algorithms in general, without focusing on the concept of neural networks. We will
discuss the networks in more detail later.
16.3 Backpropagation
Optimization
The idea of gradient descent to a desired goal actually boils down to an optimization
problem. We illustrate this with a backpropagation algorithm to determine the local
minimum value of the function in Remark 16.2:
370 16 Scientific Machine Learning with PyTorch
Example 16.8 Let 𝑓(𝑥) ∶= −𝑥 sin(𝑥). We compute the local minimum starting from
the point 𝑥 = 4.
First the components:
1 from torch import tensor , sin, optim
The function argument 𝑥 is the parameter that needs to be changed during training.
So we must record the gradient:
2 x = tensor(4., requires_grad = True)
Since we want to minimize the function, we can use itself as loss function:
3 f = lambda x: ‐x*sin(x)
We take the opportunity to present another optimizer, which seems to perform better
here. It is the very popular Adam-optimizer, that was only introduced in 2015. The
name Adam is derived from “adaptive momentum estimation”.
It extends the gradient descent so that it cleverly adapts the learning rate in the
course of computation. The term momentum means that the next update is deter-
mined as a combination of the gradient and the previous update:
4 optimizer = optim.Adam([x], lr=0.05)
Note that the learning-rate and the number of epochs below were again chosen by
trial and error, so as to yield an acceptable result.
We come to the minimization procedure:
5 for _ in range(200):
6 optimizer.zero_grad()
7 loss = f(x)
8 loss.backward()
9 optimizer.step()
The result matches the one we can obtain with the SciPy operator minimize consid-
ered in Sect. 4.6 in the SciPy chapter.
Example 16.9 (Rosenbrock Function) The same idea also works for two dimen-
sional functions. We illustrate this with a backpropagation algorithm to determine
the minimum value of the Rosenbrock function
2
𝑓(𝑥, 𝑦) ∶= (1 − 𝑥)2 + 100 (𝑦 − 𝑥 2 ) .
For background we refer to Sect. 4.6 in the SciPy chapter. Recall that 𝑓 assumes its
minimum value 0 at the point (1, 1).
16.3 Backpropagation 371
Again the function arguments 𝑥, 𝑦 are the parameters that need to be changed during
training. So we must record the gradients:
2 x = tensor(0., requires_grad = True) # initial value 0
3 y = tensor(0., requires_grad = True)
Note that here the minimum loss value corresponds to the desired minimum 0 itself.
Nonlinear Equations
In the SciPy chapter we saw that the solution of nonlinear equations can be viewed as
a search for a minimum. We illustrate how gradient descent in PyTorch can be used
to approximate such solutions.
Some experimenting shows that the basic optimizer SGD gives a reasonable result:
4 optimizer = optim.SGD([x], lr=0.1)
As mentioned, we set ℓ(𝑥) ∶= 𝑓(𝑥)2. This is all we need:
5 for _ in range(20):
6 loss = f(x)**2
7 loss.backward()
8 optimizer.step()
9 optimizer.zero_grad()
10 print(x.detach()) # Out: tensor(‐1.0299)
Remark Experimenting with different values for the hyper parameters shows that
the solution is very sensitive to changes. For instance, if we increase the learning rate
to 0.2 we get a different wrong result (even if we increase the number of epochs), for
a learning rate = 1 the computation diverges. If we start with 𝑥0 = 2 as in the Newton
approximation, we also get a wrong result.
Example 16.11 Systems of equations can also be solved by gradient descent. Consider
the same system as in Sect. 4.6 in the SciPy chapter:
𝑦 − 𝑥 3 − 2𝑥 2 + 1 = 0
𝑦 + 𝑥 2 − 1 = 0.
Linear Algebra
Backpropagation and gradient descent can also be used to solve linear equations in
matrix form. The standard way to solve equations 𝐴𝑥 = 𝑏 by gradient descent is to
minimize ||𝐴𝑥 − 𝑏||2 for the Euclidean norm.
Note that ‘zeros’ returns a zero-vector of desired length, ‘mm’ stands for matrix-
multiplication.
We fix an example with a matrix 𝐴 and vector 𝑏 given by
3 A = tensor([[1., 2., 3.], [1., 1., 1.], [3., 3., 1.]])
4 b = tensor([[2.], [2.], [0.]])
The parameters along which we descend are the x-values themselves. We initialize x
as a zero-vector of length 3. The unsqueeze method with argument 1 converts it to
a vector along dimension 1, i.e. a column vector:
5 x = zeros(3).unsqueeze(1)
6 x.requires_grad=True # needed for automatic gradient computation
We come to the optimizer. Here again some experimentation indicates that the Adam
optimizer with initial learning rate 1.0 gives a useable result:
8 optimizer = optim.Adam([x], lr=1.)
The next two examples show that the choice of loss criterion can have a major effect.
Example 16.13 We consider an equation 𝐴𝑥 = 𝑏 from Example 4.3 in the SciPy
chapter.
First the torch components:
1 from torch import tensor , zeros, optim, mm
2 from torch.nn import MSELoss
We zero initialize the solution vector and prepare for gradient descent:
6 x = zeros(4).unsqueeze(1)
7 x.requires_grad=True
The solution:
16 print(x.detach().T)
17 Out: tensor([[0.9951, 0.9999, 0.9954, 1.0014]])
Example 16.14 To implement the idea in PyTorch, we only have to make some minor
changes in the program in Example 16.13 above.
Lines 1-7 remain as before.
The loss function from line 13 is replaced by the quadratic form, the initial learn-
ing rate is changed to 0.09, and, most importantly, the number of epochs significantly
reduced to 160:
8 optimizer = optim.Adam([x], lr=0.09)
9 for _ in range(160):
10 optimizer.zero_grad()
11 loss = 1/2 * mm(x.T, mm(A, x)) ‐ mm(x.T, b)
12 loss.backward()
13 optimizer.step()
14 print(x.detach().T) # Out: tensor([[1.000, 1.000, 1.000, 1.000]])
Remark Recall that in the SciPy chapter we used conjugate gradients to reach the
exact result in 4 steps. A PyTorch implementation of the conjugate gradient method
is mentioned in the documentation, but is unfortunately not yet available.
Example 16.15 In this context our linear regression function 𝑥 ↦ 𝑤𝑥+𝑏 in Sect. 16.1
is a simple degenerate example with identity as activation function.
376 16 Scientific Machine Learning with PyTorch
1 if 𝑥 ≥ 0
𝜃(𝑥) ∶= {
0 else,
Theorem 16.17 Let 𝑓∶ 𝐼 → ℝ for some closed interval 𝐼 = [𝑎, 𝑏]. Then 𝑓 can be
approximated uniformly by functions of the form
𝑛
(1) 𝑓𝑛 (𝑥) ∶= ∑ (𝑣𝑖 ⋅ 𝜃(𝑥 + 𝑏𝑖 )), 𝑛 → ∞.
𝑖=1
𝑓(𝑥𝑖 ) if 𝑥 ∈ 𝐼𝑖
(2) 𝑓𝑛,𝑖 (𝑥) ∶= {
0 else.
For 𝑓 = sin this is illustrated in Fig. 16.4 on the next page, on the left for 𝑓10 , on the
right for 𝑓100 .
It is then clear that with the usual supremum norm:
Note that the development so far applies in general and has nothing to do with neural
networks.
16.4 Neural Networks 377
1.00 1.00
0.75 0.75
0.50 0.50
0.25 0.25
0.00 0.00
0.25 0.25
0.50 0.50
0.75 0.75
1.00 1.00
0 1 2 3 4 5 6 0 1 2 3 4 5 6
Fig. 16.4 Approximation of the sine function by piecewise constant functions. Left with 10 evalu-
ation points, right with 100. Dots mean that the left end values belong to the line segments
The crucial observation is now, that each 𝑓𝑛,𝑖 in (2) can be expressed by a pair of
threshold terms, more precisely as follows:
By reordering the terms in (5) we get the following expression for (3):
𝑛
(6) 𝑓𝑛 (𝑥) = 𝑓(𝑥1 ) 𝜃(𝑥 − 𝑥1 ) + ∑ (𝑓(𝑥𝑖 ) − 𝑓(𝑥𝑖−1 )) 𝜃(𝑥 − 𝑥𝑖 ).
𝑖=2
With 𝑣1 ∶= 𝑓(𝑥1 ) and 𝑣𝑖 ∶= 𝑓(𝑥𝑖 ) − 𝑓(𝑥𝑖−1 ) for 2 ≤ 𝑖 ≤ 𝑛, and all 𝑏𝑖 ∶= −𝑥𝑖 we finally
get the claim from (1), (6) and (4).
This concludes the proof.
Note that the proof is in fact constructive. For a given function 𝑓, equation (6) shows
how to define weights and biasses to generate the approximation functions.
Example 16.18 Consider the function 𝑓 = sin over the interval [0, 2𝜋]. The follow-
ing program shows how to implement approximation functions 𝑓𝑛 .
We begin with the necessary torch components:
1 from torch import tensor , linspace , heaviside , sin
The second argument specifies the value for the argument 0. According to our defi-
nition above it must be 1.
We fix our target function that is to approximated:
3 f = sin
The following function fn(x,n) is the heart of the program. It implements the ap-
proximation functions 𝑓𝑛 (𝑥):
378 16 Scientific Machine Learning with PyTorch
It is immediately verified that lines 6-8 correspond to equation (6) in the proof.
Note that the PyTorch linspace generator in line 5 is basically the same as its
NumPy counterpart.
We illustrate our construction by plotting 𝑓10 and 𝑓100 :
10 x = linspace(0, 6.283, 200)
11 from matplotlib.pyplot import plot
12 plot(x, fn(x, 10)); plot(x, fn(x, 100))
The result is shown in Fig. 16.6 on page 380 below, together with the corresponding
result obtained with sigmoid networks, to which we turn now.
Sigmoid Neurons
1.0
0.8
0.6
0.4
0.2
0.0
6 4 2 0 2 4 6
Fig. 16.5 Sigmoid function 𝜎 (blue line), Threshold function 𝜃 (orange line). Dot indicates that the
point belongs to the line segment
16.4 Neural Networks 379
Note that the only structural difference to the formulation in Theorem 16.17 is that
now the first layer consists of sigmoid neurons 𝑦𝑖 = 𝜎(𝑤𝑖 𝑥 + 𝑏𝑖 ) for which moreover
a weight 𝑤𝑖 must also be taken into account.
To see that the claim holds, in light of the universal approximation theorem for
threshold functions it is sufficient to note that the threshold function 𝜃(𝑥) can be
approximated as closely as desired by 𝜎(𝑤𝑥) for 𝑤 → ∞ In fact, in our standard
graphic resolution, for 𝑤 = 50 the function 𝑥 ↦ 𝜎(𝑤𝑥) can hardly be graphically
differentiated from 𝜃, apart of course from the almost vertical line connection be-
tween lower and upper piece.
Recall that in the threshold version, the argument terms had the form 𝑥+𝑏 without
any weight coefficient 𝑤 for the variable 𝑥. In the present case this is needed however,
since we have to replace 𝜃(𝑥+𝑏) with 𝜎(𝑤𝑥+𝑤𝑏). Actually also the value of 𝑏 changes.
We use the same name 𝑏 once again instead of 𝑤𝑏.
This concludes the proof.
Example 16.21 To illustrate the construction we continue with Example 16.18 and
compute the sigmoid approximations 𝑓𝑛 for 𝑛 = 10 and 𝑛 = 100, where we assume a
multiplication factor 𝑤 = 50.
We just need to make a few minor changes to the program above.
In line 1 the function sigmoid is imported instead of heaviside. Line 2 is deleted.
In the definition of fn(x,n), in lines 6-8 we replace computations of the form 𝜃(𝑥)
with 𝜎(50𝑥), as exemplified in the proof above:
res = f(x[0]) * sigmoid(50*(x ‐ xn[0]))
for i in range(n‐1):
res += (f(xn[i+1]) ‐ f(xn[i])) * sigmoid(50*(x ‐ xn[i+1]))
The result is illustrated on the right in Fig. 16.6 on the following page.
380 16 Scientific Machine Learning with PyTorch
1.00 1.00
0.75 0.75
0.50 0.50
0.25 0.25
0.00 0.00
0.25 0.25
0.50 0.50
0.75 0.75
1.00 1.00
0 1 2 3 4 5 6 0 1 2 3 4 5 6
Fig. 16.6 Approximation of 𝑓 = sin with threshold (left) and sigmoid (right) neuronal networks
𝑓10 (blue) and 𝑓100 (orange). For the sigmoid network a weight factor 𝑤 = 50 is assumed
The universal approximation theorem implies that neural networks can represent
any continuous function on a closed interval with any desired precision. Simple two-
layer networks are actually sufficient. Moreover, since the network is based on dif-
ferentiable functions we can use backpropagation algorithms like gradient descent
to determine parameter values.
In the following sections we examine how gradient descent algorithms can be im-
plemented in neural networks in various problem areas with PyTorch, especially in
function approximation and in ordinary and partial differential equations.
In line 1, all required building blocks are imported from the package torch.nn.
The Sequential constructor in line 2 collects the modules in lines 3–5 and exe-
cutes them sequentially, passing the output of one layer as input to the next.
16.5 Function Approximation 381
Note that the components in N can be accessed just as in lists: N[0] refers to the
module in line 3, N[2] to the one in line 5.
Remark Conversely we can also assign parameter values manually, for instance by:
>>> from torch import tensor
>>> from torch.nn import Parameter
>>> N[0].weight = Parameter(tensor([[1.], [2.]])) # assume n = 2
As a first example we train a neural network to approximate the Runge function, al-
ready discussed several times in previous chapters.
Example 16.23 Consider the Runge function
1
𝑓(𝑥) ∶= , 𝑥 ∈ [−1, 1].
1 + 25𝑥 2
We start by importing the necessary components:
1 from torch import linspace , optim
2 from torch.nn import Sequential , Linear , Sigmoid , MSELoss
As mentioned, the linspace generator is basically the same as its NumPy counter-
part. Again we need the input vector as a column vector, i.e. as a 1-column matrix.
As seen before, the transformation is accomplished by the unsqueeze method for the
dimension number 1.
For the neural network we choose one with the canonic form in Definition 16.22:
5 N = Sequential(Linear(1,30), Sigmoid(), Linear(30, 1, bias=False))
In the iteration loop, only note that the gradient computation is embedded in the
function closure, which is then called by the optimizer in line 14.
8 for _ in range(100):
9 def closure():
10 optimizer.zero_grad()
11 loss = loss_fn(N(x), f(x))
12 loss.backward()
13 return loss
14 optimizer.step(closure)
The result is so close to the exact one, that the difference does not show up. In Fig. 16.7
on the next page we thus plot the values predicted by the model, and additionally the
deviation from the exact solution.
16.5 Function Approximation 383
1.0
0.006
0.8
0.004
0.6 0.002
0.000
0.4
0.002
0.2
0.004
0.0 0.006
1.00 0.75 0.50 0.25 0.00 0.25 0.50 0.75 1.00 1.00 0.75 0.50 0.25 0.00 0.25 0.50 0.75 1.00
Fig. 16.7 Left: Runge function computed by a neural network (blue line), deviation from the exact
function (orange line). Right: Deviation in higher resolution.
We construct the domain with the PyTorch cartesian_prod operator with its obvi-
ous meaning:
4 n = 30
5 x = linspace(‐2, 2, n)
6 y = linspace(‐2, 4, n)
7 xy = cartesian_prod(x,y)
Again we choose a canonic model implied by the approximation theorem. Note that
here we introduce one weight parameter for each dimension:
9 N = Sequential(
10 Linear(2, n*n), # Two input variables
11 Sigmoid(),
12 Linear(n*n, 1, bias=False))
384 16 Scientific Machine Learning with PyTorch
In line 22 we use the PyTorch version of the meshgrid function. It returns tensors,
which are however understood by Matplotlib. Note that the indexing='ij' option
is necessary to match its NumPy counterpart.
In line 23, the detach method extracts the result from the network output N(xy),
which is then converted into a NumPy array and reshaped so that it corresponds to
the shape of the grid inputs obtained from the tensors in lines 5 and 6.
The rest is then pure Matplotlib code:
25 from matplotlib.pyplot import figure
26 fig = figure()
27 ax = fig.add_subplot(projection='3d')
28 ax.plot_surface(X, Y, Z, cmap="jet")
29 ax.plot_wireframe(X, Y, Z ‐ Z_ex, color='#ff7f0e')
30 fig = figure()
31 ax = fig.add_subplot(projection='3d')
32 ax.plot_surface(X, Y, Z ‐ Z_ex, cmap="jet")
16.6 Derivation
So far we have used automatic derivation to control the gradient descent towards
optimal solutions. However, the underlying mechanisms can also be used directly.
For this the function grad in the module autograd is the workhorse. We’ll briefly
discuss how it works.
16.6 Derivation 385
3500 0.2
3000
2500 0.1
2000 0.0
1500
1000
0.1
500 0.2
0
4 4
3 3
2 2
2.0 1.5 1 2.0 1.5 1
1.0 0.5 0 1.0 0.5 0
0.0 0.5 1 0.0 0.5 1
1.0 1.5 2 1.0 1.5 2
2.0 2.0
Fig. 16.8 Rosenbrock function computed by a neural network. Left the approximation, the devia-
tion from the exact function shown in orange. Right the error function in higher resolution
Note the comma after the output variable dudx. The reason is that grad outputs a
tuple, in this case a 1-tuple. Recall that in Python a 1-tuple is always written with a
trailing comma.
Derivative Functions
𝜕𝑥3𝑖 3𝑥2 0
𝐽(𝑥1 , 𝑥2 ) = ( ) = ( 1 2) .
𝜕𝑥𝑗 1≤𝑖,𝑗≤2 0 3𝑥2
Note that the desired values are positioned on the main diagonal, where they can be
retrieved through left multiplication with a one-vector 𝑣 = (1, 1):
3𝑥21 0
(∗) 𝑣 𝐽 = (1, 1) ( ) = (3𝑥21 , 3𝑥22 ).
0 3𝑥22
With the declaration in line 4, the grad function with option ‘grad_outputs = v’ in
line 5 implements (∗), as desired.
This is all we need to approximate first derivative functions in PyTorch. For higher-
order derivatives some care must be taken.
Higher-Order Derivatives
We get an error message. The reason is that dudx is returned as a simple tensor(3.)
so that the system cannot see how it was computed and thus cannot not backtrack
the gradient.
If we replace line 4 in Example 16.25 with
4 dudx, = autograd.grad(u, x, create_graph=True)
we get the result tensor(3., grad_fn=<MulBackward0>) for dudx, which can then
in line 7 be used to compute the correct value for d2udx2.
We put it all together and define a differential operator that we will use extensively in
the further course of the chapter.
Definition 16.28 (diff Operator) The following differential operator will be central:
1 def diff(u, x):
2 from torch import autograd , ones_like
3 grad, = autograd.grad(outputs = u, # function u derived by
4 inputs = x, # argument tensor x
5 grad_outputs = ones_like(u),
6 create_graph = True)
7 return grad
16.6 Derivation 387
The input tensor x must be provided as a column vector, which also has the require
gradient option set to true. u is expected to be a function of x.
Note that in line 2 we have collected all torch components that we need to make
the function standalone.
In lines 3 and 4 we use a more verbose formulation to specify in- and outputs.
As in Example 16.26 above, in line 5 we declare the vector by which we multiply the
Jacobian. The vector-generator ‘ones_like’ supplies a tensor of the same shape as
the argument u, filled with the scalar value 1.0.
As in Example 16.27, the option in line 6 ensures that the operator can also be
used for higher-order derivatives.
In fact, we’re going to use a second order derivative operator as well, the operator
diff2 defined by
def diff2(u,x): return diff(diff(u,x), x)
For later use we store both operators in a file derivate.py, from where they can be
retrieved e.g. with
from derivate import diff, diff2
Example 16.29 We compute first and second derivative of the arctangent function:
1 from torch import linspace , atan
2 from derivate import diff, diff2
The unsqueeze method in line 3 converts the linspace vector to a vector along di-
mension 1, i.e. a column vector. Line 4 sets the gradient requirement.
This is all we need. In line 5 we introduce our function, compute the first derivative
in line 6 and the second in line 7:
5 u = atan(x) # base function
6 dudx = diff(u, x) # first derivative
7 d2udx2 = diff2(u, x) # second derivative
Note that the detach method in lines 9-11 extracts the pure data parts from the ten-
sors, essentially converting them to NumPy arrays, so that they can be handled by
the matplotlib ‘plot’ command.
The result is shown in Fig. 16.9 on the next page.
388 16 Scientific Machine Learning with PyTorch
1.5
atan(x)
d/dx atan(x)
1.0 d2/dx2 atan(x)
0.5
0.0
0.5
1.0
1.5
6 4 2 0 2 4 6
Partial Derivatives
We can also use our differential operator diff to compute partial derivatives. As an
example that will be useful to us later, we show how to compute the Laplacian Δ𝑓 of
a function 𝑓∶ ℝ2 → ℝ. For details we refer to Sect. 4.8 in the SciPy chapter.
Definition 16.30 (Laplace Operator) The definition is based on the diff operator
in Definition 16.28:
1 def laplace(u, xy):
2 grads = diff(u, xy)
3 dudx, dudy = grads[:, 0], grads[:, 1]
4 du2dx2 = diff(dudx, xy)[:, 0]
5 du2dy2 = diff(dudy, xy)[:, 1]
6 return (du2dx2 + du2dy2).unsqueeze(1)
For the arguments in line 1 we assume that xy is a cartesian product of x and y values,
which also satisfies the gradient requirement. The first argument u is again assumed
to be a function of xy.
For later use the operator is saved together with the operators diff and diff2 in the
file derivate.py.
Example 16.31 As an application we consider the function 𝑓∶ [0, 1]2 → ℝ given by
𝑓(𝑥, 𝑦) ∶= 1 + 𝑥 2 + 2𝑦 2 ,
We fix the domain. For this we again use the cartesian_prod operator with its ob-
vious meaning:
3 x = linspace(0,1,50); y = linspace(0,1,50)
4 xy = cartesian_prod(x,y)
5 xy.requires_grad=True
The output confirms the discussion in Sect. 15.3 in the FEniCS chapter.
16.7 Integration
Now that we have the basic derivation tools in place, let’s continue with neural net-
work computations.
First we want to use machine learning to determine definite integrals of the form
𝑥
𝑢(𝑥) = ∫ 𝑓(𝑠) 𝑑𝑠
𝑐
for some constant 𝑐 in some interval [𝑎, 𝑏]. Note that 𝑢(𝑐) = 0.
A naive approach would be to follow the idea in Example 16.23 and 16.24 and train
a network 𝑁 that outputs the integral function directly.
Note 16.32 However, it turns out to be more efficient to exclude the approximation
to 𝑢(𝑐) = 0 from the training process altogether, and rather introduce a modifiable
trial function 𝑢𝑡 such that by construction 𝑢𝑡 (𝑐) = 0 always holds. This can be ac-
complished as follows:
(2) Minimize the loss between the current values of 𝑢′𝑡 and the integrand 𝑓.
Example 16.33 We consider the function 𝑓(𝑥) ∶= 1/(1 + 𝑥 2 ) and 𝑐 = 0, with the
known solution 𝑢 = arctan.
We use our diff operator introduced in Definition 16.28.
390 16 Scientific Machine Learning with PyTorch
Our neural network again has the simplest form as in Definition 16.22 according to
the universal approximation theorem:
6 N = Sequential(Linear(1,50), Sigmoid(), Linear(50, 1, bias=False))
1.5 0.00175
1.0 0.00150
0.00125
0.5
0.00100
0.0 0.00075
0.00050
0.5
0.00025
1.0 0.00000
1.5 0.00025
6 4 2 0 2 4 6 6 4 2 0 2 4 6
Fig. 16.10 Left approximation function. Right deviation from the exact solution arctangent
Example 16.34 Extending the idea, we consider the following initial value problem,
discussed in Example 4.13 in the SciPy chapter:
(1) 𝑢𝑡 ∶= (1 − 𝑥) + 𝑥 𝑁 → 𝑢,
because then the initial value 𝑢𝑡 (0) = 1 is always correct, regardless of the current
parameter values in 𝑁.
The training is controlled by:
(2) Minimize the loss between the current values of 𝑢′𝑡 and 𝑥 − 𝑢𝑡 .
The program begins as usual with the import of the required components:
1 from torch import linspace , optim
2 from torch.nn import Sequential , Linear , Sigmoid , MSELoss
3 from derivate import diff
We stay with the same network architecture as in the above integration example:
5 N = Sequential(Linear(1,50), Sigmoid(), Linear(50, 1, bias=False))
It looks exactly like the one in Fig. 4.4 on page 77 in the SciPy chapter, so we don’t
repeat it here.
In the SciPy and Julia chapters we discussed systems of coupled equations, using the
Lotka-Volterra equations as a prime example.
Example 16.35 We illustrate the main idea with a significantly simpler example,
given by
𝑢′ = 𝑣, 𝑣′ = −𝑢, 𝑥 ∈ [0, 2𝜋],
𝑢(0) = 0, 𝑣(0) = 1,
with the obvious solution 𝑢 = sin, 𝑣 = cos.
This time we introduce two neural nets 𝑁𝑢 , 𝑁𝑣 , and train them so that
(1) 𝑢𝑡 ∶= 𝑥 𝑁𝑢 → 𝑢, 𝑣𝑡 ∶= (1 − 𝑥) + 𝑥 𝑁𝑣 → 𝑣.
(2) Minimize the losses between 𝑢′𝑡 and 𝑣𝑡 and between 𝑣𝑡 and −𝑢′𝑡 .
Here is the PyTorch program. As usual we begin with the import of the components:
1 from torch import linspace , optim
2 from torch.nn import Sequential , Linear , Sigmoid , MSELoss
The new approach is to approximate 𝑢 and 𝑣 using two connected networks. Note
that both are shaped according to our standard architecture:
5 N_u = Sequential(Linear(1,50), Sigmoid(), Linear(50,1,bias=False))
6 N_v = Sequential(Linear(1,50), Sigmoid(), Linear(50,1,bias=False))
In the present case it turns out to be advantageous to sum up the elementwise losses
instead of taking the mean. This is achieved by adding a corresponding option to the
MSELoss function:
7 loss_fn = MSELoss(reduction="sum")
We want to use our LBFGS optimizer. To do this, we first merge both parameter lists
into one, which we can then pass to the optimizer:
8 params = list(N_u.parameters()) + list(N_v.parameters())
9 optimizer = optim.LBFGS(params , lr=0.1)
Recall that the optimizer has to perform the loss computation repeatedly within each
epoch. As before, we embed it in a closure function:
10 for _ in range(150):
11 def closure():
12 optimizer.zero_grad()
13 global u, v # needed for plotting
14 u = x * N_u(x) # see (1) above
15 v = (1‐x) + x * N_v(x) # ditto
16 dudx = diff(u, x)
17 dvdx = diff(v, x)
The optimizer then adjusts the parameters using the closure function:
21 optimizer.step(closure)
The result consists of the pair of the sine and cosine functions, so it does not need to
be shown here.
Remark In the exercises you will be asked to solve the Lotka-Volterra equation from
Example 4.14 in the SciPy chapter. Note that this is actually a challenge that may
not be easily solvable with the currently available optimizers and loss functions in
PyTorch.
394 16 Scientific Machine Learning with PyTorch
Second order ODEs can be solved as single equations in neural networks. The differ-
ence from first order equations is that we now control the approximation with second
derivatives.
Example 16.36 We consider the following 2-point boundary value problem with
(homogeneous) Dirichlet boundary conditions, discussed in Sect. 5.6 in the SymPy
and Sect. 10.7 in the Maple chapter:
As in the previous examples we factor out the boundary conditions from the training
process. More precisely, we train a network 𝑁 such that
(1) 𝑢𝑡 = (1 − 𝑥) 𝑥 𝑁 → 𝑢.
Loss function and optimizer are as usual, just note that we’re including the ‘sum’ op-
tion in the loss function:
7 loss_fn = MSELoss(reduction="sum")
8 optimizer = optim.LBFGS(N.parameters(), lr=0.01)
In the loop, observe that line 13 implements (1) above, and line 15 implements (2).
The ‘global’ specification in line 12 is only used to make the resulting ‘u’ value ac-
cessible to the plot function in line 20:
9 for _ in range(50):
10 def closure():
11 optimizer.zero_grad()
12 global u
13 u = (1‐x) * x * N(x)
16.9 Second Order ODEs 395
14 d2udx2 = diff2(u,x)
15 loss = loss_fn(‐d2udx2 , f(x))
16 loss.backward()
17 return loss
18 optimizer.step(closure)
The plot:
19 from matplotlib.pyplot import plot
20 plot(x.detach(), u.detach())
The graph looks exactly like in Fig. 5.3 on page 114 calculated with SymPy and Fig.
10.5 on page 241 calculated with Maple, so we will not repeat it here.
Remark In previous chapters we discussed the Bratu equation, see e.g. Example 4.18
or Example 8.28. Recall that it has two solutions, as shown in Fig. 4.8 on page 82. In
the exercises you will be asked to find both. In fact, it is possible that PyTorch may
not be quite ready for this just yet.
In a second order initial value problem we specify the function value 𝑢(𝑎) = 𝛼 for
the start value 𝑎 as well as the first derivative 𝑢′ (𝑎) = 𝛾.
As a canonic trial function 𝑢𝑡 we set
(∗) 𝑢𝑡 (𝑥) = 𝛼 + (𝑥 − 𝑎) 𝛾 + 𝑥 2 𝑁,
Example 16.37 Consider the following initial value problem with the obvious solu-
tion 𝑢 = sin:
𝑢″ = −𝑢, 𝑥 ∈ [0, 2𝜋],
𝑢(0) = 0, 𝑢′ (0) = 1.
Observing the implementation of (∗) in line 11, the program is then straightforward:
1 from torch import linspace , optim
2 from torch.nn import Sequential , Linear , Sigmoid , MSELoss
3 from derivate import diff2
4 x = linspace(0., 6.283, 50, requires_grad=True).unsqueeze(1)
5 N = Sequential(Linear(1,50), Sigmoid(), Linear(50, 1, bias=False))
6 loss_fn = MSELoss()
7 optimizer = optim.LBFGS(N.parameters())
8 for _ in range(20):
9 def closure():
10 optimizer.zero_grad()
11 global u
12 u = x + x*x * N(x) # implements (*), alpha = 0, gamma = 1
396 16 Scientific Machine Learning with PyTorch
13 d2udx2 = diff2(u, x)
14 loss = loss_fn(d2udx2 , ‐u)
15 loss.backward()
16 return loss
17 optimizer.step(closure)
As already mentioned, note that the computation may diverge due to the random
initialization of the network. When it converges, the remaining loss after 20 epochs
is typically around 0.0002.
Remark Note that the example is only intended as a proof of concept. In practice,
the current PyTorch tools do not seem to be sufficient to guarantee solutions in gen-
eral cases. In the exercises you will be asked to solve the pendulum equation from
Example 4.15 in the SciPy chapter as a challenge.
−Δ𝑢 = 𝑓 ≡ 1 in Ω,
𝑢=𝑔≡0 on the boundary 𝜕Ω.
(∗) 𝑢𝑡 ∶= 𝐵 𝑁 → 𝑢,
𝐵(𝑥, 𝑦) ∶= 𝑥 (1 − 𝑥) 𝑦 (1 − 𝑦) .
Approximation is this time controlled by comparing the Laplacian −Δ𝑢𝑡 with the
source function 𝑓.
Our PyTorch solution program begins with the preamble:
1 from torch import linspace , optim, cartesian_prod , meshgrid , ones
2 from torch.nn import Sequential , Linear , Sigmoid , MSELoss
3 from derivate import laplace
16.10 Partial Differential Equations PDEs 397
As a result we get something like this: a remaining loss of size 0.0127, and a peak
value of 0.0735 for 𝑢, compared to 0.0736 obtained in the SciPy and Matlab ap-
proximations.
Note once more that the program execution may diverge on some runs. This be-
havior is again due to the random initialization of the parameter values.
We prepare the data for the plot:
23 X, Y = meshgrid(xs, ys)
24 Z = u.reshape(n,n).detach().numpy()
We skip the plot itself. It will look just as the one in Fig. 4.9 on page 86 in the SciPy
chapter.
398 16 Scientific Machine Learning with PyTorch
(∗∗) 𝑢𝑡 ∶= 𝐶 + 𝐵 𝑁,
where 𝐶 = 𝐶(𝑥, 𝑦) now is a function that assumes the correct values on the bound-
aries. What it does in the inner region is not essential. Such effects on 𝑢𝑡 are com-
pensated for during training anyway.
Below we provide a simple example to illustrate the idea. Again, current PyTorch
tools seem generally inadequate for handling more complex cases. We refer to the
exercises.
Example 16.39 We consider a Laplace equation studied on various occasions before,
e.g. in Sect. 12.5 in Chap. 12:
where
𝑥 (1 − 𝑥) if 𝑦 = 0
𝑔(𝑥, 𝑦) ∶= {
0 else.
We take the approach as before, with 𝐵 as in the last example. Additionally we include
a term 𝐶 to take care of the nonhomogeneous boundary conditions according to
(∗∗). We set
𝐶(𝑥, 𝑦) ∶= (1 − 𝑦) 𝑥 (1 − 𝑥) .
We see that 𝐶 corresponds to 𝑔 on the boundary. As mentioned, what it does inside
the domain is of no concern. The training of the network cancels out such effects.
We come to the program. It is only a slight modification of the last one in Exam-
ple 16.38. In line 1, we only have to include the ‘zeros’ tensor constructor in the
import list. Lines 2–9 can then be copied literally.
We then add a line to encode the boundary value function 𝐶:
C = ((1 ‐ y)*x*(1 ‐ x)).unsqueeze(1)
The result can then be plotted with the standard Matplotlib methods. Here we only
prepare the inputs, and leave the details to the reader:
X, Y = meshgrid(xs, ys)
Z = u.detach().reshape(n,n).numpy()
0.20
0.15
0.10
0.05
0.00
1.0
0.8
0.0 0.2 0.6
0.4 0.6 0.4
0.2
0.8 1.0 0.0
Fig. 16.11 Laplace equation. Boundary value 𝑥 (1 − 𝑥) if 𝑦 = 0 and 0 otherwise
Exercises
Curve Fitting
1.00
0.75
0.50
0.25
0.00
0.25
0.50
0.75
1.00
3 2 1 0 1 2 3
Fig. 16.12 Blue: sine curve. Orange: best fit third degree polynomial
400 16 Scientific Machine Learning with PyTorch
Activation Functions
Exercise 16.2 In addition to the sigmoid function, there are other frequently used
activation functions in neural networks, including the hyperbolic tangent
𝑒 𝑥 − 𝑒−𝑥
tanh(𝑥) =
𝑒 𝑥 + 𝑒−𝑥
and the rectified linear unit
𝑅(𝑥) = max(0, 𝑥).
Check whether the universal approximation theorem 16.20 also applies to these.
Derivation
𝑓(𝑥, 𝑦) ∶= 1 + 𝑥 2 + 2𝑦 2,
Integration
Exercise 16.4 Write a program that computes the Gauss error function
2 𝑥 −𝑠2
erf (𝑥) ∶= ∫ 𝑒 𝑑𝑠, 𝑥 ∈ [−3, 3].
√𝜋 0
In Example 16.33 we could test the approximation by comparing it with the exact so-
lution. This is not possible in the present case because there is no elementary function
for the error function.
Instead, write a program to test whether the derivative of the error function re-
turns the integrand. If your program reports an error, it could be useful to take a
closer look at line 20 in the program in Example 16.38.
Exercises 401
Differential Equations
Exercise 16.5 In extension of Example 16.35, write a program that solves the Lotka-
Volterra equation
𝑢′ (𝑡) = 𝑢(𝑡) (1 − 𝑣(𝑡)) ,
𝑣′ (𝑡) = −𝑣(𝑡) (1 − 𝑢(𝑡)) , 𝑡 ∈ [0, 4𝜋],
𝑢(0) = 1, 𝑣(0) = 4,
discussed in Example 4.14 in the SciPy chapter.
Exercise 16.6 In Example 4.18 in the SciPy chapter and Example 8.28 in the Julia
chapter we discussed the Bratu equation
In Example 16.37 we solved a simple second order initial value problem. We extend
that approach to compute the angle 𝜃 of a pendulum that is subjected to gravity with
friction, as in Example 4.15 in the SciPy chapter:
Exercise 16.7 Write a neural network program to determine 𝜃 such that
1 ′
𝜃″ (𝑡) + 𝜃 (𝑡) + 5 sin (𝜃(𝑡)) = 0,
4
𝜃(0) = 𝜋 − 0.1, 𝜃′ (0) = 0.
Note that the angle function 𝜃 here should not be confused with the Heaviside thresh-
old function in Sect. 16.4.
Exercise 16.8 Using the ideas from Example 16.39, extend the homogeneous Poisson
equation in Example 16.38 to a nonhomogeneous one, where now the boundary-
𝑦 𝑦
value function is given by 𝑔 = 𝑒 𝑥+ 2 and the source function by 𝑓 = 1.25 𝑒 𝑥+ 2 . For
details and illustration see Sect. 4.8 in the SciPy chapter.
402 16 Scientific Machine Learning with PyTorch
In Sect. 16.4 we showed that neural networks are universal function approximators.
Another way to examine their power in general computations is to see how they can
be used to implement the elementary logical operators underlying computation, the
Boolean operators like “not”, “and” and “or”.
In this section, we take a closer look at the mechanisms of neural networks and
discuss the representation of logical operators by threshold and learnable sigmoid
networks.
Boolean Operators
Here 0 and 1 stand for the truth values false and true. The operators ‘¬’, ‘∧’, and ‘∨’ de-
note the logical operations negation, conjunction, and disjunction. The Sheffer stroke ‘∣’
denotes a logical operation that is equivalent to the negation of the conjunction op-
eration, expressed in ordinary language as “not both”. It is also called nand. The ‘⊻’
operator stands for “either or, but not both”. It is often referred to as exclusive dis-
junction and denoted ‘xor’.
Except for exclusive disjunction, each of the above operators can be easily expressed
by a single threshold neuron. Possible solutions are for example:
¬𝑥 = 𝜃(−𝑥 + 0.5),
𝑥 ∧ 𝑦 = 𝜃(𝑥 + 𝑦 − 1.5),
(∗)
𝑥 ∨ 𝑦 = 𝜃(𝑥 + 𝑦 − 0.5),
𝑥 ∣ 𝑦 = 𝜃(−𝑥 − 𝑦 + 1.5).
From (2) and (4) we get 𝑤1 < 0, which implies 𝑏 > 0 by (3), in contradiction to (1).
This concludes the proof.
We can however represent exclusive disjunction by a neural network with two layers
of threshold neurons. Since 𝑥 ⊻ 𝑦 = (𝑥 ∨ 𝑦) ∧ (𝑥 ∣ 𝑦) we can use the equalities in (∗)
above to compute
Linear Separability
1.0 1.0
0.5 0.5
0.0 0.0
Fig. 16.13 For the Boolean disjunction (left) and exclusive disjunction (right), red dots represent
the value 0, green the value 1. The blue line left shows the line given by 𝑥 + 𝑦 − 1/2 = 0
For the disjunction ‘∨’ the pair of argument sets mapped to 0 and 1, respectively, can
be separated by a line, here in fact the line given by 𝑥 + 𝑦 − 1/2 = 0, as in (∗) above.
In other words, the ‘∨’ operator induces a binary classifier of the sets 𝐶0 = {(0, 0)}
and 𝐶1 = {(0, 1), (1, 0), (1, 1)}.
On the other hand, it is obvious that there can be no line that separates the sets of
green and red dots for the ‘⊻’ operator on the right. The sets 𝐶0 = {(0, 0), (1, 1)} and
𝐶1 = {(0, 1), (1, 0)} are not linearly separable.
In fact also in arbitrary dimensions 𝑛, single threshold neurons can be used as binary
classifiers for a pair of sets 𝐶0 , 𝐶1 ⊆ ℝ𝑛 if and only if the pair is linearly separable by
a hyperplane.
404 16 Scientific Machine Learning with PyTorch
Using sigmoid activation we can use gradient descent to train neural networks to
compute the above logical functions. However, since the sigmoid function never
reaches the values 1 or 0, it can only be an approximation. We interpret 1 as > 0.99
and 0 as < 0.01.
The network parameters are then iteratively updated in the following loop:
8 for _ in range(50):
9 optimizer.zero_grad() # zero‐clear gradients
10 preds = N(inputs) # make predictions
11 loss = loss_fn(preds, targets) # compute loss
12 loss.backward() # compute gradients
13 optimizer.step() # update parameters
When the loop is finished, the result can be printed out with
14 print(preds.detach().T) # Out: tensor([[0.9973, 0.0031]])
Remark Note again that the model parameters are automatically initialized by some
random values. This implies that the output results will usually vary between program
runs. Also the number of epochs needed to reach the desired accuracy may vary.
and adapting the Linear declaration Linear(1,1) in line 6 to the size 2 for input
samples by Linear(2,1), and finally increasing the numbers of epochs to, say, 500.
For the disjunction and negated conjunction, the only additional modification is to
specify the targets as
targets = tensor([[0.], [1.], [1.], [1.]]) # disjunction
targets = tensor([[1.], [1.], [1.], [0.]]) # negated conjunction
Then we formulate the specific input and target vectors, where the inputs are as in
the last example:
3 inputs = tensor([[0., 0.], [0., 1.], [1., 0.], [1., 1.]])
4 targets = tensor([[0.], [1.], [1.], [0.]])
The network architecture is designed according to the two layer threshold formula-
tion (∗∗) above:
5 N = Sequential(
6 Linear(2, 2), Sigmoid(),
7 Linear(2, 1), Sigmoid())
In line 6, the 𝑥 and 𝑦 inputs are processed in two parallel neurons, which return the
intermediate results in two variables, both of which are then passed to the sigmoid
function. In line 7 the outputs are then processed and the weighted sum plus bias
passed through another sigmoid activation function.
We use our standard loss function
8 loss_fn = MSELoss()
The loop itself is then as usual. Note however that we choose to run 6000 iterations
to reach an acceptable accuracy:
10 for _ in range(6_000): # 6_000 epochs
11 optimizer.zero_grad() # zero‐clear gradients
12 preds = N(inputs) # make predictions
13 loss = loss_fn(preds, targets) # compute loss
14 loss.backward() # compute gradients
15 optimizer.step() # update parameters
In this chapter, we have explored the possibilities of using neural networks in nu-
merical computing. However, we also pointed out that the original incentive to study
such networks was primarily in artificial intelligence applications, such as computer
vision and natural language processing.
Although somewhat off-topic for our concern, in this section we at least present
one typical example for the use of machine learning in artificial intelligence: the
recognition of handwritten text.
More specifically, we show how to train a neural network to classify handwrit-
ten digits from 0 to 9. For this purpose, we consider a classic image recognition
dataset called MNIST (for Modified National Institute of Standards and Technology
Database).
The dataset consists of handwritten images that earlier served as the basis for
benchmarking classification algorithms. Today, the learning capabilities of neural
networks are so advanced that the MNIST set is seen more as a toy example, some-
times referred to as the “Hello World” of machine learning.
Nevertheless, it serves as a good introduction to the central techniques.
The Database
MNIST is a labeled dataset that pairs images of handwritten numerals with the name
of the respective numeral.
It consists of 60,000 training samples and a disjoint set of 10,000 test samples.
Samples consist of an image together with its label, i.e. the true digit represented by
the image.
Fig. 16.14 shows some typical sample images.
We will write a PyTorch neural network program that will be trained to classify the
images in the MNIST dataset based on the digit presented.
We need the package ‘torchvision’. It can be downloaded with the Terminal com-
mand:
$ conda install ‐c pytorch torchvision
Data Preparation
10
15
20
25
0 5 10 15 20 25
In line 7, the 28 × 28 input images are flattened into a vector of 784 = 28 × 28 entries,
and then passed through two layers of sigmoid neurons in line 8.
The output layer in line 9 returns a vector with 10 entries, one for each of the
possible digit values 𝑖 = 0, … , 9, which is then passed through the softmax activation
function, given by
𝑆 (𝑥0 , … , 𝑥9 ) = (𝑝0 , … , 𝑝9 ), where
𝑒 𝑥𝑖
𝑝𝑖 = 9 𝑥 , 𝑖 = 0, … , 9.
∑𝑗=0 𝑒 𝑗
The softmax function normalizes its input vector to a distribution consisting of val-
ues 𝑝𝑖 in the interval (0, 1), which also add up to 1, and hence can be interpreted as
probabilities.
The classification model now works like this: from an input image it computes the
output values 𝑝𝑖 , and the index 𝑖 with the maximal value is interpreted as the model’s
predicted result. We define a corresponding function:
10 def predicted(image):
11 output = model(image)
12 _, pred = output.max(dim=1)
13 return pred
The max method in line 12 returns a tuple consisting of the maximum value itself and
its index. Here we just need the index and return it as the function result.
Note that ‘output’ in line 11 is a tensor of shape (1, 10). The dim=1 option indicates
that we are looking for the maximum along the column dimension.
Example Consider the image in our example above. Recall that the model is already
automatically initialized with some random parameter values.
For illustration we apply the untrained model and get:
>>> output7 = model(image7); output7.detach()
Out: tensor([[0.0588, 0.1087, 0.1133, 0.0870, 0.1736,
0.1322, 0.0671, 0.0861, 0.0905, 0.0826]])
>>> predicted(image7) # Out: 4
Recall that the correct label value is 3. Here, the initial probability estimate for it is a
meager output7[0][3].detach() = 0.0870.
Supplement II. Character Recognition 409
Training
The idea is to train the net such that it predicts as many as possible correct values.
First we need a loss criterion. Assume the label of an image is 𝑘. Then we would ideally
aim at an output where 𝑝𝑘 is close to 1, and all other 𝑝𝑖 are close to 0. The loss to be
minimized then becomes 1 − 𝑝𝑘 .
Example In our example the initial loss is:
>>> 1 ‐ output7[0][3].detach() # Out: tensor(0.913)
Note that for efficiency reasons we do not adjust the parameters for each individual
loss gradient. Rather we bundle a set of image-label pairs together, and use the average
loss over the set to update parameters.
Bundling is done by partitioning the training set into so-called mini-batches. A
small power of 2 is usually considered a good choice for the size of a batch. Here we
choose 64:
14 batches = utils.data.DataLoader(trainset , batch_size = 64)
The loop initiated in line 16 runs through all mini-batches once in every iteration.
As mentioned above, the loss-average over the batch is determined in lines 20-23,
and then used to apply the loss-gradient in line 24.
On our test machine the running time for the training was about 1.5 minutes.
Example We check how the now trained model behaves with respect to our example:
>>> output7 = model(image7); output7.detach()
tensor([[2.273e‐10, 2.482e‐08, 1.422e‐07, 1.000e+00, 1.721e‐13,
4.491e‐08, 2.657e‐14, 4.818e‐07, 2.023e‐07, 6.305e‐09]])
>>> predicted(image7) # Out: 3
>>> output7[0][3].detach() # Out: tensor(1.0000)
The prediction is correct, and the probability estimate the best we could hope for.
410 16 Scientific Machine Learning with PyTorch
Evaluation
We can now test the accuracy of our model on the test set of 10,000 new images.
We begin by introducing and preparing the set:
26 testset = datasets.MNIST('./', train = False,
27 transform = transforms.ToTensor())
Here the option train=False specifies to load the test set instead of the training set.
We then simply pass through all entries in the test set and record which ones are
predicted correctly by the model:
28 correct = 0
29 for image, label in testset:
30 if predicted(image) == label: correct += 1
31 print("Correct:", correct) # 9715
Remark Recall that in our model we flatten the image representations to one-di-
mensional vectors. This, of course, has the undesirable effect of losing all information
about connected curves and lines used to draw digits.
In fact, the prediction accuracy can be further increased to over 99% by instead
using more sophisticated models based on so-called convolutional neural networks
that take precisely such shape information into account.
For details we refer to the official PyTorch site.
References
The index is divided into individual sections. The first list contains general subjects
and names. Programming-specific terms are grouped by language in order of ap-
pearence: Python, C/C ++, Julia, Matlab, Maple, Mathematica.
A C
E H
F I
Factorial function 34, 125, 163, 222, 251 IEEE 754 standard 15, 22, 29
recursive 37, 223, 251 Inheritance in classes 46, 174, 190
Fermat’s Last Theorem 51 Initial value problem 75, 92, 180, 209, 210,
Fibonacci sequence 37, 164, 252, 305, 312 216
File access path 40, 120 Instance 43
Finite difference method 57, 83, 184, 283 Integral inner product 116, 232, 243
Finite element method 215, 233, 235, 239, Integral norm 232, 243
265, 266 Integration
Five-point-stencil 84, 283 numerical 74
Floating-point number 15 symbolic 109, 231, 259
For loop 27, 124, 161, 200, 222, 248 Integration by parts 239, 323, 328, 329
Fortran 57, 119, 193, 298 Interactive programming 20
Forward declaration 150, 164 Interpreter 16
Four color conjecture 1 IVP see Initial value problem
J
G
Jacobian matrix 91, 385
Galerkin method 111, 112, 238, 264, 265, 322 Julia set 187
Gauss error function 400 Julia, Gaston 187
Gauss sum formula 25 Jump operation 10
Gauss-Seidel method 90 Just in time compiler JIT 165, 166, 327
Gaussian elimination 1, 3, 207 Just-in-time compilation 157
Gaussian normal equation 65, 66
Ghost point 284 K
Global interpreter lock 298
GMD (German research center for information Karpinski, Stefan 157
technology) 271 Kepler conjecture 1, 2
Gmsh mesh generator 355 Kronecker product 59, 178
Gradient descent 60, 361
back propagation 364 L
epoch 363
forward propagation 363 Landau symbol 109, 110
hyper parameters 369 Laplace equation 283, 314, 352, 398
learning rate 363 Laplace expansion 128, 149
step length 361 Laplacian differential operator 388
Gram–Schmidt process 61, 117 Least squares method 64
Gödel, Kurt 13 Legendre polynomial 117, 243
Subjects and Names 415
Python
generate_mesh FEniCS mshr function 354 Linear PyTorch class constructor 367, 380,
get_local FEniCS vector method 327 381
global scope specifier 390 linsolve SymPy function 103
grad FEniCS operator 332, 339 linspace NumPy array constructor 69
grad PyTorch tensor attribute 365, 366 linspace PyTorch tensor constructor 377
grad PyTorch autograd function 384, 385 list function 32, 71, 393
list type 30
H List comprehension 32, 58
List concatenation 32, 45
heaviside PyTorch function 377 load pickle method 42
help function 53 log SymPy function 109
hilbert SciPy linalg function 288 lu SciPy linalg function 63
hsplit NumPy function 289 LUsolve SymPy matrix method 106, 113
hstack NumPy function 289
M
I
math module 24, 49, 53
Matplotlib module 69
identity NumPy array constructor 58
Matrix SymPy matrix constructor 103
identity SciPy sparse function 69
max PyTorch function 408
if‐else statement 28
mean PyTorch tensor method 364
import statement 22, 53, 54
MeshFunction FEniCS function 354
in set operator 34
meshgrid NumPy array constructor 73, 384
init_printing SymPy function 95
meshgrid PyTorch tensor constructor 383
inner FEniCS operator 332, 339
midpoint FEniCS method 354
int function 41
minimize SciPy optimize function 73, 74,
int number type 22
92
Integer SymPy number type 96
mm PyTorch matrix multiplication 373
integrate SciPy module 74
MPI mpi4py module 273
integrate SymPy function 109
MPI.COMM_WORLD communicator object 273
inv SciPy linalg function 63
MPI.MAX maximum operator 276
invhilbert SciPy linalg function 288
MPI.PROD product operator 276
MPI.SUM sum operator 276
J
mpi4py Python module 272
MSELoss PyTorch loss function 365, 366
J imaginary unit 24
mshr FEniCS module 350
j imaginary unit 24, 67
N
K
ndarray NumPy type 55
kron NumPy array constructor 60 near FEniCS function 338, 342
kron SciPy sparse function 69 newton SciPy optimize function 72
nn PyTorch module 365, 366
L no_grad PyTorch context manager 365, 366
None type 276, 280, 287
lambda operator 35 NonlinearVariationalProblem FEniCS class
lambdify SymPy function 100 340
latex SymPy function 103 NonlinearVariationalSolver FEniCS class
LBFGS PyTorch optimizer 381, 390 340
leaf_node FEniCS 1.7.2 method 350 norm NumPy function 338, 339
len function 30, 34, 45, 62 normalize FEniCS function 348
len set cardinality operator 34 not Boolean operator 24
lhs SymPy equation attribute 102 nullspace SymPy matrix method 106
limit SymPy function 108 numpy PyTorch converter method 384, 399
linalg SciPy module 63 numpy module 53, 54
Python 419
C/C++
C L
Julia
Matlab
B L
vector 195 zeros create array with all zeros 197, 198
Maple
A I
C L
N S
Mathematica