Julia High Performance - Sample Chapter
Julia High Performance - Sample Chapter
ee
C o m m u n i t y
E x p e r i e n c e
D i s t i l l e d
$ 34.99 US
22.99 UK
P U B L I S H I N G
pl
Sa
m
Avik Sengupta
for many years, mostly using Java interspersed with snippets of the exotic R and K
languages. This experience left him wondering whether there were better things out
there. Avik's quest came to a happy conclusion with the appearance of Julia in 2012.
He has been happily coding in Julia and contributing to it ever since.
Preface
When I first learned about Julia in early 2012, it was clear to me that this is a
language that I've wanted for many years. The use of multiple dispatch made it
very easy to express mathematical concepts, while the speed of the language made
it feasible to express them in the Julia. I came for the elegance and stayed for the
performance. On the other hand, some users come to Julia for the performance and
stay for the elegance. Either way, in order to fully appreciate the power and beauty
of the language, it needs to live up to its promise of high performance.
I hope this book will help Julia programmers at all levels to learn the design
techniques and paradigms that produce fast Julia code. One of the nice things
about Julia is that its performance characteristics are simple and easy to reason out.
I hope this book will provide you with a framework to think about and analyze the
performance of your own code.
Preface
Chapter 6, Fast Arrays, describes ways to use multidimensional arrays in the fastest
possible way.
Chapter 7, Beyond the Single Processor, provides an introduction to Julia's distributed
computing facilities.
Julia is Fast
In many ways, the history of programming languages has often been driven by, and
certainly intertwined, with the needs of numerical and scientific computing. The first
high-level programming language, Fortran, was created with scientific computing
in mind, and continues to be important in the field even to this day. In recent years,
the rise of data science as a specialty has brought additional focus to scientific
computing, particularly for statistical uses. In this area, somewhat counterintuitively,
both specialized languages such as R and general-purpose languages such as Python
are in widespread use. The rise of Hadoop and Spark has spread the use of Java and
Scala respectively among this community. In the midst of all this, Matlab has had
a strong niche within engineering and communities, while Mathematica remains
unparalleled for symbolic operations.
A new language for scientific computing therefore has a very high barrier to overcome.
It's been only a few short years since the Julia language was introduced into the
world. In this time, it's innovative features, which make it a dynamic language, based
on multiple dispatch as its defining paradigm, has created growing niche within the
numerical computing world. However, it's the claim of high performance that excited
its early adopters the most.
This, then, is a book that celebrates writing high-performance programs. With Julia,
this is not only possible, but also reasonably straightforward, within a low-overhead,
dynamic language.
[1]
Julia is Fast
As a reader of this book, you have likely already written your first few Julia
programs. We will assume that you have successfully installed Julia, and have a
working programming environment available. We expect you are familiar with
very basic Julia syntax, but we will discuss and review many of those concepts
throughout the book as we introduce them.
[2]
Chapter 1
Secondly, when running code routinely written in two languages, there can be severe
and unforeseen performance pitfalls. When you can drop down to C code quickly,
everything is fine. However, if, for whatever reason, your code cannot call into a C
routine, you'll find your program taking hundreds or even thousands of times more
longer than you expected.
Julia is the first modern language to make a reasonable effort to solve the "two
language" problem. It is a high-level, dynamic, language with powerful features
that make for a very productive programmer. At the same time, code written in Julia
usually runs very fast, almost as fast as code written in statically typed languages.
The rest of this chapter describes some of the underlying design decisions that
make Julia such a fast language. We also see some evidence of the performance
claims for Julia.
The rest of the book shows you how to write your Julia programs in a way that
optimizes its time and memory usage to the maximum. We will discuss how to
measure and reason performance in Julia, and how to avoid potential performance
pitfalls.
For all the content in this book, we will illustrate our point individually with small
and simple programs. We hope that this will enable you grasp the crux of the issue,
without getting distracted by unnecessary elements of a larger program. We expect
that this methodology will therefore provide you with an instinctive intuition about
Julia's performance profile.
Julia has a refreshingly simple performance model and thus writing fast Julia
code is a matter of understanding a few key elements of computer architecture,
and how the Julia compiler interacts with it. We hope that, by the end of this book,
your instincts are well developed to design and write your own Julia code with the
fastest possible performance.
Versions of Julia
Julia is a fast moving project, with an open development process.
All the code and examples in this book are targeted at version 0.4
of the language, which is the currently released version at the time
of publication. Check Packt's website for changes and errata for
future versions of Julia.
[3]
Julia is Fast
[4]
Chapter 1
[5]
Julia is Fast
Types
We will have much more to say about types in Julia throughout this book. At this
stage, suffice it to say that Julia's concept of types is a key ingredient in its performance.
The Julia compiler tries to infer the type of all data used in a program, and compiles
different versions of functions specialized to particular types of its arguments. To take
a simple example, consider the sqrt function. This function can be called with integer
or floating-point arguments. Julia will compile two versions of the code, one for integer
arguments, and one for floating point arguments. This means that, at runtime, fast,
straight-line code without any type checks will be executed on the CPU.
The ability of the compiler to reason about types is due to the combination of a
sophisticated dataflow-based algorithm, and careful language design that allows
this information to be inferred from most programs before execution begins. Put
in another way, the language is designed to make it easy to statically analyze.
If there is a single reason for Julia is being such a high-performance language, this
is it. This is why Julia is able to run at C-like speeds while still being a dynamic
language. Type inference and code specialization are as close to a secret sauce as Julia
gets. It is notable that, outside this type inference mechanism, the Julia compiler is
quite simple. It does not include many advanced Just in Time optimizations that
Java and JavaScript compilers are known to use. When the compiler has enough
information about the types within the code, it can generate optimized, straight-line,
code without many of these advanced techniques.
It is useful to note here that unlike some other optionally typed dynamic languages,
simply adding type annotations to your code does not usually make Julia go any faster.
Type inference means that the compiler is, in most cases, able to figure out the types
of variables when necessary. Hence you can usually write high-level code without
fighting with the compiler about types, and still achieve superior performance.
[6]
Chapter 1
1000
n =1
1
n2
You will notice that this code contains no type annotations. It should look quite
familiar to any modern dynamic language. The same algorithm implemented in
C would look something similar to this:
double pisum() {
double sum = 0.0;
for (int j=0; j<500; ++j) {
sum = 0.0;
for (int k=1; k<=10000; ++k) {
sum += 1.0/(k*k);
}
}
return sum;
}
[7]
Julia is Fast
You can also download the code files by clicking on the Code
Files button on the book's webpage at the Packt Publishing
website. This page can be accessed by entering the book's
name in the Search box. Please note that you need to be
logged in to your Packt account.
Once the file is downloaded, please make sure that you unzip
or extract the folder using the latest version of:
By timing this code, and its re-implementation in many other languages (all of which
are available at https://github.com/JuliaLang/julia/tree/master/test/
perf/micro), we can note that Julia's performance claims are certainly borne out in
this limited test. Julia can perform at a level similar to C and other statically typed
and compiled languages.
This is of course a micro benchmark, and should therefore not be extrapolated
too much. However, I hope you will agree that it is possible to achieve excellent
performance in Julia. The rest of the book will attempt to show how you can
achieve performance close to this standard in your code.
[8]
Chapter 1
Summary
In this chapter, you noted that Julia is a language that is built from the ground up
for high performance. Its design and implementation have always been focused on
providing the highest possible performance on the modern CPU.
The rest of the book will show you how to use the power of Julia to the maximum,
to write the fastest possible code in this language. In the next chapter, we will discuss
how to measure the speed of Julia code, and identify performance bottlenecks.
You will learn some of the tools that are built into Julia for this purpose.
[9]
www.PacktPub.com
Stay Connected: