Codility Lessons
Codility Lessons
Iterations
In programming, iterating means repeating some part of your program. This lesson presents
basic programming constructions that allow iterations to be performed: “for” and “while”
loops.
The for loop repeats loop_body for each value in turn from the range_of_values, with
the current value assigned to some_variable. In its simplest form, the range of values can
be a range of integers, denoted by: range(lowest, highest + 1). For example, the
following loop prints every integer from 0 to 99:
1 for i in range(0, 100):
2 print i
Looping over a range of integers starting from 0 is a very common operation. (This is
mainly because arrays and Python lists are indexed by integers starting from 0; see Chapter 2
Arrays for more details.) When specifying the range of integers, if the starting value equals
zero then you can simply skip it. For example, the following loop produces exactly the same
result as the previous one:
1 for i in range(100):
2 print i
Example: We are given some positive integer n. Let’s compute the factorial of n and assign
it to the variable factorial. The factorial of n is n! = 1 · 2 · . . . · n. We can obtain it by
starting with 1 and multiplying it by all the integers from 1 to n.
1 factorial = 1
2 for i in range (1, n + 1):
3 factorial *= i
�c Copyright 2020 by Codility Limited. All Rights Reserved. Unauthorized copying or publication pro-
hibited.
1
Example: Let’s print a triangle made of asterisks (‘*’) separated by spaces. The triangle
should consist of n rows, where n is a given positive integer, and consecutive rows should
contain 1, 2, . . . , n asterisks. For example, for n = 4 the triangle should appear as follows:
*
* *
* * *
* * * *
We need to use two loops, one inside the other: the outer loop should print one row in
each step and the inner loop should print one asterisk in each step 2 .
1 for i in range(1, n + 1):
2 for j in range(i):
3 print ’*’,
4 print
The range function can also accept one more argument specifying the step with which the
iterated values progress. More formally, range(start, stop, step) is a sequence of
values beginning with start, whose every consecutive value is increased by step, and that
contains only values smaller than stop (for positive step; or greater than stop for negative
step). For example, range(10, 0, -1) represents sequence 10, 9, 8, . . . , 1. Note that we
cannot omit start when we specify step.
Example: Let’s print a triangle made of asterisks (‘*’) separated by spaces and consisting
of n rows again, but this time upside down, and make it symmetrical. Consecutive rows should
contain 2n − 1, 2n − 3, . . . , 3, 1 asterisks and should be indented by 0, 2, 4, . . . , 2(n − 1)
spaces. For example, for n = 4 the triangle should appear as follows:
* * * * * * *
* * * * *
* * *
*
The triangle should have n rows, where n is some given positive integer.
This time we will use three loops: one outer and two inner loops. The outer loop in
each step prints one row of the triangle. The first inner loop is responsible for printing the
indentations, and the second for printing the asterisks.
1 for i in range(n, 0, -1):
2 for j in range(n - i):
3 print ’ ’,
4 for j in range(2 * i - 1):
5 print ’*’,
6 print
2
we loop are generated one by one, and are thus not known in advance either? In such a case,
we have to use a different kind of loop, called a “while” loop. The syntax of the while loop is
as follows:
Before each step of the loop, some_condition is computed. As long as its value is true3 ,
the body of the loop is executed. Once it becomes false, we exit the loop without executing
loop_body.
Example: Given a positive integer n, how can we count the number of digits in its decimal
representation? One way to do it is convert the integer into a string and count the characters.
Here, though, we will use only arithmetical operations instead. We can simply keep dividing
the number by ten and count how many steps are needed to obtain 0.
1 result = 0
2 while n > 0:
3 n = n // 10
4 result += 1
Example: The Fibonacci numbers4 form a sequence of integers defined recursively in the
following way. The first two numbers in the Fibonacci sequence are 0 and 1, and each subse-
quent number is the sum of the previous two. The first few elements in this sequence are: 0,
1, 1, 2, 3, 5, 8, 13. Let’s write a program that prints all the Fibonacci numbers, not exceeding
a given integer n.
We can keep generating and printing consecutive Fibonacci numbers until we exceed n.
In each step it’s enough to store only two consecutive Fibonacci numbers.
1 a = 0
2 b = 1
3 while a <= n:
4 print a
5 c = a + b
6 a = b
7 b = c
3
Example: The following program:
1 days = [’Monday’, ’Tuesday’, ’Wednesday’, ’Thursday’,
2 ’Friday’, ’Saturday’, ’Sunday’]
3 for day in days:
4 print day
Monday
Tuesday
Friday
Wednesday
Thursday
Sunday
Saturday
Looping over a dictionary means looping over its set of keys. Again, the order in which
the keys are processed is arbitrary.
4
Chapter 2
Arrays
Array is a data-structure that can be used to store many items in one place. Imagine that we
have a list of items; for example, a shopping list. We don’t keep all the products on separate
pages; we simply list them all together on a single page. Such a page is conceptually similar
to an array. Similarly, if we plan to record air temperatures over the next 365 days, we would
not create lots of individual variables, but would instead store all the data in just one array.
(that is, shopping is the name of the array and every product within it is separated by
a comma). Each item in the array is called an element. Arrays can store any number of
elements (assuming that there is enough memory). Note that a list can be also empty:
shopping = []
If planning to record air temperatures over the next 365 days, we can create in advance
a place to store the data. The array can be created in the following way:
1
2.3. Modifying array values
We can change array elements as if they were separate variables, that is each array element
can be assigned a new value independently. For example, let’s say we want to record that on
the 42nd day of measurement, the air temperature was 25 degrees. This can be done with a
single assignment:
temperatures[42] = 25
If there was one more product to add to our shopping list, it could be appended as follows:
shopping += [’eggs’]
The index for that element will be the next integer after the last (in this case, 3).
N = len(shopping)
Let’s write a function that counts the number of days with negative air temperature.
Instead of iterating over indexes, we can iterate over the elements of the array. To do this,
we can simply write:
1 for item in array:
2 ...
In the above solution, for every temperature, we increase the number of days with a negative
temperature if the number is lower than zero.
2
2.5. Basic array operations
There are a few basic operations on arrays that are very useful. Apart from the length
operation:
len([1, 2, 3]) == 3
2.6. Exercise
Problem: Given array A consisting of N integers, return the reversed array.
Solution: We can iterate over the first half of the array and exchange the elements with
those in the second part of the array.
Python is a very rich language and provides many built-in functions and methods. It turns
out, that there is already a built-in method reverse, that solves this exercise. Using such a
method, array A can be reversed simply by:
1 A.reverse()
3
Chapter 3
Time complexity
Use of time complexity makes it easy to estimate the running time of a program. Performing
an accurate calculation of a program’s operation time is a very labour-intensive process
(it depends on the compiler and the type of computer or speed of the processor). Therefore, we
will not make an accurate measurement; just a measurement of a certain order of magnitude.
Complexity can be viewed as the maximum number of primitive operations that a program
may execute. Regular operations are single additions, multiplications, assignments etc. We
may leave some operations uncounted and concentrate on those that are performed the largest
number of times. Such operations are referred to as dominant.
The number of dominant operations depends on the specific input data. We usually want
to know how the performance time depends on a particular aspect of the data. This is most
frequently the data size, but it can also be the size of a square matrix or the value of some
input variable.
The operation in line 4 is dominant and will be executed n times. The complexity is described
in Big-O notation: in this case O(n) — linear complexity.
The complexity specifies the order of magnitude within which the program will perform
its operations. More precisely, in the case of O(n), the program may perform c · n opera-
tions, where c is a constant; however, it may not perform n2 operations, since this involves
a different order of magnitude of data. In other words, when calculating the complexity we
omit constants: i.e. regardless of whether the loop is executed 20 · n times or n5 times, we still
have a complexity of O(n), even though the running time of the program may vary. When
analyzing the complexity we must look for specific, worst-case examples of data that the
program will take a long time to process.
�c Copyright 2020 by Codility Limited. All Rights Reserved. Unauthorized copying or publication pro-
hibited.
1
3.1. Comparison of different time complexities
Let’s compare some basic time complexities.
3.2: Constant time — O(1).
1 def constant(n):
2 result = n * n
3 return result
The value of n is halved on each iteration of the loop. If n = 2x then log n = x. How long
would the program below take to execute, depending on the input data?
3.4: Linear time — O(n).
1 def linear(n, A):
2 for i in xrange(n):
3 if A[i] == 0:
4 return 0
5 return 1
Let’s note that if the first value of array A is 0 then the program will end immediately. But
remember, when analyzing time complexity we should check for worst cases. The program
will take the longest time to execute if array A does not contain any 0.
3.5: Quadratic time — O(n2 ).
1 def quadratic(n):
2 result = 0
3 for i in xrange(n):
4 for j in xrange(i, n):
5 result += 1
6 return result
2
Exponential and factorial time
It is worth knowing that there are other types of time complexity such as factorial time O(n!)
and exponential time O(2n ). Algorithms with such complexities can solve problems only for
very small values of n, because they would take too long to execute for large values of n.
• n � 1 000 000, the expected time complexity is O(n) or O(n log n),
Of course, these limits are not precise. They are just approximations, and will vary depending
on the specific task.
3.4. Exercise
Problem: You are given an integer n. Count the total of 1 + 2 + . . . + n.
Solution: The task can be solved in several ways. Some person, who knows nothing about
time complexity, may implement an algorithm in which the result is incremented by 1:
3
3.7: Slow solution — time complexity O(n2 ).
1 def slow_solution(n):
2 result = 0
3 for i in xrange(n):
4 for j in xrange(i + 1):
5 result += 1
6 return result
Another person may increment the result respectively by 1, 2, . . . , n. This algorithm is much
faster:
But the third person’s solution is even quicker. Let us write the sequence 1, 2, . . . , n and
repeat the same sequence underneath it, but in reverse order. Then just add the numbers
from the same columns:
1 2 3 ... n−1 n
n n−1 n−2 ... 2 1
n+1 n+1 n+1 ... n+1 n+1
The result in each column is n + 1, so we can easily count the final result:
4
Chapter 4
Counting elements
A numerical sequence can be stored in an array in various ways. In the standard approach,
the consecutive numbers a0 , a1 , . . . , an−1 are usually put into the corresponding consecutive
indices of the array:
We can also store the data in a slightly different way, by making an array of counters. Each
number may be counted in the array by using an index that corresponds to the value of the
given number.
a0 a1 a2 a3 a4 a5
0 0 4 2 4 5
count[] 2 0 1 0 2 1
0 1 2 3 4 5
Notice that we do not place elements directly into a cell; rather, we simply count their
occurrences. It is important that the array in which we count elements is sufficiently large.
If we know that all the elements are in the set {0, 1, . . . , m}, then the array used for counting
should be of size m + 1.
4.1: Counting elements — O(n + m).
1 def counting(A, m):
2 n = len(A)
3 count = [0] * (m + 1)
4 for k in xrange(n):
5 count[A[k]] += 1
6 return count
The limitation here may be available memory. Usually, we are not able to create arrays of
109 integers, because this would require more than one gigabyte of available memory.
Counting the number of negative integers can be done in two ways. The first method is
to add some big number to each value: so that, all values would be greater than or equal to
zero. That is, we shift the representation of zero by some arbitrary amount to accommodate
all the negative numbers we need. In the second method, we simply create a second array for
counting negative numbers.
�c Copyright 2020 by Codility Limited. All Rights Reserved. Unauthorized copying or publication pro-
hibited.
1
4.1. Exercise
Problem: You are given an integer m (1 � m � 1 000 000) and two non-empty, zero-indexed
arrays A and B of n integers, a0 , a1 , . . . , an−1 and b0 , b1 , . . . , bn−1 respectively (0 � ai , bi � m).
The goal is to check whether there is a swap operation which can be performed on these
arrays in such a way that the sum of elements in array A equals the sum of elements in
array B after the swap. By swap operation we mean picking one element from array A and
one element from array B and exchanging them.
Solution O(n2 ): The simplest method is to swap every pair of elements and calculate the
totals. Using that approach gives us O(n3 ) time complexity. A better approach is to calculate
the sums of elements at the beginning, and check only how the totals change during the swap
operation.
Solution O(n + m): The best approach is to count the elements of array A and calculate
the difference d between the sums of the elements of array A and B.
For every element of array B, we assume that we will swap it with some element from
array A. The difference d tells us the value from array A that we are interested in swapping,
because only one value will cause the two totals to be equal. The occurrence of this value can
be found in constant time from the array used for counting.
2
Chapter 5
Prefix sums
There is a simple yet powerful technique that allows for the fast calculation of sums of
elements in given slice (contiguous segments of array). Its main idea uses prefix sums which
are defined as the consecutive totals of the first 0, 1, 2, . . . , n elements of an array.
a0 a1 a2 ... an−1
p0 = 0 p1 = a 0 p2 = a 0 + a 1 p3 = a 0 + a 1 + a 2 ... pn = a0 + a1 + . . . + an−1
We can easily calculate the prefix sums in O(n) time complexity. Notice that the total p k
equals pk−1 + ak−1 , so each consecutive value can be calculated in a constant time.
Similarly, we can calculate suffix sums, which are the totals of the k last values. Using prefix
(or suffix) sums allows us to calculate the total of any slice of the array very quickly. For
example, assume that you are asked about the totals of m slices [x..y] such that 0 � x � y < n,
where the total is the sum ax + ax+1 + . . . + ay−1 + ay .
The simplest approach is to iterate through the whole array for each result separately;
however, that requires O(n·m) time. The better approach is to use prefix sums. If we calculate
the prefix sums then we can answer each question directly in constant time. Let’s subtract p x
from the value py+1 .
We have calculated the total of ax + ax+1 + . . . + ay−1 + ay in O(1) time. Using this approach,
the total time complexity is O(n + m).
�c Copyright 2020 by Codility Limited. All Rights Reserved. Unauthorized copying or publication pro-
hibited.
1
5.1. Exercise
Problem: You are given a non-empty, zero-indexed array A of n (1 � n � 100 000) integers
a0 , a1 , . . . , an−1 (0 � ai � 1 000). This array represents number of mushrooms growing on the
consecutive spots along a road. You are also given integers k and m (0 � k, m < n).
A mushroom picker is at spot number k on the road and should perform m moves. In
one move she moves to an adjacent spot. She collects all the mushrooms growing on spots
she visits. The goal is to calculate the maximum number of mushrooms that the mushroom
picker can collect in m moves.
For example, consider array A such that:
2 3 7 5 1 3 9
0 1 2 3 4 5 6
The mushroom picker starts at spot k = 4 and should perform m = 6 moves. She might
move to spots 3, 2, 3, 4, 5, 6 and thereby collect 1 + 5 + 7 + 3 + 9 = 25 mushrooms. This is the
maximal number of mushrooms she can collect.
Solution O(m2 ): Note that the best strategy is to move in one direction optionally followed
by some moves in the opposite direction. In other words, the mushroom picker should not
change direction more than once. With this observation we can find the simplest solution.
Make the first p = 0, 1, 2, . . . , m moves in one direction, then the next m − p moves in the
opposite direction. This is just a simple simulation of the moves of the mushroom picker
which requires O(m2 ) time.
Solution O(n+m): A better approach is to use prefix sums. If we make p moves in one direc-
tion, we can calculate the maximal opposite location of the mushroom picker. The mushroom
picker collects all mushrooms between these extremes. We can calculate the total number of
collected mushrooms in constant time by using prefix sums.
2
Chapter 6
Sorting
Sorting is the process of arranging data in a certain order. Usually, we sort by the value of
the elements. We can sort numbers, words, pairs, etc. For example, we can sort students by
their height, and we can sort cities in alphabetical order or by their numbers of citizens. The
most-used orders are numerical order and alphabetical order. Let’s consider the simplest set,
an array consisting of integers:
5 2 8 14 1 16
0 1 2 3 4 5
We want to sort this array into numerical order to obtain the following array:
1 2 5 8 14 16
0 1 2 3 4 5
There are many sorting algorithms, and they differ considerably in terms of their time com-
plexity and use of memory. Here we describe some of them.
1
6.2. Counting sort
The idea: First, count the elements in the array of counters (see chapter 2). Next, just iterate
through the array of counters in increasing order.
Notice that we have to know the range of the sorted values. If all the elements are in the
set {0, 1, . . . , k}, then the array used for counting should be of size k + 1. The limitation here
may be available memory.
The time complexity here is O(n + k). We need additional memory O(k) to count all the
elements. At first sight, the time complexity of the above implementation may appear greater.
However, all the operations in lines 9 and 10 are performed not more than O(n) times.
The time complexity of this sorting function is O(n log n). Generally, sorting algorithms use
very interesting ideas which can be used in other problems. It is worth knowing how they
work, and it is also worth implementing them yourself at least once. In the future you can
use the built-in sorting functions, because their implementations will be faster and they make
your code shorter and more readable.
2
6.5. Exercise
Problem: You are given a zero-indexed array A consisting of n > 0 integers; you must return
the number of unique values in array A.
Solution O(n log n): First, sort array A; similar values will then be next to each other.
Finally, just count the number of distinct pairs in adjacent cells.
The time complexity is O(n log n), in view of the sorting time.
3
Chapter 7
Here are described two structures used for storage of elements. The structures will provide
two operations: push (inserting the new element to the structure) and pop (removing some
element from the structure).
7.1. Stack
The stack is a basic data structure in which the insertion of new elements takes place at
the top and deletion of elements also takes place from the top. The idea of the stack can
be illustrated by plates stacked on top of one another. Each new plate is placed on top of
the stack of plates (operation push), and plates can only be taken off the top of the stack
(operation pop).
6
8 8 8 3
4 4 pop 4 pop 4 4 pop 4
push(6) push(3)
7 7 7 7 7 7
The stack can be represented by an array for storing the elements. Apart of the array, we
should also remember the size of the stack and we must be sure to declare sufficient space
for the array (in the following implementation we can store N elements).
The push function adds an element to the stack. The pop function removes and returns the
most recently pushed element from the stack. We shouldn’t perform a pop operation on an
empty stack.
�c Copyright 2020 by Codility Limited. All Rights Reserved. Unauthorized copying or publication pro-
hibited.
1
7.2. Queue
The queue is a basic data structure in which new elements are inserted at the back but old
elements are removed from the front. The idea of the queue can be illustrated by a line of
customers in a grocery store. New people join the back of the queue and the next person to
be served is the first one in the line.
4 8 4 8 6 pop 8 6 pop 6 6 3
push(6) push(3)
The queue can be represented by an array for storing the elements. Apart of the array, we
should also remember the front (head) and back (tail) of the queue. We must be sure to
declare sufficient space for the array (in the following implementation we can store N − 1
elements).
Notice that in the above implementation we used cyclic buffer (you can read about it more
at http://en.wikipedia.org/wiki/Circular_buffer).
The push function adds an element to the queue. The pop function removes and returns
an element from the front of the queue (we shouldn’t perform a pop operation on an empty
queue). The empty function check whether the queue is empty and the size function returns
the number of elements in the queue.
7.3. Exercises
Problem: You are given a zero-indexed array A consisting of n integers: a0 , a1 , . . . , an−1 .
Array A represents a scenario in a grocery store, and contains only 0s and/or 1s:
• 0 represents the action of a new person joining the line in the grocery store,
• 1 represents the action of the person at the front of the queue being served and leaving
the line.
The goal is to count the minimum number of people who should have been in the line before
the above scenario, so that the scenario is possible (it is not possible to serve a person if the
line is empty).
2
Solution O(n): We should remember the size of the queue and carry out a simulation of
people arriving at and leaving the grocery store. If the size of the queue becomes a negative
number then that sets the lower limit for the number of people who had to stand in the line
previously. We should find the smallest negative number to determine the size of the queue
during the whole simulation.
The total time complexity of the above algorithm is O(n). The space complexity is O(1)
because we don’t store people in the array, but only remember the size of the queue.
3
Chapter 8
Leader
Let us consider a sequence a0 , a1 , . . . , an−1 . The leader of this sequence is the element whose
value occurs more than n2 times.
a0 a1 a2 a3 a4 a5 a6
6 8 4 6 8 6 6
0 1 2 3 4 5 6
In the picture the leader is highlighted in gray. Notice that the sequence can have at most one
leader. If there were two leaders then their total occurrences would be more than 2 · n2 = n,
but we only have n elements.
The leader may be found in many ways. We describe some methods here, starting with
trivial, slow ideas and ending with very creative, fast algorithms. The task is to find the value
of the leader of the sequence a0 , a1 , . . . , an−1 , such that 0 � ai � 109 . If there is no leader,
the result should be -1.
�c Copyright 2020 by Codility Limited. All Rights Reserved. Unauthorized copying or publication pro-
hibited.
1
8.2. Solution with O(n log n) time complexity
If the sequence is presented in non-decreasing order, then identical values are adjacent to
each other.
a0 a1 a2 a3 a4 a5 a6
4 6 6 6 6 8 8
0 1 2 3 4 5 6
Having sorted the sequence, we can easily count slices of the same values and find the leader
in a smarter way. Notice that if the leader occurs somewhere in our sequence, then it must
occur at index n2 (the central element). This is because, given that the leader occurs in more
than half the total values in the sequence, there are more leader values than will fit on either
side of the central element in the sequence.
The time complexity of the above algorithm is O(n log n) due to the sorting time.
a0 a1
4 6
0 1 a2 a3 a4 a5 a6
6 6 6 8 8
2 3 4 5 6
Removing pairs of different elements is not trivial. Let’s create an empty stack onto which
we will be pushing consecutive elements. After each such operation we check whether the two
elements at the top of the stack are different. If they are, we remove them from the stack.
This is equivalent to removing a pair of different elements from the sequence (in the picture
below, different elements being removed are highlighted in gray).
2
8
6 6 8
6 6 6 6 6
4 4 6 6 6 6 6
a0 a1 a2 a3 a4 a5 a6
4 6 6 6 6 8 8
In fact, we don’t need to remember all the elements from the stack, because all the values
below the top are always equal. It is sufficient to remember only the values of elements and
the size of the stack.
At the beginning we notice that if the sequence contains a leader, then after the removal
of different elements the leader will not have changed. After removing all pairs of different
elements, we end up with a sequence containing all the same values. This value is not neces-
sarily the leader; it is only a candidate for the leader. Finally, we should iterate through all
the elements and count the occurrences of the candidate; if it is greater than n2 then we have
found the leader; otherwise the sequence does not contain a leader.
The time complexity of this algorithm is O(n) because every element is considered only
once. The final counting of occurrences of the candidate value also works in O(n) time.
3
Chapter 9
Let’s define a problem relating to maximum slices. You are given a sequence of n integers
a0 , a1 , . . . , an−1 and the task is to find the slice with the largest sum. More precisely, we are
looking for two indices p, q such that the total ap + ap+1 + . . . + aq is maximal. We assume
that the slice can be empty and its sum equals 0.
a0 a1 a2 a3 a4 a5 a6
5 -7 3 5 -2 4 -1
In the picture, the slice with the largest sum is highlighted in gray. The sum of this slice
equals 10 and there is no slice with a larger sum. Notice that the slice we are looking for may
contain negative integers, as shown above.
Analyzing all possible slices requires O(n2 ) time complexity, and for each of them we compute
the total in O(n) time complexity. It is the most straightforward solution, however it is far
from optimal.
�c Copyright 2020 by Codility Limited. All Rights Reserved. Unauthorized copying or publication pro-
hibited.
1
9.2. Solution with O(n2 ) time complexity
We can easily improve our last solution. Notice that the prefix sum allows the sum of any
slice to be computed in a constant time. With this approach, the time complexity of the
whole algorithm reduces to O(n2 ). We assume that pref is an array of prefix sums (prefi =
a0 + a1 + . . . + ai−1 ).
We can also solve this problem without using prefix sums, within the same time complexity.
Assume that we know the sum of slice (p, q), so s = ap + ap+1 + . . . + aq . The sum of the slice
with one more element (p, q + 1) equals s + aq+1 . Following this observation, there is no need
to compute the sum each time from the beginning; we can use the previously calculated sum.
This time, the fastest algorithm is the one with the simplest implementation, however it
is conceptually more difficult. We have used here a very popular and important technique.
Based on the solution for shorter sequences we can find the solution for longer sequences.
2
Chapter 10
People have been analyzing prime numbers since time immemorial, but still we continue to
search for fast new algorithms that can check the primality of numbers. A prime number is
a natural number greater than 1 that has exactly two divisors (1 and itself). A composite
number has more than two divisors.
2 3 4 5 6 7 8 9 10 11 12 13 14
In the above picture the primes are highlighted in white and composite numbers are shown
in gray.
1 2 3 4 6 9 12 18 36
√
Thus, iterating through all the numbers from 1 to n allows us to find all the divisors. If
number n is of the form k 2 , then the symmetric divisor of k is also k. This divisor should be
counted just once.
√
10.1: Counting the number of divisors — O( n).
1 def divisors(n):
2 i = 1
3 result = 0
�c Copyright 2020 by Codility Limited. All Rights Reserved. Unauthorized copying or publication pro-
hibited.
1
4 while (i * i < n):
5 if (n % i == 0):
6 result += 2
7 i += 1
8 if (i * i == n):
9 result += 1
10 return result
We assume that 1 is neither a prime nor a composite number, so the above algorithm works
only for n � 2.
10.3. Exercises
Problem: Consider n coins aligned in a row. Each coin is showing heads at the beginning.
1 2 3 4 5 6 7 8 9 10
Then, n people turn over corresponding coins as follows. Person i reverses coins with numbers
that are multiples of i. That is, person i flips coins i, 2 · i, 3 · i, . . . until no more appropriate
coins remain. The goal is to count the number of coins showing tails. In the above example,
the final configuration is:
1 2 3 4 5 6 7 8 9 10
Solution O(n log n): We can simulate the results of each person reversing coins.
2
10 return result
3
Chapter 11
Sieve of Eratosthenes
The Sieve of Eratosthenes is a very simple and popular technique for finding all the prime
numbers in the range from 2 to a given number n. The algorithm takes its name from the
process of sieving—in a simple way we remove multiples of consecutive numbers.
Initially, we have the set of all the numbers {2, 3, . . . , n}. At each step we choose the
smallest number in the set and remove all its multiples. Notice that every composite number
√
has a divisor of at most n. In particular, it has a divisor which is a prime number. It
√
is sufficient to remove only multiples of prime numbers not exceeding n. In this way, all
composite numbers will be removed.
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
The above illustration shows steps of sieving for n = 17. The elements of the processed set
are in white, and removed composite numbers are in gray. First, we remove multiples of the
smallest element in the set, which is 2. The next element remaining in the set is 3, and we
also remove its multiples, and so on.
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
The above algorithm can be slightly improved. Notice that we needn’t cross out multiples
of i which are less than i2 . Such multiples are of the form k · i, where k < i. These have
already been removed by one of the prime divisors of k. After this improvement, we obtain
the following implementation:
�c Copyright 2020 by Codility Limited. All Rights Reserved. Unauthorized copying or publication pro-
hibited.
1
7 k = i * i
8 while (k <= n):
9 sieve[k] = False
10 k += i
11 i += 1
12 return sieve
√
Let’s analyse the time complexity of the above algorithm. For each prime number p j � n
we cross out at most pnj numbers, so we get the following number of operations:
n n n � n � 1
+ + + ... = =n· (11.1)
2 3 5 √ pj √ pj
pj � n pj � n
The sum of the reciprocals of the primes pj � n equals asymptotically O(log log n). So the
overall time complexity of this algorithm is O(n log log n). The proof is not trivial, and is
beyond the scope of this article. An example proof can be found here.
11.1. Factorization
Factorization is the process of decomposition into prime factors. More precisely, for a given
number x we want to find primes p1 , p2 , . . . , pk whose product equals x.
Use of the sieve enables fast factorization. Let’s modify the sieve algorithm slightly. For
every crossed number we will remember the smallest prime that divides this number.
0 0 2 0 2 0 2 3 2 0 2 0 2 3 2 0 2 0 2
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
With this approach we can factorize numbers very quickly. If we know that one of the prime
factors of x is p, then all the prime factors of x are p plus the decomposition of xp .
2
Number x cannot have more than log x prime factors, because every prime factor is � 2.
Factorization by the above method works in O(log x) time complexity. Note that consecutive
factors will be presented in non-decreasing order.
3
Chapter 12
Euclidean algorithm
The Euclidean algorithm is one of the oldest numerical algorithms still to be in common
use. It solves the problem of computing the greatest common divisor (gcd) of two positive
integers.
Let’s estimate this algorithm’s time complexity (based on n = a+b). The number of steps can
be linear, for e.g. gcd(x, 1), so the time complexity is O(n). This is the worst-case complexity,
because the value x + y decreases with every step.
• if b | a, then gcd(a, b) = b,
�c Copyright 2020 by Codility Limited. All Rights Reserved. Unauthorized copying or publication pro-
hibited.
1
Let’s prove that gcd(a, b) = gcd(b, r), where r = a mod b and a = b · t + r:
• Firstly, let d = gcd(a, b). We get d | (b · t + r) and d | b, so d | r.
Therefore we get gcd(a, b) | gcd(b, r).
Denote by (ai , bi ) pairs of values a and b, for which the above algorithm performs i steps.
Then bi � Fibi−1 (where Fibi is the i-th Fibonacci number). Inductive proof:
3. for more steps, (ak+1 , bk+1 ) → (ak , bk ) → (ak−1 , bk−1 ), then ak = bk+1 , ak−1 = bk ,
bk−1 = ak mod bk , so ak = q · bk + bk−1 for some q � 1, so bk+1 � bk + bk−1 .
2
This algorithm is superior to the previous one for very large integers when it cannot be
assumed that all the arithmetic operations used here can be done in a constant time. Due
to the binary representation, operations are performed in linear time based on the length of
the binary representation, even for very big integers. On the other hand, modulo applied in
algorithm 10.2 has worse time complexity. It exceeds O(log n · log log n), where n = a + b.
Denote by (ai , bi ) pairs of values a and b, for which the above algorithm performs i steps.
We have ai+1 � ai , bi+1 � bi , b1 = a1 > 0. In the first three cases, ai+1 · bi+1 � 2 · ai · bi . In the
fourth case, ai+1 · bi+1 � 2 · ai−1 · bi−1 , because a difference of two odd numbers is an even
number. By induction we get:
i−1
ai · b i � 2� 2 � (12.2)
Thus, the time complexity is O(log(a · b)) = O(log a + b) = O(log n). And for very large
integers, O((log n)2 ), since each arithmetic operation can be done in O(log n) time.
lcm(a, b) = a·b
gcd(a,b)
Knowing how to compute the gcd(a, b) in O(log(a+b)) time, we can also compute the lcm(a, b)
in the same time complexity.
12.5. Exercise
Problem: Michael, Mark and Matthew collect coins of consecutive face values a, b and c
(each boy has only one kind of coins). The boys have to find the minimum amount of money
that each of them may spend by using only their own coins.
Solution: It is easy to note that we want to find the least common multiple of the three
integers, i.e. lcm(a, b, c). The problem can be generalized for the lcm of exactly n integers.
There is the following relation:
We simply find the lcm n times, and each step works in logarithmic time.
3
Chapter 13
Fibonacci numbers
The Fibonacci numbers form a sequence of integers defined recursively in the following way.
The first two numbers in the Fibonacci sequence are 0 and 1, and each subsequent number
is the sum of the previous two.
0 for n = 0,
Fn = 1 for n = 1,
F
n−1 + Fn−2 for n > 1.
0 1 1 2 3 5 8 13 21 34 55 89
0 1 2 3 4 5 6 7 8 9 10 11
Notice that recursive enumeration as described by the definition is very slow. The definition
of Fn repeatedly refers to the previous numbers from the Fibonacci sequence.
The above algorithm performs Fn additions of 1, and, as the sequence grows exponentially,
we get an inefficient solution.
Enumeration of the Fibonacci numbers can be done faster simply by using a basis of
dynamic programming. We can calculate the values F0 , F1 , . . . , Fn based on the previously
calculated numbers (it is sufficient to remember only the last two values).
1
13.1. Faster algorithms for Fibonacci numbers
Fibonacci numbers can be found in O(log n) time. However, for this purpose we have to use
matrix multiplication and the following formula:
� �n � �
1 1 F Fn
= n+1 , for n � 1.
1 0 Fn Fn−1
13.2. Exercise
Problem: For all the given numbers x0 , x1 , . . . , xn−1 , such that 1 � xi � m � 1 000 000,
check whether they may be presented as the sum of two Fibonacci numbers.
Solution: Notice that only a few tens of Fibonacci numbers are smaller than the maximal
m (exactly 31). We consider all the pairs. If some of them sum to k � m, then we mark
index k in the array to denote that the value k can be presented as the sum of two Fibonacci
numbers.
In summary, for each number xi we can answer whether it is the sum of two Fibonacci
numbers in constant time. The total time complexity is O(n + m).
2
Chapter 14
The binary search is a simple and very useful algorithm whereby many linear algorithms can
be optimized to run in logarithmic time.
14.1. Intuition
Imagine the following game. The computer selects an integer value between 1 and 16 and
our goal is to guess this number with a minimum number of questions. For each guessed
number the computer states whether the guessed number is equal to, bigger or smaller than
the number to be guessed.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
The iterative check of all the successive values 1, 2, . . . , 16 is linear, because with each question
the set of the candidates is reduced by one.
The goal is to ask a question that reduces the set of candidates maximally. The best
option is to choose the middle element, as doing so causes the set of candidates to be halved
each time. With this approach, we ask the logarithmic number of questions at maximum.
14.2. Implementation
In a binary search we use the information that all the elements are sorted. Let’s try to solve
the task in which we ask for the position of a value x in a sorted array a 0 � a1 � . . . � an−1 .
Let’s see how the number of candidates is reduced, for example for the value x = 31.
12 15 15 19 24 31 53 59 60
0 1 2 3 4 5 6 7 8
12 15 15 19 24 31 53 59 60
0 1 2 3 4 5 6 7 8
12 15 15 19 24 31 53 59 60
0 1 2 3 4 5 6 7 8
For every step of the algorithm we should remember the beginning and the end of the re-
maining slice of the array (respectively, variables beg and end). The middle element of the
slice can easily be calculated as mid = � beg+end
2 �.
�c Copyright 2020 by Codility Limited. All Rights Reserved. Unauthorized copying or publication pro-
hibited.
1
14.1: Binary search in O(log n).
1 def binarySearch(A, x):
2 n = len(A)
3 beg = 0
4 end = n - 1
5 result = -1
6 while (beg <= end):
7 mid = (beg + end) / 2
8 if (A[mid] <= x):
9 beg = mid + 1
10 result = mid
11 else:
12 end = mid - 1
13 return result
The above algorithm will find the largest element which is less than or equal to x. In sub-
sequent iterations the number of candidates is halved, so the time complexity is O(log n). It
is noteworthy that the above implementation is universal; it is enough to modify only the
condition inside the while loop.
14.4. Exercise
Problem: You are given n binary values x0 , x1 , . . . , xn−1 , such that xi ∈ {0, 1}. This array
represents holes in a roof (1 is a hole). You are also given k boards of the same size. The goal
is to choose the optimal (minimal) size of the boards that allows all the holes to be covered
by boards.
Solution: The size of the boards can be found with a binary search. If size x is sufficient to
cover all the holes, then we know that sizes x + 1, x + 2, . . . , n are also sufficient. On the other
hand, if we know that x is not sufficient to cover all the holes, then sizes x − 1, x − 2, . . . , 1
are also insufficient.
2
11 else:
12 beg = mid + 1
13 return result
There is the question of how to check whether size x is sufficient. We can go through all the
indices from the left to the right and greedily count the boards. We add a new board only if
there is a hole that is not covered by the last board.
The total time complexity of such a solution is O(n log n) due to the binary search time.
3
Chapter 15
Caterpillar method
The Caterpillar method is a likeable name for a popular means of solving algorithmic tasks.
The idea is to check elements in a way that’s reminiscent of movements of a caterpillar.
The caterpillar crawls through the array. We remember the front and back positions of the
caterpillar, and at every step either of them is moved forward.
a0 a1 a2 a3 a4 a5 a6
6 2 7 4 1 3 6
Each position of the caterpillar will represent a different contiguous subsequence in which
the total of the elements is not greater than s. Let’s initially set the caterpillar on the first
element. Next we will perform the following steps:
• if we can, we move the right end (f ront) forward and increase the size of the caterpillar;
• otherwise, we move the left end (back) forward and decrease the size of the caterpillar.
In this way, for every position of the left end we know the longest caterpillar that covers
elements whose total is not greater than s. If there is a subsequence whose total of elements
equals s, then there certainly is a moment when the caterpillar covers all its elements.
�c Copyright 2020 by Codility Limited. All Rights Reserved. Unauthorized copying or publication pro-
hibited.
1
11 return False
Let’s estimate the time complexity of the above algorithm. At the first glance we have two
nested loops, what suggest quadratic time. However, notice that at every step we move the
front or the back of the caterpillar, and their positions will never exceed n. Thus we actually
get an O(n) solution.
The above estimation of time complexity is based on amortized cost, which will be ex-
plained more precisely in future lessons.
15.2. Exercise
Problem: You are given n sticks (of lengths 1 � a0 � a1 � . . . � an−1 � 109 ). The goal is
to count the number of triangles that can be constructed using these sticks. More precisely,
we have to count the number of triplets at indices x < y < z, such that a x + ay > az .
Solution O(n2 ): For every pair x, y we can find the largest stick z that can be used to
construct the triangle. Every stick k, such that y < k � z, can also be used, because the
condition ax + ay > ak will still be true. We can add up all these triangles at once.
If the value z is found every time from the beginning then we get a O(n 3 ) time complexity
solution. However, we can instead use the caterpillar method. When increasing the value of
y, we can increase (as far as possible) the value of z.
The time complexity of the above algorithm is O(n2 ), because for every stick x the values of
y and z increase O(n) number of times.
2
Chapter 16
Greedy algorithms
We consider problems in which a result comprises a sequence of steps or choices that have
to be made to achieve the optimal solution. Greedy programming is a method by which
a solution is determined based on making the locally optimal choice at any given moment.
In other words, we choose the best decision from the viewpoint of the current stage of the
solution.
Depending on the problem, the greedy method of solving a task may or may not be
the best approach. If it is not the best approach, then it often returns a result which is
approximately correct but suboptimal. In such cases dynamic programming or brute-force
can be the optimal approach. On the other hand, if it works correctly, its running time is
usually faster than those of dynamic programming or brute-force.
�c Copyright 2021 by Codility Limited. All Rights Reserved. Unauthorized copying or publication pro-
hibited.
1
The function returns the list of pairs: denomination, number of coins. The time complexity
of the above algorithm is O(n) as the number of coins is added once for every denomination.
16.3. Exercise
Problem: There are n > 0 canoeists weighing respectively 1 � w0 � w1 � . . . � wn−1 � 109 .
The goal is to seat them in the minimum number of double canoes whose displacement (the
maximum load) equals k. You may assume that wi � k.
Solution A O(n): The task can be solved by using a greedy algorithm. The heaviest canoeist
is called heavy. Other canoeists who can be seated with heavy in the canoe are called light.
All the other remaining canoeists are also called heavy.
The idea is that, for the heaviest heavy, we should find the heaviest light who can be
seated with him/her. So, we seat together the heaviest heavy and the heaviest light. Let us
note that the lighter the heaviest heavy is, the heavier light can be. Thus, the division between
heavy and light will change over time — as the heaviest heavy gets closer to the pool of light.
Proof of correctness: There exists an optimal solution in which the heaviest heavy h and
the heaviest light l are seated together. If there were a better solution in which h sat alone
then l could be seated with him/her anyway. If heavy h were seated with some light x � l,
then x and l could just be swapped. If l has any companion y, x and y would fit together, as
y � h.
The solution for the first canoe is optimal, so the problem can be reduced to seat the
remaining canoeists in the minimum number of canoes.
2
The total time complexity of this solution is O(n). The outer while loop performs O(n) steps
since in each step one or two canoeists are seated in a canoe. The inner while loop in each
step changes a heavy into a light. As at the beginning there are O(n) heavy and with each
step at the outer while loop only one light become a heavy, the overall total number of steps
of the inner while loop has to be O(n).
Solution B O(n): The heaviest canoeist is seated with the lightest, as long as their weight
is less than or equal to k. If not, the heaviest canoeist is seated alone in the canoe.
The time complexity is O(n), because with each step of the loop, at least one canoeist is
seated.
Proof of correctness: Analogically to solution A. If light l were seated with some heavy
x < h, then x and h could just be swapped.
If the heaviest canoeist is seated alone, it is not possible to seat anybody with him/her.
If there exists a solution in which the heaviest canoeist h is seated with some other x, we can
swap x with the lightest canoeist l, because l can sit in place of x since x � l. Also, x can sit
in place of l, since if l has any companion y, we have y � h.
3
Chapter 17
Dynamic programming
• no coins are needed to pay a zero amount: dp[i, 0] = 0 (for all i);
• if there are no denominations and the amount is positive, there is no solution, so for
convenience the result can be infinite in this case: dp[0, j] = ∞ (for all j > 0);
• if the amount to be paid is smaller than the highest denomination c i , this denomination
can be discarded: dp[i, j] = dp[i − 1, j] (for all i > 0 and all j such that c i > j);
• otherwise, we should consider two options and choose the one requiring fewer coins:
either we use a coin of the highest denomination, and a smaller amount to be paid
remains, or we don’t use coins of the highest denomination (and the denomination can
thus be discarded): dp[i, j] = min(dp[i, j − ci ] + 1, dp[i − 1, j]) (for all i > 0 and all j
such that ci � j).
�c Copyright 2020 by Codility Limited. All Rights Reserved. Unauthorized copying or publication pro-
hibited.
1
The following table shows all the solutions to sub-problems considered for the example data.
dp[i, j] 0 1 2 3 4 5 6
∅ 0 ∞ ∞ ∞ ∞ ∞ ∞
{1} 0 1 2 3 4 5 6
{1, 3} 0 1 2 1 2 3 2
{1, 3, 4} 0 1 2 1 1 2 2
Implementation
Consider n denominations, 0 < c0 � c1 � . . . � cn−1 . The algorithm processes the respective
denominations and calculates the minimum number of coins needed to pay every amount
from 0 to k. When considering each successive denomination, we use the previously calculated
results for the smaller amounts.
17.1: The dynamic algorithm for finding change.
1 def dynamic_coin_changing(C, k):
2 n = len(C)
3 # create two-dimensional array with all zeros
4 dp = [[0] * (k + 1) for i in xrange(n + 1)]
5 dp[0] = [0] + [MAX_INT] * k
6 for i in xrange(1, n + 1):
7 for j in xrange(C[i - 1]):
8 dp[i][j] = dp[i - 1][j]
9 for j in xrange(C[i - 1], k + 1):
10 dp[i][j] = min(dp[i][j - C[i - 1]] + 1, dp[i - 1][j])
11 return dp[n]
Both the time complexity and the space complexity of the above algorithm is O(n · k). In the
above implementation, memory usage can be optimized. Notice that, during the calculation
of dp, we only use the previous row, so we don’t need to remember all of the rows.
17.2: The dynamic algorithm for finding change with optimized memory.
1 def dynamic_coin_changing(C, k):
2 n = len(C)
3 dp = [0] + [MAX_INT] * k
4 for i in xrange(1, n + 1):
5 for j in xrange(C[i - 1], k + 1):
6 dp[j] = min(dp[j - C[i - 1]] + 1, dp[j])
7 return dp
17.2. Exercise
Problem: A small frog wants to get from position 0 to k (1 � k � 10 000). The frog can
jump over any one of n fixed distances s0 , s1 , . . . , sn−1 (1 � si � k). The goal is to count the
number of different ways in which the frog can jump to position k. To avoid overflow, it is
sufficient to return the result modulo q, where q is a given number.
We assume that two patterns of jumps are different if, in one pattern, the frog visits
a position which is not visited in the other pattern.
Solution O(n · k): The task can be solved by using dynamic programming. Let’s create an
array dp consisting of k elements, such that dp[j] will be the number of ways in which the
frog can jump to position j.
2
We update consecutive cells of array dp. There is exactly one way for the frog to jump to
position 0, so dp[0] = 1. Next, consider some position j > 0.
The number of ways in which the frog can jump to position j with a final jump of s i is
dp[j − si ]. Thus, the number of ways in which the frog can get to position j is increased by
the number of ways of getting to position j − si , for every jump si .
0 j−s1 j−s0 j
More precisely, dp[j] is increased by the value of dp[j − si ] (for all si � j) modulo q.
The time complexity is O(n · k) (all cells of array dp are visited for every jump) and the space
complexity is O(k).