TRUST NO ONE Homomorphic Encryption and Its Applications
TRUST NO ONE Homomorphic Encryption and Its Applications
TRUST NO ONE Homomorphic Encryption and Its Applications
Department of Informatics
TRUST NO ONE;
Homomorphic Encryption and its
Applications
February, 2023
Abstract
I would like to thank Chunlei Li, my thesis advisor. Completing my master’s thesis with-
out him would have been much harder. Thanks also to valuable help from Anya Helene
Bagge, my supervisor for the bachelor thesis. This laid the foundations for my master’s
thesis. I would also like to thank my mom, dad, sister, and brother for encouraging me
throughout all my ups and downs on working with my thesis.
Knut Storvestre
Wednesday 1st February, 2023
Motivation
1 Introduction 1
2 Types of encryption 3
2.1 Symmetric encryption(SE) . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.1 Stream Ciphers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.2 Block Ciphers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Asymmetric encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.1 RSA encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.2 Diffie-Hellman key exchange . . . . . . . . . . . . . . . . . . . . . 11
2.3 Homomorphic encryption(HE) . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4 Types of homomorphic encryption . . . . . . . . . . . . . . . . . . . . . . 13
2.4.1 Partially homomorphic encryption . . . . . . . . . . . . . . . . . . 13
2.4.2 Leveled fully homomorphic encryption . . . . . . . . . . . . . . . 14
2.4.3 Fully homomorphic encryption (FHE) . . . . . . . . . . . . . . . . 14
2.4.4 Areas of Applications . . . . . . . . . . . . . . . . . . . . . . . . . 14
3 Preliminaries 15
3.1 Basic notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 Problems complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3 Lattice Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.4 Shortest Vector Problem(SVP) . . . . . . . . . . . . . . . . . . . . . . . 18
3.5 Closest Vector Problem(CVP) . . . . . . . . . . . . . . . . . . . . . . . . 20
3.5.1 Solving CVP with a good basis . . . . . . . . . . . . . . . . . . . 21
3.5.2 Solving CVP with bad basis . . . . . . . . . . . . . . . . . . . . . 22
3.5.3 Crypto system based on CVP . . . . . . . . . . . . . . . . . . . . 23
3.6 Learning With Errors (LWE) . . . . . . . . . . . . . . . . . . . . . . . . 24
3.6.1 Search LWE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.6.2 Decisional LWE . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.7 Ring Learning With Errors (RLWE) . . . . . . . . . . . . . . . . . . . . 25
i
3.7.1 RLWE crypto system . . . . . . . . . . . . . . . . . . . . . . . . . 26
4 CKKS Scheme 27
4.1 Message . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.2 Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.2.1 Embedding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.2.2 Inverse Natural Projection . . . . . . . . . . . . . . . . . . . . . . 32
4.2.3 Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.2.4 Projection to the lattice . . . . . . . . . . . . . . . . . . . . . . . 32
4.2.5 Decoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.2.6 Encoding and decoding example . . . . . . . . . . . . . . . . . . . 33
4.3 Leveled Homomorphic Encryption . . . . . . . . . . . . . . . . . . . . . . 35
4.3.1 Key Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.3.2 Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.3.3 Decryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.3.4 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Bibliography 46
ii
List of Figures
iii
Chapter 1
Introduction
Fully Homomorphic Encryption (FHE) has become increasingly popular in the last years
due to the reliance on cloud computing. Cloud computing is when a professional provider
offers computer system resources, in particular data storage and computing power. Ex-
amples of providers include Amazon, Microsoft and Google. They have distributed data
centers allowing customers to share facilities, thus reducing their capital expenses. This
explains cloud computing’s popularity. The popularity of 3rd party solutions entails an
increasing concern about privacy. FHE is important because it allows you to use cloud
computing without compromising your privacy.
The area of research within cryptography is dynamic, and new developments come all the
time. One of the latest developments in FHE is a public key crypto system, the CKKS
scheme. This scheme is named after the scientists Cheon, Kim, Kim, and Song who
introduced the CKKS scheme in 2016 [5]. A scheme is defined as a large-scale systematic
plan, or arrangement, for attaining a particular object or putting a particular idea into
effect.
1
language but, is significantly slower compared to C++. Later, when I discovered the
advantages of Java, this became a dominant alternative. Subsequently, the scheme was
re-programmed in Java. Java has several benefits. Firstly, it is easy to understand.
Secondly, it is extensively used on the backend. The backend is where the processes and
operations are taking place. This is “behind the scenes” where the cryptographic data
is processed. Lastly, Java is about three times faster than a commonly used language,
Python. In sum, applying Java simplifies the code and speed up the processing time.
However, despite benefits, the CKKS scheme has not yet been implemented in Java. In
this thesis I contribute to do so. In addition, I have developed a Graphical User Interphase
(GUI). This greatly improves the user-friendliness of the scheme and lowers the threshold
for the user. The low threshold opens this cryptographic scheme to users with limited
knowledge of Java.
2
Chapter 2
Types of encryption
First of all, I would like to introduce some central concepts in this thesis. Cryptography
comes from ancient Greek and means “secret text”. The field of cryptography is based on
some core criteria: data confidentiality, data integrity, authentication and non-reputation.
This entails that cryptographic algorithms should follow these principles or be set up an
in an environment where they are not vulnerable to exploitation.
A common misconception is that we should only use cryptographic algorithms that are
completely impenetrable. Modern ciphers are designed to achieve sounding computa-
tional security, indicating that it is intractable to develop an attack against the ciphers
with complexity less than the brute-force attack, which exhausts all possible keys for the
target ciphers. Almost all cryptographic algorithms in use today are in theory vulnerable
to brute force attacks. Creating a cipher that is not vulnerable to brute force attacks
is generally considered to be an impossible task, except for the One Time Pad (OTP).
However, it has a significant shortcoming which I will discuss in Subsection 2.1.1. The
thinking behind many cryptographic algorithms is not that it should be impenetrable.
Even tough many of the ciphers that are commonly used today are vulnerable in theory,
they are generally secure in practice.
3
2.1 Symmetric encryption(SE)
Stream ciphers encrypt data in a continuous stream while the block ciphers encrypts
data block by block. Stream ciphers are generally faster than block ciphers in hardware
implementations, but also generally less complex and secure.
An early recorded example of symmetric encryption is the Caesar cipher [17]. This stream
cipher encrypts by moving each letter of the message a fixed number of positions forward
in the alphabet. The receiver must move each of the letters the same number of positions
back in the alphabet to decrypt. The number of positions that the user has to move each
letter back in the alphabet is known as the private key.
Example. If you want to encrypt the message “attack at morning” it will become “nggnpx
ng zbeavat” if you use the private key of 13.
Today’s stream ciphers typically use bits instead of letters. The stream ciphers operates
on two bit streams: One stream for the data we want to encrypt called the plaintext
stream and, a stream for the key we encrypt the data with called the key stream.
plaintext stream X = x1 , x2 , . . . , xn
key stream K = k1 , k2 , . . . , kn
Definition 2.1.1 (Stream cipher encryption and decryption) The plaintext, the
ciphertext and the key stream consists of individual bits, i.e., xi , yi , ki ∈ {0, 1} [24].
4
We add the secret key bit to the encryption, and subtract the secret key bit from the
decryption. This is possible since we are using (mod 2) which means that adding
2ki (mod 2) ≡ 0 where subtraction and addition is identical.
We differentiate the different stream ciphers from the way they generate the key streams.
One-Time Pad(OTP)
OTP is the only stream cipher that satisfies the confidentiality criteria and keeps the
data secret from attackers. OTP is unconditionally secure. It can not be broken even
tough the attacker has unlimited computational resources and time. This stream cipher
achieves this by generating a key stream that is completely random and where each of
the key stream bits is only used once. One major drawback is that the key has to have
the same length as the message. If you want to encrypt a film of 2 gigabits you have to
use a cryptographic key of 2 gigabits. OTP is rarely used today because of its extensive
use of memory.
The LFSR generates a key stream bit by pushing each bit one space to the right. The
rightmost bit will be removed and added to the key stream. The bit will also be XORed
with multiple other bits in the LFSR and then put in the first slot. 2.1 The rightmost
bit which is the bit at slot 16 will become the first bit of the key stream. This bit will
also be xor’ed with the bit at slot 14, 13 and 11 before being put in slot 1. If both the
sender and the receiver share the initial state of the LFSR the sender and the receiver is
able to generate the same key stream.
An example of an LFSR cipher used in practice is the A5/1 cipher. This cipher was a part
of the Global System for Mobile communications (GSM). This cellular telephone standard
5
describes the protocols used in 2G and was developed by European Telecommunications
Standards Institute(ETSI). 2G was first commercially launched in December 1991 in
Finland [14].
There are many block cipher encryption algorithms. I will cover AES.
• AES-128
• AES-192
• AES-256
Each of the numbers represents the bit length of the encryption key. The U.S. government
uses AES-192 or AES-256 to store top secret information. AES-192 and AES-256 are
therefore widely assumed that to be secure. AES operates on blocks of plaintext data
that is 128 bits long. If the block is smaller than 128 bits the remaining bits will be filled
6
with padding. AES is widely used in a variety of security protocols including Transport
Layer Security (TLS)/Secure Sockets Layer (SSL), Secure Shell (SSH) and many Wi-Fi
encryption standards networks.
Mode of operation
Mode of operation is an algorithm that describes how to repeatedly apply block cipher
algorithms such as AES to transform a message into ciphertext, while at the same time
use the same cryptographic key. Different modes of operation can be used to improve the
security of a cipher. It becomes more difficult for an attacker to guess the key used to
encrypt the message. Even tough we use AES, the encrypted data is not guaranteed to
be secure if we use an unsecure mode of operations. Some common modes of operation
include electronic codebook (ECB) mode and cipher block chaining (CBC).
Even tough we use AES-256 as a block cipher encryption, the encrypted data can still
in some cases be recovered by an attacker. An example of this is encrypting a penguin
image using ECB. This example is illustrated in Figure 2.4.
7
Figure 2.4: Encrypted penguin with Electronic Codebook mode
You can still recognise that the pattern is a penguin because the same plaintext is always
mapped to the same ciphertext. The attacker in this case does not have to do anything
since the human eye can recognise the information.
As you can see in Figure 2.6 CBC does not have the pattern of a penguin.
8
Figure 2.6: Encrypted penguin with Cipher Block Chaining mode
CBC can be vulnerable to the same pattern recognition if the attacker uses brute force
to test all different types of IV’s. The security of CBC is therefore reliant on the length
of IV. CBC is also more vulnerable to noise interference. If one bit is flipped before the
decryption all the other blocks that are chained after the block will be affected. There are
other more advanced modes of operations like Counter mode which is able to generate a
different output given the same plaintext without chaining. Even tough CBC has some
shortcomings, they are by far outweighed by its advantages. CBC is used today in the
Transport Layer Security (TLS) protocol which provides privacy while communicating
over the Internet.
Many of the different modes of operation are used in conjunction with each other to
complement each other’s weaknesses in many of the modern applications.
A big problem when it comes to symmetric encryption is the difficulties with exchanging
the private key in a secure way. Asymmetric encryption solves this problem by using two
keys, a public key used for encryption and a private key used for decryption. You can
therefore publish the public key. If anyone wants to send you a secret message, they can
safely do so by encrypting their message with the public key before sending it.
9
There are many cases where asymmetric and symmetric encryption work hand in hand.
We can use asymmetric cryptography to transfer symmetric keys. Asymmetric encryption
was first developed in 1973 by the English mathematician Clifford Cocks at the Govern-
ment Communications Headquarters (GCHQ). The system was secret and classified until
1997.
In the meantime, the computer scientists Rivest, Shamir, and Adleman developed the
”RSA algorithm” to perform public key cryptography. It was first described in 1977 and
later published in the paper ”A method for obtaining digital signatures and public-key
cryptosystems” in 1978 [26]. RSA is based on the hardness of factorizing big integers.
The RSA algorithm is described below.
RSA encryption:
c = me (mod n)
RSA decryption:
m = cd (mod n)
10
2.2.2 Diffie-Hellman key exchange
This key-exchange scheme allows two parties, that have no prior knowledge of each other,
to establish a shared secret key over an insecure channel. This key exchange is based on
a hard mathematical problem called the ”discrete logarithm problem”. This problem is
also a part of the NP-Complete group. An example of this can be that two persons, Alice
and Bob, want to share a common secret.
1. Alice and Bob agree on two integer values a modulus value p and a generator g.
Note that g is a generator of Z∗p if for every a ∈ Z∗p we have g k ≡ a (mod p) for
some k.
2. Alice chooses a secret integer a and sends Bob A ≡ g a (mod p). The secret can
also be referred to as the discrete logarithm of A with respect to the base g.
We know that Alice and Bob has the same secret because
11
In Figure 2.7 provides an illustration of how the Diffie-Hellman key exchange works.
The common paint represents the variables g and p. The secret colors represents the
variables a for Alice and b for Bob. Even tough the color generated from the mixture of
the common and secret paint is public knowledge, it is hard for the attacker to separate
the secret colors from the common paint. The hardness of this is known as the ”Discrete-
Logarithm Problem”, which is a part of the NP-Complete group. The NP-Complete
group will be discussed in more detail in Section 3.2.
When the idea of homomorphic encryption was first presented by Rivest et al. in 1978
[25], they discovered that RSA had a multiplicative homomorphism. To explain this idea
we can imagine two values x and y. We encrypt the x and y with value e. Then, encrypt
each of the values by putting e as the exponent as you can recall from the Subsection
2.2.1.
encrypt(x) ≡ xe (mod n)
encrypt(y) ≡ y e (mod n)
12
The multiplicative homomorphism is that you can multiply each of the encrypted values
and get their product when you decrypt them.
Designing a crypto system that has both additive and multiplicative homomorphism is
much harder. Solving this problem was for long considered the holy grail of cryptogra-
phy. A breakthrough occurred 30 years after the idea first was presented Craig Gentry
published the first ever fully homomorphic encryption scheme in his PhD thesis in 2009
[10].
Craig Gentry and Shai Halevi later implemented the scheme and published their work
in a proceedings in 2011 [11]. He showed how the scheme’s security and performance
work in practice. They made a rough estimate, and found that the security parameter
λ, which is the same as the dimensions of the lattice, should be at least 213 to 215 to
be considered secure. A lattice is defined in Section 3.3. They ran the implementation
on a powerful IBM System x3500 server, featuring a 64-bit quad-core Intel Xeon E5450
processor, running at 3GHz, with 12MB L2 cache and 24GB of RAM. The implementation
used 2.2 hours to generate the cryptographic keys and 31 minutes to recrypt, which is
an operation to reduce the noise by treating decryption as an evaluation process. Gentry
admitted that the scheme was atrociously slow. This scheme would not be usable in a
long time, even with the help of Moore’s law. Since then, the speed of the schemes for
FHE has increased around 8 times each year, but performance is still a major obstacle.
There are multiple levels of homomorphic encryption. The levels are separated by the
types and amount of operations you can perform on the encrypted data.
This only works for one type of abstract algebraic operation: Either addition encrypt(a+
b) or multiplication encrypt(a · b) on encrypted data. RSA [26] is an example of a
somewhat homomorphic encryption.
13
2.4.2 Leveled fully homomorphic encryption
This allows both types of algebraic operations: Addition and multiplication on encrypted
data, although it allows only for a limited number of operations. The advantage of using
leveled homomorphic encryption is that it is fast when you know how many operations
you are going to perform. By performing mathematical operations the noise increases.
The noise slowly corrupts the data making it less and less accurate, until the data is
unusable.
FHE is an increasingly good method for computation on sensitive genomic data. FHE
can be applied to the evaluation of various algorithms like machine learning on encrypted
financial, medical, or genomic data [15][20][16][21]
14
Chapter 3
Preliminaries
A good crypto system should be easy to construct and implement, but hard to crack.
This can be compared to building a chest for keeping secrets. It is a delicate tradeoff
between how expensive the chest is to build and its security level against intruders. The
era of quantum computers has not yet arrived. However, quantum algorithms capable
of solving number-theoretic hard problems, such as integer factorization and discrete
logarithm, in polynomial complexity have already been developed. This entails that the
widely used RSA cryptosystem, ”DH key exchange” scheme is vulnerable when quantum
computers are brought from theory to reality. To prevent successful attacks, we must
therefore develop new and resilient mathematical problems.
We define b·e as rounding to the closest integer, if the value is exactly between two
integers we round upwards. We use Zq to define a set of all integers modulo q. We use
h·, ·i to denote the dot product of two vectors. We use Ω(·) as a symbol for the average
case time complexity that the fastest known algorithm use to solve a problem. We use
O(·) as the worst case time complexity an algorithm uses give all possible inputs.
15
3.2 Problems complexity
This image shows the different classifications of problems ranging from the least complex
problems at bottom to the most complex problems at the top. This image is based on
the popular assumption that P6=NP.
P
P is a set of problems that are proven to be solvable in polynomial time.
NP
NP stands for non-deterministic polynomial time. Solutions in NP can be verified in poly-
nomial time. However, we do not necessary know if we can find a solution in polynomial
time.
NP-Complete
If a problem in NP is not solvable in polynomial time, then it is probably part of NP-
Complete. All these problems can be reduced to each other in polynomial time. This
means that if you are able to solve one of them in polynomial time, you can solve all of
them in polynomial time.
16
NP-Hard
The class of problems π such that all problems in NP can be reduced to π. While π may
not necessarily be in NP itself, it is the class of NP-hard problems.
Cryptosystems
If you base your cryptosystem on a very complex problem, it becomes resilient against
attacks. By properly using problems that are NP-Hard we could design a resilient crypto
system. When it comes to cryptography, it is important that the problem we use is
Ω(NP-Hard).
A lattice is a simply a vector space over R. Each of the points in the lattice is a linear
combination over R of the basis vectors.
Here we refer to B as the basis of the lattice. We can take the scaled sum of the vectors
and create other vectors that are also in the lattice.
17
3.4 Shortest Vector Problem(SVP)
Lattice contains many problems that are NP-hard. The Shortest Vector Problem (SVP)
which is defined in 3.4.1 is proven to be NP-hard by Daniele Micciancio [22]. Here we
use the Euclidean norm which is defined in 3.4.2.
Definition 3.4.1 (Shortest Vector Problem) Given a basis B = {b~1 , b~2 , ...., b~n } ∈
Zm×n , the shortest vector problem is to find a vector ~v satisfying
Example. The Shortest Vector Problem might be represented as this. For simplicity, we
will be basing our lattices over integers.
~b1
~b2
~v
18
In Figure 3.2 we illustrate a lattice L ⊂ R2 with the basis of B = {[−27, 13], [−41, 17]}.
In this case, the shortest vector ~v is [1,5] because:
This might look like an easy problem to solve, given that you only have two dimensions.
However, the problem’s difficulty rapidly increases as the number of dimensions increases.
The fastest discovered algorithm for solving this problem is the Lenstra–Lenstra–Lovász
lattice basis reduction algorithm [18] which has a time complexity of the non-deterministic
polynomial Ω(2n ) where n represents the dimensions of the lattice. SVP has also been
proven by to be Ω(NP-Hard) by Miklós Ajtai in his proceedings in 1998 [3]. As n grows,
this becomes too time-consuming for existing computers to solve, and there has not yet
been discovered any post-quantum algorithms that can solve this.
The SVP problem is hard to solve given a bad basis, and is easy if you have a good basis.
A good basis is when the vectors are reasonably orthogonal to one another. SVP on a
good basis is just the shortest basis vector. This is illustrated in Figure 3.3 where b~1 is
the shortest vector.
~b1
~b2
19
3.5 Closest Vector Problem(CVP)
CVP is closely related to SVP. Given a lattice L and a target point x, CVP asks us to
find the closest lattice point to the target [23]. A mathematical definition has been given
in Definition 3.5.1. Goldreich et al. showed that any hardness of SVP implies the same
hardness for CVP [12]. Solving CVP has a Ω(NP-hard) time complexity if you have a
bad basis. On the other hand if you have a good basis CVP can be solved in polynomial
time using Babais algorithm [4].
Definition 3.5.1 (Search Closest Vector Problem) For any approximation param-
eter γ = γ(n) ≥ 1, the search problem CV Pγ is defined as follows.
The input is a basis for a lattice B ⊂ Rn and vector t ∈ Rn , the target. The goal is to
output a vector y ∈ L satisfying
kt − yk ≤ γ · dist(t, L)
In general, if the basis vectors are close to orthogonal on each other, CVP can easily be
solved using Babai’s Closest Vertex Algorithm 1. László Babai was able to show that you
could solve CVP with polynomial time complexity.
Write w = t1 v1 + t2 v2 + · · · + tn vn with t1 , . . . , tn ∈ R.
Set ai = bti e = 1, 2, . . . , n.
Return the vector v = a1 v1 + a2 v2 + · · · an vn .
20
3.5.1 Solving CVP with a good basis
Suppose we want to solve CVP with a good basis as illustrated in Figure 3.4.
~b1
~b2 ~y
Problem: To solve this CVP we must find the closest vector to the target point [31,-2].
This point is marked as a blue triangle.
Solution: Since the basis vectors are reasonably orthogonal we can solve CVP by using
Babai’s algorithm 1, and we write the equation:
1a1 + 14a2 = 31
5a1 − 4a2 = −2
21
The systems have the following solutions
48
a1 = ≈1
37
157
a2 = ≈2
74
[29,-3] is the closest lattice point and is marked as a green dot in Figure 3.4.
When we are to solve CVP for the same target point [31,-2] and we are given the bad
basis form Figure 3.2. We write the equation:
−27a1 − 41a2 = 31
13a1 − 17a2 = −2
445
a1 = ≈6
74
349
a2 = − ≈ −5
74
22
This is not the closest lattice point. We can see that the lattice point with the good basis
marked with a green dot in Figure 3.4 is significantly closer than the point we got from
the bad basis (marked as a red dot) in Figure 3.5.
~b1
~b2
~y
As shown in the previous subsection solving the same CVP problem with a bad basis is
hard while solving it with a good basis is easy. We can use this idea to create a crypto
system where we use the bad basis as a public key and the good basis as a private key.
3. Bob sends the bad basis to Alice with each of the vectors enumerated ~b1 , ~b2 , . . . , ~bn .
4. Alice encrypts her numbers by multiplying them with the bad basis which generates
a pont x in the lattice.
23
5. Alice then adds a small error to x lets call this new point z.
7. Bob can use his good basis to quickly solve the CVP problem by finding z closest
lattice point x.
8. Bob can then multiply x with the inverse of the bad basis and get Alice’s secret
numbers.
The reason Alice sends z instead of x is because the attacker can easily find Alice’s secret
numbers as shown in step 8.
Lattice-based cryptosystems today essentially use the above idea, but they are more
elaborate.
The LWE problem was first introduced by Oden Regev in 2005 [1]. LWE is based on
solving a linear system of equations with errors. There are two types of LWE problems:
Search-version LWE and decisional-version LWE.
Search LWE asks us to recover a secret vector s ∈ Znq , when we are given m samples of
(ai , hai , si + ei ),
where ai ∈ Znq is sampled from a uniform distribution and ei is sampled from a discrete
Gaussian distribution χ.
24
Example 1 (Search LWE) This problem asks us to recover the secret s ∈ Z521 with the
following system of equations. Each of the equations has an error between e ± 1.
In this problem s = (19, 9, 2, 5, 16). If it was not for the error this problem could easily
be solved in polynomial time using Gaussian elimination. But in LWE the Gaussian
elimination will amplify the error to such an extent that we would be unable to recover
the information in s. This makes the problem significantly harder.
The Decisional LWE problem gives an input of samples (a, v) where a ∈ Znq is chosen
from a uniform probability distribution. The difference between search and decisional
LWE is that we must determine with some non-negligible error probability weather v is
chosen uniformly from Znq or if v is chosen to be (ai , hai , si + ei ) where s ∈ Znq is uniformly
chosen and ei ∈ Znq is chosen from χn .
The RLWE problem was first introduced by Vadim Lyubashevsky, Chris Peikert, and
Oded Regev in 2013 [19]. RLWE is a variant of LWE, but instead of using vectors Znq
it works with polynomials over rings Zq [X]/(X n + 1). RLWE can be defined similarly
to LWE (ai , hai , si + ei ), but the variables are polynomials drawn from Zq [X]/(X n + 1).
Each of the variables uses the same probability distributions as in LWE.
RLWE has multiple advantages over LWE when it comes to homomorphic encryption.
One is that multiplication between polynomials can be done a lot quicker by using the
25
Fast Fourier Transform(FFT) algorithm. FFT has a time complexity of O(n log(n)).
This is significantly faster than matrix multiplication which has a time complexity of
O(n2 ).
By applying RLWE we can develop a crypto system: Let the public key be pk = (a, −as+
e) ∈ (Zq [X]/(X n + 1))2 , the secret key be s ∈ Zq [X]/(X n + 1) and the plaintext be
m ∈ Zq [X]/(X n + 1).
Encryption:
ciphertext = (0, m) + pk = (a, m − as + e) = (c0 , c1 )
Decryption:
m’ = c0 + c1 · s = m − a · s + e + a · s = m + e ≈ m
26
Chapter 4
CKKS Scheme
In this chapter I describe the different stages of the CKKS scheme. CKKS allows you to
perform approximate fully homomorphic operations on complex floating point numbers
at a high speed. CKKS was introduced as an approximate encryption scheme in 2016
by Cheon et al. [5]. “Approximate“ means that you will not get the exact message you
originally encrypted, but a number close to it, depending on the encryption parameters.
The CKKS scheme only encrypts numbers. Numbers can represent everything from
letters and symbols to entire computer programs. The authors of CKKS argue that all
real world data will have some error. Trying to get the exact number you encrypted is
irrelevant. CKKS is a public key crypto system. This implies that you have a public key
for encryption and a private key for decryption. The secret key should be kept private.
In this section I give a brief introduction on the mathematical background of CKKS. This
includes the message, the encoding and the encryption part.
27
4.1 Message
The CKKS scheme supports approximate arithmetics over complex numbers. The input
n
message of the crypto system must be vectors from the space of C 2 , where n is some
n
power-of-two integer. This vector is defined as ~z = (z1 , z2 , ...., z n2 ) ∈ C 2 .
4.2 Encoding
Encoding is the process of changing the data, or the message, into a new format called
plaintext. This entails that it can be interpreted and encrypted by the CKKS scheme.
In this Subsection I follow the encoding technique for packing messages from [5].
n
In CKKS we are mapping the message ~z ∈ C 2 into a cyclotomic polynomial ring with
integer coefficients m(X) = Z[X]/(X n + 1), where n is the degree modulus of the polyno-
mial. For simplicity and security n will be a power of 2. The encoding algorithm receives
two parameters: the message and the scaling factor ∆ > 1. Having a large ∆ gives us
more accuracy but reduces the number of homomorphic operations we can perform.
The easiest way to understand CKKS encoding, is by explaining each of the steps indi-
vidually.
28
4.2.1 Embedding
The inverse embedding operation defined as σ −1 is the last step of the encoding process.
We will start with this step because it will give us a better understanding of the steps
leading up to it and the properties to embedding. The embedding works as a map between
polynomials and vectors, which is an isometric ring homomorphism.
σ −1
Cn −−→ C[X]/(X n + 1)
σ
C[X]/(X n + 1) →
− Cn
Embedding
The embedding is easier to understand than the embedding-1 . I will therefore start
with the embedding.
Inverse embedding
Finding the polynomial that when evaluated on the roots of unity maps to the embedded
29
vector ~z is harder. The polynomial must satisfy m(X) = Σn−1 i n
i=0 αi X ∈ C[X]/(X + 1),
given the vector ~z, where α is the coefficients of the polynomial that we need to find.
Σn−1
j=0 αj (ζ
2i−1 j
) = zi , i = 1, . . . , n
This can be solved by using the values as a system of linear equations. Aα = ~z where A
2i−1
is the Vandermonde matrix of ζi=1,...,n .
(ζ)0 (ζ)1 (ζ)2 ... (ζ)n
(ζ 3 )0 (ζ 3 )1 (ζ 3 )2 ... (ζ 3 )n
A=
(ζ 5 )0 (ζ 5 )1 (ζ 5 )2 ... (ζ 5 )n
.. .. .. .. ..
. . . . .
(ζ 2n−1 )0 (ζ 2n−1 )1 (ζ 2n−1 )2 . . . (ζ 2n−1 )n
α = A−1~z
Example
1) Inverse embedding
Suppose you want to map the vector ~z = [4.44 + 2i, 4, 2 + i, 2i] to a cyclotomic polynomial
C[X]/(X 4 + 1).
30
We now have the following system Aα = ~z.
1 (0.7071 + 0.7071i) i (−0.7071 + 0.7071i) α1 (4.44 + 2i)
1 (−0.7071 + 0.7071i) −i (0.7071 + 0.7071i) α2 4
· =
1 (−0.7071 − 0.7071i) i (0.7071 − 0.7071i) α 2 + i
3
1 (0.7071 − 0.7071i) −i (−0.7071 − 0.7071i) α4 2i
α1 = 2.61 + 1.25i
α2 = −0.4525 − 0.6081i
α3 = 0.25 − 0.6081i
α4 = 0.0990 − 1.6688i
2) Additive homomorphism
To show the embeddings additive homomorphism we create a new polynomial m0 (X).
3) Embedding
We now map the polynomial back to a vector ~z0 .
σ
− ~z0 ∈ C4
C[X]/(X 4 + 1) →
Here we can see that ~z0 = ~z + ~z. This shows the additive homomorphism from step 2.
31
4.2.2 Inverse Natural Projection
n
In 4.2.1 we mapped a vector ~z ∈ Cn to C[X]/(X n + 1). CKKS maps ~z ∈ C 2 to the
n
cyclotomic polynomial Z[X]/(X n + 1). We therefore have to modify the values of ~z ∈ C 2
n
to be embedded in Z[X]/(X n + 1). One of these steps is expanding ~z ∈ C 2 by using
inverse natural projection π −1 .
n π −1
C 2 −−→ H
such that
π −1
[z1 , . . . , z n2 ] −−→ [z1 , . . . , z n2 , z n2 +1 , . . . , z n ]
4.2.3 Scaling
This step is simple. We multiply with the scaling factor ∆. The size of the scaling factor
determines the accuracy of the encoding.
∆·H
H −−→ V
We will now project the vector ~z ∈ V on to the lattice of σ(Z[X]/(X n + 1)). We compute
the coordinates of z with repect to the orthogonal lattice basis. The orthogonal lattice
basis is the good basis as discussed in Section 3.5.
b·eσ(Z[X]/(X n +1))
V −−−−−−−−−−→ σ(Z[X]/(X n + 1))
!
(~z, V1 ) (~z, V2 ) (~z, Vn )
~z ∈ V = [z1 , z2 , . . . , zn ] → a = (a1 , a2 , . . . , an ) = , ,...,
(V1 , V1 ) (V2 , V2 ) (Vn , Vn )
Vi represents the i-th column of the Vandermonde matrix A defined in Subsection 4.2.1.
We then round each of the coordinate values to its closest integer and multiply it with
the Vandermonde matrix.
a → bae · V
32
4.2.5 Decoding
~z = π ◦ σ(∆−1 · m)
~z = [2.25+3.4i, 5+9.1i], ∆ = 64
Encoding
π −1
[2.25 + 3.4i, 5 + 9.1i] −−→ [2.25 + 3.4i, 5 + 9.1i, 5 − 9.1i, 2.25 − 3.4i]
2) Scaling
H·∆
[2.25+3.4i, 5+9.1i, 5−9.1i, 2.25−3.4i] −−→ [144+217.6i, 320+582.4i, 320−582.4i, 144−217.6i]
3) Projection
" #
(~z, V1 ) (~z, V2 ) (~z, V3 ) (~z, V4 )
, , , = [232, 220.61731573, −182.4, 345.06810922]
(V1 , V1 ) (V2 , V2 ) (V3 , V3 ) (V4 , V4 )
33
4) Inverse Embedding
We now have Aα = ~z
1 (0.7071 + 0.7071i) i (−0.7071 + 0.7071i) α1 (144.319 + 218.222i)
1 (−0.7071 + 0.7071i) −i (0.7071 + 0.7071i) α2 319.681 + 582.222i
· =
1 (−0.7071 − 0.7071i) i (0.7071 − 0.7071i) α 319.681 − 582.222i
3
1 (0.7071 − 0.7071i) −i (−0.7071 − 0.7071i) α4 144.319 − 218.222i
α1 = 232
α2 = 221
α3 = −182
α4 = 345
We have now completed the encoding, and we now have the polynomial:
Decoding
1) Inverse Scaling
2) Embedding
p(x) = 3.625 + 3.453125x − 2.84375x2 + 5.390625x3
[p(ζ 1 ), p(ζ 3 ), p(ζ 5 ), p(ζ 7 )] = [2.255 + 3.410i, 4.995 + 9.097i, 4.995 − 9.097i, 2.255 − 3.410i]
3) Natural Projection
π
[2.255+3.410i, 4.995+9.097i, 4.995−9.097i, 2.255−3.410i] →
− [2.255+3.410i, 4.995+9.097i]
34
4.3 Leveled Homomorphic Encryption
This chapter covers encryption and the homomorphic operations that can be performed
on encrypted data, also referred to as ciphertext.
We will also introduce a new sampling distribution HWT (h) which contains a set of
vectors in {−1, 0, 1}n with a Hamming weight of h. This means that each of the vectors
has exactly h non-zero values. For simplicity, I will write R instead of Z[X]/(X n + 1) and
RQ instead of ZQ [X]/(X n + 1). The notation a ← RQ means that a is a sample from a
uniform probability distribution of the set RQ . χ is a discrete Gaussian distribution.
Secret key
The secret key is generated as a sample from s ← HWT (h).
sk = (1, s) ∈ R2Q
Public key
The CKKS encryption uses variables from the RLWE problem as we discussed in 3.7 as
the public key pk. The variables are generated as a ← RQ and e ← χ.
We use −a in the second polynomial because it makes the decryption slightly easier 4.3.3.
Switching key
We generate the switching key by inputting two values sk and s0 ∈ RQ . We sample
a ← RQ and e ← χ.
swk = (a, −as + e + Q · s0 ) ∈ R2Q2
Relinearization key
The relinearization key is generated with the help of the switching key generation method.
The variable s is the same as in sk.
35
4.3.2 Encryption
We will encrypt the message m ∈ R by using pk into a ciphertext ct. For encryption, we
will introduce a new probability distribution ZO(p). For real 0 ≤ p ≤ 1, the distribu-
tion ZO(p) draws vectors form {−1, 0, 1}n , where each value in the drawn vector has a
probability of 1 − p for being a zero and p/2 for each of −1 and +1.
Encryption
Let v ← ZO(0, 5) be an ephemeral value, e0 , e1 ← χ and the public key pk = (a, b).
ct = v · pk + (m + e0 , e1 ) = (v · a + e0 + m, v · b + e1 ) ∈ R2q
4.3.3 Decryption
m0 = c0 + c1 · s = m − a · s + e + a · s (mod q) = m + e ≈ m
4.3.4 Evaluation
The part where we preform homomorphic operations on encrypted data. We can not
perform operations ciphertext that has different modulo q. The ciphertexts we will do
operations on are ct = (a, b) and ct0 = (a0 , b0 ).
Addition(ct, ct’):
When perform addition, the ciphertexts must have the same scaling factor.
c0 = a · a0 (mod q)
c1 = a · b0 + b · a0 (mod q)
c2 = b · b0 (mod q)
36
We will use rlk = (r0 , r1 ) to reduce the number of polynomials to two.
The resulting ciphertext ctmult = (c00 , c01 ) will have the scaling factor ∆mult = ∆ · ∆0 .
37
Chapter 5
I have implemented the CKKS scheme with a Graphical User Interface(GUI) and a code-
based interface. The GUI is made to give the user an easy and intuitive introduction to
the CKKS scheme with no programming or mathematical experience. One example is
that it renders error messages to help the user write valid input. The code-based interface
gives the user more flexibility and room to experiment. The downside is that it does not
have the same restrictions as the GUI, which makes it easier to do mistakes and generate
invalid output or crash the program. I have implemented the scheme using the Java
version 15.
5.1 Architecture
The user sets the parameters: Polynomial degree n, small modulus q, Big modulo Q,
scaling factor ∆, prime number bit size θ and Miller-Rabin iterations β.
5.1.1 Parameters
The purpose of parameters is to store the values necessary for encoding, encryption and
key generation. This is an efficient way to sore information from operation to operation.
38
Parameters creates and instance of the ChineseRemainderTheorem class (CRT). CRT
generates 2+log2 (n)+ 4·logθ2 (Q) primes, where each of the primes p satisfies p ≡ 1 (mod 2n).
Each of the primes is generated with the Miller-Rabin primality test, which starts by
testing if 10θ−1 + 1 is prime and for each iteration adds an extra 2n to the initial value
until all the necessary primes has been created. I add 2n such that 1 ≡ p (mod 2n).
CRT then creates an instance of the Number Theoretical Transform(NTT) for each of the
primes. Each of the NTTs then finds the primitive element (ψ) to its prime (ι) which is
ι−1
Defined in 5.1.1. NTT then finds the root of unity κ = ψ2n (mod ι) and the inverse root
of unity ρ = κι−2 (mod ι). NTT then calculates the roots of unity [κ0 , κ1 , κ2 , . . . , κn−1 ]
then the roots of unity inverse [ρ0 , ρ1 , ρ2 , . . . , ρn−1 ].
A primitive root (mod q) is an element g ∈ Z∗q whose powers generate all of Z∗n . That
is, every element b ∈ Zn can be written as g x (mod q) for some integer x.
39
5.1.2 Key generator
The secret key will be created with the Hamming weight of n4 , which means that the
security of the ciphertext depends on the size of the encrypted vector. When generating
the error (e) for the public key ((a, −as + e) and the switching key (a, −as + e + Q · s0 ) I
used the ZO(0, 5) as the Gaussian distribution, which was introduced in 4.3.2.
5.1.3 Encoder
Before we can start encoding the program must first construct the class, which creates
and instance of the Fast Fourier Transform (FFT). FFT precomputes the roots of unity
and the roots of unity inverse to the cyclotomic Φ2n (X) as discussed in 4.2.1. The encoder
implemented will be more advanced than the one discussed in 4.2. My encoder is based
on the latest developments in the HEAAN library [6].
40
5.1.5 Evaluator
Described in 4.3.4.
The purpose of the GUI is to offer a basic introduction to the user. The user will be
guided through the different steps of the CKKS scheme. This implies that no program-
ming knowledge is needed. The GUI is divided into four frames, that the user fills out
sequentially.
In the first frame, the user must fill in the length and size of the vectors. The first frame
is illustrated in 5.4.
41
Figure 5.4: Frame one
The next step is to fill in the vector values, which later are available for cryptographic
operations. This frame is illustrated in Figure 5.5.
In the third step, the user will set the parameters which are discussed in 5.1. The program
gives the user some suggestions, and the user can make his choices.
In the final step, the user can perform encoding, encryption and homomorphic operations
like multiplication, addition and subtraction on the vectors from the second step. This
frame is illustrated in figure 5.7.
When a user is to encrypt vectors, the user must first press the ”Generate keys” button.
All the vectors and cryptographic keys values will be printed to the terminal when the
user presses one of the ”Show buttons”.
42
Figure 5.7: Operation frame
To perform evaluation operations, the user must press the ”Add” button to the right of
one of the vectors, and then select one of the algebraic operations, add another vector and
finally press ”Evaluate”. When the user presses ”Evaluate”, the result will be printed to
the terminal and the resulting vector will be added under ”Results”.
A problem I faced while creating the GUI, was showing the vectors and keys to the
user. It would have been too messy and cluttered to have them in the GUI. Adding
scrolling would not help with the user-friendliness. As a solution to this challenge, I
chose to combine the GUI with the terminal. This implies that I print these values to
the terminal.
Another problem is that when you encode or decode a vector, a small error will be added.
If the user pressed a wrong button, the user must go back and set the vector again. To
prevent this, the program remembers when a vector has been unencoded and previously
encoded. Then it will pick the previously saved state.
43
Chapter 6
In this master thesis I implemented a Java library based on the CKKS scheme. This
library supports approximate arithmetic on encrypted data consisting of real numbers.
This master thesis is relevant because cloud computing is becoming increasingly popular
due to low price and convenience. However, cloud computing is problematic because it
can allow 3rd parties to access the stored information. Homomorphic encryption is a
good preventive measure. Moreover, we must develop more resilient cryptosystems based
on complex mathematical problems to defend ourselves in the coming era of quantum
computers.
The field of homomorphic encryption is developing in a fast pace and this does not seem to
slow down anytime soon. A cutting edge development is the CKKS encryption scheme,
which was first published in 2016 [5]. It has been under continuous development ever
since, and this is where this master thesis contributes.
Moreover, in this master thesis I describe various types of encryption and what criteria a
good cryptosystem should follow. A fundamental issue is the tradeoff between efficiency
and security. I have described the mathematical concept of homomorphism and intro-
duced Ω(NP-Hard) problems based on lattices and showed how these could be applied to
cryptography.
Further, I have explained the CKKS scheme and its methods for encoding, encryption and
evaluation. However, in addition to the descriptions, the main contribution of this master
thesis is my implementation of the CKKS scheme in Java. Java is a popular language,
less complex than C++ and faster than Python. To my knowledge, an implementation
44
in Java has never been done before. In addition, I created a graphical user interphase
(GUI) to give the user a basic overview of the CKKS scheme. Moreover, it provides
a user-friendliness to encompass users with limited programming experience. A future
research avenue is to find a way to set up the library for server-client mode.
45
Bibliography
[1] STOC ’05: Proceedings of the Thirty-Seventh Annual ACM Symposium on Theory
of Computing, New York, NY, USA, 2005. Association for Computing Machinery.
ISBN 1581139608.
[3] Miklós Ajtai. The shortest vector problem in l2 is np-hard for randomized reductions.
In Proceedings of the thirtieth annual ACM symposium on Theory of computing,
pages 10–19, 1998.
[4] László Babai. On lovász’lattice reduction and the nearest lattice point problem.
Combinatorica, 6(1):1–13, 1986.
[5] Jung Hee Cheon, Andrey Kim, Miran Kim, and Yongsoo Song. Homomorphic en-
cryption for arithmetic of approximate numbers. Cryptology ePrint Archive, 2016.
[6] Jung Hee Cheon, Kyoohyung Han, Andrey Kim, Miran Kim, and Yongsoo
Song. Bootstrapping for approximate homomorphic encryption. In Advances in
Cryptology–EUROCRYPT 2018: 37th Annual International Conference on the The-
ory and Applications of Cryptographic Techniques, Tel Aviv, Israel, April 29-May 3,
2018 Proceedings, Part I 37, pages 360–384. Springer, 2018.
[7] Joan Daemen and Vincent Rijmen. The design of Rijndael: AES — the Advanced
Encryption Standard. page 238, 2002.
[9] Whitfield Diffie and Martin Hellman. New directions in cryptography. 1976.
[10] Craig Gentry. A fully homomorphic encryption scheme. Stanford university, 2009.
46
[11] Craig Gentry and Shai Halevi. Implementing gentry’s fully-homomorphic encryp-
tion scheme. In Annual International Conference on the theory and applications of
cryptographic techniques, pages 129–148. Springer, 2011.
[12] Oded Goldreich, Daniele Micciancio, Shmuel Safra, and J-P Seifert. Approximat-
ing shortest lattice vectors is not harder than approximating closest lattice vectors.
Information Processing Letters, 71(2):55–61, 1999.
[13] Jeffrey Hoffstein, Jill Pipher, Joseph H Silverman, and Joseph H Silverman. An
introduction to mathematical cryptography. page 380, 2008.
[15] M.Kim J.H.Cheon and K.Lauter. Homomorphic computation on edit distance. In-
terantional Conference on Finincial Cryptography and Data Security, pages 194–212,
2015.
[18] Arjen K. Lenstra, Hendrik Willem Lenstra, and László Lovász. Factoring polyno-
mials with rational coefficients. Mathematische annalen, 261(ARTICLE):515–534,
1982.
[19] Vadim Lyubashevsky, Chris Peikert, and Oded Regev. On ideal lattices and learning
with errors over rings. Journal of the ACM (JACM), 60(6):1–35, 2013.
[20] J.H. Cheon M. Kim, Y. Song. Secure searching of biomarkers through hybrid homo-
morphic encryption scheme. BMC medical genomics, page 10(2): 42, 2017.
47
[23] Daniele Micciancio. Closest Vector Problem, pages 212–214. Springer US, Boston,
MA, 2011. ISBN 978-1-4419-5906-5. doi: 10.1007/978-1-4419-5906-5 399.
URL: https://doi.org/10.1007/978-1-4419-5906-5 399.
[24] Christof Paar and Jan Pelzl. Understanding cryptography: a textbook for students
and practitioners. page 31, 2010.
[25] Ronald L. Rivest, Len Adleman, Michael L. Dertouzos, et al. On data banks and
privacy homomorphisms. Foundations of secure computation, 4(11):169–180, 1978.
[26] Ronald L. Rivest, Adi Shamir, and Leonard Adleman. A method for obtaining
digital signatures and public-key cryptosystems. Communications of the ACM, 21
(2):120–126, 1978.
[28] A.J Han Vick. Illustration of the idea of the diffie-hellman key exchange, 2011.
URL: https://commons.wikimedia.org/wiki/File:Diffie-Hellman Key Exchange.svg. Ac-
cessed: 2022-11-8.
48