Elliptic Curve Hierarchical Deterministic Private Key Sequences: Bitcoin Standards and Best Practices
Elliptic Curve Hierarchical Deterministic Private Key Sequences: Bitcoin Standards and Best Practices
Elliptic Curve Hierarchical Deterministic Private Key Sequences: Bitcoin Standards and Best Practices
M ASTER T HESIS
Author: Supervisors:
Daniele F ORNARO Prof. Daniele M ARAZZINA
Prof. Ferdinando M. A METRANO
19 April 2018
i
Bitcoin Blockchain
ii
POLITECNICO DI MILANO
Abstract
Industrial and Information Engineering
Department of Mathematics
Mathematical Engeneering
Acknowledgements
First of all, I would like to give my sincere gratitude to professor Ferdinando Ame-
trano of the Politecnico di Milano, who transmitted to me the passion of the subject
and who has dedicated a large part of his time in order to bring me on the right path.
I would like to thank professor Daniele Marazzina of Politecnico di Milano for his
supervision to this work and for his many tips. Then I would like to give my thanks
to all the friends and colleagues of Deloitte Blockchain Lab Italy for the stimulating
and innovative environment in which I was able to write this thesis; in particular
Paolo Mazzocchi, Stefano Leone, Raffaele Nicodemo and Calogero Mandracchia,
for support and suggestions. Furthermore, I would thank Leonardo Comandini for
the mutual support and for his help on this work.
Thank you.
iv
Contents
Abstract ii
Acknowledgements iii
Contents iv
Introduction 1
1 Cryptography 3
1.1 HASH function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.1 Other functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Elliptic Curve over F p . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.1 Symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.2 Point addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.3 Scalar multiplication . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.4 Discrete Logarithm Problem . . . . . . . . . . . . . . . . . . . . 8
1.2.5 Group order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2.6 Bitcoin private-public key cryptography . . . . . . . . . . . . . 9
2 Wallet 11
2.1 Nondeterministic (random) Wallet . . . . . . . . . . . . . . . . . . . . . . 11
Pros and Cons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Deterministic Wallets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.1 Deterministic Wallet type-1 . . . . . . . . . . . . . . . . . . . . . 12
Pros and Cons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.2 Deterministic Wallet type-2 . . . . . . . . . . . . . . . . . . . . . 13
Pros and Cons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.3 Deterministic Wallet type-3 . . . . . . . . . . . . . . . . . . . . . 15
4 Mnemonic phrase 27
4.1 BIP 39 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.1.1 Mnemonic Generation . . . . . . . . . . . . . . . . . . . . . . . . 27
4.1.2 From Mnemonic to Seed . . . . . . . . . . . . . . . . . . . . . . . 29
4.2 Electrum Mnemonic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2.1 Mnemonic Generation . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2.2 From Mnemonic to Seed . . . . . . . . . . . . . . . . . . . . . . . 32
4.3 Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Conclusion 37
B Python code 41
B.1 Deterministic Wallet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
B.1.1 Type-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
B.1.2 Type-2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
B.2 Hierarchical Deterministic Wallet - BIP 32 . . . . . . . . . . . . . . . . . 42
B.3 Mnemonic phrase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
B.3.1 BIP 39 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
B.3.2 Electrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
1
Introduction
This thesis claims to analyze in detail the principal techniques used for the deriva-
tion of the public-private keys pair in the Bitcoin framework.
The first chapter will give an explanation of the basic concepts needed for this work.
Two fundamental elements are used: hash functions and the Elliptic Curve. The
former are irreversible algorithms; although the image of such functions is a limited
set, it doesn’t exist a computationally feasible procedure for the inverse. The only
way to compute the inverse of a hash function is by trying and this will take to much
time, due to high computational costs, making the operation infeasible. The latter,
the Elliptic Curve, is defined by an equation over a specific field and it is a plane
algebraic curve in this contest. In this type of cryptography a point on this curve is
called public key and the integer number, used to obtain the point, is called private
key. In this chapter, all the most important properties of this curve will be explained.
In the second chapter will be analyzed in detail the principal techniques used in
order to generate private and public key pairs. In particular, we will see four type of
derivations. The first and naive method consists of randomly extracting a number
and considering it as a private key, from which the corresponding public key will be
derived each time a new pair is requested. The other three methods are the so-called:
deterministic. This is due to the fact that in order to generate a bunch of keys, it is
necessary one single datum, called seed. These three methods are in an increasing
scale of difficulty and complexity and we will see their principal advantages and
disadvantages. The last type of derivation is the most used because it derives the
keys in a hierarchical way. This method will be seen in the next chapter.
The third chapter will be focused on the analysis of the Hierarchical Determinis-
tic Wallet, the most sophisticated type of derivation used up to now. It is defined by
BIP32 [1] and it is used by most of the Bitcoin wallets. This derivation is determinis-
tic, a seed is needed, and it is hierarchical. From the seed it is possible to derive an
arbitrary large number of keys and all of these keys can derive new keys in the same
way and so on. This procedure can be iterated as long as desired, leaving the user a
wide choice in the derivation of these numbers.
In the fourth chapter there will be the analysis of two possible ways to store the
seed: one was proposed by BIP39 [2] and it is the most used in the Bitcoin frame-
work; the other is the one used by Electrum [3], one of the principal Bitcoin wallet.
Both of them used a mnemonic phrase, a sentence composed of a certain number of
words from which it is possible to derive the seed. Nevertheless, they have some
Contents 2
differences and we will analyze them. The principal difference stands in the differ-
ent way used to verify the correctness of the mnemonic phrase. With BIP32 it is only
possible to check if the phrase is plausible, but with Electrum it is possible to assign
a version to the seed that will be generated by the mnemonic phrase, giving a pur-
pose to the keys derived from it.
The fifth chapter will be focused on some possible applications of the Hierarchi-
cal Deterministic Wallet proposed by BIP32. In particular we will see the standard
way to write a path, in order to easily understand how to generate particular keys
from the seed. We will also analyze one of the standards used by most of the Bitcoin
wallet: BIP43 [4]. The purpose of this BIP is to give a particular meaning to some
branches of the tree. We will therefore describe two important applications: multi-
coin wallet BIP44 [5] and SegWit addresses BIP49 [6].
Along with this writing, we attach the GitHub link to the repository of Python code
for the course of professor F. Ametrano. In this repository I have replicated in Python
all the procedures and methods presented and described in this thesis, neglecting all
those parts that are not inherent to it and writing the important ones in a synthetic
and essential way. The most relevant parts of those scripts will be reported in ap-
pendix B.
https://github.com/fametrano/BitcoinBlockchainTechnology
3
Chapter 1
Cryptography
In order to have a clear understanding of this thesis, it is necessary to know the basic
concepts of:
X HASH function.
X Elliptic Curve.
Only these two elements together can describe most of the cryptography behind
Bitcoin.
The input data is called message and the output data is called hash value.
A good and secure hash function must have at least these six properties:
(i) It is deterministic: if the message remains unchanged, the hash value is the
same.
(ii) It is quick: it should not take too much time to compute the hash value from
the message.
(iv) It is collision free: it is infeasible to find two messages with the same hash
value, even if it is theoretically possible.
(v) It has the avalanche effect: a very small change in the input message, even
flipping a single bit, produces a completely different hash value.
(vi) It has fixed size output and could have input messages of any size.
In this thesis, we will see hash functions as black-boxes, with all the proprieties de-
scribed above.
There are various kinds, but for our purpose, the main difference lies in the number
of bits of the hash value. Among all the possible hash functions, in Bitcoin cryptog-
raphy three functions are used:
Chapter 1. Cryptography 4
One important hash function used in the Bitcoin cryptography is the so-called HASH160.
It is simply the concatenation of SHA256 and RIPEDM160:
From the moment that the last operation made to compute the HASH160 function
was the RIPEDM160 function, the output size is 160 bits.
HMAC is a function that makes some computation, involving also a hash function.
This algorithm provides better immunity against length extension attacks, namely
attack in which the length of the input message is known and all the possible
combinations of the input are tried.
It receives 3 inputs:
where H is the hash function, k is the key, m is the message, opad(•) and ipad(•) are
two padding function, applied to the key k and || is a symbol that denotes concate-
nation.
times. Each of these times a particular string of bytes, called salt, is inserted within
the computation of the hash. This algorithm provides more computational work
with respect to a single hash function, and so it reduces the risk of a brute force at-
tack.
y2 = x3 + ax + b over F p , (1.1)
where F p is the finite field defined over the set of integers modulo p and a and b are
the coefficients of the curve.
y2 = x3 + ax + b mod p. (1.2)
Figure 1.1 shows some examples of Elliptic Curve over F p with a = −7 and b = 10
1.2.1 Symmetry
The elliptic curve has an important property: the line y = p/2 is an axis of symmetry
for the curve.
This can be shown, by proving that the point P( x, y) belongs to the Elliptic Curve
(EC) if and only if the point Q( x, p − y) belongs to the curve too:
P( x, y) ∈ EC ⇐⇒ Q( x, p − y) ∈ EC.
Proof :
P( x, y) ∈ EC =⇒ y2 = x3 + ax + b mod p,
Q( x, p − y) ∈ EC ⇐⇒ ( p − y)2 = x3 + ax + b mod p.
From the moment that the right hand side of both the equations are equal, we only
need to prove that:
( p − y)2 = y2 mod p.
This is true, indeed:
p·k = 0 mod p ∀k ∈ N.
The other implication ( ⇐= ) is almost the same and it follows the same logic.
c.v.d.
Chapter 1. Cryptography 7
Once shown the symmetry property, it can be useful to denote the point P( x, y) as
the opposite of Q( x, p − y):
P = − Q =⇒ P + Q = 0,
where the + is a binary operation between two points in the EC and it will be ex-
plained below. The 0 in this contest is the point at infinity.
After defined when points in the EC have zero-sum, it is possible to calculate the
equations for point addition:
A = ( x1 , y1 ) B = ( x2 , y2 ).
Let’s define A + B := ( x3 , y3 ).
Chapter 1. Cryptography 8
A + B = 0 = (in f , in f ).
x3 = s2 − x1 − x2 mod p,
y3 = s( x1 − x3 ) − y1 mod p.
Once we have s the value x3 and y3 are obtained following this simple formula.
n · P = |P + P +
{z· · · + P}.
n times
53 · P = 110101base 2 · P = 25 · P + 24 · P + 22 · P + P
Computing the common sub-terms only once we obtain a total of 5 doubling and
3 addition operations, much less of 52 addition operations. This algorithm is even
more efficient if the scalar is a very large number.
Q = n · P,
Let’s suppose to know Q and P. With these information it exists only one possi-
ble n ∈ N, such that n < order, where order will be defined below, and that the
equation above holds true. Even so this number n is infeasible to find for large value
of order.
Chapter 1. Cryptography 9
This is due to the fact that there is not an efficient algorithm that is able to com-
pute n given P and Q. The only way to find n is by trying. As already mentioned,
this could become infeasible if the number of value that n can assume (order) is too
large.
If p is a very large number, it is not trivial to count all the points in that field, but
there is an algorithm that allows to calculate the order of a group in a fast and effi-
cient way, like Schoof’s algorithm.
n · G + m · G = |G + ·{z
· · + G} + |G + ·{z
· · + G} = |G + ·{z
· · + G} = (n + m) · G.
n times m times n+m times
So multiples of G are closed under addition and this is enough to prove that the set
of the multiples of G is a cyclic subgroup of the group formed by the elliptic curve.
Remark The order of the subgroup generated by G is linked to the order of the elliptic curve
by Lagrange’s theorem, which states that the order of a subgroup is a divisor of the order of
the parent group.
Remark If the order of the group is a prime number, all the points belonging to the EC
generate a subgroup with the same order of the group or with order 1.
All these preliminary information are needed in order to introduce the private-public
key cryptography used by Bitcoin.
y2 = x 3 + 7 mod p, (1.3)
where the mod p (modulo prime number) indicates that this curve is over a finite
field of prime order p = 2256 − 232 − 29 − 28 − 27 − 26 − 24 − 1.
The order of this Elliptic Curve is a very large prime number, close to 2256 , but smaller
then p.
x=
79BE667EF9DCBBAC55A06295CE870B07029BFCDB2DCE28D959F2815B16F81798
y=
483ADA7726A3C4655DA4FBFC0E1108A8FD17B448A68554199C47D08FFB10D4B8
From the moment that the order of the group is a prime number, the order of any
subgroup is equal to the order of the entire group. In particular, the order of the
subgroup generated by G is equal to order.
We have now all the elements necessary to define the private and the public key.
Definition A private key is a number chosen in the range between 1 and (order − 1).
Definition A public key W is a point in the Bitcoin EC, derived from a private key k in the
following way:
W = k · G, (1.4)
where the multiplication between k and G is defined in the previous chapter.
This is a one way function: it is simple to compute the scalar multiplication, knowing
the private key, but it is infeasible to do the opposite.
The purpose of having defined the private and public keys is to use them to crypto-
graphically sign a message. It is not the scope of this thesis explain how a message
is signed, but it is at least necessary to know the principal properties of a signed
message.
Let’s suppose to have a message that is needed to be signed, in Bitcoin this mes-
sage is usually a transaction.
• Knowing the public key associated with the private key that signs the mes-
sage, it is possible to verify that the message is signed using the corresponding
private key (without knowing it).
For this reason the keys are called private and public, the former is suppose to be
kept secret because it is able to sign a message, instead the latter is suppose to be
shared, in order to let everyone else knows that who signs the message is in posses-
sion of the corresponding private key.
11
Chapter 2
Wallet
The way used to store keys is essential in private-public keys cryptography. For this
reason, different types of wallets were designed.
Definition A wallet is a software used to store keys. It is able also to sign messages with
the private key, but in this framework, we will only consider wallets as key containers.
• Deterministic Wallet.
Remark Bitcoin wallets contain keys, not coins. Coins are in the Blockchain.
X ∼ U ( S ),
where S is the finite set of natural number in the range from 1 to (order − 1).
(ii) Take some realizations k1 , k2 ...k n of X using enough entropy to make these
numbers (private keys) impossible to guess.
k 1 = X ( ω1 ) , k 2 = X ( ω2 ) , ... k n = X ( ω n ).
(iii) Go back to point (i) every time new private keys are needed.
With this procedure it is impossible to compute the public key without having already
computed the private key.
Random Wallet
Pros Cons
The use of random wallet is strongly discouraged for anything other than simple tests.
There are different types of deterministic wallets, in this text we will analyze three
main types:
(i) Generate a seed (only once), a random number from a Discrete Uniform Random
Variable
seed = X (ω ), X ∼ U ( S ),
where S is the finite set of natural number in the range from 1 to (order − 1).
Chapter 2. Wallet 13
(ii) Consider the numbers seed and n as strings and concatenate n to seed, obtaining
a value:
value = seed|n.
(iii) Apply the SHA256 function to value and obtain the nth private key.
(iv) Go back to point (ii) every time new private keys are needed with n = n + 1.
With this procedure it is impossible to compute the public key without having already
computed the private key.
Pros Cons
The use of this type of wallet is not recommended for everyday use, but it could be
used to store Bitcoin in a safe place: cold wallet.
Master private key (mp): a random number, generated from a Discrete Uniform
Random Variable
mp = X (ω ), X ∼ U ( S ),
where S is the finite set of natural number in the range from 1 to (order − 1).
The master private key must be kept secret.
Chapter 2. Wallet 14
Master public key (MP): a point on the EC, obtained from the mp:
MP = mp · G,
Public random number (r): a random number, generated from a Discrete Uni-
form Random Variable
r = X ( ω ), X ∼ U ( S ),
This number can be consider non-secret.
In order to obtain the corresponding public key Pn , it is possible to compute the stan-
dard multiplication:
Pn = pn · G.
It is also possible to compute Pn without knowing pn , using only non-secret infor-
mation: hn|r and MP.
(i) Compute V:
V = hn|r · G,
where V can be see as the public key of hn|r and can be consider non-secret.
(ii) Add MP to V:
Pn = MP + V,
where the sum in this contest is the one defined between two point in the EC.
Pn = pn · G
= (mp + hn|r ) · G
= (mp · G ) + (hn|r · G )
= MP + V.
Pros Cons
The type-2 deterministic wallet is an improvement of the type-1 because it has the same
benefits (except for the need to back up two number instead of only one), but with a
great advantage: it is possible to generate new addresses, obtained from the public
keys, also in a non safe environment, having only r and MP.
A thief can only steal your privacy, if he steals only MP and r. In fact, he is able
to see which messages (transactions) have you signed, without the possibility to
sign new ones and spend your coins.
(i) Generate a seed, a random number from a Discrete Uniform Random Variable,
unique for each wallet.
seed = X (ω ), X ∼ U ( S ),
(ii) Generate a master private key from the seed, using a stretching function: PBKDF2.
(iii) From this master private key it is possible to generate 232 private key using an
irreversible hash function: SHA512
Chapter 2. Wallet 16
(iv) All of this private key "children" can derive 232 private key and all of these "grand-
children" can derive as many.
This procedure can produce a huge number of keys. They seem independent from
an outside point of view: it is impossible to guess that two keys are derived from the
same seed.
Chapter 3
3.1 Elements
First, let us focus on the main elements of the Wallet:
Seed.
Extended keys.
3.1.1 Seed
The entire Wallet is based on a seed.
seed = X (ω ), X ∼ U ( S ),
where S is the finite set of natural numbers in the range from 1 to an arbitrary value.
Obviously the greater the set from which the number can be extracted, the better it
is for the security of the seed itself.
Once it is decoded we will obtain exactly 78 bytes, with a specific meaning and
order:
~ 4 bytes are used for the fingerprint. It is a unique value that identify the parent.
Compute the HASH160 function on the "parent" public key in a compressed
form and then take the first 4 bytes:
~ 32 bytes are used for the chain code. The chain code is used in order to intro-
duce entropy in the children generation. We will see below how it works.
An extended key is called Extended Private Key if the lasts 33 bytes are used to
specify the private key; it is called Extended Public Key if they are used to specify
the public key.
First of all we need to convert the seed into a string of bytes, where the most sig-
nificant bytes come first (big endian). In order to do so, we need to know the length
of the string of bytes.
byte_string1 = 00 00 00 07,
byte_string2 = 00 00 07,
byte_string3 = 00 07,
byte_string4 = 07.
Chapter 3. Hierarchical Deterministic Wallet 19
These 4 byte strings are obtained from the same seed: seed = 7 and the only differ-
ence is the length of the string.
Remark Different length of the string produces a different master private key, even if the
seed is the same number.
In Python:
1 b y t e _ s t r i n g = seed . t o _ b y t e s ( seed_bytes , ’ big ’ )
where seed is an integer number, seed_bytes is the number of bytes that the byte_string
should have.
It is essential to specify the length of the byte string, otherwise, there will be ob-
tained different wallets.
Once we obtain a string of bytes, we will compute the HMAC algorithm. The hash
function used for HMAC is the SHA512 and the key is a particular string of bytes:
b"Bitcoin seed". In Python the implementation is the following:
1 from h a s h l i b import sha512
2 from hmac import HMAC
3
4 hashValue = HMAC( b " B i t c o i n seed " , b y t e _ s t r i n g , sha512 ) . d i g e s t ( )
Now we have obtained a hashValue of 512 bits, so 64 bytes. Consider the firsts 32
bytes as the master private key and the next 32 bytes as the master chain code. A
Python implementation is the following:
1 p r i v a t e _ k e y _ b y t e s = hashValue [ 0 : 3 2 ]
2 c h a i n _ c o d e _ b y t e s = hashValue [ 3 2 : 6 4 ]
Now we have two-byte strings, one for the master private key and the other for the
master chain code.
It is important to remember that a private key must be in the range between 1 and
order, so the byte string for the private key should be converted in int and then take
the mod order. In Python we have:
1 p r i v a t e _ k e y = i n t ( p r i v a t e _ k e y _ b y t e s . hex ( ) , 1 6 ) % order
Finally, we will concatenate all the information obtained in order to form a Master
Extended Private Key (in bytes format):
• depth = b0 \ x000 ,
Remark The SHA512 is an irreversible function, so it is infeasible to obtain the seed, know-
ing the master key. (It is also useless because with the master key you can derive all the keys
in the wallet).
• Normal.
• Hardened.
Both methods have some advantage and disadvantage that we will discuss later. For
every situation, it is essential to use the method that best fit.
For both the method the derivation starts from an extended private key. From this
key some essential information are necessary:
? Chain code.
? Private key.
Chapter 3. Hierarchical Deterministic Wallet 21
It is also required a number, used in order to specify the index of the child. This
number should be in the range between 0 and 4294967295. This is due to the fact
that in any extended key there are 4 bytes used to specify the index of the child:
In fact, it is possible to generate even a greater number of children from the same
parent, but it would not be possible to write the corresponding extended key in the
format described above.
After that we concatenate this 33 byte string to the 4 byte string representing the
index number:
msg = compressed public key | index,
where msg is now a string of 37 bytes.
Now we split this string of bytes in two: the last 32 are the child chain code. Then we
take the first 32 bytes, convert them into an integer number and sum it to the parent
private key (mod order), obtaining the child private key.
First, we concatenate the 33 bytes of parent private key, considering also the 00 byte,
with the 4 byte string representing the index number.
Remark In order to better distinguish the hardened derivation from the normal one, the
numbering of the indices starts from the number 231 .
Now we split this string of bytes in two (in the same way as the normal method):
the last 32 are the child chain code. Finally, we take the first 32 bytes, convert them
into an integer number and sum it to the parent private key (mod order), obtaining
Chapter 3. Hierarchical Deterministic Wallet 23
• Chain code.
• Index.
First, we apply the HMAC algorithm to the same inputs used for the normal deriva-
tion:
The output of this function is the same of the normal derivation with the extended
private key. The last 32 bytes formed the child chain code, instead, the first 32 bytes
can be read as a special number: q.
Now we multiply the generator G to the integer number q and we obtain Q, a point
on the EC:
Q = q · G.
Finally, we compute the sum between two points on the elliptic curve: Q and Pparent ,
where Pparent is the parent public key.
Q + Pparent = Pchild ,
We will now prove that the child public key obtained in this way Pchild2 is the same
as that obtained starting from the private key, Pchild1 :
Both the procedures start from q, number obtained from the first 32 bytes of the
HMAC function. Let’s call p parent the parent private key and pchild the child private
key.
Pchild1 = pchild · G
= (q + p parent ) · G
= (q · G ) + ( p parent · G )
= Q + Pparent = Pchild2
cvd
The HMAC-SHA256 function has as inputs three elements: the parent chain code,
the parent public key and the child index. The firsts two information can be taken
from the parent extended public key, instead the child index can be taken from the
child extended private key.
After that, we consider the first 32 bytes of the result of this function and consider it
as an integer number, q.
Remembering that to get the child private key it is needed to compute a sum with
the parent private key, it is possible to reverse the process.
Let’s call pchild and p parent the private keys of the child and the parent respectively.
So we have derived the private key of the parent. Graphically this derivation can
be shown in Figure 3.5
In fact, the inputs of HMAC-SHA512 are different for the two derivations. If for
the normal derivation only the information in the public extended key is sufficient,
for the hardened derivation the parent private key is needed. This makes impossible
to obtain a public from a public, but also it makes impossible to derive the private
key of the parent knowing the private key of the child and the public of the parent.
However, if all the child keys are used by the same person and you need to gen-
erate a different public key, with this derivation it is possible to do so even in a "hot
place". Let’s suppose to have stored only the extended public key in a device, you
can then receive payment to your public keys, but it is impossible to spend those
coins as long as the private keys are hidden. If someone stole your device the only
problem is a leak of privacy, because it is possible, by examining the blockchain,
to discover all transactions signed with the private keys associated with the public
keys of the wallet, but it is impossible to sign new transactions without having the
private keys.
As a best practice, it is always advisable to use the hardened method for the first
derivation from the extended master private key. A hardened key should have both
hardened or normal children, but from a normal child, it is not reasonable to derive
a hardened one because it makes no sense to increase the security of the wallet at the
last level.
27
Chapter 4
Mnemonic phrase
We have seen how it is possible to generate keys starting from a seed. But a seed is
a long number, difficult to remember and not easy to write down on a paper. You
may incur typos while transcribing it, and this can compromise the entire wallet.
Remark Mistyping a single digit in the seed produce completely different keys.
In order to work around this problem, some solutions were implemented. Among
them, the most widespread and used is the one described by BIP39, Bitcoin Improve-
ment Proposal number 39. This is not the only one, in this chapter, we will also see
another solution proposed by Electrum1 , one of the most famous Bitcoin wallet.
Both these solutions use a Mnemonic phrase, from which the seed is obtained. This
phrase is designed to avoid typing errors while maintaining the same level of secu-
rity and entropy.
4.1 BIP 39
First, we will see how to generate a mnemonic phrase in the framework of BIP39 [2]
and then how it is possible to obtain a seed from it.
Let us call ENT the number of binary digits of the given entropy. Then ENT should
belong to a given set:
ENT ∈ {128, 160, 192, 224, 256}.
The reason for a given length for the entropy will be clear in a moment.
Now we write the entropy in bytes format, obtaining a string of ENT/8 length.
Then we compute the SHA256 algorithm and consider only the first ENT/32 bits
as a checksum. Finally add these bits to the bottom of the entropy, obtaining an in-
teger number, called entropy_checked, expressed in binary format of length equal to:
ENT + ENT/32.
In Python:
where entropy_bin and checksum_bin are strings of bits that can be concatenated.
Now it is clear the reason for a constraint on the length of the entropy in input:
The point (i) is due to the structure proposed by BIP39. It is only a convention to take
the first ENT/32 bits as a checksum. However, it is essential that the final length of
the entropy plus the checksum must be a dividend of 11, from the moment that the
dictionary is a set of 211 words.
The point (ii) is just a suggestion because taking less entropy could bring to a leak
of security. It will be easier for an attacker to guess your mnemonic phrase by trying
out all the possible combinations if fewer words are involved. It is important to re-
member that adding even a single bit of entropy, doubles the difficulty of guessing it.
The point (iii) is another suggestion. A private key is a number smaller than 2256 ,
therefore, it would be useless to generate a seed starting from an entropy with more
than 256 bits.
Each of these strings represents an integer number that can take values in the range
between 0 and 2047, ie 211 − 1. Associate each of these numbers with a word in the
Chapter 4. Mnemonic phrase 29
chosen dictionary, suppose to consider the English one sorted alphabetically. Write
down all these words, separated by a space and obtain the Mnemonic Phrase.
All these steps can be summarized with the following scheme, (ENT = 128):
entropy16 = f 012003974d093eda670121023cd03bb
m
entropy2 = 1111000000010010000000...0111011
| {z }
SH A256
⇓
0010
|{z} 010001000001001...
| {z }
check sum ignored
The function used is the PBKDF2 and it is used in order to avoid brute force at-
tack, from the moment that the output has exactly the same length of a standard
hash function, but it will take more times to calculate it from the moment that it will
compute the same hash function many times.
It receives as input:
Message: Mnemonic phrase.
Salt: ’mnemonic’ + passphrase.
Number of iterations: 2048.
Digest-module: SHA512.
Mac-module: HMAC.
Summing up it can be said that it calculates the same hash function (HMAC-SHA512)
2048 times.
Although it is true that a human being is a scarce source of randomness, the passphrase
is usually chosen by the user. This is due to the fact that it should not introduce more
entropy, but it prevents an attack with rainbow tables and gives the possibility to the
user to have different wallets with the same mnemonic phrase.
Chapter 4. Mnemonic phrase 30
Remark The randomness should be guaranteed by the input entropy used to generate the
mnemonic phrase, not by the passphrase.
where mnemonic is the mnemonic phrase previously computed and passphrase is cho-
sen by the user (if not specified it is empty).
Remark With this procedure we always produce a seed of specific length: 512 bits. It will
always be enough because every private key can take value from a smaller set of value (1 to
order).
The main difference is in the way that the mnemonic phrase is generated and the
purpose of it. Electrum chooses to assign a version to the seed in such a way that is
possible to recognize the purpose of the keys and the way to generate them.
To be consistent with the BIP39 section, consider the entropy as a large integer num-
ber and call ENT the number of its binary digits. Then ENT must be a multiple of 11,
if the chosen dictionary is the same of BIP39. However, the choice of the dictionary
is not binding.
The first important difference with BIP39 mnemonic is that the checksum is not per-
formed on the entropy but on the Mnemonic phrase directly. In order to obtain a
valid mnemonic phrase, the following instruction must be followed:
2. Associate each string with a word from the chosen dictionary of 2048 words.
3. Write down all these words, separated by a space and obtain a candidate Mnemonic
phrase.
5. If the initial digits are different, increase entropy by one and then go back to
point 1, otherwise, you have obtained a valid mnemonic phrase.
Point 5. has the simple effect, most of the time, of changing a single word of the
mnemonic phrase and this will lead to a complete different HASH. Do this over and
over again, until the first digits match the version digits required.
Let’s see an example: suppose to be looking for a standard type seed, starting with
ENT = 132:
Mnemonic = usage orchard lift online melt replace budget indoor table twenty issue shock
HASH(Mnemonic) = 3d5d23737859601eeabe32d1e1...
Mnemonic = usage orchard lift online melt replace budget indoor table twenty issue shoe
Mnemonic = usage orchard lift online melt replace budget indoor table twenty issue worry
So we have finally found a valid Mnemonic phrase for the standard type seed.
The Salt used for the PBKDF2 function does not contain the word ’mnemonic’, in-
stead, it contains the word ’electrum’. It is always concatenated with a passphrase,
chosen by the user.
Once again if the passphrase is not specified by the user, it will be left empty.
4.3 Comparison
Once described how to generate the Mnemonic phrase for each of the two principal
proposals, let’s analyze the advantages and disadvantages.
Both BIP39 and Electrum are secure from a so-called brute force attack, from the mo-
ment that the function PBKDF2 is used to generate the seed from the mnemonic
phrase and so it is infeasible to guess a Mnemonic randomly generated by another
user. In fact, each time a valid phrase has been found, it is required to compute the
seed, then the first child keys and then look at the public ledger, the blockchain, in
order to see if some of this private keys are used to sign a transaction. All these pas-
sages are computational consuming and it would take too much time to try even a
Chapter 4. Mnemonic phrase 33
small part of all the possible combination. Even with all the most powerful comput-
ers in the world working together, it will take a time of many order of magnitude
greater than the age of the universe itself.
These two methods are both secure, but there is a difference. Suppose to be look-
ing for a 12 words mnemonic already used: for BIP39 it is "only" needed to try 128
bits of entropy and then the others 4 bits, the checksum, are obtained through a
HASH function, instead with Electrum it is needed to try all the 132 bits of entropy
and then check, through a HASH function, if they are valid. With this example, the
difference is in 4 bits, but it increases with the number of words:
The first column represents the number of words in a mnemonic phrase, the second
and the third represent respectively the bits of entropy needed to be checked in or-
der to find a specific BIP39 or Electrum mnemonic phrase. The last column is simply
the difference between the previous two. Although this additional difficulty is not
necessary, the difference between the two methods is not negligible. In fact, to find
a specific Electrum phrase with 12 words you have to do 16 (24 ) times the attempts
needed to find the same number of words in the BIP39 framework. If the words
become 24 the number of attempts would be 256 (28 ) times.
Another difference is in the way that a phrase is considered valid. BIP39 used
a checksum based on the input entropy. This means that the knowledge of the
mnemonic phrase alone is not enough to know if it is valid or not. A fixed dictionary
is always required. On the other side for Electrum it is useless the knowledge of the
dictionary because the validation is based directly on the HASH of the mnemonic
phrase. Furthermore, Electrum allows the use of any kind of dictionary and not only
the standard ones.
Finally, the most important difference is the Electrum introduction of version type
for the seed. While BIP39 only check if a mnemonic is valid, Electrum checks the
validity of phrase looking if it corresponds to a specific version. Directly from the
Mnemonic, with Electrum, it is possible to understand how to derive all the keys
required and their purpose. On the other side, with BIP39, it is impossible to say a
priori the purpose of keys derived from that seed.
To avoid this problem a new BIP was proposed: BIP43. In order to identify a partic-
ular purpose for a bunch of keys, a particular derivation scheme was used:
m / purpose0 / ∗
Changing the derivation path will change also the purpose of the keys. For more
information see the next chapter.
34
Chapter 5
Let’s denote the extended master private key with m and the extended master public
key with M.
The first normal private child derived from m will be denoted by the number 0:
m/0
The fourth normal private child of the first normal private child of the master private
key will use the following notation:
m/0/3
m/00
It is possible to mix hardened and normal derivation. For example, the 15th hardened
child of the 37th normal child of the 6th normal child of the 1019th hardened child of
m will be represented in the following way:
m/10180 /5/36/140
Remark Although it is not recommended to use hardened derivation after a normal deriva-
tion, it is always possible to do so.
This nomenclature can be used also to represent extended public keys. In fact, it is
possible to derive, with the normal derivation, an extended public key from the par-
ent extended public key. The notation will be the usual one: for example, the third
Chapter 5. How to use a HD Wallet 35
public key derived from the sixth public key derived from M will be represented in
this way:
M/2/5
Remark It is useless to specify that children derived from M are normal and not hardened.
It is impossible to use the hardened derivation to derive the public key from a parent public
key.
5.2 BIP 43
This BIP introduces a "Purpose Field" in the Hierarchical Deterministic wallet [4].
The first child of the extended master private key specifies the purpose of the entire
branch.
m / purpose0 / ∗
For example, if purpose = 44 it means that this is a multi-coin wallet, if purpose = 49
it means that the keys generated follow the BIP49 specification (P2WPKH nested in
P2SH).
• Purpose: it must be equal to 44’ (or 0x8000002C) and it indicates that the subtree
of this node is used according to this specification. Hardened derivation is
used at this level.
• Coin_type: this level creates a separate subtree for every cryptocoin, avoiding
reusing addresses across cryptocoins and improving privacy issues. Coin type
is a constant, set for each cryptocoin. Hardened derivation is used at this level.
Some example:
Path Cryptocoin
m / 44’ / 0’ Bitcoin (mainnet)
m / 44’ / 1’ Bitcoin (testnet)
m / 44’ / 2’ Litecoin
m / 44’ / 3’ Dogecoin
m / 44’ / 60’ Ethereum
m / 44’ / 128’ Monero
m / 44’ / 144’ Ripple
m / 44’ / 1815’ Cardano
.. ..
. .
A list with the complete set of cryptocoins is available and in continuous up-
date:
https://github.com/satoshilabs/slips/blob/master/slip-0044.md
Chapter 5. How to use a HD Wallet 36
• Account: this level splits the key space into independent user identities, so the
wallet never mixes the coins across different accounts. Users can use these
accounts to organize the funds in the same fashion as bank accounts; for do-
nation purposes, for saving purposes, for common expenses etc. Hardened
derivation is used at this level.
• Change: it can take only two value: 0 and 1. 0 is used for addresses that are
meant to be visible outside of the wallet (e.g. for receiving payments). 1 is
used for addresses which are not meant to be visible outside of the wallet and
is used for return transaction change. Normal derivation is used at this level.
• Index: this is the last derivation, used to have many keys for each cryptocoin. It
can take values from 0 in sequentially increasing manner. Normal derivation
is used at this level.
The principal advantage of this BIP is the possibility to easily back up all the cryp-
tocoins of the user just by remembering a particular set of worlds (BIP39 mnemonic
phrase). If this method is used to store more then one coin, keep attention not to lose
the master private key, otherwise, all coins were lost.
The logic of BIP44 was followed and it let the user the possibility to have a multi-
coin wallet just by choosing a specific coin_type.
Remark In order to make SegWit a soft fork (backward compatible), a SegWit transaction
is nested in a pay-to-script-hash, so the corresponding address must begin with ’3’.
37
Conclusion
The main purpose of this work has been the analysis of methods used to generate a
sequence of private and public keys and to store the seed from which the sequence
is derived.
First, we have briefly described some simple deterministic derivation, then we have
analyzed the Hierarchical Deterministic Wallet. It is possible to derive an extended
key in two way: normal and hardened. The use of the normal derivation allows
public-to-public derivation, that is the derivation of a sequence of public keys from
an extended public key, without access to any private key; anyway the entire wallet
is compromised if both a parent extended public key and a child extended private
key are stolen. The use of the hardened derivation prevented this problem, but it
does not allow public-to-public derivation.
Then we focused on the two methods mostly used to generate the seed: the ver-
sion proposed by BIP39 and the one proposed by Electrum. Both of them start from
a given entropy to generate a mnemonic phrase, which is then used to obtain a seed.
The two methods are very similar, but with some subtle differences. In this thesis
these differences have been analyzed, showing pros and cons of each method.
It was not a goal of this work to point out a better proposal, but to provide a complete
and detailed overview of the various way to generate asymmetric cryptographic
keys.
"We often fear what we do not understand. Our best defense is knowledge."
Appendix A
In order to make it easy to store and recognize keys, in the Bitcoin framework, some
encodes were designed.
In this appendix, we will briefly describe the possible ways to write down a pub-
lic key, a private key and finally how it is possible to obtain a Bitcoin addresses
(Pay-to-Public-Key-Hash).
All the examples below will start from the following private key (expressed in hex-
adecimal digits):
2AFEED53F26EF06521E7E825F83CB36A4632791A070A782E353230EAE71EBDD3.
• Uncompressed.
• Compressed.
Both these encodes contains the same information and it is possible to obtain one
from the other and vice versa.
A.1.1 Uncompressed
An uncompressed public key is a string of hexadecimal digits, obtained by concate-
nation of the x coordinate with the y coordinate (64 hexadecimal digits both). It is
added 04 at the beginning of the string, obtaining a total of 130 hexadecimal digits.
A.1.2 Compressed
A compressed public key is a string of hexadecimal digits, it is obtained taking the
x coordinate and adding 02 at the begging if the y coordinate is even, 03 otherwise.
Its length will be of 66 hexadecimal digits.
Appendix A. Bitcoin keys representation and addresses 39
Remark The symmetry property of the EC allows us to write down only the x coordinate.
The y coordinate can be derived by the equation of the EC that give us 2 possible y. The choice
between these will be made base on the first two digits of the compressed public key.
In order to obtain a WIF Private Key the following procedure must be used:
• Write down the private key in hexadecimal format. (64 digits).
• Add two digits as a version number (80 for Bitcoin) in front of the private key.
This is done in order to recognize the purpose of the key.
• Add 01 at the end of the private key if you want a WIF compressed, none if you
want a WIF uncompressed. The difference between these two types is that from
a compressed private key a compressed public key is expected to be derived and
from a uncompressed private key a uncompressed public key is expected.
• Add a checksum at the end, obtained applying the SHA256 function twice to
the string previously obtained, take the first 4 bytes (8 hexadecimal digits) and
put them at the end of the string.
• Finally compute the encoding in base 58, obtaining a string of 51 digits if un-
compressed or 52 digits if compressed.
A private key WIF compressed will start with the K or L and an uncompressed one
will start with 5.
A.3 Address
Among the Bitcoin transactions, one of the most used is a Pay-to-Public-Key-Hash,
meaning that in the transaction you will not write directly the public key, but the
hash of that public key.
The hash function used in this framework is the HASH160, applied to the compressed
public key. The result is a PubkeyHash and from the moment that the HASH160 is an
irreversible function, it is infeasible to obtain the public key starting from the Pub-
keyHash.
Appendix A. Bitcoin keys representation and addresses 40
• Add one byte as version (00 for Bitcoin) in front of the PubkeyHash.
• Add a checksum at the end, obtained applying the SHA256 function twice to
the string previously obtained, take the first 4 bytes and put them at the end of
the string.
Example of an address:
1DFvgrsFE6qVfgX83E35SbLdpjiSFffY2q.
Remark This is not the only type of address in the Bitcoin framework, but it is the simplest
one, derived directly from the public key.
41
Appendix B
Python code
This appendix shows the most relevant parts of the Python code made for this thesis.
Remark These scripts only work if they are inserted in the repository of the professor Ferdi-
nando M. Ametrano [13].
B.1.1 Type-1
This is the script used to generate private and public keys, using the first type of
deterministic derivation.
B.1.2 Type-2
This is the script used to generate private and public keys, using the second type of
deterministic derivation.
5 # s e c r e t master p r i v a t e key
6 mp = random . r a n d i n t ( 0 , order − 1)
7 p r i n t ( ’ \ n s e c r e t master p r i v a t e key : \ n ’ , hex (mp) , ’ \n ’ )
8
9 # p u b l i c random number
10 r = random . r a n d i n t ( 0 , order − 1)
11 p r i n t ( ’ p u b l i c ephemeral key : \ n ’ , hex ( r ) )
12
13 # Master PublicKey :
14 MP = p o i n t M u l t i p l y (mp, G)
15 p r i n t ( ’ Master P u b l i c Key : \ n ’ , hex (MP[ 0 ] ) , ’ \n ’ , hex (MP[ 1 ] ) , ’ \n ’ )
16
17 # number o f key p a i r s t o g e n e r a t e
18 nKeys = 3
19 p = [ 0 ] ∗ nKeys
20 P = [ ( 0 , 0 ) ] ∗ nKeys
21
22 # PubKeys can be c a l c u l a t e d without using privKeys
23 f o r i i n range ( 0 , nKeys ) :
24 # H( i |r )
25 H_i_r = i n t ( sha256 ( ( hex ( i ) +hex ( r ) ) . encode ( ) ) . h e x d i g e s t ( ) , 1 6 ) %order
26 P [ i ] = pointAdd (MP, p o i n t M u l t i p l y ( H_i_r , G) )
27
28 # check t h a t PubKeys match with privKeys
29 f o r i i n range ( 0 , nKeys ) :
30 # H( i |r )
31 H_i_r = i n t ( sha256 ( ( hex ( i ) +hex ( r ) ) . encode ( ) ) . h e x d i g e s t ( ) , 1 6 ) %order
32 p [ i ] = (mp + H_i_r ) %order
33 a s s e r t P [ i ] == p o i n t M u l t i p l y ( p [ i ] , G)
34 p r i n t ( ’ prKey# ’ , i , ’ : \ n ’ , hex ( p [ i ] ) , sep= ’ ’ )
35 p r i n t ( ’ PubKey# ’ , i , ’ : \ n ’ , hex ( P [ i ] [ 0 ] ) , ’ \n ’ , hex ( P [ i ] [ 1 ] ) , ’ \n ’ , sep= ’ ’ )
24 a s s e r t key [ 0 ] i n ( 2 , 3 )
25 e l i f ( vbytes i n PRIVATE ) :
26 a s s e r t key [ 0 ] == 0
27 else :
28 r a i s e Exc ep tio n ( " i n v a l i x key [ 0 ] p r e f i x ’% s ’ " % type ( key [ 0 ] ) . __name__ )
29 a s s e r t i n t . from_bytes ( key [ 1 : 3 3 ] , ’ big ’ ) < order , " i n v a l i d key "
30 a s s e r t l e n ( depth ) == 1 , " wrong l e n g t h f o r depth "
31 a s s e r t l e n ( f i n g e r p r i n t ) == 4 , " wrong l e n g t h f o r f i n g e r p r i n t "
32 a s s e r t l e n ( index ) == 4 , " wrong l e n g t h f o r index "
33 a s s e r t l e n ( chain_code ) == 3 2 , " wrong l e n g t h f o r chain_code "
34
35 def bip32_parse_xkey ( xkey ) :
36 decoded = b58decode_check ( xkey )
37 a s s e r t l e n ( decoded ) == 7 8 , " wrong l e n g t h f o r decoded xkey "
38 i n f o = { " vbytes " : decoded [ : 4 ] ,
39 " depth " : decoded [ 4 : 5 ] ,
40 " f i n g e r p r i n t " : decoded [ 5 : 9 ] ,
41 " index " : decoded [ 9 : 1 3 ] ,
42 " chain_code " : decoded [ 1 3 : 4 5 ] ,
43 " key " : decoded [ 4 5 : ]
44 }
45 b i p 3 2 _ i s v a l i d _ x k e y ( i n f o [ " vbytes " ] , i n f o [ " depth " ] , i n f o [ " f i n g e r p r i n t " ] , \
46 i n f o [ " index " ] , i n f o [ " chain_code " ] , i n f o [ " key " ] )
47 return info
48
49 def bip32_compose_xkey ( vbytes , depth , f i n g e r p r i n t , index , chain_code , key ) :
50 b i p 3 2 _ i s v a l i d _ x k e y ( vbytes , depth , f i n g e r p r i n t , index , chain_code , key )
51 xkey = vbytes + \
52 depth + \
53 fingerprint + \
54 index + \
55 chain_code + \
56 key
57 r e t u r n b58encode_check ( xkey )
58
59 def bip32_xprvtoxpub ( xprv ) :
60 decoded = b58decode_check ( xprv )
61 a s s e r t decoded [ 4 5 ] == 0 , " not a p r i v a t e key "
62 p = i n t . from_bytes ( decoded [ 4 6 : ] , ’ big ’ )
63 P = p o i n t M u l t i p l y ( p , G)
64 P_bytes = ( b ’ \x02 ’ i f ( P [ 1 ] % 2 == 0 ) e l s e b ’ \x03 ’ ) + P [ 0 ] . t o _ b y t e s ( 3 2 ,
’ big ’ )
65 network = PRIVATE . index ( decoded [ : 4 ] )
66 xpub = PUBLIC [ network ] + decoded [ 4 : 4 5 ] + P_bytes
67 r e t u r n b58encode_check ( xpub )
68
69 def bip32_master_key ( seed , seed_bytes , vbytes = PRIVATE [ 0 ] ) :
70 hashValue = HMAC( b " B i t c o i n seed " , seed . t o _ b y t e s ( seed_bytes , ’ big ’ ) ,
sha512 ) . d i g e s t ( )
71 p_bytes = hashValue [ : 3 2 ]
72 p = i n t ( p_bytes . hex ( ) , 1 6 ) % order
73 p_bytes = b ’ \x00 ’ + p . t o _ b y t e s ( 3 2 , ’ big ’ )
74 chain_code = hashValue [ 3 2 : ]
75 xprv = bip32_compose_xkey ( vbytes , b ’ \x00 ’ , b ’ \x00\x00\x00\x00 ’ , b ’ \x00\
x00\x00\x00 ’ , chain_code , p_bytes )
76 r e t u r n xprv
77
78 # Child Key D e r i v a t i o n
79 def bip32_ckd ( extKey , c h i l d _ i n d e x ) :
80 p a r e n t = bip32_parse_xkey ( extKey )
81 depth = ( i n t . from_bytes ( p a r e n t [ " depth " ] , ’ big ’ ) + 1 ) . t o _ b y t e s ( 1 , ’ big ’ )
82 i f p a r e n t [ " vbytes " ] i n PRIVATE :
83 network = PRIVATE . index ( p a r e n t [ " vbytes " ] )
Appendix B. Python code 44
B.3.1 BIP 39
These are the functions used to generate a valid BIP 39 Mnemonic phrase and the
related seed.
4
5 def from_entropy_to_mnemonic_int ( entropy , ENT) :
6 e n t r o p y _ b y t e s = entropy . t o _ b y t e s ( i n t (ENT/8) , b y t e o r d e r = ’ big ’ )
7 checksum = sha256 ( e n t r o p y _ b y t e s ) . d i g e s t ( )
8 checksum_int = i n t . from_bytes ( checksum , b y t e o r d e r = ’ big ’ )
9 checksum_bin = bin ( checksum_int )
10 while l e n ( checksum_bin ) <258:
11 checksum_bin = ’ 0b0 ’ + checksum_bin [ 2 : ]
12 entropy_bin = bin ( entropy )
13 while l e n ( entropy_bin ) <ENT+ 2 :
14 entropy_bin = ’ 0b0 ’ + entropy_bin [ 2 : ]
15 entropy_checked = entropy_bin [ 2 : ] + checksum_bin [ 2 : 2 + i n t (ENT/32) ]
16 number_mnemonic = (ENT/32 + ENT) /11
17 a s s e r t number_mnemonic %1 == 0
18 number_mnemonic = i n t ( number_mnemonic )
19 mnemonic_int = [ 0 ] ∗ number_mnemonic
20 f o r i i n range ( 0 , number_mnemonic ) :
21 mnemonic_int [ i ] = i n t ( entropy_checked [ i ∗ 1 1 : ( i +1) ∗ 1 1 ] , 2 )
22 r e t u r n mnemonic_int
23
24 def from_mnemonic_int_to_mnemonic ( mnemonic_int , d i c t i o n a r y _ t x t ) :
25 d i c t i o n a r y = open ( d i c t i o n a r y _ t x t , ’ r ’ ) . r e a d l i n e s ( )
26 mnemonic = ’ ’
27 f o r j i n mnemonic_int :
28 mnemonic = mnemonic + ’ ’ + d i c t i o n a r y [ j ] [ : − 1 ]
29 mnemonic = mnemonic [ 1 : ]
30 r e t u r n mnemonic
31
32 def generate_mnemonic_bip39 ( entropy , number_words = 2 4 , d i c t i o n a r y = ’
English_dictionary . txt ’ ) :
33 ENT = i n t ( number_words ∗32/3)
34 mnemonic_int = from_entropy_to_mnemonic_int ( entropy , ENT)
35 mnemonic = from_mnemonic_int_to_mnemonic ( mnemonic_int , d i c t i o n a r y )
36 r e t u r n mnemonic
37
38 def from_mnemonic_to_seed ( mnemonic , passphrase = ’ ’ ) :
39 PBKDF2_ROUNDS = 2048
40 r e t u r n PBKDF2 ( mnemonic , ’ mnemonic ’ + passphrase , i t e r a t i o n s =
PBKDF2_ROUNDS, macmodule = hmac , digestmodule = sha512 ) . read ( 6 4 ) . hex ( ) )
B.3.2 Electrum
These are the functions used to generate a valid Electrum Mnemonic phrase for a
chosen version and the related seed.
Bibliography
[13] Ferdinando M. Ametrano, Material for the Bitcoin & Blockchain Technology course,
https://github.com/fametrano/BitcoinBlockchainTechnology