Cryptographic Hash Functions
Cryptographic Hash Functions
Cryptographic Hash Functions
net/publication/351837904
CITATIONS READS
0 3,412
1 author:
Wahome Macharia
University of London
2 PUBLICATIONS 0 CITATIONS
SEE PROFILE
All content following this page was uploaded by Wahome Macharia on 25 May 2021.
Kelvin W. Macharia
kwm14@student.london.ac.uk
1. Introduction
A hash function is a mathematical algorithm that takes data of arbitrary length as input and
maps it to a fixed length enciphered text as output. This output is called a message digest, a
hash value, a hash code or simply a hash.
A hash function that satisfies some additional requirements so that it can be used for
cryptographic applications is known as a cryptographic hash function. These functions are
essential constructs that have a variety of use cases. The main fields of their application are
protection of stored passwords, message authentication, digital signatures, and therefore
certificates.
Cryptographic hash functions are broadly classified into two classes: unkeyed hash
functions also known as Manipulation Detection Code (MDC) or Message Authentication
Code (MAC) with single a parameter – an input message – and keyed hash functions with
two distinct inputs – an input message and a secret key. Generally, the term hash functions
refers to unkeyed hash functions [8].
1
▪ The SHA (Secure Hash Algorithm) family - published by the National Institute
of Standards and Technology (NIST) as a U.S. Federal Information Processing
Standard (FIPS) [2]. This family designates six different hash functions: SHA-0, SHA-
1, SHA-224, SHA-256, SHA-384, and SHA-512 [2, 3]. The first four operate on 512-bit
message blocks divided into 32-bit words and the last two on 1024-bit blocks divided
into 64-bit words. Bitcoin, the original and largest cryptocurrency (at the time of
writing), uses the SHA-256 hash function.
▪ The MD (Message Digest) family - comprises of MD2, MD4, MD5 and MD6
authored by Ronald Rivest for RSA security and was adopted as the Internet
Standard RFC 1321 [4].
▪ RIPEMD (RACE Integrity Primitives Evaluation Message Digest) – a family
of cryptographic hash functions based upon the design principles used in MD4
developed by Hans Dobbertin, Antoon Bosselaers, and Bart Preneel at the COSIC
research group at the Katholieke Universiteit Leuven. RIPEMD-160 produces a hash
digest of 160 bits (20 bytes).
▪ Whirlpool – designed by Vincent Rijmen and Paulo S. L. M. Barreto, this hash
function based on a substantially modified version of the Advanced Encryption
Standard (AES). Whirlpool produces a hash digest of 512 bits (64 bytes).
▪ BLAKE – a hash function submitted to the NIST hash function competition by Jean-
Philippe Aumasson, Luca Henzen, Willi Meier, and Raphael C.-W. Phan. It is based
on Dan Bernstein's ChaCha stream cipher, but a permuted copy of the input block,
XORed with round constants, is added before each ChaCha round.
▪ Curl-P – a hash function formerly used in IOTA Signature Scheme (ISS). IOTA is a
cryptocurrency designed for use with the Internet of Things (IoT) and automotive
ecosystems. ISS is based on Winternitz One-Time Signatures but unlike traditional
Winternitz, in IOTA users sign the hash of a message. Thus, the security of ISS relies
on its cryptographic hash function, which was Curl-P-27.
A cryptographic hash function is expected to have the following properties that guarantee its
effectiveness and security:
2
• Target collision resistance (2nd pre-image resistance) – this property
requires that given a hash function H and any input m, it should be computationally
infeasible to find another input m’ such that m’ ≠ m and H(m) = H(m’).
• Collision resistance - this property requires that given a hash function H, it should
be computationally infeasible to find two inputs m and m’ such that m ≠ m’ and H(m)
= H(m’). Due to the fixed size of hash values compared to the much larger – and
arbitrary – size of inputs, collisions are expected to exist in hash functions. However,
they must be computationally intractable to find.
• Deterministic – this property requires that a hash function H should consistently
map a given input m to a hash value h. It should also be public and computable.
• Avalanche effect – this property requires that a change in just one bit of the input
data should result in a large change in the output. This “diffusion” ensures that any
inference about the input from the output is infeasible thus this property is also
sometimes defined as unruliness.
• Hash speed – an ideal property of a cryptographic hash function is its ability to
operate at a reasonable speed. In many situations, a hashing algorithm should
compute hash values rather quickly. However, it’s worthwhile to note that faster is
not always better or more secure.
Attacking a cryptographic hash function implies breaking one of its security properties. For
example, breaking pre-image resistance means that an attacker can create a message that
hashes to a specific hash [5]. Attacks on hash functions may focus on either the structure of
the hash function or on the algorithm of the compression function used to condense
arbitrary size input into a fixed size hash value.
Over years, a significant number of cryptographic hash functions have been broken and
proven to be vulnerable to security attacks. The main target of these attacks is the collision
resistance of hash functions. For instance, in August 2004 collisions were found in several
then-popular hash functions, including MD5 [6].
A hash function can be termed as “broken” when, immaterial of the computation feasibility
of that effort, a lower number of its evaluations compared to the brute force attack
complexities and strengths estimated by its designer are used to overcome at least one of its
properties. For instance, consider that it requires 290 evaluations to find a collision for a 256-
bit hash function. The hash function is broken because this factor is less than the 2128
evaluations required by the Birthday attack despite the intractability of the computational
complexity [7].
3
Attacks on hash functions can be classified into two broad categories – Brute Force attacks
and Cryptanalytical attacks. Figure 1 below illustrates this classification:
The strength and security of a hash function – and the difficulty of Brute Force
attacks – relies solely on the size of its output hash value. For a hash output of length
n, the effort required to resist various classical brute force attacks can be expressed as
follows:
• One-way function inversion (pre-image resistance) attack – the
effort required to find an input m which maps to h by H given a challenge h
equals 2n because for a given n-bit hash h of the hash function H, an attacker
4
would evaluate H with every plausible input m until the desired output hash
value h is obtained.
• 2nd pre-image resistance attack – the effort required to find two inputs
m and m’ that are mapped to the same output by H equals 2n. In this variation
of the brute force attack, an attacker would evaluate the hash function H with
every possible input m’ ≠ m, for a given input m, until the hash value
h = H(m) is obtained.
• Collision attack – the effort required for a given hash function H, to find
two inputs m and m’ such that m ≠ m’ and H(m) = H(m’) equals 2n/2 as on
average, they would have to try 2n - 1 (i.e., 2n / 2) inputs to find one whose hash
value matches. However, in what is referred to as a Birthday attack that is
based on the Birthday Paradox, a chosen plain text attack is possible in which
case the effort required for a collision in a hash function equals 2n/2 as
opposed to 2n - 1. [9]
Further extensions of these classic Brute Force attacks have been studied by various
authors. These include: the K-way collision attack for K >= 2 whose aim is to
find K different inputs mi such that H(mi) = … = H(mk) [5] and the K-way 2nd pre-
image resistance attack for K >= 1 where given an input m, a hash value h, and a
hash function H such that h = H(m), the aim is to find K different inputs mi with
H(mi) = h and mi ≠ m.
5
Fig. 2 General structure of most hash algorithms (IJERT [12])
The hash function takes an input message and partitions it into L fixed size blocks of
b bits each. The final block can be padded to b bits if necessary and may also include
the value of the total length of the input to the hash function which makes the job of
an attacker more difficult. They must either find two messages of equal length that
hash to the same value or two messages of differing lengths that, with each of their
respective lengths appended, hash to the same value [12].
In their respective papers, Damgård [10] and Merkle [11], provide theorems showing
that if there exists a fixed-length input collision resistant compression function f : {0,
1}b x {0, 1}t then one can design a variable-length input collision resistant hash
function H : {0, 1}* → {0, 1}t by iterating that compression function [7]. In other
words, if the compression function is collision resistant, then so is the resultant
iterated hash function. Thus, if the compression function is vulnerable to any attack,
then so is the iterated hash function but the converse of this result is not necessarily
true in general.
6
Generic attacks are those that work on a general hash function construction. These
attacks are applicable even if the underlying compression algorithm is replaced by
some abstract oracle [8]. For example, attacks on the Merkel-Damgård construction
that work on all hash functions designed using the approach are generic attacks.
Length extension attacks, Joux's multi-collision attacks, multi (2nd) pre-image attacks
such as the one based on fixed points, correcting block attacks, herding attacks, and
meet in the middle attacks are examples of generic cryptanalytical attacks.
Specific attacks are those that apply differential [19], linear [20,21] or rotational [22]
cryptanalysis or that work on specific hash functions or the algorithm underlying its
compression function such as the collision attacks on the specific hash function of
MD4 [13], MD5 [14,15], SHA-0 [16,17], and SHA-1 [16,18]. If the hash function’s underlying
compression function is implemented using the encryption algorithm, then the
weaknesses in the encryption algorithm – the encryption algorithm may have
complementation property or weak keys or fixed points – can be exploited to attack
hash functions.
4. Conclusion
Cryptanalytical attacks on hash functions, just like with encryption algorithms, seek to
exploit some property of the algorithm to perform some attack other than an exhaustive
search.
Hash functions are however practically easier to attack than encryption algorithms because
the attacker does need to assume any secrets and the maximum computational effort
required to attack the hash function is only upper bounded by the attacker's resources and
not user’s gullibility. This is not the case with block ciphers where the maximum practical
count of executions of the block algorithm is limited by how much computational effort the
attacker can get the user to do [7].
References
[1] Rompay, B. V. (2004) “Analysis and Design of Cryptographic Hash functions, MAC
algorithms and Block Ciphers”. Ph.D. thesis, Leuven, Belgium: Electrical Engineering
Department, Katholieke Universiteit.
7
[2] National Institute of Standards and Technology (NIST). (1995) “Secure Hash Standard”.
FIPS Publication 180-1. Google Scholar
[3] National Institute of Standards and Technology (NIST). (2002) “Secure Hash Standard”.
FIPS Publication 180-2. Google Scholar
[4] Rivest, R. (1992) “The MD5 Message-Digest Algorithm”. Request for Comments: 1321,
Network Working Group. RFC 1321
[5] Lucks, S. (2004) “Design Principles for Iterated Hash Functions”. Cryptology ePrint
Archive, Report 2004/253, pp. 253. https://eprint.iacr.org/2004/253
[6] Wang, X. Feng, D. Lai, X. Yu, H. (2004) “Collisions for Hash Functions MD4, MD5,
HAVAL-128, and RIPEMD”. Jinan250100, China: The School of Mathematics and System
Science, Shandong University. https://eprint.iacr.org/2004/199.pdf
[9] Bellare, M. Kohno, T. (2004) "Hash Function Balance and Its Impact on Birthday
Attacks". In EUROCRYPT, pp.401-418.
[10] Damgård, I. (1989) “A design principle for hash functions”. In Gilles Brassard, editor,
Advances in Cryptology: CRYPTO 89, volume 435 of Lecture Notes in Computer Science,
pages 416-427. Springer-Verlag.
[11] Merkle, R. (1989) “One way hash function and DES”. In Gilles Brassard, editor,
Advances in Cryptology: CRYPTO 89, volume 435 of Lecture Notes in Computer Science,
pages 416-427. Springer-Verlag.
[13] Wang, X. Lai, X. Feng, D. Chen, H. Yu, X. (2005) "Cryptanalysis of the Hash Functions
MD4 and RIPEMD". In EUROCRYPT, pp.1-18.
8
[14] Wang, X. Lai, X. Feng, D. Yu, X. (2004) "Collisions for Hash Functions MD4, MD5,
HAVAL-128 and RIPEMD". IACR Cryptology ePrint Archive, pp. 199.
[15] Wang, X. Yu, X. (2005) "How to Break MD5 and Other Hash Functions”. In
EUROCRYPT, pp. 19-35.
[17] Wang, X. Yu, H. Yin, Y. L. (2005) "Efficient Collision Search Attacks on SHA-0". In
CRYPTO, pp.1-16.
[18] Wang, X. Yin, Y. L. Yu, H. (2005) "Finding Collisions in the Full SHA-1". In CRYPTO,
pp.17-36.
[21] Matsui, M. (1993) “Linear Cryptanalysis methods for DES Cipher”. In EUROCRYPT, pp.
386-397.
[23] Muller F. (2004) “The MD2 Hash Function Is Not One-Way”. In: Lee P.J. (eds)
Advances in Cryptology - ASIACRYPT 2004. ASIACRYPT 2004. Lecture Notes in Computer
Science, vol 3329. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30539-
2_16