Finding The Greedy, Prodigal, and Suicidal Contracts at Scale

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

Finding The Greedy, Prodigal, and Suicidal Contracts at Scale

Ivica Nikolić Aashish Kolluri Ilya Sergey


School of Computing, NUS School of Computing, NUS University College London
Singapore Singapore United Kingdom
Prateek Saxena Aquinas Hobor
School of Computing, NUS Yale-NUS College and School of Computing, NUS
Singapore Singapore
arXiv:1802.06038v1 [cs.CR] 16 Feb 2018

Abstract purpose applications. Contracts are programs that run on


Smart contracts—stateful executable objects hosted on blockchains: their code and state is stored on the ledger,
blockchains like Ethereum—carry billions of dollars and they can send and receive coins. Smart contracts
worth of coins and cannot be updated once deployed. We have been popularized by the Ethereum blockchain. Re-
present a new systematic characterization of a class of cently, sophisticated applications of smart contracts have
trace vulnerabilities, which result from analyzing mul- arisen, especially in the area of token management due
tiple invocations of a contract over its lifetime. We fo- to the development of the ERC20 token standard. This
cus attention on three example properties of such trace standard allows the uniform management of custom to-
vulnerabilities: finding contracts that either lock funds kens, enabling, e.g., decentralized exchanges and com-
indefinitely, leak them carelessly to arbitrary users, or plex wallets. Today, over a million smart contracts oper-
can be killed by anyone. We implemented M AIAN, the ate on the Ethereum network, and this count is growing.
first tool for precisely specifying and reasoning about Smart contracts offer a particularly unique combina-
trace properties, which employs inter-procedural sym- tion of security challenges. Once deployed they can-
bolic analysis and concrete validator for exhibiting real not be upgraded or patched,1 unlike traditional con-
exploits. Our analysis of nearly one million contracts sumer device software. Secondly, they are written in a
flags 34, 200 (2, 365 distinct) contracts vulnerable, in 10 new ecosystem of languages and runtime environments,
seconds per contract. On a subset of 3, 759 contracts the de facto standard for which is the Ethereum Virtual
which we sampled for concrete validation and manual Machine and its programming language called Solidity.
analysis, we reproduce real exploits at a true positive rate Contracts are relatively difficult to test, especially since
of 89%, yielding exploits for 3, 686 contracts. Our tool their runtimes allow them to interact with other smart
finds exploits for the infamous Parity bug that indirectly contracts and external off-chain services; they can be in-
locked 200 million dollars worth in Ether, which previ- voked repeatedly by transactions from a large number of
ous analyses failed to capture. users. Third, since coins on a blockchain often have sig-
nificant value, attackers are highly incentivized to find
and exploit bugs in contracts that process or hold them
1 Introduction
directly for profit. The attack on the DAO contract cost
Cryptocurrencies feature a distributed protocol for a set the Ethereum community $60 million US; and several
of computers to agree on the state of a public ledger more recent ones have had impact of a similar scale [1].
called the blockchain. Prototypically, these distributed In this work, we present a systematic characterization
ledgers map accounts or addresses (the public half of a of a class of vulnerabilities that we call as trace vulner-
cryptographic key pair) with quantities of virtual “coins”. abilities. Unlike many previous works that have applied
Miners, or the computing nodes, facilitate recording the static and dynamic analyses to find bugs in contracts au-
state of a payment network, encoding transactions that tomatically [2–5], our work focuses on detecting vul-
transfer coins from one address to another. A signifi- nerabilities across a long sequence of invocations of a
cant number of blockchain protocols now exist, and as of contract. We label vulnerable contracts with three cate-
writing the market value of the associated coins is over gories — greedy, prodigal, and suicidal — which either
$300 billion US, creating a lucrative attack target. lock funds indefinitely, leak them to arbitrary users, or
Smart contracts extend the idea of a blockchain to a 1 Other than by “hard forks”, which are essentially decisions of the

compute platform for decentralized execution of general- community to change the protocol and are extremely rare.
be susceptible to by killed by any user. Our precisely de- Different source languages compile to the EVM seman-
fined properties capture many well-known examples of tics, the predominant of them being Solidity [9]. A smart
known anecdotal bugs [1, 6, 7], but broadly cover a class contract embodies the concept of an autonomous agent,
of examples that were not known in prior work or public identified by its program logic, its identifying address,
reports. More importantly, our characterization allows and its associated balance in Ether. Contracts, like other
us to concretely check for bugs by running the contract, addresses, can receive Ether from external agents stor-
which aids determining confirmed true positives. ing it in their balance field; they can can also send Ether
We build an analysis tool called M AIAN for find- to other addresses via transactions. A smart contract is
ing these vulnerabilities directly from the bytecode of created by the owner who sends an initializing transac-
Ethereum smart contracts, without requiring source code tion, which contains the contract bytecode and has no
access. In total, across the three categories of vulnera- specified recipient. Due to the persistent nature of the
bilities, M AIAN has been used to analyze 970, 898 con- blockchain, once initialized, the contract code cannot
tracts live of the public Ethereum blockchain. Our tech- be updated. Contracts live perpetually unless they are
niques are powerful enough to find the infamous Parity explicitly terminated by executing the SUICIDE byte-
bug that indirectly caused 200 million dollars worth of code instruction, after which they are no longer invo-
Ether, which is not found by previous analyses. A total cable or called dead. When alive, contracts can be in-
of 34, 200 (2, 365 distinct) contracts are flagged as poten- voked many times. Each invocation is triggered by send-
tially buggy, directly carry the equivalent of millions of ing a transaction to the contract address, together with
dollars worth of Ether. As in the case of the Parity bug, input data and a fee (known as gas) [8]. The mining net-
they may put a larger amount to risk, since contracts in- work executes separate instances of the contract code and
teract with one another. For 3, 759 contracts we tried to agrees on the outputs of the invocation via the standard
concretely validate, M AIAN has found over 3, 686 con- blockchain consensus protocol, i.e., Nakamoto consen-
firmed vulnerabilities with 89% true positive rate. All sus [10, 11]. The result of the computation is replicated
vulnerabilities are uncovered on average within 10 sec- via the blockchain and grants a transaction fee to the min-
onds of analysis per contract. ers as per block reward rates established periodically.
Contributions. We make the following contributions: The EVM allows contract functions to have local state,
• We identify three classes of trace vulnerabilities, while the contracts may have global variables stored on
which can be captured as properties of a execution the blockchain. Contracts can invoke other contracts via
traces — potentially infinite sequence of invocations message calls; outputs of these calls, considered to be
of a contract. Previous techniques and tools [3] are not a part of the same transaction, are returned to the caller
designed to find these bugs because they only model during the runtime. Importantly, calls are also used to
behavior for a single call to a contract. send Ether to other contracts and non-contract addresses.
• We provide formal high-order properties to check The balance of a contract can be read by anyone, but is
which admit a mechanized symbolic analysis proce- only updated via calls from other contracts and externally
dure for detection. We fully implement M AIAN, a tool initiated transactions.
for symbolic analysis of smart contract bytecode (with- Contracts can be executed repeatedly over their life-
out access to source code). time. A transaction can run one invocation of the con-
• We test close to one million contracts, finding thou- tract and an execution trace is a (possibly infinite) se-
sands of confirmed true positives within a few seconds quence of runs of a contract recorded on the blockchain.
of analysis time per contract. Testing trace properties Our work shows the importance of reasoning about ex-
with M AIAN is practical. ecution traces of contracts with a class of vulnerabilities
that has not been addressed in prior works, and provides
an automatic tool to detect these issues.
2 Problem
We define a new class of trace vulnerabilities, showing 2.2 Contracts with Trace Vulnerabilities
three specific examples of properties that can be checked While trace vulnerabilities are a broader class, we our
in this broader class. We present our approach and tool focus attention on three example properties to check of
to reason about the class of trace vulnerabilities. contract traces. Specifically, we flag contracts which (a)
can be killed by arbitrary addresses, (b) have no way to
2.1 Background on Smart Contracts release Ether after a certain execution state, and (c) re-
lease Ether to arbitrary addresses carelessly.
Smart contracts in Ethereum run on Ethereum Virtual Note that any characterization of bugs must be taken
Machine (EVM), a stack-based execution runtime [8]. with a grain of salt, since one can always argue that the

2
1 function payout ( address [] recipients , 1 function initMultiowned ( address [] _owners ,
2 uint256 [] amounts ) { 2 uint _required ) {
3 require ( recipients . length == amounts . length ) ; 3 if ( m_numOwners > 0) throw ;
4 for ( uint i = 0; i < recipients . length ; i ++) { 4 m_numOwners = _owners . length + 1;
5 /* ... */ 5 m_owners [1] = uint ( msg . sender ) ;
6 recipients [ i ]. send ( amounts [ i ]) ; 6 m_ownerIndex [ uint ( msg . sender ) ] = 1;
7 }} 7 m_required = _required ;
8 /* ... */
9 }
Figure 1: Bounty contract; payout leaks Ether. 10
11 function kill ( address _to ) {
12 uint ownerIndex = m_ownerIndex [ uint ( msg . sender ) ];
exposed behavior embodies intent — as was debated in 13 if ( ownerIndex == 0) return ;
the case of the DAO bug [6]. Our characterization of 14 var pending = m_pending [ sha3 ( msg . data ) ];
15 if ( pending . yetNeeded == 0) {
vulnerabilities is based, in part, on anecdotal incidents 16 pending . yetNeeded = m_required ;
reported publicly [6,7,12]. To the best of our knowledge, 17 pending . ownersDone = 0;
however, our characterization is the first to precisely de- 18 }
19 uint ownerIndexBit = 2** ownerIndex ;
fine checkable properties of such incidents and measure 20 if ( pending . ownersDone & ownerIndexBit == 0) {
their prevalence. Note that there are several valid rea- 21 if ( pending . yetNeeded <= 1)
sons for contracts for being killable, holding funds in- 22 suicide ( _to ) ;
23 else {
definitely under certain conditions, or giving them out 24 pending . yetNeeded - -;
to addresses not known at the time of deployment. For 25 pending . ownersDone |= ownerIndexBit ;
instance, a common security best practice is that when 26 }
27 }
under attack, a contract should be killed and should re- 28 }
turn funds to a trusted address, such as that of the owner.
Similarly, benign contracts such as bounties or games,
often hold funds for long periods of time (until a bounty Figure 2: Simplified fragment of ParityWalletLibrary
is awarded) and release them to addresses that are not contract, which can be killed.
known statically. Our characterization admits these be-
nign behaviors and flags egregious violations described contracts are presented in Section 5.
next, for which we are unable to find justifiable intent. Suicidal Contracts. A contract often enables a security
Prodigal Contracts. Contracts often return funds to fallback option of being killed by its owner (or trusted ad-
owners (when under attack), to addresses that have sent dresses) in emergency situations like when being drained
Ether to it in past (e.g., in lotteries), or to addresses of its Ether due to attacks, or when malfunctioning.
that exhibit a specific solution (e.g., in bounties). How- However, if a contract can be killed by any arbitrary ac-
ever, when a contract gives away Ether to an arbitrary count, which would make it to execute the SUICIDE in-
address— which is not an owner, has never deposited struction, we consider it vulnerable and call it suicidal.
Ether in the contract, and has provided no data that is The recent Parity fiasco [1] is a concrete example
difficult to fabricate by an arbitrary observer—we deem of such type of a contract. A supposedly innocent
this as a vulnerability. We are interested in finding such Ethereum user [13] killed a library contract on which
contracts, which we call as prodigal. the main Parity contract relies, thus rendering the lat-
Consider the Bounty contract with code fragment ter non-functional and locking all its Ether. To under-
given in Figure 1. This contract collects Ether from dif- stand the suicidal side of the library contract, focus on
ferent sources and rewards bounty to a selected set of the shortened code fragment of this contract given in
recipients. In the contract, the function payout sends to Figure 2. To kill the contract, the user invokes two dif-
a list of recipients specified amounts of Ether. It is clear ferent functions: one to set the ownership,2 and one to
from the function definition that the recipients and the actually kill the contract. That is, the user first calls
initMultiowned, providing empty array for _owners,
amounts are provided as inputs, and anybody can call
the function (i.e., the function does not have restrictions and zero for _required. This effectively means that the
on the sender). The message sender of the transaction contract has no owners and that nobody has to agree to
is not checked for; the only check is on the size of lists. execute a specific contract function. Then the user in-
Therefore, any user can invoke this function with a list of vokes the function kill. This function needs _required
recipients of her choice, and completely drain its Ether. number of owners to agree to kill the contract, before the
actual suicide command at line 22 is executed. How-
The above contract requires a single function invoca-
ever, since in the previous call to initMultiowned, the
tion to leak its Ether. However, there are examples of
contracts which need two or more invocations (calls with 2 The bug would have been prevented has the function

specific arguments) to cause a leak. Examples of such initMultiowned been properly initialized by the authors.

3
1 contract AddressReg {
Bytecode
2 address public owner ;
3 mapping ( address = > bool ) isVerifiedMap ; Sample Exploit
4 function setOwner ( address _owner ) { Symbolic
Result
 Analysis
5 if ( msg . sender == owner )
6 owner = _owner ; Concrete
7 } Validation
8 function AddressReg () { owner = msg . sender ; }      Analysis
9 function verify ( address addr ) { Specifications
10 if ( msg . sender == owner ) Maian
11 isVerifiedMap [ addr ] = true ;
12 }
13 function deverify ( address addr ) { Figure 4: M AIAN
14 if ( msg . sender == owner )
15 isVerifiedMap [ addr ] = false ; posthumous contracts on the live Ethereum blockchain
16 }
17 function h a s P h y s i c a l A d d r e s s ( address addr )
we have found in Section 5.
18 constant returns ( bool ) {
19 return isVerifiedMap [ addr ];
20 } 2.3 Our Approach
21 }
Each run of the contract, called an invocation, may exer-
Figure 3: AddressReg contract locks Ether. cise an execution path in the contract code under a given
input context. Note that prior works have considered
bugs that are properties of one invocation, ignoring the
value of _required was set to zero, suicide is executed, chain of effects across a trace of invocations [2,5,14–17].
and thus the contract is killed.
We develop a tool that uses systematic techniques to
Greedy Contracts. We refer to contracts that remain find contracts that violate specific properties of traces.
alive and lock Ether indefinitely, allowing it be released The violations are either:
under no conditions, as greedy. In the example of the (a) of safety properties, asserting that there exists a
Parity contract, many other multisigWallet-like con- trace from a specified blockchain state that causes the
tracts which held Ether, used functions from the Parity contract to violate certain conditions; and
https://vectr.com/ashgeek/b4JRQBQTAY 1/1
library contract to release funds to their users. After (b) of liveness properties, asserting whether some ac-
the Parity library contracts was killed, the wallet con- tions cannot be taken in any execution starting from a
tracts could no longer access the library, thus became specified blockchain state.
greedy. This vulnerability resulted in locking of $200M We formulate the three kinds of vulnerable contracts
US worth of Ether indefinitely! as these safety and liveness trace properties in Section 3.
Greedy contracts can arise out of more direct errors as Our technique of finding vulnerabilities, implemented as
well. The most common such errors occur in contracts a tool called M AIAN and described in Section 4, con-
that accept Ether but either completely lack instructions sists of two major components: symbolic analysis and
that send Ether out (e.g. send, call, transfer), or concrete validation. The symbolic analysis component
such instructions are not reachable. An example of con- takes contract bytecode and analysis specifications as in-
tract that lacks commands that release Ether, that has al- puts. The specifications include vulnerability category
ready locked Ether is given in Figure 3. to search for and depth of the search space, which fur-
Posthumous Contracts. When a contract is killed, ther we refer to as invocation depth, along with a few
its code and global variables are cleared from the other analysis parameters we outline in Section 4. To de-
blockchain, thus preventing any further execution of its velop our symbolic analysis component, we implement
code. However, all killed contracts continue to receive a custom Ethereum Virtual Machine, which facilitates
transactions. Although such transactions can no longer symbolic execution of contract bytecode [3]. With every
invoke the code of the contract, if Ether is sent along contract candidate, our component runs possible execu-
them, it is added to the contract balance, and similarly to tion traces symbolically, until it finds a trace which satis-
the above case, it is locked indefinitely. Killed contract or fies a set of predetermined properties. The input context
contracts that do not contain any code, but have non-zero to every execution trace is a set of symbolic variables.
Ether we call posthumous. It is the onus of the sender to Once a contract is flagged, the component returns con-
check if the contract is alive before sending Ether, and crete values for these variables. Our final step is to run
evidence shows that this is not always the case. Because the contract concretely and validate the result for true
posthumous contracts require no further static analysis positives; this step is implemented by our concrete val-
beyond that for identifying suicidal contracts, we do not idation component. The concrete validation component
treat this as a separate class of bugs. We merely list all takes the inputs generated by symbolic analysis compo-

4
nent and checks the exploit of the contract on a private ascribe Ether balance to accounts, as well as manipula-
fork of Ethereum blockchain. Essentially, it is a testbed tions with the local contract configuration. As custom-
environment used to confirm the correctness of the bugs. ary in Ethereum, such agent is represented by its address
As a result, at the end of validation the candidate contract id, and might be a contract itself. For the purpose of
is determined as true or false positive, but the contract this work, we simplify the semantics of E THER L ITE by
state on main blockchain is not affected since no changes eliding the executions resulting in exceptions, as reason-
are committed to the official Ethereum blockchain. ing about such is orthogonal to the properties of interest.
Therefore, the configurations δ of the E THER L ITE ab-
stract machine are defined as follows:
3 Execution Model and Trace Properties
A life cycle of a smart contract can be represented by a Configuration δ , hA, σ i
sequence of the contract’s states, which describe the val- Execution stack A , hM, id, pc, s, mi · A | ε
ues of the contract’s fields, as well as its balance, inter- Message m , {sender →
7 id; value : N; data 7→ . . .}
leaved with instructions and irreversible actions it per- 
Blockchain state σ , id 7→ bal : N; code? 7→ M; f ? 7→ v
forms modifying the global context of the blockchain,
such transferring Ether or committing suicide. One can
consider a contract to be buggy with respect to a certain That is, a contract execution configuration consists
class of unwelcome high-level scenarios (e.g., “leaking” of an activation record stack A and a blockchain con-
funds) if some of its finite execution traces fail to sat- text σ . An activation record stack A is a list of tuples
isfy a certain condition. Trace properties characterised hM, id, pc, s, mi, where id and M are the address and the
this way are traditionally qualified as trace-safety ones, code of the contract currently being executed, pc is a pro-
meaning that “during a final execution nothing bad hap- gram counter pointing to the next instruction to be exe-
pens”. Proving the absence of some other high-level bugs cuted, s is a local operand stack, and m is the last mes-
will, however, require establishing a statement of a dif- sage used to invoke the contract execution. Among other
ferent kind, namely, “something good must eventually fields, m stores the identity of the sender, the amount
happen”. Such properties are known as liveness ones value of the ether being transferred (represented as a nat-
and require reasoning about progress in executions. An ural number), as well as auxiliary fields (data) used to
example of such property would be an assertion that a provide additional arguments for a contract call, which
contract can always execute a finite number of steps in we will be omitting for the sake of brevity. Finally, a
order to perform an action of interest, such as tranferring simplified context σ of a blockchain is encoded as a fi-
money, in order to be considered non-greedy. nite partial mapping from an account id to its balance
In this section, we formally define the execution model and contract code M and its mutable state, mapping the
of Ethereum smart contracts, allowing one to pinpoint field names f to the corresponding values,3 which both
the vulnerabilities characterised in Section 2.2. The key are optional (hence, marked with ?) and are only present
idea of our bug-catching approach is to formulate the for contract-storing blockchain records. We will further
erroneous behaviours as predicates of observed contract refer to the union of a contract’s fields entries f 7→ v and
traces, rather than individual configurations and instruc- its balance entry bal 7→ z as a contract state ρ.
tion invocations, occurring in the process of an execu- Figure 5 presents selected rules for a smart contract
tion. By doing so, we are able to (a) capture the prodi- execution in E THER L ITE.4 The rules for storing and
gal/suicidal contracts via conditions that relate the un- loading values to/from a contract’s field f are standard.
welcome agents gaining, at some point, access to a con- Upon calling another account, a rule C ALL is executed,
tract’s funds or suicide functionality by finding a way which required the amount of Ether z to be transferred
around a planned semantics, and (b) respond about re- to be not larger than the contract id’s current balance,
peating behavioural patterns in the contract life cycles, and changes the activation record stack and the global
allowing us to detect greedy contracts. blockchain context accordingly. Finally, the rule S UI -
CIDE N ON E MPTY S TACK provides the semantics for the
SUICIDE instruction (for the case of a non-empty activa-
3.1 EVM Semantics and Execution Traces tion record stack), in which case all funds of the termi-
We begin with defining cotnract execution traces by nated contract id are transferred to the caller’s id0 .
adopting a low-level execution semantics of an EVM- An important addition we made to the semantics of
like language in the form of E THER L ITE-like calcu- E THER L ITE are execution labels, which allow to distin-
lus [2]. E THER L ITE implements a small-step stack ma- 3 For simplicity of presentation, we treat all contract state as persis-
chine, operating on top of a global configuration of the tent, eliding operations with auxiliary memory, such as MLOAD/MSTORE.
blockchain, which used to retrieve contract codes and 4 The remaining rules can be found in the work by Luu et al. [2].

5
SS TORE last configuration is hε, σ 0 i for some σ 0 . The following
0
M[pc] = SSTORE σ = σ [id][ f 7→ v] definition captures the behaviors of multiple subsequent
sstore( f , v)
hhM, id, pc, f · v · s, mi · A, σ i −−−−−−−→ hhM, id, pc + 1, s, mi · A, σ 0 i transactions with respect to a contract of interest.
Definition 3.2 (Multi-transactional contract trace). A
SL OAD
M[pc] = SLOAD v = σ [id][ f ] contract trace t = τid (σ , mi ), for a sequence of messages
sload( f , v) mi = m0 , . . . , mn , is a concatenation of single-transaction
hhM, id, pc, f · s, mi · A, σ i −−−−−−−→ hhM, id, pc + 1, v · s, mi · A, σ i
traces τbid (σi , mi ), where σ0 = σ , σi+1 is a blockchain
C ALL state at the end of an execution starting from a con-
M[pc] = CALL σ [id][bal] ≥ z figuration hhσ [id][code], id, 0, ε, mi i · ε, σi i, and all traces
s = id0 · z · args · s0 a = hM, id, pc + 1, s0 , mi τbid (σi , mi ) are complete for i ∈ {0, . . . , n − 1}.
m0 = {sender 7→ id; value 7→ z; data 7→ args} M 0 = σ [id0 ][code]
σ 0 = σ [id][bal 7→ σ [id][bal] − z] σ 00 = σ 0 [id0 ][bal 7→ σ 0 [id0 ][bal] + z] As stated, the definition does not require a trace to end
call(id0 , m0 ) 0 0 0
hhM, id, pc, s, mi · A, σ i −−−−−−−→ hhM , id , 0, ε, m i · a · A, σ i 00 with a complete execution at the last transaction. For
convenience, we will refer to the last element of a trace t
S UICIDE N ON E MPTY S TACK by last(t) and to its length as length(t).
M[pc] = SUICIDE s = id0 · s0 a = hM 0 , pc0 , s00 , m0 i
σ 0 = σ [id0 ][bal 7→ (σ [id0 ][bal] + σ [id][bal])] σ 00 = σ 0 [id][bal 7→ 0]
suicide(id0 )
hhM, id, pc, s, mi · a · A, σ i −−−−−−−→ hhM 0 , id0 , pc0 , 1 · s00 , m0 i · A, σ 00 i
3.2 Characterising Safety Violations
The notion of contract traces allows us to formally cap-
Figure 5: Selected execution rules of E THER L ITE.
ture the definitions of buggy behaviors, described previ-
guish between specific transitions being taken, as well as ously in Section 2.2. First, we turn our attention to the
their parameters, and are defined as follows: prodigal/suicidal contracts, which can be uniformly cap-
tured by the following higher-order trace predicate.
` , sstore( f , v) | sload( f , v) | call(id, m) | suicide(id) | . . .
Definition 3.3 (Leaky contracts). A contract with an ad-
For instance, a transition label of the form call(id, m) dress id is considered to be leaky with respect to pred-
captures the fact that a currently running contract has icates P, R and Q, and a blockchain state σ (denoted
transferred control to another contract id, by sending it leakyP,R,Q (id, σ )) iff there exists a sequence of messages
a message m, while the label suicide(id) would mean a mi , such that for a trace t = τid (σ , mi ):
suicide of the current contract, with transfer of all of its 1. the precondition P(σ [id][code],t0 , m0 ) holds,
funds to the account (a contract’s or not) id. 2. the side condition R(ti , m0 ) holds for all i < length(t),
With the labelled operational semantics at hand, we 3. the postcondition Q(tn , m0 ) holds for tn = last(t).
can now provide a definition of partial contract execution
traces as sequences of interleaved contract states ρi and Definition 3.3 of leaky contracts is relative with re-
transition labels ` j as follows: spect to a current state of a blockchain: a contract that is
currently leaky may stop being such in the future. Also,
Definition 3.1 (Projected contract trace). A partial pro- notice that the “triggering” initial message m0 serves as
jected trace t = τbid (σ , m) of a contract id in an initial an argument for all three parameter predicates. We will
blockchain state σ and an incoming message m is defined now show how two behaviors observed earlier can be en-
as a sequence [hρ0 , `0 i, . . . , hρn , `n i], such that for every coded via specific choices of P, R, and Q.5
i ∈ {0 . . . n}, ρi = σi [id]|bal, f , where σi is the blockchain Prodigal contracts. A contract is considered prodigal if
state at the ith occurrence of a configuration of the form, it sends Ether, immediately or after a series of transitions
hh•, id, •, •, •i, σi i in an execution sequence starting from (possibly spanning multiple transactions), to an arbitrary
the configuration hhσ [id][code], id, 0, ε, mi · ε, σ i, and `i sender. This intuition can be encoded via the following
is a label of an immediate next transition. choice of P, R, and Q for Definition 3.3:
In other words, τbid (σ , m) captures the states of a con-
P(M, hρ, `i, m) , m[sender] ∈
/ im(ρ) ∧ m[value] = 0
tract id, interleaved with the transitions taken “on its be-
R(hρ, `i, m) , True
half” and represented by the corresponding labels, start-
Q(hρ, `i, m) , ` = call(m[sender], m0 ) ∧ m0 [value] > 0
ing from the initial blockchain σ and triggered by the
∨ ` = delegatecall(m[sender])
message m. The notation σ [id]|bal, f stands for a projec-
∨ ` = suicide(m[sender])
tion to the corresponding components of the contract en-
try in σ . States and transitions of contracts other than id According to the instantiation of the parameter predi-
and involved into the same execution are, thus, ignored. cates above, a prodigal contract is exposed by a trace that
Given a (partial) projected trace τbid (σ , m), we say that 5 In most of the cases, it is sufficient to take R , True, but in Sec-

it is complete, if it corresponds to an execution, whose tion 6 we hint certain properties that require a non-trivial side condition.

6
is triggered by a message m, whose sender does not ap- Greedy contracts. In order to specify a property assert-
pear in the contract’s state (m[sender] ∈ / im(ρ)), i.e., it is ing that in an interaction with up to k transactions, a con-
not the owner, and the Ether payload of m is zero. To tract does not allow to release its funds, we instantiate
expose the erroneous behavior of the contract, the post- the predicates from Definition 3.4 as follows:
condition checks that the transition of a contract is such
that it transfer funds or control (i.e., corresponds to CALL, P(M, hρ, `i, m) , ρ[bal] > 0
DELEGATECALL or SUICIDE instructions [8]) with the re- 
` = call(m[sender], m0 ) ∧ m0 [value] > 0

cipient being the sender of the initial message. In the case R(hρ, `i, m) , ¬  ∨ ` = delegatecall(m[sender])
 

of sending funds via CALL we also check that the amount ∨ ` = suicide(m[sender])
being transferred is non zero. In other words, the initial
caller m[sender], unknown to the contract, got himself Intuitively, the definition of a greedy contract is dual to
some funds without any monetary contribution! In prin- the notion of a prodigal one, as witnessed by the above
ciple, we could ensure minimality of a trace, subject to formulation: at any trace starting from an initial state,
the property, by imposing a non-trivial side condition R, where the contract holds a non-zero balance, no transi-
although this does not affect the class of contracts ex- tion transferring the corresponding funds (i.e., matched
posed by this definition. by the side condition R) can be taken, no matter what is
Suicidal contracts. A definition of a suicidal contract the sender’s identity. That is, this definition covers the
is very similar to the one of a prodigal contract. It is case of contract’s owner as well: no one can withdraw
delivered by the following choice of predicates: any funds from the contract.

P(M, hρ, `i, m) , SUICIDE ∈ M ∧ m[sender] ∈


/ im(ρ)
R(hρ, `i, m) , True 4 The Algorithm and the Tool
Q(hρ, `i, m) , ` = suicide(m[sender])
M AIAN is a symbolic analyzer for smart contract ex-
That is, a contract is suicidal if its code M contains ecution traces, for the properties defined in Section 3.
the SUICIDE instruction and the corresponding transition It operates by taking as input a contract in its byte-
can be triggered by a message sender, that does not ap- code form and a concrete starting block value from the
pear in the contract’s state at the moment of receiving the Ethereum blockchain as the input context, flagging con-
message, i.e., at the initial moment m[sender] ∈/ im(ρ). tracts that are outlined in Section 2.2. When reasoning
about contract traces, M AIAN follows the E THER L ITE
rules, described in Section 3.1, executing them symbol-
3.3 Characterising Liveness Violations ically. During the execution, which starts from a con-
A contract is considered locking at a certain blockchain tract state satisfying the precondition of property of in-
state σ , if at any execution originating from σ prohibits terest (cf. Definitions 3.3 and 3.4), it checks if there ex-
certain transitions to be taken. Since disproving liveness ists an execution trace which violates the property and a
properties of this kind with a finite counterexample is set of candidate values for input transactions that trigger
impossible in general, we formulate our definition as an the property violation. For the sake of tractability of the
under-approximation of the property of interest, consid- analysis, it does not keep track of the entire blockchain
ering only final traces up to a certain length: context σ (including the state of other contracts), treating
only the contract’s transaction inputs and certain block
Definition 3.4 (Locking contracts). A contract with an parameters as symbolic. To reduce the number of false
address id is considered to be locking with respect to positives and confirm concrete exploits for vulnerabili-
predicates P and R, the transaction number k, and a ties, M AIAN calls its concrete validation routine, which
blockchain state σ (denoted lockingP,R,k (id, σ )) iff for we outline in Section 4.2.
all sequences of messages mi of length less or equal than
k, the corresponding trace t = τid (σ , mi ) satisfies:
1. the precondition P(σ [id][code],t0 , m0 ), 4.1 Symbolic Analysis
2. the side condition R(ti , m0 ) for all i ≤ length(t). Our work concerns finding properties of traces that in-
Notice that, unlike Definition 3.3, this Definition does volve multiple invocations of a contract. We leverage
not require a postcondition, as it is designed to under- static symbolic analysis to perform this step in a way that
approximate potentially infinite traces, up to a certain allows reasoning across contract calls and across multi-
length k,6 so the “final state” is irrelevant. ple blocks. We start our analysis given a contract byte-
code and a starting concrete context capturing values of
6 We discuss viable choices of k in Section 5. the blockchain. M AIAN reasons about values read from

7
input transaction fields and block parameters7 in a sym- maintain a symbolic path constraint which captures the
bolic way—specifically, it denotes the set of all concrete conditions necessary to execute the path being analyzed
values that the input variable can take as a symbolic vari- in a standard way. M AIAN implements support for 121
able. It then symbolically interprets the relationship of out of the 133 bytecode instructions in Ethereum’s stack-
other variables computed in the contract as a symbolic based low-level language.
expression over symbolic variables. For instance, the At a call instruction, control follows transfer to the tar-
code y := x + 4 results in a symbolic value for y if x get. If the target of the transfer is a symbolic expression,
is a symbolic expression; otherwise it is executed as con- M AIAN backtracks in its depth-first search. Calls outside
crete value. Conceptually, one can imagine the analy- a contract, however, are not simulated and returns are
sis as maintaining two memories mapping variables to marked symbolic. Therefore, M AIAN depth-first search
values: one is a symbolic memory mapping variables to is inter-procedural, but not inter-contract.
their symbolic expressions, the other mapping variables Handling data accesses. The memory mappings, both
to their concrete values. symbolic and concrete, record all the contract memory as
Execution Path Search. The symbolic interpretation well blockchain storage. During the symbolic interpreta-
searches the space of all execution paths in a trace with tion, when a global or blockchain storage is accessed for
a depth-first search. The search is a best effort to in- the first time on a path, its concrete value is read from
crease coverage and find property violating traces. Our the main Ethereum blockchain into local mappings. This
goal is neither to be sound, i.e., search all possible paths ensures that subsequent reads or writes are kept local to
at the expense of false positives, nor to be provably com- the path being presently explored.
plete, i.e., have only true positives at the expense of cov- The EVM machine supports a flat byte-addressable
erage [18]. From a practical perspective, we make design memory, and each address has a bit-width of 256 bits.
choices that strike a balance between these two goals. The accesses are in 32-byte sized words which M AIAN
The symbolic execution starts from the entry point of encodes as bit-vector constraints to the SMT solver. Due
the contract, and considers all functions which can be to unavailability of source code, M AIAN does not have
invoked externally as an entry point. More precisely, any prior information about higher-level datatypes in the
the symbolic execution starts at the first instruction in memory. All types default to 256-bit integers in the en-
the bytecode, proceeding sequentially until the execution coding used by M AIAN. Furthermore, M AIAN attempts
path ends in terminating instruction. Such instruction to recover more advanced types such as dynamic arrays
can be valid (e.g., STOP, RETURN), in which case we as- by using the following heuristic: if a symbolic variable,
sume to have reached the end of some contract function, say x, is used in constant arithmetic to create an expres-
and thus restart the symbolic execution again from the sion (say x + 4) that loads from memory (as an argument
first bytecode instruction to simulate the next function to the CALLDATALOAD instruction), then it detects such an
call. On the other hand, the terminating instruction can access as a dynamic memory array access. Here, M AIAN
be invalid (e.g., non-existing instruction code or invalid uses the SMT solver to generate k concrete values for the
jump destination), in which case we terminate the search symbolic expression, making the optimistic assumption
down this path and backtrack in the depth-first search that the size of the array to be an integer in the range
procedure to try another path. When execution reaches [0, k]. The parameter k is configurable, and defaults to 2.
a branch, M AIAN concretely evaluates the branch con- Apart from this case, whenever accesses in the memory
dition if all the variables used in the conditional expres- involve a symbolic address, M AIAN makes no attempt
sion are concrete. This uniquely determines the direction at alias analysis and simply terminates the path being
for continuing the symbolic execution. If the condition search and backtracks in its depth-first search.
involves a symbolic expression, M AIAN queries an ex- Handling non-deterministic inputs. Contracts have
ternal SMT solver to check for the satisfiability of the several sources of non-deterministic inputs such as the
symbolic conditional expression as well as its negation. block timestamp, etc. While these are treated as sym-
Here, if the symbolic conditional expression as well as its bolic, these are not exactly under the control of the exter-
negation are satisfiable, both branches are visited in the nal users. M AIAN does not use their concrete values as it
depth-first search; otherwise, only the satisfiable branch needs to reason about invocations of the contract across
is explored in the depth first search. On occasions, the multiple invocations, i.e., at different blocks.
satisfiability of the expression cannot be decided in a Flagging Violations. Finally, when the depth-first
pre-defined timeout used by our tool; in such case, we search in the space of the contract execution reaches
terminate the search down this path and backtrack in a state where the desired property is violated, it flags
the depth-first search procedure to try another path. We the contract as a buggy candidate. The symbolic path
7 Those being CALLVALUE, CALLER, NUMBER, TIMESTAMP, constraint, along with the necessary property conditions,
BLOCKHASH, BALANCE, ADDRESS, and ORIGIN. are asserted for satisfiability to the SMT solver. We

8
use Z3 [19] as our solver, which provides concrete val- test to confirm the correctness of the exploit.
ues that make the input formula satisfiable. We use these Greedy Contracts. A strategy similar to the above
values as the concrete data for our symbolic inputs, in- two cannot be used to validate the exploits on contracts
cluding the symbolic transaction data. that lock Ether. However, during the bug finding pro-
Bounding the path search space. M AIAN takes the fol- cess, our symbolic execution engine checks firsthand
lowing steps to bound the search in the (potentially in- whether a contract accepts Ether. The validation frame-
finite) path space. First, the call depth is limited to the work can, thus, check if a contract is true positive by
constant called max_call_depth, which defaults to 3 but confirming that it accepts Ether and does not have CALL,
can be configured for empirical tests. Second, we limit DELEGATECALL, or SUICIDE opcodes in its bytecode. In
the total number of jumps or control transfers on one path Section 5 we give examples of such contracts.
explored to a configurable constant max_cfg_nodes, de-
fault set to 60. This is necessary to avoid being stuck in 5 Evaluation
loops, for instance. Third, we set a timeout of 10 sec-
onds per call to our SMT solver. Lastly, the total time We analyzed 970, 898 smart contracts, obtained by
spent on a contract is limited to configurable constant downloading the Ethereum blockchain from the first
max_analysis_time, default set to 300 seconds. block utill block number 4, 799, 998, which is the last
Pruning. To speed up the state search, we implement block as of December 26, 2017. Ethereum blockchain
pruning with memorization. Whenever the search en- has only contract bytecodes. To obtain the original
counters that the particular configuration (i.e., contract (Solidity) source codes, we refer to the Etherscan ser-
storage, memory, and stack) has been seen before, it does vice [21] and obtain source for 9, 825 contracts. Only
not further explore that part of the path space. around 1% of the contracts have source code, highlight-
ing the utility of M AIAN as a bytecode analyzer.
Recall that our concrete validation component can an-
4.2 Concrete Validation alyze a contract from a particular block height where
In the concrete validation step, M AIAN creates a pri- the contract is alive (i.e., initialized, but not killed). To
vate fork of the original Ethereum blockchain with the simplify our validation process for a large number of
last block as the input context. It then runs the contract contracts flagged by the symbolic analysis component,
with the concrete values of the transactions generated by we perform our concrete validation at block height of
the symbolic analysis to check if the property holds in 4, 499, 451, further denoted as BH. At this block height,
the concrete execution. If the concrete execution fails we find that most of the flagged contracts are alive, in-
to exhibit a violation of the trace property, we mark the cluding the Parity library contract [1] that our tool suc-
contract as a false positive; otherwise, the contract is cessfully finds. This contract was killed at a block height
marked as a true positive. To implement the validating of 4, 501, 969. All contracts existing on blockchain at a
framework, we added a new functionality to the official block height of 4, 499, 451 are tested, but only contracts
go-ethereum package [20] which allows us to fork the that are alive at BH are concretely validated.8
Ethereum main chain at a block height of our choice. Experimental Setup and Performance. M AIAN sup-
Once we fork the main chain, we mine on that fork with- ports parallel analysis of contracts, and scales linearly
out connecting to any peers on the Ethereum network, in the number of available cores. We run it on a Linux
and thus we are able to mine our own transactions with- box, with 64-bit Ubuntu 16.04.3 LTS, 64GB RAM and
out committing them to the main chain. 40 CPUs Intel(R) Xeon(R) E5-2680 v2@2.80GHz. In
Prodigal Contracts. The validation framework checks if most of our experiments we run the tool on 32 cores.
a contract indeed leaks Ether by sending to it the transac- On average, M AIAN requires around 10.0 seconds to an-
tions with inputs provided by the symbolic analysis en- alyze a contract for the three aforementioned bugs: 5.5
gine. The transactions are sent by one of our accounts seconds to check if a contract is prodigal, 3.2 seconds for
created previously. Once the transactions are executed, suicidal, and 1.3 seconds for greedy.
the validation framework checks whether the contract Contract Characteristics. The number of contracts has
has sent Ether to our account. If a verifying contract increased tenfold from Dec, 2016 to Dec, 2017 and 176-
does not have Ether, our framework first sends Ether to fold since Dec, 2015. However, the distribution of Ether
the contract and only then runs the exploit. balance across contracts follows a skewed distribution.
Suicidal Contracts. In a similar fashion, the frame- Less than 1% of the contracts have more than 99% of the
work checks if a contract can be killed after executing the Ether in the ecosystem. This suggests that a vulnerabil-
transactions provided by the symbolic analysis engine on ity in any one of these high-profile contracts can affect a
the forked chain. Note, once a contract is killed, its byte- 8 We also concretely validate the flagged candidates which were

code is reset to ’0x’. Our framework uses precisely this killed before BH as well.

9
#Candidates Candidates % of BH, the concrete validation proceeds as follows. We cre-
Category flagged without #Validated true
(distinct) source positives
ate a private test fork of the blockchain, starting from a
Prodigal 1504 (438) 1487 1253 97 snapshot at a block height where the contract is alive. We
Suicidal 1495 (403) 1487 1423 99 send Ether to the contract from one of our addresses ad-
Greedy 31,201 (1524) 31,045 1083 69 dress, and check if the contract leaks Ether to an arbitrary
Total 34, 200 (2, 365) 34, 019 3, 759 89
address. We repeat this procedure for each contract, and
find that all 24 candidate contracts are true positives.
Table 1: Final results using invocation depth 3 at block
Suicidal contracts. M AIAN flags 1, 495 contracts (403
height BH. Column 1 reports number of flagged contracts,
distinct), including the ParityWalletLibrary contract,
and the distinct among these. Column 2 shows the num-
as found susceptible to being killed by an arbitrary ad-
ber of flagged which have no source code. Column 3 is
dress, with a nearly 99% true positive rate. Out of 1, 495
the subset we sampled for concrete validation. Column 4
contracts, 1, 398 are alive at BH. Our concrete validation
reports true positive rates; the total here is the average TP
engine on a private fork of Ethereum confirm that 1, 385
rate weighted by the number of validated contracts.
contracts (or 99.07%) are true positives, i.e., they can be
killed by any arbitrary Ethereum account, while 13 con-
large fraction of the entire Ether balance. Note that con-
tracts (or 0.93%) are false positives. The list of true pos-
tracts interact with each other, therefore, a vulnerability
itives includes the recent ParityWalletLibrary contract
in one contract may affect many others holding Ether, as
which was killed at block height 4, 501, 969 by an ar-
demonstrated by the recent infamous Parity library which
bitrary account. Of the 1, 495 contracts flagged, 25 have
was used by wallet contracts with $200 million US worth
been killed by BH; we repeat the procedure described pre-
of Ether [1].
viously and cofirmed all of them as true positives.
Greedy contracts. Our tool flags 31, 201 greedy can-
5.1 Results didates (1, 524 distinct), which amounts to around 3.2%
of the contracts present on the blockchain. The first ob-
Table 1 summarizes the contracts flagged by M AIAN. servation is that M AIAN deems all but these as accept-
Given the large number of flagged contracts, we select ing Ether but having states that release them (not lock-
a random subset for concrete validation, and report on ing indefinitely). To validate a candidate contract as a
the true positive rates obtained. We report the number of true positive one has to show that the contract does not
distinct contracts, calculated by comparing the hash of release/send Ether to any address for any valid trace.
the bytecode; however, all percentages are calculated on However, concrete validation may not cover all possible
the original number of contracts (with duplicates). traces, and thus it cannot be used to confirm if a contract
Prodigal contracts. Our tool has flagged 1, 504 candi- is greedy. Therefore, we take a different strategy and di-
dates contracts (438 distinct) which may leak Ether to an vide them into two categories:
arbitrary Ethereum address, with a true positive rate of
(i) Contracts that accept Ether, but in their bytecode do
around 97%. At block height BH, 46 of these contracts
not have any of the instructions that release Ether (such
hold some Ether. The concrete validation described in
instructions include CALL, SUICIDE, or DELEGATECALL).
Section 4.2 succeeds for exploits for 37 out of 46 — these
(ii) Contracts that accept Ether, and in their bytecode
are true positives, whereas 7 are false positives. The re-
have at least one of CALL, SUICIDE or DELEGATECALL.
maining 2 contracts leak Ether to an address different
from the caller’s address. Note that all of the 37 true M AIAN flagged 1, 058 distinct contracts from the first
positive contracts are alive as of this writing. For ethical category. We validate that these contracts can receive
reasons, no exploits were done on the main blockchain. Ether (we send Ether to them in a transaction with input
Of the remaining 1, 458 contracts which presently do data according to the one provided by the symbolic ex-
not have Ether on the public Ethereum blockchain, 24 ecution routine). Our experiments show that 1, 057 out
have been killed and 42 have not been published (as of of 1, 058 (e.g., 99.9%) can receive Ether and thus are
block height BH). To validate the remaining alive con- true positives. On the other hand, the tool flagged 466
tracts (in total 1392) on a private fork, first we send them distinct contracts from the second category, which are
Ether from our mining account, and find that 1, 183 con- harder to confirm by testing alone. We resort to manual
tracts can receive Ether.9 We then concretely validate analysis for a subset of these which have source code.
whether these contract leak Ether to an arbitrary address. Among these, only 25 have Solidity source code. With
A total of 1, 156 out of 1, 183 (97.72%) contracts are con- manual inspection we find that none of them are true pos-
firmed to be true positives; 27 (2.28%) are false positives. itive — some traces can reach the CALL code, but M AIAN
For each of the 24 contracts killed by the block height failed to reach it in its path exploration. The reasons for
these are mentioned in the Section 5.3. By extrapola-
9 These are live and we could update them with funds in testing. tion (weighted average across 1, 083 validated), we ob-

10
1 bytes20 prev ; 1 function withdraw () public returns ( uint ) {
2 function tap ( bytes20 nickname ) { 2 Record storage rec = records [ msg . sender ];
3 prev = nickname ; 3 uint balance = rec . balance ;
4 if ( prev != nickname ) { 4 if ( balance > 0) {
5 msg . sender . send ( this . balance ) ; 5 rec . balance = 0;
6 } 6 msg . sender . transfer ( balance ) ;
7 } 7 Withdrawn ( now , msg . sender , balance ) ;
8 }
Figure 6: A prodigal contract. 9 if ( now - l a s t I n v e s t m e n t T i m e > 4 weeks ) {
10 selfdestruct ( funder ) ;
1 contract Mortal { 11 }
2 address public owner ; 12 return balance ; }
3 function mortal () {
4 owner = msg . sender ;
5 } Figure 8: The Dividend contract can be killed by in-
6 function kill () {
7 if ( msg . sender == owner ) { voking withdraw if the last investment has been made
8 suicide ( owner ) ; at least 4 weeks ago.
9 }
10 }
11 }
12 contract Thing is Mortal { /* ... */ }
the semantic of SUICIDE instruction enforce it to send
all of its balance to an address provided to the instruc-
Figure 7: The prodigal contract Thing, derived from
tion. In Figure 7, the contract Thing [22] is inherited
Mortal, leaks Ether to any address by getting killed.
from a base contract Mortal. The contract implements a
review system in which public reviews an ongoing topic.
tain true positive rate among greedy contracts of 69%. Among others, the contract has a kill function inherited
Posthumous Contracts. Recall that posthumous are from its base contract which is used to send its balance
contracts that are dead on the blockchain (have been to its owner if its killed. The function mortal, suppos-
killed) but still have non-zero Ether balance. We can find edly a constructor, is misspelled, and thus anyone can
such contracts by querying the blockchain, i.e., by col- call mortal to become the owner of the contract. Since
lecting all contracts without executable code, but with the derived contract Thing inherits functions from con-
non-zero balance. We found 853 contracts at a block tract Mortal, this vulnerability in the base contract al-
height of 4, 799, 998 that do not have any compiled code lows an arbitrary Ethereum account to become the owner
on the blockchain but have positive Ether balance. Inter- of the derived contract, to kill it, and to receive its Ether.
estingly, among these, 294 contracts have received Ether Suicidal contracts. A contract can be killed by ex-
after they became dead. ploiting an unprotected SUICIDE instruction. A trivial
example is a public kill function which hosts the sui-
5.2 Case Studies: True Positives cide instruction. Sometimes, SUICIDE is protected by
Apart from examples presented in section 2.2, we now a weak condition, such as in the contract Dividend given
present true and false postive cases studies. Note that in Figure 8. This contract allows users to buy shares
we only present the contracts with source code for read- or withdraw their investment. The logic of withdraw-
ability. However, the fraction of flagged contracts with ing investment is implemented by the withdraw function.
source codes is very low (1%). However, this function has a self_destruct instruction
Prodigal contracts. In Figure 6, we give an example of which can be executed once the last investment has been
a prodigal contract. The function tap seems to lock Ether made more than 4 weeks ago. Hence, if an investor calls
because the condition at line 4, semantically, can never this function after 4 weeks of the last investment, all the
be true. However, the compiler optimization of Solidity funds go to the owner of the contract and all the records
allows this condition to pass when an input greater than of investors are cleared from the blockchain. Though the
20 bytes is used to call the function tap. Note, on a byte- ether is safe with the owner , there would be no record of
code level, the EVM can only load chunks of 32 bytes of any investment for the owner to return ether to investors.
input data. At line 3 in tap the first 20 bytes of nickname In the previous example, one invocation of withdraw
are assigned to the global variable prev, while neglecting function was sufficient to kill the contract. There are,
the remaining 12 bytes. The error occurs because EVM however, contracts which require two or more func-
at line 4, correctly nullifies the 12 bytes in prev, but not tion invocations to be killed. For instance, the contract
in nickname. Thus if nickname has non-zero values in Mortal given in Figure 7 checks whether it is the owner
these 12 bytes then the inequality is true. This contract that calls the kill function. Hence, it requires an attacker
so far has lost 5.0001 Ether to different addresses on real to become the owner of the contract to kill it. So, this
Ethereum blockchain. contract requires two invocations to be killed: one call to
A contract may also leak Ether by getting killed since the function mortal used to become an owner of the con-

11
1 contract SimpleStorage { 1 function c o n f i r m T r a n s a c t i o n ( uint tId )
2 uint storedData ; address storedAddress ; 2 ownerExists ( msg . sender ) {
3 event flag ( uint val , address addr ) ; 3 confirmations [ tId ][ msg . sender ] = true ;
4 4 e x e c u t e T r a n s a c t i o n ( tId ) ;
5 function set ( uint x , address y ) { 5 }
6 storedData = x ; storedAddress = y ; 6 function e x e c u t e T r a n s a c t i o n ( uint tId ) {
7 } 7 // In case of majority
8 function get () constant 8 if ( isConfirmed ( tId ) ) {
9 returns ( uint retVal , address retAddr ) { 9 Transaction tx = transactions [ tId ];
10 return ( storedData , storedAddress ) ; 10 tx . executed = true ;
11 } 11 if ( tx . destination . call . value ( tx . value )
12 } ( tx . data ) )
12 /* .... */
13 }}
Figure 9: A contract that locks Ether.
Figure 10: False positive, flagged as a greedy contract.
tract and one call to the function kill to kill the contract.
A more secure contract would leverage the mortal func- 1 function RandomNumber () returns ( uint ) {
2 /* .... */
tion to a constructor so that the function is called only 3 last = seed ^( uint ( sha3 ( block . blockhash (
once when the contract is deployed. Note, the recent Par- 4 block . number ) , nonces [ msg . sender ]) ) *0
x000b0007000500030001 );
ity bug similarly also requires two invocations [1]. 5 }
Greedy contracts. The contract SimpleStorage, given 6 function Guess ( uint _guess ) returns ( bool ) {
7 if ( RandomNumber () == _guess ) {
in Figure 9, is an example of a contract that locks Ether 8 if (! msg . sender . send ( this . balance ) ) throw ;
indefinitely. When an arbitrary address sends Ether along 9 /* .... */
10 } /* .... */ }
with a transaction invoking the set function, the contract
balance increases by the amount of Ether sent. How-
ever, the contract does not have any instruction to release Figure 11: False positive, flagged as a prodigal contract.
Ether, and thus locks it on the blockchain.
The payable keyword has been introduced in Solid- Greedy contracts. The large share of false positives is
ity recently to prevent functions from accepting Ether by attributed to two causes:
default, i.e., a function not associated with payable key- (i) Detecting a trace which leads to release of Ether may
word throws if Ether is sent in a transaction. However, need three or more function invocations. For instance,
although this contract does not have any function asso- in Figure 10, the function confirmTransaction has to be
ciated with the payable keyword, it accepts Ether since executed by the majority of owners for the contract to
it had been compiled with an older version of Solidity execute the transaction. Our default invocation depth is
compiler (with no support for payable). the reason for missing a possible reachable state.
(ii) Our tool is not able to recover the subtype for the
generic bytes type in the EVM semantics.
5.3 Case Studies: False Positives (iii) Some contracts release funds only if a random num-
We manually analyze cases where M AIAN’s concrete ber (usually generated using transaction and block pa-
validation fails to trigger the necessary violation with the rameters) matches a predetermined value unlike in the
produced concrete values, if source code is available. case of the contract in Figure 11. In that contract the
Prodigal and Suicidal contracts. In both of the classes, variable _guess is also a symbolic variable, hence, the
false positives arise due to two reasons: solver can find a solution for condition on line 7. If there
(i) Our tool performs inter-procedural analysis within a is a concrete value in place of _guess, the solver times
contract, but does not transfer control in cross-contract out since the constraint involves a hash function (hard to
calls. For calls from one contract to a function of another invert by the SMT solver).
contract, M AIAN assigns symbolic variables to the return 5.4 Summary and Observations
values. This is imprecise, because real executions may
only return one value (say true) when the call succeeds. The symbolic execution engine of M AIAN flags 34, 200
(ii) M AIAN may assign values to symbolic variables re- contracts. With concrete validation engine or manual in-
lated to block state (e.g., timestamp and blocknumber) spection, we have confirmed that around 97% of prodi-
in cases where these values are used to decide the con- gal, 97% of suicidal and 69% of greedy contracts are true
trol flow. Thus, we may get false positives because those positive. The importance of analyzing the bytecode of
values may be different at the concrete validation stage. the contracts, rather than Solidity source code, is demon-
For instance, in Figure 11, the _guess value depends on strated by the fact that only 1% of all contracts have
the values of block parameters, which cannot be forced source code. Further, among all flagged contracts, only
to take on the concrete values found by our analyzer. 181 have verified source codes according to the widely

12
Inv. depth Prodigal Suicidal Greedy Reasoning about smart contracts. OYENTE [2, 3] was
1 131 127 682
the first symbolic execution-based tool that provided
2 156 141 682
3 157 141 682 analysis targeting several specific issues: (a) mishan-
4 157 141 682 dled exceptions, (b) transaction-ordering dependence,
(c) timestamp dependence and (d) reentrancy [29], thus
Table 2: The table shows number of contracts flagged remedying the corner cases of Solidity/EVM semantics
for various invocation depths. This analysis is done on a (a) as well as some programming anti-patterns (b)–(d).
random subset of 25, 000–100, 000 contracts. Other tools for symbolic analysis of EVM and/or
EVM have been developed more recently: M ANTI -
used platform Etherscan, or in percentages only 1.06%, CORE [17], M YTHRILL [15, 16], S ECURIFY [5], and
0.47% and 0.49%, in the three categories of prodigal, KEVM [30, 31], all focusing on detecting low-level
suicidal, and greedy, respectively. We refer the reader to safety violations and vulnerabilities, such as integer over-
Table 1 for the exact summary of these results. flows, reentrancy, and unhandled exceptions, etc, nei-
Furthermore, the maximal amount of Ether that could ther of them requiring reasoning about contract execu-
have been withdrawn from prodigal and suicidal con- tion traces. A very recent work by Grossman et al. [32]
tracts, before the block height BH, is nearly 4, 905 Ether, similar to our in spirit and providing a dynamic anal-
or 5.9 million US dollars10 according to the exchange ysis of execution traces, focuses exclusively on detect-
rate at the time of this writing. In addition, 6, 239 Ether ing non-callback-free contracts (i.e., prone to reentrancy
(7.5 million US dollars) is locked inside posthumous attacks)—a vulnerability that is by now well studied.
contracts currently on the blockchain, of which 313 Ether Concurrently with our work, Kalra et al. developed
(379, 940 US dollars) have been sent to dead contracts af- Z EUS [4], a framework for automated verification of
ter they have been killed. smart contracts using abstract interpretation and sym-
Finally, the analysis given in Table 2 shows the num- bolic model checking, accepting user-provided policies
ber of flagged contracts for different invocation depths to verify for. Unlike M AIAN, Z EUS conducts policy
from 1 to 4. We tested 25, 000 contracts being for greedy, checking at a level of LLVM-like intermediate represen-
and 100, 000 for remaining categories, inferring that in- tation of a contract, obtained from Solidity code, and
creasing depth improves results marginally, and an invo- leverages a suite of standard tools, such as off-the-shelf
cation depth of 3 is an optimal tradeoff point. constraint and SMT solvers [19, 33, 34]. Z EUS does not
provide a general framework for checking trace proper-
ties, or under-approximating liveness properties.
6 Related Work
Various versions of EVM semantics [8] were imple-
mented in Coq [35], Isabelle/HOL [36, 37], F? [38],
Dichotomy of smart contract bugs. The early work by
Idris [39], and Why3 [40, 41], followed by subsequent
Delmolino et al. [24] distinguishes the following classes
mechanised contract verification efforts. However, none
of problems: (a) contracts that do not refund their users,
of those efforts considered trace properties in the spirit
(b) missing encryptions of sensitive user data and (c) lack
of what we defined in Section 3.
of incentives for the users to take certain actions. The
Several contract languages were proposed recently
property (a) is the closest to our notion of greedy. While
that distinguish between global actions (e.g., sending
that outlines the problem and demonstrates it on series
Ether or terminating a contract) and instructions for ordi-
of simple examples taught in a class, they do not provide
nary computations [42,43], for the sake of simplified rea-
a systematic approach for detection of smart contracts
soning about contract executions. For instance, the work
prone to this issue. Later works on contract safety and
on the contract language S CILLA [43] shows how to en-
security identify potential bugs, related to the concurrent
code in Coq [44] and formally prove a property, which is
transactional executions [25], mishandled exceptions [2],
very similar to a contract being non-leaky, as per Defini-
overly extensive gas consumption [14] and implementa-
tion 3.3 instantiated with a non-trivial side condition R.
tions of fraudulent financial schemes [26].11
In contrast to all those work, which focus on bad im- 7 Conclusion
plementation practices or misused language semantics,
we believe, our characterisation of several classes of con- We characterize vulnerabilities in smart contracts that
tract bugs, such as greedy, prodigal, etc, is novel, as they are checkable as properties of an entire execution trace
are stated in terms of properties execution traces rather (possibly infinite sequence of their invocations). We
than particular instructions taken/states reached. show three examples of such trace vulnerabilities, lead-
ing to greedy, prodigal and suicidal contracts. Analyzing
10 Calculated at 1, 210 USD/Eth [23]. 970, 898 contracts, our new tool M AIAN flags thousands
11 See the works [27, 28] for a survey of known contract issues. of contracts vulnerable at a high true positive rate.

13
References [25] I. Sergey and A. Hobor, “A Concurrent Perspective on Smart
Contracts,” in 1st Workshop on Trusted Smart Contracts, ser.
[1] A. Akentiev, “Parity multisig github.” [Online]. Available: LNCS, vol. 10323. Springer, 2017, pp. 478–493.
https://github.com/paritytech/parity/issues/6995
[26] M. Bartoletti, S. Carta, T. Cimoli, and R. Saia, “Dissecting
[2] L. Luu, D. Chu, H. Olickel, P. Saxena, and A. Hobor, “Making ponzi schemes on ethereum: identification, analysis, and impact,”
smart contracts smarter,” in CCS. ACM, 2016, pp. 254–269. CoRR, vol. abs/1703.03779, 2017.
[3] “Oyente: An Analysis Tool for Smart Contracts,” 2018. [Online]. [27] N. Atzei, M. Bartoletti, and T. Cimoli, “A Survey of Attacks
Available: https://github.com/melonproject/oyente on Ethereum Smart Contracts (SoK),” in POST, ser. LNCS, vol.
[4] S. Kalra, S. Goel, M. Dhawan, and S. Sharma, “Zeus: Analyzing 10204. Springer, 2017, pp. 164–186.
safety of smart contracts,” in NDSS, 2018, to appear. [28] ConsenSys Diligence, “Ethereum Smart Contract Security Best
[5] “Securify: Formal Verification of Ethereum Smart Contracts,” Practices,” 2018. [Online]. Available: https://consensys.github.
2018. [Online]. Available: http://securify.ch/ io/smart-contract-best-practices

[6] M. del Castillo, “The DAO Attacked: Code Issue Leads to $60 [29] E. G. Sirer, “Reentrancy Woes in Smart Contracts.”
Million Ether Theft,” June 17, 2016. [Online]. Available: http://hackingdistributed.com/2016/07/13/
reentrancy-woes/
[7] “Governmental’s 1100eth jackpot payout is stuck because it uses
too much gas.” [Online]. Available: https://www.reddit.com/r/ [30] E. Hildenbrandt, M. Saxena, X. Zhu, N. Rodrigues, P. Daian,
ethereum/comments/4ghzhv/ D. Guth, and G. Rosu, “KEVM: A Complete Semantics of the
Ethereum Virtual Machine,” Tech. Rep., 2017.
[8] G. Wood, “Ethereum: A secure decentralised generalised
transaction ledger.” [Online]. Available: https://ethereum.github. [31] G. Rosu, “ERC20-K: Formal Executable Specification of
io/yellowpaper/paper.pdf ERC20,” December 2017. [Online]. Available: https:
//runtimeverification.com/blog/?p=496
[9] Solidity: High-Level Language for Implementing Smart Con-
tracts. [Online]. Available: http://solidity.readthedocs.io/ [32] S. Grossman, I. Abraham, G. Golan-Gueta, Y. Michalevsky,
N. Rinetzky, M. Sagiv, and Y. Zohar, “Online detection of effec-
[10] S. Nakamoto, “Bitcoin: A peer-to-peer electronic cash system,” tively callback free objects with applications to smart contracts,”
2008. [Online]. Available: http://bitcoin.org/bitcoin.pdf PACMPL, vol. 2, no. POPL, pp. 48:1–48:28, 2018.
[11] G. Pı̂rlea and I. Sergey, “Mechanising blockchain consensus,” in [33] A. Gurfinkel, T. Kahsai, A. Komuravelli, and J. A. Navas, “The
CPP. ACM, 2018, pp. 78–90. SeaHorn Verification Framework,” in CAV, Part I, ser. LNCS, vol.
[12] J. Alois, “Ethereum Parity Hack May Impact ETH 500,000 or 9206. Springer, 2015, pp. 343–361.
$146 Million,” 2017. [34] K. L. McMillan, “Interpolants and Symbolic Model Checking,”
[13] “The guy who blew up parity didn’t know what he was doing.” in VMCAI, ser. LNCS, vol. 4349. Springer, 2007, pp. 89–90.
[Online]. Available: https://www.reddit.com/r/CryptoCurrency/ [35] Y. Hirai, “Ethereum Virtual Machine for Coq (v0.0.2),”
comments/7beos3/ Published online on 5 March 2017. [Online]. Available:
[14] T. Chen, X. Li, X. Luo, and X. Zhang, “Under-optimized smart https://goo.gl/DxYFwK
contracts devour your money,” in IEEE 24th International Con- [36] ——, “Defining the Ethereum Virtual Machine for Interactive
ference on Software Analysis, Evolution and Reengineering, Theorem Provers,” in 1st Workshop on Trusted Smart Contracts,
SANER, 2017, pp. 442–446. ser. LNCS, vol. 10323. Springer, 2017, pp. 520–535.
[15] “Mythril,” 2018. [Online]. Available: https://github.com/ [37] S. Amani, M. Bégel, M. Bortin, and M. Staples, “Towards Ver-
b-mueller/mythril/ ifying Ethereum Smart Contract Bytecode in Isabelle/HOL,” in
[16] B. Mueller, “How Formal Verification Can Ensure Flawless CPP. ACM, 2018, pp. 66–77.
Smart Contracts,” January 2018. [Online]. Available: https: [38] K. Bhargavan, A. Delignat-Lavaud, C. Fournet, A. Gollamudi,
//goo.gl/9wUFE1 G. Gonthier, N. Kobeissi, N. Kulatova, A. Rastogi, T. Sibut-
[17] “Manticore,” 2018. [Online]. Available: https://github.com/ Pinote, N. Swamy, and S. Zanella-Béguelin, “Formal verification
trailofbits/manticore of smart contracts: Short paper,” in PLAS. ACM, 2016, pp.
91–96.
[18] P. Godefroid, “Higher-order test generation,” in Proceedings of
the 32Nd ACM SIGPLAN Conference on Programming Language [39] J. Pettersson and R. Edström, “Safer Smart Contracts through
Design and Implementation, ser. PLDI ’11, 2011. Type-Driven Development,” Master’s thesis, Chalmers Univer-
sity of Technology, Sweden, 2016.
[19] L. M. de Moura and N. Bjørner, “Z3: an efficient SMT solver,”
in TACAS, ser. LNCS, vol. 4963. Springer, 2008, pp. 337–340. [40] C. Reitwiessner, “Formal verification for solidity con-
tracts,” 2015. [Online]. Available: https://forum.ethereum.
[20] “Go-ethereum.” [Online]. Available: https://github.com/ org/discussion/3779/formal-verification-for-solidity-contracts
ethereum/go-ethereum
[41] J. Filliâtre and A. Paskevich, “Why3 - Where Programs Meet
[21] “Etherscan verified source codes.” [Online]. Available: https: Provers,” in ESOP, ser. LNCS, vol. 7792. Springer, 2013, pp.
//etherscan.io/contractsVerified 125–128.
[22] “Contract mortal.” [Online]. Available: https://etherscan. [42] “Bamboo,” 2018. [Online]. Available: https://github.com/
io/address/0x4671ebe586199456ca28ac050cc9473cbac829eb# pirapira/bamboo
code
[43] I. Sergey, A. Kumar, and A. Hobor, “Scilla: a smart contract
[23] “Etherscan.” [Online]. Available: https://etherscan.io/ intermediate-level language,” CoRR, vol. abs/1801.00687, 2018.
[24] K. Delmolino, M. Arnett, A. E. Kosba, A. Miller, and E. Shi, [44] Coq Development Team, The Coq Proof Assistant Reference
“Step by step towards creating a safe smart contract: Lessons and Manual - Version 8.7, 2018. [Online]. Available: http:
insights from a cryptocurrency lab,” in FC 2016 International //coq.inria.fr/
Workshops, ser. LNCS, vol. 9604. Springer, 2016, pp. 79–94.

14

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy