access.2020.3006383

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.3006383, IEEE Access

Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.
Digital Object Identifier 10.1109/ACCESS.2017.Doi Number

A graph model based blockchain


implementation for increasing performance
and security in decentralized ledger systems
Konstantinos Tsoulias1, Georgios Palaiokrassas1, Georgios Fragkos2, Antonios Litke1
and
1
Theodora Varvarigou1
National Technical University of Athens, Zografou Campus, 15773 Athens, Greece
2
Dept. of Electrical and Computer Engineering, University of New Mexico, Albuquerque, NM, 87131, USA
Corresponding author: Georgios Palaiokrassas (e-mail: geopal@mail.ntua.gr).
The research leading to these results has received funding from the European Commission under the H2020 Programme’s project M-
Sec (grant agreement nr. 814917).

ABSTRACT Blockchains are being recently used as a supporting technology framework


for decentralized applications requiring functionalities such as exchange of value through
tokens, cryptocurrency and smart contracts. In this paper, we have developed a
decentralized application model in Python, where blockchain data are stored in a Neo4j
graph database. Following the basic principles of Ethereum blockchain network, we
implemented a Casper-like consensus mechanism and tested its effectiveness in achieving
finality. For block proposing, we employed both Proof of Work and Proof of Stake
protocols and examined how participants' incentives and consensus criteria differ
according to each one. A major part of this work is to incorporate the graph model in the
functionality of the blockchain and its components, while also exploiting its benefits in
data analysis by finding relationships between data and extracting their true value.
Through this approach, we were able to monitor and visualize changes in blockchain data
in various use case scenarios. Lastly, we ran a series of simulated experiments to test the
efficiency of the implemented technologies and mechanisms in preventing the most
common blockchain attacks such as the 51% Attack, Catastrophic Crashes and Attack
from dynamic validator sets. We show how the modelling of the blockchain data as a
distributed graph can assist protocols operations, enhance their security, and facilitate the
application of analytical methods to the stored information through path-dependent
queries.

INDEX TERMS Blockchains, Proof of Work, Proof of Stake, consensus mechanisms,


graph databases, blockchain security, Casper, Neo4j.

I. INTRODUCTION on open and trustless networks of peers [2]. Blockchains


Blockchains are regarded as both public and private have been used as the underlying technology for many
ledgers containing transactional data within their cryptocoins and tokens [3], setting the ground for disrupting
decentralized data structures, which form a series of tightly the future Internet [4] as well as the traditional business
connected, timestamped blocks [1]. Their unique model by providing new means for exchanging value. Thus,
architecture makes blockchain systems immutable in the the research on various features of blockchains has become
sense that transactions cannot be tampered once they are very important in order to be able to enhance the
officially validated and registered in a block of the chain 1. technological framework with characteristics that can pave
Based on cryptographic proof, blockchain technology the way for a wider adoption of blockchain technology.
abolishes the need for a trusted third party, enabling for One such feature is finality, that needs to be achieved
reliable and robust decentralized applications, implemented through consensus protocols [5], assuring cryptocurrency
transactions cannot be changed, reversed, or canceled after
1
A series of connected blocks that starts on the Genesis Block. being published in the blockchain. Finality in Bitcoin's [6]

VOLUME XX, 2020 1

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.3006383, IEEE Access

blockchain is achieved with the Proof of Work (PoW) In this paper, we have developed a decentralized
protocol that requires users' CPU power to link new blocks application model in Python that is connected to a Neo4j
of transactions to the existing blockchain, and thereby database [26], where blockchain data are stored. Following
forming a continuous record that cannot be altered without the basic principles of Buterin and Griffith’s original paper
redoing all the work. In the case of a fork, this process, also [20], we practiced a Casper-like consensus mechanism to
known as mining [7], encourages users to always mine on function alongside the most popular block proposal
top of the longest chain, since it came from the largest pool mechanisms: the PoW and the PoS protocols. A major part
of CPU power and so it is the most difficult to reproduce. of our work was to incorporate Neo4j in the functionality of
The inherent characteristics of blockchain architecture, the above mechanisms and ultimately improving their
like transparency, verifiability, privacy, and anonymity, performance. For that reason, we developed a versatile
have encouraged since then, various industries and Graph Model for our blockchain database that allows for a
operational domains to further explore its numerous multilevel viewing of the stored data and, by extension,
benefits and applications [8]. Blockchain technology has numerous ways of accessing them. From the Neo4j Desktop
also its drawbacks, with scalability [9], security, and energy [27] application, we were able to monitor and visualize
consumption [10] problems being the most significant. changes in the deployed graph database in various use case
Nonetheless, new protocols and solutions are continually scenarios. Another advantage of the blockchain graph
being developed [11]-[14] to address these problems and to database is the ease in applying analytical methods to the
consolidate the blockchain technology and the stored data and to the relationships between them with
decentralized model, potentially transforming the way graph analysis tools. This innovation could solidify the
people choose to transact globally [15]. One such example blockchain analytics field by facilitating the evaluation of
is the Proof of Stake (PoS) protocol [16] that attempts to blockchain’s components and the behavior of the network’s
restrict PoW's wastefulness, by using tokens instead of nodes. For this reason, we ran a series of simulated
computational work, as a scarce and well-distributed experiments and by utilizing the annotated graph model, we
resource to prevent cheap attacks to the blockchain. tested the efficiency of the implemented technologies and
However, PoS stakeholders’ incentives [17] differ from mechanisms in preventing the most common blockchain
those of PoW miners’ in a way that may compromise attacks; namely the 51% Attack, Catastrophic Crashes, and
network’s security. Virtually the most profitable tactic for a the Attack from dynamic validator sets.
stakeholder is to vote on every branch of the blockchain The rest of the paper is structured as follows: In Section
tree2, thus making it harder to identify the most reliable II we present the theoretical background for the tools and
chain and reach a clear consensus. To tackle the so-called mechanisms developed in this paper. In Section III we
Nothing-at-Stake problem [18], Ethereum [19] developers discuss briefly about the published work that technically
created a partial consensus mechanism, called Casper [20], relates to blockchain and the ideas proposed in our paper. In
that combines the PoS research and Byzantine Fault Section IV we describe the architecture of the decentralized
Tolerance (BFT) [21] consensus theory. Casper overlays an application, while details about the implementation and the
existing blockchain and offers the appropriate tools and functionality of its components are given in Section V. In
regulations to readjust participants' incentives [22], so that Section VI we run our application and test the performance
they always consent to the most secure chain. This of the employed blockchain data model and the security of
technology is so recent that it has yet to be tested in a real the implemented protocols against the most common
cryptocurrency, leaving some problems associated with still blockchain attacks. Section VII is the conclusion of this
open. paper, where we summarize our findings and suggest
Along with the troubleshooting, efforts are also being possible applications for the mechanisms we developed.
made to involve new tools and test new approaches in
blockchain technology [23]-[25], expanding its capabilities II. BACKGROUND
and applications. In this context, and because of the high
interconnection of blockchain data, the representation of A. CASPER CONSENSUS MECHANISM
blockchain as a distributed graph database is far from Casper is a partial consensus mechanism combining Proof
absurd. Relationships between its data, keep blockchain of Stake algorithm research and Byzantine fault-tolerant
coherent, and may bear information of great analytical consensus theory. Casper’s operations are backed by a
value. Only a database that natively embraces relationships group of particular nodes, the validators [28], who are
is able to store, process, and query those connections responsible for voting on checkpoints and finalizing
efficiently. While other databases compute relationships at transactions. A checkpoint is only a regular block, whose
query time through expensive JOIN operations, a graph height in the blockchain tree is an exact multiple of a
database stores connections alongside the data in the model, number. In Ethereum, for instance, this number is set to
allowing millions of connections per second to be traversed. 100, so through the resultant checkpoint tree, validators can
finalize every 100 blocks at once, rather than voting on
every single block.
2
The forking of chains in the ledger results in tree like structure rooted Every node can become a validator by depositing at least
at the Genesis Block. the predetermined minimum amount of tokens. The number

2 VOLUME XX, 2020

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.3006383, IEEE Access

of tokens deposited also represents the stake of the deposits. In case of a rule violation, Casper guarantees that
validator, which rises and falls with rewards and penalties. all relevant evidence can be found, and the offenders can be
A node’s voting power is determined by his share of the identified.
number of tokens deposited by all validators. Hence, when For the mathematical proof of the above proposition we
we say “2/3 of validators”, we are referring to the deposit- will be working on the checkpoint tree. Given two finalized
weighted fraction. To exit the validator sets and collect his checkpoints xm and yn on two conflicting subchains, there
share a node must publish a withdraw message. After are two distinct chains of supermajority links from a
exiting, the node is forever forbidden to re-enter the sets. common starting checkpoint s (whether that is the Genesis
Validators can broadcast a vote message containing four Block or not) to xm and yn respectively:
pieces of information: two checkpoints of the same s → y0 → y1 → … → yn → yn+1
subchain3 s and t, together with their respective heights h(s) and
and h(t). Therefore a vote can be represented with a link s → x0 → x1 → … → xm → xm+1
from a source to a target checkpoint. Where, xm+1 and yn+1 are the children of xm and yn
respectively, since xm and yn are finalized (finalization
If at least 2/3 of the validators (by deposit) have published rule). The heights of all checkpoints xj , yi in the above
the same vote with source s and target t, then s → t is called chains should be different, otherwise rule I is violated.
a supermajority link. Without loss of generality we assume that h(xm) > h(yn),
A checkpoint c is called justified if (1) it is the root, or (2) hence that h(xm) > h(yn+1), since h(xj) ≠ h(yi) . Let k be
there exists a supermajority link c’ → c where checkpoint the lowest integer such that h(xk) > h(yn+1); then h(xk-1) <
c’ is justified. h(yn) (or h(xk-1) = h(yn), which again violates rule I). This
A checkpoint c is called finalized if (1) it is the root or (2) it implies the existence of a supermajority link xk-1 → xk ,
is justified, and there is a supermajority link c → c’ where where h(xk-1) < h(yn) < h(yn+1) h(xk), thus violating rule II.
c’ is a direct child of c. If two conflicting supermajority links l1 and l2 exist, we can
conclude that at least 1/3 of the validators violated the
slashing conditions, since at least 2/3 of the validators have
published l1 and at least 2/3 of the validators have published
l2.

FIGURE 1. Example of justifying and finalizing checkpoints in the


checkpoint tree

Casper's proper function precludes two checkpoints of


different subchains from being both finalized (Figure 1). To
achieve this, all validators must comply with the following
rules:

An individual validator must not publish two different votes


{ 𝑠1 , 𝑡1 , ℎ(𝑠1 ), ℎ(𝑡1 ) } and {𝑠2 , 𝑡2 , ℎ(𝑠2 ), ℎ(𝑡2 ) }
such that either:
I. ℎ(𝑡1 ) = ℎ(𝑡2 ). FIGURE 2. Proof of the effectiveness of Casper's slashing conditions.
Equivalently, a validator must not publish two distinct votes
for the same target height. In the case of a fork, miners/stakeholders are incentivized
or to always build on the branch that contains the highest
II. ℎ(𝑠1 ) < ℎ(𝑠2 ) < ℎ(𝑡2 ) < ℎ(𝑡1 ). justified checkpoint. This correct-by-construction fork
Equivalently, a validator must not vote within the span of choice rule [29], besides being the optimal strategy for
his other votes. nodes, also prevents pathological scenarios to occur; by
following the longest chain fork choice rule, Casper can get
Βreach of any of the above rules results in the slashing of “stuck” where any blocks built atop the longest chain
the offending validators (Figure 2); the permanent cannot be finalized without some validators getting slashed.
withdrawal from the validator sets and the deletion of their So, this rule is to be followed by every miner/stakeholder
since it ensures the liveness of the consensus protocol
3
A series of connected blocks that starts on a fork of a chain.

VOLUME XX, 2020 3

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.3006383, IEEE Access

ambiguous acting of individual nodes amongst them. The


B. NEO4J GRAPH DATABASE CPU voting consensus that Nakamoto suggested with Proof
A graph database (GDB) [30] is a database designed to treat of Work (PoW) [6] encouraged a multitude of new
the relationships between data as equally important to the mechanisms based on proof of concepts [36] to try and
data themselves. Graph databases are part of the NoSQL tackle PoW's problems while maintaining a similar level of
databases created to address the limitations of the existing security. Proof of Stake (PoS), the most popular and
relational databases. This is achieved by using graph energy-saving alternative to PoW [10] [37], requires
structures for semantic queries with nodes, edges, and participants to prove the ownership of the amount of
properties to represent and store data. A graph database is currency, expecting a strong correlation between a node's
intended to hold data without constricting them to a wealth and its fidelity. Our application incorporates the
predefined model by storing connections alongside the data above protocols as block proposal schemes and highlights
in the model, while other databases compute relationships the adjustments needed for them to work appropriately
at query time through expensive JOIN operations. Hence, under a Casper-like consensus mechanism.
accessing nodes and relationships in a native graph Other protocols such as Proof of Activity (PoA) [38]
database is an efficient, constant-time operation. combine useful elements from both PoW and PoS.
Neo4j [26] is an open-source, NoSQL, highly scalable Operating PoA requires building blocks from miners via
native graph database that provides an ACID-compliant PoW, which are controlled and signed by active network
transactional backend for developing applications. This stakeholders. The hash of any new block header solved by a
means that it efficiently implements the property graph miner is mapped to one of the satoshis in the network. Then
model down to the storage level while using pointers to a procedure is followed to track its owner, who then is
navigate and traverse the graph. Performance-wise Neo4j responsible for signing the new block header. This process
delivers consistent, real-time efficiency for multi-hop is repeated N times, for the new block is published. As it
queries on large, interconnected datasets. Moreover, it understood, like in a PoS scheme, the more tokens a node
offers a versatile property graph model that allows for possesses, the more chances he has to be elected. The
fluidly evolving solutions to meet user’s requirements. protocol is called Proof of Activity because it also requires
Cypher [31], a declarative query language similar to SQL, the N stakeholder to be active; otherwise, another block
but optimized for graphs, is now used by other databases header (with different N stakeholders) will be the first to
like SAP HANA [32] Graph and Redis graph [33] via the sign.
openCypher project [34]. Casper [20], the partial consensus mechanism is perhaps
The property graph model of Neo4j organizes data as the most advanced PoS algorithm; it’s innovation is so new
nodes, relationships and properties. Nodes are the entities in that it has yet to be thoroughly tested in a large scale
the graph that can hold any number of attributes (key-value environment. Recently, Ethereum’s developers announced
pairs), called properties. They can also be categorized into the first release [39] of Casper Friendly Finality Gadget
labels, that each represents a specific role for the nodes (FFG) and the code was made available to researchers,
tagged with it. Two semantically-relevant nodes can be auditors and client developers, to start testing the software.
linked with a directed relationship. Relationships are Essentially, Casper FFG is a simplified version of a
characterized by their type, and like nodes, they can too Byzantine fault tolerant protocol [21], with “votes” for
hold properties. Additionally, due to the efficient way in checkpoints taking the place of prepares and commits.
which they are stored, any number or type of relationships Shortly, Ethereum 2.0 [40] is expected to be launched,
can be shared by two nodes without sacrificing which will include Casper CBC [41]; an upgraded version
performance. of Casper FFG that will complete the transition from PoW
Neo4j also offers a growing, open library of graph to PoS consensus. However, researchers have already been
algorithms [35] that are optimized for fast results. With examining the effectiveness of Casper's principles, the
little to no coding required, these algorithms reveal the incentives involved and mechanisms through individual
hidden patterns and structures in the stored connecting data decentralized application models. Such an example is
around pathfinding, centrality and community detection. presented in the work of Moindrot et al. [42] where a
Lastly, Neo4j Browser, a graphical user interface (GUI) simulation of a blockchain application in Python was
that can be run through a web browser, allows for querying, developed to familiarize the reader with Casper's basic
visualization, and data interaction. All these capabilities operations, as well as to examine the impact of latency and
make Neo4j the ideal tool to employ in representing, disconnected nodes in the protocol's finality. However, this
visualizing and analyzing the cumbersome and highly simulation is far from a working DApp since it does not
connected blockchain data. involve active users, and some key blockchain and Casper
components, like consensus protocols and dynamic
III. RELATED WORK validators sets were not implemented.
A consensus protocol is a fault-tolerant set of rules that Furthermore, the Ethereum project has adopted the
ensures all nodes agreement on the order in which entries GHOST protocol, that suggests an alternative to the
are appended to the blockchain, despite the malicious or longest-chain rule of common PoW protocols; that is,

4 VOLUME XX, 2020

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.3006383, IEEE Access

selecting the heaviest sub tree rooted at each fork. By doing as a separate Web application of the same structure, to
so, developers aim to avoid scenarios where an attacker's ensure equality. For the purposes of this paper, we deployed
chain can grow longer without him having the majority of the P2P network on a single machine by having each node
network's computing power. An example of that is when run on a different port of Python's local development
larger blocks are created that take longer to propagate server. This implementation allowed us to uniquely identify
through the network, thus resulting in more forks to occur. each node by its port number so that it functions as the
In that case, the Greedy Heaviest-Observed Sub-Tree rule node's address.
proposes that those off the main chain blocks can still
contribute to it's validity. That idea of using graphs as a
way to optimize performance and security of distributed
ledgers was further examined in the cryptocurrency space.
A well-known blockchain protocol that introduces a unique
graph based model in blockchain is IOTA's tangle. This
protocol is entirely based on a DAG, which is used for
storing and verifying transactions by connecting them to
others, already confirmed. Nevertheless, this
implementation differs in many ways from the typical
blockchain structures since it doesn't use blocks to store
transactions, it combines the roles of transaction issuers and
transaction approvers and unlike most protocols it doesn't
include monetary rewards. Acknowledging the tremendous
benefits that graph solutions can provide in distributed
ledgers, we propose a model, that stores and connects
blockchain's digital entities in a Neo4j database.
Modeling blockchain as a graph database is not a novel
idea [43]. Several studies [44]-[46] have highlighted the
analytical value within the blockchain data and the
relationships between them that can be optimally exploited
through a high-fidelity blockchain graph model. In [44]
specifically, this was done, by parsing and deserializing the
Bitcoin raw binary data files into a suitable format for FIGURE 3. Transaction broadcasting sequence diagram.
importing into Neo4j. Then, they ran the annotated
graphthrough a graph-analysis framework that uses path- Communication between nodes is enabled through the
dependent Cypher queries to extract and summarize useful Flask-RESTful extension; each node stores its peers' ports-
statistics. This implementation paves the way for a addresses and can transmit messages to them by merely
blockchain analytics field that focuses on identifying and invoking the suitable API Resources with a supported
even predicting behaviors in both the nodes and their HTTPS method. Newcomer nodes query one or more IP
published messages. In our paper we extend the idea of a addresses hardcoded into their scripts that act like DNS
graph blockchain database by also incorporating the graph seeds, by storing and transmitting peers’ IP addresses. The
model into the core functions of blockchain and its procedure followed when a node broadcasts a transaction to
mechanics. Furthermore, we suggest a both flexible and the network can be visualized in the sequence diagram of
lean blockchain model that negates the need for a locking- Figure 3.
unlocking graph mechanism by being stored alongside the Lastly, every peer initializes and utilizes a Neo4j
traditional blockchain for a low memory overhead. This distributed graph database, in which blockchain data are
implantation intends to access data used by consensus stored and dynamically accessed.
protocols and block-proposal schemes at much greater
speeds than the traditional way, ultimately resulting in B. BLOCKCHAIN GRAPH MODEL
higher performance decentralized systems. Representing common blockchain data in a Graph Database
can be arranged in a forthright manner; a node can be
IV. SYSTEM OVERVIEW labeled either as Block or as Transaction, with dedicated
attributes in each case. Two consecutive blocks are linked
A. P2P NETWORK with a "CHILD_OF" relationship, while transactions are
Every decentralized application is supported by a P2P connected to the Blocks they belong to, with an
network [47] where members can interact with one another "INCLUDED_IN" relationship. Following this model, we
without the need for a trusted authority. In our model, we can depict any data broadcasted in the network, that is
simulate such a network by utilizing the Python Flask stored in a Block and has attributes, as a separate node or
Microframework. In particular, every node is implemented label in the Neo4j database.

VOLUME XX, 2020 5

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.3006383, IEEE Access

offers a simplified architecture that can be stored alongside


the original blockchain with an additional charge of 33%
(for additional details please check the Running the System
subsection) in terms of space complexity, to assist the
operation and the evaluation of its components. Our model
classifies blockchain data into six distinct node labels.
Users are linked to the Transactions they participate in and
to the messages they publish. Messages are associated with
validator activities and are categorized into deposit,
withdraw, and slashing messages. Transactions, messages
and votes are linked to the blocks in which they are added.
Each Block is connected with its parent-block, while
Checkpoints Blocks are double-labeled and additionally
linked to the previous Checkpoint in the checkpoint tree.

V. IMPLEMENTATION DETAILS
FIGURE 4. Blockchain database Graph Model.
A. BLOCK PROPOSAL MECHANISMS
However, the benefits of this implementation are not 1) PROOF OF WORK
limited to presenting data in an organized and In Proof-of-Work blockchains, nodes compete with each
comprehensible way. Neo4j offers the ability to quickly other to solve a cryptographic puzzle, like producing hashes
access stored information by utilizing graph theory with specific patterns. This procedure, known as mining, is
algorithms as well as to apply analytics [48] on blockchain implemented in our application and uses three main
data, by executing graph algorithms through complex components: a hash function, a random number generator,
Cypher queries. Nevertheless, not all of the information is and a winner verification method.
worth storing in the Neo4j graph database. The separation Every prospective miner first initiates a subprocess for
criteria are related to the usability of the stored data in mining the next block. The procedure begins by
blockchain functions and their analytical value. The constructing the new block for the miner’s selected chain as
efficiency of Neo4j is further optimized when the graph an object of class Block. For this block to be published, the
model expedites the retrieval of inaccessible native miner must first solve it by appending random numbers to
dataresulting in a higher performance system. its header and hashing the resulted string. When an SHA-
Hence, the graph database design is not absolute but 256 hashcode [52] with a specific amount of zeros is
rather is to be considered as a versatile tool, completely produced, the block is considered solved and is broadcasted
dedicated to its blockchain, containing the information of in the network. The number of zeros required is defined by
value and connecting them according to the needs of its the difficulty parameter, with which we can adjust the
mechanisms and protocols. One such example could be average block time.
calculating a user's balance, which would require finding all After receiving and verifying the new block, the other
transactions that he participates in by crawling each block miners terminate any ongoing mining processes and update
in the blockchain tree. A blockchain that values such metric their local data. Python multiprocessing library allows for
should store and connect users’ transactions in an optimal the mining process to be executed in parallel, without
way. Also, another practice might involve the handling of impeding the rest of the Flask Server operations.
smart contracts [49][50] perhaps in an e-shop application.
In that case a useful indicator could be the credibility score
of a user, calculated by retrieving the contracts they were 2) PROOF OF STAKE
involved in and by considering the credibility scores of the The Proof of Stake block proposal system, simulates the
other parties as well as the method of the contract’s mining process by using instead of computational effort,
resolution [51] (agreement, dispute, use of mediator etc.). as proof of the block constructors’ reliability. Here, users
Hence, the graph model of such an application should who possess a larger amount of tokens have a higher
include smart contract nodes, link them with the users that chance of being selected, as they have stronger incentives
participate in them and store a resolution method property. to protect the network and the value of the cryptocurrency.
Bearing in mind the above principles, we have designed In our implementation, a node can apply to become a
a graph model that facilitates the most common stakeholder by broadcasting an HTTPS message to the
blockchain’s operations as well as those of Casper’s network. Nodes of the same chain will store the received
consensus mechanism, while allowing for path-dependent application with the applicant's public key and stake. The
queries to be applied and information to be retrieved in a election process is triggered by sending a "start_pos"
resourceful manner. The design presented in Figure 4 takes HTTPS request to one of its nodes and is essentially a
advantage of blockchain’s highly interconnected data and stake-weighted random choice between applicants. If the
"start_pos" requests are sent to randomly selected nodes,

6 VOLUME XX, 2020

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.3006383, IEEE Access

FIGURE 5. New block’s validator sets’ configuration example.

eventually, the chain with the most online nodes will grow block A will have its validator sets modified as shown in
the longest. However, this affects neither the security nor Figure 5.
the finality of the consensus protocol, since here, unlike In this way, current validators authorize the next by
PoW, chain length is not a measure of difficulty. finalizing specific blocks. After a validator's withdrawal
Instead, we use the staking of each block to determine gets finalized, he will remain in the "current_rear" set until
which chain was the hardest to create and thus the hardest another checkpoint gets finalized. Further, by having both
to be reproduced. The problem that emerges here is that front and rear validators agree, for a checkpoint to get
stakeholder incentives run counter to this security measure, finalized, we ensure that all new validators vote in line with
since the optimal tactic for them is to bet on the blocks of their predecessors, achieving a continuous line of consensus
every branch. Thus staking-wise, the weaker blocks are throughout the blockchain. The final measure that we need
made indistinguishable from the stronger ones, and the to adopt is preventing the out of order checkpoint
Staking measure becomes unreliable. Casper aims to alter finalization. This step is crucial for Casper's safety since
these incentives by assigning consensus to the validators. otherwise, there can occur scenarios where conflicting
justification and finalization votes have been sent by
B. CASPER-LIKE CONSENSUS PROTOCOL disjointed validator sets,
1) DYNAMIC VALIDATOR SETS and therefore it is impossible to trace and punish the
To achieve finality in our Proof of Stake protocol, we offenders.
implemented a Casper-like consensus protocol, that can 2) VOTING
work over any block proposal scheme as well. This Given the changes in the operation of the validator sets and
mechanism should also allow the switching of validators the new security risks arising from them, we redefine a
without compromising blockchain's security. Our design supermajority link, a justified and a finalized checkpoint as
incorporates the key principles [53] laid down by Casper's follows:
creators into one straightforward simple implementation. An ordered pair of blocks (s, t), has a supermajority link,
Specifically, we suggest the use of two dynamic validator if both at least 2/3 of checkpoint's t "current_front"
sets responsible for voting on checkpoints; the validator set have published votes s → t and at least 2/3 of
"current_front" and the "current_rear" validator sets. In checkpoint's t "current_rear" validator set of t have
addition to these two, we utilize the "new_front" structure published votes s → t.
to store the next state of the current_front validator set, thus Given an ordered pair of checkpoints (s, t), t is
greatly simplifying the process. Every block stores and considered justified, if there is a supermajority link s → t.
handles these structures, recognizing in that way its Given a checkpoint s' and its direct child t', s' is
expected voters and the future state of all three validator considered finalized, if s' is justified, there is a
sets. A node can become or quit being a validator by supermajority link s' → t' and all votes justifying and
broadcasting a deposit or a withdraw HTTPS message, finalizing checkpoint s' are included in the subchain before
respectively, to the rest of the network. Every block inherits the creation of the next checkpoint
its parent’s structures, and subsequently, depending on the
deposits and withdrawals it includes, it adds and removes
As with the implementation of the validator sets, we use
nodes from its new_front set. Changes in the voting
three auxiliary data structures that are stored in each block
validator sets take effect after the finalization of a
and facilitate the voting process. In the front_votes and
checkpoint. So, if checkpoint C gets finalized, the next
rear_votes structures, we store the sent links with the
block B created in the same subchain, that is also a child of

VOLUME XX, 2020 7

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.3006383, IEEE Access

FIGURE 6. Versatility of the Blockchain Graph Model.

majority rates received, while in the "justified" list. we store cumbersome information to be optimally retrieved. In this
the justified checkpoints. section, we run the decentralized application and examine
A node broadcasts a vote s → t, via a "submit_vote" how the employed graph model enhances the performance
HTTPS message. For this vote to be valid, it must contain of the blockchain and its components. The Neo4j Desktop
the required information; "source_hash" , "source_height", application we are using, runs the Neo4j Browser version
"target_hash", "target_height", "public_key". When 4.0.1 and the Neo4j Server version 3.4.10, providing an
importing such a vote into a block b, a series of tasks are environment to visualize dataand work on our local Neo4j
performed by the miner/stakeholder: databases. Figure 6 shows the versatility of our graph
i) checking that blocks b and t belong to the same model, that can be traversed in multiple ways depending on
subchain; ii) calculating the sender's stake participation in the type of information requested. We present two instances
each of the validator sets of block t and confirms that at in which we recorded significant performance
least one of them is greater than 0%; iii) adding the augmentation by utilizing the Neo4j Graph Database;
calculated percentages to the total vote rates that the s → t calculating balances, tracing offending validators.
link has already received in the "front_votes" and The calculation of a user's balance in the blockchain tree
"rear_votes" sets; iv) including t in the "justified" list in the can be performed rapidly, by pinpointing his corresponding
case of s → t being a justification link, that has achieved a Neo4j node in the "Users" label and parsing his "To" and
supermajority of at least 66%. "From" relationships. For the same calculation to be done
These data structures depict the votes included in the along only one branch, we execute a directional paths
current block's ancestors and not in the whole blockchain finder algorithm from the final block b to the user node and
tree. Hence, each block entry can be uniquely described keep all "From" and "To" relationships included in those
only by the block's height. paths.Our design allows each block to connect only to its
Finally, when creating a new checkpoint, the miner will previous block through a backwards "CHILD_OF"
check its stored data in order to determine whether the relationship. Since each block can have only one parent
previous checkpoint in the chain got finalized. If the block, we can isolate a single chain from block b to the
finalization supermajority link exists and the previous Genesis Block with a path consisting of “CHILD_OF”
checkpoint also appears justified, we understand that all relationships.
votes required have arrived in time and are stored in blocks The following Cypher query returns the total number of
of the same subchain. In this case, the miner notifies peers tokens sent to "ADDRESS" with transactions that are
that the previous checkpoint is to be considered finalized included in the chain extending from block of hash
“HASH” to the Genesis block:
VI. EVALUATION MATCH (b:Blocks{hash: HASH }) - [:CHILD_OF*0..]
A. RUNNING THE SYSTEM -> () - [k] - (t:Transactions) - [:TO] ->
While blockchain owes many of its advantages in the way (u:Users{hex:'ADDRESS'})
that it organizes its data, there is scope for improvement on RETURN SUM(t.amount)
accessing them. A key point of our research was to suggest
a graph model, that abolishes the need for crawling each Unlike the classic blockchain structure, the graphical
block in the blockchain and allows for the otherwise model here allows for bidirectional traversing of entities in

8 VOLUME XX, 2020

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.3006383, IEEE Access

the requested path. We can evaluate and analyze any query traditional blockchain database in terms of performance,
through the Neo4j by looking at its execution plan: The scalability and queryability. However, these
PROFILE command runs the given statement while implementations follow a NoSQL approach meaning that
keeping track of how many rows pass through each they store sets of disconnected aggregates, that makes it
operator, and how much each operator needs to interact difficult to use them for connected data. The most common
with the storage layer to retrieve the necessary data. strategy for adding relationships to such stores is to embed
Profiling the above query reveals how the Neo4j Planner an aggregate’s identifier inside the field belonging to
takes advantage of blockchain's data model and optimizes another aggregate, effectively introducing foreign keys.
the match by taking the node-degree into account when But, this requires joining aggregates at the application level,
checking for the connection, starting on the smaller side which given blockchain's high interconnectivity quickly
while caching internally. Specifically, it avoids crawling becomes prohibitively expensive. We find that graph
each block and examining the hundreds of transactions databases optimally exploit the benefits of blockchain's
contained in them; instead it follows only the transactions unique architecture and create versatile data structures that
that the requested node participates in. can be traversed in real time.
Casper bases its effectiveness on its ability to identify The performance evaluation of our graph tool compared
and punish malicious validators. In our Casper-like to that of the document-oriented approach is in agreement
protocol, we incentivize the network's nodes to track and with the results of several studies [55][56] that suggest the
report those offenders by offering them financial rewards in overall superiority of graph databases regarding querying
the case of a successful slashing. Furthermore, all evidence time of the connected information. Specifically, in Figure 7
for a rule violation can be discovered and recovered by any we present our findings regarding the average query
node, as all the sent votes are stored publicly in the execution time for both our private blockchain application
blockchain. that follows a document-oriented database architecture and
The incorporation of Neo4j into our application and the our Neo4j blockchain database in logarithmic scale. The
complex Cypher queries it allows, further facilitates the query used in this case was a simple balance calculation for
detection of such offending votes in the blockchain. a specific user. To have a fair comparison, memory
Specifically, we can request all distinct pairs of conflicting consumption in Neo4j should not exceed 13GB of RAM,
votes sent by the same validator with two simple Cypher which is what an Ethereum full-node uses. To achieve we
queries: set both heap and page cache to 4GB each, assuring that
when combined with the extra memory that JVM needs to
MATCH (v1:Vote), (v2:Vote) function correctly, Neo4j’s process memory consumption
WHERE v1.r_from = v2.r_from will not grow beyond the desired levels.
AND v1.target_height = v2.target_height
AND ID(v1) < ID(v2)
RETURN v1.r_from, v1, v2
Returns all distinct pairs of votes; v1, v2, sent by the same
validator with targets at the same height.

MATCH (v1:Vote), (v2:Vote)


WHERE v1.r_from = v2.r_from
AND v1.target_height > v2.target_height
AND v1.source_height < v2.source_height
AND NOT ID(v1) = ID(v2)
RETURN v1.r_from, v1, v2
Returns all distinct pairs of votes; v1, v2, sent by the same
validator, in which one vote is within the span of the other.

The above queries apply to all published votes in the FIGURE 7. Average query execution time for Neo4j and Blockchain
blockchain tree and not in a specific branch. The process of databases.
tracking offenders is greatly simplified since sent votes are
indexed with the "Votes" label and serial block access is no While that the graph model and the appropriate Cypher
longer required., queries can simplify the procedures performed by the
If measured by traditional DB criteria, traditional protocols and mechanisms that function in the blockchain,
blockchain, seems poor: throughput is only a few the space complexity of our implementation, should also be
transactions per second, capacity is a few GB and most considered. We can calculate the Neo4j stored records'
importantly it has essentially no querying abilities, thus sizes as follows: 15 B for Nodes, 34 B for Relationships, 41
making it unsuitable for applying statistics on its data. B for Properties for nodes and relationships. In our graph
Several efforts [54] have been made to improve the model, we suggest that a Transaction consists of the

VOLUME XX, 2020 9

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.3006383, IEEE Access

following contents: the Transaction Node, 3 Relationships The simulated results are gathered with the aid of a python
(FROM, TO, IN) and 2 Properties (amount, hash). Hence a script that executes Cypher queries to the Neo4j database,
published Transaction occupies 1 × 15 + 3 × 34 + where the native blockchain data are stored. Thus, we once
2 × 41 = 199 B of disk space. Comparing this number to more showcase the benefit of the blockchain graph model
the average Transaction size in Bitcoin which is 600 B, we in quickly pinpointing and extracting high-value analytics
understand that storing and utilizing our graph model regarding, in this case, the evaluation the blockchain’s
alongside the traditional blockchain data would mean a mechanisms as well as the behavior of its participants.
33% increase of the total blockchain size.
For the computation, memory configuration depends on
how much virtual memory will the JVM for Neo4j use at
runtime. Thus, the more the memory allocated matches the
size of the database, the less swapping of data will occur at
runtime, ultimately resulting in higher performance. Storing
the 258 GB of Bitcoin’s data [57] in our graph model would
require almost 258 × 0.33 = 86 GB of memory. However,
the memory used by the Neo4j instance is the collection of
data requested by the client. For path-dependent queries, a
precise calculation of that can be complicated since it may
involve duplicate nodes and relationships. On the contrary,
we can make a reasonable estimation of the memory
required for the two simpler queries used to spot offending
validators. In that case, we request all N pairs of votes that
consist of one node and four properties. Hence, the memory
required for this query to be optimally executed is 𝑁 × 𝑁 ×
(15 + 4 × 41) = 179𝑁 2 B. To further reduce the space
complexity of our implementation and improve its
effectiveness we are exploring additional graph models and FIGURE 7. Consensus with Fixed and Fluctuating Inactivity Leak.
tools that will entirely base their operation on them.
1) CATASTROPHIC CRASHES
B. PREVENTING ATTACKS If more than 1/3 of the validators are disconnected from the
In this section we test the robustness of the mechanisms network due to computer failures, network partitioning, or
developed against the most renowned blockchain attacks. In risky behavior, it is virtually impossible to finalize new
the case of Casper, security against several types of attacks blocks. In this case, the Inactivity Leak can help the
is provided by its nature. Casper can tolerate 1/3 of the blockchain recover, by gradually decreasing the stake of the
validators being malicious in achieving finality; any percent offline validators and thus weakening their voting
larger than that can stop the network from finalizing any power. Inactivity Leak's value can either be fixed or
new checkpoints. On the contrary, its security is only fluctuating and the money deducted from the inactive
compromised when the dishonest validators achieve validators can either be erased or returned to them
supermajority in both validator sets; thus being able to fully sometime after they get back online. In our study, we are
control the finalization of new checkpoints. Sybil attacks focusing on optimizing the role of the Inactivity Leak in
are prevented as Casper operates in a Proof of Stake Casper's security. In other terms, penalties should be
manner; the size of a validator's deposit determines his adjusted, so that the network can effectively and quickly
voting power. To further reduce the impact of multiple- overcome a catastrophic crash, while voting remains
address users, Casper requires a large number of tokens ultimately profitable for validators with short absences.
being deposited, to become a validator. We examine the efficacy of a fixed and a fluctuating
While we saw how the Neo4j can assist in tracking Inactivity Leak in achieving consensus, after a Catastrophic
Casper’s offending validators, safety under other types of Crash occurs, at which 50% of current validators
attacks is to be examined through a series of simulated disconnect. To simulate this, we initialized 1000 validators
experiments on our application. The vulnerability of PoS of which 500 were only online and able to cast a vote. The
Casper systems in 51% attacks, and the optimization of consensus rates on justification and finalization votes can
parameters for a consensus protocol resistant in catastrophic be stored in the checkpoint nodes as a separate property,
crashes will be the main focal points of this section. while not significantly affecting the overall space
The simulation process is enabled through a batch script complexity of our Neo4j implementation, since checkpoints
that initializes nodes and performs the basic functions of are sparsely distributed throughout the blockchain tree. In
miners and validators. The above process also provides for both cases of Inactivity Leaks, penalties are initially set to
the existence of side-chains that can be created in pseudo- -1% and take place per 10 checkpoints. Should consensus
random manner; that is an adjustable parameter that not be reached during that period, non-fixed Inactivity Leak
determines the possibility of a fork to occur on each block. decreases by 5% until at least one block gets finalized. On

10 VOLUME XX, 2020

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.3006383, IEEE Access

FIGURE 8. Consensus with Fixed and Fluctuating Inactivity Leaks for subchains of varying strengths

Figure 7 , we can observe the change in the percentage of into account different probabilities of the main chain being
active validators, justified and finalized checkpoints for a voted.
stable and a fluctuating (Figure 7) Inactivity Leak. The results displayed in Figure 8 for both types of
If we now consider the existence of additional candidate Inactivity Leak suggest that the fluctuating inactivity leak
main chains in the network, we can assume that the diminishes the influence that the relative strengths and the
probability of a checkpoint being voted is proportional to its number of subchains have on the speed of reaching
subchain's relative strength. The relative strength of a consensus, since in every case the first finalized block
checkpoint or a subchain derives from the criteria that appear around epoch 9. In the instance of the fixed
validators use when voting. Thus, in a PoS blockchain with inactivity penalty consensus is highly dependent on the
honest validators, the relative strength of X could be strength of the candidate blocks, delaying the first
interpreted as the total amount of tokens staked on X checkpoint finalization as long as 30 epochs in some cases.
compared with that on checkpoints at the same height. However, the acceleration of consensus that a Fluctuating
Now, we simulate the previous voting process, while taking Leak offers associates additional risk with the role of
validators, as it shrinks the time window within which more than 3% of the total tokens are shown. The two
validators can recover a crash without suffering extensive distributions appear almost identical after 1,000,000 blocks,
losses on their deposits. which means that the model followed maintains any
financial differences between the nodes of the network
2) 51% ATTACKS without expanding them percentage-wise.
The 51% attack refers to a blockchain attack performed by Like many other BFT protocols, Casper uses 1/3 as the
a group of miners that control more than 50% of the maximum number of faults it can tolerate. Given n total
network's mining or computing power and could potentially nodes, of which there are f byzantine nodes, we need at
control new transactions' confirmation to double-spend least t nodes to agree to reach consensus. Assuming that the
coins. Here we examine whether such an attack could be n-f nodes are split into two equally sized groups of (n - f)
feasible in our Proof of Stake-Casper model and whether /2, we want to make sure that the influence of the byzantine
the inherent features of these mechanisms favor the Voting nodes that may act arbitrarily isn't enough to achieve
Power centralization amongst stakeholders and validators, consensus. Hence t > (n - f) / 2 + f, ensuring that the two
respectively. groups cannot decide different things and result in a safety
Regarding PoS, Voting Power centralization can be failure. For liveness, we make sure that the n - f nodes can
checked using the PoS scheme we have implemented in our come to a consensus, without the cooperation of the f
network. According to this, stakeholders have a chance of byzantine nodes. Thus (n - f ) ≥ t. By combining the two
being selected proportionally to their share of capital. The constraints, we get n/3 > f as the fault tolerance threshold of
winning node will be rewarded with a fixed amount of Casper.
cryptocurrencies. In our simulation we initialized 1000 In this way, to adequately control Casper’s finalization
nodes with the voting power distribution being similar to process, the attackers would have to acquire at least 67% of
that of real PoS networks. Then, we initiate the stakeholder the total deposits in both validator sets. Still, the 34% in just
election process for 1,000,000 simulated blocks and check one validator set would be enough to block the network
again for potential wealth accumulation. To calculate the from finalizing any new checkpoints. Another fundamental
total stake of each stakeholder in each case, we take parameter is that Casper's structure should be such, that it
advantage of the Cypher queries presented in Section A, does not amplify economic differences between validators,
which greatly simplify the process. The initial distribution jeopardizing the sets' decentralized character. Other than
of hashrate as well as the distribution after the election being the backbone of Casper’s operation the adopted
process is presented in Figure 9 (a) in a detailed and reward-punishment system also dictates the motives in
simplified form where only PoS stakeholders that possess consensus groups and hence the possibility of centralization

VOLUME XX, 2020 11

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.3006383, IEEE Access

of power in them. To underline the significance of rewards, exponentially. Moreover, those powerful nodes would have
we tested two different approaches in our Casper-like a stronger influence on the rest of the validators who would
protocol: a system that encourages consensus and another be incentivized to follow the majority to reach consensus
that encourages participation. and earn the bonus.
The results in each case were gathered with the aid of The danger of voting power centralization, lurking in
Cypher queries, for quickly pinpointing each validator’s consensus-enforcing reward systems leads us to the
votes through the “Vote” label and retrieving their voting participation valued approach of the original Casper. This
percentages in each set from the node’s properties. While can be achieved with a deposit-equivalent reward given to
the initial distribution of PoS voting power depicted those who have cast at least N votes during a predetermined
previously can be set similar to that of the real PoS period with targets of height > h, where h is the height of
cryptocurrencies, the novelty of Casper's ideas imposes the highest justified checkpoint, so that the broadcasted
several restrictions when selecting appropriate data inputs. votes are relevant with the checkpoint finalization process.
However, the fact that Ethereum requests 1500 ETH as the By doing so, powerful validators have no advantages over
minimum validator's deposit, a demand that only 5000 the rest of the set as long as everyone participates in the
addresses can fulfill at the moment, in combination with the voting process. This rewarding system is deceptively more
risk associated with being a validator allows for a good similar to that of the PoS than the previous one as long as
estimation of the front and rear sets sizes. By extension, the Casper is responsible for the finalization, the only
distribution of voting power can be directed by that of the difference being that in PoS non-participating nodes face 0
stake distribution of the top 5000 Ethereum addresses. projected profits instead of Casper’s negative penalties.
Finally, to resolve dead-ends, validator's minimum deposit
is made expensive, while gradually lowering participation
rewards when no consensus is reached for several periods.
With this amendment, a consensus-blocking attack would
be costly and less profitable for the attackers than
participating honestly in the voting process.

VII. CONCLUSION AND FUTURE WORK


The establishment of decentralized applications and the
(a) Example of decentralized voting power distribution in PoS before and widespread adoption of blockchains in mainstream
after 1,000,000 blocks
financial technology applications require the refinement of
the current consensus mechanisms and approaches
involved, thus overcoming blockchain's safety, efficiency,
and scaling barriers. In our work, we developed a fully
customizable blockchain application that enabled the
integration of new technologies and the evaluation of up-to-
date mechanisms in the blockchain. We have shown how
the modeling of the burdensome blockchain data as a
distributed graph can assist protocols operations, enhance
their security, and facilitate the application of analytical
methods to the stored information through path-dependent
(b) Example of centralization of voting power in a validator set before and
after 2,000 checkpoints queries. Besides, through the tangible representation of the
data provided by graph databases such as Neo4j, we were
FIGURE 9. Examining distribution of power in consensus groups.
able to monitor the fundamental processes of the consensus
protocols and block proposal schemes developed.
To recover from a dead-end situation where consensus
Furthermore, we adapted this implementation so that it
was not reached for several checkpoints, the protocol can
serves the most up-to-date blockchain consensus
favor groups of validators with consensus rates that exceed
mechanisms. Focusing on Casper's BFT consensus
a regulated percentage, through bonus rewards. This
protocol, we showcased how the annotated model enhances
scheme suggests the gradual tradeoff between system's
network’s security in deterring attacks from dynamic
liveness and security where the strongest consensus group
validator sets by quickly pinpointing conflicting votes and
will eventually prevail over the rest of the set. However,
punishing the offenders. Finally, we ran a series of
rewarding consensus underlies a voting power
simulations that tested our approach’s resilience to the most
centralization danger. By implementing this scheme for 250
widespread blockchain attacks. In particular, we have
validators and running it for 2000 checkpoints, as
examined through Cypher queries how the configuration of
demonstrated in Figure 9 (b), we saw, that even with all
the Inactivity Leak is affecting finality's recovery from a
validators being honest, those with larger deposit shares are
catastrophic crash, whether a 51% attack on Casper - PoS
more likely to receive this bonus, and further increase their
blockchains is possible, and how adjusting Casper's
power. Thus, economic differences are amplified

12 VOLUME XX, 2020

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.3006383, IEEE Access

penalties and rewards can prevent hashrate centralization [11] S. Kim, Y. Kwon, and S. Cho, ‘‘A survey of scalability solutions on
blockchain,’’ in Proc. Int. Conf. Inf. Commun. Technol. Converg.
scenarios
(ICTC), Oct. 2018, pp. 1204–1207.
Our simplified model allows for a lean and versitile [12] J. Kwon, ‘‘Tendermint: Consensus without mining,’’ Tech. Rep.,
blockchain implementation that exploits the benefits of May 2014.
graph databases over the SQL approaches in both storing [13] A. Kosba, A. Miller, E. Shi, Z. Wen, and C. Papamanthou, ‘‘Hawk:
The blockchain model of cryptography and privacy-preserving smart
and accessing the blockchain interconnected data. At the
contracts,’’ in Proc. IEEE Symp. Secur. Privacy (SP), May 2016, pp.
same time blockchain data analysis is enabled through 839–858.
graph analytics and social network analysis numerous graph [14] I. Eyal, A. E. Gencer, E. G. Sirer, and R. van Renesse, ‘‘Bitcoin-NG:
representations of the stored data to accurately evaluate the A scalable blockchain protocol,’’ in Proc. 13th USENIX Symp.
Netw. Syst. Design Implement. (NSDI), 2016, pp. 45–59.
operation of the involved mechanisms as well as the
[15] M. Swan, Blockchain: Blueprint for a New Economy. Newton, MA,
behaviours of network's agents. The promising results of USA: O’Reilly Media, 2015.
our research provide a clear direction for studying and [16] S. King and S. Nadal, ‘‘Ppcoin: Peer-to-peer crypto-currency with
developing other memory efficient graph models that proof of-stake,’’ Tech. Rep., Aug. 2012.
[17] C. Buragohain, D. Agrawal, and S. Suri, “A game theoretic
optimally exploit the benefits of this technology at both
framework for incentives in P2P systems,” Proceedings Third
operating and analytical level. However, it is left as future International Conference on Peer-to-Peer Computing (P2P2003),
research to implement and deploy them in real blockchain Oct. 2003.
applications, which is always the final measure of [18] V. Buterin, “On Stake,” Ethereum Blog, Jul-2014. [Online].
Available:
evaluation.
https://blog.ethereum.org/2014/07/05/stake/?source=post_page.
Concurrently, other innovations are continually being [19] G. Wood, ‘‘Ethereum: A secure decentralised generalised transaction
developed in the blockchain field, which may become the ledger,’’ Ethereum Project Yellow Paper, vol. 151, pp. 1–32, Apr.
solutions to blockchain's most critical challenges. Most 2014.
[20] V. Buterin and V. Griffith “Casper the Friendly Finality Gadget,”
notable is the "Lightning Network" [23] payment protocol,
2017, arXiv:1710.09437. [Online]. Available: https://arxiv.org/abs/
which overlays an existing blockchain and tackles 1710.09437
cryptocurrencies scaling problems. The incorporation of a [21] M. Castro and B. Liskov, ‘‘Practical Byzantine fault tolerance,’’ in
graphic model into the operation of such a micropayment Proc. OSDI, vol. 99, 1999, pp. 173–186.
[22] V. Buterin, D. Reijsbergen, S. Leonardos, and G. Piliouras,
system that will monitor fraudulent transactions amongst all
“Incentives in Ethereum’s Hybrid Casper Protocol,” 2019 IEEE
channels may eliminate the need for outsourcing trust to International Conference on Blockchain and Cryptocurrency (ICBC),
'watchtower' nodes and, thus, expand its limitations. 2019
It is left for future research as well to examine how the [23] J. Poon and T. Dryja, “The Bitcoin Lightning Network.” [Online].
Available: http://lightning.network/lightning-network-paper.pdf.
proposed reference implementation can be applied in upper
[24] “IBM Research: Behind the Architecture of Hyperledger Fabric,”
level transaction consolidation frameworks such as the IBM Research Blog, 08-Feb-2019. [Online]. Available:
Lightning Network and explore whether it can enhance the https://www.ibm.com/blogs/research/2018/02/architecture-
security features and the analytical methods for path- hyperledger-fabric/.
[25] M. Samaniego R. Deters "Blockchain as a service for IoT" 2016
dependent queries and relationships analytics of
IEEE International Conference on Internet of Things (iThings) pp.
transactions in the blocks. 433-436 2016.
[26] “Neo4j Database,” Neo4j Graph Database Platform. [Online].
REFERENCES Available: https://neo4j.com/neo4j-graph-database/?ref=home-
[1] S. Haber and W. Stornetta, “How to time-stamp a digital document,” banner/.
Journal of Cryptology, vol. 3, no. 2, 1991. [27] “Neo4j Desktop User Interface Guide,” Neo4j Graph Database
[2] K. F. Buford, H. H. Yu, and E. K. Lua, P2P networking and Platform. [Online]. Available: https://neo4j.com/developer/neo4j-
applications. Amsterdam: Elsevier/Morgan Kaufmann, 2009. desktop/.
[3] G. Hileman and M. Rauchs, “2017 Global Cryptocurrency [28] NKB Group, “Ethereum releases Casper v0.1: A short description for
Benchmarking Study,” SSRN Electronic Journal, 2017. validators,” Medium, 15-May-2018. [Online]. Available:
[4] J. Hendler "Web 3.0 Emerging" Computer vol. 42 no. 1 pp. 111-113 https://medium.com/@theNKBGroup/ethereum-releases-casper-v0-
2009. 1-a-short-description-for-validators-3e0a7676d286
[5] N. Chowdhury, “Consensus Mechanisms of Blockchain,” Inside [29] V. Buterin, “Immediate message-driven GHOST as FFG fork choice
Blockchain, Bitcoin, and Cryptocurrencies, pp. 49–60, 2019. rule,” Ethereum Research, 14-Jul-2018. [Online]. Available:
[6] S. Nakamoto, ‘‘Bitcoin: A peer-to-peer electronic cash system,’’ https://ethresear.ch/t/immediate-message-driven-ghost-as-ffg-branch-
Tech. Rep., 2008. choice-rule/2561.
[7] W. Wang, D. T. Hoang, P. Hu, Z. Xiong, D. Niyato, P. Wangm, Y. [30] I. Robinson, J. Webber, and E. Eifrem, Graph Databases: New
Wen and D. I. Kim, "A survey on consensus mechanisms and mining Opportunities for Connected Data . Sebastopol, CA: OReilly &
strategy management in blockchain networks," IEEE Access vol. 7 Associates, 2015.
pp. 22328-22370 2018. [31] O. Panzarino, Learning Cypher. Birmingham, United Kingdom:
[8] D. Tapscott and A. Tapscott, Blockchain revolution: how the Packt Publishing, 2014.
technology behind Bitcoin is changing money, business and the [32] “What is SAP HANA? An unrivaled data platform for the digital
world. UK: Portfolio Penguin, 2018. age,” SAP. [Online]. Available:
[9] K. Croman, C. Decker, I. Eyal, A. E. Gencer, A. Juels, A. Kosba, A. https://www.sap.com/products/hana.html?infl=32095c59-c617-45d7-
Miller, P. Saxena, E. Shi, E. G. Sirer, D. Song, and R. Wattenhofer, a13d-8af08c419145.
“On Scaling Decentralized Blockchains,” Financial Cryptography [33] “RedisGraph,” Redis Labs. [Online]. Available:
and Data Security Lecture Notes in Computer Science, pp. 106–125, https://redislabs.com/redis-enterprise/redis-graph/.
2016 [34] Neo4j, “openCypher,” openCypher · openCypher. [Online].
[10] “Bitcoin Energy Consumption Index,” Digiconomist. [Online]. Available: http://www.opencypher.org/.
Available: https://digiconomist.net/bitcoin-energy-consumption.

VOLUME XX, 2020 13

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.3006383, IEEE Access

[56] C. Weinberger, “Benchmark: MongoDB, PostgreSQL, OrientDB,


[35] M. Needham and A. E. Hodler, Graph algorithms: practical examples Neo4j and ArangoDB,” ArangoDB, 10-Mar-2020. [Online].
in Apache Spark and Neo4j. Sebastopol, CA: OReilly Media, 2019. Available: https://www.arangodb.com/2018/02/nosql-performance-
[36] C. Cachin and M. Vukolić, “Blockchain Consensus Protocols in the benchmark-2018-mongodb-postgresql-orientdb-neo4j-arangodb/.
Wild,” arXiv.org, 07-Jul-2017. [Online]. Available: [Accessed: 14-May-2020].
https://arxiv.org/abs/1707.01873 [57] “Blockchain Size,” Blockchain.com. [Online]. Available:
[37] T. Swanson, “How much electricity is consumed by Bitcoin, Bitcoin https://www.blockchain.com/el/charts/blocks-size. [Accessed: 22-
Cash, Ethereum, Litecoin, and Monero?,” 2018. [Online] Available: Jan-2020].
https://www.ofnumbers.com/2018/08/26/how-much-electricity-is-
consumed-by-bitcoin-bitcoin-cash-ethereum-litecoin-and-monero/
[38] I. Bentov, C. Lee, A. Mizrahi, and M. Rosenfeld, “Proof of KONSTANTINOS TSOULIAS received
Activity,” ACM SIGMETRICS Performance Evaluation Review, his diploma from the School of Electrical and
vol. 42, no. 3, pp. 34–37, Aug. 2014 Computer Engineering of the National
[39] Ethereum, “First release” GitHub. [Online]. Available: Technical University of Athens (NTUA) in
https://github.com/ethereum/casper/releases/tag/v0.1.0 2019, after completing his thesis on
[40] V. Buterin, “Ethereum 2.0 Mauve Paper.” [Online]. Available: “Developing a consensus mechanism in
https://www.win.tue.nl/~mholende/seminar/references/ethereum_ma blockchain trees”. His research interests include
uve.pdf. distributed systems, machine learning, social
[41] V. Buterin, “A CBC Casper Tutorial,” Dec-2018. [Online]. network analysis and blockchain.
Available: https://vitalik.ca/general/2018/12/05/cbc_casper.html.
[42] O. Moindrot and C. Bournhonesque, “Proof of Stake Made Simple
with Casper,” Stanford University, 2017. [Online]. Available: Dr. GEORGIOS PALAIOKRASSAS
http://www.scs.stanford.edu/17au- received his diploma from the Dept. of
cs244b/labs/projects/moindrot_bournhonesque.pdf. Electrical and Computer Engineering of the
[43] “Bitcoin to Neo4j,” GitHub. [Online]. Available: National Technical University of Athens
https://github.com/in3rsha/bitcoin-to-neo4j. (NTUA) in 2013 and his PhD in 2019 from the
[44] D. Mcginn, D. Mcilwraith, and Y. Guo, “Towards open data same department, where he is currently a
blockchain analytics: a Bitcoin perspective,” Royal Society Open postdoctoral researcher and senior Research
Science, vol. 5, no. 8, p. 180298, 2018. Associate. He has participated in numerous EU-
[45] M. Bartoletti, S. Lande, L. Pompianu, and A. Bracciali, “A general funded projects and his research interests
framework for blockchain analytics,” Proceedings of the 1st include social networks, blockchain, machine
Workshop on Scalable and Resilient Infrastructures for Distributed learning, and IoT.
Ledgers - SERIAL 17, 2017.
[46] H. Kalodner, S. Goldfeder, A. Steven, M. Möser, and A. Narayanan,
GEORGIOS FRAGKOS is a PhD
“BlockSci: Design and applications of a blockchain analysis
candidate and research assistant in the
platform,” arXiv.org, 08-Sep-2017. [Online]. Available:
Department of Electrical and Computer
https://arxiv.org/abs/1709.02489.
Engineering, University of New Mexico. He
[47] R. Schollmeier, “A definition of peer-to-peer networking for the
received his Diploma in Electrical and
classification of peer-to-peer architectures and applications,”
Computer Engineering from the National
Proceedings First International Conference on Peer-to-Peer
Technical University of Athens in 2018. His
Computing, Sep. 2001
main research interests include deep
[48] A B M Moniruzzaman and S. A. Hossain, “NoSQL Database: New
reinforcement learning, game theory,
Era of Databases for Big data Analytics - Classification,
optimization, contract theory, and blockchain.
Characteristics and Comparison,” arXiv:1307.0191, 30-Jun-2013.
[Online]. Available: https://arxiv.org/abs/1307.0191.
[49] V. Buterin, “A Next-Generation Smart Contract and Decentralized
Application Platform,” github. [Online]. Available: Dr. ANTONIOS LITKE has more than
https://github.com/ethereum/wiki/wiki/White-Paper. [Accessed: 14- 18 years of experience as a professional ICT
May-2020]. engineer. He has participated in R&D teams of
[50] M. Bartoletti, T. Cimoli, and R. Zunino, “Fun with Bitcoin Smart over 15 research projects (EC and nationally
Contracts,” Lecture Notes in Computer Science Leveraging funded) and has led technical teams for highly
Applications of Formal Methods, Verification and Validation. demanding commercial IT projects. Dr. Litke
Industrial Practice, pp. 432–449, 2018. received the diploma from the Dept. of
[51] M. Bartoletti and R. Zunino, “BitML,” Proceedings of the 2018 Computer Engineering and Informatics of the
ACM SIGSAC Conference on Computer and Communications University of Patras, Greece in 1999, and the
Security, 2018. PhD from Electrical and Computer Engineering Department
[52] “US Secure Hash Algorithms (SHA and HMAC-SHA),” IETF Tools. of National Technical University of Athens in 2006 (his thesis
[Online]. Available: https://tools.ietf.org/html/rfc4634. [Accessed: had been awarded by Thomaidis Foundation). He is the author
15-May-2020]. of more than 30 scientific articles with over 500 citations and
[53] V. Buterin, “Safety Under Dynamic Validator Sets,” Medium, 11- a reviewer of several international journals and conferences.
Jun-2017. [Online]. Available: His research interests include parallel and distributed
https://medium.com/@VitalikButerin/safety-under-dynamic- computing, service oriented architectures, blockchains and
validator-sets-ef0c3bbdf9f6. cybersecurity.
[54] McConaghy, Marques, Muller, De Jonghe, McConaghy, McMullen,
Henderson, Bellemare, and Granzotto, “BigchainDB: A Scalable Theodora A. Varvarigou is a professor of
Blockchain Database.” [Online]. Available: computer science at the National Technical
https://mycourses.aalto.fi/pluginfile.php/378362/mod_resource/conte University of Athens. Her research interests
nt/1/bigchaindb-whitepaper.pdf. [Accessed: 14-May-2020]. include parallel algorithms and architectures,
[55] R. Hecht and S. Jablonski, “NoSQL evaluation: A use case oriented fault-tolerant computation, optimization
survey,” 2011 International Conference on Cloud and Service algorithms, and content management. Professor
Computing, 2011. Varvarigou received a PhD in computer science
from Stanford University.

14 VOLUME XX, 2020

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy