Bank Fraud Detection Using Community Detection Algorithm

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Proceedings of the Second International Conference on Inventive Research in Computing Applications (ICIRCA-2020)

IEEE Xplore Part Number: CFP20N67-ART; ISBN: 978-1-7281-5374-2

Bank Fraud Detection using Community Detection


Algorithm
Dhiman Sarma Wahidul Alam Ishita Saha
Department of Computer Science and Department of Computer Science and Department of Computer Science and
Engineering Engineering Engineering
Rangamati Science and Technology University of Science and Technology University of Science and Technology
University Chittagong Chittagong
Rangamati, Bangladesh Chittagong, Bangladesh Chittagong, Bangladesh
dhiman001@yahoo.com wahidulalam@ustc.ac.bd ishita_saha05@yahoo.com

Mohammad Nazmul Alam Mohammad Jahangir Alam Sohrab Hossain


Department of Computer Science and Department of Computer Science and Department of Computer Science and
Engineering Engineering Engineering
Royal University of Dhaka Southern University Bangladesh East Delta University
Dhaka, Bangladesh Chittagong , Bangladesh Chittagong, Bangladesh
mna235@yahoo.com Jahangir@southern.edu.bd sohrab.h@eastdelta.edu.bd

Abstract- Bank fraud is a federal crime that involves fraudulent used to identify fraud scenarios and other similar types of
attempts aims for monetary gains by deceiving financial scams by the industry. It is also capable to stop fraud
institutions. Every year, banks and financial institutions lose premises in real-time. Still, there has been no concrete
billions due to fraud. Fraudsters tempt bankers through scams
system developed yet to prevent fraud. But the best practice
to gain financial assets. The most common types of bank fraud
is to develop methods for fraud detection. It can be achieved
include debit and credit card fraud, account fraud, insurance
fraud, money laundering fraud, etc. Bankers are obliged to
by using the individual data points and the connection
safeguard their financial assets as well as institutional integrity among them. Sometimes these connections go unperceived
to armored the global financial system. Anti-fraud guard until it is in the final stage. The connections among various
systems are regularly circumvented by fraudsters’ dodging data are important for identifying fraud because it can
techniques. This paper proposed a system to detect bank fraud behold the best clues. Understanding the connection of
using a community detection algorithm that identifies the various data and deriving meaning from these links can be
patterns that can lead to fraud occurrences. An agile method drawn from one’s existing data. The graph is used to
was used to design the web-based application to detect the
rearrange the problem and to formulate a new look. Unlike
fraud. The application functioned as a central hub between the
most other ways of looking at data, the graph is designed to
banks and customers. Neo4j, a graph database, was used for
creating and representing the database, and the Cypher query
expose the connections. Graph databases are used to observe
was used as a graph query language. The proposed system patterns very rapidly but they are difficult to detect using
successfully detected all frauds presented during the test common symbolizes such as tables. At present, a large
experiment. This paper will assist bankers to combat fraud by number of companies are using graph databases to solve
detecting and preventing similar occurrences. different types of data related problems, including fraud
detection.
Keywords— Fraud Detection; Bank Fraud Detection;
In paper [5], researchers used data from the bank credit
Community Detection Algorithm; Machine Learning; Neo4j;
cardholders’ dataset. The detection technique was developed
Cypher Query Language
based on the transactional data of credit card accounts.
The artificial neural network (ANN) was tested on
I. INTRODUCTION various types of scam such as stolen credit card information,
solicitation scam [6], camouflaged scam, and non-received
Banks and insurance companies are lucrative targets of
issue scam. The ANN algorithm capable of detecting fraud
fraudsters. They successfully seize billions worth financial
more significantly with fewer false positives over fuzzy
assets every year. Fraud detection [1-4] plays an important
rule-based scam detection methods. The paper measured the
role in minimizing these losses. Nevertheless, fraudsters
network performance for this data set for accuracy and got
continue their invasion by defeating all the existing and
the nearest to scam detection. The application was run on an
newly developed anti-fraudulent techniques with their clever
dodgings. Among the effective tools, graph databases are

978-1-7281-5374-2/20/$31.00 ©2020 IEEE 642

Authorized licensed use limited to: National Inst of Training & Indust Eng - Mumbai. Downloaded on August 12,2021 at 08:04:25 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Second International Conference on Inventive Research in Computing Applications (ICIRCA-2020)
IEEE Xplore Part Number: CFP20N67-ART; ISBN: 978-1-7281-5374-2

IBM 3080 computer and was recently in use for scam A. Training Module
detection on that bank's data portfolio [7-12]. A training module was used for making step-by-step
The objectives of these papers [13-15] were to develop processes to present a more realistic view of the information
a successful scam detection model to detect online banking flow. The training module consists of raw data given by the
scams and classification. Besides, the scope was to express users to detect if there are any fraudsters exist or not. NID,
its impact on the rapid and trustworthy detection of any Passport number, Mobile Number, Address, etc. were being
“abnormal” transactions. These abnormal transactions were passed through the module and later, data were stored via
considered outliers for the banking system and treated as the storage module, and finally processed by the detection
fraud activities. module.
Detecting communities play an important role in
network science with multiple applications. Community
detection is widely used in the area of machine learning
applications. Because it helps to detect the presence of an
internal network organization at a harsh level and detect
further internal special affairs among the various nodes
which are very hardtop to detect from direct empirical tests.
Moreover, it allows researchers to comprehend the attributes
of vigorous procedures taking place in a network. As
syntagmatic examples, spreading processes of an outbreak
are significantly affected by the community structure of the
graph [16-19]. Data driven approaches are extremely useful
to detect pattern in the dataset [20-23]. Security in banking
Fig. 2. Training module
sector can be achieved by using encryption and data driven
based fraud detection system [24-27].
B. Storage Module
II. METHODOLOGY
The storage module is a system module to store the data
The proposed model for detecting fraud is presented in
from the training dataset. It stores the data in the system and
figure 1. The proposed architecture in this paper consists of
sends the data to the training module while necessary. In
three major modules such as a training module, a storage
this system, neo4j detailed with a bolt driver to store the
module, and a detection module. The input data were
data and reprocess the data to the training module if
entered by the legitimate bankers or users and captured by
necessary. Besides, the neo4j cluster was being used to
the training module. The training module passed test data to
process the data in the storage module (figure 2).
the storage module to store the data. The retrieval of data by
the training module occurred when necessary. After then
data was processed to the detection module to detect the
fraud by using a cipher query builder and community
system algorithm. The modules of figure 1 are described
below:

Fig. 3. Storage Module

C. Detection Module
Figure 4 shows the inside structure of the detection
module. The core elements of the module are test data,
cipher query builder, community detection algorithm, and
response builder. Raw data were taken by the test data as a
training dataset. The raw data were searched by the cypher
Fig. 1. Architectural of the proposed system query builder. Then the data passed through a community
detection algorithm to check the connection between each

978-1-7281-5374-2/20/$31.00 ©2020 IEEE 643

Authorized licensed use limited to: National Inst of Training & Indust Eng - Mumbai. Downloaded on August 12,2021 at 08:04:25 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Second International Conference on Inventive Research in Computing Applications (ICIRCA-2020)
IEEE Xplore Part Number: CFP20N67-ART; ISBN: 978-1-7281-5374-2

node of the raw data and showed the result set via the
response builder.

Fig. 6. New Account.

Fig. 4. Detection Module

V. RESULT AND DISCUSSION

A web-based application was developed to detect


fraud in this paper. The application performs as a central
hub between the users and the banks to easily detect fraud.
The application was built as user friendly.

Fig. 7. Account List.

Later if any new customers want to open a new account


you can check the data whether the information is valid or
not by clicking on the “Search” button. There are three
most important attributes provided which cannot be the
same as two or more persons, these are national
identification (NID) number, passport number, and mobile
number.

Fig. 5. Home page.

Here, in figure 5, the data manager of every bank has


the facility to input data of the account holder manually by
clicking on the “New Account” button and the information
of the clients are immediately stored in the database which
could be viewed later by “Account List” button. But only
the bank admins could have access to this panel to enter the
input and visualize the list. This is the initial point where
raw data were generated for the training module and storage
module. Each time the data which were entered recently,
can be retrieved from the storage module and viewed by the Fig. 9. Search and the three most important attributes.
account list form (figure 6).

978-1-7281-5374-2/20/$31.00 ©2020 IEEE 644

Authorized licensed use limited to: National Inst of Training & Indust Eng - Mumbai. Downloaded on August 12,2021 at 08:04:25 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Second International Conference on Inventive Research in Computing Applications (ICIRCA-2020)
IEEE Xplore Part Number: CFP20N67-ART; ISBN: 978-1-7281-5374-2

In case of any one of the attributes gets matched then it


is detected immediately by the application and suspected as
a fraud. Then Bank can initiate further investigation and
take legal actions on them. Neo4j was used for creating and
representing the database graphically as it makes it easier to
detect by viewing the connections.

Fig. 10. Fraud Scene.

In another scenario, if a new customer’s data are always


checked as valid and do not match or linked with any other
account holder. Hence the customer’s remarks pointed as
clean.

Fig. 11. For example two frauds of the same National Identification (NID)
number.

Fig. 12. Clean account holder.

VI. CONCLUSION

Fraud detection is a challenging and intelligent task in


the digital era. Expert systems and intelligent software are
employed to defend fraud. This paper presented a fraud
detection system using a community detection algorithm. It
Fig. 8. Graphical representation of detecting fraud using Neo4j.
was developed through a web-based application that acts as
an intermodal hub between the bankers and the customers.
We tested another type of fraud in our experiment.
Neo4j, a graph database system, was implemented for
Suppose, “Jishan”, who had an account in a bank named
searching and filtering fraud cases. Dataset, entered by the
„ABC Bank Limited‟ two years ago and got bankrupt. This
authorized banker, was passed through the training module
bankrupt information was updated in the bank's server. After
to the storage module and can be retrieved by the training
two years later when he went to another bank to open a new
module on demand. These test data were then processed
account to apply for a new loan. Our system checked the
through a detection module where cypher query builder
attributes and found the previous details. It showed the
responded through the response builder and detect fraud.
bankrupt information and detected the event as fraud which
The test experiments were successfully detected by the
restricted the application of “Jishan” to perform.
application system and graphically represented to the users.

978-1-7281-5374-2/20/$31.00 ©2020 IEEE 645

Authorized licensed use limited to: National Inst of Training & Indust Eng - Mumbai. Downloaded on August 12,2021 at 08:04:25 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Second International Conference on Inventive Research in Computing Applications (ICIRCA-2020)
IEEE Xplore Part Number: CFP20N67-ART; ISBN: 978-1-7281-5374-2

In the future, we will predict fraud associated with known issues," (in English), Financial and Credit Activity-Problems of
Theory and Practice, Article vol. 1, no. 28, pp. 36-45, 2019.
fraudsters and their past activities by using a convolution
[19] G. S. Carnivali, A. B. Vieira, A. Ziviani, and P. A. A. Esquef,
neural network (CNN). "CoVeC: Coarse-grained vertex clustering for efficient community
detection in sparse complex networks," (in English), Information
Sciences, Article vol. 522, pp. 180-192, Jun 2020.
REFERENCES [20] F. Ahmed, et al., "A Combined Belief Rule based Expert System to
[1] S. Daliri, "Using Harmony Search Algorithm in Neural Networks to Predict Coronary Artery Disease," in 2020 International Conference
Improve Fraud Detection in Banking System," (in English), on Inventive Computation Technologies (ICICT), 2020, pp. 252-257.
Computational Intelligence and Neuroscience, Article vol. 2020, p. 5, [21] S. Hossain, et al., "A Belief Rule Based Expert System to Predict
Feb 2020, Art. no. 6503459. Student Performance under Uncertainty," in 2019 22nd International
[2] S. M. Darwish, "An intelligent credit card fraud detection approach Conference on Computer and Information Technology (ICCIT), 2019,
based on semantic fusion of two classifiers," (in English), Soft pp. 1-6.
Computing, Article vol. 24, no. 2, pp. 1243-1253, Jan 2020. [22] S. Hossain et al. "A Critical Comparison between Distributed
[3] E. A. Minastireanu and G. Mesnita, "Methods of Handling Database Approach and Data Warehousing Approach." International
Unbalanced Datasets in Credit Card Fraud Detection," (in English), Journal of Scientific & Engineering Research, Article 5.1 (2014):
Brain-Broad Research in Artificial Intelligence and Neuroscience, 196-201.
Article vol. 11, no. 1, pp. 131-143, Mar 2020. [23] K. Noor et al., "Performance analysis of a surveillance system to
[4] A. Somasundaram and S. Reddy, "Parallel and incremental credit card detect and track vehicles using Haar cascaded classifiers and optical
fraud detection model to handle concept drift and data imbalance," (in flow method," 2017 12th IEEE Conference on Industrial Electronics
English), Neural Computing & Applications, Article vol. 31, pp. 3-14, and Applications (ICIEA), Siem Reap, 2017, pp. 258-263.
Jan 2019. [24] D Sarma., 2012. Security of Hard Disk Encryption. Masters Thesis.
[5] Y. Wu, Y. J. Xu, and J. Y. Li, "Feature construction for fraudulent Identifiers: urn:nbn:se:kth:diva-98673(URN), Royal Institute of
credit card cash-out detection," (in English), Decision Support Technology.
Systems, Article vol. 127, p. 11, Dec 2019, Art. no. 113155. [25] S. Hossain, D. Sarma, R. J. Chakma, W. Alam, M. M. Hoque, and I.
[6] S. Arora and M. P. S. Bhatia, "Fingerprint Spoofing Detection to H. Sarker, "A Rule-Based Expert System to Assess Coronary Artery
Improve Customer Security in Mobile Financial Applications Using Disease Under Uncertainty," in Computing Science, Communication
Deep Learning," (in English), Arabian Journal for Science and and Security, Singapore, 2020, pp. 143-159: Springer Singapore.
Engineering, Article vol. 45, no. 4, pp. 2847-2863, Apr 2020. [26] S. Hossain, A. Abtahee, I. Kashem, M. M. Hoque, and I. H. Sarker,
[7] O. Ata and L. Hazim, "Comparative Analysis of Different "Crime Prediction Using Spatio-Temporal Data," in Computing
Distributions Dataset by Using Data Mining Techniques on Credit Science, Communication and Security, Singapore, 2020, pp. 277-289:
Card Fraud Detection," (in English), Tehnicki Vjesnik-Technical Springer Singapore.
Gazette, Article vol. 27, no. 2, pp. 618-626, Apr 2020. [27] H. Alqahtani et al., "Cyber Intrusion Detection Using Machine
[8] A. Eshghi and M. Kargari, "Introducing a new method for the fusion Learning Classification Techniques," in Computing Science,
of fraud evidence in banking transactions with regards to Communication and Security, Singapore, 2020, pp. 121-131: Springer
uncertainty," (in English), Expert Systems with Applications, Article Singapore.
vol. 121, pp. 382-392, May 2019.
[9] R. Seidlova, J. Pozivil, and J. Seidl, "Marketing and business
intelligence with help of ant colony algorithm," (in English), Journal
of Strategic Marketing, Article vol. 27, no. 5, pp. 451-463, 2019.
[10] X. L. Xu, N. Hu, M. Trovati, J. Ray, F. Palmieri, and H. M. Pandey,
"DLCD-CCE: A Local Community Detection Algorithm for Complex
IoT Networks," (in English), Ieee Internet of Things Journal, Article
vol. 7, no. 5, pp. 4607-4615, May 2020.
[11] A. Zamir et al., "Phishing web site detection using diverse machine
learning algorithms," (in English), Electronic Library, Article vol. 38,
no. 1, pp. 65-80, Jan 2020.
[12] Z. K. Zandian and M. R. Keyvanpour, "Feature Extraction Method
Based on Social Network Analysis," (in English), Applied Artificial
Intelligence, Article vol. 33, no. 8, pp. 669-688, Jul 2019.
[13] M. Zamini and S. M. H. Hasheminejad, "A comprehensive survey of
anomaly detection in banking, wireless sensor networks, social
networks, and healthcare," (in English), Intelligent Decision
Technologies-Netherlands, Article vol. 13, no. 2, pp. 229-270, 2019.
[14] F. Karimi, S. Lotfi, and H. Izadkhah, "Multiplex community detection
in complex networks using an evolutionary approach," (in English),
Expert Systems with Applications, Article vol. 146, p. 20, May 2020,
Art. no. 113184.
[15] U. Fiore, A. De Santis, F. Perla, P. Zanetti, and F. Palmieri, "Using
generative adversarial networks for improving classification
effectiveness in credit card fraud detection," (in English), Information
Sciences, Article vol. 479, pp. 448-455, Apr 2019.
[16] R. B. Xu, Y. Che, X. M. Wang, J. X. Hu, and Y. Xie, "Stacked
autoencoder-based community detection method via an ensemble
clustering framework," (in English), Information Sciences, Article
vol. 526, pp. 151-165, Jul 2020.
[17] I. Gonzalez-Carrasco, J. L. Jimenez-Marquez, J. L. Lopez-Cuadrado,
and B. Ruiz-Mezcua, "Automatic detection of relationships between
banking operations using machine learning," (in English), Information
Sciences, Article vol. 485, pp. 319-346, Jun 2019.
[18] M. Pohoretskyi, D. Serhieieva, and Z. Toporetska, "The proof of the
event of a financial resources fraud in the banking sector: problematic

978-1-7281-5374-2/20/$31.00 ©2020 IEEE 646

Authorized licensed use limited to: National Inst of Training & Indust Eng - Mumbai. Downloaded on August 12,2021 at 08:04:25 UTC from IEEE Xplore. Restrictions apply.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy