0% found this document useful (0 votes)

43 views

Risk Analysis

This document discusses risk analysis and decision support systems using data mining techniques in customs administrations. It reviews 16 studies from around the world that used data mining to address problems customs face, such as smuggling detection. Most studies used datasets from customs declarations and focused on China. Decision trees were commonly used, achieving over 90% accuracy. The review identifies opportunities to improve customs risk management systems and stimulate further research, as few European studies exist, including none in Greece.

Uploaded by

dawood sadiq janjua

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

43 views

Risk Analysis

Uploaded by

dawood sadiq janjua

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/333565847

RISK ANALYSIS AND DECISION SUPPORT SYSTEMS WITH DATA MINING

TECHNIQUES IN CUSTOMS ADMINISTRATORS

Conference Paper · June 2019

CITATION READS

1 1,363

2 authors:

Κωνσταντια Ταχτσιδου Efstathios Kirkos

Alexander Technological Educational Institute of Thessaloniki International Hellenic University
1 PUBLICATION 1 CITATION 17 PUBLICATIONS 825 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Efstathios Kirkos on 03 June 2019.

The user has requested enhancement of the downloaded file.

P a g e | 236

3rd INTERNATIONAL CONFERENCE ON QUANTITATIVE, SOCIAL, BIOMEDICAL & ECONOMIC ISSUES 2019 - ICQSBEI 2019

RISK ANALYSIS AND DECISION SUPPORT SYSTEMS WITH DATA

MINING TECHNIQUES IN CUSTOMS ADMINISTRATORS
Konstantia Tachtsidou
Alexander Technological Educational Institute of Thessaloniki
e-mail: kontax80@gmail.com

Efstathios Kirkos
Alexander Technological Educational Institute of Thessaloniki
e-mail: stkirk@acc.teithe.gr

Abstract
Custom administrations develop automated systems to optimal inspect the rapidly increased
transactions by their limited inspection force. Literature suggests that Data Mining techniques
outperform the traditional methods, especially in the domain of risk analysis and fraud
detection. Data Mining is the process of analyzing vast datasets in order to reveal valuable
information. Also is a key tool for the analysis of structured and unstructured data, known as
Big Data. This paper aims to review research studies conducted, worldwide, to Customs Risk
Management domain using data mining techniques. Various combinations of keywords used
and the search yielded a sample of 16 relevant articles that deal with different problems that
Customs face, such as profiling of economic operators and efficient exploitation of
manpower. Most of them had been conducted in China. It is remarkable that very few
researches conducted in European countries and none of them in Greece. The majority of the
studies- about 63% - are dealing with the detection of smuggling and miscoding, two of the
biggest problems faced by the Customs. Out of 24 data mining techniques used in the articles
analyzed. Decision Trees model appears to be the leading one in detecting fraud. In general,
supervised learning tools have been used more frequently than the unsupervised ones. Best
technique for Customs Risk Management model is the combination of Neural Network and
Decision Trees. Most of the models achieved great accuracy, above 90%. In most studies, 10
out of 16, the data come from the customs declarations. This review summarizes the current
trends in the field of Customs Risk Management, reveals opportunities and needs, constitutes
a source of knowledge for relevant research and aspires to stimulate the interest of future
researchers and practitioners of Custom administrations.

Keywords: data mining, custom administrations, smuggling, fraud detection, customs risk
management

JEL Classification: C38, C45, C53, C55, H26

1. INTRODUCTION

Customs administrations are responsible for implementing a wide range of policies in areas
as diverse as collection of duties and taxes, defense of security and safety, compliance of
trade regulations. Also Customs are often support the work of other services such as police
and immigration authorities. The main problem that customs face is smuggling that is the
clandestine import of goods or the evasion of taxes by circumvention of customs controls.
The consequence of this is the avoidance of customs duties and taxes and the considerable
difficulty in trade due to the increase of unfair competition. The detection of smuggling
P a g e | 237

3rd INTERNATIONAL CONFERENCE ON QUANTITATIVE, SOCIAL, BIOMEDICAL & ECONOMIC ISSUES 2019 - ICQSBEI 2019

requires physical examination of goods. These controls must be quick and effective in order
not to disturb trade flows in a fast moving economy. However, due to the large increase in
trade over the last few years and the limited manpower and materials of customs, the physical
examination of all shipments is impossible.
For this reason, eligibility and risk targeting methods are applied through automated risk
analysis systems, so as to maintain a proper balance between customs controls and facilitation
of legitimate trade. The existing systems are based on simple criteria of selectivity, focusing
on the goods, the importer, the exporter, the carrier plus a random target. Also they are
strengthened by the exchange of information between customs authorities. However, they
have not proven to be particularly effective. They require customs officers to control a large
number of transactions which frequently results in very low recorded offence rates. For
example in Greece, in 2017, 79,513 checks were carried out and only the 5,608 of them were
actual infringements, ie only 7%, while in products of Excise Duty the rate was much lower,
ie only 3,3% [1]. Also, the existing systems cannot make use of unstructured data, causing it
to ignore a large amount of all available data.
Thus, many customs administrations are focusing on other practices, for maintaining more
efficient automated systems in order to improve their targeting/selectivity processes, such as
data mining. Data mining refers to extracting or “mining” knowledge from large amount of
data and can develop systems to support strategic decisions in customs risk management.
Data mining techniques are already being used in many areas which are facing the risk of
fraud, one of the fundamental applications of data mining. For example, insurance fraud,
credit card fraud, telecommunications fraud, and check forgery. Also it is widespread in the
area of sales promotion as an important tool for improving their competitiveness. Data
mining is even used in medicine in order to recognize the predisposition to illness.
The World Customs Organization and the European Union have already underline the
importance of data mining in the domain of Customs risk management, something that has
also been perceived by the Greek customs administration.
This paper aims to review research studies conducted, worldwide, to Customs Risk
Management domain using data mining techniques and is structured in five sections. The first
section describes the meaning of customs controls and customs risk management. In the
second section, a theoretical approach to data mining is being undertaken to consolidate all
relevant knowledge discovery processes. The third section presents sixteen studies that have
been published in scientific journals and conference proceedings around the world describing
the customs problems that have been solved, the datasets used and the data mining models
that have been developed. A description of the algorithm of each model is given, their
accuracy is reported and a comparative analysis of all studies is made. Finally section four
sets out the conclusions, the contributions of the survey and suggestions for future research.

2. CUSTOMS ADMINISTRATIONS AND RISK

2.1 CUSTOMS CONTROLS

Customs controls means all acts performed by the customs authorities in order to ensure
compliance with the customs legislation and other legislation governing the transaction of
goods from and to non-EU countries[18]. For the purposes of customs controls, customs
authorities verify the accuracy and completeness of the declaration, which is the document
that lists the details of the imported or exported product. In particular, the elements of the
declaration to be controlled are:
a) Origin of goods, the country or territory where the goods obtained.
P a g e | 238

3rd INTERNATIONAL CONFERENCE ON QUANTITATIVE, SOCIAL, BIOMEDICAL & ECONOMIC ISSUES 2019 - ICQSBEI 2019

b) Code of goods. All imported/ exported products are classified under a10 digit tariff code
that carries information about the customs duty rates and non-tariff measures. The EU
classification system consists of the Combined Nomenclature (NC) that serves the EU’s
common customs tariff plus the Integrated Tariff (TARIC) that provides information on all
trade policy and tariff measures applicable to specific goods in the EU (e.g. suspension on
duties, tariff quotas, tariff preferences, anti-dumping measures) [20]. The Combined
Nomenclature (NC) is based on the Harmonized Commodity Description and Coding System
(HS) developed by the WCO. Also the classification of goods includes taxes: The most
important taxes according to their participation in revenue are VAT and Excise Duty.VAT is
calculated as a percentage of the taxable amount. Its share of total customs revenue in Greece
is quite high, 38.57% in 2017, which corresponds to 4,896.0m euro. The Excise Duties cover
alcohol, alcoholic drinks, energy products, electricity, and tobacco products and its
percentage of total Greek customs revenue in 2017 was 55.61% [3]. Also most cases of
smuggling are recorded in the Excise Duty products. In 2017 huge quantities of products
were seized in Greece where the total amount of tax evaded was 81.835.712,60 euro[2].
c) Value of goods. The economic value of goods declared for importation.
All the above elements constitute the “identity” of the good that provides the basis for
assessment of duty that has to be paid.
Other significant elements of the declaration are the economic operators such as the importer,
the exporter and the carrier, the person who brings or assumes responsibility for the carriage
of the goods, in or out of the custom territory of the Union.
Customs controls, other than random checks, shall primarily be based on risk analysis using
electronic data-processing techniques, with the purpose of identifying and evaluating the
risks.

2.2 RISK

Risk is the probability of non-compliance with the customs laws. Customs risk management
is all practices that provide customs with the necessary information to identify and effectively
deal with the transactions that are of high risk. EU has gradually developed a common risk
management approach (CRMF) providing an equivalent level of protection at its external
borders, which is implemented by all Member States. It is based upon the exchange of risk
information (RIF) and risk analysis results between customs administrations. Also establishes
common risk criteria (CRC) and priority control areas (PCA) that include indicators of risk,
the nature and the duration of customs controls, types of goods, traffic routes of transport or
economic operators which are subject to increased levels of risk analysis [19].
But the existing system is not used with sufficient accuracy. According to the European Court
of Auditors [8] there are weaknesses in the information exchange tools, both in terms of
content and their use. Too much information, inappropriate feedback from the member states
and many messages for relatively small and local risks, was reported. All the above led to an
excess of information and difficulties to identify the key risks.

3. DATA MINING

Data are divided in two categories. The structured data collected from business information
monitoring systems and the unstructured data which come from additional data sources like
social media, email, customer’s feedback on the company’s product, smart phones, in-vehicle
infotainment devices etc. These two categories consists Big Data that expands in a great
volume.
P a g e | 239

3rd INTERNATIONAL CONFERENCE ON QUANTITATIVE, SOCIAL, BIOMEDICAL & ECONOMIC ISSUES 2019 - ICQSBEI 2019

Big Data can be analyzed for insights that lead to better decisions and strategic business
moves. Some Customs administrations have already embarked on Big Data initiatives,
leveraging the power of analytics, ensuring the quality of data (regarding cargos, shipments
and conveyances) and widening the scope of data they could use for analytical purposes to
ensure that better informed and smarter decisions are taken [15]. For example, the New
Zealand and Hong Kong customs have created joint repositories that combine sets of data
from various government authorities. United Kingdom aims at gaining access to commercial
data flows in the supply chain and Canada, is focusing on integrating additional resources and
information into the already used Enterprise Data Warehouse, so as to improve the ability to
assess the risk to goods and people through biometric examination, facial recognition, and
automated lying detection.
The analysis of Big Data necessitates the application of new efficient methods like
Knowledge Discovery in Databases, or in other words data mining. Data mining combines
methodologies of statistics, machine learning, artificial intelligence and other disciplines and
provides useful knowledge by revealing trends, patterns, exceptions and relationships through
vast datasets [12]. The steps of processing Big Data are a) storage, b) cleaning (remove the
seemingly abnormal data that may mislead the results), and c) analysis, apply the appropriate
technique depending on the problem that needs to be solved and the mining of the
information to solve the problem. There are several data mining techniques that are divided
into supervised learning techniques (target is known in advance and the algorithm tries to
find the relationship of data with the target) and unsupervised learning (the target is not
known in advance).

The main techniques of supervised learning are classification and regression. Both of them
group the stored data into classes. The algorithm "learns" from a training set and then
classifies the data of a test set according to the knowledge that acquired. Regression is being
used for numerical target while classification of nominal class values. A well-known example
of classification problem is the approval of bank loans. Methods of classification are:
a) Neural Networks: They use a set of nodes connected to each other like human brain
neurons and learn from their experience like humans.
b) Decision Trees represent the data relationships in the form of a tree
c) Bayesian Network is a probabilistic model, the probability of verifying case A given that
B is true.
d) In k- nearest neighbor the classification is based on the k nearest points of the training
set.
e) Support Vector Machines focus to the construction of a hyperplane that optimal separates
the data, displayed in a multidimensional space.

The unsupervised learning techniques are the following:

Clustering: The algorithm groups the data according to the common attributes. It is mainly
applied to promotional campaigns tailored to the specific features of each group. The more
familiar algorithm of clustering is k-means.
Association Pattern Discovery: the algorithm identifies the relationships among the data. It
is mainly applied to achieve cross-selling. Most Popular Algorithm is Apriori.
Anomaly detection attempts to identify exceptions. It is an important field of data mining,
particularly for detecting fraud, for example the detection of a stolen credit card.
Time Series Analysis: This technique analyzes temporal developmental patterns. Monitoring
the time fluctuation of a business's sales is an example of such a technique.
P a g e | 240

3rd INTERNATIONAL CONFERENCE ON QUANTITATIVE, SOCIAL, BIOMEDICAL & ECONOMIC ISSUES 2019 - ICQSBEI 2019

4. CRITICAL ANALYSIS OF THE LITERATURE

4.1 CUSTOMS PROBLEMS

Literature suggests that Data mining may improve the efficiency and rationality of customs
decision-making policies that involve the least risk and the best probable outcomes. This
paper reviews research studies conducted to resolve the main problems that customs face
during customs controls using data mining techniques. Various combinations of keywords
used to identify the pertinent articles and the search yielded a sample of 16 relevant articles.
The table below presents all the customs problems solved by applying data mining
techniques in the sixteen studies, which are the main problems customs administrations face
in conducting controls. Column 3 "STUDIES" in the table shows how many articles of this
survey dealt with resolving this problem and the last column of the table shows the countries
where the studies were made.

Table 1. Main problems Custom’s face during customs controls

No Custom Problem Studies Country
1 Targeting declarations for control 4 Netherlands, Iran, United
States, Ivory Coast

2 Detection of Smuggling Vessels 1 Taiwan

3 Empty container verification 1 United Kingdom
4 Classification of economic operators in groups 3 Ethiopia, China
of risk/reliability
5 Development of structured database for the 2 Brazil, China
Customs
6 Automated drug detection 1 China
7 Prediction of ships route 1 Malta
8 Association of import-export commodities 1 Ethiopia
with other commercial factors
9 Detection of fraud declarations, using a dataset 2 China, India
where the sample of fraudulent declarations is
too small compared to un-fraudulent one (1-
2%)

The majority of these studies are dealing with the development of models that detect fraud
such as smuggling, misclassification, drug detection, empty container verification, smuggling
of vessels. Other studies present methodologies for building a structured database for the
Customs, capable of improving the accuracy of fraud detection systems. Three of the studies
deal with the classification of economic operators in groups of risk/reliability for the direct
identification of potential change in compliance behavior and emerging risks. Another study
proposes a model which predicts ships route and in the last study a data mining model
identifies the association of import-export commodities with other commercial factors, such
as market price and exchange rate. This knowledge allows Customs to understand variations
in the demand of services, so they may reallocate resources more efficiently.
Most of the studies were conducted in China and one of them in United States. It is
remarkable that very few researches conducted in European countries and none of them in
Greece.
P a g e | 241

3rd INTERNATIONAL CONFERENCE ON QUANTITATIVE, SOCIAL, BIOMEDICAL & ECONOMIC ISSUES 2019 - ICQSBEI 2019

4.2 REPORT OF STUDIES

The following tables summarize the 16 articles per technique of data mining. Also there is a
presentation of the data sets that used, the algorithms applied in each case as well as the
performance of each algorithm.

Table 2. Summary of published articles that used supervised learning techniques

No Custom Country Algorithm Dataset Accuracy Author
problem
1 Classification China Fuzzy logic/Back 50 declarations 70% Duan, (2013)
of customs risk Propagation of 2010 (30 as [5]
a train set & 20
as a test set)
2 Detection of Taiwan Neural 150.000 Most Wena et al.,
Smuggling Network/Logistic declarations of accurate is 2012 [22]
Vessels Regression period 2002- Neural
2009(70% as a Network
train set & with 76,3%
30% as a test
set)
3 Detection of China Decision Trees Unknown High Shao et al.
fraud C4.5, number of accuracy (2002) [10]
declarations , multidimensional import/export
using a dataset criterion, chi- declarations
where the square
sample of
fraudulent
declarations is
too small
compared to
un-fraudulent
one (1%)
4 Detection of India Decision trees and 329.137 93,87% Kumar, A., &
fraud the predictions of declarations of Nagadevara,
declarations, this classification 2004 as a train V., (2006)
using a dataset then fed into Neural set and 57.075 [13]
where the networks model of 2005 as a
sample of test set
fraudulent
declarations is
too small
compared to
un-fraudulent
one (1-2%)
5 Automated United Decision trees- 2,8*105 images 99,93% and Jaccard,
verification of Kingdom Random of containers 93% Rogers,
empty forest/Basic Image detection of Morton &
containers feature/Intensity 1,25 grams Griffin, 2016
domains of cocaine [11]
6 Classification India Decision trees- 46.748 99,95% J48 Bezabeh,
of economic J48/Neural declarations 99,71%NN 2017 [4]
operators in Network/10fold
groups of cross validation
risk/reliability
7 Detection of Netherlands Naïve Bayes/Tree 10.154 55,7% & Triepels,
miscoding Augmented declarations 31% Naïve Feelders &
P a g e | 242

3rd INTERNATIONAL CONFERENCE ON QUANTITATIVE, SOCIAL, BIOMEDICAL & ECONOMIC ISSUES 2019 - ICQSBEI 2019

No Custom Country Algorithm Dataset Accuracy Author

problem
declaration and Bayes/Markov and Bayes & Daniels
detection of blanket information of 65% & 52% (2015) [21]
smuggling 625 Tree
transactions Augmented
(75% as a train Bayes
set & 25% as a
test set)
8 Structured Brazil Interpretation Rules Brazilian The Roman et al.,
Customs /Bayes Networks/ Federal strategies 2009 [17]
database N-grams Revenue’s adopted
database within the
harpia5
project.
9 Drug detection China SVM/stacked auto Images of Milk 91% Zhu, Wang &
in Milk Powder encorder/5 fold Powder Cans Zhang, 2018
Cans cross [25]
validation/Back
propagation
10 Prediction of Malta k-nn 600.000 93% and Duca, Bacciu
ships route messages of 100% in a & Marchetti
841 ships (22- real incident (2017) [6]
26 October of
2016)

Table 3. Summary of published articles that used unsupervised learning techniques

No Custom Country Algorithm Dataset Accuracy Author
problem
1 Detection of China k-means/ 300.825 Very high Xiao,
fraud Dynamic k- declarations Xiao &
declarations means/Logistic (200.000 as a Wang,
Regression train set & (2016)
100.825 as a [23]
test set)
2 Detection of China Q-cluster/Pareto law 8.615 100% Yan-hai
fraud declarations & Lin-
declarations of 2002 yan
(1.506 as a (2005)
train set) [14]
3 Detection of Iran X-means,/ Unknown 73% Rad et
fraud mahalanobis/ dataset al.,
declarations k-nn/ density based (2015)
method [16]
4 Association of Ethiopia Apriori/Tertius/ 704.573 The most Gebru,
import-export PredictiveApriori/ declarations accurate is 2018 [9]
commodities Filtered Apriori Apriori (100
with other rules in 2,9’’)
commercial & Filtered
factors Apriori, (100
rules in
2,90’’)
5 Detection of Ivory Apriori 6.854 results 57% Zehero,
fraud Coast of controls Soro,
declarations (May 2016- Gondo,
P a g e | 243

3rd INTERNATIONAL CONFERENCE ON QUANTITATIVE, SOCIAL, BIOMEDICAL & ECONOMIC ISSUES 2019 - ICQSBEI 2019

No Custom Country Algorithm Dataset Accuracy Author

problem
May 2018) Brou &
Asseu,
2018
[24]
6 Detection of United GBAD-MDL, Artificial All Eberle &
fraud States GBAD-P, GBAD- dataset algorithms Holder,
declarations MPS created with 100% 2007 [7]
subgen tool

Best technique for Risk Management model is the combination of Neural Network and
Decision Trees. It achieved very good accuracy (93.87%) and even faced the problem of the
"out-of-balance" databases (where the sample of fraudulent declarations is too small
compared to un-fraudulent one).
In most studies, 10 out of 16, the data come from the customs declarations. Therefore, the
information provided by the customs databases has significant results in the detection of
fraud through data mining. Of course, this does not reduce the importance of information
from other sources, such as X-ray images or ship routes, since they give very high accuracy
rates.
Most of the models achieved great accuracy, above 90%. Thus, by adopting data mining
techniques Customs can develop more efficient Customs Risk Management models.

4.3 DATA MINING TECHNIQUES

Out of 24 data mining techniques used in the articles analyzed. The table below summarizes
all the data mining techniques presented in the papers. In total 27 algorithms were developed.
Some of them were combined by creating hybrid models while others were applied to the
same database and the performance of each was evaluated. However, everyone has
encountered the problem, for which they were created, with very high accuracy.

Table 4. Most used data mining methods and their usage frequency
No Technique Frequency Description
1 Naïve Bayes 1 Uses Bayes rule of conditional probability to compute the
probability of a label given set of features
2 Tree-Augmented 1 Relaxes the hypothesis of independency by specifying a tree
Naïve Bayes structure on the feature set in which each feature only has as parent
the label and at most one other feature.
3 Markov blanket 1 The set of all parents, children and children’s other parents nodes
of a certain node
4 Mahalanobis 1 It is used to measure the distance between the observations and
takes note of the correlations between them
5 density based 1 The cluster expands, as long as the neighborhood of the adjacent
method points has the required density
6 X-mean 1 An advanced form of the K-means clustering algorithm, where
there is no need to accurately determine the number of clusters
7 Apriori 2 it pruns many of the sets that are unlikely to be totally frequent
(after measuring their support), thus saving extra efforts. Candidate
subsets are produced using only the strong subsets that were
exported during the previous pass.
8 GBAD-MDL 1 uses MDL to find the best infrastructure in a graph and then
examines all the snapshots of this infrastructure to find similarities
with this pattern
P a g e | 244

3rd INTERNATIONAL CONFERENCE ON QUANTITATIVE, SOCIAL, BIOMEDICAL & ECONOMIC ISSUES 2019 - ICQSBEI 2019

No Technique Frequency Description

9 GBAD-P 1 uses the MDL and then examines whether any "extensions" from
the template have a low probability
10 GBAD-MPS 1 uses MDL and then examines all the snapshots of his ancestors
whose edges and tops are missing
11 Logistic Regression 2 Is widely used in problems where the dependent variable is binary
(i.e. smuggling or non-smuggling).
12 Stacked auto 1 Use automatic encoders to build networks. They are layered
encoder networks where each layer uses as an input the output
characteristics of the previous layer
13 Random Forest 1 Create a set of decision trees from randomly selected subsets of the
training set and merges the scores of the different decision trees to
select the final class of the test set
14 Oriented Basic 1 BIFs are a scheme for classifying each pixel of an image into
Image Feature categories, taking into account their symmetry. Oriented BIFs
(oBIFs) are an extension of BIFs to include rotating asymmetric
features.
15 Intensity domains 1 Encode information about an image intensity and its spatial
distribution

16 Interpretation Rules 1 Specifies how to extract the value of a feature through the confused
information given in an unstructured text. Then the system
constructs the characteristic-value pair that is returned by the
strong rule.
17 chi-square 1 Determines whether there is a significant difference between the
expected frequencies and the observed frequencies in one or more
categories.
18 C4.5 1 Uses the Profit Log as the separation criterion, which is the ratio of
information Log to the entropy
19 multi-dimension- 1 The multi-dimension-criterion model discovers the association
criterion relationship between attributes, which is different from the usual
association algorithm
20 k-means 1 The n samples are randomly selected as the center of the n groups.
Then each sample is assigned to the nearest cluster and the new
center of the newly formed cluster is updated.
21 Dynamic k-means 1 Through repetitive clustering separates the heterogeneous clusters
until it reaches the optimum number of clusters
22 J-48 Decision Trees 1 The C4.5 algorithm for building decision trees is implemented in
Weka as a classifier called J48
23 Back Propagation 1 The basic idea is that if the outcome is not expected, the error will
NN go back to the input for recalculation until it agrees with the
prediction
24 Q- cluster 1 An effective cluster is that the characteristics of the sample in one
type is less similar but the different sample is similarity.
25 Tertius 1 Creates rules from the values of the pairs of attributes in the
training data. It uses logical first order representation and displays
dependence on the number of lines in the rules
26 Predictive Apriori 1 Candidates are ranked according to foreseeable accuracy. It tries to
maximize the predictive accuracy of a correlation rule despite
confidence.

Supervised learning tools have been used more and Decision Trees model appears to be the
leading one, since it was used in 4 articles and detected different types of fraud. Neural
Networks and Logistic Regression follow. Thus, it could be stated that supervised learning
techniques are better-performing tools than the unsupervised ones in addressing the customs
P a g e | 245

3rd INTERNATIONAL CONFERENCE ON QUANTITATIVE, SOCIAL, BIOMEDICAL & ECONOMIC ISSUES 2019 - ICQSBEI 2019

difficulties which arise during the controls. Among the unsupervised learning techniques, the
Apriori algorithm had been used more and also gave great accuracy.

5. CONCLUSIONS

Customs are responsible for preserving commercial stability, timely settlement of the large
number of transactions and at the same time detecting and combating "fraudulent" behaviors.
The detection of smuggling requires physical examination of all goods that are imported and
exported, which is impossible due to the large volume of the international trade and the
limitation in customs manpower and materials. For this reason, automated targeting systems
that identify the high risk transactions is the most important tool for the customs controls.
However, the present methods do not have the desired effect. They give large percentages of
controls with significant small percentages of detected violations. Thus, customs
administrations are turning to data mining in order to create more sophisticated information
systems. Data mining provides useful knowledge by revealing the “valuable” information
through vast databases. Through its techniques, systems of neural networks, decision trees,
Bayes networks, clustering and association rules can be developed to support strategic
decisions in customs risk management. It has been used for years by a large number of
companies so as to increase profits and improve their competitiveness through describing past
trends and predicting the future ones. Also, it is a key tool for the analysis of Big Data, the
high in volume structured and unstructured data. Leveraging Big Data capabilities is
important to turn the available data into new insights. This will improve overall customs
operations and will provide lots of opportunities in many areas.
This study presents papers that have been published in scientific journals and conference
proceedings worldwide and highlights the usefulness of data mining in solving major
problems faced by customs when conducting their controls. It describes the customs
problems encountered, the data sets used, and the data mining models developed. From their
comparative analysis arises the fact that most of the studies have dealt with fraud detection
and risk targeting. Techniques of supervised learning of data mining have been used more
often. This does not reduce the importance of the unsupervised learning techniques that have
yielded a very high accuracy. Finally, in most studies, the data is derived from the import /
export declaration data. Therefore, the information provided by the customs databases has
significant results in the detection of fraud through data mining techniques.
The first contribution of this review is to highlight the usefulness of the application of data
mining techniques in Custom’s decision-support systems, especially in the domain of fraud
detection where it gives exceptionally better accuracy compared to the systems that Customs
already use. Furthermore, it aspires to be used as guidance for both researchers and
practitioners of Customs administrations in order to select the most appropriate technique
when dealing with Customs Risk Management decisions.
It is necessary to mention that other factors should be exploited in order to make data mining
tools more efficient for Customs. The important ones are listed below:
a) Access to Big Data. Apart the information derived from customs declarations
customs authorities should take advantage from the information given from other sources
such as other public services (tax authorities, etc), other government agencies (Single
Window, E-government) and private operators. Also, it should be exploited information
derived from the “cloud”, multilingual news sources and information from platforms
containing data from various electronic devices and objects, such as the "connected
containers", which allow objects to collect and exchange data, known as the "Internet of
Things".
P a g e | 246

3rd INTERNATIONAL CONFERENCE ON QUANTITATIVE, SOCIAL, BIOMEDICAL & ECONOMIC ISSUES 2019 - ICQSBEI 2019

b) The application of advanced data mining software such as SPSS, SAS, Maple,
Wolfram, Mathematica, RapidMiner, Neural Designer, Oracle Data Miner, SAP and R,
Python, etc. could help analyze data in the least costly way.
c) Finally, there is a need to develop an enterprise culture in Customs based on the
“value of data analysis”.
A very interesting research would be the implementation of data mining techniques in Greek
database for detection of fraudulent declaration (underestimated value, miscoding and
smuggling).We leave these issues open for future research.

References

[1] A.A.D.E.gr. (2019). Customs Controls. Retrieved February 2, 2019,

from:.https://www.aade.gr/sites/default/files/2018-2017.
[2] A.A.D.E.gr. (2019). Press Release. Retrieved February 2, 2019, from:
https://www.aade.gr/sites/default/files/2018-04/dt_05.04.2018.
[3] Aade.gr. (2019). Customs Revenue. Retrieved February 2, 2019, from:
https://www.aade.gr/sites/default/files/2018-2017.
[4] Bezabeh, B. (2017). The application of data mining techniques to support customer
relationship management: The case of Ethiopian revenue and customs authority, CoRR,
abs/1706.10050.
[5] Duan, J. (2013). Risk Identification and Evaluation of Customs Management Based on
Fuzzy Neural Network Algorithm. Applied Mechanics and Materials, 291-294, 2924-2927.
[6] Duca, A., Bacciu, C. and Marchetti, A. (2017). A K-nearest neighbor classifier for ship
route prediction. Paper presented at the OCEANS 2017 , Aberdeen, UK.
[7] Eberle, W. and Holder, L. (2007). Discovering Structural Anomalies in Graph-Based
Data. Paper presented at the Seventh IEEE International Conference on Data Mining
Workshops (ICDMW 2007), Omaha, USA.
[8] Europeam Court of Auditors (2017). Special Report No 19. Import procedures:
shortcomings in the legal framework and an ineffective implementation impact the financial
interests of the EU(pursuant to Article 287(4), second subparagraph, TFEU). Luxembourg.
[9] Gebru, M. (2018). Association Pattern Discovery of Import Export Items in Ethiopia,
American Scientific Research Journal for Engineering, Technology, and Sciences
(ASRJETS), 44(1), 240-256.
[10] Shao, H., Zhao, H., and Chang , G., (n.d.). Applying data mining to detect fraud
behavior in customs declaration, Paper presented at the 2002 International Conference on
Machine Learning and Cybernetics, Beijing, China.
[11] Jaccard, N., Rogers, T., Morton, E. and Griffin, L. (2016). Tackling the x-ray cargo
inspection challenge using machine learning, Paper presented at the SPIE 9847, Anomaly
Detection and Imaging with X-Rays (ADIX), Baltimore, USA
[12] Kirkos, E. (2015). Business Intelligence and Data Mining. Athens: Connection of
Hellenic academic libraries.
[13] Kumar, A. and Nagadevara, V. (2006). Development of Hybrid Classification
Methodology for Mining Skewed Data Sets ? A Case Study of Indian Customs Data, Paper
presented at the 2006 IEEE International Conference on Computer Systems and Applications,
Dubai, UAE.
[14] Li Yan-hai, L. and Lin-yan, S. (2005). Study and applications of data mining to the
structure risk analysis of customs declaration cargo, Paper presented at the 2005 IEEE
International Conference on e-Business Engineering (ICEBE'05), Beijing, China.
View publication stats

P a g e | 247

3rd INTERNATIONAL CONFERENCE ON QUANTITATIVE, SOCIAL, BIOMEDICAL & ECONOMIC ISSUES 2019 - ICQSBEI 2019

[15] Okazaki, Y. (2017). Implications of Big Data for Customs - How It Can Support Risk
Management Capabilities, WCO Research Paper, No. 39.
[16] Rad, H., Arash, S., Rahbar, F., Rahmani, R., Heshmati, Z. and Fard, M. (2015). A Novel
Unsupervised Classification Method for Customs Fraud Detection, Indian Journal of Science
and Technology, 8(35).
[17] Roman, N., Ferreira, C., Meira, L., Rezende, R., Digiampietri, L. and Filho, J. (2019).
Attribute-Value Specification in Customs Fraud Detection A Human-Aided Approach., Paper
presented at the 10th International Digital Government Research Conference, Puebla Mexico.
[18]The European Parliament and the council of the European union. (2013).Union Customs
Code:. Definitions. (Article 5). Strasbourg.
[19] The European Parliament and the council of the European union. (2013).Union Customs
Code: Risk management and customs control. (Article 46). Strasbourg.
[20] The European Parliament and the council of the European union. (2013).Union Customs
Code: Tarrif classification of goods. (Article 57). Strasbourg.
[21] Triepels, R., Feelders, A. and Daniels, H. (2015). Uncovering Document Fraud in
Maritime Freight Transport Based on Probabilistic Classification, Computer Information
Systems and Industrial Management, 282-293.
[22] Wena, C., Hsu, P., Wang, C., Wuc, T. and Hsu, M. (2012). E-government Information
Application: Identifying Smuggling Vessels with Data mining Technology, Electronic
Journal of e-Government, 10(2), 47-58.
[23] Xiao, Z., Xiao, H. and Wang, Y. (2016). A Risk Decision-making Approach to Customs
Targeting, The Open Cybernetics & Systemics Journal, 10(1), 250-262.
[24] Zehero, B., Soro, E., Gondo, Y., Brou, P. and Asseu, O. (2018). Elicitation of
Association Rules from Information on Customs Offences on the Basis of Frequent
Motives,. Engineering, 10(09), 588-605.
[25] Zhu, Y., Wang, L. and Zhang, W. (2018). Detection of Contraband in Milk Powder
Cans by Using Stacked Auto-Encoders Combination with Support Vector Machine, IOP
Conference Series: Earth and Environmental Science, 170(3), 032114.

Digital Literacy Level 5
100% (6)
Digital Literacy Level 5
3 pages
Modern Customs Risk Management Framework: Improvement Towards Institutional Reform
No ratings yet
Modern Customs Risk Management Framework: Improvement Towards Institutional Reform
10 pages
Security Testing Handbook for Banking Applications
From Everand
Security Testing Handbook for Banking Applications
Arvind Doraiswamy
5/5 (1)
SSH Passwd
No ratings yet
SSH Passwd
1,370 pages
Fault Code SID 146 - EGR System
50% (2)
Fault Code SID 146 - EGR System
6 pages
ICQSBEI2019
No ratings yet
ICQSBEI2019
13 pages
Computer-Aided_System_for_Customs_Fraud_Analytics_
No ratings yet
Computer-Aided_System_for_Customs_Fraud_Analytics_
9 pages
Risk Management in Customs Using Deep Neural Network (Risk-management-Timalsina)
No ratings yet
Risk Management in Customs Using Deep Neural Network (Risk-management-Timalsina)
6 pages
Data Mining
No ratings yet
Data Mining
10 pages
1470-Texte de l'article-2276-1-10-20240401
No ratings yet
1470-Texte de l'article-2276-1-10-20240401
33 pages
IJEETVol 2no 1January-June2017
No ratings yet
IJEETVol 2no 1January-June2017
5 pages
Technology Governance: Concepts & Practices
From Everand
Technology Governance: Concepts & Practices
Azhar Zia-ur-Rehman
No ratings yet
Regulatory Challenges
From Everand
Regulatory Challenges
Miles Kendrick
No ratings yet
Concon The Realizations of Supply Chain Management To Freight Forwarders 4
No ratings yet
Concon The Realizations of Supply Chain Management To Freight Forwarders 4
37 pages
Artificial Intelligence Regulation: Fundamentals and Applications
From Everand
Artificial Intelligence Regulation: Fundamentals and Applications
Fouad Sabry
No ratings yet
A Correlation Analysis Based Risk Warning Service For Cross-Border Trading
No ratings yet
A Correlation Analysis Based Risk Warning Service For Cross-Border Trading
8 pages
A Data Mining-Based Framework For Supply Chain Risk Management
No ratings yet
A Data Mining-Based Framework For Supply Chain Risk Management
30 pages
Unmasking Deception: Advanced Forensic Accounting Techniques for Fraud Detection
From Everand
Unmasking Deception: Advanced Forensic Accounting Techniques for Fraud Detection
Elizabeth Mogopodi
No ratings yet
Big Data Survey
No ratings yet
Big Data Survey
12 pages
Causes of Uncertainty in Logistic Operations of FMCG Industry in Pakistan
No ratings yet
Causes of Uncertainty in Logistic Operations of FMCG Industry in Pakistan
30 pages
Chapter 2. Related Works
No ratings yet
Chapter 2. Related Works
7 pages
WCO, RM Guide
No ratings yet
WCO, RM Guide
32 pages
Competition: A Contemporary Perspective on Unilateral Conduct
From Everand
Competition: A Contemporary Perspective on Unilateral Conduct
Anna Olimpia
No ratings yet
Smart Shipping Containers
From Everand
Smart Shipping Containers
Zuri Deepwater
No ratings yet
Custom Forwarding
No ratings yet
Custom Forwarding
36 pages
1827 01 WCJ v11n1 Karlsson
No ratings yet
1827 01 WCJ v11n1 Karlsson
12 pages
I. Readiness
No ratings yet
I. Readiness
7 pages
A Dynamic Risk Assessment
No ratings yet
A Dynamic Risk Assessment
13 pages
E-Commerce: A Guide to Managing the Pest Risk Posed by Goods Ordered Online and Distributed through Postal and Courier Pathways
From Everand
E-Commerce: A Guide to Managing the Pest Risk Posed by Goods Ordered Online and Distributed through Postal and Courier Pathways
Food and Agriculture Organization of the United Nations
No ratings yet
Risk Assessment
No ratings yet
Risk Assessment
192 pages
Handbook on Compliance Risk Management for Tax Administrations
From Everand
Handbook on Compliance Risk Management for Tax Administrations
Inter-American Center of Tax Administration
No ratings yet
2011 SCMIJTummalaand Schoenherr
No ratings yet
2011 SCMIJTummalaand Schoenherr
30 pages
PBC 63493
No ratings yet
PBC 63493
11 pages
Supply Chain Delays
From Everand
Supply Chain Delays
Zuri Deepwater
No ratings yet
International Journal of Safety and Security Engineering: Received: 27 August 2021 Accepted: 3 December 2021
No ratings yet
International Journal of Safety and Security Engineering: Received: 27 August 2021 Accepted: 3 December 2021
6 pages
Shailesh Banking Project
No ratings yet
Shailesh Banking Project
16 pages
Uses of Artificial Intelligence in The B PDF
No ratings yet
Uses of Artificial Intelligence in The B PDF
7 pages
Praveen Kumar Singh.v Article
No ratings yet
Praveen Kumar Singh.v Article
8 pages
Managing Information Risk: A Director's Guide
From Everand
Managing Information Risk: A Director's Guide
Stewart Mitchell
No ratings yet
Privacy in the Digital Age
From Everand
Privacy in the Digital Age
Roberto Miguel Rodriguez
No ratings yet
Transforming Customs Operations With Decision Intelligence
No ratings yet
Transforming Customs Operations With Decision Intelligence
16 pages
Risk Analysis
No ratings yet
Risk Analysis
13 pages
Supply Chains
From Everand
Supply Chains
Amelia Green
No ratings yet
Risk Mangement Notes
No ratings yet
Risk Mangement Notes
5 pages
Best Practices for SOX ITGC: Building a Resilient Framework for Financial Audits: TECHNOLOGY
From Everand
Best Practices for SOX ITGC: Building a Resilient Framework for Financial Audits: TECHNOLOGY
Elizabeth Mogopodi
No ratings yet
Methodology For Fraud Detection in Electronic Tran
No ratings yet
Methodology For Fraud Detection in Electronic Tran
5 pages
Smart Shipping Routes
From Everand
Smart Shipping Routes
Aisha Khan
No ratings yet
Capacity Building Framework On Data Analytics - en
100% (1)
Capacity Building Framework On Data Analytics - en
67 pages
Global Tech Regulation
From Everand
Global Tech Regulation
Elian Wildgrove
No ratings yet
Integrated Approach to Trade and Transport Facilitation: Measuring Readiness for Sustainable, Inclusive, and Resilient Trade
From Everand
Integrated Approach to Trade and Transport Facilitation: Measuring Readiness for Sustainable, Inclusive, and Resilient Trade
Asian Development Bank
No ratings yet
Achieving Clean Administration in Government Entities
From Everand
Achieving Clean Administration in Government Entities
AG Rasmeni
No ratings yet
Dangerous Enthusiasms: E-government, Computer Failure and Information System Development
From Everand
Dangerous Enthusiasms: E-government, Computer Failure and Information System Development
Robin Gauld
4/5 (5)
Procurement & Material Management Assignment - 3
No ratings yet
Procurement & Material Management Assignment - 3
6 pages
Thesis Topicst
No ratings yet
Thesis Topicst
3 pages
Regulating Cross-Border Data Flows: Issues, Challenges and Impact
From Everand
Regulating Cross-Border Data Flows: Issues, Challenges and Impact
Bryan Mercurio
No ratings yet
Metodolgie Risc
No ratings yet
Metodolgie Risc
19 pages
Big Data Management in The Shipping Industry - Examining Strength
No ratings yet
Big Data Management in The Shipping Industry - Examining Strength
15 pages
GabonCustoms Maindocument Revised 5 SE
No ratings yet
GabonCustoms Maindocument Revised 5 SE
33 pages
Final IAMConferance Paper
No ratings yet
Final IAMConferance Paper
25 pages
Using AI and The Fuzzy Delphi Method For Dispatching Executives To Overseas Banks
No ratings yet
Using AI and The Fuzzy Delphi Method For Dispatching Executives To Overseas Banks
6 pages
Big Data Analytics Based Approach To Tax Evasion Detection
No ratings yet
Big Data Analytics Based Approach To Tax Evasion Detection
4 pages
Performance Drivers Shipping Loans Mitroussi Abouarghou Haider Pettit Tigka
No ratings yet
Performance Drivers Shipping Loans Mitroussi Abouarghou Haider Pettit Tigka
40 pages
Who is prepared for the new digital age?: Evidence from the EIB Investment Survey
From Everand
Who is prepared for the new digital age?: Evidence from the EIB Investment Survey
Bookwire
No ratings yet
DECS450 Manual
100% (1)
DECS450 Manual
356 pages
COSC201101052 - Assignment 1
No ratings yet
COSC201101052 - Assignment 1
23 pages
Abc All
No ratings yet
Abc All
35 pages
VS60 VS70 SM 5508160 100 r3 PDF
No ratings yet
VS60 VS70 SM 5508160 100 r3 PDF
532 pages
Handwriting Biometrics
No ratings yet
Handwriting Biometrics
16 pages
CS30Pro-SDK Instruction V1.0.2
No ratings yet
CS30Pro-SDK Instruction V1.0.2
30 pages
ITM Unit - 1
No ratings yet
ITM Unit - 1
12 pages
Dxdi
No ratings yet
Dxdi
32 pages
NRO API Documentation
No ratings yet
NRO API Documentation
8 pages
Readme
No ratings yet
Readme
3 pages
01 Project Governance Key Roles Powerpoint Template 16x9 1
No ratings yet
01 Project Governance Key Roles Powerpoint Template 16x9 1
5 pages
Data Mining-Constraint Based Cluster Analysis
100% (1)
Data Mining-Constraint Based Cluster Analysis
4 pages
knowledge_base_GG install
No ratings yet
knowledge_base_GG install
31 pages
Application Guide
No ratings yet
Application Guide
24 pages
Hal Finney (Computer Scientist) - Wikipedia
No ratings yet
Hal Finney (Computer Scientist) - Wikipedia
25 pages
Cisco Voice Troubleshooting
No ratings yet
Cisco Voice Troubleshooting
10 pages
Training Course Outline: Introduction To Adobe Premiere Pro
No ratings yet
Training Course Outline: Introduction To Adobe Premiere Pro
2 pages
Jungle Flasher
No ratings yet
Jungle Flasher
8 pages
福利 2
No ratings yet
福利 2
13 pages
Automation For Machining Operations
No ratings yet
Automation For Machining Operations
10 pages
Scrum Oum Mapping
No ratings yet
Scrum Oum Mapping
6 pages
It_(r22)_3-1_data Science and Artificial Intelligence_lab Manual
No ratings yet
It_(r22)_3-1_data Science and Artificial Intelligence_lab Manual
97 pages
Qualys Iac Security Integration Vscode
No ratings yet
Qualys Iac Security Integration Vscode
6 pages
Role of Records and Archives Management PDF
No ratings yet
Role of Records and Archives Management PDF
4 pages
Securecrt Scripting
No ratings yet
Securecrt Scripting
4 pages
Eluri Bhairawaprasad
No ratings yet
Eluri Bhairawaprasad
2 pages
Rendering SketchUp Models With Kerkythea
No ratings yet
Rendering SketchUp Models With Kerkythea
5 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Risk Analysis

Uploaded by

Risk Analysis

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

RISK ANALYSIS AND DECISION SUPPORT SYSTEMS WITH DATA MINING

Conference Paper · June 2019

Κωνσταντια Ταχτσιδου Efstathios Kirkos

SEE PROFILE SEE PROFILE

The user has requested enhancement of the downloaded file.

RISK ANALYSIS AND DECISION SUPPORT SYSTEMS WITH DATA

JEL Classification: C38, C45, C53, C55, H26

2. CUSTOMS ADMINISTRATIONS AND RISK

2.1 CUSTOMS CONTROLS

The unsupervised learning techniques are the following:

4. CRITICAL ANALYSIS OF THE LITERATURE

4.1 CUSTOMS PROBLEMS

Table 1. Main problems Custom’s face during customs controls

2 Detection of Smuggling Vessels 1 Taiwan

4.2 REPORT OF STUDIES

Table 2. Summary of published articles that used supervised learning techniques

No Custom Country Algorithm Dataset Accuracy Author

Table 3. Summary of published articles that used unsupervised learning techniques

No Custom Country Algorithm Dataset Accuracy Author

4.3 DATA MINING TECHNIQUES

No Technique Frequency Description

[1] A.A.D.E.gr. (2019). Customs Controls. Retrieved February 2, 2019,

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.